ERIC Educational Resources Information Center
Siman-Tov, Ayelet; Kaniel, Shlomo
2011-01-01
The research validates a multivariate model that predicts parental adjustment to coping successfully with an autistic child. The model comprises four elements: parental stress, parental resources, parental adjustment and the child's autism symptoms. 176 parents of children aged between 6 to 16 diagnosed with PDD answered several questionnaires…
McKinney, Cliff; Renk, Kimberly
2008-06-01
Although parent-adolescent interactions have been examined, relevant variables have not been integrated into a multivariate model. As a result, this study examined a multivariate model of parent-late adolescent gender dyads in an attempt to capture important predictors in late adolescents' important and unique transition to adulthood. The sample for this study consisted of 151 male and 324 female late adolescents, who reported on their mothers' and fathers' parenting style, their family environment, their mothers' and fathers' expectations for them, the conflict that they experience with their mothers and fathers, and their own adjustment. Overall, the variables had significant relationships with one another. Further, the male-father, male-mother, and female-father structural equation models that were examined suggested that parenting style has an indirect relationship with late adolescents' adjustment through characteristics of the family environment and the conflict that is experienced in families; such findings were not evident for the female-mother model. Thus, the examination of parent-late adolescent interactions should occur in the context of the gender of parents and their late adolescents. PMID:17710537
ERIC Educational Resources Information Center
McKinney, Cliff; Renk, Kimberly
2008-01-01
Although parent-adolescent interactions have been examined, relevant variables have not been integrated into a multivariate model. As a result, this study examined a multivariate model of parent-late adolescent gender dyads in an attempt to capture important predictors in late adolescents' important and unique transition to adulthood. The sample…
Hirozawa, Anne M; Montez-Rath, Maria E; Johnson, Elizabeth C; Solnit, Stephen A; Drennan, Michael J; Katz, Mitchell H; Marx, Rani
2016-01-01
We compared prospective risk adjustment models for adjusting patient panels at the San Francisco Department of Public Health. We used 4 statistical models (linear regression, two-part model, zero-inflated Poisson, and zero-inflated negative binomial) and 4 subsets of predictor variables (age/gender categories, chronic diagnoses, homelessness, and a loss to follow-up indicator) to predict primary care visit frequency. Predicted visit frequency was then used to calculate patient weights and adjusted panel sizes. The two-part model using all predictor variables performed best (R = 0.20). This model, designed specifically for safety net patients, may prove useful for panel adjustment in other public health settings. PMID:27576054
Adjustment of geochemical background by robust multivariate statistics
Zhou, D.
1985-01-01
Conventional analyses of exploration geochemical data assume that the background is a constant or slowly changing value, equivalent to a plane or a smoothly curved surface. However, it is better to regard the geochemical background as a rugged surface, varying with changes in geology and environment. This rugged surface can be estimated from observed geological, geochemical and environmental properties by using multivariate statistics. A method of background adjustment was developed and applied to groundwater and stream sediment reconnaissance data collected from the Hot Springs Quadrangle, South Dakota, as part of the National Uranium Resource Evaluation (NURE) program. Source-rock lithology appears to be a dominant factor controlling the chemical composition of groundwater or stream sediments. The most efficacious adjustment procedure is to regress uranium concentration on selected geochemical and environmental variables for each lithologic unit, and then to delineate anomalies by a common threshold set as a multiple of the standard deviation of the combined residuals. Robust versions of regression and RQ-mode principal components analysis techniques were used rather than ordinary techniques to guard against distortion caused by outliers Anomalies delineated by this background adjustment procedure correspond with uranium prospects much better than do anomalies delineated by conventional procedures. The procedure should be applicable to geochemical exploration at different scales for other metals. ?? 1985.
A "Model" Multivariable Calculus Course.
ERIC Educational Resources Information Center
Beckmann, Charlene E.; Schlicker, Steven J.
1999-01-01
Describes a rich, investigative approach to multivariable calculus. Introduces a project in which students construct physical models of surfaces that represent real-life applications of their choice. The models, along with student-selected datasets, serve as vehicles to study most of the concepts of the course from both continuous and discrete…
Multivariate Model of Infant Competence.
ERIC Educational Resources Information Center
Kierscht, Marcia Selland; Vietze, Peter M.
This paper describes a multivariate model of early infant competence formulated from variables representing infant-environment transaction including: birthweight, habituation index, personality ratings of infant social orientation and task orientation, ratings of maternal responsiveness to infant distress and social signals, and observational…
Kautter, John; Pope, Gregory C.
2004-01-01
The authors document the development of the CMS frailty adjustment model, a Medicare payment approach that adjusts payments to a Medicare managed care organization (MCO) according to the functional impairment of its community-residing enrollees. Beginning in 2004, this approach is being applied to certain organizations, such as Program of All-Inclusive Care for the Elderly (PACE), that specialize in providing care to the community-residing frail elderly. In the future, frailty adjustment could be extended to more Medicare managed care organizations. PMID:25372243
Estimation of Data Uncertainty Adjustment Parameters for Multivariate Earth Rotation Series
NASA Technical Reports Server (NTRS)
Sung, Li-yu; Steppe, J. Alan
1994-01-01
We have developed a maximum likelihood method to estimate a set of data uncertainty adjustment parameters, iccluding scaling factors and additive variances and covariances, for multivariate Earth rotation series.
Multivariate pluvial flood damage models
Van Ootegem, Luc; Verhofstadt, Elsy; Van Herck, Kristine; Creten, Tom
2015-09-15
Depth–damage-functions, relating the monetary flood damage to the depth of the inundation, are commonly used in the case of fluvial floods (floods caused by a river overflowing). We construct four multivariate damage models for pluvial floods (caused by extreme rainfall) by differentiating on the one hand between ground floor floods and basement floods and on the other hand between damage to residential buildings and damage to housing contents. We do not only take into account the effect of flood-depth on damage, but also incorporate the effects of non-hazard indicators (building characteristics, behavioural indicators and socio-economic variables). By using a Tobit-estimation technique on identified victims of pluvial floods in Flanders (Belgium), we take into account the effect of cases of reported zero damage. Our results show that the flood depth is an important predictor of damage, but with a diverging impact between ground floor floods and basement floods. Also non-hazard indicators are important. For example being aware of the risk just before the water enters the building reduces content damage considerably, underlining the importance of warning systems and policy in this case of pluvial floods. - Highlights: • Prediction of damage of pluvial floods using also non-hazard information • We include ‘no damage cases’ using a Tobit model. • The damage of flood depth is stronger for ground floor than for basement floods. • Non-hazard indicators are especially important for content damage. • Potential gain of policies that increase awareness of flood risks.
ERIC Educational Resources Information Center
Loehlin, John C.; Neiderhiser, Jenae M.; Reiss, David
2005-01-01
Adolescent adjustment measures may be related to each other and to the social environment in various ways. Are these relationships similar in genetic and environmental sources of covariation, or different? A multivariate behaviorgenetic analysis was made of 6 adjustment and 3 treatment composites from the study Nonshared Environment in Adolescent…
A multivariate CAR model for mismatched lattices.
Porter, Aaron T; Oleson, Jacob J
2014-10-01
In this paper, we develop a multivariate Gaussian conditional autoregressive model for use on mismatched lattices. Most current multivariate CAR models are designed for each multivariate outcome to utilize the same lattice structure. In many applications, a change of basis will allow different lattices to be utilized, but this is not always the case, because a change of basis is not always desirable or even possible. Our multivariate CAR model allows each outcome to have a different neighborhood structure which can utilize different lattices for each structure. The model is applied in two real data analysis. The first is a Bayesian learning example in mapping the 2006 Iowa Mumps epidemic, which demonstrates the importance of utilizing multiple channels of infection flow in mapping infectious diseases. The second is a multivariate analysis of poverty levels and educational attainment in the American Community Survey. PMID:25457598
Learning Adaptive Forecasting Models from Irregularly Sampled Multivariate Clinical Data
Liu, Zitao; Hauskrecht, Milos
2016-01-01
Building accurate predictive models of clinical multivariate time series is crucial for understanding of the patient condition, the dynamics of a disease, and clinical decision making. A challenging aspect of this process is that the model should be flexible and adaptive to reflect well patient-specific temporal behaviors and this also in the case when the available patient-specific data are sparse and short span. To address this problem we propose and develop an adaptive two-stage forecasting approach for modeling multivariate, irregularly sampled clinical time series of varying lengths. The proposed model (1) learns the population trend from a collection of time series for past patients; (2) captures individual-specific short-term multivariate variability; and (3) adapts by automatically adjusting its predictions based on new observations. The proposed forecasting model is evaluated on a real-world clinical time series dataset. The results demonstrate the benefits of our approach on the prediction tasks for multivariate, irregularly sampled clinical time series, and show that it can outperform both the population based and patient-specific time series prediction models in terms of prediction accuracy. PMID:27525189
DUALITY IN MULTIVARIATE RECEPTOR MODEL. (R831078)
Multivariate receptor models are used for source apportionment of multiple observations of compositional data of air pollutants that obey mass conservation. Singular value decomposition of the data leads to two sets of eigenvectors. One set of eigenvectors spans a space in whi...
Quality Reporting of Multivariable Regression Models in Observational Studies
Real, Jordi; Forné, Carles; Roso-Llorach, Albert; Martínez-Sánchez, Jose M.
2016-01-01
Abstract Controlling for confounders is a crucial step in analytical observational studies, and multivariable models are widely used as statistical adjustment techniques. However, the validation of the assumptions of the multivariable regression models (MRMs) should be made clear in scientific reporting. The objective of this study is to review the quality of statistical reporting of the most commonly used MRMs (logistic, linear, and Cox regression) that were applied in analytical observational studies published between 2003 and 2014 by journals indexed in MEDLINE. Review of a representative sample of articles indexed in MEDLINE (n = 428) with observational design and use of MRMs (logistic, linear, and Cox regression). We assessed the quality of reporting about: model assumptions and goodness-of-fit, interactions, sensitivity analysis, crude and adjusted effect estimate, and specification of more than 1 adjusted model. The tests of underlying assumptions or goodness-of-fit of the MRMs used were described in 26.2% (95% CI: 22.0–30.3) of the articles and 18.5% (95% CI: 14.8–22.1) reported the interaction analysis. Reporting of all items assessed was higher in articles published in journals with a higher impact factor. A low percentage of articles indexed in MEDLINE that used multivariable techniques provided information demonstrating rigorous application of the model selected as an adjustment method. Given the importance of these methods to the final results and conclusions of observational studies, greater rigor is required in reporting the use of MRMs in the scientific literature. PMID:27196467
Bayesian Transformation Models for Multivariate Survival Data
DE CASTRO, MÁRIO; CHEN, MING-HUI; IBRAHIM, JOSEPH G.; KLEIN, JOHN P.
2014-01-01
In this paper we propose a general class of gamma frailty transformation models for multivariate survival data. The transformation class includes the commonly used proportional hazards and proportional odds models. The proposed class also includes a family of cure rate models. Under an improper prior for the parameters, we establish propriety of the posterior distribution. A novel Gibbs sampling algorithm is developed for sampling from the observed data posterior distribution. A simulation study is conducted to examine the properties of the proposed methodology. An application to a data set from a cord blood transplantation study is also reported. PMID:24904194
Dehesh, Tania; Zare, Najaf; Ayatollahi, Seyyed Mohammad Taghi
2015-01-01
Background. Univariate meta-analysis (UM) procedure, as a technique that provides a single overall result, has become increasingly popular. Neglecting the existence of other concomitant covariates in the models leads to loss of treatment efficiency. Our aim was proposing four new approximation approaches for the covariance matrix of the coefficients, which is not readily available for the multivariate generalized least square (MGLS) method as a multivariate meta-analysis approach. Methods. We evaluated the efficiency of four new approaches including zero correlation (ZC), common correlation (CC), estimated correlation (EC), and multivariate multilevel correlation (MMC) on the estimation bias, mean square error (MSE), and 95% probability coverage of the confidence interval (CI) in the synthesis of Cox proportional hazard models coefficients in a simulation study. Result. Comparing the results of the simulation study on the MSE, bias, and CI of the estimated coefficients indicated that MMC approach was the most accurate procedure compared to EC, CC, and ZC procedures. The precision ranking of the four approaches according to all above settings was MMC ≥ EC ≥ CC ≥ ZC. Conclusion. This study highlights advantages of MGLS meta-analysis on UM approach. The results suggested the use of MMC procedure to overcome the lack of information for having a complete covariance matrix of the coefficients. PMID:26413142
Response Surface Modeling Using Multivariate Orthogonal Functions
NASA Technical Reports Server (NTRS)
Morelli, Eugene A.; DeLoach, Richard
2001-01-01
A nonlinear modeling technique was used to characterize response surfaces for non-dimensional longitudinal aerodynamic force and moment coefficients, based on wind tunnel data from a commercial jet transport model. Data were collected using two experimental procedures - one based on modem design of experiments (MDOE), and one using a classical one factor at a time (OFAT) approach. The nonlinear modeling technique used multivariate orthogonal functions generated from the independent variable data as modeling functions in a least squares context to characterize the response surfaces. Model terms were selected automatically using a prediction error metric. Prediction error bounds computed from the modeling data alone were found to be- a good measure of actual prediction error for prediction points within the inference space. Root-mean-square model fit error and prediction error were less than 4 percent of the mean response value in all cases. Efficacy and prediction performance of the response surface models identified from both MDOE and OFAT experiments were investigated.
MacNab, Ying C
2016-09-20
We present a general coregionalization framework for developing coregionalized multivariate Gaussian conditional autoregressive (cMCAR) models for Bayesian analysis of multivariate lattice data in general and multivariate disease mapping data in particular. This framework is inclusive of cMCARs that facilitate flexible modelling of spatially structured symmetric or asymmetric cross-variable local interactions, allowing a wide range of separable or non-separable covariance structures, and symmetric or asymmetric cross-covariances, to be modelled. We present a brief overview of established univariate Gaussian conditional autoregressive (CAR) models for univariate lattice data and develop coregionalized multivariate extensions. Classes of cMCARs are presented by formulating precision structures. The resulting conditional properties of the multivariate spatial models are established, which cast new light on cMCARs with richly structured covariances and cross-covariances of different spatial ranges. The related methods are illustrated via an in-depth Bayesian analysis of a Minnesota county-level cancer data set. We also bring a new dimension to the traditional enterprize of Bayesian disease mapping: estimating and mapping covariances and cross-covariances of the underlying disease risks. Maps of covariances and cross-covariances bring to light spatial characterizations of the cMCARs and inform on spatial risk associations between areas and diseases. Copyright © 2016 John Wiley & Sons, Ltd. PMID:27091685
Multivariate Markov chain modeling for stock markets
NASA Astrophysics Data System (ADS)
Maskawa, Jun-ichi
2003-06-01
We study a multivariate Markov chain model as a stochastic model of the price changes of portfolios in the framework of the mean field approximation. The time series of price changes are coded into the sequences of up and down spins according to their signs. We start with the discussion for small portfolios consisting of two stock issues. The generalization of our model to arbitrary size of portfolio is constructed by a recurrence relation. The resultant form of the joint probability of the stationary state coincides with Gibbs measure assigned to each configuration of spin glass model. Through the analysis of actual portfolios, it has been shown that the synchronization of the direction of the price changes is well described by the model.
Adaptable Multivariate Calibration Models for Spectral Applications
THOMAS,EDWARD V.
1999-12-20
Multivariate calibration techniques have been used in a wide variety of spectroscopic situations. In many of these situations spectral variation can be partitioned into meaningful classes. For example, suppose that multiple spectra are obtained from each of a number of different objects wherein the level of the analyte of interest varies within each object over time. In such situations the total spectral variation observed across all measurements has two distinct general sources of variation: intra-object and inter-object. One might want to develop a global multivariate calibration model that predicts the analyte of interest accurately both within and across objects, including new objects not involved in developing the calibration model. However, this goal might be hard to realize if the inter-object spectral variation is complex and difficult to model. If the intra-object spectral variation is consistent across objects, an effective alternative approach might be to develop a generic intra-object model that can be adapted to each object separately. This paper contains recommendations for experimental protocols and data analysis in such situations. The approach is illustrated with an example involving the noninvasive measurement of glucose using near-infrared reflectance spectroscopy. Extensions to calibration maintenance and calibration transfer are discussed.
Bayesian Local Contamination Models for Multivariate Outliers
Page, Garritt L.; Dunson, David B.
2013-01-01
In studies where data are generated from multiple locations or sources it is common for there to exist observations that are quite unlike the majority. Motivated by the application of establishing a reference value in an inter-laboratory setting when outlying labs are present, we propose a local contamination model that is able to accommodate unusual multivariate realizations in a flexible way. The proposed method models the process level of a hierarchical model using a mixture with a parametric component and a possibly nonparametric contamination. Much of the flexibility in the methodology is achieved by allowing varying random subsets of the elements in the lab-specific mean vectors to be allocated to the contamination component. Computational methods are developed and the methodology is compared to three other possible approaches using a simulation study. We apply the proposed method to a NIST/NOAA sponsored inter-laboratory study which motivated the methodological development. PMID:24363465
Multivariate models of adult Pacific salmon returns.
Burke, Brian J; Peterson, William T; Beckman, Brian R; Morgan, Cheryl; Daly, Elizabeth A; Litz, Marisa
2013-01-01
Most modeling and statistical approaches encourage simplicity, yet ecological processes are often complex, as they are influenced by numerous dynamic environmental and biological factors. Pacific salmon abundance has been highly variable over the last few decades and most forecasting models have proven inadequate, primarily because of a lack of understanding of the processes affecting variability in survival. Better methods and data for predicting the abundance of returning adults are therefore required to effectively manage the species. We combined 31 distinct indicators of the marine environment collected over an 11-year period into a multivariate analysis to summarize and predict adult spring Chinook salmon returns to the Columbia River in 2012. In addition to forecasts, this tool quantifies the strength of the relationship between various ecological indicators and salmon returns, allowing interpretation of ecosystem processes. The relative importance of indicators varied, but a few trends emerged. Adult returns of spring Chinook salmon were best described using indicators of bottom-up ecological processes such as composition and abundance of zooplankton and fish prey as well as measures of individual fish, such as growth and condition. Local indicators of temperature or coastal upwelling did not contribute as much as large-scale indicators of temperature variability, matching the spatial scale over which salmon spend the majority of their ocean residence. Results suggest that effective management of Pacific salmon requires multiple types of data and that no single indicator can represent the complex early-ocean ecology of salmon. PMID:23326586
Multivariate Models of Adult Pacific Salmon Returns
Burke, Brian J.; Peterson, William T.; Beckman, Brian R.; Morgan, Cheryl; Daly, Elizabeth A.; Litz, Marisa
2013-01-01
Most modeling and statistical approaches encourage simplicity, yet ecological processes are often complex, as they are influenced by numerous dynamic environmental and biological factors. Pacific salmon abundance has been highly variable over the last few decades and most forecasting models have proven inadequate, primarily because of a lack of understanding of the processes affecting variability in survival. Better methods and data for predicting the abundance of returning adults are therefore required to effectively manage the species. We combined 31 distinct indicators of the marine environment collected over an 11-year period into a multivariate analysis to summarize and predict adult spring Chinook salmon returns to the Columbia River in 2012. In addition to forecasts, this tool quantifies the strength of the relationship between various ecological indicators and salmon returns, allowing interpretation of ecosystem processes. The relative importance of indicators varied, but a few trends emerged. Adult returns of spring Chinook salmon were best described using indicators of bottom-up ecological processes such as composition and abundance of zooplankton and fish prey as well as measures of individual fish, such as growth and condition. Local indicators of temperature or coastal upwelling did not contribute as much as large-scale indicators of temperature variability, matching the spatial scale over which salmon spend the majority of their ocean residence. Results suggest that effective management of Pacific salmon requires multiple types of data and that no single indicator can represent the complex early-ocean ecology of salmon. PMID:23326586
Small Sample Properties of Bayesian Multivariate Autoregressive Time Series Models
ERIC Educational Resources Information Center
Price, Larry R.
2012-01-01
The aim of this study was to compare the small sample (N = 1, 3, 5, 10, 15) performance of a Bayesian multivariate vector autoregressive (BVAR-SEM) time series model relative to frequentist power and parameter estimation bias. A multivariate autoregressive model was developed based on correlated autoregressive time series vectors of varying…
Comparison between Mothers and Fathers in Coping with Autistic Children: A Multivariate Model
ERIC Educational Resources Information Center
Kaniel, Shlomo; Siman-Tov, Ayelet
2011-01-01
The main purpose of this research is to compare the differences between how mothers and fathers cope with autistic children based on a multivariate model that describes the relationships between parental psychological resources, parental stress appraisal and parental adjustment. 176 parents who lived in Israel (88 mothers and 88 fathers) of…
A complete procedure for multivariate index-flood model application
NASA Astrophysics Data System (ADS)
Requena, Ana Isabel; Chebana, Fateh; Mediero, Luis
2016-04-01
Multivariate frequency analyses are needed to study floods due to dependence existing among representative variables of the flood hydrograph. Particularly, multivariate analyses are essential when flood-routing processes significantly attenuate flood peaks, such as in dams and flood management in flood-prone areas. Besides, regional analyses improve at-site quantile estimates obtained at gauged sites, especially when short flow series exist, and provide estimates at ungauged sites where flow records are unavailable. However, very few studies deal simultaneously with both multivariate and regional aspects. This study seeks to introduce a complete procedure to conduct a multivariate regional hydrological frequency analysis (HFA), providing guidelines. The methodology joins recent developments achieved in multivariate and regional HFA, such as copulas, multivariate quantiles and the multivariate index-flood model. The proposed multivariate methodology, focused on the bivariate case, is applied to a case study located in Spain by using hydrograph volume and flood peak observed series. As a result, a set of volume-peak events under a bivariate quantile curve can be obtained for a given return period at a target site, providing flexibility to practitioners to check and decide what the design event for a given purpose should be. In addition, the multivariate regional approach can also be used for obtaining the multivariate distribution of the hydrological variables when the aim is to assess the structure failure for a given return period.
Bayesian Analysis of Multivariate Probit Models with Surrogate Outcome Data
ERIC Educational Resources Information Center
Poon, Wai-Yin; Wang, Hai-Bin
2010-01-01
A new class of parametric models that generalize the multivariate probit model and the errors-in-variables model is developed to model and analyze ordinal data. A general model structure is assumed to accommodate the information that is obtained via surrogate variables. A hybrid Gibbs sampler is developed to estimate the model parameters. To…
A unifying modeling framework for highly multivariate disease mapping.
Botella-Rocamora, P; Martinez-Beneito, M A; Banerjee, S
2015-04-30
Multivariate disease mapping refers to the joint mapping of multiple diseases from regionally aggregated data and continues to be the subject of considerable attention for biostatisticians and spatial epidemiologists. The key issue is to map multiple diseases accounting for any correlations among themselves. Recently, Martinez-Beneito (2013) provided a unifying framework for multivariate disease mapping. While attractive in that it colligates a variety of existing statistical models for mapping multiple diseases, this and other existing approaches are computationally burdensome and preclude the multivariate analysis of moderate to large numbers of diseases. Here, we propose an alternative reformulation that accrues substantial computational benefits enabling the joint mapping of tens of diseases. Furthermore, the approach subsumes almost all existing classes of multivariate disease mapping models and offers substantial insight into the properties of statistical disease mapping models. PMID:25645551
Preliminary Multi-Variable Parametric Cost Model for Space Telescopes
NASA Technical Reports Server (NTRS)
Stahl, H. Philip; Hendrichs, Todd
2010-01-01
This slide presentation reviews creating a preliminary multi-variable cost model for the contract costs of making a space telescope. There is discussion of the methodology for collecting the data, definition of the statistical analysis methodology, single variable model results, testing of historical models and an introduction of the multi variable models.
A Multivariate Model of Achievement in Geometry
ERIC Educational Resources Information Center
Bailey, MarLynn; Taasoobshirazi, Gita; Carr, Martha
2014-01-01
Previous studies have shown that several key variables influence student achievement in geometry, but no research has been conducted to determine how these variables interact. A model of achievement in geometry was tested on a sample of 102 high school students. Structural equation modeling was used to test hypothesized relationships among…
A Multivariate Model of Physics Problem Solving
ERIC Educational Resources Information Center
Taasoobshirazi, Gita; Farley, John
2013-01-01
A model of expertise in physics problem solving was tested on undergraduate science, physics, and engineering majors enrolled in an introductory-level physics course. Structural equation modeling was used to test hypothesized relationships among variables linked to expertise in physics problem solving including motivation, metacognitive planning,…
Multivariate Probabilistic Analysis of an Hydrological Model
NASA Astrophysics Data System (ADS)
Franceschini, Samuela; Marani, Marco
2010-05-01
Model predictions derived based on rainfall measurements and hydrological model results are often limited by the systematic error of measuring instruments, by the intrinsic variability of the natural processes and by the uncertainty of the mathematical representation. We propose a means to identify such sources of uncertainty and to quantify their effects based on point-estimate approaches, as a valid alternative to cumbersome Montecarlo methods. We present uncertainty analyses on the hydrologic response to selected meteorological events, in the mountain streamflow-generating portion of the Brenta basin at Bassano del Grappa, Italy. The Brenta river catchment has a relatively uniform morphology and quite a heterogeneous rainfall-pattern. In the present work, we evaluate two sources of uncertainty: data uncertainty (the uncertainty due to data handling and analysis) and model uncertainty (the uncertainty related to the formulation of the model). We thus evaluate the effects of the measurement error of tipping-bucket rain gauges, the uncertainty in estimating spatially-distributed rainfall through block kriging, and the uncertainty associated with estimated model parameters. To this end, we coupled a deterministic model based on the geomorphological theory of the hydrologic response to probabilistic methods. In particular we compare the results of Monte Carlo Simulations (MCS) to the results obtained, in the same conditions, using Li's Point Estimate Method (LiM). The LiM is a probabilistic technique that approximates the continuous probability distribution function of the considered stochastic variables by means of discrete points and associated weights. This allows to satisfactorily reproduce results with only few evaluations of the model function. The comparison between the LiM and MCS results highlights the pros and cons of using an approximating method. LiM is less computationally demanding than MCS, but has limited applicability especially when the model
Aspects of model selection in multivariate analyses
Picard, R.
1982-01-01
Analysis of data sets that involve large numbers of variables usually entails some type of model fitting and data reduction. In regression problems, a fitted model that is obtained by a selection process can be difficult to evaluate because of optimism induced by the choice mechanism. Problems in areas such as discriminant analysis, calibration, and the like often lead to similar difficulties. The preceeding sections reviewed some of the general ideas behind assessment of regression-type predictors and illustrated how they can be easily incorporated into a standard data analysis.
MULTIVARIATE LINEAR MIXED MODELS FOR MULTIPLE OUTCOMES. (R824757)
We propose a multivariate linear mixed (MLMM) for the analysis of multiple outcomes, which generalizes the latent variable model of Sammel and Ryan. The proposed model assumes a flexible correlation structure among the multiple outcomes, and allows a global test of the impact of ...
Multivariate Models of Mothers' and Fathers' Aggression toward Their Children
ERIC Educational Resources Information Center
Smith Slep, Amy M.; O'Leary, Susan G.
2007-01-01
Multivariate, biopsychosocial, explanatory models of mothers' and fathers' psychological and physical aggression toward their 3- to 7-year-old children were fitted and cross-validated in 453 representatively sampled families. Models explaining mothers' and fathers' aggression were substantially similar. Surprisingly, many variables identified as…
Multivariate Bayesian Models of Extreme Rainfall
NASA Astrophysics Data System (ADS)
Rahill-Marier, B.; Devineni, N.; Lall, U.; Farnham, D.
2013-12-01
Accounting for spatial heterogeneity in extreme rainfall has important ramifications in hydrological design and climate models alike. Traditional methods, including areal reduction factors and kriging, are sensitive to catchment shape assumptions and return periods, and do not explicitly model spatial dependence between between data points. More recent spatially dense rainfall simulators depend on newer data sources such as radar and may struggle to reproduce extremes because of physical assumptions in the model and short historical records. Rain gauges offer the longest historical record, key when considering rainfall extremes and changes over time, and particularly relevant in today's environment of designing for climate change. In this paper we propose a probabilistic approach of accounting for spatial dependence using the lengthy but spatially disparate hourly rainfall network in the greater New York City area. We build a hierarchical Bayesian model allowing extremes at one station to co-vary with concurrent rainfall fields occurring at other stations. Subsequently we pool across the extreme rainfall fields of all stations, and demonstrate that the expected catchment-wide events are significantly lower when considering spatial fields instead of maxima-only fields. We additionally demonstrate the importance of using concurrent spatial fields, rather than annual maxima, in producing covariance matrices that describe true storm dynamics. This approach is also unique in that it considers short duration storms - from one hour to twenty-four hours - rather than the daily values typically derived from rainfall gauges. The same methodology can be extended to include the radar fields available in the past decade. The hierarchical multilevel approach lends itself easily to integration of long-record parameters and short-record parameters at a station or regional level. In addition climate covariates can be introduced to support the relationship of spatial covariance with
Inclusion of Dominance Effects in the Multivariate GBLUP Model
Vasconcellos, Renato Coelho de Castro; Pires, Luiz Paulo Miranda; Von Pinho, Renzo Garcia
2016-01-01
New proposals for models and applications of prediction processes with data on molecular markers may help reduce the financial costs of and identify superior genotypes in maize breeding programs. Studies evaluating Genomic Best Linear Unbiased Prediction (GBLUP) models including dominance effects have not been performed in the univariate and multivariate context in the data analysis of this crop. A single cross hybrid construction procedure was performed in this study using phenotypic data and actual molecular markers of 4,091 maize lines from the public database Panzea. A total of 400 simple hybrids resulting from this process were analyzed using the univariate and multivariate GBLUP model considering only additive effects additive plus dominance effects. Historic heritability scenarios of five traits and other genetic architecture settings were used to compare models, evaluating the predictive ability and estimation of variance components. Marginal differences were detected between the multivariate and univariate models. The main explanation for the small discrepancy between models is the low- to moderate-magnitude correlations between the traits studied and moderate heritabilities. These conditions do not favor the advantages of multivariate analysis. The inclusion of dominance effects in the models was an efficient strategy to improve the predictive ability and estimation quality of variance components. PMID:27074056
Inclusion of Dominance Effects in the Multivariate GBLUP Model.
Dos Santos, Jhonathan Pedroso Rigal; Vasconcellos, Renato Coelho de Castro; Pires, Luiz Paulo Miranda; Balestre, Marcio; Von Pinho, Renzo Garcia
2016-01-01
New proposals for models and applications of prediction processes with data on molecular markers may help reduce the financial costs of and identify superior genotypes in maize breeding programs. Studies evaluating Genomic Best Linear Unbiased Prediction (GBLUP) models including dominance effects have not been performed in the univariate and multivariate context in the data analysis of this crop. A single cross hybrid construction procedure was performed in this study using phenotypic data and actual molecular markers of 4,091 maize lines from the public database Panzea. A total of 400 simple hybrids resulting from this process were analyzed using the univariate and multivariate GBLUP model considering only additive effects additive plus dominance effects. Historic heritability scenarios of five traits and other genetic architecture settings were used to compare models, evaluating the predictive ability and estimation of variance components. Marginal differences were detected between the multivariate and univariate models. The main explanation for the small discrepancy between models is the low- to moderate-magnitude correlations between the traits studied and moderate heritabilities. These conditions do not favor the advantages of multivariate analysis. The inclusion of dominance effects in the models was an efficient strategy to improve the predictive ability and estimation quality of variance components. PMID:27074056
FACTOR ANALYTIC MODELS OF CLUSTERED MULTIVARIATE DATA WITH INFORMATIVE CENSORING
This paper describes a general class of factor analytic models for the analysis of clustered multivariate data in the presence of informative missingness. We assume that there are distinct sets of cluster-level latent variables related to the primary outcomes and to the censorin...
Validation of multivariate model of leaf ionome is fundamentally confounded.
Technology Transfer Automated Retrieval System (TEKTRAN)
The multivariable signature model reported by Baxter et al. (1) to predict Fe and P homeostasis in Arabidopsis is fundamentally flawed for two reasons: 1) The initial experiments identified a correlation between trace metal (Mn, Co, Zn, Mo, Cd) signature and “Fe-deficiency,” which was used to train ...
A Multivariate Descriptive Model of Motivation for Orthodontic Treatment.
ERIC Educational Resources Information Center
Hackett, Paul M. W.; And Others
1993-01-01
Motivation for receiving orthodontic treatment was studied among 109 young adults, and a multivariate model of the process is proposed. The combination of smallest scale analysis and Partial Order Scalogram Analysis by base Coordinates (POSAC) illustrates an interesting methodology for health treatment studies and explores motivation for dental…
Studying Resist Stochastics with the Multivariate Poisson Propagation Model
Naulleau, Patrick; Anderson, Christopher; Chao, Weilun; Bhattarai, Suchit; Neureuther, Andrew
2014-01-01
Progress in the ultimate performance of extreme ultraviolet resist has arguably decelerated in recent years suggesting an approach to stochastic limits both in photon counts and material parameters. Here we report on the performance of a variety of leading extreme ultraviolet resist both with and without chemical amplification. The measured performance is compared to stochastic modeling results using the Multivariate Poisson Propagation Model. The results show that the best materials are indeed nearing modeled performance limits.
Tvedebrink, Torben; Eriksen, Poul Svante; Morling, Niels
2015-11-01
In this paper, we discuss the construction of a multivariate generalisation of the Dirichlet-multinomial distribution. An example from forensic genetics in the statistical analysis of DNA mixtures motivates the study of this multivariate extension. In forensic genetics, adjustment of the match probabilities due to remote ancestry in the population is often done using the so-called θ-correction. This correction increases the probability of observing multiple copies of rare alleles in a subpopulation and thereby reduces the weight of the evidence for rare genotypes. A recent publication by Cowell et al. (2015) showed elegantly how to use Bayesian networks for efficient computations of likelihood ratios in a forensic genetic context. However, their underlying population genetic model assumed independence of alleles, which is not realistic in real populations. We demonstrate how the so-called θ-correction can be incorporated in Bayesian networks to make efficient computations by modifying the Markov structure of Cowell et al. (2015). By numerical examples, we show how the θ-correction incorporated in the multivariate Dirichlet-multinomial distribution affects the weight of evidence. PMID:26344785
Overpaying morbidity adjusters in risk equalization models.
van Kleef, R C; van Vliet, R C J A; van de Ven, W P M M
2016-09-01
Most competitive social health insurance markets include risk equalization to compensate insurers for predictable variation in healthcare expenses. Empirical literature shows that even the most sophisticated risk equalization models-with advanced morbidity adjusters-substantially undercompensate insurers for selected groups of high-risk individuals. In the presence of premium regulation, these undercompensations confront consumers and insurers with incentives for risk selection. An important reason for the undercompensations is that not all information with predictive value regarding healthcare expenses is appropriate for use as a morbidity adjuster. To reduce incentives for selection regarding specific groups we propose overpaying morbidity adjusters that are already included in the risk equalization model. This paper illustrates the idea of overpaying by merging data on morbidity adjusters and healthcare expenses with health survey information, and derives three preconditions for meaningful application. Given these preconditions, we think overpaying may be particularly useful for pharmacy-based cost groups. PMID:26420555
Coercively Adjusted Auto Regression Model for Forecasting in Epilepsy EEG
Kim, Sun-Hee; Faloutsos, Christos; Yang, Hyung-Jeong
2013-01-01
Recently, data with complex characteristics such as epilepsy electroencephalography (EEG) time series has emerged. Epilepsy EEG data has special characteristics including nonlinearity, nonnormality, and nonperiodicity. Therefore, it is important to find a suitable forecasting method that covers these special characteristics. In this paper, we propose a coercively adjusted autoregression (CA-AR) method that forecasts future values from a multivariable epilepsy EEG time series. We use the technique of random coefficients, which forcefully adjusts the coefficients with −1 and 1. The fractal dimension is used to determine the order of the CA-AR model. We applied the CA-AR method reflecting special characteristics of data to forecast the future value of epilepsy EEG data. Experimental results show that when compared to previous methods, the proposed method can forecast faster and accurately. PMID:23710252
Modeling rainfall-runoff relationship using multivariate GARCH model
NASA Astrophysics Data System (ADS)
Modarres, R.; Ouarda, T. B. M. J.
2013-08-01
The traditional hydrologic time series approaches are used for modeling, simulating and forecasting conditional mean of hydrologic variables but neglect their time varying variance or the second order moment. This paper introduces the multivariate Generalized Autoregressive Conditional Heteroscedasticity (MGARCH) modeling approach to show how the variance-covariance relationship between hydrologic variables varies in time. These approaches are also useful to estimate the dynamic conditional correlation between hydrologic variables. To illustrate the novelty and usefulness of MGARCH models in hydrology, two major types of MGARCH models, the bivariate diagonal VECH and constant conditional correlation (CCC) models are applied to show the variance-covariance structure and cdynamic correlation in a rainfall-runoff process. The bivariate diagonal VECH-GARCH(1,1) and CCC-GARCH(1,1) models indicated both short-run and long-run persistency in the conditional variance-covariance matrix of the rainfall-runoff process. The conditional variance of rainfall appears to have a stronger persistency, especially long-run persistency, than the conditional variance of streamflow which shows a short-lived drastic increasing pattern and a stronger short-run persistency. The conditional covariance and conditional correlation coefficients have different features for each bivariate rainfall-runoff process with different degrees of stationarity and dynamic nonlinearity. The spatial and temporal pattern of variance-covariance features may reflect the signature of different physical and hydrological variables such as drainage area, topography, soil moisture and ground water fluctuations on the strength, stationarity and nonlinearity of the conditional variance-covariance for a rainfall-runoff process.
Collision prediction models using multivariate Poisson-lognormal regression.
El-Basyouny, Karim; Sayed, Tarek
2009-07-01
This paper advocates the use of multivariate Poisson-lognormal (MVPLN) regression to develop models for collision count data. The MVPLN approach presents an opportunity to incorporate the correlations across collision severity levels and their influence on safety analyses. The paper introduces a new multivariate hazardous location identification technique, which generalizes the univariate posterior probability of excess that has been commonly proposed and applied in the literature. In addition, the paper presents an alternative approach for quantifying the effect of the multivariate structure on the precision of expected collision frequency. The MVPLN approach is compared with the independent (separate) univariate Poisson-lognormal (PLN) models with respect to model inference, goodness-of-fit, identification of hot spots and precision of expected collision frequency. The MVPLN is modeled using the WinBUGS platform which facilitates computation of posterior distributions as well as providing a goodness-of-fit measure for model comparisons. The results indicate that the estimates of the extra Poisson variation parameters were considerably smaller under MVPLN leading to higher precision. The improvement in precision is due mainly to the fact that MVPLN accounts for the correlation between the latent variables representing property damage only (PDO) and injuries plus fatalities (I+F). This correlation was estimated at 0.758, which is highly significant, suggesting that higher PDO rates are associated with higher I+F rates, as the collision likelihood for both types is likely to rise due to similar deficiencies in roadway design and/or other unobserved factors. In terms of goodness-of-fit, the MVPLN model provided a superior fit than the independent univariate models. The multivariate hazardous location identification results demonstrated that some hazardous locations could be overlooked if the analysis was restricted to the univariate models. PMID:19540972
A pairwise interaction model for multivariate functional and longitudinal data
Chiou, Jeng-Min; Müller, Hans-Georg
2016-01-01
Functional data vectors consisting of samples of multivariate data where each component is a random function are encountered increasingly often but have not yet been comprehensively investigated. We introduce a simple pairwise interaction model that leads to an interpretable and straightforward decomposition of multivariate functional data and of their variation into component-specific processes and pairwise interaction processes. The latter quantify the degree of pairwise interactions between the components of the functional data vectors, while the component-specific processes reflect the functional variation of a particular functional vector component that cannot be explained by the other components. Thus the proposed model provides an extension of the usual notion of a covariance or correlation matrix for multivariate vector data to functional data vectors and generates an interpretable functional interaction map. The decomposition provided by the model can also serve as a basis for subsequent analysis, such as study of the network structure of functional data vectors. The decomposition of the total variance into componentwise and interaction contributions can be quantified by an \\documentclass[12pt]{minimal} \\usepackage{amsmath} \\usepackage{wasysym} \\usepackage{amsfonts} \\usepackage{amssymb} \\usepackage{amsbsy} \\usepackage{upgreek} \\usepackage{mathrsfs} \\setlength{\\oddsidemargin}{-69pt} \\begin{document} }{}$R^2$\\end{document}-like decomposition. We provide consistency results for the proposed methods and illustrate the model by applying it to sparsely sampled longitudinal data from the Baltimore Longitudinal Study of Aging, examining the relationships between body mass index and blood fats. PMID:27279664
Analysis of Forest Foliage Using a Multivariate Mixture Model
NASA Technical Reports Server (NTRS)
Hlavka, C. A.; Peterson, David L.; Johnson, L. F.; Ganapol, B.
1997-01-01
Data with wet chemical measurements and near infrared spectra of ground leaf samples were analyzed to test a multivariate regression technique for estimating component spectra which is based on a linear mixture model for absorbance. The resulting unmixed spectra for carbohydrates, lignin, and protein resemble the spectra of extracted plant starches, cellulose, lignin, and protein. The unmixed protein spectrum has prominent absorption spectra at wavelengths which have been associated with nitrogen bonds.
Various forms of indexing HDMR for modelling multivariate classification problems
Aksu, Çağrı; Tunga, M. Alper
2014-12-10
The Indexing HDMR method was recently developed for modelling multivariate interpolation problems. The method uses the Plain HDMR philosophy in partitioning the given multivariate data set into less variate data sets and then constructing an analytical structure through these partitioned data sets to represent the given multidimensional problem. Indexing HDMR makes HDMR be applicable to classification problems having real world data. Mostly, we do not know all possible class values in the domain of the given problem, that is, we have a non-orthogonal data structure. However, Plain HDMR needs an orthogonal data structure in the given problem to be modelled. In this sense, the main idea of this work is to offer various forms of Indexing HDMR to successfully model these real life classification problems. To test these different forms, several well-known multivariate classification problems given in UCI Machine Learning Repository were used and it was observed that the accuracy results lie between 80% and 95% which are very satisfactory.
Various forms of indexing HDMR for modelling multivariate classification problems
NASA Astrophysics Data System (ADS)
Aksu, ćaǧrı; Tunga, M. Alper
2014-12-01
The Indexing HDMR method was recently developed for modelling multivariate interpolation problems. The method uses the Plain HDMR philosophy in partitioning the given multivariate data set into less variate data sets and then constructing an analytical structure through these partitioned data sets to represent the given multidimensional problem. Indexing HDMR makes HDMR be applicable to classification problems having real world data. Mostly, we do not know all possible class values in the domain of the given problem, that is, we have a non-orthogonal data structure. However, Plain HDMR needs an orthogonal data structure in the given problem to be modelled. In this sense, the main idea of this work is to offer various forms of Indexing HDMR to successfully model these real life classification problems. To test these different forms, several well-known multivariate classification problems given in UCI Machine Learning Repository were used and it was observed that the accuracy results lie between 80% and 95% which are very satisfactory.
Multivariable Parametric Cost Model for Ground Optical: Telescope Assembly
NASA Technical Reports Server (NTRS)
Stahl, H. Philip; Rowell, Ginger Holmes; Reese, Gayle; Byberg, Alicia
2004-01-01
A parametric cost model for ground-based telescopes is developed using multi-variable statistical analysis of both engineering and performance parameters. While diameter continues to be the dominant cost driver, diffraction limited wavelength is found to be a secondary driver. Other parameters such as radius of curvature were examined. The model includes an explicit factor for primary mirror segmentation and/or duplication (i.e. multi-telescope phased-array systems). Additionally, single variable models based on aperture diameter were derived.
Multivariable Parametric Cost Model for Ground Optical Telescope Assembly
NASA Technical Reports Server (NTRS)
Stahl, H. Philip; Rowell, Ginger Holmes; Reese, Gayle; Byberg, Alicia
2005-01-01
A parametric cost model for ground-based telescopes is developed using multivariable statistical analysis of both engineering and performance parameters. While diameter continues to be the dominant cost driver, diffraction-limited wavelength is found to be a secondary driver. Other parameters such as radius of curvature are examined. The model includes an explicit factor for primary mirror segmentation and/or duplication (i.e., multi-telescope phased-array systems). Additionally, single variable models Based on aperture diameter are derived.
The following SAS macros can be used to create a multivariate usual intake distribution for multiple dietary components that are consumed nearly every day or episodically. A SAS macro for performing balanced repeated replication (BRR) variance estimation is also included.
Enhancing scientific reasoning by refining students' models of multivariable causality
NASA Astrophysics Data System (ADS)
Keselman, Alla
Inquiry learning as an educational method is gaining increasing support among elementary and middle school educators. In inquiry activities at the middle school level, students are typically asked to conduct investigations and infer causal relationships about multivariable causal systems. In these activities, students usually demonstrate significant strategic weaknesses and insufficient metastrategic understanding of task demands. Present work suggests that these weaknesses arise from students' deficient mental models of multivariable causality, in which effects of individual features are neither additive, nor constant. This study is an attempt to develop an intervention aimed at enhancing scientific reasoning by refining students' models of multivariable causality. Three groups of students engaged in a scientific investigation activity over seven weekly sessions. By creating unique combinations of five features potentially involved in earthquake mechanism and observing associated risk meter readings, students had to find out which of the features were causal, and to learn to predict earthquake risk. Additionally, students in the instructional and practice groups engaged in self-directed practice in making scientific predictions. The instructional group also participated in weekly instructional sessions on making predictions based on multivariable causality. Students in the practice and instructional conditions showed small to moderate improvement in their attention to the evidence and in their metastrategic ability to recognize effective investigative strategies in the work of other students. They also demonstrated a trend towards making a greater number of valid inferences than the control group students. Additionally, students in the instructional condition showed significant improvement in their ability to draw inferences based on multiple records. They also developed more accurate knowledge about non-causal features of the system. These gains were maintained
Multivariate Statistical Modelling of Drought and Heat Wave Events
NASA Astrophysics Data System (ADS)
Manning, Colin; Widmann, Martin; Vrac, Mathieu; Maraun, Douglas; Bevaqua, Emanuele
2016-04-01
Multivariate Statistical Modelling of Drought and Heat Wave Events C. Manning1,2, M. Widmann1, M. Vrac2, D. Maraun3, E. Bevaqua2,3 1. School of Geography, Earth and Environmental Sciences, University of Birmingham, Edgbaston, Birmingham, UK 2. Laboratoire des Sciences du Climat et de l'Environnement, (LSCE-IPSL), Centre d'Etudes de Saclay, Gif-sur-Yvette, France 3. Wegener Center for Climate and Global Change, University of Graz, Brandhofgasse 5, 8010 Graz, Austria Compound extreme events are a combination of two or more contributing events which in themselves may not be extreme but through their joint occurrence produce an extreme impact. Compound events are noted in the latest IPCC report as an important type of extreme event that have been given little attention so far. As part of the CE:LLO project (Compound Events: muLtivariate statisticaL mOdelling) we are developing a multivariate statistical model to gain an understanding of the dependence structure of certain compound events. One focus of this project is on the interaction between drought and heat wave events. Soil moisture has both a local and non-local effect on the occurrence of heat waves where it strongly controls the latent heat flux affecting the transfer of sensible heat to the atmosphere. These processes can create a feedback whereby a heat wave maybe amplified or suppressed by the soil moisture preconditioning, and vice versa, the heat wave may in turn have an effect on soil conditions. An aim of this project is to capture this dependence in order to correctly describe the joint probabilities of these conditions and the resulting probability of their compound impact. We will show an application of Pair Copula Constructions (PCCs) to study the aforementioned compound event. PCCs allow in theory for the formulation of multivariate dependence structures in any dimension where the PCC is a decomposition of a multivariate distribution into a product of bivariate components modelled using copulas. A
Multivariate 3D modelling of Scottish soil properties
NASA Astrophysics Data System (ADS)
Poggio, Laura; Gimona, Alessandro
2015-04-01
Information regarding soil properties across landscapes at national or continental scales is critical for better soil and environmental management and for climate regulation and adaptation policy. The prediction of soil properties variation in space and time and their uncertainty is an important part of environmental modelling. Soil properties, and in particular the 3 fractions of soil texture, exhibit strong co-variation among themselves and therefore taking into account this correlation leads to spatially more accurate results. In this study the continuous vertical and lateral distributions of relevant soil properties in Scottish soils were modelled with a multivariate 3D-GAM+GS approach. The approach used involves 1) modelling the multivariate trend with full 3D spatial correlation, i.e., exploiting the values of the neighbouring pixels in 3D-space, and 2) 3D kriging to interpolate the residuals. The values at each cell for each of the considered depth layers were defined using a hybrid GAM-geostatistical 3D model, combining the fitting of a GAM (generalised Additive Models) to estimate multivariate trend of the variables, using a 3D smoother with related covariates. Gaussian simulations of the model residuals were used as spatial component to account for local details. A dataset of about 26,000 horizons (7,800 profiles) was used for this study. A validation set was randomly selected as 25% of the full dataset. Numerous covariates derived from globally available data, such as MODIS and SRTM, are considered. The results of the 3D-GAM+kriging showed low RMSE values, good R squared and an accurate reproduction of the spatial structure of the data for a range of soil properties. The results have an out-of-sample RMSE between 10 to 15% of the observed range when taking into account the whole profile. The approach followed allows the assessment of the uncertainty of both the trend and the residuals.
A Cyber-Attack Detection Model Based on Multivariate Analyses
NASA Astrophysics Data System (ADS)
Sakai, Yuto; Rinsaka, Koichiro; Dohi, Tadashi
In the present paper, we propose a novel cyber-attack detection model based on two multivariate-analysis methods to the audit data observed on a host machine. The statistical techniques used here are the well-known Hayashi's quantification method IV and cluster analysis method. We quantify the observed qualitative audit event sequence via the quantification method IV, and collect similar audit event sequence in the same groups based on the cluster analysis. It is shown in simulation experiments that our model can improve the cyber-attack detection accuracy in some realistic cases where both normal and attack activities are intermingled.
Multivariate moment closure techniques for stochastic kinetic models
Lakatos, Eszter Ale, Angelique; Kirk, Paul D. W.; Stumpf, Michael P. H.
2015-09-07
Stochastic effects dominate many chemical and biochemical processes. Their analysis, however, can be computationally prohibitively expensive and a range of approximation schemes have been proposed to lighten the computational burden. These, notably the increasingly popular linear noise approximation and the more general moment expansion methods, perform well for many dynamical regimes, especially linear systems. At higher levels of nonlinearity, it comes to an interplay between the nonlinearities and the stochastic dynamics, which is much harder to capture correctly by such approximations to the true stochastic processes. Moment-closure approaches promise to address this problem by capturing higher-order terms of the temporally evolving probability distribution. Here, we develop a set of multivariate moment-closures that allows us to describe the stochastic dynamics of nonlinear systems. Multivariate closure captures the way that correlations between different molecular species, induced by the reaction dynamics, interact with stochastic effects. We use multivariate Gaussian, gamma, and lognormal closure and illustrate their use in the context of two models that have proved challenging to the previous attempts at approximating stochastic dynamics: oscillations in p53 and Hes1. In addition, we consider a larger system, Erk-mediated mitogen-activated protein kinases signalling, where conventional stochastic simulation approaches incur unacceptably high computational costs.
Multivariate moment closure techniques for stochastic kinetic models
NASA Astrophysics Data System (ADS)
Lakatos, Eszter; Ale, Angelique; Kirk, Paul D. W.; Stumpf, Michael P. H.
2015-09-01
Stochastic effects dominate many chemical and biochemical processes. Their analysis, however, can be computationally prohibitively expensive and a range of approximation schemes have been proposed to lighten the computational burden. These, notably the increasingly popular linear noise approximation and the more general moment expansion methods, perform well for many dynamical regimes, especially linear systems. At higher levels of nonlinearity, it comes to an interplay between the nonlinearities and the stochastic dynamics, which is much harder to capture correctly by such approximations to the true stochastic processes. Moment-closure approaches promise to address this problem by capturing higher-order terms of the temporally evolving probability distribution. Here, we develop a set of multivariate moment-closures that allows us to describe the stochastic dynamics of nonlinear systems. Multivariate closure captures the way that correlations between different molecular species, induced by the reaction dynamics, interact with stochastic effects. We use multivariate Gaussian, gamma, and lognormal closure and illustrate their use in the context of two models that have proved challenging to the previous attempts at approximating stochastic dynamics: oscillations in p53 and Hes1. In addition, we consider a larger system, Erk-mediated mitogen-activated protein kinases signalling, where conventional stochastic simulation approaches incur unacceptably high computational costs.
Multivariate moment closure techniques for stochastic kinetic models.
Lakatos, Eszter; Ale, Angelique; Kirk, Paul D W; Stumpf, Michael P H
2015-09-01
Stochastic effects dominate many chemical and biochemical processes. Their analysis, however, can be computationally prohibitively expensive and a range of approximation schemes have been proposed to lighten the computational burden. These, notably the increasingly popular linear noise approximation and the more general moment expansion methods, perform well for many dynamical regimes, especially linear systems. At higher levels of nonlinearity, it comes to an interplay between the nonlinearities and the stochastic dynamics, which is much harder to capture correctly by such approximations to the true stochastic processes. Moment-closure approaches promise to address this problem by capturing higher-order terms of the temporally evolving probability distribution. Here, we develop a set of multivariate moment-closures that allows us to describe the stochastic dynamics of nonlinear systems. Multivariate closure captures the way that correlations between different molecular species, induced by the reaction dynamics, interact with stochastic effects. We use multivariate Gaussian, gamma, and lognormal closure and illustrate their use in the context of two models that have proved challenging to the previous attempts at approximating stochastic dynamics: oscillations in p53 and Hes1. In addition, we consider a larger system, Erk-mediated mitogen-activated protein kinases signalling, where conventional stochastic simulation approaches incur unacceptably high computational costs. PMID:26342359
Multivariate longitudinal data analysis with mixed effects hidden Markov models.
Raffa, Jesse D; Dubin, Joel A
2015-09-01
Multiple longitudinal responses are often collected as a means to capture relevant features of the true outcome of interest, which is often hidden and not directly measurable. We outline an approach which models these multivariate longitudinal responses as generated from a hidden disease process. We propose a class of models which uses a hidden Markov model with separate but correlated random effects between multiple longitudinal responses. This approach was motivated by a smoking cessation clinical trial, where a bivariate longitudinal response involving both a continuous and a binomial response was collected for each participant to monitor smoking behavior. A Bayesian method using Markov chain Monte Carlo is used. Comparison of separate univariate response models to the bivariate response models was undertaken. Our methods are demonstrated on the smoking cessation clinical trial dataset, and properties of our approach are examined through extensive simulation studies. PMID:25761965
A Multivariate Model for the Study of Parental Acceptance-Rejection and Child Abuse.
ERIC Educational Resources Information Center
Rohner, Ronald P.; Rohner, Evelyn C.
This paper proposes a multivariate strategy for the study of parental acceptance-rejection and child abuse and describes a research study on parental rejection and child abuse which illustrates the advantages of using a multivariate, (rather than a simple-model) approach. The multivariate model is a combination of three simple models used to study…
Multivariate models of mothers' and fathers' aggression toward their children.
Smith Slep, Amy M; O'Leary, Susan G
2007-10-01
Multivariate, biopsychosocial, explanatory models of mothers' and fathers' psychological and physical aggression toward their 3- to 7-year-old children were fitted and cross-validated in 453 representatively sampled families. Models explaining mothers' and fathers' aggression were substantially similar. Surprisingly, many variables identified as risk factors in the parental aggression and physical child abuse literatures, such as income, unrealistic expectations, and alcohol problems, although correlated with aggression bivariately, did not contribute uniquely to the models. In contrast, a small number of variables (i.e., child responsible attributions, overreactive discipline style, anger expression, and attitudes approving of aggression) appeared to be important pathways to parent aggression, mediating the effects of more distal risk factors. Models accounted for a moderate proportion of the variance in aggression. PMID:17907856
Optimal model-free prediction from multivariate time series.
Runge, Jakob; Donner, Reik V; Kurths, Jürgen
2015-05-01
Forecasting a time series from multivariate predictors constitutes a challenging problem, especially using model-free approaches. Most techniques, such as nearest-neighbor prediction, quickly suffer from the curse of dimensionality and overfitting for more than a few predictors which has limited their application mostly to the univariate case. Therefore, selection strategies are needed that harness the available information as efficiently as possible. Since often the right combination of predictors matters, ideally all subsets of possible predictors should be tested for their predictive power, but the exponentially growing number of combinations makes such an approach computationally prohibitive. Here a prediction scheme that overcomes this strong limitation is introduced utilizing a causal preselection step which drastically reduces the number of possible predictors to the most predictive set of causal drivers making a globally optimal search scheme tractable. The information-theoretic optimality is derived and practical selection criteria are discussed. As demonstrated for multivariate nonlinear stochastic delay processes, the optimal scheme can even be less computationally expensive than commonly used suboptimal schemes like forward selection. The method suggests a general framework to apply the optimal model-free approach to select variables and subsequently fit a model to further improve a prediction or learn statistical dependencies. The performance of this framework is illustrated on a climatological index of El Niño Southern Oscillation. PMID:26066231
Modeling pharmacokinetic data using heavy-tailed multivariate distributions.
Lindsey, J K; Jones, B
2000-08-01
Pharmacokinetic studies of drug and metabolite concentrations in the blood are usually conducted as crossover trials, especially in Phases I and II. A longitudinal series of measurements is collected on each subject within each period. Dependence among such observations, within and between periods, will generally be fairly complex, requiring two levels of variance components, for the subjects and for the periods within subjects, and an autocorrelation within periods as well as a time-varying variance. Until now, the standard way in which this has been modeled is using a multivariate normal distribution. Here, we introduce procedures for simultaneously handling these various types of dependence in a wider class of distributions called the multivariate power exponential and Student t families. They can have the heavy tails required for handling the extreme observations that may occur in such contexts. We also consider various forms of serial dependence among the observations and find that they provide more improvement to our models than do the variance components. An integrated Ornstein-Uhlenbeck (IOU) stochastic process fits much better to our data set than the conventional continuous first-order autoregression, CAR(1). We apply these models to a Phase I study of the drug, flosequinan, and its metabolite. PMID:10959917
Multivariate Models for Normal and Binary Responses in Intervention Studies
ERIC Educational Resources Information Center
Pituch, Keenan A.; Whittaker, Tiffany A.; Chang, Wanchen
2016-01-01
Use of multivariate analysis (e.g., multivariate analysis of variance) is common when normally distributed outcomes are collected in intervention research. However, when mixed responses--a set of normal and binary outcomes--are collected, standard multivariate analyses are no longer suitable. While mixed responses are often obtained in…
Recency of Pap smear screening: a multivariate model.
Howe, H L; Bzduch, H
1987-01-01
Most descriptive reports of women who have not received recent Pap smear screening have been limited to bivariate descriptions. The purpose of this study was to develop a multivariate model to predict the recency of Pap smear screening. A systematic sample of women residents, aged 25 to 74 years, in upstate New York was selected. The women were asked to report use of Pap smear screening during several time periods, their congruence with recommended medical practice, general use of medical services, and a variety of sociodemographic indicators. A log linear weighted least squares regression model was developed, and it explained 30 percent of the variance in recency of Pap smear screening behavior. While the sociodemographic variables were important predictors in the model, the medical care variables were the strongest predictors of recent Pap smear use. A significant relationship between race and recency of Pap smear testing was not supported by these data. PMID:3108946
ASSESSING PHENOTYPIC CORRELATION THROUGH THE MULTIVARIATE PHYLOGENETIC LATENT LIABILITY MODEL
Cybis, Gabriela B.; Sinsheimer, Janet S.; Bedford, Trevor; Mather, Alison E.; Lemey, Philippe; Suchard, Marc A.
2016-01-01
Understanding which phenotypic traits are consistently correlated throughout evolution is a highly pertinent problem in modern evolutionary biology. Here, we propose a multivariate phylogenetic latent liability model for assessing the correlation between multiple types of data, while simultaneously controlling for their unknown shared evolutionary history informed through molecular sequences. The latent formulation enables us to consider in a single model combinations of continuous traits, discrete binary traits, and discrete traits with multiple ordered and unordered states. Previous approaches have entertained a single data type generally along a fixed history, precluding estimation of correlation between traits and ignoring uncertainty in the history. We implement our model in a Bayesian phylogenetic framework, and discuss inference techniques for hypothesis testing. Finally, we showcase the method through applications to columbine flower morphology, antibiotic resistance in Salmonella, and epitope evolution in influenza. PMID:27053974
Transfer of multivariate calibration models between spectrometers: A progress report
Haaland, D.; Jones, H.; Rohrback, B.
1994-12-31
Multivariate calibration methods are extremely powerful for quantitative spectral analyses and have myriad uses in quality control and process monitoring. However, when analyses are to be completed at multiple sites or when spectrometers drift, recalibration is required. Often a full recalibration of an instrument can be impractical: the problem is particularly acute when the number of calibration standards is large or the standards chemically unstable. Furthermore, simply using Instrument A`s calibration model to predict unknowns on Instrument B can lead to enormous errors. Therefore, a mathematical procedure that would allow for the efficient transfer of a multivariate calibration model from one instrument to others using a small number of transfer standards is highly desirable. In this study, near-infrared spectral data have been collected from two sets of statistically designed round-robin samples on multiple FT-IR and grating spectrometers. One set of samples encompasses a series of dilute aqueous solutions of urea, creatinine, and NaCl while the second set is derived from mixtures of heptane, monochlorobenzene, and toluene. A systematic approach has been used to compare the results from four published transfer algorithms in order to determine parameters that affect the quality of the transfer for each class of sample and each type of spectrometer.
Investigating College and Graduate Students' Multivariable Reasoning in Computational Modeling
ERIC Educational Resources Information Center
Wu, Hsin-Kai; Wu, Pai-Hsing; Zhang, Wen-Xin; Hsu, Ying-Shao
2013-01-01
Drawing upon the literature in computational modeling, multivariable reasoning, and causal attribution, this study aims at characterizing multivariable reasoning practices in computational modeling and revealing the nature of understanding about multivariable causality. We recruited two freshmen, two sophomores, two juniors, two seniors, four…
Multivariable frequency weighted model order reduction for control synthesis
NASA Technical Reports Server (NTRS)
Schmidt, David K.
1989-01-01
Quantitative criteria are presented for model simplification, or order reduction, such that the reduced order model may be used to synthesize and evaluate a control law, and the stability robustness obtained using the reduced order model will be preserved when controlling the full-order system. The error introduced due to model simplification is treated as modeling uncertainty, and some of the results from multivariate robustness theory are brought to bear on the model simplification problem. A numerical procedure developed previously is shown to lead to results that meet the necessary criteria. The procedure is applied to reduce the model of a flexible aircraft. Also, the importance of the control law itself, in meeting the modeling criteria, is underscored. An example is included that demonstrates that an apparently robust control law actually amplifies modest modeling errors in the critical frequency region, and leads to undesirable results. The cause of this problem is associated with the canceling of lightly damped transmission zeroes in the plant. An attempt is made to expand on some of the earlier results and to further clarify the theoretical basis behind the proposed methodology.
Simplex Factor Models for Multivariate Unordered Categorical Data
Bhattacharya, Anirban; Dunson, David B.
2013-01-01
Gaussian latent factor models are routinely used for modeling of dependence in continuous, binary, and ordered categorical data. For unordered categorical variables, Gaussian latent factor models lead to challenging computation and complex modeling structures. As an alternative, we propose a novel class of simplex factor models. In the single-factor case, the model treats the different categorical outcomes as independent with unknown marginals. The model can characterize flexible dependence structures parsimoniously with few factors, and as factors are added, any multivariate categorical data distribution can be accurately approximated. Using a Bayesian approach for computation and inferences, a Markov chain Monte Carlo (MCMC) algorithm is proposed that scales well with increasing dimension, with the number of factors treated as unknown. We develop an efficient proposal for updating the base probability vector in hierarchical Dirichlet models. Theoretical properties are described, and we evaluate the approach through simulation examples. Applications are described for modeling dependence in nucleotide sequences and prediction from high-dimensional categorical features. PMID:23908561
NASA Technical Reports Server (NTRS)
Achtemeier, Gary L.; Ochs, Harry T., III
1988-01-01
The variational method of undetermined multipliers is used to derive a multivariate model for objective analysis. The model is intended for the assimilation of 3-D fields of rawinsonde height, temperature and wind, and mean level temperature observed by satellite into a dynamically consistent data set. Relative measurement errors are taken into account. The dynamic equations are the two nonlinear horizontal momentum equations, the hydrostatic equation, and an integrated continuity equation. The model Euler-Lagrange equations are eleven linear and/or nonlinear partial differential and/or algebraic equations. A cyclical solution sequence is described. Other model features include a nonlinear terrain-following vertical coordinate that eliminates truncation error in the pressure gradient terms of the horizontal momentum equations and easily accommodates satellite observed mean layer temperatures in the middle and upper troposphere. A projection of the pressure gradient onto equivalent pressure surfaces removes most of the adverse impacts of the lower coordinate surface on the variational adjustment.
Multivariate screening in food adulteration: untargeted versus targeted modelling.
López, M Isabel; Trullols, Esther; Callao, M Pilar; Ruisánchez, Itziar
2014-03-15
Two multivariate screening strategies (untargeted and targeted modelling) have been developed to compare their ability to detect food fraud. As a case study, possible adulteration of hazelnut paste is considered. Two different adulterants were studied, almond paste and chickpea flour. The models were developed from near-infrared (NIR) data coupled with soft independent modelling of class analogy (SIMCA) as a classification technique. Regarding the untargeted strategy, only unadulterated samples were modelled, obtaining 96.3% of correct classification. The prediction of adulterated samples gave errors between 5.5% and 2%. Regarding targeted modelling, two classes were modelled: Class 1 (unadulterated samples) and Class 2 (almond adulterated samples). Samples adulterated with chickpea were predicted to prove its ability to deal with non-modelled adulterants. The results show that samples adulterated with almond were mainly classified in their own class (90.9%) and samples with chickpea were classified in Class 2 (67.3%) or not in any class (30.9%), but no one only as unadulterated. PMID:24206702
Sparse Multivariate Autoregressive Modeling for Mild Cognitive Impairment Classification
Li, Yang; Wee, Chong-Yaw; Jie, Biao; Peng, Ziwen
2014-01-01
Brain connectivity network derived from functional magnetic resonance imaging (fMRI) is becoming increasingly prevalent in the researches related to cognitive and perceptual processes. The capability to detect causal or effective connectivity is highly desirable for understanding the cooperative nature of brain network, particularly when the ultimate goal is to obtain good performance of control-patient classification with biological meaningful interpretations. Understanding directed functional interactions between brain regions via brain connectivity network is a challenging task. Since many genetic and biomedical networks are intrinsically sparse, incorporating sparsity property into connectivity modeling can make the derived models more biologically plausible. Accordingly, we propose an effective connectivity modeling of resting-state fMRI data based on the multivariate autoregressive (MAR) modeling technique, which is widely used to characterize temporal information of dynamic systems. This MAR modeling technique allows for the identification of effective connectivity using the Granger causality concept and reducing the spurious causality connectivity in assessment of directed functional interaction from fMRI data. A forward orthogonal least squares (OLS) regression algorithm is further used to construct a sparse MAR model. By applying the proposed modeling to mild cognitive impairment (MCI) classification, we identify several most discriminative regions, including middle cingulate gyrus, posterior cingulate gyrus, lingual gyrus and caudate regions, in line with results reported in previous findings. A relatively high classification accuracy of 91.89 % is also achieved, with an increment of 5.4 % compared to the fully-connected, non-directional Pearson-correlation-based functional connectivity approach. PMID:24595922
Multivariate models of inter-subject anatomical variability
Ashburner, John; Klöppel, Stefan
2011-01-01
This paper presents a very selective review of some of the approaches for multivariate modelling of inter-subject variability among brain images. It focusses on applying probabilistic kernel-based pattern recognition approaches to pre-processed anatomical MRI, with the aim of most accurately modelling the difference between populations of subjects. Some of the principles underlying the pattern recognition approaches of Gaussian process classification and regression are briefly described, although the reader is advised to look elsewhere for full implementational details. Kernel pattern recognition methods require matrices that encode the degree of similarity between the images of each pair of subjects. This review focusses on similarity measures derived from the relative shapes of the subjects' brains. Pre-processing is viewed as generative modelling of anatomical variability, and there is a special emphasis on the diffeomorphic image registration framework, which provides a very parsimonious representation of relative shapes. Although the review is largely methodological, excessive mathematical notation is avoided as far as possible, as the paper attempts to convey a more intuitive understanding of various concepts. The paper should be of interest to readers wishing to apply pattern recognition methods to MRI data, with the aim of clinical diagnosis or biomarker development. It also tries to explain that the best models are those that most accurately predict, so similar approaches should also be relevant to basic science. Knowledge of some basic linear algebra and probability theory should make the review easier to follow, although it may still have something to offer to those readers whose mathematics may be more limited. PMID:20347998
Harinath, Eranda; Mann, George K I
2008-06-01
This paper describes a design and two-level tuning method for fuzzy proportional-integral derivative (FPID) controllers for a multivariable process where the fuzzy inference uses the inference of standard additive model. The proposed method can be used for any n x n multi-input-multi-output process and guarantees closed-loop stability. In the two-level tuning scheme, the tuning follows two steps: low-level tuning followed by high-level tuning. The low-level tuning adjusts apparent linear gains, whereas the high-level tuning changes the nonlinearity in the normalized fuzzy output. In this paper, two types of FPID configurations are considered, and their performances are evaluated by using a real-time multizone temperature control problem having a 3 x 3 process system. PMID:18558531
Linear multivariate evaluation models for spatial perception of soundscape.
Deng, Zhiyong; Kang, Jian; Wang, Daiwei; Liu, Aili; Kang, Joe Zhengyu
2015-11-01
Soundscape is a sound environment that emphasizes the awareness of auditory perception and social or cultural understandings. The case of spatial perception is significant to soundscape. However, previous studies on the auditory spatial perception of the soundscape environment have been limited. Based on 21 native binaural-recorded soundscape samples and a set of auditory experiments for subjective spatial perception (SSP), a study of the analysis among semantic parameters, the inter-aural-cross-correlation coefficient (IACC), A-weighted-equal sound-pressure-level (L(eq)), dynamic (D), and SSP is introduced to verify the independent effect of each parameter and to re-determine some of their possible relationships. The results show that the more noisiness the audience perceived, the worse spatial awareness they received, while the closer and more directional the sound source image variations, dynamics, and numbers of sound sources in the soundscape are, the better the spatial awareness would be. Thus, the sensations of roughness, sound intensity, transient dynamic, and the values of Leq and IACC have a suitable range for better spatial perception. A better spatial awareness seems to promote the preference slightly for the audience. Finally, setting SSPs as functions of the semantic parameters and Leq-D-IACC, two linear multivariate evaluation models of subjective spatial perception are proposed. PMID:26627762
Real, Jordi; Forné, Carles; Roso-Llorach, Albert; Martínez-Sánchez, Jose M
2016-05-01
Controlling for confounders is a crucial step in analytical observational studies, and multivariable models are widely used as statistical adjustment techniques. However, the validation of the assumptions of the multivariable regression models (MRMs) should be made clear in scientific reporting. The objective of this study is to review the quality of statistical reporting of the most commonly used MRMs (logistic, linear, and Cox regression) that were applied in analytical observational studies published between 2003 and 2014 by journals indexed in MEDLINE.Review of a representative sample of articles indexed in MEDLINE (n = 428) with observational design and use of MRMs (logistic, linear, and Cox regression). We assessed the quality of reporting about: model assumptions and goodness-of-fit, interactions, sensitivity analysis, crude and adjusted effect estimate, and specification of more than 1 adjusted model.The tests of underlying assumptions or goodness-of-fit of the MRMs used were described in 26.2% (95% CI: 22.0-30.3) of the articles and 18.5% (95% CI: 14.8-22.1) reported the interaction analysis. Reporting of all items assessed was higher in articles published in journals with a higher impact factor.A low percentage of articles indexed in MEDLINE that used multivariable techniques provided information demonstrating rigorous application of the model selected as an adjustment method. Given the importance of these methods to the final results and conclusions of observational studies, greater rigor is required in reporting the use of MRMs in the scientific literature. PMID:27196467
Smooth-Threshold Multivariate Genetic Prediction with Unbiased Model Selection.
Ueki, Masao; Tamiya, Gen
2016-04-01
We develop a new genetic prediction method, smooth-threshold multivariate genetic prediction, using single nucleotide polymorphisms (SNPs) data in genome-wide association studies (GWASs). Our method consists of two stages. At the first stage, unlike the usual discontinuous SNP screening as used in the gene score method, our method continuously screens SNPs based on the output from standard univariate analysis for marginal association of each SNP. At the second stage, the predictive model is built by a generalized ridge regression simultaneously using the screened SNPs with SNP weight determined by the strength of marginal association. Continuous SNP screening by the smooth thresholding not only makes prediction stable but also leads to a closed form expression of generalized degrees of freedom (GDF). The GDF leads to the Stein's unbiased risk estimation (SURE), which enables data-dependent choice of optimal SNP screening cutoff without using cross-validation. Our method is very rapid because computationally expensive genome-wide scan is required only once in contrast to the penalized regression methods including lasso and elastic net. Simulation studies that mimic real GWAS data with quantitative and binary traits demonstrate that the proposed method outperforms the gene score method and genomic best linear unbiased prediction (GBLUP), and also shows comparable or sometimes improved performance with the lasso and elastic net being known to have good predictive ability but with heavy computational cost. Application to whole-genome sequencing (WGS) data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) exhibits that the proposed method shows higher predictive power than the gene score and GBLUP methods. PMID:26947266
Pleiotropy analysis of quantitative traits at gene level by multivariate functional linear models.
Wang, Yifan; Liu, Aiyi; Mills, James L; Boehnke, Michael; Wilson, Alexander F; Bailey-Wilson, Joan E; Xiong, Momiao; Wu, Colin O; Fan, Ruzong
2015-05-01
In genetics, pleiotropy describes the genetic effect of a single gene on multiple phenotypic traits. A common approach is to analyze the phenotypic traits separately using univariate analyses and combine the test results through multiple comparisons. This approach may lead to low power. Multivariate functional linear models are developed to connect genetic variant data to multiple quantitative traits adjusting for covariates for a unified analysis. Three types of approximate F-distribution tests based on Pillai-Bartlett trace, Hotelling-Lawley trace, and Wilks's Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants in one genetic region. The approximate F-distribution tests provide much more significant results than those of F-tests of univariate analysis and optimal sequence kernel association test (SKAT-O). Extensive simulations were performed to evaluate the false positive rates and power performance of the proposed models and tests. We show that the approximate F-distribution tests control the type I error rates very well. Overall, simultaneous analysis of multiple traits can increase power performance compared to an individual test of each trait. The proposed methods were applied to analyze (1) four lipid traits in eight European cohorts, and (2) three biochemical traits in the Trinity Students Study. The approximate F-distribution tests provide much more significant results than those of F-tests of univariate analysis and SKAT-O for the three biochemical traits. The approximate F-distribution tests of the proposed functional linear models are more sensitive than those of the traditional multivariate linear models that in turn are more sensitive than SKAT-O in the univariate case. The analysis of the four lipid traits and the three biochemical traits detects more association than SKAT-O in the univariate case. PMID:25809955
Jonathan, Aguero-Valverde; Wu, Kun-Feng Ken; Donnell, Eric T
2016-02-01
Many studies have proposed the use of a systemic approach to identify sites with promise (SWiPs). Proponents of the systemic approach to road safety management suggest that it is more effective in reducing crash frequency than the traditional hot spot approach. The systemic approach aims to identify SWiPs by crash type(s) and, therefore, effectively connects crashes to their corresponding countermeasures. Nevertheless, a major challenge to implementing this approach is the low precision of crash frequency models, which results from the systemic approach considering subsets (crash types) of total crashes leading to higher variability in modeling outcomes. This study responds to the need for more precise statistical output and proposes a multivariate spatial model for simultaneously modeling crash frequencies for different crash types. The multivariate spatial model not only induces a multivariate correlation structure between crash types at the same site, but also spatial correlation among adjacent sites to enhance model precision. This study utilized crash, traffic, and roadway inventory data on rural two-lane highways in Pennsylvania to construct and test the multivariate spatial model. Four models with and without the multivariate and spatial correlations were tested and compared. The results show that the model that considers both multivariate and spatial correlation has the best fit. Moreover, it was found that the multivariate correlation plays a stronger role than the spatial correlation when modeling crash frequencies in terms of different crash types. PMID:26615494
Bailit, Jennifer L.; Grobman, William A.; Rice, Madeline Murguia; Spong, Catherine Y.; Wapner, Ronald J.; Varner, Michael W.; Thorp, John M.; Leveno, Kenneth J.; Caritis, Steve N.; Shubert, Phillip J.; Tita, Alan T. N.; Saade, George; Sorokin, Yoram; Rouse, Dwight J.; Blackwell, Sean C.; Tolosa, Jorge E.; Van Dorsten, J. Peter
2014-01-01
Objective Regulatory bodies and insurers evaluate hospital quality using obstetrical outcomes, however meaningful comparisons should take pre-existing patient characteristics into account. Furthermore, if risk-adjusted outcomes are consistent within a hospital, fewer measures and resources would be needed to assess obstetrical quality. Our objective was to establish risk-adjusted models for five obstetric outcomes and assess hospital performance across these outcomes. Study Design A cohort study of 115,502 women and their neonates born in 25 hospitals in the United States between March 2008 and February 2011. Hospitals were ranked according to their unadjusted and risk-adjusted frequency of venous thromboembolism, postpartum hemorrhage, peripartum infection, severe perineal laceration, and a composite neonatal adverse outcome. Correlations between hospital risk-adjusted outcome frequencies were assessed. Results Venous thromboembolism occurred too infrequently (0.03%, 95% CI 0.02% – 0.04%) for meaningful assessment. Other outcomes occurred frequently enough for assessment (postpartum hemorrhage 2.29% (95% CI 2.20–2.38), peripartum infection 5.06% (95% CI 4.93–5.19), severe perineal laceration at spontaneous vaginal delivery 2.16% (95% CI 2.06–2.27), neonatal composite 2.73% (95% CI 2.63–2.84)). Although there was high concordance between unadjusted and adjusted hospital rankings, several individual hospitals had an adjusted rank that was substantially different (as much as 12 rank tiers) than their unadjusted rank. None of the correlations between hospital adjusted outcome frequencies was significant. For example, the hospital with the lowest adjusted frequency of peripartum infection had the highest adjusted frequency of severe perineal laceration. Conclusions Evaluations based on a single risk-adjusted outcome cannot be generalized to overall hospital obstetric performance. PMID:23891630
Multivariate Effect Size Estimation: Confidence Interval Construction via Latent Variable Modeling
ERIC Educational Resources Information Center
Raykov, Tenko; Marcoulides, George A.
2010-01-01
A latent variable modeling method is outlined for constructing a confidence interval (CI) of a popular multivariate effect size measure. The procedure uses the conventional multivariate analysis of variance (MANOVA) setup and is applicable with large samples. The approach provides a population range of plausible values for the proportion of…
Multi-Variable Model-Based Parameter Estimation Model for Antenna Radiation Pattern Prediction
NASA Technical Reports Server (NTRS)
Deshpande, Manohar D.; Cravey, Robin L.
2002-01-01
A new procedure is presented to develop multi-variable model-based parameter estimation (MBPE) model to predict far field intensity of antenna. By performing MBPE model development procedure on a single variable at a time, the present method requires solution of smaller size matrices. The utility of the present method is demonstrated by determining far field intensity due to a dipole antenna over a frequency range of 100-1000 MHz and elevation angle range of 0-90 degrees.
Multivariate Search of the Standard Model Higgs Boson at LHC
Mjahed, Mostafa
2007-01-12
resent an attempt to identify the SM Higgs boson at LHC in the channel (pp-bar {yields} HX {yields} W+ W-X {yields} l+ vl- v X). We use a multivariate processing of data as a tool for a better discrimination between signal and background (via Principal Components Analysis, Genetic Algorithms and Neural Network). Events were produced at LHC energies (MH = 140 - 200 GeV), using the Lund Monte Carlo generator PYTHIA 6.1. Higgs boson events (pp-bar {yields} HX {yields} W+W-X {yields} l+ vl- v X) and the most relevant background are considered.
The Detection of Metabolite-Mediated Gene Module Co-Expression Using Multivariate Linear Models
Padayachee, Trishanta; Khamiakova, Tatsiana; Shkedy, Ziv; Perola, Markus; Salo, Perttu; Burzykowski, Tomasz
2016-01-01
Investigating whether metabolites regulate the co-expression of a predefined gene module is one of the relevant questions posed in the integrative analysis of metabolomic and transcriptomic data. This article concerns the integrative analysis of the two high-dimensional datasets by means of multivariate models and statistical tests for the dependence between metabolites and the co-expression of a gene module. The general linear model (GLM) for correlated data that we propose models the dependence between adjusted gene expression values through a block-diagonal variance-covariance structure formed by metabolic-subset specific general variance-covariance blocks. Performance of statistical tests for the inference of conditional co-expression are evaluated through a simulation study. The proposed methodology is applied to the gene expression data of the previously characterized lipid-leukocyte module. Our results show that the GLM approach improves on a previous approach by being less prone to the detection of spurious conditional co-expression. PMID:26918614
Technology Transfer Automated Retrieval System (TEKTRAN)
Advanced mathematical models have the potential to capture the complex metabolic and physiological processes that result in heat production, or energy expenditure (EE). Multivariate adaptive regression splines (MARS), is a nonparametric method that estimates complex nonlinear relationships by a seri...
Modelling world gold prices and USD foreign exchange relationship using multivariate GARCH model
NASA Astrophysics Data System (ADS)
Ping, Pung Yean; Ahmad, Maizah Hura Binti
2014-12-01
World gold price is a popular investment commodity. The series have often been modeled using univariate models. The objective of this paper is to show that there is a co-movement between gold price and USD foreign exchange rate. Using the effect of the USD foreign exchange rate on the gold price, a model that can be used to forecast future gold prices is developed. For this purpose, the current paper proposes a multivariate GARCH (Bivariate GARCH) model. Using daily prices of both series from 01.01.2000 to 05.05.2014, a causal relation between the two series understudied are found and a bivariate GARCH model is produced.
When univariate model-free time series prediction is better than multivariate
NASA Astrophysics Data System (ADS)
Chayama, Masayoshi; Hirata, Yoshito
2016-07-01
The delay coordinate method is known to be a practically useful technique for reconstructing the states of an observed system. While this method is theoretically supported by Takens' embedding theorem concerning observations of a scalar time series, we can extend the method to include a multivariate time series. It is often assumed that a better prediction can be obtained using a multivariate time series than by using a scalar time series. However, multivariate time series contains various types of information, and it may be difficult to extract information that is useful for predicting the states. Thus, univariate prediction may sometimes be superior to multivariate prediction. Here, we compare univariate model-free time series predictions with multivariate ones, and demonstrate that univariate model-free prediction is better than multivariate one when the prediction steps are small, while multivariate prediction performs better when the prediction steps become larger. We show the validity of the former finding by using artificial datasets generated from the Lorenz 96 models and a real solar irradiance dataset. The results indicate that it is possible to determine which method is the best choice by considering how far into the future we want to predict.
MacNab, Ying C
2016-08-01
This paper concerns with multivariate conditional autoregressive models defined by linear combination of independent or correlated underlying spatial processes. Known as linear models of coregionalization, the method offers a systematic and unified approach for formulating multivariate extensions to a broad range of univariate conditional autoregressive models. The resulting multivariate spatial models represent classes of coregionalized multivariate conditional autoregressive models that enable flexible modelling of multivariate spatial interactions, yielding coregionalization models with symmetric or asymmetric cross-covariances of different spatial variation and smoothness. In the context of multivariate disease mapping, for example, they facilitate borrowing strength both over space and cross variables, allowing for more flexible multivariate spatial smoothing. Specifically, we present a broadened coregionalization framework to include order-dependent, order-free, and order-robust multivariate models; a new class of order-free coregionalized multivariate conditional autoregressives is introduced. We tackle computational challenges and present solutions that are integral for Bayesian analysis of these models. We also discuss two ways of computing deviance information criterion for comparison among competing hierarchical models with or without unidentifiable prior parameters. The models and related methodology are developed in the broad context of modelling multivariate data on spatial lattice and illustrated in the context of multivariate disease mapping. The coregionalization framework and related methods also present a general approach for building spatially structured cross-covariance functions for multivariate geostatistics. PMID:27566769
NASA Technical Reports Server (NTRS)
Waszak, Martin R.
1997-01-01
The Benchmark Active Controls Technology (BACT) project is part of NASA Langley Research Center s Benchmark Models Program for studying transonic aeroelastic phenomena. In January of 1996 the BACT wind-tunnel model was used to successfully demonstrate the application of robust multivariable control design methods (H and -synthesis) to flutter suppression. This paper addresses the design and experimental evaluation of robust multivariable flutter suppression control laws with particular attention paid to the degree to which stability and performance robustness was achieved.
Ringham, Brandy M; Kreidler, Sarah M; Muller, Keith E; Glueck, Deborah H
2016-07-30
Multilevel and longitudinal studies are frequently subject to missing data. For example, biomarker studies for oral cancer may involve multiple assays for each participant. Assays may fail, resulting in missing data values that can be assumed to be missing completely at random. Catellier and Muller proposed a data analytic technique to account for data missing at random in multilevel and longitudinal studies. They suggested modifying the degrees of freedom for both the Hotelling-Lawley trace F statistic and its null case reference distribution. We propose parallel adjustments to approximate power for this multivariate test in studies with missing data. The power approximations use a modified non-central F statistic, which is a function of (i) the expected number of complete cases, (ii) the expected number of non-missing pairs of responses, or (iii) the trimmed sample size, which is the planned sample size reduced by the anticipated proportion of missing data. The accuracy of the method is assessed by comparing the theoretical results to the Monte Carlo simulated power for the Catellier and Muller multivariate test. Over all experimental conditions, the closest approximation to the empirical power of the Catellier and Muller multivariate test is obtained by adjusting power calculations with the expected number of complete cases. The utility of the method is demonstrated with a multivariate power analysis for a hypothetical oral cancer biomarkers study. We describe how to implement the method using standard, commercially available software products and give example code. Copyright © 2015 John Wiley & Sons, Ltd. PMID:26603500
Tang, An-Min; Tang, Nian-Sheng
2015-02-28
We propose a semiparametric multivariate skew-normal joint model for multivariate longitudinal and multivariate survival data. One main feature of the posited model is that we relax the commonly used normality assumption for random effects and within-subject error by using a centered Dirichlet process prior to specify the random effects distribution and using a multivariate skew-normal distribution to specify the within-subject error distribution and model trajectory functions of longitudinal responses semiparametrically. A Bayesian approach is proposed to simultaneously obtain Bayesian estimates of unknown parameters, random effects and nonparametric functions by combining the Gibbs sampler and the Metropolis-Hastings algorithm. Particularly, a Bayesian local influence approach is developed to assess the effect of minor perturbations to within-subject measurement error and random effects. Several simulation studies and an example are presented to illustrate the proposed methodologies. PMID:25404574
Multivariate spatial models of excess crash frequency at area level: case of Costa Rica.
Aguero-Valverde, Jonathan
2013-10-01
Recently, areal models of crash frequency have being used in the analysis of various area-wide factors affecting road crashes. On the other hand, disease mapping methods are commonly used in epidemiology to assess the relative risk of the population at different spatial units. A natural next step is to combine these two approaches to estimate the excess crash frequency at area level as a measure of absolute crash risk. Furthermore, multivariate spatial models of crash severity are explored in order to account for both frequency and severity of crashes and control for the spatial correlation frequently found in crash data. This paper aims to extent the concept of safety performance functions to be used in areal models of crash frequency. A multivariate spatial model is used for that purpose and compared to its univariate counterpart. Full Bayes hierarchical approach is used to estimate the models of crash frequency at canton level for Costa Rica. An intrinsic multivariate conditional autoregressive model is used for modeling spatial random effects. The results show that the multivariate spatial model performs better than its univariate counterpart in terms of the penalized goodness-of-fit measure Deviance Information Criteria. Additionally, the effects of the spatial smoothing due to the multivariate spatial random effects are evident in the estimation of excess equivalent property damage only crashes. PMID:23872657
Multivariate Regression Models for Estimating Journal Usefulness in Physics.
ERIC Educational Resources Information Center
Bennion, Bruce C.; Karschamroon, Sunee
1984-01-01
This study examines possibility of ranking journals in physics by means of bibliometric regression models that estimate usefulness as it is reported by 167 physicists in United States and Canada. Development of four models, patterns of deviation from models, and validity and application are discussed. Twenty-six references are cited. (EJS)
A Sandwich-Type Standard Error Estimator of SEM Models with Multivariate Time Series
ERIC Educational Resources Information Center
Zhang, Guangjian; Chow, Sy-Miin; Ong, Anthony D.
2011-01-01
Structural equation models are increasingly used as a modeling tool for multivariate time series data in the social and behavioral sciences. Standard error estimators of SEM models, originally developed for independent data, require modifications to accommodate the fact that time series data are inherently dependent. In this article, we extend a…
Preliminary Multi-Variable Cost Model for Space Telescopes
NASA Technical Reports Server (NTRS)
Stahl, H. Philip; Hendrichs, Todd
2010-01-01
Parametric cost models are routinely used to plan missions, compare concepts and justify technology investments. This paper reviews the methodology used to develop space telescope cost models; summarizes recently published single variable models; and presents preliminary results for two and three variable cost models. Some of the findings are that increasing mass reduces cost; it costs less per square meter of collecting aperture to build a large telescope than a small telescope; and technology development as a function of time reduces cost at the rate of 50% per 17 years.
A Multivariate Model for Coastal Water Quality Mapping Using Satellite Remote Sensing Images
Su, Yuan-Fong; Liou, Jun-Jih; Hou, Ju-Chen; Hung, Wei-Chun; Hsu, Shu-Mei; Lien, Yi-Ting; Su, Ming-Daw; Cheng, Ke-Sheng; Wang, Yeng-Fung
2008-01-01
This study demonstrates the feasibility of coastal water quality mapping using satellite remote sensing images. Water quality sampling campaigns were conducted over a coastal area in northern Taiwan for measurements of three water quality variables including Secchi disk depth, turbidity, and total suspended solids. SPOT satellite images nearly concurrent with the water quality sampling campaigns were also acquired. A spectral reflectance estimation scheme proposed in this study was applied to SPOT multispectral images for estimation of the sea surface reflectance. Two models, univariate and multivariate, for water quality estimation using the sea surface reflectance derived from SPOT images were established. The multivariate model takes into consideration the wavelength-dependent combined effect of individual seawater constituents on the sea surface reflectance and is superior over the univariate model. Finally, quantitative coastal water quality mapping was accomplished by substituting the pixel-specific spectral reflectance into the multivariate water quality estimation model.
Gaussian Copula multivariate modeling for texture image retrieval using wavelet transforms.
Lasmar, Nour-Eddine; Berthoumieu, Yannick
2014-05-01
In the framework of texture image retrieval, a new family of stochastic multivariate modeling is proposed based on Gaussian Copula and wavelet decompositions. We take advantage of the copula paradigm, which makes it possible to separate dependence structure from marginal behavior. We introduce two new multivariate models using, respectively, generalized Gaussian and Weibull densities. These models capture both the subband marginal distributions and the correlation between wavelet coefficients. We derive, as a similarity measure, a closed form expression of the Jeffrey divergence between Gaussian copula-based multivariate models. Experimental results on well-known databases show significant improvements in retrieval rates using the proposed method compared with the best known state-of-the-art approaches. PMID:24686281
Nonlinear Latent Curve Models for Multivariate Longitudinal Data
ERIC Educational Resources Information Center
Blozis, Shelley A.; Conger, Katherine J.; Harring, Jeffrey R.
2007-01-01
Latent curve models have become a useful approach to analyzing longitudinal data, due in part to their allowance of and emphasis on individual differences in features that describe change. Common applications of latent curve models in developmental studies rely on polynomial functions, such as linear or quadratic functions. Although useful for…
Modeling the Pineapple Express phenomenon via Multivariate Extreme Value Theory
NASA Astrophysics Data System (ADS)
Weller, G.; Cooley, D. S.
2011-12-01
The pineapple express (PE) phenomenon is responsible for producing extreme winter precipitation events in the coastal and mountainous regions of the western United States. Because the PE phenomenon is also associated with warm temperatures, the heavy precipitation and associated snowmelt can cause destructive flooding. In order to study impacts, it is important that regional climate models from NARCCAP are able to reproduce extreme precipitation events produced by PE. We define a daily precipitation quantity which captures the spatial extent and intensity of precipitation events produced by the PE phenomenon. We then use statistical extreme value theory to model the tail dependence of this quantity as seen in an observational data set and each of the six NARCCAP regional models driven by NCEP reanalysis. We find that most NCEP-driven NARCCAP models do exhibit tail dependence between daily model output and observations. Furthermore, we find that not all extreme precipitation events are pineapple express events, as identified by Dettinger et al. (2011). The synoptic-scale atmospheric processes that drive extreme precipitation events produced by PE have only recently begun to be examined. Much of the current work has focused on pattern recognition, rather than quantitative analysis. We use daily mean sea-level pressure (MSLP) fields from NCEP to develop a "pineapple express index" for extreme precipitation, which exhibits tail dependence with our observed precipitation quantity for pineapple express events. We build a statistical model that connects daily precipitation output from the WRFG model, daily MSLP fields from NCEP, and daily observed precipitation in the western US. Finally, we use this model to simulate future observed precipitation based on WRFG output driven by the CCSM model, and our pineapple express index derived from future CCSM output. Our aim is to use this model to develop a better understanding of the frequency and intensity of extreme
Hypoglycemia Early Alarm Systems Based On Multivariable Models
Turksoy, Kamuran; Bayrak, Elif S; Quinn, Lauretta; Littlejohn, Elizabeth; Rollins, Derrick; Cinar, Ali
2013-01-01
Hypoglycemia is a major challenge of artificial pancreas systems and a source of concern for potential users and parents of young children with Type 1 diabetes (T1D). Early alarms to warn the potential of hypoglycemia are essential and should provide enough time to take action to avoid hypoglycemia. Many alarm systems proposed in the literature are based on interpretation of recent trends in glucose values. In the present study, subject-specific recursive linear time series models are introduced as a better alternative to capture glucose variations and predict future blood glucose concentrations. These models are then used in hypoglycemia early alarm systems that notify patients to take action to prevent hypoglycemia before it happens. The models developed and the hypoglycemia alarm system are tested retrospectively using T1D subject data. A Savitzky-Golay filter and a Kalman filter are used to reduce noise in patient data. The hypoglycemia alarm algorithm is developed by using predictions of future glucose concentrations from recursive models. The modeling algorithm enables the dynamic adaptation of models to inter-/intra-subject variation and glycemic disturbances and provides satisfactory glucose concentration prediction with relatively small error. The alarm systems demonstrate good performance in prediction of hypoglycemia and ultimately in prevention of its occurrence. PMID:24187436
Modeling a multivariable reactor and on-line model predictive control.
Yu, D W; Yu, D L
2005-10-01
A nonlinear first principle model is developed for a laboratory-scaled multivariable chemical reactor rig in this paper and the on-line model predictive control (MPC) is implemented to the rig. The reactor has three variables-temperature, pH, and dissolved oxygen with nonlinear dynamics-and is therefore used as a pilot system for the biochemical industry. A nonlinear discrete-time model is derived for each of the three output variables and their model parameters are estimated from the real data using an adaptive optimization method. The developed model is used in a nonlinear MPC scheme. An accurate multistep-ahead prediction is obtained for MPC, where the extended Kalman filter is used to estimate system unknown states. The on-line control is implemented and a satisfactory tracking performance is achieved. The MPC is compared with three decentralized PID controllers and the advantage of the nonlinear MPC over the PID is clearly shown. PMID:16294779
Storm Water Management Model Climate Adjustment Tool (SWMM-CAT)
The US EPA’s newest tool, the Stormwater Management Model (SWMM) – Climate Adjustment Tool (CAT) is meant to help municipal stormwater utilities better address potential climate change impacts affecting their operations. SWMM, first released in 1971, models hydrology and hydrauli...
Unified Model for Academic Competence, Social Adjustment, and Psychopathology.
ERIC Educational Resources Information Center
Schaefer, Earl S.; And Others
A unified conceptual model is needed to integrate the extensive research on (1) social competence and adaptive behavior, (2) converging conceptualizations of social adjustment and psychopathology, and (3) emerging concepts and measures of academic competence. To develop such a model, a study was conducted in which teacher ratings were collected on…
IRT-ZIP Modeling for Multivariate Zero-Inflated Count Data
ERIC Educational Resources Information Center
Wang, Lijuan
2010-01-01
This study introduces an item response theory-zero-inflated Poisson (IRT-ZIP) model to investigate psychometric properties of multiple items and predict individuals' latent trait scores for multivariate zero-inflated count data. In the model, two link functions are used to capture two processes of the zero-inflated count data. Item parameters are…
Computer-Aided Decisions in Human Services: Expert Systems and Multivariate Models.
ERIC Educational Resources Information Center
Sicoly, Fiore
1989-01-01
This comparison of two approaches to the development of computerized supports for decision making--expert systems and multivariate models--focuses on computerized systems that assist professionals with tasks related to diagnosis or classification in human services. Validation of both expert systems and statistical models is emphasized. (39…
MULTIVARIATE RECEPTOR MODELS-CURRENT PRACTICE AND FUTURE TRENDS. (R826238)
Multivariate receptor models have been applied to the analysis of air quality data for sometime. However, solving the general mixture problem is important in several other fields. This paper looks at the panoply of these models with a view of identifying common challenges and ...
Multivariate crash modeling for motor vehicle and non-motorized modes at the macroscopic level.
Lee, Jaeyoung; Abdel-Aty, Mohamed; Jiang, Ximiao
2015-05-01
Macroscopic traffic crash analyses have been conducted to incorporate traffic safety into long-term transportation planning. This study aims at developing a multivariate Poisson lognormal conditional autoregressive model at the macroscopic level for crashes by different transportation modes such as motor vehicle, bicycle, and pedestrian crashes. Many previous studies have shown the presence of common unobserved factors across different crash types. Thus, it was expected that adopting multivariate model structure would show a better modeling performance since it can capture shared unobserved features across various types. The multivariate model and univariate model were estimated based on traffic analysis zones (TAZs) and compared. It was found that the multivariate model significantly outperforms the univariate model. It is expected that the findings from this study can contribute to more reliable traffic crash modeling, especially when focusing on different modes. Also, variables that are found significant for each mode can be used to guide traffic safety policy decision makers to allocate resources more efficiently for the zones with higher risk of a particular transportation mode. PMID:25790973
Meta-Analytic Structural Equation Modeling (MASEM): Comparison of the Multivariate Methods
ERIC Educational Resources Information Center
Zhang, Ying
2011-01-01
Meta-analytic Structural Equation Modeling (MASEM) has drawn interest from many researchers recently. In doing MASEM, researchers usually first synthesize correlation matrices across studies using meta-analysis techniques and then analyze the pooled correlation matrix using structural equation modeling techniques. Several multivariate methods of…
Multivariate modeling of settling depth of apple fruit (Red Delicious variety) in water.
Kheiralipour, Kamran; Marzbani, Farshid
2016-03-01
Settling depth of apple was determined by a water column and a digital camera. The depth was experimentally modeled using multivariate regression using a coded program in MATLAB software. The best models were based on the density, dropping height volume/mass with coefficient of determination and mean square error of 0.90 and 4.08, respectively. PMID:27004104
NASA Astrophysics Data System (ADS)
Alih, Ekele; Ong, Hong Choon
2014-07-01
The application of Ordinary Least Squares (OLS) to a single equation assumes among others, that the predictor variables are truly exogenous; that there is only one-way causation between the dependent variable yi and the predictor variables xij. If this is not true and the xij 'S are at the same time determined by yi, the OLS assumption will be violated and a single equation method will give biased and inconsistent parameter estimates. The OLS also suffers a huge set back in the presence of contaminated data. In order to rectify these problems, simultaneous equation models have been introduced as well as robust regression. In this paper, we construct a simultaneous equation model with variables that exhibit simultaneous dependence and we proposed a robust multivariate regression procedure for estimating the parameters of such models. The performance of the robust multivariate regression procedure was examined and compared with the OLS multivariate regression technique and the Three-Stage Least squares procedure (3SLS) using numerical simulation experiment. The performance of the robust multivariate regression and (3SLS) were approximately equally better than OLS when there is no contamination in the data. Nevertheless, when contaminations occur in the data, the robust multivariate regression outperformed the 3SLS and OLS.
A Multivariate Model of Stakeholder Preference for Lethal Cat Management
Wald, Dara M.; Jacobson, Susan K.
2014-01-01
Identifying stakeholder beliefs and attitudes is critical for resolving management conflicts. Debate over outdoor cat management is often described as a conflict between two groups, environmental advocates and animal welfare advocates, but little is known about the variables predicting differences among these critical stakeholder groups. We administered a mail survey to randomly selected stakeholders representing both of these groups (n = 1,596) in Florida, where contention over the management of outdoor cats has been widespread. We used a structural equation model to evaluate stakeholder intention to support non-lethal management. The cognitive hierarchy model predicted that values influenced beliefs, which predicted general and specific attitudes, which in turn, influenced behavioral intentions. We posited that specific attitudes would mediate the effect of general attitudes, beliefs, and values on management support. Model fit statistics suggested that the final model fit the data well (CFI = 0.94, RMSEA = 0.062). The final model explained 74% of the variance in management support, and positive attitudes toward lethal management (humaneness) had the largest direct effect on management support. Specific attitudes toward lethal management and general attitudes toward outdoor cats mediated the relationship between positive (p<0.05) and negative cat-related impact beliefs (p<0.05) and support for management. These results supported the specificity hypothesis and the use of the cognitive hierarchy to assess stakeholder intention to support non-lethal cat management. Our findings suggest that stakeholders can simultaneously perceive both positive and negative beliefs about outdoor cats, which influence attitudes toward and support for non-lethal management. PMID:24736744
Chen, Xiaohong; Fan, Yanqin; Pouzo, Demian; Ying, Zhiliang
2010-07-01
We study estimation and model selection of semiparametric models of multivariate survival functions for censored data, which are characterized by possibly misspecified parametric copulas and nonparametric marginal survivals. We obtain the consistency and root-n asymptotic normality of a two-step copula estimator to the pseudo-true copula parameter value according to KLIC, and provide a simple consistent estimator of its asymptotic variance, allowing for a first-step nonparametric estimation of the marginal survivals. We establish the asymptotic distribution of the penalized pseudo-likelihood ratio statistic for comparing multiple semiparametric multivariate survival functions subject to copula misspecification and general censorship. An empirical application is provided. PMID:24790286
Modelling household finances: A Bayesian approach to a multivariate two-part model
Brown, Sarah; Ghosh, Pulak; Su, Li; Taylor, Karl
2016-01-01
We contribute to the empirical literature on household finances by introducing a Bayesian multivariate two-part model, which has been developed to further our understanding of household finances. Our flexible approach allows for the potential interdependence between the holding of assets and liabilities at the household level and also encompasses a two-part process to allow for differences in the influences on asset or liability holding and on the respective amounts held. Furthermore, the framework is dynamic in order to allow for persistence in household finances over time. Our findings endorse the joint modelling approach and provide evidence supporting the importance of dynamics. In addition, we find that certain independent variables exert different influences on the binary and continuous parts of the model thereby highlighting the flexibility of our framework and revealing a detailed picture of the nature of household finances. PMID:27212801
Chen, Gang; Adleman, Nancy E.; Saad, Ziad S.; Leibenluft, Ellen; Cox, RobertW.
2014-01-01
All neuroimaging packages can handle group analysis with t-tests or general linear modeling (GLM). However, they are quite hamstrung when there are multiple within-subject factors or when quantitative covariates are involved in the presence of a within-subject factor. In addition, sphericity is typically assumed for the variance–covariance structure when there are more than two levels in a within-subject factor. To overcome such limitations in the traditional AN(C)OVA and GLM, we adopt a multivariate modeling (MVM) approach to analyzing neuroimaging data at the group level with the following advantages: a) there is no limit on the number of factors as long as sample sizes are deemed appropriate; b) quantitative covariates can be analyzed together with within- subject factors; c) when a within-subject factor is involved, three testing methodologies are provided: traditional univariate testing (UVT)with sphericity assumption (UVT-UC) and with correction when the assumption is violated (UVT-SC), and within-subject multivariate testing (MVT-WS); d) to correct for sphericity violation at the voxel level, we propose a hybrid testing (HT) approach that achieves equal or higher power via combining traditional sphericity correction methods (Greenhouse–Geisser and Huynh–Feldt) with MVT-WS. PMID:24954281
NASA Astrophysics Data System (ADS)
Grujic, O.; Caers, J.
2014-12-01
Modern approaches to uncertainty quantification in the subsurface rely on complex procedures of geological modeling combined with numerical simulation of flow & transport. This approach requires long computational times rendering any full Monte Carlo simulation infeasible, in particular solving the flow & transport problem takes hours of computing time in real field problems. This motivated the development of model selection methods aiming to identify a small subset of models that capture important statistics of a larger ensemble of geological model realization. A recent method, based on model selection in metric space, termed distance-kernel method (DKM) allows selecting representative models though kernel k-medoid clustering. The distance defining the metric space is usually based on some approximate flow model. However, the output of an approximate flow model can be multi-variate (reporting heads/pressures, saturation, rates). In addition, the modeler may have information from several other approximate models (e.g. upscaled models) or summary statistical information about geological heterogeneity that could allow for a more accurate selection. In an effort to perform model selection based on multivariate attributes, we rely on functional data analysis which allows for an exploitation of covariances between time-varying multivariate numerical simulation output. Based on mixed functional principal component analysis, we construct a lower dimensional space in which kernel k-medoid clustering is used for model selection. In this work we demonstrate the functional approach on a complex compositional flow problem where the geological uncertainty consists of channels with uncertain spatial distribution of facies, proportions, orientations and geometries. We illustrate that using multivariate attributes and multiple approximate models provides accuracy improvement over using a single attribute.
Modeling inflation rates and exchange rates in Ghana: application of multivariate GARCH models.
Nortey, Ezekiel Nn; Ngoh, Delali D; Doku-Amponsah, Kwabena; Ofori-Boateng, Kenneth
2015-01-01
This paper was aimed at investigating the volatility and conditional relationship among inflation rates, exchange rates and interest rates as well as to construct a model using multivariate GARCH DCC and BEKK models using Ghana data from January 1990 to December 2013. The study revealed that the cumulative depreciation of the cedi to the US dollar from 1990 to 2013 is 7,010.2% and the yearly weighted depreciation of the cedi to the US dollar for the period is 20.4%. There was evidence that, the fact that inflation rate was stable, does not mean that exchange rates and interest rates are expected to be stable. Rather, when the cedi performs well on the forex, inflation rates and interest rates react positively and become stable in the long run. The BEKK model is robust to modelling and forecasting volatility of inflation rates, exchange rates and interest rates. The DCC model is robust to model the conditional and unconditional correlation among inflation rates, exchange rates and interest rates. The BEKK model, which forecasted high exchange rate volatility for the year 2014, is very robust for modelling the exchange rates in Ghana. The mean equation of the DCC model is also robust to forecast inflation rates in Ghana. PMID:25741459
Modeling of turbulent supersonic H2-air combustion with a multivariate beta PDF
NASA Technical Reports Server (NTRS)
Baurle, R. A.; Hassan, H. A.
1993-01-01
Recent calculations of turbulent supersonic reacting shear flows using an assumed multivariate beta PDF (probability density function) resulted in reduced production rates and a delay in the onset of combustion. This result is not consistent with available measurements. The present research explores two possible reasons for this behavior: use of PDF's that do not yield Favre averaged quantities, and the gradient diffusion assumption. A new multivariate beta PDF involving species densities is introduced which makes it possible to compute Favre averaged mass fractions. However, using this PDF did not improve comparisons with experiment. A countergradient diffusion model is then introduced. Preliminary calculations suggest this to be the cause of the discrepancy.
Using Bibliotherapy to Help Children Adjust to Changing Role Models.
ERIC Educational Resources Information Center
Pardeck, John T.; Pardeck, Jean A.
One technique for helping children adjust to changing role models is bibliotherapy--the use of children's books to facilitate identification with and exploration of sex role behavior. Confronted with change in various social systems, particularly the family, children are faced with conflicts concerning their sex role development. The process…
Catastrophe, Chaos, and Complexity Models and Psychosocial Adjustment to Disability.
ERIC Educational Resources Information Center
Parker, Randall M.; Schaller, James; Hansmann, Sandra
2003-01-01
Rehabilitation professionals may unknowingly rely on stereotypes and specious beliefs when dealing with people with disabilities, despite the formulation of theories that suggest new models of the adjustment process. Suggests that Catastrophe, Chaos, and Complexity Theories hold considerable promise in this regard. This article reviews these…
NASA Astrophysics Data System (ADS)
Leeds, W. B.; Wikle, C. K.
2012-12-01
Spatio-temporal statistical models, and in particular Bayesian hierarchical models (BHMs), have become increasingly popular as means of representing natural processes such as climate and weather that evolve over space and time. Hierarchical models make it possible to specify separate, conditional probability distributions that account for uncertainty in the observations, the underlying process, and parameters in situations when specifying these sources of uncertainty in a joint probability distribution may be difficult. As a result, BHMs are a natural setting for climatologists, meteorologists, and other environmental scientists to incorporate scientific information (e.g., PDEs, IDEs, etc.) a priori into a rigorous statistical framework that accounts for error in measurements, uncertainty in the understanding of the true underlying process, and uncertainty in the parameters that describe the process. While much work has been done in the development of statistical models for linear dynamic spatio-temporal processes, statistical modeling for nonlinear (and particularly, multivariate nonlinear) spatio-temporal dynamical processes is still a relatively open area of inquiry. As a result, general statistical models for environmental scientists to model complicated nonlinear processes is limited. We address this limitation in the methodology by introducing a multivariate "general quadratic nonlinear" framework for modeling multivariate, nonlinear spatio-temporal random processes inside of a BHM in a way that is especially applicable for problems in the ocean and atmospheric sciences. We show that in addition to the fact that this model addresses the previously mentioned sources of uncertainty for a wide spectrum of multivariate, nonlinear spatio-temporal processes, it is also a natural framework for data assimilation, allowing for the fusing of observations with computer models, computer model emulators, computer model output, or "mechanistically motivated" statistical
ERIC Educational Resources Information Center
Tchumtchoua, Sylvie; Dey, Dipak K.
2012-01-01
This paper proposes a semiparametric Bayesian framework for the analysis of associations among multivariate longitudinal categorical variables in high-dimensional data settings. This type of data is frequent, especially in the social and behavioral sciences. A semiparametric hierarchical factor analysis model is developed in which the…
A Multivariate Model of Parent-Adolescent Relationship Variables in Early Adolescence
ERIC Educational Resources Information Center
McKinney, Cliff; Renk, Kimberly
2011-01-01
Given the importance of predicting outcomes for early adolescents, this study examines a multivariate model of parent-adolescent relationship variables, including parenting, family environment, and conflict. Participants, who completed measures assessing these variables, included 710 culturally diverse 11-14-year-olds who were attending a middle…
ERIC Educational Resources Information Center
MacIntosh, Randall
1997-01-01
Presents KANT, a FORTRAN 77 software program that tests assumptions of multivariate normality in a data set. Based on the test developed by M. V. Mardia (1985), the KANT program is useful for those engaged in structural equation modeling with latent variables. (SLD)
A General Multivariate Latent Growth Model with Applications to Student Achievement
ERIC Educational Resources Information Center
Bianconcini, Silvia; Cagnone, Silvia
2012-01-01
The evaluation of the formative process in the University system has been assuming an ever increasing importance in the European countries. Within this context, the analysis of student performance and capabilities plays a fundamental role. In this work, the authors propose a multivariate latent growth model for studying the performances of a…
Tracking Problem Solving by Multivariate Pattern Analysis and Hidden Markov Model Algorithms
ERIC Educational Resources Information Center
Anderson, John R.
2012-01-01
Multivariate pattern analysis can be combined with Hidden Markov Model algorithms to track the second-by-second thinking as people solve complex problems. Two applications of this methodology are illustrated with a data set taken from children as they interacted with an intelligent tutoring system for algebra. The first "mind reading" application…
The Dirichlet-Multinomial Model for Multivariate Randomized Response Data and Small Samples
ERIC Educational Resources Information Center
Avetisyan, Marianna; Fox, Jean-Paul
2012-01-01
In survey sampling the randomized response (RR) technique can be used to obtain truthful answers to sensitive questions. Although the individual answers are masked due to the RR technique, individual (sensitive) response rates can be estimated when observing multivariate response data. The beta-binomial model for binary RR data will be generalized…
A Multivariate Multilevel Approach to the Modeling of Accuracy and Speed of Test Takers
ERIC Educational Resources Information Center
Klein Entink, R. H.; Fox, J. P.; van der Linden, W. J.
2009-01-01
Response times on test items are easily collected in modern computerized testing. When collecting both (binary) responses and (continuous) response times on test items, it is possible to measure the accuracy and speed of test takers. To study the relationships between these two constructs, the model is extended with a multivariate multilevel…
Web-Based Tools for Modelling and Analysis of Multivariate Data: California Ozone Pollution Activity
ERIC Educational Resources Information Center
Dinov, Ivo D.; Christou, Nicolas
2011-01-01
This article presents a hands-on web-based activity motivated by the relation between human health and ozone pollution in California. This case study is based on multivariate data collected monthly at 20 locations in California between 1980 and 2006. Several strategies and tools for data interrogation and exploratory data analysis, model fitting…
Multivariate Radiological-Based Models for the Prediction of Future Knee Pain: Data from the OAI
Galván-Tejada, Jorge I.; Celaya-Padilla, José M.; Treviño, Victor; Tamez-Peña, José G.
2015-01-01
In this work, the potential of X-ray based multivariate prognostic models to predict the onset of chronic knee pain is presented. Using X-rays quantitative image assessments of joint-space-width (JSW) and paired semiquantitative central X-ray scores from the Osteoarthritis Initiative (OAI), a case-control study is presented. The pain assessments of the right knee at the baseline and the 60-month visits were used to screen for case/control subjects. Scores were analyzed at the time of pain incidence (T-0), the year prior incidence (T-1), and two years before pain incidence (T-2). Multivariate models were created by a cross validated elastic-net regularized generalized linear models feature selection tool. Univariate differences between cases and controls were reported by AUC, C-statistics, and ODDs ratios. Univariate analysis indicated that the medial osteophytes were significantly more prevalent in cases than controls: C-stat 0.62, 0.62, and 0.61, at T-0, T-1, and T-2, respectively. The multivariate JSW models significantly predicted pain: AUC = 0.695, 0.623, and 0.620, at T-0, T-1, and T-2, respectively. Semiquantitative multivariate models predicted paint with C-stat = 0.671, 0.648, and 0.645 at T-0, T-1, and T-2, respectively. Multivariate models derived from plain X-ray radiography assessments may be used to predict subjects that are at risk of developing knee pain. PMID:26504490
Multivariable Model for Time to First Treatment in Patients With Chronic Lymphocytic Leukemia
Wierda, William G.; O'Brien, Susan; Wang, Xuemei; Faderl, Stefan; Ferrajoli, Alessandra; Do, Kim-Anh; Garcia-Manero, Guillermo; Cortes, Jorge; Thomas, Deborah; Koller, Charles A.; Burger, Jan A.; Lerner, Susan; Schlette, Ellen; Abruzzo, Lynne; Kantarjian, Hagop M.; Keating, Michael J.
2011-01-01
Purpose The clinical course for patients with chronic lymphocytic leukemia (CLL) is diverse; some patients have indolent disease, never needing treatment, whereas others have aggressive disease requiring early treatment. We continue to use criteria for active disease to initiate therapy. Multivariable analysis was performed to identify prognostic factors independently associated with time to first treatment for patients with CLL. Patients and Methods Traditional laboratory, clinical prognostic, and newer prognostic factors such as fluorescent in situ hybridization (FISH), IGHV mutation status, and ZAP-70 expression evaluated at first patient visit to MD Anderson Cancer Center were correlated by multivariable analysis with time to first treatment. This multivariable model was used to develop a nomogram—a weighted tool to calculate 2- and 4-year probability of treatment and estimate median time to first treatment. Results There were 930 previously untreated patients who had traditional and new prognostic factors evaluated; they did not have active CLL requiring initiation of treatment within 3 months of first visit and were observed for time to first treatment. The following were independently associated with shorter time to first treatment: three involved lymph node sites, increased size of cervical lymph nodes, presence of 17p deletion or 11q deletion by FISH, increased serum lactate dehydrogenase, and unmutated IGHV mutation status. Conclusion We developed a multivariable model that incorporates traditional and newer prognostic factors to identify patients at high risk for progression to treatment. This model may be useful to identify patients for early interventional trials. PMID:21969505
Development of a charge adjustment model for cardiac catheterization.
Brennan, Andrew; Gauvreau, Kimberlee; Connor, Jean; O'Connell, Cheryl; David, Sthuthi; Almodovar, Melvin; DiNardo, James; Banka, Puja; Mayer, John E; Marshall, Audrey C; Bergersen, Lisa
2015-02-01
A methodology that would allow for comparison of charges across institutions has not been developed for catheterization in congenital heart disease. A single institution catheterization database with prospectively collected case characteristics was linked to hospital charges related and limited to an episode of care in the catheterization laboratory for fiscal years 2008-2010. Catheterization charge categories (CCC) were developed to group types of catheterization procedures using a combination of empiric data and expert consensus. A multivariable model with outcome charges was created using CCC and additional patient and procedural characteristics. In 3 fiscal years, 3,839 cases were available for analysis. Forty catheterization procedure types were categorized into 7 CCC yielding a grouper variable with an R (2) explanatory value of 72.6%. In the final CCC, the largest proportion of cases was in CCC 2 (34%), which included diagnostic cases without intervention. Biopsy cases were isolated in CCC 1 (12%), and percutaneous pulmonary valve placement alone made up CCC 7 (2%). The final model included CCC, number of interventions, and cardiac diagnosis (R (2) = 74.2%). Additionally, current financial metrics such as APR-DRG severity of illness and case mix index demonstrated a lack of correlation with CCC. We have developed a catheterization procedure type financial grouper that accounts for the diverse case population encountered in catheterization for congenital heart disease. CCC and our multivariable model could be used to understand financial characteristics of a population at a single point in time, longitudinally, and to compare populations. PMID:25113520
NASA Astrophysics Data System (ADS)
Ng, W.; Rasmussen, P. F.; Panu, U. S.
2009-12-01
Stochastic weather modeling is subject to a number of challenges including varied spatial-dependency and the existence of missing observations. Daily precipitation possesses unique statistical characteristics in distribution, such as the existence of high frequency of zero records and the high skewness of the distribution of precipitation amount. To address for these difficulties, a methodology based on the multivariate truncated Normal distribution model is proposed. The methodology transforms the skewed distribution of precipitation amounts at multiple sites into a multivariate Normal distribution model. The missing observations are then be estimated through the conditional mean and variance obtained from the multivariate Normal distribution model. The adequacy of the proposed model structure was first verified using a synthetic data set. Subsequently, 30 years of historical daily precipitation records from 10 Canadian meteorological stations were used to evaluate the performance of the model. The result of the evaluation shows that the proposed model reasonably can preserve the statistical characteristics of the historical records in estimated the missing records at multiple sites.
Applying the multivariate time-rescaling theorem to neural population models
Gerhard, Felipe; Haslinger, Robert; Pipa, Gordon
2011-01-01
Statistical models of neural activity are integral to modern neuroscience. Recently, interest has grown in modeling the spiking activity of populations of simultaneously recorded neurons to study the effects of correlations and functional connectivity on neural information processing. However any statistical model must be validated by an appropriate goodness-of-fit test. Kolmogorov-Smirnov tests based upon the time-rescaling theorem have proven to be useful for evaluating point-process-based statistical models of single-neuron spike trains. Here we discuss the extension of the time-rescaling theorem to the multivariate (neural population) case. We show that even in the presence of strong correlations between spike trains, models which neglect couplings between neurons can be erroneously passed by the univariate time-rescaling test. We present the multivariate version of the time-rescaling theorem, and provide a practical step-by-step procedure for applying it towards testing the sufficiency of neural population models. Using several simple analytically tractable models and also more complex simulated and real data sets, we demonstrate that important features of the population activity can only be detected using the multivariate extension of the test. PMID:21395436
Multivariate Calibration Models for Sorghum Composition using Near-Infrared Spectroscopy
Wolfrum, E.; Payne, C.; Stefaniak, T.; Rooney, W.; Dighe, N.; Bean, B.; Dahlberg, J.
2013-03-01
NREL developed calibration models based on near-infrared (NIR) spectroscopy coupled with multivariate statistics to predict compositional properties relevant to cellulosic biofuels production for a variety of sorghum cultivars. A robust calibration population was developed in an iterative fashion. The quality of models developed using the same sample geometry on two different types of NIR spectrometers and two different sample geometries on the same spectrometer did not vary greatly.
Kay, D; McDonald, A
1983-01-01
This paper reports on the calibration and use of a multiple regression model designed to predict concentrations of Escherichia coli and total coliforms in two upland British impoundments. The multivariate approach has improved predictive capability over previous univariate linear models because it includes predictor variables for the timing and magnitude of hydrological input to the reservoirs and physiochemical parameters of water quality. The significance of these results for catchment management research is considered. PMID:6639016
ERIC Educational Resources Information Center
Pakenham, Kenneth I.; Samios, Christina; Sofronoff, Kate
2005-01-01
The present study examined the applicability of the double ABCX model of family adjustment in explaining maternal adjustment to caring for a child diagnosed with Asperger syndrome. Forty-seven mothers completed questionnaires at a university clinic while their children were participating in an anxiety intervention. The children were aged between…
NASA Astrophysics Data System (ADS)
Ghanate, A. D.; Kothiwale, S.; Singh, S. P.; Bertrand, Dominique; Krishna, C. Murali
2011-02-01
Cancer is now recognized as one of the major causes of morbidity and mortality. Histopathological diagnosis, the gold standard, is shown to be subjective, time consuming, prone to interobserver disagreement, and often fails to predict prognosis. Optical spectroscopic methods are being contemplated as adjuncts or alternatives to conventional cancer diagnostics. The most important aspect of these approaches is their objectivity, and multivariate statistical tools play a major role in realizing it. However, rigorous evaluation of the robustness of spectral models is a prerequisite. The utility of Raman spectroscopy in the diagnosis of cancers has been well established. Until now, the specificity and applicability of spectral models have been evaluated for specific cancer types. In this study, we have evaluated the utility of spectroscopic models representing normal and malignant tissues of the breast, cervix, colon, larynx, and oral cavity in a broader perspective, using different multivariate tests. The limit test, which was used in our earlier study, gave high sensitivity but suffered from poor specificity. The performance of other methods such as factorial discriminant analysis and partial least square discriminant analysis are at par with more complex nonlinear methods such as decision trees, but they provide very little information about the classification model. This comparative study thus demonstrates not just the efficacy of Raman spectroscopic models but also the applicability and limitations of different multivariate tools for discrimination under complex conditions such as the multicancer scenario.
NASA Astrophysics Data System (ADS)
Golay, Jean; Kanevski, Mikhaïl
2013-04-01
The present research deals with the exploration and modeling of a complex dataset of 200 measurement points of sediment pollution by heavy metals in Lake Geneva. The fundamental idea was to use multivariate Artificial Neural Networks (ANN) along with geostatistical models and tools in order to improve the accuracy and the interpretability of data modeling. The results obtained with ANN were compared to those of traditional geostatistical algorithms like ordinary (co)kriging and (co)kriging with an external drift. Exploratory data analysis highlighted a great variety of relationships (i.e. linear, non-linear, independence) between the 11 variables of the dataset (i.e. Cadmium, Mercury, Zinc, Copper, Titanium, Chromium, Vanadium and Nickel as well as the spatial coordinates of the measurement points and their depth). Then, exploratory spatial data analysis (i.e. anisotropic variography, local spatial correlations and moving window statistics) was carried out. It was shown that the different phenomena to be modeled were characterized by high spatial anisotropies, complex spatial correlation structures and heteroscedasticity. A feature selection procedure based on General Regression Neural Networks (GRNN) was also applied to create subsets of variables enabling to improve the predictions during the modeling phase. The basic modeling was conducted using a Multilayer Perceptron (MLP) which is a workhorse of ANN. MLP models are robust and highly flexible tools which can incorporate in a nonlinear manner different kind of high-dimensional information. In the present research, the input layer was made of either two (spatial coordinates) or three neurons (when depth as auxiliary information could possibly capture an underlying trend) and the output layer was composed of one (univariate MLP) to eight neurons corresponding to the heavy metals of the dataset (multivariate MLP). MLP models with three input neurons can be referred to as Artificial Neural Networks with EXternal
Vickers, Andrew J; Cronin, Angel M; Kattan, Michael W; Gonen, Mithat; Scardino, Peter T; Milowsky, Matthew I.; Dalbagni, Guido; Bochner, Bernard H.
2009-01-01
Background Multivariable prediction models have been shown to predict cancer outcomes more accurately than cancer stage. The effects on clinical management are unclear. We aimed to determine whether a published multivariable prediction model for bladder cancer (“bladder nomogram”) improves medical decision making, using referral for adjuvant chemotherapy as a model. Methods We analyzed data from an international cohort study of 4462 patients undergoing cystectomy without chemotherapy 1969 – 2004. The number of patients eligible for chemotherapy was determined using pathologic stage criteria (lymph node positive or stage pT3 or pT4), and for three cut-offs on the bladder nomogram (10%, 25% and 70% risk of recurrence with surgery alone). The number of recurrences was calculated by applying a relative risk reduction to eligible patients' baseline risk. Clinical net benefit was then calculated by combining recurrences and treatments, weighting the latter by a factor related to drug tolerability. Results A nomogram cut-off outperformed pathologic stage for chemotherapy for every scenario of drug effectiveness and tolerability. For a drug with a relative risk of 0.80, where clinicians would treat no more than 20 patients to prevent one recurrence, use of the nomogram was equivalent to a strategy that resulted in 60 fewer chemotherapy treatments per 1000 patients without any increase in recurrence rates. Conclusions Referring cystectomy patients to adjuvant chemotherapy on the basis of a multivariable model is likely to lead to better patient outcomes than the use of pathological stage. Further research is warranted to evaluate the clinical effects of multivariable prediction models. PMID:19823979
2014-01-01
Background Before considering whether to use a multivariable (diagnostic or prognostic) prediction model, it is essential that its performance be evaluated in data that were not used to develop the model (referred to as external validation). We critically appraised the methodological conduct and reporting of external validation studies of multivariable prediction models. Methods We conducted a systematic review of articles describing some form of external validation of one or more multivariable prediction models indexed in PubMed core clinical journals published in 2010. Study data were extracted in duplicate on design, sample size, handling of missing data, reference to the original study developing the prediction models and predictive performance measures. Results 11,826 articles were identified and 78 were included for full review, which described the evaluation of 120 prediction models. in participant data that were not used to develop the model. Thirty-three articles described both the development of a prediction model and an evaluation of its performance on a separate dataset, and 45 articles described only the evaluation of an existing published prediction model on another dataset. Fifty-seven percent of the prediction models were presented and evaluated as simplified scoring systems. Sixteen percent of articles failed to report the number of outcome events in the validation datasets. Fifty-four percent of studies made no explicit mention of missing data. Sixty-seven percent did not report evaluating model calibration whilst most studies evaluated model discrimination. It was often unclear whether the reported performance measures were for the full regression model or for the simplified models. Conclusions The vast majority of studies describing some form of external validation of a multivariable prediction model were poorly reported with key details frequently not presented. The validation studies were characterised by poor design, inappropriate handling
Hieke, Stefanie; Benner, Axel; Schlenk, Richard F.; Schumacher, Martin; Bullinger, Lars; Binder, Harald
2016-01-01
Clinical cohorts with time-to-event endpoints are increasingly characterized by measurements of a number of single nucleotide polymorphisms that is by a magnitude larger than the number of measurements typically considered at the gene level. At the same time, the size of clinical cohorts often is still limited, calling for novel analysis strategies for identifying potentially prognostic SNPs that can help to better characterize disease processes. We propose such a strategy, drawing on univariate testing ideas from epidemiological case-controls studies on the one hand, and multivariable regression techniques as developed for gene expression data on the other hand. In particular, we focus on stable selection of a small set of SNPs and corresponding genes for subsequent validation. For univariate analysis, a permutation-based approach is proposed to test at the gene level. We use regularized multivariable regression models for considering all SNPs simultaneously and selecting a small set of potentially important prognostic SNPs. Stability is judged according to resampling inclusion frequencies for both the univariate and the multivariable approach. The overall strategy is illustrated with data from a cohort of acute myeloid leukemia patients and explored in a simulation study. The multivariable approach is seen to automatically focus on a smaller set of SNPs compared to the univariate approach, roughly in line with blocks of correlated SNPs. This more targeted extraction of SNPs results in more stable selection at the SNP as well as at the gene level. Thus, the multivariable regression approach with resampling provides a perspective in the proposed analysis strategy for SNP data in clinical cohorts highlighting what can be added by regularized regression techniques compared to univariate analyses. PMID:27159447
Chen, Gang; Adleman, Nancy E; Saad, Ziad S; Leibenluft, Ellen; Cox, Robert W
2014-10-01
All neuroimaging packages can handle group analysis with t-tests or general linear modeling (GLM). However, they are quite hamstrung when there are multiple within-subject factors or when quantitative covariates are involved in the presence of a within-subject factor. In addition, sphericity is typically assumed for the variance-covariance structure when there are more than two levels in a within-subject factor. To overcome such limitations in the traditional AN(C)OVA and GLM, we adopt a multivariate modeling (MVM) approach to analyzing neuroimaging data at the group level with the following advantages: a) there is no limit on the number of factors as long as sample sizes are deemed appropriate; b) quantitative covariates can be analyzed together with within-subject factors; c) when a within-subject factor is involved, three testing methodologies are provided: traditional univariate testing (UVT) with sphericity assumption (UVT-UC) and with correction when the assumption is violated (UVT-SC), and within-subject multivariate testing (MVT-WS); d) to correct for sphericity violation at the voxel level, we propose a hybrid testing (HT) approach that achieves equal or higher power via combining traditional sphericity correction methods (Greenhouse-Geisser and Huynh-Feldt) with MVT-WS. To validate the MVM methodology, we performed simulations to assess the controllability for false positives and power achievement. A real FMRI dataset was analyzed to demonstrate the capability of the MVM approach. The methodology has been implemented into an open source program 3dMVM in AFNI, and all the statistical tests can be performed through symbolic coding with variable names instead of the tedious process of dummy coding. Our data indicates that the severity of sphericity violation varies substantially across brain regions. The differences among various modeling methodologies were addressed through direct comparisons between the MVM approach and some of the GLM implementations in
Franco-Pedroso, Javier; Ramos, Daniel; Gonzalez-Rodriguez, Joaquin
2016-01-01
In forensic science, trace evidence found at a crime scene and on suspect has to be evaluated from the measurements performed on them, usually in the form of multivariate data (for example, several chemical compound or physical characteristics). In order to assess the strength of that evidence, the likelihood ratio framework is being increasingly adopted. Several methods have been derived in order to obtain likelihood ratios directly from univariate or multivariate data by modelling both the variation appearing between observations (or features) coming from the same source (within-source variation) and that appearing between observations coming from different sources (between-source variation). In the widely used multivariate kernel likelihood-ratio, the within-source distribution is assumed to be normally distributed and constant among different sources and the between-source variation is modelled through a kernel density function (KDF). In order to better fit the observed distribution of the between-source variation, this paper presents a different approach in which a Gaussian mixture model (GMM) is used instead of a KDF. As it will be shown, this approach provides better-calibrated likelihood ratios as measured by the log-likelihood ratio cost (Cllr) in experiments performed on freely available forensic datasets involving different trace evidences: inks, glass fragments and car paints. PMID:26901680
A multivariate conditional model for streamflow prediction and spatial precipitation refinement
NASA Astrophysics Data System (ADS)
Liu, Zhiyong; Zhou, Ping; Chen, Xiuzhi; Guan, Yinghui
2015-10-01
The effective prediction and estimation of hydrometeorological variables are important for water resources planning and management. In this study, we propose a multivariate conditional model for streamflow prediction and the refinement of spatial precipitation estimates. This model consists of high dimensional vine copulas, conditional bivariate copula simulations, and a quantile-copula function. The vine copula is employed because of its flexibility in modeling the high dimensional joint distribution of multivariate data by building a hierarchy of conditional bivariate copulas. We investigate two cases to evaluate the performance and applicability of the proposed approach. In the first case, we generate one month ahead streamflow forecasts that incorporate multiple predictors including antecedent precipitation and streamflow records in a basin located in South China. The prediction accuracy of the vine-based model is compared with that of traditional data-driven models such as the support vector regression (SVR) and the adaptive neuro-fuzzy inference system (ANFIS). The results indicate that the proposed model produces more skillful forecasts than SVR and ANFIS. Moreover, this probabilistic model yields additional information concerning the predictive uncertainty. The second case involves refining spatial precipitation estimates derived from the tropical rainfall measuring mission precipitationproduct for the Yangtze River basin by incorporating remotely sensed soil moisture data and the observed precipitation from meteorological gauges over the basin. The validation results indicate that the proposed model successfully refines the spatial precipitation estimates. Although this model is tested for specific cases, it can be extended to other hydrometeorological variables for predictions and spatial estimations.
NASA Astrophysics Data System (ADS)
Rupšys, P.
2015-10-01
A system of stochastic differential equations (SDE) with mixed-effects parameters and multivariate normal copula density function were used to develop tree height model for Scots pine trees in Lithuania. A two-step maximum likelihood parameter estimation method is used and computational guidelines are given. After fitting the conditional probability density functions to outside bark diameter at breast height, and total tree height, a bivariate normal copula distribution model was constructed. Predictions from the mixed-effects parameters SDE tree height model calculated during this research were compared to the regression tree height equations. The results are implemented in the symbolic computational language MAPLE.
Rupšys, P.
2015-10-28
A system of stochastic differential equations (SDE) with mixed-effects parameters and multivariate normal copula density function were used to develop tree height model for Scots pine trees in Lithuania. A two-step maximum likelihood parameter estimation method is used and computational guidelines are given. After fitting the conditional probability density functions to outside bark diameter at breast height, and total tree height, a bivariate normal copula distribution model was constructed. Predictions from the mixed-effects parameters SDE tree height model calculated during this research were compared to the regression tree height equations. The results are implemented in the symbolic computational language MAPLE.
Bradshaw, Sam; Mason, Shelley S.; Looft, Fred J.
2010-01-01
Tactile sensation is a complex manifestation of mechanical stimuli applied to the skin. At the most fundamental level of the somatosensory system is the cutaneous mechanoreceptor. The objective here was to establish a framework for modeling afferent mechanoreceptor behavior as a nanoscale biosensor under dynamic compressive loads using multivariate regression techniques. A multivariate logistical model was chosen because the system contains continuous input variables and a singular binary-output variable corresponding to the nerve action potential. Subsequently, this method was used to quantify the sensitivity of ten rapidly adapting afferents from rat hairy skin due to the stimulus metrics of compressive stress, strain, their respective time derivatives, and interactions. In vitro experiments involving compressive stimulation of isolated afferents using pseudorandom and nonrepeating noise sequences were completed. An analysis of the data was performed using multivariate logistical regression producing odds ratios (ORs) as a metric associated with mechanotransduction. It was determined that cutaneous mechanoreceptors are preferentially sensitive to stress (mean ORmax = 26.10), stress rate (mean ORmax = 15.03), strain (mean ORmax = 12.01), and strain rate (mean ORmax = 7.29) typically occurring within 7.3 ms of the nerve response. As a novel approach to receptor characterization, this analytical framework was validated for the multiple-input, binary-output neural system. PMID:21197157
Forecasting of municipal solid waste quantity in a developing country using multivariate grey models
Intharathirat, Rotchana; Abdul Salam, P.; Kumar, S.; Untong, Akarapong
2015-05-15
Highlights: • Grey model can be used to forecast MSW quantity accurately with the limited data. • Prediction interval overcomes the uncertainty of MSW forecast effectively. • A multivariate model gives accuracy associated with factors affecting MSW quantity. • Population, urbanization, employment and household size play role for MSW quantity. - Abstract: In order to plan, manage and use municipal solid waste (MSW) in a sustainable way, accurate forecasting of MSW generation and composition plays a key role. It is difficult to carry out the reliable estimates using the existing models due to the limited data available in the developing countries. This study aims to forecast MSW collected in Thailand with prediction interval in long term period by using the optimized multivariate grey model which is the mathematical approach. For multivariate models, the representative factors of residential and commercial sectors affecting waste collected are identified, classified and quantified based on statistics and mathematics of grey system theory. Results show that GMC (1, 5), the grey model with convolution integral, is the most accurate with the least error of 1.16% MAPE. MSW collected would increase 1.40% per year from 43,435–44,994 tonnes per day in 2013 to 55,177–56,735 tonnes per day in 2030. This model also illustrates that population density is the most important factor affecting MSW collected, followed by urbanization, proportion employment and household size, respectively. These mean that the representative factors of commercial sector may affect more MSW collected than that of residential sector. Results can help decision makers to develop the measures and policies of waste management in long term period.
Multivariate model to characterise relations between maize mutant starches and hydrolysis kinetics.
Kansou, Kamal; Buléon, Alain; Gérard, Catherine; Rolland-Sabaté, Agnès
2015-11-20
The many studies about amylolysis have collected considerable information regarding the contribution of the starch physico-chemical properties. But the inherent elaborate and variable structure of granular starch and, consequently, the multifactorial condition of the system hinders the interpretation of the experimental results. The immediate benefit of multivariate statistical analysis approaches with that regard is twofold: considering the factors, possibly interrelated, all together and not independently, providing a first estimation of the magnitude and confidence level of the relations between factors and amylolysis kinetic parameters. Based on data of amylolysis of 13 starch samples from wild type, single and double mutants of maize by porcine pancreatic α-amylase (PPA), a multivariate analysis is proposed. Amylolysis progress-curves were fitted by a Weibull function, as proposed in a previous work, to extract three kinetic parameters: the reaction rate coefficient during the first time-unit, k, the reaction rate retardation over time, h, and the final hydrolysis extent, X∞. Multivariate models relate the macromolecular composition and the fractions of crystalline polymorphic types to the kinetic parameters. h and X∞ are found to be highly related to the measured properties. Thus the amylose content appears to be significantly correlated to the hydrolysis rate retardation, which sheds some light on the probable contribution of the amylose molecules contained in the granules. The multivariate models give correct prediction performances except for k whose a part of variability remains unexplained. A further analysis points out the extent of the characterisation effort of the granule structure needed to extend the fraction of explained variability. PMID:26344307
Modarres, Reza; Ouarda, Taha B M J; Vanasse, Alain; Orzanco, Maria Gabriela; Gosselin, Pierre
2014-07-01
Changes in extreme meteorological variables and the demographic shift towards an older population have made it important to investigate the association of climate variables and hip fracture by advanced methods in order to determine the climate variables that most affect hip fracture incidence. The nonlinear autoregressive moving average with exogenous variable-generalized autoregressive conditional heteroscedasticity (ARMAX-GARCH) and multivariate GARCH (MGARCH) time series approaches were applied to investigate the nonlinear association between hip fracture rate in female and male patients aged 40-74 and 75+ years and climate variables in the period of 1993-2004, in Montreal, Canada. The models describe 50-56% of daily variation in hip fracture rate and identify snow depth, air temperature, day length and air pressure as the influencing variables on the time-varying mean and variance of the hip fracture rate. The conditional covariance between climate variables and hip fracture rate is increasing exponentially, showing that the effect of climate variables on hip fracture rate is most acute when rates are high and climate conditions are at their worst. In Montreal, climate variables, particularly snow depth and air temperature, appear to be important predictors of hip fracture incidence. The association of climate variables and hip fracture does not seem to change linearly with time, but increases exponentially under harsh climate conditions. The results of this study can be used to provide an adaptive climate-related public health program and ti guide allocation of services for avoiding hip fracture risk. PMID:23722925
NASA Astrophysics Data System (ADS)
Modarres, Reza; Ouarda, Taha B. M. J.; Vanasse, Alain; Orzanco, Maria Gabriela; Gosselin, Pierre
2014-07-01
Changes in extreme meteorological variables and the demographic shift towards an older population have made it important to investigate the association of climate variables and hip fracture by advanced methods in order to determine the climate variables that most affect hip fracture incidence. The nonlinear autoregressive moving average with exogenous variable-generalized autoregressive conditional heteroscedasticity (ARMA X-GARCH) and multivariate GARCH (MGARCH) time series approaches were applied to investigate the nonlinear association between hip fracture rate in female and male patients aged 40-74 and 75+ years and climate variables in the period of 1993-2004, in Montreal, Canada. The models describe 50-56 % of daily variation in hip fracture rate and identify snow depth, air temperature, day length and air pressure as the influencing variables on the time-varying mean and variance of the hip fracture rate. The conditional covariance between climate variables and hip fracture rate is increasing exponentially, showing that the effect of climate variables on hip fracture rate is most acute when rates are high and climate conditions are at their worst. In Montreal, climate variables, particularly snow depth and air temperature, appear to be important predictors of hip fracture incidence. The association of climate variables and hip fracture does not seem to change linearly with time, but increases exponentially under harsh climate conditions. The results of this study can be used to provide an adaptive climate-related public health program and ti guide allocation of services for avoiding hip fracture risk.
NASA Astrophysics Data System (ADS)
Libera, D.; Arumugam, S.
2015-12-01
Water quality observations are usually not available on a continuous basis because of the expensive cost and labor requirements so calibrating and validating a mechanistic model is often difficult. Further, any model predictions inherently have bias (i.e., under/over estimation) and require techniques that preserve the long-term mean monthly attributes. This study suggests and compares two multivariate bias-correction techniques to improve the performance of the SWAT model in predicting daily streamflow, TN Loads across the southeast based on split-sample validation. The first approach is a dimension reduction technique, canonical correlation analysis that regresses the observed multivariate attributes with the SWAT model simulated values. The second approach is from signal processing, importance weighting, that applies a weight based off the ratio of the observed and model densities to the model data to shift the mean, variance, and cross-correlation towards the observed values. These procedures were applied to 3 watersheds chosen from the Water Quality Network in the Southeast Region; specifically watersheds with sufficiently large drainage areas and number of observed data points. The performance of these two approaches are also compared with independent estimates from the USGS LOADEST model. Uncertainties in the bias-corrected estimates due to limited water quality observations are also discussed.
Multi-variate models are essential for understanding vertebrate diversification in deep time
Benson, Roger B. J.; Mannion, Philip D.
2012-01-01
Statistical models are helping palaeontologists to elucidate the history of biodiversity. Sampling standardization has been extensively applied to remedy the effects of uneven sampling in large datasets of fossil invertebrates. However, many vertebrate datasets are smaller, and the issue of uneven sampling has commonly been ignored, or approached using pairwise comparisons with a numerical proxy for sampling effort. Although most authors find a strong correlation between palaeodiversity and sampling proxies, weak correlation is recorded in some datasets. This has led several authors to conclude that uneven sampling does not influence our view of vertebrate macroevolution. We demonstrate that multi-variate regression models incorporating a model of underlying biological diversification, as well as a sampling proxy, fit observed sauropodomorph dinosaur palaeodiversity best. This bivariate model is a better fit than separate univariate models, and illustrates that observed palaeodiversity is a composite pattern, representing a biological signal overprinted by variation in sampling effort. Multi-variate models and other approaches that consider sampling as an essential component of palaeodiversity are central to gaining a more complete understanding of deep time vertebrate diversification. PMID:21697163
NASA Astrophysics Data System (ADS)
Ghosh, S.; Reichenbach, P.; Rossi, M.; Guzzetti, F.; van Westen, C.; Carranza, E. J. M.
2009-04-01
The results of multivariate landslide statistical susceptibility models are highly sensitive to the type of statistical and spatial distribution of the mass movement used as grouping variable, and to the type of geofactors used as explanatory variables. Different classification of landslide data set could result in different model performance and validation fit. Exploiting a discriminant analysis (DA) and a logistic regression (LR) models, we prepared different landslide susceptibility zonation for a study area around Kurseong town in the Darjeeling Himalaya region, Eastern India. To prepare the models, we used as training data set, 342 shallow translational rock slides and 168 shallow translational debris slides, which occurred between 1968 and 2003. To validate the models we used a different set of landslide that occurred between 2004 and 2007. 62 relevant factors including morphometric and geo-environmental parameters were used as explanatory variables. We present and discuss the performance and the validation results of the landslide susceptibility zonation prepared with the two different statistical multivariate models using as grouping variables - the rock slides data set, the debris slides data set and the two type of landslides data set together. The discriminate analysis performs better than the logistic regression and this is probably due to the: a) lack of coherence in the selected training data set and the corresponding explanatory variables; b) landslide type classification problems; c) frequency distribution of landslide/no-landslide mapping units.
Haaland, David M.
1999-07-14
The analysis precision of any multivariate calibration method will be severely degraded if unmodeled sources of spectral variation are present in the unknown sample spectra. This paper describes a synthetic method for correcting for the errors generated by the presence of unmodeled components or other sources of unmodeled spectral variation. If the spectral shape of the unmodeled component can be obtained and mathematically added to the original calibration spectra, then a new synthetic multivariate calibration model can be generated to accommodate the presence of the unmodeled source of spectral variation. This new method is demonstrated for the presence of unmodeled temperature variations in the unknown sample spectra of dilute aqueous solutions of urea, creatinine, and NaCl. When constant-temperature PLS models are applied to spectra of samples of variable temperature, the standard errors of prediction (SEP) are approximately an order of magnitude higher than that of the original cross-validated SEPs of the constant-temperature partial least squares models. Synthetic models using the classical least squares estimates of temperature from pure water or variable-temperature mixture sample spectra reduce the errors significantly for the variable temperature samples. Spectrometer drift adds additional error to the analyte determinations, but a method is demonstrated that can minimize the effect of drift on prediction errors through the measurement of the spectra of a small subset of samples during both calibration and prediction. In addition, sample temperature can be predicted with high precision with this new synthetic model without the need to recalibrate using actual variable-temperature sample data. The synthetic methods eliminate the need for expensive generation of new calibration samples and collection of their spectra. The methods are quite general and can be applied using any known source of spectral variation and can be used with any multivariate
Fuzzy modeling with multivariate membership functions: gray-box identification and control design.
Abonyi, J; Babuska, R; Szeifert, F
2001-01-01
A novel framework for fuzzy modeling and model-based control design is described. The fuzzy model is of the Takagi-Sugeno (TS) type with constant consequents. It uses multivariate antecedent membership functions obtained by Delaunay triangulation of their characteristic points. The number and position of these points are determined by an iterative insertion algorithm. Constrained optimization is used to estimate the consequent parameters, where the constraints are based on control-relevant a priori knowledge about the modeled process. Finally, methods for control design through linearization and inversion of this model are developed. The proposed techniques are demonstrated by means of two benchmark examples: identification of the well-known Box-Jenkins gas furnace and inverse model-based control of a pH process. The obtained results are compared with results from the literature. PMID:18244840
Xu Chengjian; Schaaf, Arjen van der; Schilstra, Cornelis; Langendijk, Johannes A.; Veld, Aart A. van't
2012-03-15
Purpose: To study the impact of different statistical learning methods on the prediction performance of multivariate normal tissue complication probability (NTCP) models. Methods and Materials: In this study, three learning methods, stepwise selection, least absolute shrinkage and selection operator (LASSO), and Bayesian model averaging (BMA), were used to build NTCP models of xerostomia following radiotherapy treatment for head and neck cancer. Performance of each learning method was evaluated by a repeated cross-validation scheme in order to obtain a fair comparison among methods. Results: It was found that the LASSO and BMA methods produced models with significantly better predictive power than that of the stepwise selection method. Furthermore, the LASSO method yields an easily interpretable model as the stepwise method does, in contrast to the less intuitive BMA method. Conclusions: The commonly used stepwise selection method, which is simple to execute, may be insufficient for NTCP modeling. The LASSO method is recommended.
NASA Technical Reports Server (NTRS)
MCKissick, Burnell T. (Technical Monitor); Plassman, Gerald E.; Mall, Gerald H.; Quagliano, John R.
2005-01-01
Linear multivariable regression models for predicting day and night Eddy Dissipation Rate (EDR) from available meteorological data sources are defined and validated. Model definition is based on a combination of 1997-2000 Dallas/Fort Worth (DFW) data sources, EDR from Aircraft Vortex Spacing System (AVOSS) deployment data, and regression variables primarily from corresponding Automated Surface Observation System (ASOS) data. Model validation is accomplished through EDR predictions on a similar combination of 1994-1995 Memphis (MEM) AVOSS and ASOS data. Model forms include an intercept plus a single term of fixed optimal power for each of these regression variables; 30-minute forward averaged mean and variance of near-surface wind speed and temperature, variance of wind direction, and a discrete cloud cover metric. Distinct day and night models, regressing on EDR and the natural log of EDR respectively, yield best performance and avoid model discontinuity over day/night data boundaries.
Inouye, David I.; Ravikumar, Pradeep; Dhillon, Inderjit S.
2016-01-01
We develop Square Root Graphical Models (SQR), a novel class of parametric graphical models that provides multivariate generalizations of univariate exponential family distributions. Previous multivariate graphical models (Yang et al., 2015) did not allow positive dependencies for the exponential and Poisson generalizations. However, in many real-world datasets, variables clearly have positive dependencies. For example, the airport delay time in New York—modeled as an exponential distribution—is positively related to the delay time in Boston. With this motivation, we give an example of our model class derived from the univariate exponential distribution that allows for almost arbitrary positive and negative dependencies with only a mild condition on the parameter matrix—a condition akin to the positive definiteness of the Gaussian covariance matrix. Our Poisson generalization allows for both positive and negative dependencies without any constraints on the parameter values. We also develop parameter estimation methods using node-wise regressions with ℓ1 regularization and likelihood approximation methods using sampling. Finally, we demonstrate our exponential generalization on a synthetic dataset and a real-world dataset of airport delay times.
On the Bayesian Treed Multivariate Gaussian Process with Linear Model of Coregionalization
Konomi, Bledar A.; Karagiannis, Georgios; Lin, Guang
2015-02-01
The Bayesian treed Gaussian process (BTGP) has gained popularity in recent years because it provides a straightforward mechanism for modeling non-stationary data and can alleviate computational demands by fitting models to less data. The extension of BTGP to the multivariate setting requires us to model the cross-covariance and to propose efficient algorithms that can deal with trans-dimensional MCMC moves. In this paper we extend the cross-covariance of the Bayesian treed multivariate Gaussian process (BTMGP) to that of linear model of Coregionalization (LMC) cross-covariances. Different strategies have been developed to improve the MCMC mixing and invert smaller matrices in the Bayesian inference. Moreover, we compare the proposed BTMGP with existing multiple BTGP and BTMGP in test cases and multiphase flow computer experiment in a full scale regenerator of a carbon capture unit. The use of the BTMGP with LMC cross-covariance helped to predict the computer experiments relatively better than existing competitors. The proposed model has a wide variety of applications, such as computer experiments and environmental data. In the case of computer experiments we also develop an adaptive sampling strategy for the BTMGP with LMC cross-covariance function.
Intharathirat, Rotchana; Abdul Salam, P; Kumar, S; Untong, Akarapong
2015-05-01
In order to plan, manage and use municipal solid waste (MSW) in a sustainable way, accurate forecasting of MSW generation and composition plays a key role. It is difficult to carry out the reliable estimates using the existing models due to the limited data available in the developing countries. This study aims to forecast MSW collected in Thailand with prediction interval in long term period by using the optimized multivariate grey model which is the mathematical approach. For multivariate models, the representative factors of residential and commercial sectors affecting waste collected are identified, classified and quantified based on statistics and mathematics of grey system theory. Results show that GMC (1, 5), the grey model with convolution integral, is the most accurate with the least error of 1.16% MAPE. MSW collected would increase 1.40% per year from 43,435-44,994 tonnes per day in 2013 to 55,177-56,735 tonnes per day in 2030. This model also illustrates that population density is the most important factor affecting MSW collected, followed by urbanization, proportion employment and household size, respectively. These mean that the representative factors of commercial sector may affect more MSW collected than that of residential sector. Results can help decision makers to develop the measures and policies of waste management in long term period. PMID:25704925
An interface model for dosage adjustment connects hematotoxicity to pharmacokinetics.
Meille, C; Iliadis, A; Barbolosi, D; Frances, N; Freyer, G
2008-12-01
When modeling is required to describe pharmacokinetics and pharmacodynamics simultaneously, it is difficult to link time-concentration profiles and drug effects. When patients are under chemotherapy, despite the huge amount of blood monitoring numerations, there is a lack of exposure variables to describe hematotoxicity linked with the circulating drug blood levels. We developed an interface model that transforms circulating pharmacokinetic concentrations to adequate exposures, destined to be inputs of the pharmacodynamic process. The model is materialized by a nonlinear differential equation involving three parameters. The relevance of the interface model for dosage adjustment is illustrated by numerous simulations. In particular, the interface model is incorporated into a complex system including pharmacokinetics and neutropenia induced by docetaxel and by cisplatin. Emphasis is placed on the sensitivity of neutropenia with respect to the variations of the drug amount. This complex system including pharmacokinetic, interface, and pharmacodynamic hematotoxicity models is an interesting tool for analysis of hematotoxicity induced by anticancer agents. The model could be a new basis for further improvements aimed at incorporating new experimental features. PMID:19107581
NASA Astrophysics Data System (ADS)
Gautam, N.; Rasmussen, P. F.
2005-05-01
Stochastic weather generators are frequently used in climate change studies to simulate input to hydrologic models. In this presentation, we focus on the particular problem of simulating daily precipitation at multiple stations in a region for which records are available. Daily precipitation is a highly intermittent process, highly variable in space, and typically has a highly skewed distribution. A stochastic precipitation model should ideally preserve the regional pattern of intermittence, the autocorrelation, the cross-correlation, and the marginal distributions of observed precipitation. For this purpose, we employed a multivariate autoregressive model. Below zero-values were considered days with no rain. To preserve the marginal distributions of observed precipitation at different stations some prior transformation of data was required. The presentation will describe the experience gained from applying the model to precipitation records in Canada. Focus will be on analytical model properties, methods of parameter estimation, and the preservation of observed statistics in the application.
Gilbert, P. B.
2014-01-01
Summary In randomized placebo-controlled preventive HIV vaccine efficacy trials, an objective is to evaluate the relationship between vaccine efficacy to prevent infection and genetic distances of the exposing HIV strains to the multiple HIV sequences included in the vaccine construct, where the set of genetic distances is considered as the continuous multivariate ‘mark’ observed in infected subjects only. This research develops a multivariate mark-specific hazard ratio model in the competing risks failure time analysis framework for the assessment of mark-specific vaccine efficacy. It allows improved efficiency of estimation by employing the semiparametric method of maximum profile likelihood estimation in the vaccine-to-placebo mark density ratio model. The model also enables the use of a more efficient estimation method for the overall log hazard ratio in the Cox model. Additionally, we propose testing procedures to evaluate two relevant hypotheses concerning mark-specific vaccine efficacy. The asymptotic properties and finite-sample performance of the inferential procedures are investigated. Finally, we apply the proposed methods to data collected in the Thai RV144 HIV vaccine efficacy trial. PMID:23421613
A pairwise likelihood-based approach for changepoint detection in multivariate time series models
Ma, Ting Fung; Yau, Chun Yip
2016-01-01
This paper develops a composite likelihood-based approach for multiple changepoint estimation in multivariate time series. We derive a criterion based on pairwise likelihood and minimum description length for estimating the number and locations of changepoints and for performing model selection in each segment. The number and locations of the changepoints can be consistently estimated under mild conditions and the computation can be conducted efficiently with a pruned dynamic programming algorithm. Simulation studies and real data examples demonstrate the statistical and computational efficiency of the proposed method. PMID:27279666
On the interpretation of weight vectors of linear models in multivariate neuroimaging.
Haufe, Stefan; Meinecke, Frank; Görgen, Kai; Dähne, Sven; Haynes, John-Dylan; Blankertz, Benjamin; Bießmann, Felix
2014-02-15
The increase in spatiotemporal resolution of neuroimaging devices is accompanied by a trend towards more powerful multivariate analysis methods. Often it is desired to interpret the outcome of these methods with respect to the cognitive processes under study. Here we discuss which methods allow for such interpretations, and provide guidelines for choosing an appropriate analysis for a given experimental goal: For a surgeon who needs to decide where to remove brain tissue it is most important to determine the origin of cognitive functions and associated neural processes. In contrast, when communicating with paralyzed or comatose patients via brain-computer interfaces, it is most important to accurately extract the neural processes specific to a certain mental state. These equally important but complementary objectives require different analysis methods. Determining the origin of neural processes in time or space from the parameters of a data-driven model requires what we call a forward model of the data; such a model explains how the measured data was generated from the neural sources. Examples are general linear models (GLMs). Methods for the extraction of neural information from data can be considered as backward models, as they attempt to reverse the data generating process. Examples are multivariate classifiers. Here we demonstrate that the parameters of forward models are neurophysiologically interpretable in the sense that significant nonzero weights are only observed at channels the activity of which is related to the brain process under study. In contrast, the interpretation of backward model parameters can lead to wrong conclusions regarding the spatial or temporal origin of the neural signals of interest, since significant nonzero weights may also be observed at channels the activity of which is statistically independent of the brain process under study. As a remedy for the linear case, we propose a procedure for transforming backward models into forward
Cella, Laura; Liuzzi, Raffaele; Conson, Manuel; D’Avino, Vittoria; Salvatore, Marco; Pacelli, Roberto
2013-10-01
Purpose: To establish a multivariate normal tissue complication probability (NTCP) model for radiation-induced asymptomatic heart valvular defects (RVD). Methods and Materials: Fifty-six patients treated with sequential chemoradiation therapy for Hodgkin lymphoma (HL) were retrospectively reviewed for RVD events. Clinical information along with whole heart, cardiac chambers, and lung dose distribution parameters was collected, and the correlations to RVD were analyzed by means of Spearman's rank correlation coefficient (Rs). For the selection of the model order and parameters for NTCP modeling, a multivariate logistic regression method using resampling techniques (bootstrapping) was applied. Model performance was evaluated using the area under the receiver operating characteristic curve (AUC). Results: When we analyzed the whole heart, a 3-variable NTCP model including the maximum dose, whole heart volume, and lung volume was shown to be the optimal predictive model for RVD (Rs = 0.573, P<.001, AUC = 0.83). When we analyzed the cardiac chambers individually, for the left atrium and for the left ventricle, an NTCP model based on 3 variables including the percentage volume exceeding 30 Gy (V30), cardiac chamber volume, and lung volume was selected as the most predictive model (Rs = 0.539, P<.001, AUC = 0.83; and Rs = 0.557, P<.001, AUC = 0.82, respectively). The NTCP values increase as heart maximum dose or cardiac chambers V30 increase. They also increase with larger volumes of the heart or cardiac chambers and decrease when lung volume is larger. Conclusions: We propose logistic NTCP models for RVD considering not only heart irradiation dose but also the combined effects of lung and heart volumes. Our study establishes the statistical evidence of the indirect effect of lung size on radio-induced heart toxicity.
Lee, S. H.; van der Werf, J. H. J.
2016-01-01
Summary: We have developed an algorithm for genetic analysis of complex traits using genome-wide SNPs in a linear mixed model framework. Compared to current standard REML software based on the mixed model equation, our method is substantially faster. The advantage is largest when there is only a single genetic covariance structure. The method is particularly useful for multivariate analysis, including multi-trait models and random regression models for studying reaction norms. We applied our proposed method to publicly available mice and human data and discuss the advantages and limitations. Availability and implementation: MTG2 is available in https://sites.google.com/site/honglee0707/mtg2. Contact: hong.lee@une.edu.au Supplementary information: Supplementary data are available at Bioinformatics online. PMID:26755623
Probabilistic, multi-variate flood damage modelling using random forests and Bayesian networks
NASA Astrophysics Data System (ADS)
Kreibich, Heidi; Schröter, Kai
2015-04-01
Decisions on flood risk management and adaptation are increasingly based on risk analyses. Such analyses are associated with considerable uncertainty, even more if changes in risk due to global change are expected. Although uncertainty analysis and probabilistic approaches have received increased attention recently, they are hardly applied in flood damage assessments. Most of the damage models usually applied in standard practice have in common that complex damaging processes are described by simple, deterministic approaches like stage-damage functions. This presentation will show approaches for probabilistic, multi-variate flood damage modelling on the micro- and meso-scale and discuss their potential and limitations. Reference: Merz, B.; Kreibich, H.; Lall, U. (2013): Multi-variate flood damage assessment: a tree-based data-mining approach. NHESS, 13(1), 53-64. Schröter, K., Kreibich, H., Vogel, K., Riggelsen, C., Scherbaum, F., Merz, B. (2014): How useful are complex flood damage models? - Water Resources Research, 50, 4, p. 3378-3395.
Multivariate model of female black bear habitat use for a Geographic Information System
Clark, Joseph D.; Dunn, James E.; Smith, Kimberly G.
1993-01-01
Simple univariate statistical techniques may not adequately assess the multidimensional nature of habitats used by wildlife. Thus, we developed a multivariate method to model habitat-use potential using a set of female black bear (Ursus americanus) radio locations and habitat data consisting of forest cover type, elevation, slope, aspect, distance to roads, distance to streams, and forest cover type diversity score in the Ozark Mountains of Arkansas. The model is based on the Mahalanobis distance statistic coupled with Geographic Information System (GIS) technology. That statistic is a measure of dissimilarity and represents a standardized squared distance between a set of sample variates and an ideal based on the mean of variates associated with animal observations. Calculations were made with the GIS to produce a map containing Mahalanobis distance values within each cell on a 60- × 60-m grid. The model identified areas of high habitat use potential that could not otherwise be identified by independent perusal of any single map layer. This technique avoids many pitfalls that commonly affect typical multivariate analyses of habitat use and is a useful tool for habitat manipulation or mitigation to favor terrestrial vertebrates that use habitats on a landscape scale.
Prediction of lip response to orthodontic treatment using a multivariable regression model
Shirvani, Amin; Sadeghian, Saeid; Abbasi, Safieh
2016-01-01
Background: This was a retrospective cephalometric study to develop a more precise estimation of soft tissue changes related to underlying tooth movment than simple relatioship betweenhard and soft tissues. Materials and Methods: The lateral cephalograms of 61 adult patients undergoing orthodontic treatment (31 = premolar extraction, 31 = nonextraction) were obtained, scanned and digitized before and immediately after the end of treatment. Hard and soft tissues, angular and linear measures were calculated by Viewbox 4.0 software. The changes of the values were analyzed using paired t-test. The accuracy of predictions of soft tissue changes were compared with two methods: (1) Use of ratios of the means of soft tissue to hard tissue changes (Viewbox 4.0 Software), (2) use of stepwise multivariable regression analysis to create prediction equations for soft tissue changes at superior labial sulcus, labrale superius, stomion superius, inferior labial sulcus, labrale inferius, stomion inferius (all on a horizontal plane). Results: Stepwise multiple regressions to predict lip movements showed strong relations for the upper lip (adjusted R2 = 0.92) and the lower lip (adjusted R2 = 0.91) in the extraction group. Regression analysis showed slightly weaker relations in the nonextraction group. Conclusion: Within the limitation of this study, multiple regression technique was slightly more accurate than the ratio of mean prediction (Viewbox4.0 software) and appears to be useful in the prediction of soft tissue changes. As the variability of the predicted individual outcome seems to be relatively high, caution should be taken in predicting hard and soft tissue positional changes. PMID:26962314
Tracking Problem Solving by Multivariate Pattern Analysis and Hidden Markov Model Algorithms
Anderson, John R.
2011-01-01
Multivariate pattern analysis can be combined with hidden Markov model algorithms to track the second-by-second thinking as people solve complex problems. Two applications of this methodology are illustrated with a data set taken from children as they interacted with an intelligent tutoring system for algebra. The first “mind reading” application involves using fMRI activity to track what students are doing as they solve a sequence of algebra problems. The methodology achieves considerable accuracy at determining both what problem-solving step the students are taking and whether they are performing that step correctly. The second “model discovery” application involves using statistical model evaluation to determine how many substates are involved in performing a step of algebraic problem solving. This research indicates that different steps involve different numbers of substates and these substates are associated with different fluency in algebra problem solving. PMID:21820455
Valle, Denis; Baiser, Benjamin; Woodall, Christopher W; Chazdon, Robin
2014-01-01
We propose a novel multivariate method to analyse biodiversity data based on the Latent Dirichlet Allocation (LDA) model. LDA, a probabilistic model, reduces assemblages to sets of distinct component communities. It produces easily interpretable results, can represent abrupt and gradual changes in composition, accommodates missing data and allows for coherent estimates of uncertainty. We illustrate our method using tree data for the eastern United States and from a tropical successional chronosequence. The model is able to detect pervasive declines in the oak community in Minnesota and Indiana, potentially due to fire suppression, increased growing season precipitation and herbivory. The chronosequence analysis is able to delineate clear successional trends in species composition, while also revealing that site-specific factors significantly impact these successional trajectories. The proposed method provides a means to decompose and track the dynamics of species assemblages along temporal and spatial gradients, including effects of global change and forest disturbances. PMID:25328064
Ecological prediction with nonlinear multivariate time-frequency functional data models
Yang, Wen-Hsi; Wikle, Christopher K.; Holan, Scott H.; Wildhaber, Mark L.
2013-01-01
Time-frequency analysis has become a fundamental component of many scientific inquiries. Due to improvements in technology, the amount of high-frequency signals that are collected for ecological and other scientific processes is increasing at a dramatic rate. In order to facilitate the use of these data in ecological prediction, we introduce a class of nonlinear multivariate time-frequency functional models that can identify important features of each signal as well as the interaction of signals corresponding to the response variable of interest. Our methodology is of independent interest and utilizes stochastic search variable selection to improve model selection and performs model averaging to enhance prediction. We illustrate the effectiveness of our approach through simulation and by application to predicting spawning success of shovelnose sturgeon in the Lower Missouri River.
Developing a multivariate electronic medical record integration model for primary health care.
Lau, Francis; Price, Morgan; Lesperance, Mary
2013-01-01
This paper describes the development of a multivariate electronic medical record (EMR) integration model for the primary health care setting. Our working hypothesis is that an integrated EMR is associated with high quality primary health care. Our assumption is that EMR integration should be viewed as a form of complex intervention with multiple interacting components that can impact the quality of care. Depending on how well the EMR is integrated in the practice setting, one can expect a corresponding change in the quality of care as measured through a set of primary health care quality indicators. To test the face validity of this model, a Delphi study is being planned where health care providers and information technology professionals involved with EMR adoption are polled for their feedback. This model has the potential to quantify and explain the factors that influence successful EMR integration to improve primary health care. PMID:23388317
Valle, Denis; Baiser, Benjamin; Woodall, Christopher W; Chazdon, Robin
2014-12-01
We propose a novel multivariate method to analyse biodiversity data based on the Latent Dirichlet Allocation (LDA) model. LDA, a probabilistic model, reduces assemblages to sets of distinct component communities. It produces easily interpretable results, can represent abrupt and gradual changes in composition, accommodates missing data and allows for coherent estimates of uncertainty. We illustrate our method using tree data for the eastern United States and from a tropical successional chronosequence. The model is able to detect pervasive declines in the oak community in Minnesota and Indiana, potentially due to fire suppression, increased growing season precipitation and herbivory. The chronosequence analysis is able to delineate clear successional trends in species composition, while also revealing that site-specific factors significantly impact these successional trajectories. The proposed method provides a means to decompose and track the dynamics of species assemblages along temporal and spatial gradients, including effects of global change and forest disturbances. PMID:25328064
Wu, Chuanli; Gao, Yuexia; Hua, Tianqi; Xu, Chenwu
2016-01-01
Background It is challenging to deal with mixture models when missing values occur in clustering datasets. Methods and Results We propose a dynamic clustering algorithm based on a multivariate Gaussian mixture model that efficiently imputes missing values to generate a “pseudo-complete” dataset. Parameters from different clusters and missing values are estimated according to the maximum likelihood implemented with an expectation-maximization algorithm, and multivariate individuals are clustered with Bayesian posterior probability. A simulation showed that our proposed method has a fast convergence speed and it accurately estimates missing values. Our proposed algorithm was further validated with Fisher’s Iris dataset, the Yeast Cell-cycle Gene-expression dataset, and the CIFAR-10 images dataset. The results indicate that our algorithm offers highly accurate clustering, comparable to that using a complete dataset without missing values. Furthermore, our algorithm resulted in a lower misjudgment rate than both clustering algorithms with missing data deleted and with missing-value imputation by mean replacement. Conclusion We demonstrate that our missing-value imputation clustering algorithm is feasible and superior to both of these other clustering algorithms in certain situations. PMID:27552203
NASA Astrophysics Data System (ADS)
Nieto, Paulino José García; Antón, Juan Carlos Álvarez; Vilán, José Antonio Vilán; García-Gonzalo, Esperanza
2014-10-01
The aim of this research work is to build a regression model of the particulate matter up to 10 micrometers in size (PM10) by using the multivariate adaptive regression splines (MARS) technique in the Oviedo urban area (Northern Spain) at local scale. This research work explores the use of a nonparametric regression algorithm known as multivariate adaptive regression splines (MARS) which has the ability to approximate the relationship between the inputs and outputs, and express the relationship mathematically. In this sense, hazardous air pollutants or toxic air contaminants refer to any substance that may cause or contribute to an increase in mortality or serious illness, or that may pose a present or potential hazard to human health. To accomplish the objective of this study, the experimental dataset of nitrogen oxides (NOx), carbon monoxide (CO), sulfur dioxide (SO2), ozone (O3) and dust (PM10) were collected over 3 years (2006-2008) and they are used to create a highly nonlinear model of the PM10 in the Oviedo urban nucleus (Northern Spain) based on the MARS technique. One main objective of this model is to obtain a preliminary estimate of the dependence between PM10 pollutant in the Oviedo urban area at local scale. A second aim is to determine the factors with the greatest bearing on air quality with a view to proposing health and lifestyle improvements. The United States National Ambient Air Quality Standards (NAAQS) establishes the limit values of the main pollutants in the atmosphere in order to ensure the health of healthy people. Firstly, this MARS regression model captures the main perception of statistical learning theory in order to obtain a good prediction of the dependence among the main pollutants in the Oviedo urban area. Secondly, the main advantages of MARS are its capacity to produce simple, easy-to-interpret models, its ability to estimate the contributions of the input variables, and its computational efficiency. Finally, on the basis of
Freitas, Mirlaine R; Barigye, Stephen J; Daré, Joyce K; Freitas, Matheus P
2016-06-01
The bioconcentration factor (BCF) is an important parameter used to estimate the propensity of chemicals to accumulate in aquatic organisms from the ambient environment. While simple regressions for estimating the BCF of chemical compounds from water solubility or the n-octanol/water partition coefficient have been proposed in the literature, these models do not always yield good correlations and more descriptive variables are required for better modeling of BCF data for a given series of organic pollutants, such as some herbicides. Thus, the logBCF values for a set of carbonyl herbicides comprising amide, urea, carbamate and thiocarbamate groups were quantitatively modeled using multivariate image analysis (MIA) descriptors, derived from colored image representations for chemical structures. The logBCF model was calibrated and vigorously validated (r(2) = 0.79, q(2) = 0.70 and rtest(2) = 0.81), providing a comprehensive three-parameter linear equation after variable selection (logBCF = 5.682 - 0.00233 × X9774 - 0.00070 × X813 - 0.00273 × X5144); the variables represent pixel coordinates in the multivariate image. Finally, chemical interpretation of the obtained models in terms of the structural characteristics responsible for the enhanced or reduced logBCF values was performed, providing key leads in the prospective development of more eco-friendly synthetic herbicides. PMID:26971171
Disaster Hits Home: A Model of Displaced Family Adjustment after Hurricane Katrina
ERIC Educational Resources Information Center
Peek, Lori; Morrissey, Bridget; Marlatt, Holly
2011-01-01
The authors explored individual and family adjustment processes among parents (n = 30) and children (n = 55) who were displaced to Colorado after Hurricane Katrina. Drawing on in-depth interviews with 23 families, this article offers an inductive model of displaced family adjustment. Four stages of family adjustment are presented in the model: (a)…
ERIC Educational Resources Information Center
Tay, Louis; Drasgow, Fritz
2012-01-01
Two Monte Carlo simulation studies investigated the effectiveness of the mean adjusted X[superscript 2]/df statistic proposed by Drasgow and colleagues and, because of problems with the method, a new approach for assessing the goodness of fit of an item response theory model was developed. It has been previously recommended that mean adjusted…
El-Basyouny, Karim; Barua, Sudip; Islam, Md Tazul
2014-12-01
Previous research shows that various weather elements have significant effects on crash occurrence and risk; however, little is known about how these elements affect different crash types. Consequently, this study investigates the impact of weather elements and sudden extreme snow or rain weather changes on crash type. Multivariate models were used for seven crash types using five years of daily weather and crash data collected for the entire City of Edmonton. In addition, the yearly trend and random variation of parameters across the years were analyzed by using four different modeling formulations. The proposed models were estimated in a full Bayesian context via Markov Chain Monte Carlo simulation. The multivariate Poisson lognormal model with yearly varying coefficients provided the best fit for the data according to Deviance Information Criteria. Overall, results showed that temperature and snowfall were statistically significant with intuitive signs (crashes decrease with increasing temperature; crashes increase as snowfall intensity increases) for all crash types, while rainfall was mostly insignificant. Previous snow showed mixed results, being statistically significant and positively related to certain crash types, while negatively related or insignificant in other cases. Maximum wind gust speed was found mostly insignificant with a few exceptions that were positively related to crash type. Major snow or rain events following a dry weather condition were highly significant and positively related to three crash types: Follow-Too-Close, Stop-Sign-Violation, and Ran-Off-Road crashes. The day-of-the-week dummy variables were statistically significant, indicating a possible weekly variation in exposure. Transportation authorities might use the above results to improve road safety by providing drivers with information regarding the risk of certain crash types for a particular weather condition. PMID:25190632
Rahman, Ziyaur; Mohammad, Adil; Siddiqui, Akhtar; Khan, Mansoor A
2015-12-01
The focus of the present investigation was to explore the use of solid-state nuclear magnetic resonance ((13)C ssNMR) and X-ray powder diffraction (XRPD) for quantification of nimodipine polymorphs (form I and form II) crystallized in a cosolvent formulation. The cosolvent formulation composed of polyethylene glycol 400, glycerin, water, and 2.5% drug, and was stored at 5°C for the drug crystallization. The (13)C ssNMR and XRPD data of the sample matrices containing varying percentages of nimodipine form I and form II were collected. Univariate and multivariate models were developed using the data. Least square method was used for the univariate model generation. Partial least square and principle component regressions were used for the multivariate models development. The univariate models of the (13)C ssNMR were better than the XRPD as indicated by statistical parameters such as correlation coefficient, R (2), root mean square error, and standard error. On the other hand, the XRPD multivariate models were better than the (13)C ssNMR as indicated by precision and accuracy parameters. Similar values were predicted by the univariate and multivariate models for independent samples. In conclusion, the univariate and multivariate models of (13)C ssNMR and XRPD can be used to quantitate nimodipine polymorphs. PMID:25956485
Wolfrum, E. J.; Sluiter, A. D.
2009-01-01
We have studied rapid calibration models to predict the composition of a variety of biomass feedstocks by correlating near-infrared (NIR) spectroscopic data to compositional data produced using traditional wet chemical analysis techniques. The rapid calibration models are developed using multivariate statistical analysis of the spectroscopic and wet chemical data. This work discusses the latest versions of the NIR calibration models for corn stover feedstock and dilute-acid pretreated corn stover. Measures of the calibration precision and uncertainty are presented. No statistically significant differences (p = 0.05) are seen between NIR calibration models built using different mathematical pretreatments. Finally, two common algorithms for building NIR calibration models are compared; no statistically significant differences (p = 0.05) are seen for the major constituents glucan, xylan, and lignin, but the algorithms did produce different predictions for total extractives. A single calibration model combining the corn stover feedstock and dilute-acid pretreated corn stover samples gave less satisfactory predictions than the separate models.
Analysis of Multivariate Experimental Data Using A Simplified Regression Model Search Algorithm
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert M.
2013-01-01
A new regression model search algorithm was developed that may be applied to both general multivariate experimental data sets and wind tunnel strain-gage balance calibration data. The algorithm is a simplified version of a more complex algorithm that was originally developed for the NASA Ames Balance Calibration Laboratory. The new algorithm performs regression model term reduction to prevent overfitting of data. It has the advantage that it needs only about one tenth of the original algorithm's CPU time for the completion of a regression model search. In addition, extensive testing showed that the prediction accuracy of math models obtained from the simplified algorithm is similar to the prediction accuracy of math models obtained from the original algorithm. The simplified algorithm, however, cannot guarantee that search constraints related to a set of statistical quality requirements are always satisfied in the optimized regression model. Therefore, the simplified algorithm is not intended to replace the original algorithm. Instead, it may be used to generate an alternate optimized regression model of experimental data whenever the application of the original search algorithm fails or requires too much CPU time. Data from a machine calibration of NASA's MK40 force balance is used to illustrate the application of the new search algorithm.
Warnell, I; Chincholkar, M; Eccles, M
2015-01-01
Predicting risk of perioperative mortality after oesophagectomy for cancer may assist patients to make treatment choices and allow balanced comparison of providers. The aim of this systematic review of multivariate prediction models is to report their performance in new patients, and compare study methods against current recommendations. We used PRISMA guidelines and searched Medline, Embase, and standard texts from 1990 to 2012. Inclusion criteria were English language articles reporting development and validation of prediction models of perioperative mortality after open oesophagectomy. Two reviewers screened articles and extracted data for methods, results, and potential biases. We identified 11 development, 10 external validation, and two clinical impact studies. Overestimation of predicted mortality was common (5-200% error), discrimination was poor to moderate (area under receiver operator curves ranged from 0.58 to 0.78), and reporting of potential bias was poor. There were potentially important case mix differences between modelling and validation samples, and sample sizes were considerably smaller than is currently recommended. Steyerberg and colleagues' model used the most 'transportable' predictors and was validated in the largest sample. Most models have not been adequately validated and reported performance has been unsatisfactory. There is a need to clarify definition, effect size, and selection of currently available candidate predictors for inclusion in prediction models, and to identify new ones strongly associated with outcome. Adoption of prediction models into practice requires further development and validation in well-designed large sample prospective studies. PMID:25231768
NASA Technical Reports Server (NTRS)
Belcastro, Christine M.
1998-01-01
Robust control system analysis and design is based on an uncertainty description, called a linear fractional transformation (LFT), which separates the uncertain (or varying) part of the system from the nominal system. These models are also useful in the design of gain-scheduled control systems based on Linear Parameter Varying (LPV) methods. Low-order LFT models are difficult to form for problems involving nonlinear parameter variations. This paper presents a numerical computational method for constructing and LFT model for a given LPV model. The method is developed for multivariate polynomial problems, and uses simple matrix computations to obtain an exact low-order LFT representation of the given LPV system without the use of model reduction. Although the method is developed for multivariate polynomial problems, multivariate rational problems can also be solved using this method by reformulating the rational problem into a polynomial form.
NASA Astrophysics Data System (ADS)
Hynds, Paul; Misstear, Bruce D.; Gill, Laurence W.; Murphy, Heather M.
2014-04-01
An integrated domestic well sampling and "susceptibility assessment" programme was undertaken in the Republic of Ireland from April 2008 to November 2010. Overall, 211 domestic wells were sampled, assessed and collated with local climate data. Based upon groundwater physicochemical profile, three clusters have been identified and characterised by source type (borehole or hand-dug well) and local geological setting. Statistical analysis indicates that cluster membership is significantly associated with the prevalence of bacteria (p = 0.001), with mean Escherichia coli presence within clusters ranging from 15.4% (Cluster-1) to 47.6% (Cluster-3). Bivariate risk factor analysis shows that on-site septic tank presence was the only risk factor significantly associated (p < 0.05) with bacterial presence within all clusters. Point agriculture adjacency was significantly associated with both borehole-related clusters. Well design criteria were associated with hand-dug wells and boreholes in areas characterised by high permeability subsoils, while local geological setting was significant for hand-dug wells and boreholes in areas dominated by low/moderate permeability subsoils. Multivariate susceptibility models were developed for all clusters, with predictive accuracies of 84% (Cluster-1) to 91% (Cluster-2) achieved. Septic tank setback was a common variable within all multivariate models, while agricultural sources were also significant, albeit to a lesser degree. Furthermore, well liner clearance was a significant factor in all models, indicating that direct surface ingress is a significant well contamination mechanism. Identification and elucidation of cluster-specific contamination mechanisms may be used to develop improved overall risk management and wellhead protection strategies, while also informing future remediation and maintenance efforts.
Hynds, Paul; Misstear, Bruce D; Gill, Laurence W; Murphy, Heather M
2014-04-01
An integrated domestic well sampling and "susceptibility assessment" programme was undertaken in the Republic of Ireland from April 2008 to November 2010. Overall, 211 domestic wells were sampled, assessed and collated with local climate data. Based upon groundwater physicochemical profile, three clusters have been identified and characterised by source type (borehole or hand-dug well) and local geological setting. Statistical analysis indicates that cluster membership is significantly associated with the prevalence of bacteria (p=0.001), with mean Escherichia coli presence within clusters ranging from 15.4% (Cluster-1) to 47.6% (Cluster-3). Bivariate risk factor analysis shows that on-site septic tank presence was the only risk factor significantly associated (p<0.05) with bacterial presence within all clusters. Point agriculture adjacency was significantly associated with both borehole-related clusters. Well design criteria were associated with hand-dug wells and boreholes in areas characterised by high permeability subsoils, while local geological setting was significant for hand-dug wells and boreholes in areas dominated by low/moderate permeability subsoils. Multivariate susceptibility models were developed for all clusters, with predictive accuracies of 84% (Cluster-1) to 91% (Cluster-2) achieved. Septic tank setback was a common variable within all multivariate models, while agricultural sources were also significant, albeit to a lesser degree. Furthermore, well liner clearance was a significant factor in all models, indicating that direct surface ingress is a significant well contamination mechanism. Identification and elucidation of cluster-specific contamination mechanisms may be used to develop improved overall risk management and wellhead protection strategies, while also informing future remediation and maintenance efforts. PMID:24583518
Multi-variate spatial explicit constraining of a large scale hydrological model
NASA Astrophysics Data System (ADS)
Rakovec, Oldrich; Kumar, Rohini; Samaniego, Luis
2016-04-01
Increased availability and quality of near real-time data should target at better understanding of predictive skills of distributed hydrological models. Nevertheless, predictions of regional scale water fluxes and states remains of great challenge to the scientific community. Large scale hydrological models are used for prediction of soil moisture, evapotranspiration and other related water states and fluxes. They are usually properly constrained against river discharge, which is an integral variable. Rakovec et al (2016) recently demonstrated that constraining model parameters against river discharge is necessary, but not a sufficient condition. Therefore, we further aim at scrutinizing appropriate incorporation of readily available information into a hydrological model that may help to improve the realism of hydrological processes. It is important to analyze how complementary datasets besides observed streamflow and related signature measures can improve model skill of internal model variables during parameter estimation. Among those products suitable for further scrutiny are for example the GRACE satellite observations. Recent developments of using this dataset in a multivariate fashion to complement traditionally used streamflow data within the distributed model mHM (www.ufz.de/mhm) are presented. Study domain consists of 80 European basins, which cover a wide range of distinct physiographic and hydrologic regimes. First-order data quality check ensures that heavily human influenced basins are eliminated. For river discharge simulations we show that model performance of discharge remains unchanged when complemented by information from the GRACE product (both, daily and monthly time steps). Moreover, the GRACE complementary data lead to consistent and statistically significant improvements in evapotranspiration estimates, which are evaluated using an independent gridded FLUXNET product. We also show that the choice of the objective function used to estimate
Analysis of Multivariate Experimental Data Using A Simplified Regression Model Search Algorithm
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert Manfred
2013-01-01
A new regression model search algorithm was developed in 2011 that may be used to analyze both general multivariate experimental data sets and wind tunnel strain-gage balance calibration data. The new algorithm is a simplified version of a more complex search algorithm that was originally developed at the NASA Ames Balance Calibration Laboratory. The new algorithm has the advantage that it needs only about one tenth of the original algorithm's CPU time for the completion of a search. In addition, extensive testing showed that the prediction accuracy of math models obtained from the simplified algorithm is similar to the prediction accuracy of math models obtained from the original algorithm. The simplified algorithm, however, cannot guarantee that search constraints related to a set of statistical quality requirements are always satisfied in the optimized regression models. Therefore, the simplified search algorithm is not intended to replace the original search algorithm. Instead, it may be used to generate an alternate optimized regression model of experimental data whenever the application of the original search algorithm either fails or requires too much CPU time. Data from a machine calibration of NASA's MK40 force balance is used to illustrate the application of the new regression model search algorithm.
Application of multivariate storage model to quantify trends in seasonally frozen soil
NASA Astrophysics Data System (ADS)
Woody, Jonathan; Wang, Yan; Dyer, Jamie
2016-06-01
This article presents a study of the ground thermal regime recorded at 11 stations in the North Dakota Agricultural Network. Particular focus is placed on detecting trends in the annual ground freeze process portion of the ground thermal regime's daily temperature signature. A multivariate storage model from queuing theory is fit to a quantity of estimated daily depths of frozen soil. Statistical inference on a trend parameter is obtained by minimizing a weighted sum of squares of a sequence of daily one-step-ahead predictions. Standard errors for the trend estimates are presented. It is shown that the daily quantity of frozen ground experienced at these 11 sites exhibited a negative trend over the observation period.
Web-based tools for modelling and analysis of multivariate data: California ozone pollution activity
Dinov, Ivo D.; Christou, Nicolas
2014-01-01
This article presents a hands-on web-based activity motivated by the relation between human health and ozone pollution in California. This case study is based on multivariate data collected monthly at 20 locations in California between 1980 and 2006. Several strategies and tools for data interrogation and exploratory data analysis, model fitting and statistical inference on these data are presented. All components of this case study (data, tools, activity) are freely available online at: http://wiki.stat.ucla.edu/socr/index.php/SOCR_MotionCharts_CAOzoneData. Several types of exploratory (motion charts, box-and-whisker plots, spider charts) and quantitative (inference, regression, analysis of variance (ANOVA)) data analyses tools are demonstrated. Two specific human health related questions (temporal and geographic effects of ozone pollution) are discussed as motivational challenges. PMID:24465054
Giacomo, Della Riccia; Stefania, Del Zotto
2013-12-15
Fumonisins are mycotoxins produced by Fusarium species that commonly live in maize. Whereas fungi damage plants, fumonisins cause disease both to cattle breedings and human beings. Law limits set fumonisins tolerable daily intake with respect to several maize based feed and food. Chemical techniques assure the most reliable and accurate measurements, but they are expensive and time consuming. A method based on Near Infrared spectroscopy and multivariate statistical regression is described as a simpler, cheaper and faster alternative. We apply Partial Least Squares with full cross validation. Two models are described, having high correlation of calibration (0.995, 0.998) and of validation (0.908, 0.909), respectively. Description of observed phenomenon is accurate and overfitting is avoided. Screening of contaminated maize with respect to European legal limit of 4 mg kg(-1) should be assured. PMID:23993617
Mac Nally, Ralph; Thomson, James R.; Kimmerer, Wim J.; Feyrer, Frederick; Newman, Ken B.; Sih, Andy; Bennett, William A.; Brown, Larry; Fleishman, Erica; Culberson, Steven D.; Castillo, Gonzalo
2010-01-01
Four species of pelagic fish of particular management concern in the upper San Francisco Estuary, California, USA, have declined precipitously since ca. 2002: delta smelt (Hypomesus transpacificus), longfin smelt (Spirinchus thaleichthys), striped bass (Morone saxatilis), and threadfin shad (Dorosoma petenense). The estuary has been monitored since the late 1960s with extensive collection of data on the fishes, their pelagic prey, phytoplankton biomass, invasive species, and physical factors. We used multivariate autoregressive (MAR) modeling to discern the main factors responsible for the declines. An expert-elicited model was built to describe the system. Fifty-four relationships were built into the model, only one of which was of uncertain direction a priori. Twenty-eight of the proposed relationships were strongly supported by or consistent with the data, while 26 were close to zero (not supported by the data but not contrary to expectations). The position of the 2 isohaline (a measure of the physical response of the estuary to freshwater flow) and increased water clarity over the period of analyses were two factors affecting multiple declining taxa (including fishes and the fishes' main zooplankton prey). Our results were relatively robust with respect to the form of stock–recruitment model used and to inclusion of subsidiary covariates but may be enhanced by using detailed state–space models that describe more fully the life-history dynamics of the declining species.
NASA Astrophysics Data System (ADS)
Pajor, A.
2006-11-01
In the paper we compare the modelling ability of discrete-time multivariate Stochastic Volatility (SV) models to describe the conditional correlations between stock index returns. We consider four tri-variate SV models, which differ in the structure of the conditional covariance matrix. Specifications with zero, constant and time-varying conditional correlations are taken into account. As an example we study tri-variate volatility models for the daily log returns on the WIG, S&P 500, and FTSE 100 indexes. In order to formally compare the relative explanatory power of SV specifications we use the Bayesian principles of comparing statistic models. Our results are based on the Bayes factors and implemented through Markov Chain Monte Carlo techniques. The results indicate that the most adequate specifications are those that allow for time-varying conditional correlations and that have as many latent processes as there are conditional variances and covariances. The empirical results clearly show that the data strongly reject the assumption of constant conditional correlations.
Multivariate probabilistic projections using imperfect climate models part I: outline of methodology
NASA Astrophysics Data System (ADS)
Sexton, David M. H.; Murphy, James M.; Collins, Mat; Webb, Mark J.
2012-06-01
We demonstrate a method for making probabilistic projections of climate change at global and regional scales, using examples consisting of the equilibrium response to doubled CO2 concentrations of global annual mean temperature and regional climate changes in summer and winter temperature and precipitation over Northern Europe and England-Wales. This method combines information from a perturbed physics ensemble, a set of international climate models, and observations. Our approach is based on a multivariate Bayesian framework which enables the prediction of a joint probability distribution for several variables constrained by more than one observational metric. This is important if different sets of impacts scientists are to use these probabilistic projections to make coherent forecasts for the impacts of climate change, by inputting several uncertain climate variables into their impacts models. Unlike a single metric, multiple metrics reduce the risk of rewarding a model variant which scores well due to a fortuitous compensation of errors rather than because it is providing a realistic simulation of the observed quantity. We provide some physical interpretation of how the key metrics constrain our probabilistic projections. The method also has a quantity, called discrepancy, which represents the degree of imperfection in the climate model i.e. it measures the extent to which missing processes, choices of parameterisation schemes and approximations in the climate model affect our ability to use outputs from climate models to make inferences about the real system. Other studies have, sometimes without realising it, treated the climate model as if it had no model error. We show that omission of discrepancy increases the risk of making over-confident predictions. Discrepancy also provides a transparent way of incorporating improvements in subsequent generations of climate models into probabilistic assessments. The set of international climate models is used to derive
ERIC Educational Resources Information Center
Timmerman, Marieke E.; Kiers, Henk A. L.
2003-01-01
Discusses a class of four simultaneous component models for the explanatory analysis of multivariate time series collected from more than one subject simultaneously. Shows how the models can be ordered hierarchically and illustrates their use through an empirical example. (SLD)
Bell, Susan P; Schnelle, John; Nwosu, Samuel K; Schildcrout, Jonathan; Goggins, Kathryn; Cawthon, Courtney; Mixon, Amanda S; Vasilevskis, Eduard E; Kripalani, Sunil
2015-01-01
Objectives To identify vulnerable cardiovascular patients in the hospital using a self-reported function-based screening tool. Participants Prospective observational cohort study of 445 individuals aged ≥65 years admitted to a university medical centre hospital within the USA with acute coronary syndrome and/or decompensated heart failure. Methods Participants completed an inperson interview during hospitalisation, which included vulnerable functional status using the Vulnerable Elders Survey (VES-13), sociodemographic, healthcare utilisation practices and clinical patient-specific measures. A multivariable proportional odds logistic regression model examined associations between VES-13 and prior healthcare utilisation, as well as other coincident medical and psychosocial risk factors for poor outcomes in cardiovascular disease. Results Vulnerability was highly prevalent (54%) and associated with a higher number of clinic visits, emergency room visits and hospitalisations (all p<0.001). A multivariable analysis demonstrating a 1-point increase in VES-13 (vulnerability) was independently associated with being female (OR 1.55, p=0.030), diagnosis of heart failure (OR 3.11, p<0.001), prior hospitalisations (OR 1.30, p<0.001), low social support (OR 1.42, p=0.007) and depression (p<0.001). A lower VES-13 score (lower vulnerability) was associated with increased health literacy (OR 0.70, p=0.002). Conclusions Vulnerability to functional decline is highly prevalent in hospitalised older cardiovascular patients and was associated with patient risk factors for adverse outcomes and an increased use of healthcare services. PMID:26316650
NASA Technical Reports Server (NTRS)
Crutcher, H. L.; Falls, L. W.
1976-01-01
Sets of experimentally determined or routinely observed data provide information about the past, present and, hopefully, future sets of similarly produced data. An infinite set of statistical models exists which may be used to describe the data sets. The normal distribution is one model. If it serves at all, it serves well. If a data set, or a transformation of the set, representative of a larger population can be described by the normal distribution, then valid statistical inferences can be drawn. There are several tests which may be applied to a data set to determine whether the univariate normal model adequately describes the set. The chi-square test based on Pearson's work in the late nineteenth and early twentieth centuries is often used. Like all tests, it has some weaknesses which are discussed in elementary texts. Extension of the chi-square test to the multivariate normal model is provided. Tables and graphs permit easier application of the test in the higher dimensions. Several examples, using recorded data, illustrate the procedures. Tests of maximum absolute differences, mean sum of squares of residuals, runs and changes of sign are included in these tests. Dimensions one through five with selected sample sizes 11 to 101 are used to illustrate the statistical tests developed.
Technology Transfer Automated Retrieval System (TEKTRAN)
The mixed linear model (MLM) is currently among the most advanced and flexible statistical modeling techniques and its use in tackling problems in plant pathology has begun surfacing in the literature. The longitudinal MLM is a multivariate extension that handles repeatedly measured data, such as r...
Michalareas, George; Schoffelen, Jan-Mathijs; Paterson, Gavin; Gross, Joachim
2013-01-01
Abstract In this work, we investigate the feasibility to estimating causal interactions between brain regions based on multivariate autoregressive models (MAR models) fitted to magnetoencephalographic (MEG) sensor measurements. We first demonstrate the theoretical feasibility of estimating source level causal interactions after projection of the sensor-level model coefficients onto the locations of the neural sources. Next, we show with simulated MEG data that causality, as measured by partial directed coherence (PDC), can be correctly reconstructed if the locations of the interacting brain areas are known. We further demonstrate, if a very large number of brain voxels is considered as potential activation sources, that PDC as a measure to reconstruct causal interactions is less accurate. In such case the MAR model coefficients alone contain meaningful causality information. The proposed method overcomes the problems of model nonrobustness and large computation times encountered during causality analysis by existing methods. These methods first project MEG sensor time-series onto a large number of brain locations after which the MAR model is built on this large number of source-level time-series. Instead, through this work, we demonstrate that by building the MAR model on the sensor-level and then projecting only the MAR coefficients in source space, the true casual pathways are recovered even when a very large number of locations are considered as sources. The main contribution of this work is that by this methodology entire brain causality maps can be efficiently derived without any a priori selection of regions of interest. Hum Brain Mapp, 2013. © 2012 Wiley Periodicals, Inc. PMID:22328419
Nieto, P J García; Antón, J C Álvarez; Vilán, J A Vilán; García-Gonzalo, E
2015-05-01
The aim of this research work is to build a regression model of air quality by using the multivariate adaptive regression splines (MARS) technique in the Oviedo urban area (northern Spain) at a local scale. To accomplish the objective of this study, the experimental data set made up of nitrogen oxides (NO x ), carbon monoxide (CO), sulfur dioxide (SO2), ozone (O3), and dust (PM10) was collected over 3 years (2006-2008). The US National Ambient Air Quality Standards (NAAQS) establishes the limit values of the main pollutants in the atmosphere in order to ensure the health of healthy people. Firstly, this MARS regression model captures the main perception of statistical learning theory in order to obtain a good prediction of the dependence among the main pollutants in the Oviedo urban area. Secondly, the main advantages of MARS are its capacity to produce simple, easy-to-interpret models, its ability to estimate the contributions of the input variables, and its computational efficiency. Finally, on the basis of these numerical calculations, using the MARS technique, conclusions of this research work are exposed. PMID:25414030
Joint prediction of multiple quantitative traits using a Bayesian multivariate antedependence model
Jiang, J; Zhang, Q; Ma, L; Li, J; Wang, Z; Liu, J-F
2015-01-01
Predicting organismal phenotypes from genotype data is important for preventive and personalized medicine as well as plant and animal breeding. Although genome-wide association studies (GWAS) for complex traits have discovered a large number of trait- and disease-associated variants, phenotype prediction based on associated variants is usually in low accuracy even for a high-heritability trait because these variants can typically account for a limited fraction of total genetic variance. In comparison with GWAS, the whole-genome prediction (WGP) methods can increase prediction accuracy by making use of a huge number of variants simultaneously. Among various statistical methods for WGP, multiple-trait model and antedependence model show their respective advantages. To take advantage of both strategies within a unified framework, we proposed a novel multivariate antedependence-based method for joint prediction of multiple quantitative traits using a Bayesian algorithm via modeling a linear relationship of effect vector between each pair of adjacent markers. Through both simulation and real-data analyses, our studies demonstrated that the proposed antedependence-based multiple-trait WGP method is more accurate and robust than corresponding traditional counterparts (Bayes A and multi-trait Bayes A) under various scenarios. Our method can be readily extended to deal with missing phenotypes and resequence data with rare variants, offering a feasible way to jointly predict phenotypes for multiple complex traits in human genetic epidemiology as well as plant and livestock breeding. PMID:25873147
Hackstadt, Amber J.; Peng, Roger D.
2014-01-01
Summary Time series studies have suggested that air pollution can negatively impact health. These studies have typically focused on the total mass of fine particulate matter air pollution or the individual chemical constituents that contribute to it, and not source-specific contributions to air pollution. Source-specific contribution estimates are useful from a regulatory standpoint by allowing regulators to focus limited resources on reducing emissions from sources that are major contributors to air pollution and are also desired when estimating source-specific health effects. However, researchers often lack direct observations of the emissions at the source level. We propose a Bayesian multivariate receptor model to infer information about source contributions from ambient air pollution measurements. The proposed model incorporates information from national databases containing data on both the composition of source emissions and the amount of emissions from known sources of air pollution. The proposed model is used to perform source apportionment analyses for two distinct locations in the United States (Boston, Massachusetts and Phoenix, Arizona). Our results mirror previous source apportionment analyses that did not utilize the information from national databases and provide additional information about uncertainty that is relevant to the estimation of health effects. PMID:25309119
spBayes: An R Package for Univariate and Multivariate Hierarchical Point-referenced Spatial Models
Finley, Andrew O.; Banerjee, Sudipto; Carlin, Bradley P.
2010-01-01
Scientists and investigators in such diverse fields as geological and environmental sciences, ecology, forestry, disease mapping, and economics often encounter spatially referenced data collected over a fixed set of locations with coordinates (latitude–longitude, Easting–Northing etc.) in a region of study. Such point-referenced or geostatistical data are often best analyzed with Bayesian hierarchical models. Unfortunately, fitting such models involves computationally intensive Markov chain Monte Carlo (MCMC) methods whose efficiency depends upon the specific problem at hand. This requires extensive coding on the part of the user and the situation is not helped by the lack of available software for such algorithms. Here, we introduce a statistical software package, spBayes, built upon the R statistical computing platform that implements a generalized template encompassing a wide variety of Gaussian spatial process models for univariate as well as multivariate point-referenced data. We discuss the algorithms behind our package and illustrate its use with a synthetic and real data example. PMID:21494410
Impact of Fractionation and Dose in a Multivariate Model for Radiation-Induced Chest Wall Pain
Din, Shaun U.; Williams, Eric L.; Jackson, Andrew; Rosenzweig, Kenneth E.; Wu, Abraham J.; Foster, Amanda; Yorke, Ellen D.; Rimner, Andreas
2015-10-01
Purpose: To determine the role of patient/tumor characteristics, radiation dose, and fractionation using the linear-quadratic (LQ) model to predict stereotactic body radiation therapy–induced grade ≥2 chest wall pain (CWP2) in a larger series and develop clinically useful constraints for patients treated with different fraction numbers. Methods and Materials: A total of 316 lung tumors in 295 patients were treated with stereotactic body radiation therapy in 3 to 5 fractions to 39 to 60 Gy. Absolute dose–absolute volume chest wall (CW) histograms were acquired. The raw dose-volume histograms (α/β = ∞ Gy) were converted via the LQ model to equivalent doses in 2-Gy fractions (normalized total dose, NTD) with α/β from 0 to 25 Gy in 0.1-Gy steps. The Cox proportional hazards (CPH) model was used in univariate and multivariate models to identify and assess CWP2 exposed to a given physical and NTD. Results: The median follow-up was 15.4 months, and the median time to development of CWP2 was 7.4 months. On a univariate CPH model, prescription dose, prescription dose per fraction, number of fractions, D83cc, distance of tumor to CW, and body mass index were all statistically significant for the development of CWP2. Linear-quadratic correction improved the CPH model significance over the physical dose. The best-fit α/β was 2.1 Gy, and the physical dose (α/β = ∞ Gy) was outside the upper 95% confidence limit. With α/β = 2.1 Gy, V{sub NTD99Gy} was most significant, with median V{sub NTD99Gy} = 31.5 cm{sup 3} (hazard ratio 3.87, P<.001). Conclusion: There were several predictive factors for the development of CWP2. The LQ-adjusted doses using the best-fit α/β = 2.1 Gy is a better predictor of CWP2 than the physical dose. To aid dosimetrists, we have calculated the physical dose equivalent corresponding to V{sub NTD99Gy} = 31.5 cm{sup 3} for the 3- to 5-fraction groups.
Application of Multivariate Modeling for Radiation Injury Assessment: A Proof of Concept
Bolduc, David L.; Villa, Vilmar; Sandgren, David J.; Ledney, G. David; Blakely, William F.; Bünger, Rolf
2014-01-01
Multivariate radiation injury estimation algorithms were formulated for estimating severe hematopoietic acute radiation syndrome (H-ARS) injury (i.e., response category three or RC3) in a rhesus monkey total-body irradiation (TBI) model. Classical CBC and serum chemistry blood parameters were examined prior to irradiation (d 0) and on d 7, 10, 14, 21, and 25 after irradiation involving 24 nonhuman primates (NHP) (Macaca mulatta) given 6.5-Gy 60Co Υ-rays (0.4 Gy min−1) TBI. A correlation matrix was formulated with the RC3 severity level designated as the “dependent variable” and independent variables down selected based on their radioresponsiveness and relatively low multicollinearity using stepwise-linear regression analyses. Final candidate independent variables included CBC counts (absolute number of neutrophils, lymphocytes, and platelets) in formulating the “CBC” RC3 estimation algorithm. Additionally, the formulation of a diagnostic CBC and serum chemistry “CBC-SCHEM” RC3 algorithm expanded upon the CBC algorithm model with the addition of hematocrit and the serum enzyme levels of aspartate aminotransferase, creatine kinase, and lactate dehydrogenase. Both algorithms estimated RC3 with over 90% predictive power. Only the CBC-SCHEM RC3 algorithm, however, met the critical three assumptions of linear least squares demonstrating slightly greater precision for radiation injury estimation, but with significantly decreased prediction error indicating increased statistical robustness. PMID:25165485
Wang, Xiuquan; Huang, Guohe; Zhao, Shan; Guo, Junhong
2015-09-01
This paper presents an open-source software package, rSCA, which is developed based upon a stepwise cluster analysis method and serves as a statistical tool for modeling the relationships between multiple dependent and independent variables. The rSCA package is efficient in dealing with both continuous and discrete variables, as well as nonlinear relationships between the variables. It divides the sample sets of dependent variables into different subsets (or subclusters) through a series of cutting and merging operations based upon the theory of multivariate analysis of variance (MANOVA). The modeling results are given by a cluster tree, which includes both intermediate and leaf subclusters as well as the flow paths from the root of the tree to each leaf subcluster specified by a series of cutting and merging actions. The rSCA package is a handy and easy-to-use tool and is freely available at http://cran.r-project.org/package=rSCA . By applying the developed package to air quality management in an urban environment, we demonstrate its effectiveness in dealing with the complicated relationships among multiple variables in real-world problems. PMID:25966889
NASA Astrophysics Data System (ADS)
Sakaguchi, Kaori; Nagatsuma, Tsutomu; Reeves, Geoffrey D.; Spence, Harlan E.
2015-12-01
The Van Allen radiation belts surrounding the Earth are filled with MeV-energy electrons. This region poses ionizing radiation risks for spacecraft that operate within it, including those in geostationary orbit (GEO) and medium Earth orbit. To provide alerts of electron flux enhancements, 16 prediction models of the electron log-flux variation throughout the equatorial outer radiation belt as a function of the McIlwain L parameter were developed using the multivariate autoregressive model and Kalman filter. Measurements of omnidirectional 2.3 MeV electron flux from the Van Allen Probes mission as well as >2 MeV electrons from the GOES 15 spacecraft were used as the predictors. Model explanatory parameters were selected from solar wind parameters, the electron log-flux at GEO, and geomagnetic indices. For the innermost region of the outer radiation belt, the electron flux is best predicted by using the Dst index as the sole input parameter. For the central to outermost regions, at L ≧ 4.8 and L ≧ 5.6, the electron flux is predicted most accurately by including also the solar wind velocity and then the dynamic pressure, respectively. The Dst index is the best overall single parameter for predicting at 3 ≦ L ≦ 6, while for the GEO flux prediction, the KP index is better than Dst. A test calculation demonstrates that the model successfully predicts the timing and location of the flux maximum as much as 2 days in advance and that the electron flux decreases faster with time at higher L values, both model features consistent with the actually observed behavior.
NASA Astrophysics Data System (ADS)
Ayoko, Godwin A.; Singh, Kirpal; Balerea, Steven; Kokot, Serge
2007-03-01
SummaryPhysico-chemical properties of surface water and groundwater samples from some developing countries have been subjected to multivariate analyses by the non-parametric multi-criteria decision-making methods, PROMETHEE and GAIA. Complete ranking information necessary to select one source of water in preference to all others was obtained, and this enabled relationships between the physico-chemical properties and water quality to be assessed. Thus, the ranking of the quality of the water bodies was found to be strongly dependent on the total dissolved solid, phosphate, sulfate, ammonia-nitrogen, calcium, iron, chloride, magnesium, zinc, nitrate and fluoride contents of the waters. However, potassium, manganese and zinc composition showed the least influence in differentiating the water bodies. To model and predict the water quality influencing parameters, partial least squares analyses were carried out on a matrix made up of the results of water quality assessment studies carried out in Nigeria, Papua New Guinea, Egypt, Thailand and India/Pakistan. The results showed that the total dissolved solid, calcium, sulfate, sodium and chloride contents can be used to predict a wide range of physico-chemical characteristics of water. The potential implications of these observations on the financial and opportunity costs associated with elaborate water quality monitoring are discussed.
Multivariate Multi-data Assimilation System in Regional Model with High Resolution
NASA Astrophysics Data System (ADS)
Benkiran, M.; Chanut, J.; Giraud St Albin, S.; Drillet, Y.
2010-12-01
Mercator Ocean has developed a regional North East Shelf forecasting system over the North East Atlantic, taking advantage of the recent developments in NEMO (1/12°). This regional forecasting system uses boundary conditions from the operational real-time Mercator Ocean North Atlantic high resolution system (1/12°). The assimilation component of the Mercator Ocean system, is based on a reduced-order Kalman filter (the SEEK or Singular Extended Evolutive Kalman filter). The error statistics are represented in a sub-space spanned by a small number of dominant 3D error directions. The data assimilation system allows to constrain the model in a multivariate way with Sea Surface Temperature (RTG-SST), together with all available satellite Sea Level Anomalies, and with in situ observations from the CORIOLIS database, including ARGO floats temperature and salinity measurements.At last, we used PALM coupler which provides a general structure for a modular implementation of a data assimilation system, and makes easier the changes in the analysis algorithm. We will confront the results obtained with the regional forecast system (1/12°) with IAU (Incremental Analysis Updates) to the ones obtained with Mercator Ocean North Atlantic high resolution system (1/12°).
Multivariate Multi-Data Assimilation System in Regional Model With High Resolution
NASA Astrophysics Data System (ADS)
Benkiran, M.; Chanut, J.; Greiner, E.; Giraud St Albin, S.; Drillet, Y.
2008-12-01
Mercator Ocean has developed a regional North East Shelf forecasting system over the North East Atlantic, taking advantage of the recent developments in NEMO (1/12). This regional forecasting system uses boundary conditions from the operational real-time Mercator Ocean North Atlantic high resolution system (1/12). The assimilation component of the Mercator Ocean system, is based on a reduced-order Kalman filter (the SEEK or Singular Extended Evolutive Kalman filter). The error statistics are represented in a sub-space spanned by a small number of dominant 3D error directions. The data assimilation system allows to constrain the model in a multivariate way with Sea Surface Temperature (RTG-SST), together with all available satellite Sea Level Anomalies, and with in situ observations from the CORIOLIS database, including ARGO floats temperature and salinity measurements.At last, we used PALM coupler which provides a general structure for a modular implementation of a data assimilation system, and makes easier the changes in the analysis algorithm. We will confront the results obtained with the regional forecast system (1/12) with IAU (Incremental Analysis Updates) to the ones obtained with Mercator Ocean North Atlantic high resolution system (1/12).
Multivariate Multi-data Assimilation System in Regional Model with High Resolution
NASA Astrophysics Data System (ADS)
Benkiran, M.; Bourdalle-Badie, R.; Drillet, Y.; Greiner, E.; Chanut, J.
2009-12-01
Mercator Ocean has developed a regional North East Shelf forecasting system over the North East Atlantic, taking advantage of the recent developments in NEMO (1/12°). This regional forecasting system uses boundary conditions from the operational real-time Mercator Ocean North Atlantic high resolution system (1/12°). The assimilation component of the Mercator Ocean system, is based on a reduced-order Kalman filter (the SEEK or Singular Extended Evolutive Kalman filter). The error statistics are represented in a sub-space spanned by a small number of dominant 3D error directions. The data assimilation system allows to constrain the model in a multivariate way with Sea Surface Temperature (RTG-SST), together with all available satellite Sea Level Anomalies, and with in situ observations from the CORIOLIS database, including ARGO floats temperature and salinity measurements.At last, we used PALM coupler which provides a general structure for a modular implementation of a data assimilation system, and makes easier the changes in the analysis algorithm. We will confront the results obtained with the regional forecast system (1/12°) with IAU (Incremental Analysis Updates) to the ones obtained with Mercator Ocean North Atlantic high resolution system (1/12°).
Multivariate time series modeling of short-term system scale irrigation demand
NASA Astrophysics Data System (ADS)
Perera, Kushan C.; Western, Andrew W.; George, Biju; Nawarathna, Bandara
2015-12-01
Travel time limits the ability of irrigation system operators to react to short-term irrigation demand fluctuations that result from variations in weather, including very hot periods and rainfall events, as well as the various other pressures and opportunities that farmers face. Short-term system-wide irrigation demand forecasts can assist in system operation. Here we developed a multivariate time series (ARMAX) model to forecast irrigation demands with respect to aggregated service points flows (IDCGi, ASP) and off take regulator flows (IDCGi, OTR) based across 5 command areas, which included area covered under four irrigation channels and the study area. These command area specific ARMAX models forecast 1-5 days ahead daily IDCGi, ASP and IDCGi, OTR using the real time flow data recorded at the service points and the uppermost regulators and observed meteorological data collected from automatic weather stations. The model efficiency and the predictive performance were quantified using the root mean squared error (RMSE), Nash-Sutcliffe model efficiency coefficient (NSE), anomaly correlation coefficient (ACC) and mean square skill score (MSSS). During the evaluation period, NSE for IDCGi, ASP and IDCGi, OTR across 5 command areas were ranged 0.98-0.78. These models were capable of generating skillful forecasts (MSSS ⩾ 0.5 and ACC ⩾ 0.6) of IDCGi, ASP and IDCGi, OTR for all 5 lead days and IDCGi, ASP and IDCGi, OTR forecasts were better than using the long term monthly mean irrigation demand. Overall these predictive performance from the ARMAX time series models were higher than almost all the previous studies we are aware. Further, IDCGi, ASP and IDCGi, OTR forecasts have improved the operators' ability to react for near future irrigation demand fluctuations as the developed ARMAX time series models were self-adaptive to reflect the short-term changes in the irrigation demand with respect to various pressures and opportunities that farmers' face, such as
Snell, Kym I.E.; Hua, Harry; Debray, Thomas P.A.; Ensor, Joie; Look, Maxime P.; Moons, Karel G.M.; Riley, Richard D.
2016-01-01
Objectives Our aim was to improve meta-analysis methods for summarizing a prediction model's performance when individual participant data are available from multiple studies for external validation. Study Design and Setting We suggest multivariate meta-analysis for jointly synthesizing calibration and discrimination performance, while accounting for their correlation. The approach estimates a prediction model's average performance, the heterogeneity in performance across populations, and the probability of “good” performance in new populations. This allows different implementation strategies (e.g., recalibration) to be compared. Application is made to a diagnostic model for deep vein thrombosis (DVT) and a prognostic model for breast cancer mortality. Results In both examples, multivariate meta-analysis reveals that calibration performance is excellent on average but highly heterogeneous across populations unless the model's intercept (baseline hazard) is recalibrated. For the cancer model, the probability of “good” performance (defined by C statistic ≥0.7 and calibration slope between 0.9 and 1.1) in a new population was 0.67 with recalibration but 0.22 without recalibration. For the DVT model, even with recalibration, there was only a 0.03 probability of “good” performance. Conclusion Multivariate meta-analysis can be used to externally validate a prediction model's calibration and discrimination performance across multiple populations and to evaluate different implementation strategies. PMID:26142114
Multivariate dynamical systems models for estimating causal interactions in fMRI
Ryali, Srikanth; Supekar, Kaustubh; Chen, Tianwen; Menon, Vinod
2010-01-01
Analysis of dynamical interactions between distributed brain areas is of fundamental importance for understanding cognitive information processing. However, estimating dynamic causal interactions between brain regions using functional magnetic resonance imaging (fMRI) poses several unique challenges. For one, fMRI measures Blood Oxygenation Level Dependent (BOLD) signals, rather than the underlying latent neuronal activity. Second, regional variations in the hemodynamic response function (HRF) can significantly influence estimation of casual interactions between them. Third, causal interactions between brain regions can change with experimental context over time. To overcome these problems, we developed a novel state-space Multivariate Dynamical Systems (MDS) model to estimate intrinsic and experimentally-induced modulatory causal interactions between multiple brain regions. A probabilistic graphical framework is then used to estimate the parameters of MDS as applied to fMRI data. We show that MDS accurately takes into account regional variations in the HRF and estimates dynamic causal interactions at the level of latent signals. We develop and compare two estimation procedures using maximum likelihood estimation (MLE) and variational Bayesian (VB) approaches for inferring model parameters. Using extensive computer simulations, we demonstrate that, compared to Granger causal analysis (GCA), MDS exhibits superior performance for a wide range of signal to noise ratios (SNRs), sample length and network size. Our simulations also suggest that GCA fails to uncover causal interactions when there is a conflict between the direction of intrinsic and modulatory influences. Furthermore, we show that MDS estimation using VB methods is more robust and performs significantly better at low SNRs and shorter time series than MDS with MLE. Our study suggests that VB estimation of MDS provides a robust method for estimating and interpreting causal network interactions in fMRI data
Multivariate calibration modeling of liver oxygen saturation using near-infrared spectroscopy
NASA Astrophysics Data System (ADS)
Cingo, Ndumiso A.; Soller, Babs R.; Puyana, Juan C.
2000-05-01
The liver has been identified as an ideal site to spectroscopically monitor for changes in oxygen saturation during liver transplantation and shock because it is susceptible to reduced blood flow and oxygen transport. Near-IR spectroscopy, combined with multivariate calibration techniques, has been shown to be a viable technique for monitoring oxygen saturation changes in various organs in a minimally invasive manner. The liver has a dual system circulation. Blood enters the liver through the portal vein and hepatic artery, and leaves through the hepatic vein. Therefore, it is of utmost importance to determine how the liver NIR spectroscopic information correlates with the different regions of the hepatic lobule as the dual circulation flows from the presinusoidal space into the post sinusoidal region of the central vein. For NIR spectroscopic information to reliably represent the status of liver oxygenation, the NIR oxygen saturation should best correlate with the post-sinusoidal region. In a series of six pigs undergoing induced hemorrhagic chock, NIR spectra collected from the liver were used together with oxygen saturation reference data from the hepatic and portal veins, and an average of the two to build partial least-squares regression models. Results obtained from these models show that the hepatic vein and an average of the hepatic and portal veins provide information that is best correlate with NIR spectral information, while the portal vein reference measurement provides poorer correlation and accuracy. These results indicate that NIR determination of oxygen saturation in the liver can provide an assessment of liver oxygen utilization.
Multivariate analysis of groundwater quality and modeling impact of ground heat pump system
NASA Astrophysics Data System (ADS)
Thuyet, D. Q.; Saito, H.; Muto, H.; Saito, T.; Hamamoto, S.; Komatsu, T.
2013-12-01
The ground source heat pump system (GSHP) has recently become a popular building heating or cooling method, especially in North America, Western Europe, and Asia, due to advantages in reducing energy consumption and greenhouse gas emission. Because of the stability of the ground temperature, GSHP can effectively exchange the excess or demand heat of the building to the ground during the building air conditioning in the different seasons. The extensive use of GSHP can potentially disturb subsurface soil temperature and thus the groundwater quality. Therefore the assessment of subsurface thermal and environmental impacts from the GSHP operations is necessary to ensure sustainable use of GSHP system as well as the safe use of groundwater resources. This study aims to monitor groundwater quality during GSHP operation and to develop a numerical model to assess changes in subsurface soil temperature and in groundwater quality as affected by GSHP operation. A GSHP system was installed in Fuchu city, Tokyo, and consists of two closed double U-tubes (50-m length) buried vertically in the ground with a distance of 7.3 m from each U-tube located outside a building. An anti-freezing solution was circulated inside the U-tube for exchanging the heat between the building and the ground. The temperature at every 5-m depth and the groundwater quality including concentrations of 16 trace elements, pH, EC, Eh and DO in the shallow aquifer (32-m depth) and the deep aquifer (44-m depth) were monitored monthly since 2012, in an observation well installed 3 m from the center of the two U-tubes.Temporal variations of each element were evaluated using multivariate analysis and geostatistics. A three-dimensional heat exchange model was developed in COMSOL Multiphysics4.3b to simulate the heat exchange processes in subsurface soils. Results showed the difference in groundwater quality between the shallow and deep aquifers to be significant for some element concentrations and DO, but
Sediment fingerprinting experiments to test the sensitivity of multivariate mixing models
NASA Astrophysics Data System (ADS)
Gaspar, Leticia; Blake, Will; Smith, Hugh; Navas, Ana
2014-05-01
(e.g. P). In general, the best fits between actual and modeled proportions were found using a set of nine tracer properties (Sr, Rb, Fe, Ti, Ca, Al, P, Si, K, Si) that were derived using DFA coupled with a multivariate stepwise algorithm, with errors between real and estimated value that did not exceed 6.7 % and values of GOF above 94.5 %. The second set of experiments aimed to explore the sensitivity of model output to variability in the particle size of source materials assuming that a degree of fluvial sorting of the resulting mixture took place. Most particle size correction procedures assume grain size affects are consistent across sources and tracer properties which is not always the case. Consequently, the < 40 µm fraction of selected soil mixtures was analysed to simulate the effect of selective fluvial transport of finer particles and the results were compared to those for source materials. Preliminary findings from this experiment demonstrate the sensitivity of the numerical mixing model outputs to different particle size distributions of source material and the variable impact of fluvial sorting on end member signatures used in mixing models. The results suggest that particle size correction procedures require careful scrutiny in the context of variable source characteristics.
Technology Transfer Automated Retrieval System (TEKTRAN)
We examined multivariate relationships in structural carbohydrates plus lignin (STC) and non-structural (NSC) carbohydrates and their impact on C:N ratio and the dynamics of active (ka) and passive (kp) residue decomposition of alfalfa, corn, soybean, cuphea and switchgrass as candidates in diverse ...
Technology Transfer Automated Retrieval System (TEKTRAN)
Wheat grain attributes that influence tortilla quality are not fully understood. This impedes genetic improvement efforts to develop wheat varieties for the growing market. This study used a multivariate discriminant analysis to predict tortilla quality using a set of 16 variables derived from kerne...
Bilgel, Murat; Prince, Jerry L; Wong, Dean F; Resnick, Susan M; Jedynak, Bruno M
2016-07-01
It is important to characterize the temporal trajectories of disease-related biomarkers in order to monitor progression and identify potential points of intervention. These are especially important for neurodegenerative diseases, as therapeutic intervention is most likely to be effective in the preclinical disease stages prior to significant neuronal damage. Neuroimaging allows for the measurement of structural, functional, and metabolic integrity of the brain at the level of voxels, whose volumes are on the order of mm(3). These voxelwise measurements provide a rich collection of disease indicators. Longitudinal neuroimaging studies enable the analysis of changes in these voxelwise measures. However, commonly used longitudinal analysis approaches, such as linear mixed effects models, do not account for the fact that individuals enter a study at various disease stages and progress at different rates, and generally consider each voxelwise measure independently. We propose a multivariate nonlinear mixed effects model for estimating the trajectories of voxelwise neuroimaging biomarkers from longitudinal data that accounts for such differences across individuals. The method involves the prediction of a progression score for each visit based on a collective analysis of voxelwise biomarker data within an expectation-maximization framework that efficiently handles large amounts of measurements and variable number of visits per individual, and accounts for spatial correlations among voxels. This score allows individuals with similar progressions to be aligned and analyzed together, which enables the construction of a trajectory of brain changes as a function of an underlying progression or disease stage. We apply our method to studying cortical β-amyloid deposition, a hallmark of preclinical Alzheimer's disease, as measured using positron emission tomography. Results on 104 individuals with a total of 300 visits suggest that precuneus is the earliest cortical region to
A MULTIVARIATE FIT LUMINOSITY FUNCTION AND WORLD MODEL FOR LONG GAMMA-RAY BURSTS
Shahmoradi, Amir
2013-04-01
It is proposed that the luminosity function, the rest-frame spectral correlations, and distributions of cosmological long-duration (Type-II) gamma-ray bursts (LGRBs) may be very well described as a multivariate log-normal distribution. This result is based on careful selection, analysis, and modeling of LGRBs' temporal and spectral variables in the largest catalog of GRBs available to date: 2130 BATSE GRBs, while taking into account the detection threshold and possible selection effects. Constraints on the joint rest-frame distribution of the isotropic peak luminosity (L{sub iso}), total isotropic emission (E{sub iso}), the time-integrated spectral peak energy (E{sub p,z}), and duration (T{sub 90,z}) of LGRBs are derived. The presented analysis provides evidence for a relatively large fraction of LGRBs that have been missed by the BATSE detector with E{sub iso} extending down to {approx}10{sup 49} erg and observed spectral peak energies (E{sub p} ) as low as {approx}5 keV. LGRBs with rest-frame duration T{sub 90,z} {approx}< 1 s or observer-frame duration T{sub 90} {approx}< 2 s appear to be rare events ({approx}< 0.1% chance of occurrence). The model predicts a fairly strong but highly significant correlation ({rho} = 0.58 {+-} 0.04) between E{sub iso} and E{sub p,z} of LGRBs. Also predicted are strong correlations of L{sub iso} and E{sub iso} with T{sub 90,z} and moderate correlation between L{sub iso} and E{sub p,z}. The strength and significance of the correlations found encourage the search for underlying mechanisms, though undermine their capabilities as probes of dark energy's equation of state at high redshifts. The presented analysis favors-but does not necessitate-a cosmic rate for BATSE LGRBs tracing metallicity evolution consistent with a cutoff Z/Z{sub Sun} {approx} 0.2-0.5, assuming no luminosity-redshift evolution.
Yu, P.
2008-01-01
More recently, advanced synchrotron radiation-based bioanalytical technique (SRFTIRM) has been applied as a novel non-invasive analysis tool to study molecular, functional group and biopolymer chemistry, nutrient make-up and structural conformation in biomaterials. This novel synchrotron technique, taking advantage of bright synchrotron light (which is million times brighter than sunlight), is capable of exploring the biomaterials at molecular and cellular levels. However, with the synchrotron RFTIRM technique, a large number of molecular spectral data are usually collected. The objective of this article was to illustrate how to use two multivariate statistical techniques: (1) agglomerative hierarchical cluster analysis (AHCA) and (2) principal component analysis (PCA) and two advanced multicomponent modeling methods: (1) Gaussian and (2) Lorentzian multi-component peak modeling for molecular spectrum analysis of bio-tissues. The studies indicated that the two multivariate analyses (AHCA, PCA) are able to create molecular spectral corrections by including not just one intensity or frequency point of a molecular spectrum, but by utilizing the entire spectral information. Gaussian and Lorentzian modeling techniques are able to quantify spectral omponent peaks of molecular structure, functional group and biopolymer. By application of these four statistical methods of the multivariate techniques and Gaussian and Lorentzian modeling, inherent molecular structures, functional group and biopolymer onformation between and among biological samples can be quantified, discriminated and classified with great efficiency.
NASA Astrophysics Data System (ADS)
McFadden, Fiona J. Stevens; Welch, Barry J.; Austin, Pual C.
2006-02-01
This paper investigates the application of multivariable model-based control to improve the regulatory control of electrolyte temperature, aluminum fluoride concentration, liquidus temperature, superheat, and electrolyte height. Also examined are therappropriateness of different control structures and the possible inclusion of recently developed sensors for alumina concentration and individual cell duct flowrate, temperature, and heat loss. For the smelter in this study, the maximum improvement possible with a multivariable model-based controller is predicted to be 30 40% reduction in standard deviation in electrolyte temperature, aluminum fluoride concentration, liquidus temperature, and superheat, and around half this for electrolyte height. Three control structures were found to be appropriate; all are different than the existing control structure, which was found to be suboptimal. Linear Quadratic Gaussian controllers were designed for each control structure and their predicted performance compared.
Covariate-Adjusted Linear Mixed Effects Model with an Application to Longitudinal Data
Nguyen, Danh V.; Şentürk, Damla; Carroll, Raymond J.
2009-01-01
Linear mixed effects (LME) models are useful for longitudinal data/repeated measurements. We propose a new class of covariate-adjusted LME models for longitudinal data that nonparametrically adjusts for a normalizing covariate. The proposed approach involves fitting a parametric LME model to the data after adjusting for the nonparametric effects of a baseline confounding covariate. In particular, the effect of the observable covariate on the response and predictors of the LME model is modeled nonparametrically via smooth unknown functions. In addition to covariate-adjusted estimation of fixed/population parameters and random effects, an estimation procedure for the variance components is also developed. Numerical properties of the proposed estimators are investigated with simulation studies. The consistency and convergence rates of the proposed estimators are also established. An application to a longitudinal data set on calcium absorption, accounting for baseline distortion from body mass index, illustrates the proposed methodology. PMID:19266053
NASA Astrophysics Data System (ADS)
Bernasocchi, M.; Coltekin, A.; Gruber, S.
2012-07-01
In environmental change studies, often multiple variables are measured or modelled, and temporal information is essential for the task. These multivariate geographic time-series datasets are often big and difficult to analyse. While many established methods such as PCP (parallel coordinate plots), STC (space-time cubes), scatter-plots and multiple (linked) visualisations help provide more information, we observe that most of the common geovisual analytics suits do not include three-dimensional (3D) visualisations. However, in many environmental studies, we hypothesize that the addition of 3D terrain visualisations along with appropriate data plots and two-dimensional views can help improve the analysts' ability to interpret the spatial relevance better. To test our ideas, we conceptualize, develop, implement and evaluate a geovisual analytics toolbox in a user-centred manner. The conceptualization of the tool is based on concrete user needs that have been identified and collected during informal brainstorming sessions and in a structured focus group session prior to the development. The design process, therefore, is based on a combination of user-centred design with a requirement analysis and agile development. Based on the findings from this phase, the toolbox was designed to have a modular structure and was built on open source geographic information systems (GIS) program Quantum GIS (QGIS), thus benefiting from existing GIS functionality. The modules include a globe view for 3D terrain visualisation (OSGEarth), a scattergram, a time vs. value plot, and a 3D helix visualisation as well as the possibility to view the raw data. The visualisation frame allows real-time linking of these representations. After the design and development stage, a case study was created featuring data from Zermatt valley and the toolbox was evaluated based on expert interviews. Analysts performed multiple spatial and temporal tasks with the case study using the toolbox. The expert
A New Climate Adjustment Tool: An update to EPA’s Storm Water Management Model
The US EPA’s newest tool, the Stormwater Management Model (SWMM) – Climate Adjustment Tool (CAT) is meant to help municipal stormwater utilities better address potential climate change impacts affecting their operations.
Liu, Xianhua; Wang, Lili
2015-01-01
A series of ultraviolet-visible (UV-Vis) spectra from seawater samples collected from sites along the coastline of Tianjin Bohai Bay in China were subjected to multivariate partial least squares (PLS) regression analysis. Calibration models were developed for monitoring chemical oxygen demand (COD) and concentrations of total organic carbon (TOC). Three different PLS models were developed using the spectra from raw samples (Model-1), diluted samples (Model-2), and diluted and raw samples combined (Model-3). Experimental results showed that: (i) possible nonlinearities in the signal concentration relationships were well accounted for by the multivariate PLS model; (ii) the predicted values of COD and TOC fit the analytical values well; the high correlation coefficients and small root mean squared error of cross-validation (RMSECV) showed that this method can be used for seawater quality monitoring; and (iii) compared with Model-1 and Model-2, Model-3 had the highest coefficient of determination (R2) and the lowest number of latent variables. This latter finding suggests that only large data sets that include data representing different combinations of conditions (i.e., various seawater matrices) will produce stable site-specific regressions. The results of this study illustrate the effectiveness of the proposed method and its potential for use as a seawater quality monitoring technique. PMID:26442484
NASA Technical Reports Server (NTRS)
Aires, Filipe; Rossow, William B.; Hansen, James E. (Technical Monitor)
2001-01-01
A new approach is presented for the analysis of feedback processes in a nonlinear dynamical system by observing its variations. The new methodology consists of statistical estimates of the sensitivities between all pairs of variables in the system based on a neural network modeling of the dynamical system. The model can then be used to estimate the instantaneous, multivariate and nonlinear sensitivities, which are shown to be essential for the analysis of the feedbacks processes involved in the dynamical system. The method is described and tested on synthetic data from the low-order Lorenz circulation model where the correct sensitivities can be evaluated analytically.
NASA Astrophysics Data System (ADS)
Kisi, Ozgur
2015-09-01
Pan evaporation (Ep) modeling is an important issue in reservoir management, regional water resources planning and evaluation of drinking-water supplies. The main purpose of this study is to investigate the accuracy of least square support vector machine (LSSVM), multivariate adaptive regression splines (MARS) and M5 Model Tree (M5Tree) in modeling Ep. The first part of the study focused on testing the ability of the LSSVM, MARS and M5Tree models in estimating the Ep data of Mersin and Antalya stations located in Mediterranean Region of Turkey by using cross-validation method. The LSSVM models outperformed the MARS and M5Tree models in estimating Ep of Mersin and Antalya stations with local input and output data. The average root mean square error (RMSE) of the M5Tree and MARS models was decreased by 24-32.1% and 10.8-18.9% using LSSVM models for the Mersin and Antalya stations, respectively. The ability of three different methods was examined in estimation of Ep using input air temperature, solar radiation, relative humidity and wind speed data from nearby station in the second part of the study (cross-station application without local input data). The results showed that the MARS models provided better accuracy than the LSSVM and M5Tree models with respect to RMSE, mean absolute error (MAE) and determination coefficient (R2) criteria. The average RMSE accuracy of the LSSVM and M5Tree was increased by 3.7% and 16.5% using MARS. In the case of without local input data, the average RMSE accuracy of the LSSVM and M5Tree was respectively increased by 11.4% and 18.4% using MARS. In the third part of the study, the ability of the applied models was examined in Ep estimation using input and output data of nearby station. The results reported that the MARS models performed better than the other models with respect to RMSE, MAE and R2 criteria. The average RMSE of the LSSVM and M5Tree was respectively decreased by 54% and 3.4% using MARS. The overall results indicated that
Procedures for adjusting regional regression models of urban-runoff quality using local data
Hoos, Anne B.; Lizarraga, Joy S.
1996-01-01
Statistical operations termed model-adjustment procedures can be used to incorporate local data into existing regression modes to improve the predication of urban-runoff quality. Each procedure is a form of regression analysis in which the local data base is used as a calibration data set; the resulting adjusted regression models can then be used to predict storm-runoff quality at unmonitored sites. Statistical tests of the calibration data set guide selection among proposed procedures.
Modeling of an Adjustable Beam Solid State Light Project
NASA Technical Reports Server (NTRS)
Clark, Toni
2015-01-01
This proposal is for the development of a computational model of a prototype variable beam light source using optical modeling software, Zemax Optics Studio. The variable beam light source would be designed to generate flood, spot, and directional beam patterns, while maintaining the same average power usage. The optical model would demonstrate the possibility of such a light source and its ability to address several issues: commonality of design, human task variability, and light source design process improvements. An adaptive lighting solution that utilizes the same electronics footprint and power constraints while addressing variability of lighting needed for the range of exploration tasks can save costs and allow for the development of common avionics for lighting controls.
Naccarato, Attilio; Furia, Emilia; Sindona, Giovanni; Tagarelli, Antonio
2016-09-01
Four class-modeling techniques (soft independent modeling of class analogy (SIMCA), unequal dispersed classes (UNEQ), potential functions (PF), and multivariate range modeling (MRM)) were applied to multielement distribution to build chemometric models able to authenticate chili pepper samples grown in Calabria respect to those grown outside of Calabria. The multivariate techniques were applied by considering both all the variables (32 elements, Al, As, Ba, Ca, Cd, Ce, Co, Cr, Cs, Cu, Dy, Fe, Ga, La, Li, Mg, Mn, Na, Nd, Ni, Pb, Pr, Rb, Sc, Se, Sr, Tl, Tm, V, Y, Yb, Zn) and variables selected by means of stepwise linear discriminant analysis (S-LDA). In the first case, satisfactory and comparable results in terms of CV efficiency are obtained with the use of SIMCA and MRM (82.3 and 83.2% respectively), whereas MRM performs better than SIMCA in terms of forced model efficiency (96.5%). The selection of variables by S-LDA permitted to build models characterized, in general, by a higher efficiency. MRM provided again the best results for CV efficiency (87.7% with an effective balance of sensitivity and specificity) as well as forced model efficiency (96.5%). PMID:27041319
Circumplex and Spherical Models for Child School Adjustment and Competence.
ERIC Educational Resources Information Center
Schaefer, Earl S.; Edgerton, Marianna
The goal of this study is to broaden the scope of a conceptual model for child behavior by analyzing constructs relevant to cognition, conation, and affect. Two samples were drawn from school populations. For the first sample, 28 teachers from 8 rural, suburban, and urban schools rated 193 kindergarten children. Each teacher rated up to eight…
A General Linear Model Approach to Adjusting the Cumulative GPA.
ERIC Educational Resources Information Center
Young, John W.
A general linear model (GLM), using least-squares techniques, was used to develop a criterion measure to replace freshman year grade point average (GPA) in college admission predictive validity studies. Problems with the use of GPA include those associated with the combination of grades from different courses and disciplines into a single measure,…
Oskrochi, Gholamreza; Lesaffre, Emmanuel; Oskrochi, Youssof; Shamley, Delva
2016-01-01
In this study, four major muscles acting on the scapula were investigated in patients who had been treated in the last six years for unilateral carcinoma of the breast. Muscle activity was assessed by electromyography during abduction and adduction of the affected and unaffected arms. The main principal aim of the study was to compare shoulder muscle activity in the affected and unaffected shoulder during elevation of the arm. A multivariate linear mixed model was introduced and applied to address the principal aims. The result of fitting this model to the data shows a huge improvement as compared to the alternatives. PMID:26950134
Oskrochi, Gholamreza; Lesaffre, Emmanuel; Oskrochi, Youssof; Shamley, Delva
2016-01-01
In this study, four major muscles acting on the scapula were investigated in patients who had been treated in the last six years for unilateral carcinoma of the breast. Muscle activity was assessed by electromyography during abduction and adduction of the affected and unaffected arms. The main principal aim of the study was to compare shoulder muscle activity in the affected and unaffected shoulder during elevation of the arm. A multivariate linear mixed model was introduced and applied to address the principal aims. The result of fitting this model to the data shows a huge improvement as compared to the alternatives. PMID:26950134
Small Bowel Obstruction—Who Needs an Operation? A Multivariate Prediction Model
Eiken, Patrick W.; Bannon, Michael P.; Heller, Stephanie F.; Lohse, Christine M.; Huebner, Marianne; Sarr, Michael G.
2016-01-01
Background Proper management of small bowel obstruction (SBO) requires a methodology to prevent nontherapeutic laparotomy while minimizing the chance of overlooking strangulation obstruction causing intestinal ischemia. Our aim was to identify preoperative risk factors associated with strangulating SBO and to develop a model to predict the need for operative intervention in the presence of an SBO. Our hypothesis was that free intraperitoneal fluid on computed tomography (CT) is associated with the presence of bowel ischemia and need for exploration. Methods We reviewed 100 consecutive patients with SBO, all of whom had undergone CT that was reviewed by a radiologist blinded to outcome. The need for operative management was confirmed retrospectively by four surgeons based on operative findings and the patient’s clinical course. Results Patients were divided into two groups: group 1, who required operative management on retrospective review, and group 2 who did not. Four patients who were treated nonoperatively had ischemia or died of malignant SBO and were then included in group 1; two patients who had a nontherapeutic exploration were included in group 2. On univariate analysis, the need for exploration (n = 48) was associated (p < 0.05) with a history of malignancy (29% vs. 12%), vomiting (85% vs. 63%), and CT findings of either free intraperitoneal fluid (67% vs. 31%), mesenteric edema (67% vs. 37%), mesenteric vascular engorgement (85% vs. 67%), small bowel wall thickening (44% vs. 25%) or absence of the “small bowel feces sign” (so-called fecalization) (10% vs. 29%). Ischemia (n = 11) was associated (p < 0.05 each) with peritonitis (36% vs. 1%), free intraperitoneal fluid (82% vs. 44%), serum lactate concentration (2.7 ± 1.6 vs. 1.3 ± 0.6 mmol/l), mesenteric edema (91% vs. 46%), closed loop obstruction (27% vs. 2%), pneumatosis intestinalis (18% vs. 0%), and portal venous gas (18% vs. 0%). On multivariate analysis, free intraperitoneal fluid [odds ratio
Cross-Correlations and Joint Gaussianity in Multivariate Level Crossing Models
2014-01-01
A variety of phenomena in physical and biological sciences can be mathematically understood by considering the statistical properties of level crossings of random Gaussian processes. Notably, a growing number of these phenomena demand a consideration of correlated level crossings emerging from multiple correlated processes. While many theoretical results have been obtained in the last decades for individual Gaussian level-crossing processes, few results are available for multivariate, jointly correlated threshold crossings. Here, we address bivariate upward crossing processes and derive the corresponding bivariate Central Limit Theorem as well as provide closed-form expressions for their joint level-crossing correlations. PMID:24742344
Lewin, M.D.; Sarasua, S.; Jones, P.A. . Div. of Health Studies)
1999-07-01
For the purpose of examining the association between blood lead levels and household-specific soil lead levels, the authors used a multivariate linear regression model to find a slope factor relating soil lead levels to blood lead levels. They used previously collected data from the Agency for Toxic Substances and Disease Registry's (ATSDR's) multisite lead and cadmium study. The data included in the blood lead measurements of 1,015 children aged 6--71 months, and corresponding household-specific environmental samples. The environmental samples included lead in soil, house dust, interior paint, and tap water. After adjusting for income, education or the parents, presence of a smoker in the household, sex, and dust lead, and using a double log transformation, they found a slope factor of 0.1388 with a 95% confidence interval of 0.09--0.19 for the dose-response relationship between the natural log of the soil lead level and the natural log of the blood lead level. The predicted blood lead level corresponding to a soil lead level of 500 mg/kg was 5.99 [micro]g/kg with a 95% prediction interval of 2.08--17.29. Predicted values and their corresponding prediction intervals varied by covariate level. The model shows that increased soil lead level is associated with elevated blood leads in children, but that predictions based on this regression model are subject to high levels of uncertainty and variability.
Comparison of the Properties of Regression and Categorical Risk-Adjustment Models
Averill, Richard F.; Muldoon, John H.; Hughes, John S.
2016-01-01
Clinical risk-adjustment, the ability to standardize the comparison of individuals with different health needs, is based upon 2 main alternative approaches: regression models and clinical categorical models. In this article, we examine the impact of the differences in the way these models are constructed on end user applications. PMID:26945302
ERIC Educational Resources Information Center
Olejnik, Stephen; Mills, Jamie; Keselman, Harvey
2000-01-01
Evaluated the use of Mallow's C(p) and Wherry's adjusted R squared (R. Wherry, 1931) statistics to select a final model from a pool of model solutions using computer generated data. Neither statistic identified the underlying regression model any better than, and usually less well than, the stepwise selection method, which itself was poor for…
NASA Astrophysics Data System (ADS)
Khoshravesh, Mojtaba; Sefidkouhi, Mohammad Ali Gholami; Valipour, Mohammad
2015-12-01
The proper evaluation of evapotranspiration is essential in food security investigation, farm management, pollution detection, irrigation scheduling, nutrient flows, carbon balance as well as hydrologic modeling, especially in arid environments. To achieve sustainable development and to ensure water supply, especially in arid environments, irrigation experts need tools to estimate reference evapotranspiration on a large scale. In this study, the monthly reference evapotranspiration was estimated by three different regression models including the multivariate fractional polynomial (MFP), robust regression, and Bayesian regression in Ardestan, Esfahan, and Kashan. The results were compared with Food and Agriculture Organization (FAO)-Penman-Monteith (FAO-PM) to select the best model. The results show that at a monthly scale, all models provided a closer agreement with the calculated values for FAO-PM (R 2 > 0.95 and RMSE < 12.07 mm month-1). However, the MFP model gives better estimates than the other two models for estimating reference evapotranspiration at all stations.