Correlative and multivariate analysis of increased radon concentration in underground laboratory.
Maletić, Dimitrije M; Udovičić, Vladimir I; Banjanac, Radomir M; Joković, Dejan R; Dragić, Aleksandar L; Veselinović, Nikola B; Filipović, Jelena
2014-11-01
The results of analysis using correlative and multivariate methods, as developed for data analysis in high-energy physics and implemented in the Toolkit for Multivariate Analysis software package, of the relations of the variation of increased radon concentration with climate variables in shallow underground laboratory is presented. Multivariate regression analysis identified a number of multivariate methods which can give a good evaluation of increased radon concentrations based on climate variables. The use of the multivariate regression methods will enable the investigation of the relations of specific climate variable with increased radon concentrations by analysis of regression methods resulting in 'mapped' underlying functional behaviour of radon concentrations depending on a wide spectrum of climate variables. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Dinç, Erdal; Ozdemir, Abdil
2005-01-01
Multivariate chromatographic calibration technique was developed for the quantitative analysis of binary mixtures enalapril maleate (EA) and hydrochlorothiazide (HCT) in tablets in the presence of losartan potassium (LST). The mathematical algorithm of multivariate chromatographic calibration technique is based on the use of the linear regression equations constructed using relationship between concentration and peak area at the five-wavelength set. The algorithm of this mathematical calibration model having a simple mathematical content was briefly described. This approach is a powerful mathematical tool for an optimum chromatographic multivariate calibration and elimination of fluctuations coming from instrumental and experimental conditions. This multivariate chromatographic calibration contains reduction of multivariate linear regression functions to univariate data set. The validation of model was carried out by analyzing various synthetic binary mixtures and using the standard addition technique. Developed calibration technique was applied to the analysis of the real pharmaceutical tablets containing EA and HCT. The obtained results were compared with those obtained by classical HPLC method. It was observed that the proposed multivariate chromatographic calibration gives better results than classical HPLC.
Mueller, Daniela; Ferrão, Marco Flôres; Marder, Luciano; da Costa, Adilson Ben; de Cássia de Souza Schneider, Rosana
2013-01-01
The main objective of this study was to use infrared spectroscopy to identify vegetable oils used as raw material for biodiesel production and apply multivariate analysis to the data. Six different vegetable oil sources—canola, cotton, corn, palm, sunflower and soybeans—were used to produce biodiesel batches. The spectra were acquired by Fourier transform infrared spectroscopy using a universal attenuated total reflectance sensor (FTIR-UATR). For the multivariate analysis principal component analysis (PCA), hierarchical cluster analysis (HCA), interval principal component analysis (iPCA) and soft independent modeling of class analogy (SIMCA) were used. The results indicate that is possible to develop a methodology to identify vegetable oils used as raw material in the production of biodiesel by FTIR-UATR applying multivariate analysis. It was also observed that the iPCA found the best spectral range for separation of biodiesel batches using FTIR-UATR data, and with this result, the SIMCA method classified 100% of the soybean biodiesel samples. PMID:23539030
Multivariate meta-analysis: potential and promise.
Jackson, Dan; Riley, Richard; White, Ian R
2011-09-10
The multivariate random effects model is a generalization of the standard univariate model. Multivariate meta-analysis is becoming more commonly used and the techniques and related computer software, although continually under development, are now in place. In order to raise awareness of the multivariate methods, and discuss their advantages and disadvantages, we organized a one day 'Multivariate meta-analysis' event at the Royal Statistical Society. In addition to disseminating the most recent developments, we also received an abundance of comments, concerns, insights, critiques and encouragement. This article provides a balanced account of the day's discourse. By giving others the opportunity to respond to our assessment, we hope to ensure that the various view points and opinions are aired before multivariate meta-analysis simply becomes another widely used de facto method without any proper consideration of it by the medical statistics community. We describe the areas of application that multivariate meta-analysis has found, the methods available, the difficulties typically encountered and the arguments for and against the multivariate methods, using four representative but contrasting examples. We conclude that the multivariate methods can be useful, and in particular can provide estimates with better statistical properties, but also that these benefits come at the price of making more assumptions which do not result in better inference in every case. Although there is evidence that multivariate meta-analysis has considerable potential, it must be even more carefully applied than its univariate counterpart in practice. Copyright © 2011 John Wiley & Sons, Ltd.
Analyzing Multiple Outcomes in Clinical Research Using Multivariate Multilevel Models
Baldwin, Scott A.; Imel, Zac E.; Braithwaite, Scott R.; Atkins, David C.
2014-01-01
Objective Multilevel models have become a standard data analysis approach in intervention research. Although the vast majority of intervention studies involve multiple outcome measures, few studies use multivariate analysis methods. The authors discuss multivariate extensions to the multilevel model that can be used by psychotherapy researchers. Method and Results Using simulated longitudinal treatment data, the authors show how multivariate models extend common univariate growth models and how the multivariate model can be used to examine multivariate hypotheses involving fixed effects (e.g., does the size of the treatment effect differ across outcomes?) and random effects (e.g., is change in one outcome related to change in the other?). An online supplemental appendix provides annotated computer code and simulated example data for implementing a multivariate model. Conclusions Multivariate multilevel models are flexible, powerful models that can enhance clinical research. PMID:24491071
Richard. D. Wood-Smith; John M. Buffington
1996-01-01
Multivariate statistical analyses of geomorphic variables from 23 forest stream reaches in southeast Alaska result in successful discrimination between pristine streams and those disturbed by land management, specifically timber harvesting and associated road building. Results of discriminant function analysis indicate that a three-variable model discriminates 10...
Deconstructing multivariate decoding for the study of brain function.
Hebart, Martin N; Baker, Chris I
2017-08-04
Multivariate decoding methods were developed originally as tools to enable accurate predictions in real-world applications. The realization that these methods can also be employed to study brain function has led to their widespread adoption in the neurosciences. However, prior to the rise of multivariate decoding, the study of brain function was firmly embedded in a statistical philosophy grounded on univariate methods of data analysis. In this way, multivariate decoding for brain interpretation grew out of two established frameworks: multivariate decoding for predictions in real-world applications, and classical univariate analysis based on the study and interpretation of brain activation. We argue that this led to two confusions, one reflecting a mixture of multivariate decoding for prediction or interpretation, and the other a mixture of the conceptual and statistical philosophies underlying multivariate decoding and classical univariate analysis. Here we attempt to systematically disambiguate multivariate decoding for the study of brain function from the frameworks it grew out of. After elaborating these confusions and their consequences, we describe six, often unappreciated, differences between classical univariate analysis and multivariate decoding. We then focus on how the common interpretation of what is signal and noise changes in multivariate decoding. Finally, we use four examples to illustrate where these confusions may impact the interpretation of neuroimaging data. We conclude with a discussion of potential strategies to help resolve these confusions in interpreting multivariate decoding results, including the potential departure from multivariate decoding methods for the study of brain function. Copyright © 2017. Published by Elsevier Inc.
NASA Astrophysics Data System (ADS)
Safi, A.; Campanella, B.; Grifoni, E.; Legnaioli, S.; Lorenzetti, G.; Pagnotta, S.; Poggialini, F.; Ripoll-Seguer, L.; Hidalgo, M.; Palleschi, V.
2018-06-01
The introduction of multivariate calibration curve approach in Laser-Induced Breakdown Spectroscopy (LIBS) quantitative analysis has led to a general improvement of the LIBS analytical performances, since a multivariate approach allows to exploit the redundancy of elemental information that are typically present in a LIBS spectrum. Software packages implementing multivariate methods are available in the most diffused commercial and open source analytical programs; in most of the cases, the multivariate algorithms are robust against noise and operate in unsupervised mode. The reverse of the coin of the availability and ease of use of such packages is the (perceived) difficulty in assessing the reliability of the results obtained which often leads to the consideration of the multivariate algorithms as 'black boxes' whose inner mechanism is supposed to remain hidden to the user. In this paper, we will discuss the dangers of a 'black box' approach in LIBS multivariate analysis, and will discuss how to overcome them using the chemical-physical knowledge that is at the base of any LIBS quantitative analysis.
Multivariate meta-analysis: Potential and promise
Jackson, Dan; Riley, Richard; White, Ian R
2011-01-01
The multivariate random effects model is a generalization of the standard univariate model. Multivariate meta-analysis is becoming more commonly used and the techniques and related computer software, although continually under development, are now in place. In order to raise awareness of the multivariate methods, and discuss their advantages and disadvantages, we organized a one day ‘Multivariate meta-analysis’ event at the Royal Statistical Society. In addition to disseminating the most recent developments, we also received an abundance of comments, concerns, insights, critiques and encouragement. This article provides a balanced account of the day's discourse. By giving others the opportunity to respond to our assessment, we hope to ensure that the various view points and opinions are aired before multivariate meta-analysis simply becomes another widely used de facto method without any proper consideration of it by the medical statistics community. We describe the areas of application that multivariate meta-analysis has found, the methods available, the difficulties typically encountered and the arguments for and against the multivariate methods, using four representative but contrasting examples. We conclude that the multivariate methods can be useful, and in particular can provide estimates with better statistical properties, but also that these benefits come at the price of making more assumptions which do not result in better inference in every case. Although there is evidence that multivariate meta-analysis has considerable potential, it must be even more carefully applied than its univariate counterpart in practice. Copyright © 2011 John Wiley & Sons, Ltd. PMID:21268052
Multivariate generalized multifactor dimensionality reduction to detect gene-gene interactions
2013-01-01
Background Recently, one of the greatest challenges in genome-wide association studies is to detect gene-gene and/or gene-environment interactions for common complex human diseases. Ritchie et al. (2001) proposed multifactor dimensionality reduction (MDR) method for interaction analysis. MDR is a combinatorial approach to reduce multi-locus genotypes into high-risk and low-risk groups. Although MDR has been widely used for case-control studies with binary phenotypes, several extensions have been proposed. One of these methods, a generalized MDR (GMDR) proposed by Lou et al. (2007), allows adjusting for covariates and applying to both dichotomous and continuous phenotypes. GMDR uses the residual score of a generalized linear model of phenotypes to assign either high-risk or low-risk group, while MDR uses the ratio of cases to controls. Methods In this study, we propose multivariate GMDR, an extension of GMDR for multivariate phenotypes. Jointly analysing correlated multivariate phenotypes may have more power to detect susceptible genes and gene-gene interactions. We construct generalized estimating equations (GEE) with multivariate phenotypes to extend generalized linear models. Using the score vectors from GEE we discriminate high-risk from low-risk groups. We applied the multivariate GMDR method to the blood pressure data of the 7,546 subjects from the Korean Association Resource study: systolic blood pressure (SBP) and diastolic blood pressure (DBP). We compare the results of multivariate GMDR for SBP and DBP to the results from separate univariate GMDR for SBP and DBP, respectively. We also applied the multivariate GMDR method to the repeatedly measured hypertension status from 5,466 subjects and compared its result with those of univariate GMDR at each time point. Results Results from the univariate GMDR and multivariate GMDR in two-locus model with both blood pressures and hypertension phenotypes indicate best combinations of SNPs whose interaction has significant association with risk for high blood pressures or hypertension. Although the test balanced accuracy (BA) of multivariate analysis was not always greater than that of univariate analysis, the multivariate BAs were more stable with smaller standard deviations. Conclusions In this study, we have developed multivariate GMDR method using GEE approach. It is useful to use multivariate GMDR with correlated multiple phenotypes of interests. PMID:24565370
Bonetti, Jennifer; Quarino, Lawrence
2014-05-01
This study has shown that the combination of simple techniques with the use of multivariate statistics offers the potential for the comparative analysis of soil samples. Five samples were obtained from each of twelve state parks across New Jersey in both the summer and fall seasons. Each sample was examined using particle-size distribution, pH analysis in both water and 1 M CaCl2 , and a loss on ignition technique. Data from each of the techniques were combined, and principal component analysis (PCA) and canonical discriminant analysis (CDA) were used for multivariate data transformation. Samples from different locations could be visually differentiated from one another using these multivariate plots. Hold-one-out cross-validation analysis showed error rates as low as 3.33%. Ten blind study samples were analyzed resulting in no misclassifications using Mahalanobis distance calculations and visual examinations of multivariate plots. Seasonal variation was minimal between corresponding samples, suggesting potential success in forensic applications. © 2014 American Academy of Forensic Sciences.
Use of Multivariate Linkage Analysis for Dissection of a Complex Cognitive Trait
Marlow, Angela J.; Fisher, Simon E.; Francks, Clyde; MacPhie, I. Laurence; Cherny, Stacey S.; Richardson, Alex J.; Talcott, Joel B.; Stein, John F.; Monaco, Anthony P.; Cardon, Lon R.
2003-01-01
Replication of linkage results for complex traits has been exceedingly difficult, owing in part to the inability to measure the precise underlying phenotype, small sample sizes, genetic heterogeneity, and statistical methods employed in analysis. Often, in any particular study, multiple correlated traits have been collected, yet these have been analyzed independently or, at most, in bivariate analyses. Theoretical arguments suggest that full multivariate analysis of all available traits should offer more power to detect linkage; however, this has not yet been evaluated on a genomewide scale. Here, we conduct multivariate genomewide analyses of quantitative-trait loci that influence reading- and language-related measures in families affected with developmental dyslexia. The results of these analyses are substantially clearer than those of previous univariate analyses of the same data set, helping to resolve a number of key issues. These outcomes highlight the relevance of multivariate analysis for complex disorders for dissection of linkage results in correlated traits. The approach employed here may aid positional cloning of susceptibility genes in a wide spectrum of complex traits. PMID:12587094
Characterizing multivariate decoding models based on correlated EEG spectral features
McFarland, Dennis J.
2013-01-01
Objective Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Methods Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). Results The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Conclusions Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. Significance While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. PMID:23466267
Zhi, Ruicong; Zhao, Lei; Xie, Nan; Wang, Houyin; Shi, Bolin; Shi, Jingye
2016-01-13
A framework of establishing standard reference scale (texture) is proposed by multivariate statistical analysis according to instrumental measurement and sensory evaluation. Multivariate statistical analysis is conducted to rapidly select typical reference samples with characteristics of universality, representativeness, stability, substitutability, and traceability. The reasonableness of the framework method is verified by establishing standard reference scale of texture attribute (hardness) with Chinese well-known food. More than 100 food products in 16 categories were tested using instrumental measurement (TPA test), and the result was analyzed with clustering analysis, principal component analysis, relative standard deviation, and analysis of variance. As a result, nine kinds of foods were determined to construct the hardness standard reference scale. The results indicate that the regression coefficient between the estimated sensory value and the instrumentally measured value is significant (R(2) = 0.9765), which fits well with Stevens's theory. The research provides reliable a theoretical basis and practical guide for quantitative standard reference scale establishment on food texture characteristics.
Multivariate Meta-Analysis Using Individual Participant Data
ERIC Educational Resources Information Center
Riley, R. D.; Price, M. J.; Jackson, D.; Wardle, M.; Gueyffier, F.; Wang, J.; Staessen, J. A.; White, I. R.
2015-01-01
When combining results across related studies, a multivariate meta-analysis allows the joint synthesis of correlated effect estimates from multiple outcomes. Joint synthesis can improve efficiency over separate univariate syntheses, may reduce selective outcome reporting biases, and enables joint inferences across the outcomes. A common issue is…
Cichy, Radoslaw Martin; Pantazis, Dimitrios
2017-09-01
Multivariate pattern analysis of magnetoencephalography (MEG) and electroencephalography (EEG) data can reveal the rapid neural dynamics underlying cognition. However, MEG and EEG have systematic differences in sampling neural activity. This poses the question to which degree such measurement differences consistently bias the results of multivariate analysis applied to MEG and EEG activation patterns. To investigate, we conducted a concurrent MEG/EEG study while participants viewed images of everyday objects. We applied multivariate classification analyses to MEG and EEG data, and compared the resulting time courses to each other, and to fMRI data for an independent evaluation in space. We found that both MEG and EEG revealed the millisecond spatio-temporal dynamics of visual processing with largely equivalent results. Beyond yielding convergent results, we found that MEG and EEG also captured partly unique aspects of visual representations. Those unique components emerged earlier in time for MEG than for EEG. Identifying the sources of those unique components with fMRI, we found the locus for both MEG and EEG in high-level visual cortex, and in addition for MEG in low-level visual cortex. Together, our results show that multivariate analyses of MEG and EEG data offer a convergent and complimentary view on neural processing, and motivate the wider adoption of these methods in both MEG and EEG research. Copyright © 2017 Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Bejar, Isaac I.
1981-01-01
Effects of nutritional supplementation on physical development of malnourished children was analyzed by univariate and multivariate methods for the analysis of repeated measures. Results showed that the nutritional treatment was successful, but it was necessary to resort to the multivariate approach. (Author/GK)
Multivariate Autoregressive Modeling and Granger Causality Analysis of Multiple Spike Trains
Krumin, Michael; Shoham, Shy
2010-01-01
Recent years have seen the emergence of microelectrode arrays and optical methods allowing simultaneous recording of spiking activity from populations of neurons in various parts of the nervous system. The analysis of multiple neural spike train data could benefit significantly from existing methods for multivariate time-series analysis which have proven to be very powerful in the modeling and analysis of continuous neural signals like EEG signals. However, those methods have not generally been well adapted to point processes. Here, we use our recent results on correlation distortions in multivariate Linear-Nonlinear-Poisson spiking neuron models to derive generalized Yule-Walker-type equations for fitting ‘‘hidden” Multivariate Autoregressive models. We use this new framework to perform Granger causality analysis in order to extract the directed information flow pattern in networks of simulated spiking neurons. We discuss the relative merits and limitations of the new method. PMID:20454705
DOE Office of Scientific and Technical Information (OSTI.GOV)
Van Benthem, Mark Hilary; Mowry, Curtis Dale; Kotula, Paul Gabriel
Thermal decomposition of poly dimethyl siloxane compounds, Sylgard{reg_sign} 184 and 186, were examined using thermal desorption coupled gas chromatography-mass spectrometry (TD/GC-MS) and multivariate analysis. This work describes a method of producing multiway data using a stepped thermal desorption. The technique involves sequentially heating a sample of the material of interest with subsequent analysis in a commercial GC/MS system. The decomposition chromatograms were analyzed using multivariate analysis tools including principal component analysis (PCA), factor rotation employing the varimax criterion, and multivariate curve resolution. The results of the analysis show seven components related to offgassing of various fractions of siloxanes that varymore » as a function of temperature. Thermal desorption coupled with gas chromatography-mass spectrometry (TD/GC-MS) is a powerful analytical technique for analyzing chemical mixtures. It has great potential in numerous analytic areas including materials analysis, sports medicine, in the detection of designer drugs; and biological research for metabolomics. Data analysis is complicated, far from automated and can result in high false positive or false negative rates. We have demonstrated a step-wise TD/GC-MS technique that removes more volatile compounds from a sample before extracting the less volatile compounds. This creates an additional dimension of separation before the GC column, while simultaneously generating three-way data. Sandia's proven multivariate analysis methods, when applied to these data, have several advantages over current commercial options. It also has demonstrated potential for success in finding and enabling identification of trace compounds. Several challenges remain, however, including understanding the sources of noise in the data, outlier detection, improving the data pretreatment and analysis methods, developing a software tool for ease of use by the chemist, and demonstrating our belief that this multivariate analysis will enable superior differentiation capabilities. In addition, noise and system artifacts challenge the analysis of GC-MS data collected on lower cost equipment, ubiquitous in commercial laboratories. This research has the potential to affect many areas of analytical chemistry including materials analysis, medical testing, and environmental surveillance. It could also provide a method to measure adsorption parameters for chemical interactions on various surfaces by measuring desorption as a function of temperature for mixtures. We have presented results of a novel method for examining offgas products of a common PDMS material. Our method involves utilizing a stepped TD/GC-MS data acquisition scheme that may be almost totally automated, coupled with multivariate analysis schemes. This method of data generation and analysis can be applied to a number of materials aging and thermal degradation studies.« less
Multivariate meta-analysis for non-linear and other multi-parameter associations
Gasparrini, A; Armstrong, B; Kenward, M G
2012-01-01
In this paper, we formalize the application of multivariate meta-analysis and meta-regression to synthesize estimates of multi-parameter associations obtained from different studies. This modelling approach extends the standard two-stage analysis used to combine results across different sub-groups or populations. The most straightforward application is for the meta-analysis of non-linear relationships, described for example by regression coefficients of splines or other functions, but the methodology easily generalizes to any setting where complex associations are described by multiple correlated parameters. The modelling framework of multivariate meta-analysis is implemented in the package mvmeta within the statistical environment R. As an illustrative example, we propose a two-stage analysis for investigating the non-linear exposure–response relationship between temperature and non-accidental mortality using time-series data from multiple cities. Multivariate meta-analysis represents a useful analytical tool for studying complex associations through a two-stage procedure. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22807043
Motegi, Hiromi; Tsuboi, Yuuri; Saga, Ayako; Kagami, Tomoko; Inoue, Maki; Toki, Hideaki; Minowa, Osamu; Noda, Tetsuo; Kikuchi, Jun
2015-11-04
There is an increasing need to use multivariate statistical methods for understanding biological functions, identifying the mechanisms of diseases, and exploring biomarkers. In addition to classical analyses such as hierarchical cluster analysis, principal component analysis, and partial least squares discriminant analysis, various multivariate strategies, including independent component analysis, non-negative matrix factorization, and multivariate curve resolution, have recently been proposed. However, determining the number of components is problematic. Despite the proposal of several different methods, no satisfactory approach has yet been reported. To resolve this problem, we implemented a new idea: classifying a component as "reliable" or "unreliable" based on the reproducibility of its appearance, regardless of the number of components in the calculation. Using the clustering method for classification, we applied this idea to multivariate curve resolution-alternating least squares (MCR-ALS). Comparisons between conventional and modified methods applied to proton nuclear magnetic resonance ((1)H-NMR) spectral datasets derived from known standard mixtures and biological mixtures (urine and feces of mice) revealed that more plausible results are obtained by the modified method. In particular, clusters containing little information were detected with reliability. This strategy, named "cluster-aided MCR-ALS," will facilitate the attainment of more reliable results in the metabolomics datasets.
A Study of Effects of MultiCollinearity in the Multivariable Analysis
Yoo, Wonsuk; Mayberry, Robert; Bae, Sejong; Singh, Karan; (Peter) He, Qinghua; Lillard, James W.
2015-01-01
A multivariable analysis is the most popular approach when investigating associations between risk factors and disease. However, efficiency of multivariable analysis highly depends on correlation structure among predictive variables. When the covariates in the model are not independent one another, collinearity/multicollinearity problems arise in the analysis, which leads to biased estimation. This work aims to perform a simulation study with various scenarios of different collinearity structures to investigate the effects of collinearity under various correlation structures amongst predictive and explanatory variables and to compare these results with existing guidelines to decide harmful collinearity. Three correlation scenarios among predictor variables are considered: (1) bivariate collinear structure as the most simple collinearity case, (2) multivariate collinear structure where an explanatory variable is correlated with two other covariates, (3) a more realistic scenario when an independent variable can be expressed by various functions including the other variables. PMID:25664257
A Study of Effects of MultiCollinearity in the Multivariable Analysis.
Yoo, Wonsuk; Mayberry, Robert; Bae, Sejong; Singh, Karan; Peter He, Qinghua; Lillard, James W
2014-10-01
A multivariable analysis is the most popular approach when investigating associations between risk factors and disease. However, efficiency of multivariable analysis highly depends on correlation structure among predictive variables. When the covariates in the model are not independent one another, collinearity/multicollinearity problems arise in the analysis, which leads to biased estimation. This work aims to perform a simulation study with various scenarios of different collinearity structures to investigate the effects of collinearity under various correlation structures amongst predictive and explanatory variables and to compare these results with existing guidelines to decide harmful collinearity. Three correlation scenarios among predictor variables are considered: (1) bivariate collinear structure as the most simple collinearity case, (2) multivariate collinear structure where an explanatory variable is correlated with two other covariates, (3) a more realistic scenario when an independent variable can be expressed by various functions including the other variables.
Cichonska, Anna; Rousu, Juho; Marttinen, Pekka; Kangas, Antti J.; Soininen, Pasi; Lehtimäki, Terho; Raitakari, Olli T.; Järvelin, Marjo-Riitta; Salomaa, Veikko; Ala-Korpela, Mika; Ripatti, Samuli; Pirinen, Matti
2016-01-01
Motivation: A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts, and restricted availability of individual-level genotype-phenotype data across the cohorts limit conducting multivariate tests. Results: We introduce metaCCA, a computational framework for summary statistics-based analysis of a single or multiple studies that allows multivariate representation of both genotype and phenotype. It extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness. Multivariate meta-analysis of two Finnish studies of nuclear magnetic resonance metabolomics by metaCCA, using standard univariate output from the program SNPTEST, shows an excellent agreement with the pooled individual-level analysis of original data. Motivated by strong multivariate signals in the lipid genes tested, we envision that multivariate association testing using metaCCA has a great potential to provide novel insights from already published summary statistics from high-throughput phenotyping technologies. Availability and implementation: Code is available at https://github.com/aalto-ics-kepaco Contacts: anna.cichonska@helsinki.fi or matti.pirinen@helsinki.fi Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153689
ERIC Educational Resources Information Center
Kim, Soyoung; Olejnik, Stephen
2005-01-01
The sampling distributions of five popular measures of association with and without two bias adjusting methods were examined for the single factor fixed-effects multivariate analysis of variance model. The number of groups, sample sizes, number of outcomes, and the strength of association were manipulated. The results indicate that all five…
NASA Astrophysics Data System (ADS)
Vittal, H.; Singh, Jitendra; Kumar, Pankaj; Karmakar, Subhankar
2015-06-01
In watershed management, flood frequency analysis (FFA) is performed to quantify the risk of flooding at different spatial locations and also to provide guidelines for determining the design periods of flood control structures. The traditional FFA was extensively performed by considering univariate scenario for both at-site and regional estimation of return periods. However, due to inherent mutual dependence of the flood variables or characteristics [i.e., peak flow (P), flood volume (V) and flood duration (D), which are random in nature], analysis has been further extended to multivariate scenario, with some restrictive assumptions. To overcome the assumption of same family of marginal density function for all flood variables, the concept of copula has been introduced. Although, the advancement from univariate to multivariate analyses drew formidable attention to the FFA research community, the basic limitation was that the analyses were performed with the implementation of only parametric family of distributions. The aim of the current study is to emphasize the importance of nonparametric approaches in the field of multivariate FFA; however, the nonparametric distribution may not always be a good-fit and capable of replacing well-implemented multivariate parametric and multivariate copula-based applications. Nevertheless, the potential of obtaining best-fit using nonparametric distributions might be improved because such distributions reproduce the sample's characteristics, resulting in more accurate estimations of the multivariate return period. Hence, the current study shows the importance of conjugating multivariate nonparametric approach with multivariate parametric and copula-based approaches, thereby results in a comprehensive framework for complete at-site FFA. Although the proposed framework is designed for at-site FFA, this approach can also be applied to regional FFA because regional estimations ideally include at-site estimations. The framework is based on the following steps: (i) comprehensive trend analysis to assess nonstationarity in the observed data; (ii) selection of the best-fit univariate marginal distribution with a comprehensive set of parametric and nonparametric distributions for the flood variables; (iii) multivariate frequency analyses with parametric, copula-based and nonparametric approaches; and (iv) estimation of joint and various conditional return periods. The proposed framework for frequency analysis is demonstrated using 110 years of observed data from Allegheny River at Salamanca, New York, USA. The results show that for both univariate and multivariate cases, the nonparametric Gaussian kernel provides the best estimate. Further, we perform FFA for twenty major rivers over continental USA, which shows for seven rivers, all the flood variables followed nonparametric Gaussian kernel; whereas for other rivers, parametric distributions provide the best-fit either for one or two flood variables. Thus the summary of results shows that the nonparametric method cannot substitute the parametric and copula-based approaches, but should be considered during any at-site FFA to provide the broadest choices for best estimation of the flood return periods.
Imaging of polysaccharides in the tomato cell wall with Raman microspectroscopy
2014-01-01
Background The primary cell wall of fruits and vegetables is a structure mainly composed of polysaccharides (pectins, hemicelluloses, cellulose). Polysaccharides are assembled into a network and linked together. It is thought that the percentage of components and of plant cell wall has an important influence on mechanical properties of fruits and vegetables. Results In this study the Raman microspectroscopy technique was introduced to the visualization of the distribution of polysaccharides in cell wall of fruit. The methodology of the sample preparation, the measurement using Raman microscope and multivariate image analysis are discussed. Single band imaging (for preliminary analysis) and multivariate image analysis methods (principal component analysis and multivariate curve resolution) were used for the identification and localization of the components in the primary cell wall. Conclusions Raman microspectroscopy supported by multivariate image analysis methods is useful in distinguishing cellulose and pectins in the cell wall in tomatoes. It presents how the localization of biopolymers was possible with minimally prepared samples. PMID:24917885
Dehesh, Tania; Zare, Najaf; Ayatollahi, Seyyed Mohammad Taghi
2015-01-01
Univariate meta-analysis (UM) procedure, as a technique that provides a single overall result, has become increasingly popular. Neglecting the existence of other concomitant covariates in the models leads to loss of treatment efficiency. Our aim was proposing four new approximation approaches for the covariance matrix of the coefficients, which is not readily available for the multivariate generalized least square (MGLS) method as a multivariate meta-analysis approach. We evaluated the efficiency of four new approaches including zero correlation (ZC), common correlation (CC), estimated correlation (EC), and multivariate multilevel correlation (MMC) on the estimation bias, mean square error (MSE), and 95% probability coverage of the confidence interval (CI) in the synthesis of Cox proportional hazard models coefficients in a simulation study. Comparing the results of the simulation study on the MSE, bias, and CI of the estimated coefficients indicated that MMC approach was the most accurate procedure compared to EC, CC, and ZC procedures. The precision ranking of the four approaches according to all above settings was MMC ≥ EC ≥ CC ≥ ZC. This study highlights advantages of MGLS meta-analysis on UM approach. The results suggested the use of MMC procedure to overcome the lack of information for having a complete covariance matrix of the coefficients.
New robust bilinear least squares method for the analysis of spectral-pH matrix data.
Goicoechea, Héctor C; Olivieri, Alejandro C
2005-07-01
A new second-order multivariate method has been developed for the analysis of spectral-pH matrix data, based on a bilinear least-squares (BLLS) model achieving the second-order advantage and handling multiple calibration standards. A simulated Monte Carlo study of synthetic absorbance-pH data allowed comparison of the newly proposed BLLS methodology with constrained parallel factor analysis (PARAFAC) and with the combination multivariate curve resolution-alternating least-squares (MCR-ALS) technique under different conditions of sample-to-sample pH mismatch and analyte-background ratio. The results indicate an improved prediction ability for the new method. Experimental data generated by measuring absorption spectra of several calibration standards of ascorbic acid and samples of orange juice were subjected to second-order calibration analysis with PARAFAC, MCR-ALS, and the new BLLS method. The results indicate that the latter method provides the best analytical results in regard to analyte recovery in samples of complex composition requiring strict adherence to the second-order advantage. Linear dependencies appear when multivariate data are produced by using the pH or a reaction time as one of the data dimensions, posing a challenge to classical multivariate calibration models. The presently discussed algorithm is useful for these latter systems.
Ramdani, Sofiane; Bonnet, Vincent; Tallon, Guillaume; Lagarde, Julien; Bernard, Pierre Louis; Blain, Hubert
2016-08-01
Entropy measures are often used to quantify the regularity of postural sway time series. Recent methodological developments provided both multivariate and multiscale approaches allowing the extraction of complexity features from physiological signals; see "Dynamical complexity of human responses: A multivariate data-adaptive framework," in Bulletin of Polish Academy of Science and Technology, vol. 60, p. 433, 2012. The resulting entropy measures are good candidates for the analysis of bivariate postural sway signals exhibiting nonstationarity and multiscale properties. These methods are dependant on several input parameters such as embedding parameters. Using two data sets collected from institutionalized frail older adults, we numerically investigate the behavior of a recent multivariate and multiscale entropy estimator; see "Multivariate multiscale entropy: A tool for complexity analysis of multichannel data," Physics Review E, vol. 84, p. 061918, 2011. We propose criteria for the selection of the input parameters. Using these optimal parameters, we statistically compare the multivariate and multiscale entropy values of postural sway data of non-faller subjects to those of fallers. These two groups are discriminated by the resulting measures over multiple time scales. We also demonstrate that the typical parameter settings proposed in the literature lead to entropy measures that do not distinguish the two groups. This last result confirms the importance of the selection of appropriate input parameters.
Sun, Hui; Wang, Huiyu; Zhang, Aihua; Yan, Guangli; Han, Ying; Li, Yuan; Wu, Xiuhong; Meng, Xiangcai; Wang, Xijun
2016-01-01
As herbal medicines have an important position in health care systems worldwide, their current assessment, and quality control are a major bottleneck. Cortex Phellodendri chinensis (CPC) and Cortex Phellodendri amurensis (CPA) are widely used in China, however, how to identify species of CPA and CPC has become urgent. In this study, multivariate analysis approach was performed to the investigation of chemical discrimination of CPA and CPC. Principal component analysis showed that two herbs could be separated clearly. The chemical markers such as berberine, palmatine, phellodendrine, magnoflorine, obacunone, and obaculactone were identified through the orthogonal partial least squared discriminant analysis, and were identified tentatively by the accurate mass of quadruple-time-of-flight mass spectrometry. A total of 29 components can be used as the chemical markers for discrimination of CPA and CPC. Of them, phellodenrine is significantly higher in CPC than that of CPA, whereas obacunone and obaculactone are significantly higher in CPA than that of CPC. The present study proves that multivariate analysis approach based chemical analysis greatly contributes to the investigation of CPA and CPC, and showed that the identified chemical markers as a whole should be used to discriminate the two herbal medicines, and simultaneously the results also provided chemical information for their quality assessment. Multivariate analysis approach was performed to the investigate the herbal medicineThe chemical markers were identified through multivariate analysis approachA total of 29 components can be used as the chemical markers. UPLC-Q/TOF-MS-based multivariate analysis method for the herbal medicine samples Abbreviations used: CPC: Cortex Phellodendri chinensis, CPA: Cortex Phellodendri amurensis, PCA: Principal component analysis, OPLS-DA: Orthogonal partial least squares discriminant analysis, BPI: Base peaks ion intensity.
Multivariate missing data in hydrology - Review and applications
NASA Astrophysics Data System (ADS)
Ben Aissia, Mohamed-Aymen; Chebana, Fateh; Ouarda, Taha B. M. J.
2017-12-01
Water resources planning and management require complete data sets of a number of hydrological variables, such as flood peaks and volumes. However, hydrologists are often faced with the problem of missing data (MD) in hydrological databases. Several methods are used to deal with the imputation of MD. During the last decade, multivariate approaches have gained popularity in the field of hydrology, especially in hydrological frequency analysis (HFA). However, treating the MD remains neglected in the multivariate HFA literature whereas the focus has been mainly on the modeling component. For a complete analysis and in order to optimize the use of data, MD should also be treated in the multivariate setting prior to modeling and inference. Imputation of MD in the multivariate hydrological framework can have direct implications on the quality of the estimation. Indeed, the dependence between the series represents important additional information that can be included in the imputation process. The objective of the present paper is to highlight the importance of treating MD in multivariate hydrological frequency analysis by reviewing and applying multivariate imputation methods and by comparing univariate and multivariate imputation methods. An application is carried out for multiple flood attributes on three sites in order to evaluate the performance of the different methods based on the leave-one-out procedure. The results indicate that, the performance of imputation methods can be improved by adopting the multivariate setting, compared to mean substitution and interpolation methods, especially when using the copula-based approach.
Wang, Yong; Yao, Xiaomei; Parthasarathy, Ranganathan
2008-01-01
Fourier transform infrared (FTIR) chemical imaging can be used to investigate molecular chemical features of the adhesive/dentin interfaces. However, the information is not straightforward, and is not easily extracted. The objective of this study was to use multivariate analysis methods, principal component analysis and fuzzy c-means clustering, to analyze spectral data in comparison with univariate analysis. The spectral imaging data collected from both the adhesive/healthy dentin and adhesive/caries-affected dentin specimens were used and compared. The univariate statistical methods such as mapping of intensities of specific functional group do not always accurately identify functional group locations and concentrations due to more or less band overlapping in adhesive and dentin. Apart from the ease with which information can be extracted, multivariate methods highlight subtle and often important changes in the spectra that are difficult to observe using univariate methods. The results showed that the multivariate methods gave more satisfactory, interpretable results than univariate methods and were conclusive in showing that they can discriminate and classify differences between healthy dentin and caries-affected dentin within the interfacial regions. It is demonstrated that the multivariate FTIR imaging approaches can be used in the rapid characterization of heterogeneous, complex structure. PMID:18980198
Docking and multivariate methods to explore HIV-1 drug-resistance: a comparative analysis
NASA Astrophysics Data System (ADS)
Almerico, Anna Maria; Tutone, Marco; Lauria, Antonino
2008-05-01
In this paper we describe a comparative analysis between multivariate and docking methods in the study of the drug resistance to the reverse transcriptase and the protease inhibitors. In our early papers we developed a simple but efficient method to evaluate the features of compounds that are less likely to trigger resistance or are effective against mutant HIV strains, using the multivariate statistical procedures PCA and DA. In the attempt to create a more solid background for the prediction of susceptibility or resistance, we carried out a comparative analysis between our previous multivariate approach and molecular docking study. The intent of this paper is not only to find further support to the results obtained by the combined use of PCA and DA, but also to evidence the structural features, in terms of molecular descriptors, similarity, and energetic contributions, derived from docking, which can account for the arising of drug-resistance against mutant strains.
Multivariate meta-analysis using individual participant data
Riley, R. D.; Price, M. J.; Jackson, D.; Wardle, M.; Gueyffier, F.; Wang, J.; Staessen, J. A.; White, I. R.
2016-01-01
When combining results across related studies, a multivariate meta-analysis allows the joint synthesis of correlated effect estimates from multiple outcomes. Joint synthesis can improve efficiency over separate univariate syntheses, may reduce selective outcome reporting biases, and enables joint inferences across the outcomes. A common issue is that within-study correlations needed to fit the multivariate model are unknown from published reports. However, provision of individual participant data (IPD) allows them to be calculated directly. Here, we illustrate how to use IPD to estimate within-study correlations, using a joint linear regression for multiple continuous outcomes and bootstrapping methods for binary, survival and mixed outcomes. In a meta-analysis of 10 hypertension trials, we then show how these methods enable multivariate meta-analysis to address novel clinical questions about continuous, survival and binary outcomes; treatment–covariate interactions; adjusted risk/prognostic factor effects; longitudinal data; prognostic and multiparameter models; and multiple treatment comparisons. Both frequentist and Bayesian approaches are applied, with example software code provided to derive within-study correlations and to fit the models. PMID:26099484
Multivariate geometry as an approach to algal community analysis
Allen, T.F.H.; Skagen, S.
1973-01-01
Multivariate analyses are put in the context of more usual approaches to phycological investigations. The intuitive common-sense involved in methods of ordination, classification and discrimination are emphasised by simple geometric accounts which avoid jargon and matrix algebra. Warnings are given that artifacts result from technique abuses by the naive or over-enthusiastic. An analysis of a simple periphyton data set is presented as an example of the approach. Suggestions are made as to situations in phycological investigations, where the techniques could be appropriate. The discipline is reprimanded for its neglect of the multivariate approach.
Multivariate optimum interpolation of surface pressure and surface wind over oceans
NASA Technical Reports Server (NTRS)
Bloom, S. C.; Baker, W. E.; Nestler, M. S.
1984-01-01
The present multivariate analysis method for surface pressure and winds incorporates ship wind observations into the analysis of surface pressure. For the specific case of 0000 GMT, on February 3, 1979, the additional data resulted in a global rms difference of 0.6 mb; individual maxima as larse as 5 mb occurred over the North Atlantic and East Pacific Oceans. These differences are noted to be smaller than the analysis increments to the first-guess fields.
In situ X-ray diffraction analysis of (CF x) n batteries: signal extraction by multivariate analysis
Rodriguez, Mark A.; Keenan, Michael R.; Nagasubramanian, Ganesan
2007-11-10
In this study, (CF x) n cathode reaction during discharge has been investigated using in situ X-ray diffraction (XRD). Mathematical treatment of the in situ XRD data set was performed using multivariate curve resolution with alternating least squares (MCR–ALS), a technique of multivariate analysis. MCR–ALS analysis successfully separated the relatively weak XRD signal intensity due to the chemical reaction from the other inert cell component signals. The resulting dynamic reaction component revealed the loss of (CF x) n cathode signal together with the simultaneous appearance of LiF by-product intensity. Careful examination of the XRD data set revealed an additional dynamicmore » component which may be associated with the formation of an intermediate compound during the discharge process.« less
Hurtado Rúa, Sandra M; Mazumdar, Madhu; Strawderman, Robert L
2015-12-30
Bayesian meta-analysis is an increasingly important component of clinical research, with multivariate meta-analysis a promising tool for studies with multiple endpoints. Model assumptions, including the choice of priors, are crucial aspects of multivariate Bayesian meta-analysis (MBMA) models. In a given model, two different prior distributions can lead to different inferences about a particular parameter. A simulation study was performed in which the impact of families of prior distributions for the covariance matrix of a multivariate normal random effects MBMA model was analyzed. Inferences about effect sizes were not particularly sensitive to prior choice, but the related covariance estimates were. A few families of prior distributions with small relative biases, tight mean squared errors, and close to nominal coverage for the effect size estimates were identified. Our results demonstrate the need for sensitivity analysis and suggest some guidelines for choosing prior distributions in this class of problems. The MBMA models proposed here are illustrated in a small meta-analysis example from the periodontal field and a medium meta-analysis from the study of stroke. Copyright © 2015 John Wiley & Sons, Ltd. Copyright © 2015 John Wiley & Sons, Ltd.
Effect of altered sensory conditions on multivariate descriptors of human postural sway
NASA Technical Reports Server (NTRS)
Kuo, A. D.; Speers, R. A.; Peterka, R. J.; Horak, F. B.; Peterson, B. W. (Principal Investigator)
1998-01-01
Multivariate descriptors of sway were used to test whether altered sensory conditions result not only in changes in amount of sway but also in postural coordination. Eigenvalues and directions of eigenvectors of the covariance of shnk and hip angles were used as a set of multivariate descriptors. These quantities were measured in 14 healthy adult subjects performing the Sensory Organization test, which disrupts visual and somatosensory information used for spatial orientation. Multivariate analysis of variance and discriminant analysis showed that resulting sway changes were at least bivariate in character, with visual and somatosensory conditions producing distinct changes in postural coordination. The most significant changes were found when somatosensory information was disrupted by sway-referencing of the support surface (P = 3.2 x 10(-10)). The resulting covariance measurements showed that subjects not only swayed more but also used increased hip motion analogous to the hip strategy. Disruption of vision, by either closing the eyes or sway-referencing the visual surround, also resulted in altered sway (P = 1.7 x 10(-10)), with proportionately more motion of the center of mass than with platform sway-referencing. As shown by discriminant analysis, an optimal univariate measure could explain at most 90% of the behavior due to altered sensory conditions. The remaining 10%, while smaller, are highly significant changes in posture control that depend on sensory conditions. The results imply that normal postural coordination of the trunk and legs requires both somatosensory and visual information and that each sensory modality makes a unique contribution to posture control. Descending postural commands are multivariate in nature, and the motion at each joint is affected uniquely by input from multiple sensors.
Comparison of connectivity analyses for resting state EEG data
NASA Astrophysics Data System (ADS)
Olejarczyk, Elzbieta; Marzetti, Laura; Pizzella, Vittorio; Zappasodi, Filippo
2017-06-01
Objective. In the present work, a nonlinear measure (transfer entropy, TE) was used in a multivariate approach for the analysis of effective connectivity in high density resting state EEG data in eyes open and eyes closed. Advantages of the multivariate approach in comparison to the bivariate one were tested. Moreover, the multivariate TE was compared to an effective linear measure, i.e. directed transfer function (DTF). Finally, the existence of a relationship between the information transfer and the level of brain synchronization as measured by phase synchronization value (PLV) was investigated. Approach. The comparison between the connectivity measures, i.e. bivariate versus multivariate TE, TE versus DTF, TE versus PLV, was performed by means of statistical analysis of indexes based on graph theory. Main results. The multivariate approach is less sensitive to false indirect connections with respect to the bivariate estimates. The multivariate TE differentiated better between eyes closed and eyes open conditions compared to DTF. Moreover, the multivariate TE evidenced non-linear phenomena in information transfer, which are not evidenced by the use of DTF. We also showed that the target of information flow, in particular the frontal region, is an area of greater brain synchronization. Significance. Comparison of different connectivity analysis methods pointed to the advantages of nonlinear methods, and indicated a relationship existing between the flow of information and the level of synchronization of the brain.
NASA Astrophysics Data System (ADS)
Azami, Hamed; Escudero, Javier
2017-01-01
Multiscale entropy (MSE) is an appealing tool to characterize the complexity of time series over multiple temporal scales. Recent developments in the field have tried to extend the MSE technique in different ways. Building on these trends, we propose the so-called refined composite multivariate multiscale fuzzy entropy (RCmvMFE) whose coarse-graining step uses variance (RCmvMFEσ2) or mean (RCmvMFEμ). We investigate the behavior of these multivariate methods on multichannel white Gaussian and 1/ f noise signals, and two publicly available biomedical recordings. Our simulations demonstrate that RCmvMFEσ2 and RCmvMFEμ lead to more stable results and are less sensitive to the signals' length in comparison with the other existing multivariate multiscale entropy-based methods. The classification results also show that using both the variance and mean in the coarse-graining step offers complexity profiles with complementary information for biomedical signal analysis. We also made freely available all the Matlab codes used in this paper.
Chen, Yong; Luo, Sheng; Chu, Haitao; Wei, Peng
2013-05-01
Multivariate meta-analysis is useful in combining evidence from independent studies which involve several comparisons among groups based on a single outcome. For binary outcomes, the commonly used statistical models for multivariate meta-analysis are multivariate generalized linear mixed effects models which assume risks, after some transformation, follow a multivariate normal distribution with possible correlations. In this article, we consider an alternative model for multivariate meta-analysis where the risks are modeled by the multivariate beta distribution proposed by Sarmanov (1966). This model have several attractive features compared to the conventional multivariate generalized linear mixed effects models, including simplicity of likelihood function, no need to specify a link function, and has a closed-form expression of distribution functions for study-specific risk differences. We investigate the finite sample performance of this model by simulation studies and illustrate its use with an application to multivariate meta-analysis of adverse events of tricyclic antidepressants treatment in clinical trials.
NASA Astrophysics Data System (ADS)
Huang, W.; Campredon, R.; Abrao, J. J.; Bernat, M.; Latouche, C.
1994-06-01
In the last decade, the Atlantic coast of south-eastern Brazil has been affected by increasing deforestation and anthropogenic effluents. Sediments in the coastal lagoons have recorded the process of such environmental change. Thirty-seven sediment samples from three cores in Piratininga Lagoon, Rio de Janeiro, were analyzed for their major components and minor element concentrations in order to examine geochemical characteristics and the depositional environment and to investigate the variation of heavy metals of environmental concern. Two multivariate analysis methods, principal component analysis and cluster analysis, were performed on the analytical data set to help visualize the sample clusters and the element associations. On the whole, the sediment samples from each core are similar and the sample clusters corresponding to the three cores are clearly separated, as a result of the different conditions of sedimentation. Some changes in the depositional environment are recognized using the results of multivariate analysis. The enrichment of Pb, Cu, and Zn in the upper parts of cores is in agreement with increasing anthropogenic influx (pollution).
DOE Office of Scientific and Technical Information (OSTI.GOV)
Loveday, D.L.; Craggs, C.
Box-Jenkins-based multivariate stochastic modeling is carried out using data recorded from a domestic heating system. The system comprises an air-source heat pump sited in the roof space of a house, solar assistance being provided by the conventional tile roof acting as a radiation absorber. Multivariate models are presented which illustrate the time-dependent relationships between three air temperatures - at external ambient, at entry to, and at exit from, the heat pump evaporator. Using a deterministic modeling approach, physical interpretations are placed on the results of the multivariate technique. It is concluded that the multivariate Box-Jenkins approach is a suitable techniquemore » for building thermal analysis. Application to multivariate Box-Jenkins approach is a suitable technique for building thermal analysis. Application to multivariate model-based control is discussed, with particular reference to building energy management systems. It is further concluded that stochastic modeling of data drawn from a short monitoring period offers a means of retrofitting an advanced model-based control system in existing buildings, which could be used to optimize energy savings. An approach to system simulation is suggested.« less
Maione, Camila; Barbosa, Rommel Melgaço
2018-01-24
Rice is one of the most important staple foods around the world. Authentication of rice is one of the most addressed concerns in the present literature, which includes recognition of its geographical origin and variety, certification of organic rice and many other issues. Good results have been achieved by multivariate data analysis and data mining techniques when combined with specific parameters for ascertaining authenticity and many other useful characteristics of rice, such as quality, yield and others. This paper brings a review of the recent research projects on discrimination and authentication of rice using multivariate data analysis and data mining techniques. We found that data obtained from image processing, molecular and atomic spectroscopy, elemental fingerprinting, genetic markers, molecular content and others are promising sources of information regarding geographical origin, variety and other aspects of rice, being widely used combined with multivariate data analysis techniques. Principal component analysis and linear discriminant analysis are the preferred methods, but several other data classification techniques such as support vector machines, artificial neural networks and others are also frequently present in some studies and show high performance for discrimination of rice.
Li, Jinling; He, Ming; Han, Wei; Gu, Yifan
2009-05-30
An investigation on heavy metal sources, i.e., Cu, Zn, Ni, Pb, Cr, and Cd in the coastal soils of Shanghai, China, was conducted using multivariate statistical methods (principal component analysis, clustering analysis, and correlation analysis). All the results of the multivariate analysis showed that: (i) Cu, Ni, Pb, and Cd had anthropogenic sources (e.g., overuse of chemical fertilizers and pesticides, industrial and municipal discharges, animal wastes, sewage irrigation, etc.); (ii) Zn and Cr were associated with parent materials and therefore had natural sources (e.g., the weathering process of parent materials and subsequent pedo-genesis due to the alluvial deposits). The effect of heavy metals in the soils was greatly affected by soil formation, atmospheric deposition, and human activities. These findings provided essential information on the possible sources of heavy metals, which would contribute to the monitoring and assessment process of agricultural soils in worldwide regions.
Characterizing multivariate decoding models based on correlated EEG spectral features.
McFarland, Dennis J
2013-07-01
Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. Copyright © 2013 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
Multivariate meta-analysis using individual participant data.
Riley, R D; Price, M J; Jackson, D; Wardle, M; Gueyffier, F; Wang, J; Staessen, J A; White, I R
2015-06-01
When combining results across related studies, a multivariate meta-analysis allows the joint synthesis of correlated effect estimates from multiple outcomes. Joint synthesis can improve efficiency over separate univariate syntheses, may reduce selective outcome reporting biases, and enables joint inferences across the outcomes. A common issue is that within-study correlations needed to fit the multivariate model are unknown from published reports. However, provision of individual participant data (IPD) allows them to be calculated directly. Here, we illustrate how to use IPD to estimate within-study correlations, using a joint linear regression for multiple continuous outcomes and bootstrapping methods for binary, survival and mixed outcomes. In a meta-analysis of 10 hypertension trials, we then show how these methods enable multivariate meta-analysis to address novel clinical questions about continuous, survival and binary outcomes; treatment-covariate interactions; adjusted risk/prognostic factor effects; longitudinal data; prognostic and multiparameter models; and multiple treatment comparisons. Both frequentist and Bayesian approaches are applied, with example software code provided to derive within-study correlations and to fit the models. © 2014 The Authors. Research Synthesis Methods published by John Wiley & Sons, Ltd.
A Multivariate Test of the Bott Hypothesis in an Urban Irish Setting
ERIC Educational Resources Information Center
Gordon, Michael; Downing, Helen
1978-01-01
Using a sample of 686 married Irish women in Cork City the Bott hypothesis was tested, and the results of a multivariate regression analysis revealed that neither network connectedness nor the strength of the respondent's emotional ties to the network had any explanatory power. (Author)
Borrowing of strength and study weights in multivariate and network meta-analysis.
Jackson, Dan; White, Ian R; Price, Malcolm; Copas, John; Riley, Richard D
2017-12-01
Multivariate and network meta-analysis have the potential for the estimated mean of one effect to borrow strength from the data on other effects of interest. The extent of this borrowing of strength is usually assessed informally. We present new mathematical definitions of 'borrowing of strength'. Our main proposal is based on a decomposition of the score statistic, which we show can be interpreted as comparing the precision of estimates from the multivariate and univariate models. Our definition of borrowing of strength therefore emulates the usual informal assessment. We also derive a method for calculating study weights, which we embed into the same framework as our borrowing of strength statistics, so that percentage study weights can accompany the results from multivariate and network meta-analyses as they do in conventional univariate meta-analyses. Our proposals are illustrated using three meta-analyses involving correlated effects for multiple outcomes, multiple risk factor associations and multiple treatments (network meta-analysis).
Multivariate longitudinal data analysis with censored and intermittent missing responses.
Lin, Tsung-I; Lachos, Victor H; Wang, Wan-Lun
2018-05-08
The multivariate linear mixed model (MLMM) has emerged as an important analytical tool for longitudinal data with multiple outcomes. However, the analysis of multivariate longitudinal data could be complicated by the presence of censored measurements because of a detection limit of the assay in combination with unavoidable missing values arising when subjects miss some of their scheduled visits intermittently. This paper presents a generalization of the MLMM approach, called the MLMM-CM, for a joint analysis of the multivariate longitudinal data with censored and intermittent missing responses. A computationally feasible expectation maximization-based procedure is developed to carry out maximum likelihood estimation within the MLMM-CM framework. Moreover, the asymptotic standard errors of fixed effects are explicitly obtained via the information-based method. We illustrate our methodology by using simulated data and a case study from an AIDS clinical trial. Experimental results reveal that the proposed method is able to provide more satisfactory performance as compared with the traditional MLMM approach. Copyright © 2018 John Wiley & Sons, Ltd.
Borrowing of strength and study weights in multivariate and network meta-analysis
Jackson, Dan; White, Ian R; Price, Malcolm; Copas, John; Riley, Richard D
2016-01-01
Multivariate and network meta-analysis have the potential for the estimated mean of one effect to borrow strength from the data on other effects of interest. The extent of this borrowing of strength is usually assessed informally. We present new mathematical definitions of ‘borrowing of strength’. Our main proposal is based on a decomposition of the score statistic, which we show can be interpreted as comparing the precision of estimates from the multivariate and univariate models. Our definition of borrowing of strength therefore emulates the usual informal assessment. We also derive a method for calculating study weights, which we embed into the same framework as our borrowing of strength statistics, so that percentage study weights can accompany the results from multivariate and network meta-analyses as they do in conventional univariate meta-analyses. Our proposals are illustrated using three meta-analyses involving correlated effects for multiple outcomes, multiple risk factor associations and multiple treatments (network meta-analysis). PMID:26546254
Ghisi, Nédia C; Oliveira, Elton C; Mendonça Mota, Thais F; Vanzetto, Guilherme V; Roque, Aliciane A; Godinho, Jayson P; Bettim, Franciele Lima; Silva de Assis, Helena Cristina da; Prioli, Alberto J
2016-10-01
Aquatic pollutants produce multiple consequences in organisms, populations, communities and ecosystems, affecting the function of organs, reproductive state, population size, species survival and even biodiversity. In order to monitor the health of aquatic organisms, biomarkers have been used as effective tools in environmental risk assessment. The aim of this study is to evaluate, through a multivariate and integrative analysis, the response of the native species Hypostomus ancistroides over a pollution gradient in the main water supply body of northwestern Paraná state (Brazil). The condition factor, micronucleus test and erythrocyte nuclear abnormalities (ENA), comet assay, measurement of the cerebral and muscular enzyme acetylcholinesterase (AChE), and histopathological analysis of liver and gill were evaluated in fishes from three sites of the Pirapó River during the dry and rainy seasons. The multivariate general result showed that the interaction between the seasons and the sites was significant: there are variations in the rates of alterations in the biological parameters, depending on the time of year researched at each site. In general, the best results were observed for the site nearest the spring, and alterations in the parameters at the intermediate and downstream sites. In sum, the results of this study showed the necessity of a multivariate analysis, evaluating several biological parameters, to obtain an integrated response to the effects of the environmental pollutants on the organisms. Copyright © 2016 Elsevier Ltd. All rights reserved.
Westman, Eric; Aguilar, Carlos; Muehlboeck, J-Sebastian; Simmons, Andrew
2013-01-01
Automated structural magnetic resonance imaging (MRI) processing pipelines are gaining popularity for Alzheimer's disease (AD) research. They generate regional volumes, cortical thickness measures and other measures, which can be used as input for multivariate analysis. It is not clear which combination of measures and normalization approach are most useful for AD classification and to predict mild cognitive impairment (MCI) conversion. The current study includes MRI scans from 699 subjects [AD, MCI and controls (CTL)] from the Alzheimer's disease Neuroimaging Initiative (ADNI). The Freesurfer pipeline was used to generate regional volume, cortical thickness, gray matter volume, surface area, mean curvature, gaussian curvature, folding index and curvature index measures. 259 variables were used for orthogonal partial least square to latent structures (OPLS) multivariate analysis. Normalisation approaches were explored and the optimal combination of measures determined. Results indicate that cortical thickness measures should not be normalized, while volumes should probably be normalized by intracranial volume (ICV). Combining regional cortical thickness measures (not normalized) with cortical and subcortical volumes (normalized with ICV) using OPLS gave a prediction accuracy of 91.5 % when distinguishing AD versus CTL. This model prospectively predicted future decline from MCI to AD with 75.9 % of converters correctly classified. Normalization strategy did not have a significant effect on the accuracies of multivariate models containing multiple MRI measures for this large dataset. The appropriate choice of input for multivariate analysis in AD and MCI is of great importance. The results support the use of un-normalised cortical thickness measures and volumes normalised by ICV.
STAMATE, MIRELA CRISTINA; TODOR, NICOLAE; COSGAREA, MARCEL
2015-01-01
Background and aim The clinical utility of otoacoustic emissions as a noninvasive objective test of cochlear function has been long studied. Both transient otoacoustic emissions and distorsion products can be used to identify hearing loss, but to what extent they can be used as predictors for hearing loss is still debated. Most studies agree that multivariate analyses have better test performances than univariate analyses. The aim of the study was to determine transient otoacoustic emissions and distorsion products performance in identifying normal and impaired hearing loss, using the pure tone audiogram as a gold standard procedure and different multivariate statistical approaches. Methods The study included 105 adult subjects with normal hearing and hearing loss who underwent the same test battery: pure-tone audiometry, tympanometry, otoacoustic emission tests. We chose to use the logistic regression as a multivariate statistical technique. Three logistic regression models were developed to characterize the relations between different risk factors (age, sex, tinnitus, demographic features, cochlear status defined by otoacoustic emissions) and hearing status defined by pure-tone audiometry. The multivariate analyses allow the calculation of the logistic score, which is a combination of the inputs, weighted by coefficients, calculated within the analyses. The accuracy of each model was assessed using receiver operating characteristics curve analysis. We used the logistic score to generate receivers operating curves and to estimate the areas under the curves in order to compare different multivariate analyses. Results We compared the performance of each otoacoustic emission (transient, distorsion product) using three different multivariate analyses for each ear, when multi-frequency gold standards were used. We demonstrated that all multivariate analyses provided high values of the area under the curve proving the performance of the otoacoustic emissions. Each otoacoustic emission test presented high values of area under the curve, suggesting that implementing a multivariate approach to evaluate the performances of each otoacoustic emission test would serve to increase the accuracy in identifying the normal and impaired ears. We encountered the highest area under the curve value for the combined multivariate analysis suggesting that both otoacoustic emission tests should be used in assessing hearing status. Our multivariate analyses revealed that age is a constant predictor factor of the auditory status for both ears, but the presence of tinnitus was the most important predictor for the hearing level, only for the left ear. Age presented similar coefficients, but tinnitus coefficients, by their high value, produced the highest variations of the logistic scores, only for the left ear group, thus increasing the risk of hearing loss. We did not find gender differences between ears for any otoacoustic emission tests, but studies still debate this question as the results are contradictory. Neither gender, nor environment origin had any predictive value for the hearing status, according to the results of our study. Conclusion Like any other audiological test, using otoacoustic emissions to identify hearing loss is not without error. Even when applying multivariate analysis, perfect test performance is never achieved. Although most studies demonstrated the benefit of using the multivariate analysis, it has not been incorporated into clinical decisions maybe because of the idiosyncratic nature of multivariate solutions or because of the lack of the validation studies. PMID:26733749
Multivariate analysis in thoracic research.
Mengual-Macenlle, Noemí; Marcos, Pedro J; Golpe, Rafael; González-Rivas, Diego
2015-03-01
Multivariate analysis is based in observation and analysis of more than one statistical outcome variable at a time. In design and analysis, the technique is used to perform trade studies across multiple dimensions while taking into account the effects of all variables on the responses of interest. The development of multivariate methods emerged to analyze large databases and increasingly complex data. Since the best way to represent the knowledge of reality is the modeling, we should use multivariate statistical methods. Multivariate methods are designed to simultaneously analyze data sets, i.e., the analysis of different variables for each person or object studied. Keep in mind at all times that all variables must be treated accurately reflect the reality of the problem addressed. There are different types of multivariate analysis and each one should be employed according to the type of variables to analyze: dependent, interdependence and structural methods. In conclusion, multivariate methods are ideal for the analysis of large data sets and to find the cause and effect relationships between variables; there is a wide range of analysis types that we can use.
Vitte, Joana; Ranque, Stéphane; Carsin, Ania; Gomez, Carine; Romain, Thomas; Cassagne, Carole; Gouitaa, Marion; Baravalle-Einaudi, Mélisande; Bel, Nathalie Stremler-Le; Reynaud-Gaubert, Martine; Dubus, Jean-Christophe; Mège, Jean-Louis; Gaudart, Jean
2017-01-01
Molecular-based allergy diagnosis yields multiple biomarker datasets. The classical diagnostic score for allergic bronchopulmonary aspergillosis (ABPA), a severe disease usually occurring in asthmatic patients and people with cystic fibrosis, comprises succinct immunological criteria formulated in 1977: total IgE, anti- Aspergillus fumigatus ( Af ) IgE, anti- Af "precipitins," and anti- Af IgG. Progress achieved over the last four decades led to multiple IgE and IgG(4) Af biomarkers available with quantitative, standardized, molecular-level reports. These newly available biomarkers have not been included in the current diagnostic criteria, either individually or in algorithms, despite persistent underdiagnosis of ABPA. Large numbers of individual biomarkers may hinder their use in clinical practice. Conversely, multivariate analysis using new tools may bring about a better chance of less diagnostic mistakes. We report here a proof-of-concept work consisting of a three-step multivariate analysis of Af IgE, IgG, and IgG4 biomarkers through a combination of principal component analysis, hierarchical ascendant classification, and classification and regression tree multivariate analysis. The resulting diagnostic algorithms might show the way for novel criteria and improved diagnostic efficiency in Af -sensitized patients at risk for ABPA.
Hou, Deyi; O'Connor, David; Nathanail, Paul; Tian, Li; Ma, Yan
2017-12-01
Heavy metal soil contamination is associated with potential toxicity to humans or ecotoxicity. Scholars have increasingly used a combination of geographical information science (GIS) with geostatistical and multivariate statistical analysis techniques to examine the spatial distribution of heavy metals in soils at a regional scale. A review of such studies showed that most soil sampling programs were based on grid patterns and composite sampling methodologies. Many programs intended to characterize various soil types and land use types. The most often used sampling depth intervals were 0-0.10 m, or 0-0.20 m, below surface; and the sampling densities used ranged from 0.0004 to 6.1 samples per km 2 , with a median of 0.4 samples per km 2 . The most widely used spatial interpolators were inverse distance weighted interpolation and ordinary kriging; and the most often used multivariate statistical analysis techniques were principal component analysis and cluster analysis. The review also identified several determining and correlating factors in heavy metal distribution in soils, including soil type, soil pH, soil organic matter, land use type, Fe, Al, and heavy metal concentrations. The major natural and anthropogenic sources of heavy metals were found to derive from lithogenic origin, roadway and transportation, atmospheric deposition, wastewater and runoff from industrial and mining facilities, fertilizer application, livestock manure, and sewage sludge. This review argues that the full potential of integrated GIS and multivariate statistical analysis for assessing heavy metal distribution in soils on a regional scale has not yet been fully realized. It is proposed that future research be conducted to map multivariate results in GIS to pinpoint specific anthropogenic sources, to analyze temporal trends in addition to spatial patterns, to optimize modeling parameters, and to expand the use of different multivariate analysis tools beyond principal component analysis (PCA) and cluster analysis (CA). Copyright © 2017 Elsevier Ltd. All rights reserved.
Multivariate Methods for Meta-Analysis of Genetic Association Studies.
Dimou, Niki L; Pantavou, Katerina G; Braliou, Georgia G; Bagos, Pantelis G
2018-01-01
Multivariate meta-analysis of genetic association studies and genome-wide association studies has received a remarkable attention as it improves the precision of the analysis. Here, we review, summarize and present in a unified framework methods for multivariate meta-analysis of genetic association studies and genome-wide association studies. Starting with the statistical methods used for robust analysis and genetic model selection, we present in brief univariate methods for meta-analysis and we then scrutinize multivariate methodologies. Multivariate models of meta-analysis for a single gene-disease association studies, including models for haplotype association studies, multiple linked polymorphisms and multiple outcomes are discussed. The popular Mendelian randomization approach and special cases of meta-analysis addressing issues such as the assumption of the mode of inheritance, deviation from Hardy-Weinberg Equilibrium and gene-environment interactions are also presented. All available methods are enriched with practical applications and methodologies that could be developed in the future are discussed. Links for all available software implementing multivariate meta-analysis methods are also provided.
Kia, Seyed Mostafa; Vega Pons, Sandro; Weisz, Nathan; Passerini, Andrea
2016-01-01
Brain decoding is a popular multivariate approach for hypothesis testing in neuroimaging. Linear classifiers are widely employed in the brain decoding paradigm to discriminate among experimental conditions. Then, the derived linear weights are visualized in the form of multivariate brain maps to further study spatio-temporal patterns of underlying neural activities. It is well known that the brain maps derived from weights of linear classifiers are hard to interpret because of high correlations between predictors, low signal to noise ratios, and the high dimensionality of neuroimaging data. Therefore, improving the interpretability of brain decoding approaches is of primary interest in many neuroimaging studies. Despite extensive studies of this type, at present, there is no formal definition for interpretability of multivariate brain maps. As a consequence, there is no quantitative measure for evaluating the interpretability of different brain decoding methods. In this paper, first, we present a theoretical definition of interpretability in brain decoding; we show that the interpretability of multivariate brain maps can be decomposed into their reproducibility and representativeness. Second, as an application of the proposed definition, we exemplify a heuristic for approximating the interpretability in multivariate analysis of evoked magnetoencephalography (MEG) responses. Third, we propose to combine the approximated interpretability and the generalization performance of the brain decoding into a new multi-objective criterion for model selection. Our results, for the simulated and real MEG data, show that optimizing the hyper-parameters of the regularized linear classifier based on the proposed criterion results in more informative multivariate brain maps. More importantly, the presented definition provides the theoretical background for quantitative evaluation of interpretability, and hence, facilitates the development of more effective brain decoding algorithms in the future.
Kia, Seyed Mostafa; Vega Pons, Sandro; Weisz, Nathan; Passerini, Andrea
2017-01-01
Brain decoding is a popular multivariate approach for hypothesis testing in neuroimaging. Linear classifiers are widely employed in the brain decoding paradigm to discriminate among experimental conditions. Then, the derived linear weights are visualized in the form of multivariate brain maps to further study spatio-temporal patterns of underlying neural activities. It is well known that the brain maps derived from weights of linear classifiers are hard to interpret because of high correlations between predictors, low signal to noise ratios, and the high dimensionality of neuroimaging data. Therefore, improving the interpretability of brain decoding approaches is of primary interest in many neuroimaging studies. Despite extensive studies of this type, at present, there is no formal definition for interpretability of multivariate brain maps. As a consequence, there is no quantitative measure for evaluating the interpretability of different brain decoding methods. In this paper, first, we present a theoretical definition of interpretability in brain decoding; we show that the interpretability of multivariate brain maps can be decomposed into their reproducibility and representativeness. Second, as an application of the proposed definition, we exemplify a heuristic for approximating the interpretability in multivariate analysis of evoked magnetoencephalography (MEG) responses. Third, we propose to combine the approximated interpretability and the generalization performance of the brain decoding into a new multi-objective criterion for model selection. Our results, for the simulated and real MEG data, show that optimizing the hyper-parameters of the regularized linear classifier based on the proposed criterion results in more informative multivariate brain maps. More importantly, the presented definition provides the theoretical background for quantitative evaluation of interpretability, and hence, facilitates the development of more effective brain decoding algorithms in the future. PMID:28167896
ERIC Educational Resources Information Center
Grochowalski, Joseph H.
2015-01-01
Component Universe Score Profile analysis (CUSP) is introduced in this paper as a psychometric alternative to multivariate profile analysis. The theoretical foundations of CUSP analysis are reviewed, which include multivariate generalizability theory and constrained principal components analysis. Because CUSP is a combination of generalizability…
Cain, Meghan K; Zhang, Zhiyong; Yuan, Ke-Hai
2017-10-01
Nonnormality of univariate data has been extensively examined previously (Blanca et al., Methodology: European Journal of Research Methods for the Behavioral and Social Sciences, 9(2), 78-84, 2013; Miceeri, Psychological Bulletin, 105(1), 156, 1989). However, less is known of the potential nonnormality of multivariate data although multivariate analysis is commonly used in psychological and educational research. Using univariate and multivariate skewness and kurtosis as measures of nonnormality, this study examined 1,567 univariate distriubtions and 254 multivariate distributions collected from authors of articles published in Psychological Science and the American Education Research Journal. We found that 74 % of univariate distributions and 68 % multivariate distributions deviated from normal distributions. In a simulation study using typical values of skewness and kurtosis that we collected, we found that the resulting type I error rates were 17 % in a t-test and 30 % in a factor analysis under some conditions. Hence, we argue that it is time to routinely report skewness and kurtosis along with other summary statistics such as means and variances. To facilitate future report of skewness and kurtosis, we provide a tutorial on how to compute univariate and multivariate skewness and kurtosis by SAS, SPSS, R and a newly developed Web application.
Preliminary Multi-Variable Parametric Cost Model for Space Telescopes
NASA Technical Reports Server (NTRS)
Stahl, H. Philip; Hendrichs, Todd
2010-01-01
This slide presentation reviews creating a preliminary multi-variable cost model for the contract costs of making a space telescope. There is discussion of the methodology for collecting the data, definition of the statistical analysis methodology, single variable model results, testing of historical models and an introduction of the multi variable models.
ERIC Educational Resources Information Center
Grung, Bjorn; Nodland, Egil; Forland, Geir Martin
2007-01-01
The analysis of the infrared spectra of an alcohol dissolved in carbon tetrachloride gives a better understanding of the various multivariate curve resolution methods. The resulting concentration profile is found to be very useful for calculating the degree of association and equilibrium constants of different compounds.
Ferreira, Fábio S.; Pereira, João M.S.; Duarte, João V.; Castelo-Branco, Miguel
2017-01-01
Background: Although voxel based morphometry studies are still the standard for analyzing brain structure, their dependence on massive univariate inferential methods is a limiting factor. A better understanding of brain pathologies can be achieved by applying inferential multivariate methods, which allow the study of multiple dependent variables, e.g. different imaging modalities of the same subject. Objective: Given the widespread use of SPM software in the brain imaging community, the main aim of this work is the implementation of massive multivariate inferential analysis as a toolbox in this software package. applied to the use of T1 and T2 structural data from diabetic patients and controls. This implementation was compared with the traditional ANCOVA in SPM and a similar multivariate GLM toolbox (MRM). Method: We implemented the new toolbox and tested it by investigating brain alterations on a cohort of twenty-eight type 2 diabetes patients and twenty-six matched healthy controls, using information from both T1 and T2 weighted structural MRI scans, both separately – using standard univariate VBM - and simultaneously, with multivariate analyses. Results: Univariate VBM replicated predominantly bilateral changes in basal ganglia and insular regions in type 2 diabetes patients. On the other hand, multivariate analyses replicated key findings of univariate results, while also revealing the thalami as additional foci of pathology. Conclusion: While the presented algorithm must be further optimized, the proposed toolbox is the first implementation of multivariate statistics in SPM8 as a user-friendly toolbox, which shows great potential and is ready to be validated in other clinical cohorts and modalities. PMID:28761571
Geladi, Paul; Nelson, Andrew; Lindholm-Sethson, Britta
2007-07-09
Electrical impedance gives multivariate complex number data as results. Two examples of multivariate electrical impedance data measured on lipid monolayers in different solutions give rise to matrices (16x50 and 38x50) of complex numbers. Multivariate data analysis by principal component analysis (PCA) or singular value decomposition (SVD) can be used for complex data and the necessary equations are given. The scores and loadings obtained are vectors of complex numbers. It is shown that the complex number PCA and SVD are better at concentrating information in a few components than the naïve juxtaposition method and that Argand diagrams can replace score and loading plots. Different concentrations of Magainin and Gramicidin A give different responses and also the role of the electrolyte medium can be studied. An interaction of Gramicidin A in the solution with the monolayer over time can be observed.
Sciutto, Giorgia; Oliveri, Paolo; Catelli, Emilio; Bonacini, Irene
2017-01-01
In the field of applied researches in heritage science, the use of multivariate approach is still quite limited and often chemometric results obtained are often underinterpreted. Within this scenario, the present paper is aimed at disseminating the use of suitable multivariate methodologies and proposes a procedural workflow applied on a representative group of case studies, of considerable importance for conservation purposes, as a sort of guideline on the processing and on the interpretation of this FTIR data. Initially, principal component analysis (PCA) is performed and the score values are converted into chemical maps. Successively, the brushing approach is applied, demonstrating its usefulness for a deep understanding of the relationships between the multivariate map and PC score space, as well as for the identification of the spectral bands mainly involved in the definition of each area localised within the score maps. PMID:29333162
NONPARAMETRIC MANOVA APPROACHES FOR NON-NORMAL MULTIVARIATE OUTCOMES WITH MISSING VALUES
He, Fanyin; Mazumdar, Sati; Tang, Gong; Bhatia, Triptish; Anderson, Stewart J.; Dew, Mary Amanda; Krafty, Robert; Nimgaonkar, Vishwajit; Deshpande, Smita; Hall, Martica; Reynolds, Charles F.
2017-01-01
Between-group comparisons often entail many correlated response variables. The multivariate linear model, with its assumption of multivariate normality, is the accepted standard tool for these tests. When this assumption is violated, the nonparametric multivariate Kruskal-Wallis (MKW) test is frequently used. However, this test requires complete cases with no missing values in response variables. Deletion of cases with missing values likely leads to inefficient statistical inference. Here we extend the MKW test to retain information from partially-observed cases. Results of simulated studies and analysis of real data show that the proposed method provides adequate coverage and superior power to complete-case analyses. PMID:29416225
Galas, David J; Sakhanenko, Nikita A; Skupin, Alexander; Ignac, Tomasz
2014-02-01
Context dependence is central to the description of complexity. Keying on the pairwise definition of "set complexity," we use an information theory approach to formulate general measures of systems complexity. We examine the properties of multivariable dependency starting with the concept of interaction information. We then present a new measure for unbiased detection of multivariable dependency, "differential interaction information." This quantity for two variables reduces to the pairwise "set complexity" previously proposed as a context-dependent measure of information in biological systems. We generalize it here to an arbitrary number of variables. Critical limiting properties of the "differential interaction information" are key to the generalization. This measure extends previous ideas about biological information and provides a more sophisticated basis for the study of complexity. The properties of "differential interaction information" also suggest new approaches to data analysis. Given a data set of system measurements, differential interaction information can provide a measure of collective dependence, which can be represented in hypergraphs describing complex system interaction patterns. We investigate this kind of analysis using simulated data sets. The conjoining of a generalized set complexity measure, multivariable dependency analysis, and hypergraphs is our central result. While our focus is on complex biological systems, our results are applicable to any complex system.
DigOut: viewing differential expression genes as outliers.
Yu, Hui; Tu, Kang; Xie, Lu; Li, Yuan-Yuan
2010-12-01
With regards to well-replicated two-conditional microarray datasets, the selection of differentially expressed (DE) genes is a well-studied computational topic, but for multi-conditional microarray datasets with limited or no replication, the same task is not properly addressed by previous studies. This paper adopts multivariate outlier analysis to analyze replication-lacking multi-conditional microarray datasets, finding that it performs significantly better than the widely used limit fold change (LFC) model in a simulated comparative experiment. Compared with the LFC model, the multivariate outlier analysis also demonstrates improved stability against sample variations in a series of manipulated real expression datasets. The reanalysis of a real non-replicated multi-conditional expression dataset series leads to satisfactory results. In conclusion, a multivariate outlier analysis algorithm, like DigOut, is particularly useful for selecting DE genes from non-replicated multi-conditional gene expression dataset.
Cho, Hwui-Dong; Kim, Ki-Hun; Hwang, Shin; Ahn, Chul-Soo; Moon, Deok-Bog; Ha, Tae-Yong; Song, Gi-Won; Jung, Dong-Hwan; Park, Gil-Chun; Lee, Sung-Gyu
2018-02-01
To compare the outcomes of pure laparoscopic left hemihepatectomy (LLH) versus open left hemihepatectomy (OLH) for benign and malignant conditions using multivariate analysis. All consecutive cases of LLH and OLH between October 2007 and December 2013 in a tertiary referral hospital were enrolled in this retrospective cohort study. All surgical procedures were performed by one surgeon. The LLH and OLH groups were compared in terms of patient demographics, preoperative data, clinical perioperative outcomes, and tumor characteristics in patients with malignancy. Multivariate analysis of the prognostic factors associated with severe complications was then performed. The LLH group (n = 62) had a significantly shorter postoperative hospital stay than the OLH group (n = 118) (9.53 ± 3.30 vs 14.88 ± 11.36 days, p < 0.001). Multivariate analysis revealed that the OLH group had >4 times the risk of the LLH group in terms of developing severe complications (Clavien-Dindo grade ≥III) (odds ratio 4.294, 95% confidence intervals 1.165-15.832, p = 0.029). LLH was a safe and feasible procedure for selected patients. LLH required shorter hospital stay and resulted in less operative blood loss. Multivariate analysis revealed that LLH was associated with a lower risk of severe complications compared to OLH. The authors suggest that LLH could be a reasonable treatment option for selected patients.
Applications of modern statistical methods to analysis of data in physical science
NASA Astrophysics Data System (ADS)
Wicker, James Eric
Modern methods of statistical and computational analysis offer solutions to dilemmas confronting researchers in physical science. Although the ideas behind modern statistical and computational analysis methods were originally introduced in the 1970's, most scientists still rely on methods written during the early era of computing. These researchers, who analyze increasingly voluminous and multivariate data sets, need modern analysis methods to extract the best results from their studies. The first section of this work showcases applications of modern linear regression. Since the 1960's, many researchers in spectroscopy have used classical stepwise regression techniques to derive molecular constants. However, problems with thresholds of entry and exit for model variables plagues this analysis method. Other criticisms of this kind of stepwise procedure include its inefficient searching method, the order in which variables enter or leave the model and problems with overfitting data. We implement an information scoring technique that overcomes the assumptions inherent in the stepwise regression process to calculate molecular model parameters. We believe that this kind of information based model evaluation can be applied to more general analysis situations in physical science. The second section proposes new methods of multivariate cluster analysis. The K-means algorithm and the EM algorithm, introduced in the 1960's and 1970's respectively, formed the basis of multivariate cluster analysis methodology for many years. However, several shortcomings of these methods include strong dependence on initial seed values and inaccurate results when the data seriously depart from hypersphericity. We propose new cluster analysis methods based on genetic algorithms that overcomes the strong dependence on initial seed values. In addition, we propose a generalization of the Genetic K-means algorithm which can accurately identify clusters with complex hyperellipsoidal covariance structures. We then use this new algorithm in a genetic algorithm based Expectation-Maximization process that can accurately calculate parameters describing complex clusters in a mixture model routine. Using the accuracy of this GEM algorithm, we assign information scores to cluster calculations in order to best identify the number of mixture components in a multivariate data set. We will showcase how these algorithms can be used to process multivariate data from astronomical observations.
2013-01-01
Background Matching pursuit algorithm (MP), especially with recent multivariate extensions, offers unique advantages in analysis of EEG and MEG. Methods We propose a novel construction of an optimal Gabor dictionary, based upon the metrics introduced in this paper. We implement this construction in a freely available software for MP decomposition of multivariate time series, with a user friendly interface via the Svarog package (Signal Viewer, Analyzer and Recorder On GPL, http://braintech.pl/svarog), and provide a hands-on introduction to its application to EEG. Finally, we describe numerical and mathematical optimizations used in this implementation. Results Optimal Gabor dictionaries, based on the metric introduced in this paper, for the first time allowed for a priori assessment of maximum one-step error of the MP algorithm. Variants of multivariate MP, implemented in the accompanying software, are organized according to the mathematical properties of the algorithms, relevant in the light of EEG/MEG analysis. Some of these variants have been successfully applied to both multichannel and multitrial EEG and MEG in previous studies, improving preprocessing for EEG/MEG inverse solutions and parameterization of evoked potentials in single trials; we mention also ongoing work and possible novel applications. Conclusions Mathematical results presented in this paper improve our understanding of the basics of the MP algorithm. Simple introduction of its properties and advantages, together with the accompanying stable and user-friendly Open Source software package, pave the way for a widespread and reproducible analysis of multivariate EEG and MEG time series and novel applications, while retaining a high degree of compatibility with the traditional, visual analysis of EEG. PMID:24059247
Robustness of reduced-order multivariable state-space self-tuning controller
NASA Technical Reports Server (NTRS)
Yuan, Zhuzhi; Chen, Zengqiang
1994-01-01
In this paper, we present a quantitative analysis of the robustness of a reduced-order pole-assignment state-space self-tuning controller for a multivariable adaptive control system whose order of the real process is higher than that of the model used in the controller design. The result of stability analysis shows that, under a specific bounded modelling error, the adaptively controlled closed-loop real system via the reduced-order state-space self-tuner is BIBO stable in the presence of unmodelled dynamics.
Evaluation of functional outcome of the floating knee injury using multivariate analysis.
Yokoyama, Kazuhiko; Tsukamoto, Tatsuro; Aoki, Shinichi; Wakita, Ryuji; Uchino, Masataka; Noumi, Takashi; Fukushima, Nobuaki; Itoman, Moritoshi
2002-11-01
The objective of this study is to evaluate significant contributing factors affecting the functional prognosis of floating knee injuries using multivariate analysis. A total of 68 floating knee injuries (67 patients) were treated at Kitasato University Hospital from 1986 to 1999. Both the femoral fractures and the tibial fractures were managed surgically by various methods. The functional results of these injuries were evaluated using the grading system of Karlström and Olerud. Follow-up periods ranged from 2 to 19 years (mean 50.2 months) after the original injury. We defined satisfactory (S) outcomes as those cases with excellent or good results and unsatisfactory (US) outcomes as those cases with acceptable or poor results. Logistic regression analysis was used as a multivariate analysis, and the dependent variables were defined as a satisfactory outcome or as an unsatisfactory outcome. The explanatory variables were predicting factors influencing the functional outcome such as age at trauma, gender, severity of soft-tissue injury in the femur and the tibia, AO fracture grade in the femur and the tibia, Fraser type (type I or type II), Injury Severity Score (ISS), and fixation time after injury (less than 1 week or more than 1 week) in the femur and the tibia. The final functional results were as follows: 25 cases had excellent results, 15 cases good results, 16 cases acceptable results, and 12 cases poor results. The predictive logistic regression equation was as follows: Log 1-p/p = 3.12-1.52 x Fraser type - 1.65 x severity of soft-tissue injury in the tibia - 1.31 x fixation time after injury in the tibia - 0.821 x AO fracture grade in the tibia + 1.025 x fixation time after injury in the femur - 0.687 x AO fracture grade in the femur ( p=0.01). Among the variables, Fraser type and the severity of soft-tissue injury in the tibia were significantly related to the final result. The multivariate analysis showed that both the involvement of the knee joint and the severity grade of soft-tissue injury in the tibia represented significant risk factors of poor outcome in floating knee injuries in this study.
Keenan, Michael R; Smentkowski, Vincent S; Ulfig, Robert M; Oltman, Edward; Larson, David J; Kelly, Thomas F
2011-06-01
We demonstrate for the first time that multivariate statistical analysis techniques can be applied to atom probe tomography data to estimate the chemical composition of a sample at the full spatial resolution of the atom probe in three dimensions. Whereas the raw atom probe data provide the specific identity of an atom at a precise location, the multivariate results can be interpreted in terms of the probabilities that an atom representing a particular chemical phase is situated there. When aggregated to the size scale of a single atom (∼0.2 nm), atom probe spectral-image datasets are huge and extremely sparse. In fact, the average spectrum will have somewhat less than one total count per spectrum due to imperfect detection efficiency. These conditions, under which the variance in the data is completely dominated by counting noise, test the limits of multivariate analysis, and an extensive discussion of how to extract the chemical information is presented. Efficient numerical approaches to performing principal component analysis (PCA) on these datasets, which may number hundreds of millions of individual spectra, are put forward, and it is shown that PCA can be computed in a few seconds on a typical laptop computer.
Bastidas, Camila Y; von Plessing, Carlos; Troncoso, José; Del P Castillo, Rosario
2018-04-15
Fourier Transform infrared imaging and multivariate analysis were used to identify, at the microscopic level, the presence of florfenicol (FF), a heavily-used antibiotic in the salmon industry, supplied to fishes in feed pellets for the treatment of salmonid rickettsial septicemia (SRS). The FF distribution was evaluated using Principal Component Analysis (PCA) and Augmented Multivariate Curve Resolution with Alternating Least Squares (augmented MCR-ALS) on the spectra obtained from images with pixel sizes of 6.25 μm × 6.25 μm and 1.56 μm × 1.56 μm, in different zones of feed pellets. Since the concentration of the drug was 3.44 mg FF/g pellet, this is the first report showing the powerful ability of the used of spectroscopic techniques and multivariate analysis, especially the augmented MCR-ALS, to describe the FF distribution in both the surface and inner parts of feed pellets at low concentration, in a complex matrix and at the microscopic level. The results allow monitoring the incorporation of the drug into the feed pellets. Copyright © 2018 Elsevier B.V. All rights reserved.
Multivariate Models for Normal and Binary Responses in Intervention Studies
ERIC Educational Resources Information Center
Pituch, Keenan A.; Whittaker, Tiffany A.; Chang, Wanchen
2016-01-01
Use of multivariate analysis (e.g., multivariate analysis of variance) is common when normally distributed outcomes are collected in intervention research. However, when mixed responses--a set of normal and binary outcomes--are collected, standard multivariate analyses are no longer suitable. While mixed responses are often obtained in…
NASA Astrophysics Data System (ADS)
Minaya, Veronica; Corzo, Gerald; van der Kwast, Johannes; Galarraga, Remigio; Mynett, Arthur
2014-05-01
Simulations of carbon cycling are prone to uncertainties from different sources, which in general are related to input data, parameters and the model representation capacities itself. The gross carbon uptake in the cycle is represented by the gross primary production (GPP), which deals with the spatio-temporal variability of the precipitation and the soil moisture dynamics. This variability associated with uncertainty of the parameters can be modelled by multivariate probabilistic distributions. Our study presents a novel methodology that uses multivariate Copulas analysis to assess the GPP. Multi-species and elevations variables are included in a first scenario of the analysis. Hydro-meteorological conditions that might generate a change in the next 50 or more years are included in a second scenario of this analysis. The biogeochemical model BIOME-BGC was applied in the Ecuadorian Andean region in elevations greater than 4000 masl with the presence of typical vegetation of páramo. The change of GPP over time is crucial for climate scenarios of the carbon cycling in this type of ecosystem. The results help to improve our understanding of the ecosystem function and clarify the dynamics and the relationship with the change of climate variables. Keywords: multivariate analysis, Copula, BIOME-BGC, NPP, páramos
Comprehensive drought characteristics analysis based on a nonlinear multivariate drought index
NASA Astrophysics Data System (ADS)
Yang, Jie; Chang, Jianxia; Wang, Yimin; Li, Yunyun; Hu, Hui; Chen, Yutong; Huang, Qiang; Yao, Jun
2018-02-01
It is vital to identify drought events and to evaluate multivariate drought characteristics based on a composite drought index for better drought risk assessment and sustainable development of water resources. However, most composite drought indices are constructed by the linear combination, principal component analysis and entropy weight method assuming a linear relationship among different drought indices. In this study, the multidimensional copulas function was applied to construct a nonlinear multivariate drought index (NMDI) to solve the complicated and nonlinear relationship due to its dependence structure and flexibility. The NMDI was constructed by combining meteorological, hydrological, and agricultural variables (precipitation, runoff, and soil moisture) to better reflect the multivariate variables simultaneously. Based on the constructed NMDI and runs theory, drought events for a particular area regarding three drought characteristics: duration, peak, and severity were identified. Finally, multivariate drought risk was analyzed as a tool for providing reliable support in drought decision-making. The results indicate that: (1) multidimensional copulas can effectively solve the complicated and nonlinear relationship among multivariate variables; (2) compared with single and other composite drought indices, the NMDI is slightly more sensitive in capturing recorded drought events; and (3) drought risk shows a spatial variation; out of the five partitions studied, the Jing River Basin as well as the upstream and midstream of the Wei River Basin are characterized by a higher multivariate drought risk. In general, multidimensional copulas provides a reliable way to solve the nonlinear relationship when constructing a comprehensive drought index and evaluating multivariate drought characteristics.
Iorgulescu, E; Voicu, V A; Sârbu, C; Tache, F; Albu, F; Medvedovici, A
2016-08-01
The influence of the experimental variability (instrumental repeatability, instrumental intermediate precision and sample preparation variability) and data pre-processing (normalization, peak alignment, background subtraction) on the discrimination power of multivariate data analysis methods (Principal Component Analysis -PCA- and Cluster Analysis -CA-) as well as a new algorithm based on linear regression was studied. Data used in the study were obtained through positive or negative ion monitoring electrospray mass spectrometry (+/-ESI/MS) and reversed phase liquid chromatography/UV spectrometric detection (RPLC/UV) applied to green tea extracts. Extractions in ethanol and heated water infusion were used as sample preparation procedures. The multivariate methods were directly applied to mass spectra and chromatograms, involving strictly a holistic comparison of shapes, without assignment of any structural identity to compounds. An alternative data interpretation based on linear regression analysis mutually applied to data series is also discussed. Slopes, intercepts and correlation coefficients produced by the linear regression analysis applied on pairs of very large experimental data series successfully retain information resulting from high frequency instrumental acquisition rates, obviously better defining the profiles being compared. Consequently, each type of sample or comparison between samples produces in the Cartesian space an ellipsoidal volume defined by the normal variation intervals of the slope, intercept and correlation coefficient. Distances between volumes graphically illustrates (dis)similarities between compared data. The instrumental intermediate precision had the major effect on the discrimination power of the multivariate data analysis methods. Mass spectra produced through ionization from liquid state in atmospheric pressure conditions of bulk complex mixtures resulting from extracted materials of natural origins provided an excellent data basis for multivariate analysis methods, equivalent to data resulting from chromatographic separations. The alternative evaluation of very large data series based on linear regression analysis produced information equivalent to results obtained through application of PCA an CA. Copyright © 2016 Elsevier B.V. All rights reserved.
Griswold, Cortland K
2015-12-21
Epistatic gene action occurs when mutations or alleles interact to produce a phenotype. Theoretically and empirically it is of interest to know whether gene interactions can facilitate the evolution of diversity. In this paper, we explore how epistatic gene action affects the additive genetic component or heritable component of multivariate trait variation, as well as how epistatic gene action affects the evolvability of multivariate traits. The analysis involves a sexually reproducing and recombining population. Our results indicate that under stabilizing selection conditions a population with a mixed additive and epistatic genetic architecture can have greater multivariate additive genetic variation and evolvability than a population with a purely additive genetic architecture. That greater multivariate additive genetic variation can occur with epistasis is in contrast to previous theory that indicated univariate additive genetic variation is decreased with epistasis under stabilizing selection conditions. In a multivariate setting, epistasis leads to less relative covariance among individuals in their genotypic, as well as their breeding values, which facilitates the maintenance of additive genetic variation and increases a population׳s evolvability. Our analysis involves linking the combinatorial nature of epistatic genetic effects to the ancestral graph structure of a population to provide insight into the consequences of epistasis on multivariate trait variation and evolution. Copyright © 2015 Elsevier Ltd. All rights reserved.
Almeida, Tiago P; Chu, Gavin S; Li, Xin; Dastagir, Nawshin; Tuan, Jiun H; Stafford, Peter J; Schlindwein, Fernando S; Ng, G André
2017-01-01
Purpose: Complex fractionated atrial electrograms (CFAE)-guided ablation after pulmonary vein isolation (PVI) has been used for persistent atrial fibrillation (persAF) therapy. This strategy has shown suboptimal outcomes due to, among other factors, undetected changes in the atrial tissue following PVI. In the present work, we investigate CFAE distribution before and after PVI in patients with persAF using a multivariate statistical model. Methods: 207 pairs of atrial electrograms (AEGs) were collected before and after PVI respectively, from corresponding LA regions in 18 persAF patients. Twelve attributes were measured from the AEGs, before and after PVI. Statistical models based on multivariate analysis of variance (MANOVA) and linear discriminant analysis (LDA) have been used to characterize the atrial regions and AEGs. Results: PVI significantly reduced CFAEs in the LA (70 vs. 40%; P < 0.0001). Four types of LA regions were identified, based on the AEGs characteristics: (i) fractionated before PVI that remained fractionated after PVI (31% of the collected points); (ii) fractionated that converted to normal (39%); (iii) normal prior to PVI that became fractionated (9%) and; (iv) normal that remained normal (21%). Individually, the attributes failed to distinguish these LA regions, but multivariate statistical models were effective in their discrimination ( P < 0.0001). Conclusion: Our results have unveiled that there are LA regions resistant to PVI, while others are affected by it. Although, traditional methods were unable to identify these different regions, the proposed multivariate statistical model discriminated LA regions resistant to PVI from those affected by it without prior ablation information.
MacNab, Ying C
2016-08-01
This paper concerns with multivariate conditional autoregressive models defined by linear combination of independent or correlated underlying spatial processes. Known as linear models of coregionalization, the method offers a systematic and unified approach for formulating multivariate extensions to a broad range of univariate conditional autoregressive models. The resulting multivariate spatial models represent classes of coregionalized multivariate conditional autoregressive models that enable flexible modelling of multivariate spatial interactions, yielding coregionalization models with symmetric or asymmetric cross-covariances of different spatial variation and smoothness. In the context of multivariate disease mapping, for example, they facilitate borrowing strength both over space and cross variables, allowing for more flexible multivariate spatial smoothing. Specifically, we present a broadened coregionalization framework to include order-dependent, order-free, and order-robust multivariate models; a new class of order-free coregionalized multivariate conditional autoregressives is introduced. We tackle computational challenges and present solutions that are integral for Bayesian analysis of these models. We also discuss two ways of computing deviance information criterion for comparison among competing hierarchical models with or without unidentifiable prior parameters. The models and related methodology are developed in the broad context of modelling multivariate data on spatial lattice and illustrated in the context of multivariate disease mapping. The coregionalization framework and related methods also present a general approach for building spatially structured cross-covariance functions for multivariate geostatistics. © The Author(s) 2016.
Forcino, Frank L; Leighton, Lindsey R; Twerdy, Pamela; Cahill, James F
2015-01-01
Community ecologists commonly perform multivariate techniques (e.g., ordination, cluster analysis) to assess patterns and gradients of taxonomic variation. A critical requirement for a meaningful statistical analysis is accurate information on the taxa found within an ecological sample. However, oversampling (too many individuals counted per sample) also comes at a cost, particularly for ecological systems in which identification and quantification is substantially more resource consuming than the field expedition itself. In such systems, an increasingly larger sample size will eventually result in diminishing returns in improving any pattern or gradient revealed by the data, but will also lead to continually increasing costs. Here, we examine 396 datasets: 44 previously published and 352 created datasets. Using meta-analytic and simulation-based approaches, the research within the present paper seeks (1) to determine minimal sample sizes required to produce robust multivariate statistical results when conducting abundance-based, community ecology research. Furthermore, we seek (2) to determine the dataset parameters (i.e., evenness, number of taxa, number of samples) that require larger sample sizes, regardless of resource availability. We found that in the 44 previously published and the 220 created datasets with randomly chosen abundances, a conservative estimate of a sample size of 58 produced the same multivariate results as all larger sample sizes. However, this minimal number varies as a function of evenness, where increased evenness resulted in increased minimal sample sizes. Sample sizes as small as 58 individuals are sufficient for a broad range of multivariate abundance-based research. In cases when resource availability is the limiting factor for conducting a project (e.g., small university, time to conduct the research project), statistically viable results can still be obtained with less of an investment.
Tourkmani, Abdo Karim; Sánchez-Huerta, Valeria; De Wit, Guillermo; Martínez, Jaime D.; Mingo, David; Mahillo-Fernández, Ignacio; Jiménez-Alfaro, Ignacio
2017-01-01
AIM To analyze the relationship between the score obtained in the Risk Score System (RSS) proposed by Hicks et al with penetrating keratoplasty (PKP) graft failure at 1y postoperatively and among each factor in the RSS with the risk of PKP graft failure using univariate and multivariate analysis. METHODS The retrospective cohort study had 152 PKPs from 152 patients. Eighteen cases were excluded from our study due to primary failure (10 cases), incomplete medical notes (5 cases) and follow-up less than 1y (3 cases). We included 134 PKPs from 134 patients stratified by preoperative risk score. Spearman coefficient was calculated for the relationship between the score obtained and risk of failure at 1y. Univariate and multivariate analysis were calculated for the impact of every single risk factor included in the RSS over graft failure at 1y. RESULTS Spearman coefficient showed statistically significant correlation between the score in the RSS and graft failure (P<0.05). Multivariate logistic regression analysis showed no statistically significant relationship (P>0.05) between diagnosis and lens status with graft failure. The relationship between the other risk factors studied and graft failure was significant (P<0.05), although the results for previous grafts and graft failure was unreliable. None of our patients had previous blood transfusion, thus, it had no impact. CONCLUSION After the application of multivariate analysis techniques, some risk factors do not show the expected impact over graft failure at 1y. PMID:28393027
Use of direct gradient analysis to uncover biological hypotheses in 16s survey data and beyond.
Erb-Downward, John R; Sadighi Akha, Amir A; Wang, Juan; Shen, Ning; He, Bei; Martinez, Fernando J; Gyetko, Margaret R; Curtis, Jeffrey L; Huffnagle, Gary B
2012-01-01
This study investigated the use of direct gradient analysis of bacterial 16S pyrosequencing surveys to identify relevant bacterial community signals in the midst of a "noisy" background, and to facilitate hypothesis-testing both within and beyond the realm of ecological surveys. The results, utilizing 3 different real world data sets, demonstrate the utility of adding direct gradient analysis to any analysis that draws conclusions from indirect methods such as Principal Component Analysis (PCA) and Principal Coordinates Analysis (PCoA). Direct gradient analysis produces testable models, and can identify significant patterns in the midst of noisy data. Additionally, we demonstrate that direct gradient analysis can be used with other kinds of multivariate data sets, such as flow cytometric data, to identify differentially expressed populations. The results of this study demonstrate the utility of direct gradient analysis in microbial ecology and in other areas of research where large multivariate data sets are involved.
NASA Astrophysics Data System (ADS)
Pujiwati, Arie; Nakamura, K.; Watanabe, N.; Komai, T.
2018-02-01
Multivariate analysis is applied to investigate geochemistry of several trace elements in top soils and their relation with the contamination source as the influence of coal mines in Jorong, South Kalimantan. Total concentration of Cd, V, Co, Ni, Cr, Zn, As, Pb, Sb, Cu and Ba was determined in 20 soil samples by the bulk analysis. Pearson correlation is applied to specify the linear correlation among the elements. Principal Component Analysis (PCA) and Cluster Analysis (CA) were applied to observe the classification of trace elements and contamination sources. The results suggest that contamination loading is contributed by Cr, Cu, Ni, Zn, As, and Pb. The elemental loading mostly affects the non-coal mining area, for instances the area near settlement and agricultural land use. Moreover, the contamination source is classified into the areas that are influenced by the coal mining activity, the agricultural types, and the river mixing zone. Multivariate analysis could elucidate the elemental loading and the contamination sources of trace elements in the vicinity of coal mine area.
Authentication of Trappist beers by LC-MS fingerprints and multivariate data analysis.
Mattarucchi, Elia; Stocchero, Matteo; Moreno-Rojas, José Manuel; Giordano, Giuseppe; Reniero, Fabiano; Guillou, Claude
2010-12-08
The aim of this study was to asses the applicability of LC-MS profiling to authenticate a selected Trappist beer as part of a program on traceability funded by the European Commission. A total of 232 beers were fingerprinted and classified through multivariate data analysis. The selected beer was clearly distinguished from beers of different brands, while only 3 samples (3.5% of the test set) were wrongly classified when compared with other types of beer of the same Trappist brewery. The fingerprints were further analyzed to extract the most discriminating variables, which proved to be sufficient for classification, even using a simplified unsupervised model. This reduced fingerprint allowed us to study the influence of batch-to-batch variability on the classification model. Our results can easily be applied to different matrices and they confirmed the effectiveness of LC-MS profiling in combination with multivariate data analysis for the characterization of food products.
Water quality analysis of the Rapur area, Andhra Pradesh, South India using multivariate techniques
NASA Astrophysics Data System (ADS)
Nagaraju, A.; Sreedhar, Y.; Thejaswi, A.; Sayadi, Mohammad Hossein
2017-10-01
The groundwater samples from Rapur area were collected from different sites to evaluate the major ion chemistry. The large number of data can lead to difficulties in the integration, interpretation, and representation of the results. Two multivariate statistical methods, hierarchical cluster analysis (HCA) and factor analysis (FA), were applied to evaluate their usefulness to classify and identify geochemical processes controlling groundwater geochemistry. Four statistically significant clusters were obtained from 30 sampling stations. This has resulted two important clusters viz., cluster 1 (pH, Si, CO3, Mg, SO4, Ca, K, HCO3, alkalinity, Na, Na + K, Cl, and hardness) and cluster 2 (EC and TDS) which are released to the study area from different sources. The application of different multivariate statistical techniques, such as principal component analysis (PCA), assists in the interpretation of complex data matrices for a better understanding of water quality of a study area. From PCA, it is clear that the first factor (factor 1), accounted for 36.2% of the total variance, was high positive loading in EC, Mg, Cl, TDS, and hardness. Based on the PCA scores, four significant cluster groups of sampling locations were detected on the basis of similarity of their water quality.
Multivariate Longitudinal Analysis with Bivariate Correlation Test
Adjakossa, Eric Houngla; Sadissou, Ibrahim; Hounkonnou, Mahouton Norbert; Nuel, Gregory
2016-01-01
In the context of multivariate multilevel data analysis, this paper focuses on the multivariate linear mixed-effects model, including all the correlations between the random effects when the dimensional residual terms are assumed uncorrelated. Using the EM algorithm, we suggest more general expressions of the model’s parameters estimators. These estimators can be used in the framework of the multivariate longitudinal data analysis as well as in the more general context of the analysis of multivariate multilevel data. By using a likelihood ratio test, we test the significance of the correlations between the random effects of two dependent variables of the model, in order to investigate whether or not it is useful to model these dependent variables jointly. Simulation studies are done to assess both the parameter recovery performance of the EM estimators and the power of the test. Using two empirical data sets which are of longitudinal multivariate type and multivariate multilevel type, respectively, the usefulness of the test is illustrated. PMID:27537692
Multivariate Longitudinal Analysis with Bivariate Correlation Test.
Adjakossa, Eric Houngla; Sadissou, Ibrahim; Hounkonnou, Mahouton Norbert; Nuel, Gregory
2016-01-01
In the context of multivariate multilevel data analysis, this paper focuses on the multivariate linear mixed-effects model, including all the correlations between the random effects when the dimensional residual terms are assumed uncorrelated. Using the EM algorithm, we suggest more general expressions of the model's parameters estimators. These estimators can be used in the framework of the multivariate longitudinal data analysis as well as in the more general context of the analysis of multivariate multilevel data. By using a likelihood ratio test, we test the significance of the correlations between the random effects of two dependent variables of the model, in order to investigate whether or not it is useful to model these dependent variables jointly. Simulation studies are done to assess both the parameter recovery performance of the EM estimators and the power of the test. Using two empirical data sets which are of longitudinal multivariate type and multivariate multilevel type, respectively, the usefulness of the test is illustrated.
Stamate, Mirela Cristina; Todor, Nicolae; Cosgarea, Marcel
2015-01-01
The clinical utility of otoacoustic emissions as a noninvasive objective test of cochlear function has been long studied. Both transient otoacoustic emissions and distorsion products can be used to identify hearing loss, but to what extent they can be used as predictors for hearing loss is still debated. Most studies agree that multivariate analyses have better test performances than univariate analyses. The aim of the study was to determine transient otoacoustic emissions and distorsion products performance in identifying normal and impaired hearing loss, using the pure tone audiogram as a gold standard procedure and different multivariate statistical approaches. The study included 105 adult subjects with normal hearing and hearing loss who underwent the same test battery: pure-tone audiometry, tympanometry, otoacoustic emission tests. We chose to use the logistic regression as a multivariate statistical technique. Three logistic regression models were developed to characterize the relations between different risk factors (age, sex, tinnitus, demographic features, cochlear status defined by otoacoustic emissions) and hearing status defined by pure-tone audiometry. The multivariate analyses allow the calculation of the logistic score, which is a combination of the inputs, weighted by coefficients, calculated within the analyses. The accuracy of each model was assessed using receiver operating characteristics curve analysis. We used the logistic score to generate receivers operating curves and to estimate the areas under the curves in order to compare different multivariate analyses. We compared the performance of each otoacoustic emission (transient, distorsion product) using three different multivariate analyses for each ear, when multi-frequency gold standards were used. We demonstrated that all multivariate analyses provided high values of the area under the curve proving the performance of the otoacoustic emissions. Each otoacoustic emission test presented high values of area under the curve, suggesting that implementing a multivariate approach to evaluate the performances of each otoacoustic emission test would serve to increase the accuracy in identifying the normal and impaired ears. We encountered the highest area under the curve value for the combined multivariate analysis suggesting that both otoacoustic emission tests should be used in assessing hearing status. Our multivariate analyses revealed that age is a constant predictor factor of the auditory status for both ears, but the presence of tinnitus was the most important predictor for the hearing level, only for the left ear. Age presented similar coefficients, but tinnitus coefficients, by their high value, produced the highest variations of the logistic scores, only for the left ear group, thus increasing the risk of hearing loss. We did not find gender differences between ears for any otoacoustic emission tests, but studies still debate this question as the results are contradictory. Neither gender, nor environment origin had any predictive value for the hearing status, according to the results of our study. Like any other audiological test, using otoacoustic emissions to identify hearing loss is not without error. Even when applying multivariate analysis, perfect test performance is never achieved. Although most studies demonstrated the benefit of using the multivariate analysis, it has not been incorporated into clinical decisions maybe because of the idiosyncratic nature of multivariate solutions or because of the lack of the validation studies.
Kwon, Yong-Kook; Ahn, Myung Suk; Park, Jong Suk; Liu, Jang Ryol; In, Dong Su; Min, Byung Whan; Kim, Suk Weon
2013-01-01
To determine whether Fourier transform (FT)-IR spectral analysis combined with multivariate analysis of whole-cell extracts from ginseng leaves can be applied as a high-throughput discrimination system of cultivation ages and cultivars, a total of total 480 leaf samples belonging to 12 categories corresponding to four different cultivars (Yunpung, Kumpung, Chunpung, and an open-pollinated variety) and three different cultivation ages (1 yr, 2 yr, and 3 yr) were subjected to FT-IR. The spectral data were analyzed by principal component analysis and partial least squares-discriminant analysis. A dendrogram based on hierarchical clustering analysis of the FT-IR spectral data on ginseng leaves showed that leaf samples were initially segregated into three groups in a cultivation age-dependent manner. Then, within the same cultivation age group, leaf samples were clustered into four subgroups in a cultivar-dependent manner. The overall prediction accuracy for discrimination of cultivars and cultivation ages was 94.8% in a cross-validation test. These results clearly show that the FT-IR spectra combined with multivariate analysis from ginseng leaves can be applied as an alternative tool for discriminating of ginseng cultivars and cultivation ages. Therefore, we suggest that this result could be used as a rapid and reliable F1 hybrid seed-screening tool for accelerating the conventional breeding of ginseng. PMID:24558311
Taylor, Sandra L; Ruhaak, L Renee; Weiss, Robert H; Kelly, Karen; Kim, Kyoungmi
2017-01-01
High through-put mass spectrometry (MS) is now being used to profile small molecular compounds across multiple biological sample types from the same subjects with the goal of leveraging information across biospecimens. Multivariate statistical methods that combine information from all biospecimens could be more powerful than the usual univariate analyses. However, missing values are common in MS data and imputation can impact between-biospecimen correlation and multivariate analysis results. We propose two multivariate two-part statistics that accommodate missing values and combine data from all biospecimens to identify differentially regulated compounds. Statistical significance is determined using a multivariate permutation null distribution. Relative to univariate tests, the multivariate procedures detected more significant compounds in three biological datasets. In a simulation study, we showed that multi-biospecimen testing procedures were more powerful than single-biospecimen methods when compounds are differentially regulated in multiple biospecimens but univariate methods can be more powerful if compounds are differentially regulated in only one biospecimen. We provide R functions to implement and illustrate our method as supplementary information CONTACT: sltaylor@ucdavis.eduSupplementary information: Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Independent Predictors of Prognosis Based on Oral Cavity Squamous Cell Carcinoma Surgical Margins.
Buchakjian, Marisa R; Ginader, Timothy; Tasche, Kendall K; Pagedar, Nitin A; Smith, Brian J; Sperry, Steven M
2018-05-01
Objective To conduct a multivariate analysis of a large cohort of oral cavity squamous cell carcinoma (OCSCC) cases for independent predictors of local recurrence (LR) and overall survival (OS), with emphasis on the relationship between (1) prognosis and (2) main specimen permanent margins and intraoperative tumor bed frozen margins. Study Design Retrospective cohort study. Setting Tertiary academic head and neck cancer program. Subjects and Methods This study included 426 patients treated with OCSCC resection between 2005 and 2014 at University of Iowa Hospitals and Clinics. Patients underwent excision of OCSCC with intraoperative tumor bed frozen margin sampling and main specimen permanent margin assessment. Multivariate analysis of the data set to predict LR and OS was performed. Results Independent predictors of LR included nodal involvement, histologic grade, and main specimen permanent margin status. Specifically, the presence of a positive margin (odds ratio, 6.21; 95% CI, 3.3-11.9) or <1-mm/carcinoma in situ margin (odds ratio, 2.41; 95% CI, 1.19-4.87) on the main specimen was an independent predictor of LR, whereas intraoperative tumor bed margins were not predictive of LR on multivariate analysis. Similarly, independent predictors of OS on multivariate analysis included nodal involvement, extracapsular extension, and a positive main specimen margin. Tumor bed margins did not independently predict OS. Conclusion The main specimen margin is a strong independent predictor of LR and OS on multivariate analysis. Intraoperative tumor bed frozen margins do not independently predict prognosis. We conclude that emphasis should be placed on evaluating the main specimen margins when estimating prognosis after OCSCC resection.
Jha, Dilip Kumar; Vinithkumar, Nambali Valsalan; Sahu, Biraja Kumar; Dheenan, Palaiya Sukumaran; Das, Apurba Kumar; Begum, Mehmuna; Devi, Marimuthu Prashanthi; Kirubagaran, Ramalingam
2015-07-15
Chidiyatappu Bay is one of the least disturbed marine environments of Andaman & Nicobar Islands, the union territory of India. Oceanic flushing from southeast and northwest direction is prevalent in this bay. Further, anthropogenic activity is minimal in the adjoining environment. Considering the pristine nature of this bay, seawater samples collected from 12 sampling stations covering three seasons were analyzed. Principal Component Analysis (PCA) revealed 69.9% of total variance and exhibited strong factor loading for nitrite, chlorophyll a and phaeophytin. In addition, analysis of variance (ANOVA-one way), regression analysis, box-whisker plots and Geographical Information System based hot spot analysis further simplified and supported multivariate results. The results obtained are important to establish reference conditions for comparative study with other similar ecosystems in the region. Copyright © 2015 Elsevier Ltd. All rights reserved.
Carlesi, Serena; Ricci, Marilena; Cucci, Costanza; La Nasa, Jacopo; Lofrumento, Cristiana; Picollo, Marcello; Becucci, Maurizio
2015-07-01
This work explores the application of chemometric techniques to the analysis of lipidic paint binders (i.e., drying oils) by means of Raman and near-infrared spectroscopy. These binders have been widely used by artists throughout history, both individually and in mixtures. We prepared various model samples of the pure binders (linseed, poppy seed, and walnut oils) obtained from different manufacturers. These model samples were left to dry and then characterized by Raman and reflectance near-infrared spectroscopy. Multivariate analysis was performed by applying principal component analysis (PCA) on the first derivative of the corresponding Raman spectra (1800-750 cm(-1)), near-infrared spectra (6000-3900 cm(-1)), and their combination to test whether spectral differences could enable samples to be distinguished on the basis of their composition. The vibrational bands we found most useful to discriminate between the different products we studied are the fundamental ν(C=C) stretching and methylenic stretching and bending combination bands. The results of the multivariate analysis demonstrated the potential of chemometric approaches for characterizing and identifying drying oils, and also for gaining a deeper insight into the aging process. Comparison with high-performance liquid chromatography data was conducted to check the PCA results.
Indelicato, Serena; Bongiorno, David; Tuzzolino, Nicola; Mannino, Maria Rosaria; Muscarella, Rosalia; Fradella, Pasquale; Gargano, Maria Elena; Nicosia, Salvatore; Ceraulo, Leopoldo
2018-03-14
Multivariate analysis was performed on a large data set of groundwater and leachate samples collected during 9 years of operation of the Bellolampo municipal solid waste landfill (located above Palermo, Italy). The aim was to obtain the most likely correlations among the data. The analysis results are presented. Groundwater samples were collected in the period 2004-2013, whereas the leachate analysis refers to the period 2006-2013. For groundwater, statistical data evaluation revealed notable differences among the samples taken from the numerous wells located around the landfill. Characteristic parameters revealed by principal component analysis (PCA) were more deeply investigated, and corresponding thematic maps were drawn. The composition of the leachate was also thoroughly investigated. Several chemical macro-descriptors were calculated, and the results are presented. A comparison of PCA results for the leachate and groundwater data clearly reveals that the groundwater's main components substantially differ from those of the leachate. This outcome strongly suggests excluding leachate permeation through the multiple landfill lining.
Analysis of Developmental Data: Comparison Among Alternative Methods
ERIC Educational Resources Information Center
Wilson, Ronald S.
1975-01-01
To examine the ability of the correction factor epsilon to counteract statistical bias in univariate analysis, an analysis of variance (adjusted by epsilon) and a multivariate analysis of variance were performed on the same data. The results indicated that univariate analysis is a fully protected design when used with epsilon. (JMB)
Yang, James J; Li, Jia; Williams, L Keoki; Buu, Anne
2016-01-05
In genome-wide association studies (GWAS) for complex diseases, the association between a SNP and each phenotype is usually weak. Combining multiple related phenotypic traits can increase the power of gene search and thus is a practically important area that requires methodology work. This study provides a comprehensive review of existing methods for conducting GWAS on complex diseases with multiple phenotypes including the multivariate analysis of variance (MANOVA), the principal component analysis (PCA), the generalizing estimating equations (GEE), the trait-based association test involving the extended Simes procedure (TATES), and the classical Fisher combination test. We propose a new method that relaxes the unrealistic independence assumption of the classical Fisher combination test and is computationally efficient. To demonstrate applications of the proposed method, we also present the results of statistical analysis on the Study of Addiction: Genetics and Environment (SAGE) data. Our simulation study shows that the proposed method has higher power than existing methods while controlling for the type I error rate. The GEE and the classical Fisher combination test, on the other hand, do not control the type I error rate and thus are not recommended. In general, the power of the competing methods decreases as the correlation between phenotypes increases. All the methods tend to have lower power when the multivariate phenotypes come from long tailed distributions. The real data analysis also demonstrates that the proposed method allows us to compare the marginal results with the multivariate results and specify which SNPs are specific to a particular phenotype or contribute to the common construct. The proposed method outperforms existing methods in most settings and also has great applications in GWAS on complex diseases with multiple phenotypes such as the substance abuse disorders.
Multivariate Meta-Analysis of Genetic Association Studies: A Simulation Study
Neupane, Binod; Beyene, Joseph
2015-01-01
In a meta-analysis with multiple end points of interests that are correlated between or within studies, multivariate approach to meta-analysis has a potential to produce more precise estimates of effects by exploiting the correlation structure between end points. However, under random-effects assumption the multivariate estimation is more complex (as it involves estimation of more parameters simultaneously) than univariate estimation, and sometimes can produce unrealistic parameter estimates. Usefulness of multivariate approach to meta-analysis of the effects of a genetic variant on two or more correlated traits is not well understood in the area of genetic association studies. In such studies, genetic variants are expected to roughly maintain Hardy-Weinberg equilibrium within studies, and also their effects on complex traits are generally very small to modest and could be heterogeneous across studies for genuine reasons. We carried out extensive simulation to explore the comparative performance of multivariate approach with most commonly used univariate inverse-variance weighted approach under random-effects assumption in various realistic meta-analytic scenarios of genetic association studies of correlated end points. We evaluated the performance with respect to relative mean bias percentage, and root mean square error (RMSE) of the estimate and coverage probability of corresponding 95% confidence interval of the effect for each end point. Our simulation results suggest that multivariate approach performs similarly or better than univariate method when correlations between end points within or between studies are at least moderate and between-study variation is similar or larger than average within-study variation for meta-analyses of 10 or more genetic studies. Multivariate approach produces estimates with smaller bias and RMSE especially for the end point that has randomly or informatively missing summary data in some individual studies, when the missing data in the endpoint are imputed with null effects and quite large variance. PMID:26196398
Multivariate Meta-Analysis of Genetic Association Studies: A Simulation Study.
Neupane, Binod; Beyene, Joseph
2015-01-01
In a meta-analysis with multiple end points of interests that are correlated between or within studies, multivariate approach to meta-analysis has a potential to produce more precise estimates of effects by exploiting the correlation structure between end points. However, under random-effects assumption the multivariate estimation is more complex (as it involves estimation of more parameters simultaneously) than univariate estimation, and sometimes can produce unrealistic parameter estimates. Usefulness of multivariate approach to meta-analysis of the effects of a genetic variant on two or more correlated traits is not well understood in the area of genetic association studies. In such studies, genetic variants are expected to roughly maintain Hardy-Weinberg equilibrium within studies, and also their effects on complex traits are generally very small to modest and could be heterogeneous across studies for genuine reasons. We carried out extensive simulation to explore the comparative performance of multivariate approach with most commonly used univariate inverse-variance weighted approach under random-effects assumption in various realistic meta-analytic scenarios of genetic association studies of correlated end points. We evaluated the performance with respect to relative mean bias percentage, and root mean square error (RMSE) of the estimate and coverage probability of corresponding 95% confidence interval of the effect for each end point. Our simulation results suggest that multivariate approach performs similarly or better than univariate method when correlations between end points within or between studies are at least moderate and between-study variation is similar or larger than average within-study variation for meta-analyses of 10 or more genetic studies. Multivariate approach produces estimates with smaller bias and RMSE especially for the end point that has randomly or informatively missing summary data in some individual studies, when the missing data in the endpoint are imputed with null effects and quite large variance.
Multivariate Regression Analysis and Slaughter Livestock,
AGRICULTURE, *ECONOMICS), (*MEAT, PRODUCTION), MULTIVARIATE ANALYSIS, REGRESSION ANALYSIS , ANIMALS, WEIGHT, COSTS, PREDICTIONS, STABILITY, MATHEMATICAL MODELS, STORAGE, BEEF, PORK, FOOD, STATISTICAL DATA, ACCURACY
MULTIVARIATE ANALYSIS OF DRINKING BEHAVIOUR IN A RURAL POPULATION
Mathrubootham, N.; Bashyam, V.S.P.; Shahjahan
1997-01-01
This study was carried out to find out the drinking pattern in a rural population, using multivariate techniques. 386 current users identified in a community were assessed with regard to their drinking behaviours using a structured interview. For purposes of the study the questions were condensed into 46 meaningful variables. In bivariate analysis, 14 variables including dependent variables such as dependence, MAST & CAGE (measuring alcoholic status), Q.F. Index and troubled drinking were found to be significant. Taking these variables and other multivariate techniques too such as ANOVA, correlation, regression analysis and factor analysis were done using both SPSS PC + and HCL magnum mainframe computer with FOCUS package and UNIX systems. Results revealed that number of factors such as drinking style, duration of drinking, pattern of abuse, Q.F. Index and various problems influenced drinking and some of them set up a vicious circle. Factor analysis revealed mainly 3 factors, abuse, dependence and social drinking factors. Dependence could be divided into low/moderate dependence. The implications and practical applications of these tests are also discussed. PMID:21584077
NASA Astrophysics Data System (ADS)
Lee, An-Sheng; Lu, Wei-Li; Huang, Jyh-Jaan; Chang, Queenie; Wei, Kuo-Yen; Lin, Chin-Jung; Liou, Sofia Ya Hsuan
2016-04-01
Through the geology and climate characteristic in Taiwan, generally rivers carry a lot of suspended particles. After these particles settled, they become sediments which are good sorbent for heavy metals in river system. Consequently, sediments can be found recording contamination footprint at low flow energy region, such as estuary. Seven sediment cores were collected along Nankan River, northern Taiwan, which is seriously contaminated by factory, household and agriculture input. Physico-chemical properties of these cores were derived from Itrax-XRF Core Scanner and grain size analysis. In order to interpret these complex data matrices, the multivariate statistical techniques (cluster analysis, factor analysis and discriminant analysis) were introduced to this study. Through the statistical determination, the result indicates four types of sediment. One of them represents contamination event which shows high concentration of Cu, Zn, Pb, Ni and Fe, and low concentration of Si and Zr. Furthermore, three possible contamination sources of this type of sediment were revealed by Factor Analysis. The combination of sediment analysis and multivariate statistical techniques used provides new insights into the contamination depositional history of Nankan River and could be similarly applied to other river systems to determine the scale of anthropogenic contamination.
Selvarasu, Suresh; Kim, Do Yun; Karimi, Iftekhar A; Lee, Dong-Yup
2010-10-01
We present an integrated framework for characterizing fed-batch cultures of mouse hybridoma cells producing monoclonal antibody (mAb). This framework systematically combines data preprocessing, elemental balancing and statistical analysis technique. Initially, specific rates of cell growth, glucose/amino acid consumptions and mAb/metabolite productions were calculated via curve fitting using logistic equations, with subsequent elemental balancing of the preprocessed data indicating the presence of experimental measurement errors. Multivariate statistical analysis was then employed to understand physiological characteristics of the cellular system. The results from principal component analysis (PCA) revealed three major clusters of amino acids with similar trends in their consumption profiles: (i) arginine, threonine and serine, (ii) glycine, tyrosine, phenylalanine, methionine, histidine and asparagine, and (iii) lysine, valine and isoleucine. Further analysis using partial least square (PLS) regression identified key amino acids which were positively or negatively correlated with the cell growth, mAb production and the generation of lactate and ammonia. Based on these results, the optimal concentrations of key amino acids in the feed medium can be inferred, potentially leading to an increase in cell viability and productivity, as well as a decrease in toxic waste production. The study demonstrated how the current methodological framework using multivariate statistical analysis techniques can serve as a potential tool for deriving rational medium design strategies. Copyright © 2010 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Ye, M.; Pacheco Castro, R. B.; Pacheco Avila, J.; Cabrera Sansores, A.
2014-12-01
The karstic aquifer of Yucatan is a vulnerable and complex system. The first fifteen meters of this aquifer have been polluted, due to this the protection of this resource is important because is the only source of potable water of the entire State. Through the assessment of groundwater quality we can gain some knowledge about the main processes governing water chemistry as well as spatial patterns which are important to establish protection zones. In this work multivariate statistical techniques are used to assess the groundwater quality of the supply wells (30 to 40 meters deep) in the hidrogeologic region of the Ring of Cenotes, located in Yucatan, Mexico. Cluster analysis and principal component analysis are applied in groundwater chemistry data of the study area. Results of principal component analysis show that the main sources of variation in the data are due sea water intrusion and the interaction of the water with the carbonate rocks of the system and some pollution processes. The cluster analysis shows that the data can be divided in four clusters. The spatial distribution of the clusters seems to be random, but is consistent with sea water intrusion and pollution with nitrates. The overall results show that multivariate statistical analysis can be successfully applied in the groundwater quality assessment of this karstic aquifer.
NASA Astrophysics Data System (ADS)
Theodorakou, Chrysoula; Farquharson, Michael J.
2009-08-01
The motivation behind this study is to assess whether angular dispersive x-ray diffraction (ADXRD) data, processed using multivariate analysis techniques, can be used for classifying secondary colorectal liver cancer tissue and normal surrounding liver tissue in human liver biopsy samples. The ADXRD profiles from a total of 60 samples of normal liver tissue and colorectal liver metastases were measured using a synchrotron radiation source. The data were analysed for 56 samples using nonlinear peak-fitting software. Four peaks were fitted to all of the ADXRD profiles, and the amplitude, area, amplitude and area ratios for three of the four peaks were calculated and used for the statistical and multivariate analysis. The statistical analysis showed that there are significant differences between all the peak-fitting parameters and ratios between the normal and the diseased tissue groups. The technique of soft independent modelling of class analogy (SIMCA) was used to classify normal liver tissue and colorectal liver metastases resulting in 67% of the normal tissue samples and 60% of the secondary colorectal liver tissue samples being classified correctly. This study has shown that the ADXRD data of normal and secondary colorectal liver cancer are statistically different and x-ray diffraction data analysed using multivariate analysis have the potential to be used as a method of tissue classification.
NASA Astrophysics Data System (ADS)
Mansouri, Edris; Feizi, Faranak; Jafari Rad, Alireza; Arian, Mehran
2018-03-01
This paper uses multivariate regression to create a mathematical model for iron skarn exploration in the Sarvian area, central Iran, using multivariate regression for mineral prospectivity mapping (MPM). The main target of this paper is to apply multivariate regression analysis (as an MPM method) to map iron outcrops in the northeastern part of the study area in order to discover new iron deposits in other parts of the study area. Two types of multivariate regression models using two linear equations were employed to discover new mineral deposits. This method is one of the reliable methods for processing satellite images. ASTER satellite images (14 bands) were used as unique independent variables (UIVs), and iron outcrops were mapped as dependent variables for MPM. According to the results of the probability value (p value), coefficient of determination value (R2) and adjusted determination coefficient (Radj2), the second regression model (which consistent of multiple UIVs) fitted better than other models. The accuracy of the model was confirmed by iron outcrops map and geological observation. Based on field observation, iron mineralization occurs at the contact of limestone and intrusive rocks (skarn type).
NASA Astrophysics Data System (ADS)
Chen, Quansheng; Qi, Shuai; Li, Huanhuan; Han, Xiaoyan; Ouyang, Qin; Zhao, Jiewen
2014-10-01
To rapidly and efficiently detect the presence of adulterants in honey, three-dimensional fluorescence spectroscopy (3DFS) technique was employed with the help of multivariate calibration. The data of 3D fluorescence spectra were compressed using characteristic extraction and the principal component analysis (PCA). Then, partial least squares (PLS) and back propagation neural network (BP-ANN) algorithms were used for modeling. The model was optimized by cross validation, and its performance was evaluated according to root mean square error of prediction (RMSEP) and correlation coefficient (R) in prediction set. The results showed that BP-ANN model was superior to PLS models, and the optimum prediction results of the mixed group (sunflower ± longan ± buckwheat ± rape) model were achieved as follow: RMSEP = 0.0235 and R = 0.9787 in the prediction set. The study demonstrated that the 3D fluorescence spectroscopy technique combined with multivariate calibration has high potential in rapid, nondestructive, and accurate quantitative analysis of honey adulteration.
Multivariate time series clustering on geophysical data recorded at Mt. Etna from 1996 to 2003
NASA Astrophysics Data System (ADS)
Di Salvo, Roberto; Montalto, Placido; Nunnari, Giuseppe; Neri, Marco; Puglisi, Giuseppe
2013-02-01
Time series clustering is an important task in data analysis issues in order to extract implicit, previously unknown, and potentially useful information from a large collection of data. Finding useful similar trends in multivariate time series represents a challenge in several areas including geophysics environment research. While traditional time series analysis methods deal only with univariate time series, multivariate time series analysis is a more suitable approach in the field of research where different kinds of data are available. Moreover, the conventional time series clustering techniques do not provide desired results for geophysical datasets due to the huge amount of data whose sampling rate is different according to the nature of signal. In this paper, a novel approach concerning geophysical multivariate time series clustering is proposed using dynamic time series segmentation and Self Organizing Maps techniques. This method allows finding coupling among trends of different geophysical data recorded from monitoring networks at Mt. Etna spanning from 1996 to 2003, when the transition from summit eruptions to flank eruptions occurred. This information can be used to carry out a more careful evaluation of the state of volcano and to define potential hazard assessment at Mt. Etna.
A multivariate time series approach to modeling and forecasting demand in the emergency department.
Jones, Spencer S; Evans, R Scott; Allen, Todd L; Thomas, Alun; Haug, Peter J; Welch, Shari J; Snow, Gregory L
2009-02-01
The goals of this investigation were to study the temporal relationships between the demands for key resources in the emergency department (ED) and the inpatient hospital, and to develop multivariate forecasting models. Hourly data were collected from three diverse hospitals for the year 2006. Descriptive analysis and model fitting were carried out using graphical and multivariate time series methods. Multivariate models were compared to a univariate benchmark model in terms of their ability to provide out-of-sample forecasts of ED census and the demands for diagnostic resources. Descriptive analyses revealed little temporal interaction between the demand for inpatient resources and the demand for ED resources at the facilities considered. Multivariate models provided more accurate forecasts of ED census and of the demands for diagnostic resources. Our results suggest that multivariate time series models can be used to reliably forecast ED patient census; however, forecasts of the demands for diagnostic resources were not sufficiently reliable to be useful in the clinical setting.
Riley, Richard D; Elia, Eleni G; Malin, Gemma; Hemming, Karla; Price, Malcolm P
2015-07-30
A prognostic factor is any measure that is associated with the risk of future health outcomes in those with existing disease. Often, the prognostic ability of a factor is evaluated in multiple studies. However, meta-analysis is difficult because primary studies often use different methods of measurement and/or different cut-points to dichotomise continuous factors into 'high' and 'low' groups; selective reporting is also common. We illustrate how multivariate random effects meta-analysis models can accommodate multiple prognostic effect estimates from the same study, relating to multiple cut-points and/or methods of measurement. The models account for within-study and between-study correlations, which utilises more information and reduces the impact of unreported cut-points and/or measurement methods in some studies. The applicability of the approach is improved with individual participant data and by assuming a functional relationship between prognostic effect and cut-point to reduce the number of unknown parameters. The models provide important inferential results for each cut-point and method of measurement, including the summary prognostic effect, the between-study variance and a 95% prediction interval for the prognostic effect in new populations. Two applications are presented. The first reveals that, in a multivariate meta-analysis using published results, the Apgar score is prognostic of neonatal mortality but effect sizes are smaller at most cut-points than previously thought. In the second, a multivariate meta-analysis of two methods of measurement provides weak evidence that microvessel density is prognostic of mortality in lung cancer, even when individual participant data are available so that a continuous prognostic trend is examined (rather than cut-points). © 2015 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xiaoyan Tang; Min Shao; Yuanhang Zhang
1996-12-31
Ambient aerosol is one of most important pollutants in China. This paper showed the results of aerosol sources of Beijing area revealed by combination of multivariate analysis models and 14C tracer measured on Accelerator Mass Spectrometry (AMS). The results indicated that the mass concentration of particulate (<100 (M)) didn`t increase rapidly, compared with economic development in Beijing city. The multivariate analysis showed that the predominant source was soil dust which contributed more than 50% to atmospheric particles. However, it would be a risk to conclude that the aerosol pollution from anthropogenic sources was less important in Beijing city based onmore » above phenomenon. Due to lack of reliable tracers, it was very hard to distinguish coal burning from soil source. Thus, it was suspected that the soil source above might be the mixture of soil dust and coal burning. The 14C measurement showed that carbonaceous species of aerosol had quite different emission sources. For carbonaceous aerosols in Beijing, the contribution from fossil fuel to ambient particles was nearly 2/3, as the man-made activities ( coal-burning, etc.) increased, the fossil part would contribute more to atmospheric carbonaceous particles. For example, in downtown Beijing at space-heating seasons, the fossil fuel even contributed more than 95% to carbonaceous particles, which would be potential harmful to population. By using multivariate analysis together with 14C data, two important sources of aerosols in Beijing (soil and coal) combustion were more reliably distinguished, which was critical important for the assessment of aerosol problem in China.« less
Fongaro, Lorenzo; Alamprese, Cristina; Casiraghi, Ernestina
2015-03-01
During ripening of salami, colour changes occur due to oxidation phenomena involving myoglobin. Moreover, shrinkage due to dehydration results in aspect modifications, mainly ascribable to fat aggregation. The aim of this work was the application of image analysis (IA) and multivariate image analysis (MIA) techniques to the study of colour and aspect changes occurring in salami during ripening. IA results showed that red, green, blue, and intensity parameters decreased due to the development of a global darker colour, while Heterogeneity increased due to fat aggregation. By applying MIA, different salami slice areas corresponding to fat and three different degrees of oxidised meat were identified and quantified. It was thus possible to study the trend of these different areas as a function of ripening, making objective an evaluation usually performed by subjective visual inspection. Copyright © 2014 Elsevier Ltd. All rights reserved.
Multivariate analysis: A statistical approach for computations
NASA Astrophysics Data System (ADS)
Michu, Sachin; Kaushik, Vandana
2014-10-01
Multivariate analysis is a type of multivariate statistical approach commonly used in, automotive diagnosis, education evaluating clusters in finance etc and more recently in the health-related professions. The objective of the paper is to provide a detailed exploratory discussion about factor analysis (FA) in image retrieval method and correlation analysis (CA) of network traffic. Image retrieval methods aim to retrieve relevant images from a collected database, based on their content. The problem is made more difficult due to the high dimension of the variable space in which the images are represented. Multivariate correlation analysis proposes an anomaly detection and analysis method based on the correlation coefficient matrix. Anomaly behaviors in the network include the various attacks on the network like DDOs attacks and network scanning.
Multivariate Cluster Analysis.
ERIC Educational Resources Information Center
McRae, Douglas J.
Procedures for grouping students into homogeneous subsets have long interested educational researchers. The research reported in this paper is an investigation of a set of objective grouping procedures based on multivariate analysis considerations. Four multivariate functions that might serve as criteria for adequate grouping are given and…
Exploring the Replicability of a Study's Results: Bootstrap Statistics for the Multivariate Case.
ERIC Educational Resources Information Center
Thompson, Bruce
1995-01-01
Use of the bootstrap method in a canonical correlation analysis to evaluate the replicability of a study's results is illustrated. More confidence may be vested in research results that replicate. (SLD)
Goh, Choon Fu; Craig, Duncan Q M; Hadgraft, Jonathan; Lane, Majella E
2017-02-01
Drug permeation through the intercellular lipids, which pack around and between corneocytes, may be enhanced by increasing the thermodynamic activity of the active in a formulation. However, this may also result in unwanted drug crystallisation on and in the skin. In this work, we explore the combination of ATR-FTIR spectroscopy and multivariate data analysis to study drug crystallisation in the skin. Ex vivo permeation studies of saturated solutions of diclofenac sodium (DF Na) in two vehicles, propylene glycol (PG) and dimethyl sulphoxide (DMSO), were carried out in porcine ear skin. Tape stripping and ATR-FTIR spectroscopy were conducted simultaneously to collect spectral data as a function of skin depth. Multivariate data analysis was applied to visualise and categorise the spectral data in the region of interest (1700-1500cm -1 ) containing the carboxylate (COO - ) asymmetric stretching vibrations of DF Na. Spectral data showed the redshifts of the COO - asymmetric stretching vibrations for DF Na in the solution compared with solid drug. Similar shifts were evident following application of saturated solutions of DF Na to porcine skin samples. Multivariate data analysis categorised the spectral data based on the spectral differences and drug crystallisation was found to be confined to the upper layers of the skin. This proof-of-concept study highlights the utility of ATR-FTIR spectroscopy in combination with multivariate data analysis as a simple and rapid approach in the investigation of drug deposition in the skin. The approach described here will be extended to the study of other actives for topical application to the skin. Copyright © 2016 Elsevier B.V. All rights reserved.
Enhancing e-waste estimates: Improving data quality by multivariate Input–Output Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, Feng, E-mail: fwang@unu.edu; Design for Sustainability Lab, Faculty of Industrial Design Engineering, Delft University of Technology, Landbergstraat 15, 2628CE Delft; Huisman, Jaco
2013-11-15
Highlights: • A multivariate Input–Output Analysis method for e-waste estimates is proposed. • Applying multivariate analysis to consolidate data can enhance e-waste estimates. • We examine the influence of model selection and data quality on e-waste estimates. • Datasets of all e-waste related variables in a Dutch case study have been provided. • Accurate modeling of time-variant lifespan distributions is critical for estimate. - Abstract: Waste electrical and electronic equipment (or e-waste) is one of the fastest growing waste streams, which encompasses a wide and increasing spectrum of products. Accurate estimation of e-waste generation is difficult, mainly due to lackmore » of high quality data referred to market and socio-economic dynamics. This paper addresses how to enhance e-waste estimates by providing techniques to increase data quality. An advanced, flexible and multivariate Input–Output Analysis (IOA) method is proposed. It links all three pillars in IOA (product sales, stock and lifespan profiles) to construct mathematical relationships between various data points. By applying this method, the data consolidation steps can generate more accurate time-series datasets from available data pool. This can consequently increase the reliability of e-waste estimates compared to the approach without data processing. A case study in the Netherlands is used to apply the advanced IOA model. As a result, for the first time ever, complete datasets of all three variables for estimating all types of e-waste have been obtained. The result of this study also demonstrates significant disparity between various estimation models, arising from the use of data under different conditions. It shows the importance of applying multivariate approach and multiple sources to improve data quality for modelling, specifically using appropriate time-varying lifespan parameters. Following the case study, a roadmap with a procedural guideline is provided to enhance e-waste estimation studies.« less
Quantifying the impact of between-study heterogeneity in multivariate meta-analyses
Jackson, Dan; White, Ian R; Riley, Richard D
2012-01-01
Measures that quantify the impact of heterogeneity in univariate meta-analysis, including the very popular I2 statistic, are now well established. Multivariate meta-analysis, where studies provide multiple outcomes that are pooled in a single analysis, is also becoming more commonly used. The question of how to quantify heterogeneity in the multivariate setting is therefore raised. It is the univariate R2 statistic, the ratio of the variance of the estimated treatment effect under the random and fixed effects models, that generalises most naturally, so this statistic provides our basis. This statistic is then used to derive a multivariate analogue of I2, which we call . We also provide a multivariate H2 statistic, the ratio of a generalisation of Cochran's heterogeneity statistic and its associated degrees of freedom, with an accompanying generalisation of the usual I2 statistic, . Our proposed heterogeneity statistics can be used alongside all the usual estimates and inferential procedures used in multivariate meta-analysis. We apply our methods to some real datasets and show how our statistics are equally appropriate in the context of multivariate meta-regression, where study level covariate effects are included in the model. Our heterogeneity statistics may be used when applying any procedure for fitting the multivariate random effects model. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22763950
Brito Lopes, Fernando; da Silva, Marcelo Corrêa; Magnabosco, Cláudio Ulhôa; Goncalves Narciso, Marcelo; Sainz, Roberto Daniel
2016-01-01
This research evaluated a multivariate approach as an alternative tool for the purpose of selection regarding expected progeny differences (EPDs). Data were fitted using a multi-trait model and consisted of growth traits (birth weight and weights at 120, 210, 365 and 450 days of age) and carcass traits (longissimus muscle area (LMA), back-fat thickness (BF), and rump fat thickness (RF)), registered over 21 years in extensive breeding systems of Polled Nellore cattle in Brazil. Multivariate analyses were performed using standardized (zero mean and unit variance) EPDs. The k mean method revealed that the best fit of data occurred using three clusters (k = 3) (P < 0.001). Estimates of genetic correlation among growth and carcass traits and the estimates of heritability were moderate to high, suggesting that a correlated response approach is suitable for practical decision making. Estimates of correlation between selection indices and the multivariate index (LD1) were moderate to high, ranging from 0.48 to 0.97. This reveals that both types of indices give similar results and that the multivariate approach is reliable for the purpose of selection. The alternative tool seems very handy when economic weights are not available or in cases where more rapid identification of the best animals is desired. Interestingly, multivariate analysis allowed forecasting information based on the relationships among breeding values (EPDs). Also, it enabled fine discrimination, rapid data summarization after genetic evaluation, and permitted accounting for maternal ability and the genetic direct potential of the animals. In addition, we recommend the use of longissimus muscle area and subcutaneous fat thickness as selection criteria, to allow estimation of breeding values before the first mating season in order to accelerate the response to individual selection. PMID:26789008
Brito Lopes, Fernando; da Silva, Marcelo Corrêa; Magnabosco, Cláudio Ulhôa; Goncalves Narciso, Marcelo; Sainz, Roberto Daniel
2016-01-01
This research evaluated a multivariate approach as an alternative tool for the purpose of selection regarding expected progeny differences (EPDs). Data were fitted using a multi-trait model and consisted of growth traits (birth weight and weights at 120, 210, 365 and 450 days of age) and carcass traits (longissimus muscle area (LMA), back-fat thickness (BF), and rump fat thickness (RF)), registered over 21 years in extensive breeding systems of Polled Nellore cattle in Brazil. Multivariate analyses were performed using standardized (zero mean and unit variance) EPDs. The k mean method revealed that the best fit of data occurred using three clusters (k = 3) (P < 0.001). Estimates of genetic correlation among growth and carcass traits and the estimates of heritability were moderate to high, suggesting that a correlated response approach is suitable for practical decision making. Estimates of correlation between selection indices and the multivariate index (LD1) were moderate to high, ranging from 0.48 to 0.97. This reveals that both types of indices give similar results and that the multivariate approach is reliable for the purpose of selection. The alternative tool seems very handy when economic weights are not available or in cases where more rapid identification of the best animals is desired. Interestingly, multivariate analysis allowed forecasting information based on the relationships among breeding values (EPDs). Also, it enabled fine discrimination, rapid data summarization after genetic evaluation, and permitted accounting for maternal ability and the genetic direct potential of the animals. In addition, we recommend the use of longissimus muscle area and subcutaneous fat thickness as selection criteria, to allow estimation of breeding values before the first mating season in order to accelerate the response to individual selection.
Rich, Ashleigh J; Lachowsky, Nathan J; Cui, Zishan; Sereda, Paul; Lal, Allan; Moore, David M; Hogg, Robert S; Roth, Eric A
2015-01-01
This study analyzed event-level partnership data from a computer-assisted survey of 719 gay and bisexual men (GBM) enrolled in the Momentum Health Study to delineate potential linkages between anal sex roles and so-called “sex drugs”, i.e. erectile dysfunction drugs (EDD), poppers and crystal methamphetamine. Univariable and multivariable analyses using generalized linear mixed models with logit link function with sexual encounters (n=2,514) as the unit of analysis tested four hypotheses: 1) EDD are significantly associated with insertive anal sex roles, 2) poppers are significantly associated with receptive anal sex, 3) both poppers and EDD are significantly associated with anal sexual versatility and, 4) crystal methamphetamine is significantly associated with all anal sex roles. Data for survey respondents and their sexual partners allowed testing these hypotheses for both anal sex partners in the same encounter. Multivariable results supported the first three hypotheses. Crystal methamphetamine was significantly associated with all anal sex roles in the univariable models, but not significant in any multivariable ones. Other multivariable significant variables included attending group sex events, venue where first met, and self-described sexual orientation. Results indicate that GBM sex-drug use behavior features rational decision-making strategies linked to anal sex roles. They also suggest that more research on anal sex roles, particularly versatility, is needed, and that sexual behavior research can benefit from partnership analysis. PMID:26525571
Rich, Ashleigh J; Lachowsky, Nathan J; Cui, Zishan; Sereda, Paul; Lal, Allan; Moore, David M; Hogg, Robert S; Roth, Eric A
2016-08-01
This study analyzed event-level partnership data from a computer-assisted survey of 719 gay and bisexual men (GBM) enrolled in the Momentum Health Study to delineate potential linkages between anal sex roles and the so-called "sex drugs," i.e., erectile dysfunction drugs (EDD), poppers, and crystal methamphetamine. Univariable and multivariable analyses using generalized linear mixed models with logit link function with sexual encounters (n = 2514) as the unit of analysis tested four hypotheses: (1) EDD are significantly associated with insertive anal sex roles, (2) poppers are significantly associated with receptive anal sex, (3) both poppers and EDD are significantly associated with anal sexual versatility, and (4) crystal methamphetamine is significantly associated with all anal sex roles. Data for survey respondents and their sexual partners allowed testing these hypotheses for both anal sex partners in the same encounter. Multivariable results supported the first three hypotheses. Crystal methamphetamine was significantly associated with all anal sex roles in the univariable models, but not significant in any multivariable ones. Other multivariable significant variables included attending group sex events, venue where first met, and self-described sexual orientation. Results indicate that GBM sex-drug use behavior features rational decision-making strategies linked to anal sex roles. They also suggest that more research on anal sex roles, particularly versatility, is needed, and that sexual behavior research can benefit from partnership analysis.
Medland, Sarah E; Loesch, Danuta Z; Mdzewski, Bogdan; Zhu, Gu; Montgomery, Grant W; Martin, Nicholas G
2007-01-01
The finger ridge count (a measure of pattern size) is one of the most heritable complex traits studied in humans and has been considered a model human polygenic trait in quantitative genetic analysis. Here, we report the results of the first genome-wide linkage scan for finger ridge count in a sample of 2,114 offspring from 922 nuclear families. Both univariate linkage to the absolute ridge count (a sum of all the ridge counts on all ten fingers), and multivariate linkage analyses of the counts on individual fingers, were conducted. The multivariate analyses yielded significant linkage to 5q14.1 (Logarithm of odds [LOD] = 3.34, pointwise-empirical p-value = 0.00025) that was predominantly driven by linkage to the ring, index, and middle fingers. The strongest univariate linkage was to 1q42.2 (LOD = 2.04, point-wise p-value = 0.002, genome-wide p-value = 0.29). In summary, the combination of univariate and multivariate results was more informative than simple univariate analyses alone. Patterns of quantitative trait loci factor loadings consistent with developmental fields were observed, and the simple pleiotropic model underlying the absolute ridge count was not sufficient to characterize the interrelationships between the ridge counts of individual fingers. PMID:17907812
Beer fermentation: monitoring of process parameters by FT-NIR and multivariate data analysis.
Grassi, Silvia; Amigo, José Manuel; Lyndgaard, Christian Bøge; Foschino, Roberto; Casiraghi, Ernestina
2014-07-15
This work investigates the capability of Fourier-Transform near infrared (FT-NIR) spectroscopy to monitor and assess process parameters in beer fermentation at different operative conditions. For this purpose, the fermentation of wort with two different yeast strains and at different temperatures was monitored for nine days by FT-NIR. To correlate the collected spectra with °Brix, pH and biomass, different multivariate data methodologies were applied. Principal component analysis (PCA), partial least squares (PLS) and locally weighted regression (LWR) were used to assess the relationship between FT-NIR spectra and the abovementioned process parameters that define the beer fermentation. The accuracy and robustness of the obtained results clearly show the suitability of FT-NIR spectroscopy, combined with multivariate data analysis, to be used as a quality control tool in the beer fermentation process. FT-NIR spectroscopy, when combined with LWR, demonstrates to be a perfectly suitable quantitative method to be implemented in the production of beer. Copyright © 2014 Elsevier Ltd. All rights reserved.
Fadel, Maya Abou; de Juan, Anna; Vezin, Hervé; Duponchel, Ludovic
2016-12-01
Electron paramagnetic resonance (EPR) spectroscopy is a powerful technique that is able to characterize radicals formed in kinetic reactions. However, spectral characterization of individual chemical species is often limited or even unmanageable due to the severe kinetic and spectral overlap among species in kinetic processes. Therefore, we applied, for the first time, multivariate curve resolution-alternating least squares (MCR-ALS) method to EPR time evolving data sets to model and characterize the different constituents in a kinetic reaction. Here we demonstrate the advantage of multivariate analysis in the investigation of radicals formed along the kinetic process of hydroxycoumarin in alkaline medium. Multiset analysis of several EPR-monitored kinetic experiments performed in different conditions revealed the individual paramagnetic centres as well as their kinetic profiles. The results obtained by MCR-ALS method demonstrate its prominent potential in analysis of EPR time evolved spectra. Copyright © 2016 Elsevier B.V. All rights reserved.
Influence of the Rh (D) blood group system on graft survival in renal transplantation.
Bryan, C F; Mitchell, S I; Lin, H M; Nelson, P W; Shield, C F; Luger, A M; Pierce, G E; Ross, G; Warady, B A; Aeder, M I; Helling, T S; Landreneau, M D; Harrell, K M
1998-02-27
The Rh (D) blood group system has not traditionally been considered to be a clinically relevant histocompatibility barrier in transplantation since conflicting results of its clinical importance have been reported. We analyzed 786 consecutive primary cadaveric renal transplants performed by transplant centers in our Organ Procurement Organization (OPO) between 1990 and 1997. We also analyzed United Network for Organ Sharing (UNOS) data on 26,469 kidney transplants done from April 1994 to June 1996. Multivariate analysis revealed that Rh identity between the recipient and donor was significantly related to better graft outcome (risk ratio, 0.43; 95% confidence interval, 0.30 to 0.61; P=0.0001). Multivariate analysis of the UNOS data revealed that the Rh -/- group may have a positive influence on graft survival with a risk ratio of 0.43 (P=0.14). Multivariate analysis of primary cadaveric renal allografts performed within the Midwest Organ Bank OPO indicates that Rh (D) is a clinically relevant histocompatibility barrier that influences 7-year graft survival.
Jantzi, Sarah C; Almirall, José R
2014-01-01
Elemental analysis of soil is a useful application of both laser ablation inductively coupled plasma mass spectrometry (LA-ICP-MS) and laser-induced breakdown spectroscopy (LIBS) in geological, agricultural, environmental, archeological, planetary, and forensic sciences. In forensic science, the question to be answered is often whether soil specimens found on objects (e.g., shoes, tires, or tools) originated from the crime scene or other location of interest. Elemental analysis of the soil from the object and the locations of interest results in a characteristic elemental profile of each specimen, consisting of the amount of each element present. Because multiple elements are measured, multivariate statistics can be used to compare the elemental profiles in order to determine whether the specimen from the object is similar to one of the locations of interest. Previous work involved milling and pressing 0.5 g of soil into pellets before analysis using LA-ICP-MS and LIBS. However, forensic examiners prefer techniques that require smaller samples, are less time consuming, and are less destructive, allowing for future analysis by other techniques. An alternative sample introduction method was developed to meet these needs while still providing quantitative results suitable for multivariate comparisons. The tape-mounting method involved deposition of a thin layer of soil onto double-sided adhesive tape. A comparison of tape-mounting and pellet method performance is reported for both LA-ICP-MS and LIBS. Calibration standards and reference materials, prepared using the tape method, were analyzed by LA-ICP-MS and LIBS. As with the pellet method, linear calibration curves were achieved with the tape method, as well as good precision and low bias. Soil specimens from Miami-Dade County were prepared by both the pellet and tape methods and analyzed by LA-ICP-MS and LIBS. Principal components analysis and linear discriminant analysis were applied to the multivariate data. Results from both the tape method and the pellet method were nearly identical, with clear groupings and correct classification rates of >94%.
NASA Astrophysics Data System (ADS)
Gómez, Wilmar
2017-04-01
By analyzing the spatial and temporal variability of extreme precipitation events we can prevent or reduce the threat and risk. Many water resources projects require joint probability distributions of random variables such as precipitation intensity and duration, which can not be independent with each other. The problem of defining a probability model for observations of several dependent variables is greatly simplified by the joint distribution in terms of their marginal by taking copulas. This document presents a general framework set frequency analysis bivariate and multivariate using Archimedean copulas for extreme events of hydroclimatological nature such as severe storms. This analysis was conducted in the lower Tunjuelo River basin in Colombia for precipitation events. The results obtained show that for a joint study of the intensity-duration-frequency, IDF curves can be obtained through copulas and thus establish more accurate and reliable information from design storms and associated risks. It shows how the use of copulas greatly simplifies the study of multivariate distributions that introduce the concept of joint return period used to represent the needs of hydrological designs properly in frequency analysis.
Multivariate analysis of prognostic factors in synovial sarcoma.
Koh, Kyoung Hwan; Cho, Eun Yoon; Kim, Dong Wook; Seo, Sung Wook
2009-11-01
Many studies have described the diversity of synovial sarcoma in terms of its biological characteristics and clinical features. Moreover, much effort has been expended on the identification of prognostic factors because of unpredictable behaviors of synovial sarcomas. However, with the exception of tumor size, published results have been inconsistent. We attempted to identify independent risk factors using survival analysis. Forty-one consecutive patients with synovial sarcoma were prospectively followed from January 1997 to March 2008. Overall and progression-free survival for age, sex, tumor size, tumor location, metastasis at presentation, histologic subtype, chemotherapy, radiation therapy, and resection margin were analyzed, and standard multivariate Cox proportional hazard regression analysis was used to evaluate potential prognostic factors. Tumor size (>5 cm), nonlimb-based tumors, metastasis at presentation, and a monophasic subtype were associated with poorer overall survival. Multivariate analysis showed metastasis at presentation and monophasic tumor subtype affected overall survival. For the progression-free survival, monophasic subtype was found to be only 1 prognostic factor. The study confirmed that histologic subtype is the single most important independent prognostic factors of synovial sarcoma regardless of tumor stage.
Clinical Trials With Large Numbers of Variables: Important Advantages of Canonical Analysis.
Cleophas, Ton J
2016-01-01
Canonical analysis assesses the combined effects of a set of predictor variables on a set of outcome variables, but it is little used in clinical trials despite the omnipresence of multiple variables. The aim of this study was to assess the performance of canonical analysis as compared with traditional multivariate methods using multivariate analysis of covariance (MANCOVA). As an example, a simulated data file with 12 gene expression levels and 4 drug efficacy scores was used. The correlation coefficient between the 12 predictor and 4 outcome variables was 0.87 (P = 0.0001) meaning that 76% of the variability in the outcome variables was explained by the 12 covariates. Repeated testing after the removal of 5 unimportant predictor and 1 outcome variable produced virtually the same overall result. The MANCOVA identified identical unimportant variables, but it was unable to provide overall statistics. (1) Canonical analysis is remarkable, because it can handle many more variables than traditional multivariate methods such as MANCOVA can. (2) At the same time, it accounts for the relative importance of the separate variables, their interactions and differences in units. (3) Canonical analysis provides overall statistics of the effects of sets of variables, whereas traditional multivariate methods only provide the statistics of the separate variables. (4) Unlike other methods for combining the effects of multiple variables such as factor analysis/partial least squares, canonical analysis is scientifically entirely rigorous. (5) Limitations include that it is less flexible than factor analysis/partial least squares, because only 2 sets of variables are used and because multiple solutions instead of one is offered. We do hope that this article will stimulate clinical investigators to start using this remarkable method.
Analysis techniques for multivariate root loci. [a tool in linear control systems
NASA Technical Reports Server (NTRS)
Thompson, P. M.; Stein, G.; Laub, A. J.
1980-01-01
Analysis and techniques are developed for the multivariable root locus and the multivariable optimal root locus. The generalized eigenvalue problem is used to compute angles and sensitivities for both types of loci, and an algorithm is presented that determines the asymptotic properties of the optimal root locus.
Methods for presentation and display of multivariate data
NASA Technical Reports Server (NTRS)
Myers, R. H.
1981-01-01
Methods for the presentation and display of multivariate data are discussed with emphasis placed on the multivariate analysis of variance problems and the Hotelling T(2) solution in the two-sample case. The methods utilize the concepts of stepwise discrimination analysis and the computation of partial correlation coefficients.
A Primer on Multivariate Analysis of Variance (MANOVA) for Behavioral Scientists
ERIC Educational Resources Information Center
Warne, Russell T.
2014-01-01
Reviews of statistical procedures (e.g., Bangert & Baumberger, 2005; Kieffer, Reese, & Thompson, 2001; Warne, Lazo, Ramos, & Ritter, 2012) show that one of the most common multivariate statistical methods in psychological research is multivariate analysis of variance (MANOVA). However, MANOVA and its associated procedures are often not…
Liu, Yong; Su, Chao; Zhang, Hong; Li, Xiaoting; Pei, Jingfei
2014-01-01
Many studies indicated that industrialization and urbanization caused serious soil heavy metal pollution from industrialized age. However, fewer previous studies have conducted a combined analysis of the landscape pattern, urbanization, industrialization, and heavy metal pollution. This paper was aimed at exploring the relationships of heavy metals in the soil (Pb, Cu, Ni, As, Cd, Cr, Hg, and Zn) with landscape pattern, industrialisation, urbanisation in Taiyuan city using multivariate analysis. The multivariate analysis included correlation analysis, analysis of variance (ANOVA), independent-sample T test, and principal component analysis (PCA). Geographic information system (GIS) was also applied to determine the spatial distribution of the heavy metals. The spatial distribution maps showed that the heavy metal pollution of the soil was more serious in the centre of the study area. The results of the multivariate analysis indicated that the correlations among heavy metals were significant, and industrialisation could significantly affect the concentrations of some heavy metals. Landscape diversity showed a significant negative correlation with the heavy metal concentrations. The PCA showed that a two-factor model for heavy metal pollution, industrialisation, and the landscape pattern could effectively demonstrate the relationships between these variables. The model explained 86.71% of the total variance of the data. Moreover, the first factor was mainly loaded with the comprehensive pollution index (P), and the second factor was primarily loaded with landscape diversity and dominance (H and D). An ordination of 80 samples could show the pollution pattern of all the samples. The results revealed that local industrialisation caused heavy metal pollution of the soil, but such pollution could respond negatively to the landscape pattern. The results of the study could provide a basis for agricultural, suburban, and urban planning. PMID:25251460
Liu, Yong; Su, Chao; Zhang, Hong; Li, Xiaoting; Pei, Jingfei
2014-01-01
Many studies indicated that industrialization and urbanization caused serious soil heavy metal pollution from industrialized age. However, fewer previous studies have conducted a combined analysis of the landscape pattern, urbanization, industrialization, and heavy metal pollution. This paper was aimed at exploring the relationships of heavy metals in the soil (Pb, Cu, Ni, As, Cd, Cr, Hg, and Zn) with landscape pattern, industrialisation, urbanisation in Taiyuan city using multivariate analysis. The multivariate analysis included correlation analysis, analysis of variance (ANOVA), independent-sample T test, and principal component analysis (PCA). Geographic information system (GIS) was also applied to determine the spatial distribution of the heavy metals. The spatial distribution maps showed that the heavy metal pollution of the soil was more serious in the centre of the study area. The results of the multivariate analysis indicated that the correlations among heavy metals were significant, and industrialisation could significantly affect the concentrations of some heavy metals. Landscape diversity showed a significant negative correlation with the heavy metal concentrations. The PCA showed that a two-factor model for heavy metal pollution, industrialisation, and the landscape pattern could effectively demonstrate the relationships between these variables. The model explained 86.71% of the total variance of the data. Moreover, the first factor was mainly loaded with the comprehensive pollution index (P), and the second factor was primarily loaded with landscape diversity and dominance (H and D). An ordination of 80 samples could show the pollution pattern of all the samples. The results revealed that local industrialisation caused heavy metal pollution of the soil, but such pollution could respond negatively to the landscape pattern. The results of the study could provide a basis for agricultural, suburban, and urban planning.
Ferreira, Fábio S; Pereira, João M S; Duarte, João V; Castelo-Branco, Miguel
2017-01-01
Although voxel based morphometry studies are still the standard for analyzing brain structure, their dependence on massive univariate inferential methods is a limiting factor. A better understanding of brain pathologies can be achieved by applying inferential multivariate methods, which allow the study of multiple dependent variables, e.g. different imaging modalities of the same subject. Given the widespread use of SPM software in the brain imaging community, the main aim of this work is the implementation of massive multivariate inferential analysis as a toolbox in this software package. applied to the use of T1 and T2 structural data from diabetic patients and controls. This implementation was compared with the traditional ANCOVA in SPM and a similar multivariate GLM toolbox (MRM). We implemented the new toolbox and tested it by investigating brain alterations on a cohort of twenty-eight type 2 diabetes patients and twenty-six matched healthy controls, using information from both T1 and T2 weighted structural MRI scans, both separately - using standard univariate VBM - and simultaneously, with multivariate analyses. Univariate VBM replicated predominantly bilateral changes in basal ganglia and insular regions in type 2 diabetes patients. On the other hand, multivariate analyses replicated key findings of univariate results, while also revealing the thalami as additional foci of pathology. While the presented algorithm must be further optimized, the proposed toolbox is the first implementation of multivariate statistics in SPM8 as a user-friendly toolbox, which shows great potential and is ready to be validated in other clinical cohorts and modalities.
Tailored multivariate analysis for modulated enhanced diffraction
DOE Office of Scientific and Technical Information (OSTI.GOV)
Caliandro, Rocco; Guccione, Pietro; Nico, Giovanni
2015-10-21
Modulated enhanced diffraction (MED) is a technique allowing the dynamic structural characterization of crystalline materials subjected to an external stimulus, which is particularly suited forin situandoperandostructural investigations at synchrotron sources. Contributions from the (active) part of the crystal system that varies synchronously with the stimulus can be extracted by an offline analysis, which can only be applied in the case of periodic stimuli and linear system responses. In this paper a new decomposition approach based on multivariate analysis is proposed. The standard principal component analysis (PCA) is adapted to treat MED data: specific figures of merit based on their scoresmore » and loadings are found, and the directions of the principal components obtained by PCA are modified to maximize such figures of merit. As a result, a general method to decompose MED data, called optimum constrained components rotation (OCCR), is developed, which produces very precise results on simulated data, even in the case of nonperiodic stimuli and/or nonlinear responses. The multivariate analysis approach is able to supply in one shot both the diffraction pattern related to the active atoms (through the OCCR loadings) and the time dependence of the system response (through the OCCR scores). When applied to real data, OCCR was able to supply only the latter information, as the former was hindered by changes in abundances of different crystal phases, which occurred besides structural variations in the specific case considered. To develop a decomposition procedure able to cope with this combined effect represents the next challenge in MED analysis.« less
Wang, Li; Wang, Xiaoyi; Jin, Xuebo; Xu, Jiping; Zhang, Huiyan; Yu, Jiabin; Sun, Qian; Gao, Chong; Wang, Lingbin
2017-03-01
The formation process of algae is described inaccurately and water blooms are predicted with a low precision by current methods. In this paper, chemical mechanism of algae growth is analyzed, and a correlation analysis of chlorophyll-a and algal density is conducted by chemical measurement. Taking into account the influence of multi-factors on algae growth and water blooms, the comprehensive prediction method combined with multivariate time series and intelligent model is put forward in this paper. Firstly, through the process of photosynthesis, the main factors that affect the reproduction of the algae are analyzed. A compensation prediction method of multivariate time series analysis based on neural network and Support Vector Machine has been put forward which is combined with Kernel Principal Component Analysis to deal with dimension reduction of the influence factors of blooms. Then, Genetic Algorithm is applied to improve the generalization ability of the BP network and Least Squares Support Vector Machine. Experimental results show that this method could better compensate the prediction model of multivariate time series analysis which is an effective way to improve the description accuracy of algae growth and prediction precision of water blooms.
Li, Min; Zhang, Lu; Yao, Xiaolong; Jiang, Xingyu
2017-01-01
The emerging membrane introduction mass spectrometry technique has been successfully used to detect benzene, toluene, ethyl benzene and xylene (BTEX), while overlapped spectra have unfortunately hindered its further application to the analysis of mixtures. Multivariate calibration, an efficient method to analyze mixtures, has been widely applied. In this paper, we compared univariate and multivariate analyses for quantification of the individual components of mixture samples. The results showed that the univariate analysis creates poor models with regression coefficients of 0.912, 0.867, 0.440 and 0.351 for BTEX, respectively. For multivariate analysis, a comparison to the partial-least squares (PLS) model shows that the orthogonal partial-least squares (OPLS) regression exhibits an optimal performance with regression coefficients of 0.995, 0.999, 0.980 and 0.976, favorable calibration parameters (RMSEC and RMSECV) and a favorable validation parameter (RMSEP). Furthermore, the OPLS exhibits a good recovery of 73.86 - 122.20% and relative standard deviation (RSD) of the repeatability of 1.14 - 4.87%. Thus, MIMS coupled with the OPLS regression provides an optimal approach for a quantitative BTEX mixture analysis in monitoring and predicting water pollution.
Mapping Informative Clusters in a Hierarchial Framework of fMRI Multivariate Analysis
Xu, Rui; Zhen, Zonglei; Liu, Jia
2010-01-01
Pattern recognition methods have become increasingly popular in fMRI data analysis, which are powerful in discriminating between multi-voxel patterns of brain activities associated with different mental states. However, when they are used in functional brain mapping, the location of discriminative voxels varies significantly, raising difficulties in interpreting the locus of the effect. Here we proposed a hierarchical framework of multivariate approach that maps informative clusters rather than voxels to achieve reliable functional brain mapping without compromising the discriminative power. In particular, we first searched for local homogeneous clusters that consisted of voxels with similar response profiles. Then, a multi-voxel classifier was built for each cluster to extract discriminative information from the multi-voxel patterns. Finally, through multivariate ranking, outputs from the classifiers were served as a multi-cluster pattern to identify informative clusters by examining interactions among clusters. Results from both simulated and real fMRI data demonstrated that this hierarchical approach showed better performance in the robustness of functional brain mapping than traditional voxel-based multivariate methods. In addition, the mapped clusters were highly overlapped for two perceptually equivalent object categories, further confirming the validity of our approach. In short, the hierarchical framework of multivariate approach is suitable for both pattern classification and brain mapping in fMRI studies. PMID:21152081
Multivariate Analysis and Machine Learning in Cerebral Palsy Research
Zhang, Jing
2017-01-01
Cerebral palsy (CP), a common pediatric movement disorder, causes the most severe physical disability in children. Early diagnosis in high-risk infants is critical for early intervention and possible early recovery. In recent years, multivariate analytic and machine learning (ML) approaches have been increasingly used in CP research. This paper aims to identify such multivariate studies and provide an overview of this relatively young field. Studies reviewed in this paper have demonstrated that multivariate analytic methods are useful in identification of risk factors, detection of CP, movement assessment for CP prediction, and outcome assessment, and ML approaches have made it possible to automatically identify movement impairments in high-risk infants. In addition, outcome predictors for surgical treatments have been identified by multivariate outcome studies. To make the multivariate and ML approaches useful in clinical settings, further research with large samples is needed to verify and improve these multivariate methods in risk factor identification, CP detection, movement assessment, and outcome evaluation or prediction. As multivariate analysis, ML and data processing technologies advance in the era of Big Data of this century, it is expected that multivariate analysis and ML will play a bigger role in improving the diagnosis and treatment of CP to reduce mortality and morbidity rates, and enhance patient care for children with CP. PMID:29312134
Multivariate Analysis and Machine Learning in Cerebral Palsy Research.
Zhang, Jing
2017-01-01
Cerebral palsy (CP), a common pediatric movement disorder, causes the most severe physical disability in children. Early diagnosis in high-risk infants is critical for early intervention and possible early recovery. In recent years, multivariate analytic and machine learning (ML) approaches have been increasingly used in CP research. This paper aims to identify such multivariate studies and provide an overview of this relatively young field. Studies reviewed in this paper have demonstrated that multivariate analytic methods are useful in identification of risk factors, detection of CP, movement assessment for CP prediction, and outcome assessment, and ML approaches have made it possible to automatically identify movement impairments in high-risk infants. In addition, outcome predictors for surgical treatments have been identified by multivariate outcome studies. To make the multivariate and ML approaches useful in clinical settings, further research with large samples is needed to verify and improve these multivariate methods in risk factor identification, CP detection, movement assessment, and outcome evaluation or prediction. As multivariate analysis, ML and data processing technologies advance in the era of Big Data of this century, it is expected that multivariate analysis and ML will play a bigger role in improving the diagnosis and treatment of CP to reduce mortality and morbidity rates, and enhance patient care for children with CP.
Huang, Jun; Kaul, Goldi; Cai, Chunsheng; Chatlapalli, Ramarao; Hernandez-Abad, Pedro; Ghosh, Krishnendu; Nagi, Arwinder
2009-12-01
To facilitate an in-depth process understanding, and offer opportunities for developing control strategies to ensure product quality, a combination of experimental design, optimization and multivariate techniques was integrated into the process development of a drug product. A process DOE was used to evaluate effects of the design factors on manufacturability and final product CQAs, and establish design space to ensure desired CQAs. Two types of analyses were performed to extract maximal information, DOE effect & response surface analysis and multivariate analysis (PCA and PLS). The DOE effect analysis was used to evaluate the interactions and effects of three design factors (water amount, wet massing time and lubrication time), on response variables (blend flow, compressibility and tablet dissolution). The design space was established by the combined use of DOE, optimization and multivariate analysis to ensure desired CQAs. Multivariate analysis of all variables from the DOE batches was conducted to study relationships between the variables and to evaluate the impact of material attributes/process parameters on manufacturability and final product CQAs. The integrated multivariate approach exemplifies application of QbD principles and tools to drug product and process development.
A simple prognostic model for overall survival in metastatic renal cell carcinoma
Assi, Hazem I.; Patenaude, Francois; Toumishey, Ethan; Ross, Laura; Abdelsalam, Mahmoud; Reiman, Tony
2016-01-01
Introduction: The primary purpose of this study was to develop a simpler prognostic model to predict overall survival for patients treated for metastatic renal cell carcinoma (mRCC) by examining variables shown in the literature to be associated with survival. Methods: We conducted a retrospective analysis of patients treated for mRCC at two Canadian centres. All patients who started first-line treatment were included in the analysis. A multivariate Cox proportional hazards regression model was constructed using a stepwise procedure. Patients were assigned to risk groups depending on how many of the three risk factors from the final multivariate model they had. Results: There were three risk factors in the final multivariate model: hemoglobin, prior nephrectomy, and time from diagnosis to treatment. Patients in the high-risk group (two or three risk factors) had a median survival of 5.9 months, while those in the intermediate-risk group (one risk factor) had a median survival of 16.2 months, and those in the low-risk group (no risk factors) had a median survival of 50.6 months. Conclusions: In multivariate analysis, shorter survival times were associated with hemoglobin below the lower limit of normal, absence of prior nephrectomy, and initiation of treatment within one year of diagnosis. PMID:27217858
Analysis of Forest Foliage Using a Multivariate Mixture Model
NASA Technical Reports Server (NTRS)
Hlavka, C. A.; Peterson, David L.; Johnson, L. F.; Ganapol, B.
1997-01-01
Data with wet chemical measurements and near infrared spectra of ground leaf samples were analyzed to test a multivariate regression technique for estimating component spectra which is based on a linear mixture model for absorbance. The resulting unmixed spectra for carbohydrates, lignin, and protein resemble the spectra of extracted plant starches, cellulose, lignin, and protein. The unmixed protein spectrum has prominent absorption spectra at wavelengths which have been associated with nitrogen bonds.
Goicoechea, H C; Olivieri, A C
1999-08-01
The use of multivariate spectrophotometric calibration is presented for the simultaneous determination of the active components of tablets used in the treatment of pulmonary tuberculosis. The resolution of ternary mixtures of rifampicin, isoniazid and pyrazinamide has been accomplished by using partial least squares (PLS-1) regression analysis. Although the components show an important degree of spectral overlap, they have been simultaneously determined with high accuracy and precision, rapidly and with no need of nonaqueous solvents for dissolving the samples. No interference has been observed from the tablet excipients. A comparison is presented with the related multivariate method of classical least squares (CLS) analysis, which is shown to yield less reliable results due to the severe spectral overlap among the studied compounds. This is highlighted in the case of isoniazid, due to the small absorbances measured for this component.
Estimating an Effect Size in One-Way Multivariate Analysis of Variance (MANOVA)
ERIC Educational Resources Information Center
Steyn, H. S., Jr.; Ellis, S. M.
2009-01-01
When two or more univariate population means are compared, the proportion of variation in the dependent variable accounted for by population group membership is eta-squared. This effect size can be generalized by using multivariate measures of association, based on the multivariate analysis of variance (MANOVA) statistics, to establish whether…
Generating Virtual Patients by Multivariate and Discrete Re-Sampling Techniques.
Teutonico, D; Musuamba, F; Maas, H J; Facius, A; Yang, S; Danhof, M; Della Pasqua, O
2015-10-01
Clinical Trial Simulations (CTS) are a valuable tool for decision-making during drug development. However, to obtain realistic simulation scenarios, the patients included in the CTS must be representative of the target population. This is particularly important when covariate effects exist that may affect the outcome of a trial. The objective of our investigation was to evaluate and compare CTS results using re-sampling from a population pool and multivariate distributions to simulate patient covariates. COPD was selected as paradigm disease for the purposes of our analysis, FEV1 was used as response measure and the effects of a hypothetical intervention were evaluated in different populations in order to assess the predictive performance of the two methods. Our results show that the multivariate distribution method produces realistic covariate correlations, comparable to the real population. Moreover, it allows simulation of patient characteristics beyond the limits of inclusion and exclusion criteria in historical protocols. Both methods, discrete resampling and multivariate distribution generate realistic pools of virtual patients. However the use of a multivariate distribution enable more flexible simulation scenarios since it is not necessarily bound to the existing covariate combinations in the available clinical data sets.
Dangers in Using Analysis of Covariance Procedures.
ERIC Educational Resources Information Center
Campbell, Kathleen T.
Problems associated with the use of analysis of covariance (ANCOVA) as a statistical control technique are explained. Three problems relate to the use of "OVA" methods (analysis of variance, analysis of covariance, multivariate analysis of variance, and multivariate analysis of covariance) in general. These are: (1) the wasting of information when…
NASA Astrophysics Data System (ADS)
Song, Biao; Lu, Dan; Peng, Ming; Li, Xia; Zou, Ye; Huang, Meizhen; Lu, Feng
2017-02-01
Raman spectroscopy is developed as a fast and non-destructive method for the discrimination and classification of hydroxypropyl methyl cellulose (HPMC) samples. 44 E series and 41 K series of HPMC samples are measured by a self-developed portable Raman spectrometer (Hx-Raman) which is excited by a 785 nm diode laser and the spectrum range is 200-2700 cm-1 with a resolution (FWHM) of 6 cm-1. Multivariate analysis is applied for discrimination of E series from K series. By methods of principal components analysis (PCA) and Fisher discriminant analysis (FDA), a discrimination result with sensitivity of 90.91% and specificity of 95.12% is achieved. The corresponding receiver operating characteristic (ROC) is 0.99, indicting the accuracy of the predictive model. This result demonstrates the prospect of portable Raman spectrometer for rapid, non-destructive classification and discrimination of E series and K series samples of HPMC.
Domingo-Almenara, Xavier; Perera, Alexandre; Brezmes, Jesus
2016-11-25
Gas chromatography-mass spectrometry (GC-MS) produces large and complex datasets characterized by co-eluted compounds and at trace levels, and with a distinct compound ion-redundancy as a result of the high fragmentation by the electron impact ionization. Compounds in GC-MS can be resolved by taking advantage of the multivariate nature of GC-MS data by applying multivariate resolution methods. However, multivariate methods have to be applied in small regions of the chromatogram, and therefore chromatograms are segmented prior to the application of the algorithms. The automation of this segmentation process is a challenging task as it implies separating between informative data and noise from the chromatogram. This study demonstrates the capabilities of independent component analysis-orthogonal signal deconvolution (ICA-OSD) and multivariate curve resolution-alternating least squares (MCR-ALS) with an overlapping moving window implementation to avoid the typical hard chromatographic segmentation. Also, after being resolved, compounds are aligned across samples by an automated alignment algorithm. We evaluated the proposed methods through a quantitative analysis of GC-qTOF MS data from 25 serum samples. The quantitative performance of both moving window ICA-OSD and MCR-ALS-based implementations was compared with the quantification of 33 compounds by the XCMS package. Results shown that most of the R 2 coefficients of determination exhibited a high correlation (R 2 >0.90) in both ICA-OSD and MCR-ALS moving window-based approaches. Copyright © 2016 Elsevier B.V. All rights reserved.
2013-01-01
Background The sea louse Lepeophtheirus salmonis is the most important ectoparasite of farmed Atlantic salmon (Salmo salar) in Norwegian aquaculture. Control of sea lice is primarily dependent on the use of delousing chemotherapeutants, which are both expensive and toxic to other wildlife. The method most commonly used for monitoring treatment effectiveness relies on measuring the percentage reduction in the mobile stages of Lepeophtheirus salmonis only. However, this does not account for changes in the other sea lice stages and may result in misleading or incomplete interpretation regarding the effectiveness of treatment. With the aim of improving the evaluation of delousing treatments, we explored multivariate analyses of bath treatments using the topical pyrethroid, cypermethrin, in salmon pens at five Norwegian production sites. Results Conventional univariate analysis indicated reductions of over 90% in mobile stages at all sites. In contrast, multivariate analyses indicated differing treatment effectiveness between sites (p-value < 0.01) based on changes in the proportion and abundance of the chalimus and PAAM (pre-adult and adult males) stages. Low water temperatures and shortened intervals between sampling after treatment may account for the differences in the composition of chalimus and PAAM stage groups following treatment. Using multivariate analysis, such factors could be separated from those which were attributable to inadequate treatment or chemotherapeutant failure. Conclusions Multivariate analyses for evaluation of treatment effectiveness against multiple life cycle stages of L. salmonis yield additional information beyond that derivable from univariate methods. This can aid in the identification of causes of apparent treatment failure in salmon aquaculture. PMID:24354936
Pleiotropy Analysis of Quantitative Traits at Gene Level by Multivariate Functional Linear Models
Wang, Yifan; Liu, Aiyi; Mills, James L.; Boehnke, Michael; Wilson, Alexander F.; Bailey-Wilson, Joan E.; Xiong, Momiao; Wu, Colin O.; Fan, Ruzong
2015-01-01
In genetics, pleiotropy describes the genetic effect of a single gene on multiple phenotypic traits. A common approach is to analyze the phenotypic traits separately using univariate analyses and combine the test results through multiple comparisons. This approach may lead to low power. Multivariate functional linear models are developed to connect genetic variant data to multiple quantitative traits adjusting for covariates for a unified analysis. Three types of approximate F-distribution tests based on Pillai–Bartlett trace, Hotelling–Lawley trace, and Wilks’s Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants in one genetic region. The approximate F-distribution tests provide much more significant results than those of F-tests of univariate analysis and optimal sequence kernel association test (SKAT-O). Extensive simulations were performed to evaluate the false positive rates and power performance of the proposed models and tests. We show that the approximate F-distribution tests control the type I error rates very well. Overall, simultaneous analysis of multiple traits can increase power performance compared to an individual test of each trait. The proposed methods were applied to analyze (1) four lipid traits in eight European cohorts, and (2) three biochemical traits in the Trinity Students Study. The approximate F-distribution tests provide much more significant results than those of F-tests of univariate analysis and SKAT-O for the three biochemical traits. The approximate F-distribution tests of the proposed functional linear models are more sensitive than those of the traditional multivariate linear models that in turn are more sensitive than SKAT-O in the univariate case. The analysis of the four lipid traits and the three biochemical traits detects more association than SKAT-O in the univariate case. PMID:25809955
Pleiotropy analysis of quantitative traits at gene level by multivariate functional linear models.
Wang, Yifan; Liu, Aiyi; Mills, James L; Boehnke, Michael; Wilson, Alexander F; Bailey-Wilson, Joan E; Xiong, Momiao; Wu, Colin O; Fan, Ruzong
2015-05-01
In genetics, pleiotropy describes the genetic effect of a single gene on multiple phenotypic traits. A common approach is to analyze the phenotypic traits separately using univariate analyses and combine the test results through multiple comparisons. This approach may lead to low power. Multivariate functional linear models are developed to connect genetic variant data to multiple quantitative traits adjusting for covariates for a unified analysis. Three types of approximate F-distribution tests based on Pillai-Bartlett trace, Hotelling-Lawley trace, and Wilks's Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants in one genetic region. The approximate F-distribution tests provide much more significant results than those of F-tests of univariate analysis and optimal sequence kernel association test (SKAT-O). Extensive simulations were performed to evaluate the false positive rates and power performance of the proposed models and tests. We show that the approximate F-distribution tests control the type I error rates very well. Overall, simultaneous analysis of multiple traits can increase power performance compared to an individual test of each trait. The proposed methods were applied to analyze (1) four lipid traits in eight European cohorts, and (2) three biochemical traits in the Trinity Students Study. The approximate F-distribution tests provide much more significant results than those of F-tests of univariate analysis and SKAT-O for the three biochemical traits. The approximate F-distribution tests of the proposed functional linear models are more sensitive than those of the traditional multivariate linear models that in turn are more sensitive than SKAT-O in the univariate case. The analysis of the four lipid traits and the three biochemical traits detects more association than SKAT-O in the univariate case. © 2015 WILEY PERIODICALS, INC.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fietkau, Rainer; Roedel, Claus; Hohenberger, Werner
2007-03-15
Purpose: The impact of the delivery of radiotherapy (RT) on treatment results in rectal cancer patients is unknown. Methods and Materials: The data from 788 patients with rectal cancer treated within the German CAO/AIO/ARO-94 phase III trial were analyzed concerning the impact of the delivery of RT (adequate RT: minimal radiation RT dose delivered, 4300 cGy for neoadjuvant RT or 4700 cGy for adjuvant RT; completion of RT in <44 days for neoadjuvant RT or <49 days for adjuvant RT) in different centers on the locoregional recurrence rate (LRR) and disease-free survival (DFS) at 5 years. The LRR, DFS, andmore » delivery of RT were analyzed as endpoints in multivariate analysis. Results: A significant difference was found between the centers and the delivery of RT. The overall delivery of RT was a prognostic factor for the LRR (no RT, 29.6% {+-} 7.8%; inadequate RT, 21.2% {+-} 5.6%; adequate RT, 6.8% {+-} 1.4%; p = 0.0001) and DFS (no RT, 55.1% {+-} 9.1%; inadequate RT, 57.4% {+-} 6.3%; adequate RT, 69.1% {+-} 2.3%; p = 0.02). Postoperatively, delivery of RT was a prognostic factor for LRR on multivariate analysis (together with pathologic stage) but not for DFS (independent parameters, pathologic stage and age). Preoperatively, on multivariate analysis, pathologic stage, but not delivery of RT, was an independent prognostic parameter for LRR and DFS (together with adequate chemotherapy). On multivariate analysis, the treatment center, treatment schedule (neoadjuvant vs. adjuvant RT), and gender were prognostic parameters for adequate RT. Conclusion: Delivery of RT should be regarded as a prognostic factor for LRR in rectal cancer and is influenced by the treatment center, treatment schedule, and patient gender.« less
Fast classification of hazelnut cultivars through portable infrared spectroscopy and chemometrics
NASA Astrophysics Data System (ADS)
Manfredi, Marcello; Robotti, Elisa; Quasso, Fabio; Mazzucco, Eleonora; Calabrese, Giorgio; Marengo, Emilio
2018-01-01
The authentication and traceability of hazelnuts is very important for both the consumer and the food industry, to safeguard the protected varieties and the food quality. This study investigates the use of a portable FTIR spectrometer coupled to multivariate statistical analysis for the classification of raw hazelnuts. The method discriminates hazelnuts from different origins/cultivars based on differences of the signal intensities of their IR spectra. The multivariate classification methods, namely principal component analysis (PCA) followed by linear discriminant analysis (LDA) and partial least square discriminant analysis (PLS-DA), with or without variable selection, allowed a very good discrimination among the groups, with PLS-DA coupled to variable selection providing the best results. Due to the fast analysis, high sensitivity, simplicity and no sample preparation, the proposed analytical methodology could be successfully used to verify the cultivar of hazelnuts, and the analysis can be performed quickly and directly on site.
Analysis of laser printer and photocopier toners by spectral properties and chemometrics
NASA Astrophysics Data System (ADS)
Verma, Neha; Kumar, Raj; Sharma, Vishal
2018-05-01
The use of printers to generate falsified documents has become a common practice in today's world. The examination and identification of the printed matter in the suspected documents (civil or criminal cases) may provide important information about the authenticity of the document. In the present study, a total number of 100 black toner samples both from laser printers and photocopiers were examined using diffuse reflectance UV-Vis Spectroscopy. The present research is divided into two parts; visual discrimination and discrimination by using multivariate analysis. A comparison between qualitative and quantitative analysis showed that multivariate analysis (Principal component analysis) provides 99.59%pair-wise discriminating power for laser printer toners while 99.84% pair-wise discriminating power for photocopier toners. The overall results obtained confirm the applicability of UV-Vis spectroscopy and chemometrics, in the nondestructive analysis of toner printed documents while enhancing their evidential value for forensic applications.
Yu, Hao; Dick, Andrew W
2012-01-01
Background Given the rapid growth of health care costs, some experts were concerned with erosion of employment-based private insurance (EBPI). This empirical analysis aims to quantify the concern. Methods Using the National Health Account, we generated a cost index to represent state-level annual cost growth. We merged it with the 1996–2003 Medical Expenditure Panel Survey. The unit of analysis is the family. We conducted both bivariate and multivariate logistic analyses. Results The bivariate analysis found a significant inverse association between the cost index and the proportion of families receiving an offer of EBPI. The multivariate analysis showed that the cost index was significantly negatively associated with the likelihood of receiving an EBPI offer for the entire sample and for families in the first, second, and third quartiles of income distribution. The cost index was also significantly negatively associated with the proportion of families with EBPI for the entire year for each family member (EBPI-EYEM). The multivariate analysis confirmed significance of the relationship for the entire sample, and for families in the second and third quartiles of income distribution. Among the families with EBPI-EYEM, there was a positive relationship between the cost index and this group's likelihood of having out-of-pocket expenditures exceeding 10 percent of family income. The multivariate analysis confirmed significance of the relationship for the entire group and for families in the second and third quartiles of income distribution. Conclusions Rising health costs reduce EBPI availability and enrollment, and the financial protection provided by it, especially for middle-class families. PMID:22417314
Schreier, Christina; Kremer, Werner; Huber, Fritz; Neumann, Sindy; Pagel, Philipp; Lienemann, Kai; Pestel, Sabine
2013-01-01
Introduction. Spectroscopic analysis of urine samples from laboratory animals can be used to predict the efficacy and side effects of drugs. This employs methods combining 1H NMR spectroscopy with quantification of biomarkers or with multivariate data analysis. The most critical steps in data evaluation are analytical reproducibility of NMR data (collection, storage, and processing) and the health status of the animals, which may influence urine pH and osmolarity. Methods. We treated rats with a solvent, a diuretic, or a nephrotoxicant and collected urine samples. Samples were titrated to pH 3 to 9, or salt concentrations increased up to 20-fold. The effects of storage conditions and freeze-thaw cycles were monitored. Selected metabolites and multivariate data analysis were evaluated after 1H NMR spectroscopy. Results. We showed that variation of pH from 3 to 9 and increases in osmolarity up to 6-fold had no effect on the quantification of the metabolites or on multivariate data analysis. Storage led to changes after 14 days at 4°C or after 12 months at −20°C, independent of sample composition. Multiple freeze-thaw cycles did not affect data analysis. Conclusion. Reproducibility of NMR measurements is not dependent on sample composition under physiological or pathological conditions. PMID:23865070
NASA Astrophysics Data System (ADS)
Hashim, Noor Haslinda Noor; Latip, Jalifah; Khatib, Alfi
2016-11-01
The metabolites of Clinacanthus nutans leaves extracts and their dependence on drying process were systematically characterized using 1H nuclear magnetic resonance spectroscopy (NMR) multivariate data analysis. Principal component analysis (PCA) and partial least square-discriminant analysis (PLS-DA) were able to distinguish the leaves extracts obtained from different drying methods. The identified metabolites were carbohydrates, amino acid, flavonoids and sulfur glucoside compounds. The major metabolites responsible for the separation in PLS-DA loading plots were lupeol, cycloclinacosides, betulin, cerebrosides and choline. The results showed that the combination of 1H NMR spectroscopy and multivariate data analyses could act as an efficient technique to understand the C. nutans composition and its variation.
ERIC Educational Resources Information Center
McFarland, Dennis J.
2014-01-01
Purpose: Factor analysis is a useful technique to aid in organizing multivariate data characterizing speech, language, and auditory abilities. However, knowledge of the limitations of factor analysis is essential for proper interpretation of results. The present study used simulated test scores to illustrate some characteristics of factor…
High Ki-67 Immunohistochemical Reactivity Correlates With Poor Prognosis in Bladder Carcinoma
Luo, Yihuan; Zhang, Xin; Mo, Meile; Tan, Zhong; Huang, Lanshan; Zhou, Hong; Wang, Chunqin; Wei, Fanglin; Qiu, Xiaohui; He, Rongquan; Chen, Gang
2016-01-01
Abstract Ki-67 is considered as one of prime biomarkers to reflect cell proliferation and immunohistochemical Ki-67 staining has been widely applied in clinical pathology. To solve the widespread controversy whether Ki-67 reactivity significantly predicts clinical prognosis of bladder carcinoma (BC), we performed a comprehensive meta-analysis by combining results from different literature. A comprehensive search was conducted in the Chinese databases of WanFang, China National Knowledge Infrastructure and Chinese VIP as well as English databases of PubMed, ISI web of science, EMBASE, Science Direct, and Wiley online library. Independent studies linking Ki-67 to cancer-specific survival (CSS), disease-free survival (DFS), overall survival (OS), progression-free survival (PFS), and recurrence-free survival (RFS) were included in our meta-analysis. With the cut-off values literature provided, hazard ratio (HR) values between the survival distributions were extracted and later combined with STATA 12.0. In total, 76 studies (n = 13,053 patients) were eligible for the meta-analysis. It was indicated in either univariate or multivariate analysis for survival that high Ki-67 reactivity significantly predicted poor prognosis. In the univariate analysis, the combined HR for CSS, DFS, OS, PFS, and RFS were 2.588 (95% confidence interval [CI]: 1.623–4.127, P < 0.001), 2.697 (95%CI: 1.874–3.883, P < 0.001), 2.649 (95%CI: 1.632–4.300, P < 0.001), 3.506 (95%CI: 2.231–5.508, P < 0.001), and 1.792 (95%CI: 1.409–2.279, P < 0.001), respectively. The pooled HR of multivariate analysis for CSS, DFS, OS, PFS, and RFS were 1.868 (95%CI: 1.343–2.597, P < 0.001), 2.626 (95%CI: 2.089–3.301, P < 0.001), 1.104 (95%CI: 1.008–1.209, P = 0.032), 1.518 (95%CI: 1.299–1.773, P < 0.001), and 1.294 (95%CI: 1.203–1.392, P < 0.001), respectively. Subgroup analysis of univariate analysis by origin showed that Ki-67 reactivity significantly correlated with all 5 clinical outcome in Asian and European-American patients (P < 0.05). For multivariate analysis, however, the pooled results were only significant for DFS, OS, and RFS in Asian patients, for CSS, DFS, PFS, and RFS in European-American patients (P < 0.05). In the subgroup with low cut-off value (<20%), our meta-analysis indicated that high Ki-67 reactivity was significantly correlated with worsened CSS, DFS, OS, PFS, and RFS on univariate analysis (P < 0.05). For multivariate analysis, the meta-analysis of literature with low cut-off value (<20%) demonstrated that high Ki-67 reactivity predicted shorter DFS, PFS, and RFS in BC patients (P < 0.05). In the subgroup analysis of high cut-off value (≥20%), our meta-analysis indicated that high Ki-67 reactivity, in either univariate or multivariate analysis, significantly correlated with all five clinical outcomes in BC patients (P < 0.05). The meta-analysis indicates that high Ki-67 reactivity significantly correlates with deteriorated clinical outcomes in BC patients and that Ki-67 can be considered as an independent indicator for the prognosis by the meta-analyses of multivariate analysis. PMID:27082587
Rotorcraft flying qualities improvement using advanced control
NASA Technical Reports Server (NTRS)
Walker, D.; Postlethwaite, I.; Howitt, J.; Foster, N.
1993-01-01
We report on recent experience gained when a multivariable helicopter flight control law was tested on the Large Motion Simulator (LMS) at DRA Bedford. This was part of a study into the application of multivariable control theory to the design of full-authority flight control systems for high-performance helicopters. In this paper, we present some of the results that were obtained during the piloted simulation trial and from subsequent off-line simulation and analysis. The performance provided by the control law led to level 1 handling quality ratings for almost all of the mission task elements assessed, both during the real-time and off-line analysis.
Using reduncancy (RDA) and canonical correlation analysis (CCA) we assessed relationships between chemical and physical characteristics and periphyton at 105 stream sites sampled by REMAP in the mineral belt of the southern Rockies ecoregion in Colorado. We contrasted results ob...
Corvucci, Francesca; Nobili, Lara; Melucci, Dora; Grillenzoni, Francesca-Vittoria
2015-02-15
Honey traceability to food quality is required by consumers and food control institutions. Melissopalynologists traditionally use percentages of nectariferous pollens to discriminate the botanical origin and the entire pollen spectrum (presence/absence, type and quantities and association of some pollen types) to determinate the geographical origin of honeys. To improve melissopalynological routine analysis, principal components analysis (PCA) was used. A remarkable and innovative result was that the most significant pollens for the traditional discrimination of the botanical and geographical origin of honeys were the same as those individuated with the chemometric model. The reliability of assignments of samples to honey classes was estimated through explained variance (85%). This confirms that the chemometric model properly describes the melissopalynological data. With the aim to improve honey discrimination, FT-microRaman spectrography and multivariate analysis were also applied. Well performing PCA models and good agreement with known classes were achieved. Encouraging results were obtained for botanical discrimination. Copyright © 2014 Elsevier Ltd. All rights reserved.
Lapolla, Annunziata; Ragazzi, Eugenio; Andretta, Barbara; Fedele, Domenico; Tubaro, Michela; Seraglia, Roberta; Molin, Laura; Traldi, Pietro
2007-06-01
To clarify the possible pathogenetic role of oxidation products originated from the glycation of proteins, human globins from nephropathic patients have been studied by matrix-assisted laser desorption/ionization mass spectrometry (MALDI), revealing not only unglycated and monoglycated globins, but also a series of different species. For the last ones, structural assignments were tentatively done on the basis of observed masses and expectations for the Maillard reaction pattern. Consequently, they must be considered only propositive, and the discussion which will follow must be considered in this view. In our opinion this approach does not seem to compromise the intended diagnostic use of the data because distinctions are valid even if the assignments are uncertain. We studied nine healthy subjects and 19 nephropathic patients and processed the data obtained from the MALDI spectra using a multivariate analysis. Our results showed that multivariate analytical techniques enable differential aspects of the profile of molecular species to be identified in the blood of end stage nephropathic patients. A correct grouping can be achieved by principal component analysis (PCA) and the results suggest that several products involved in carbonyl stress exist in nephropathic patients. These compounds may have a relevant role as specific markers of the pathological state.
Multiple imputation for handling missing outcome data when estimating the relative risk.
Sullivan, Thomas R; Lee, Katherine J; Ryan, Philip; Salter, Amy B
2017-09-06
Multiple imputation is a popular approach to handling missing data in medical research, yet little is known about its applicability for estimating the relative risk. Standard methods for imputing incomplete binary outcomes involve logistic regression or an assumption of multivariate normality, whereas relative risks are typically estimated using log binomial models. It is unclear whether misspecification of the imputation model in this setting could lead to biased parameter estimates. Using simulated data, we evaluated the performance of multiple imputation for handling missing data prior to estimating adjusted relative risks from a correctly specified multivariable log binomial model. We considered an arbitrary pattern of missing data in both outcome and exposure variables, with missing data induced under missing at random mechanisms. Focusing on standard model-based methods of multiple imputation, missing data were imputed using multivariate normal imputation or fully conditional specification with a logistic imputation model for the outcome. Multivariate normal imputation performed poorly in the simulation study, consistently producing estimates of the relative risk that were biased towards the null. Despite outperforming multivariate normal imputation, fully conditional specification also produced somewhat biased estimates, with greater bias observed for higher outcome prevalences and larger relative risks. Deleting imputed outcomes from analysis datasets did not improve the performance of fully conditional specification. Both multivariate normal imputation and fully conditional specification produced biased estimates of the relative risk, presumably since both use a misspecified imputation model. Based on simulation results, we recommend researchers use fully conditional specification rather than multivariate normal imputation and retain imputed outcomes in the analysis when estimating relative risks. However fully conditional specification is not without its shortcomings, and so further research is needed to identify optimal approaches for relative risk estimation within the multiple imputation framework.
Cichonska, Anna; Rousu, Juho; Marttinen, Pekka; Kangas, Antti J; Soininen, Pasi; Lehtimäki, Terho; Raitakari, Olli T; Järvelin, Marjo-Riitta; Salomaa, Veikko; Ala-Korpela, Mika; Ripatti, Samuli; Pirinen, Matti
2016-07-01
A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts, and restricted availability of individual-level genotype-phenotype data across the cohorts limit conducting multivariate tests. We introduce metaCCA, a computational framework for summary statistics-based analysis of a single or multiple studies that allows multivariate representation of both genotype and phenotype. It extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness.Multivariate meta-analysis of two Finnish studies of nuclear magnetic resonance metabolomics by metaCCA, using standard univariate output from the program SNPTEST, shows an excellent agreement with the pooled individual-level analysis of original data. Motivated by strong multivariate signals in the lipid genes tested, we envision that multivariate association testing using metaCCA has a great potential to provide novel insights from already published summary statistics from high-throughput phenotyping technologies. Code is available at https://github.com/aalto-ics-kepaco anna.cichonska@helsinki.fi or matti.pirinen@helsinki.fi Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Jåstad, Eirik O; Torheim, Turid; Villeneuve, Kathleen M; Kvaal, Knut; Hole, Eli O; Sagstuen, Einar; Malinen, Eirik; Futsaether, Cecilia M
2017-09-28
The amino acid l-α-alanine is the most commonly used material for solid-state electron paramagnetic resonance (EPR) dosimetry, due to the formation of highly stable radicals upon irradiation, with yields proportional to the radiation dose. Two major alanine radical components designated R1 and R2 have previously been uniquely characterized from EPR and electron-nuclear double resonance (ENDOR) studies as well as from quantum chemical calculations. There is also convincing experimental evidence of a third minor radical component R3, and a tentative radical structure has been suggested, even though no well-defined spectral signature has been observed experimentally. In the present study, temperature dependent EPR spectra of X-ray irradiated polycrystalline alanine were analyzed using five multivariate methods in further attempts to understand the composite nature of the alanine dosimeter EPR spectrum. Principal component analysis (PCA), maximum likelihood common factor analysis (MLCFA), independent component analysis (ICA), self-modeling mixture analysis (SMA), and multivariate curve resolution (MCR) were used to extract pure radical spectra and their fractional contributions from the experimental EPR spectra. All methods yielded spectral estimates resembling the established R1 spectrum. Furthermore, SMA and MCR consistently predicted both the established R2 spectrum and the shape of the R3 spectrum. The predicted shape of the R3 spectrum corresponded well with the proposed tentative spectrum derived from spectrum simulations. Thus, results from two independent multivariate data analysis techniques strongly support the previous evidence that three radicals are indeed present in irradiated alanine samples.
Risk factors for baclofen pump infection in children: a multivariate analysis.
Spader, Heather S; Bollo, Robert J; Bowers, Christian A; Riva-Cambrin, Jay
2016-06-01
OBJECTIVE Intrathecal baclofen infusion systems to manage severe spasticity and dystonia are associated with higher infection rates in children than in adults. Factors unique to this population, such as poor nutrition and physical limitations for pump placement, have been hypothesized as the reasons for this disparity. The authors assessed potential risk factors for infection in a multivariate analysis. METHODS Patients who underwent implantation of a programmable pump and intrathecal catheter for baclofen infusion at a single center between January 1, 2000, and March 1, 2012, were identified in this retrospective cohort study. The primary end point was infection. Potential risk factors investigated included preoperative (i.e., demographics, body mass index [BMI], gastrostomy tube, tracheostomy, previous spinal fusion), intraoperative (i.e., surgeon, antibiotics, pump size, catheter location), and postoperative (i.e., wound dehiscence, CSF leak, and number of revisions) factors. Univariate analysis was performed, and a multivariate logistic regression model was created to identify independent risk factors for infection. RESULTS A total of 254 patients were evaluated. The overall infection rate was 9.8%. Univariate analysis identified young age, shorter height, lower weight, dehiscence, CSF leak, and number of revisions within 6 months of pump placement as significantly associated with infection. Multivariate analysis identified young age, dehiscence, and number of revisions as independent risk factors for infection. CONCLUSIONS Young age, wound dehiscence, and number of revisions were independent risk factors for infection in this pediatric cohort. A low BMI and the presence of either a gastrostomy or tracheostomy were not associated with infection and may not be contraindications for this procedure.
Tourkmani, Abdo Karim; Sánchez-Huerta, Valeria; De Wit, Guillermo; Martínez, Jaime D; Mingo, David; Mahillo-Fernández, Ignacio; Jiménez-Alfaro, Ignacio
2017-01-01
To analyze the relationship between the score obtained in the Risk Score System (RSS) proposed by Hicks et al with penetrating keratoplasty (PKP) graft failure at 1y postoperatively and among each factor in the RSS with the risk of PKP graft failure using univariate and multivariate analysis. The retrospective cohort study had 152 PKPs from 152 patients. Eighteen cases were excluded from our study due to primary failure (10 cases), incomplete medical notes (5 cases) and follow-up less than 1y (3 cases). We included 134 PKPs from 134 patients stratified by preoperative risk score. Spearman coefficient was calculated for the relationship between the score obtained and risk of failure at 1y. Univariate and multivariate analysis were calculated for the impact of every single risk factor included in the RSS over graft failure at 1y. Spearman coefficient showed statistically significant correlation between the score in the RSS and graft failure ( P <0.05). Multivariate logistic regression analysis showed no statistically significant relationship ( P >0.05) between diagnosis and lens status with graft failure. The relationship between the other risk factors studied and graft failure was significant ( P <0.05), although the results for previous grafts and graft failure was unreliable. None of our patients had previous blood transfusion, thus, it had no impact. After the application of multivariate analysis techniques, some risk factors do not show the expected impact over graft failure at 1y.
Moseson, Heidi; Gerdts, Caitlin; Dehlendorf, Christine; Hiatt, Robert A; Vittinghoff, Eric
2017-12-21
The list experiment is a promising measurement tool for eliciting truthful responses to stigmatized or sensitive health behaviors. However, investigators may be hesitant to adopt the method due to previously untestable assumptions and the perceived inability to conduct multivariable analysis. With a recently developed statistical test that can detect the presence of a design effect - the absence of which is a central assumption of the list experiment method - we sought to test the validity of a list experiment conducted on self-reported abortion in Liberia. We also aim to introduce recently developed multivariable regression estimators for the analysis of list experiment data, to explore relationships between respondent characteristics and having had an abortion - an important component of understanding the experiences of women who have abortions. To test the null hypothesis of no design effect in the Liberian list experiment data, we calculated the percentage of each respondent "type," characterized by response to the control items, and compared these percentages across treatment and control groups with a Bonferroni-adjusted alpha criterion. We then implemented two least squares and two maximum likelihood models (four total), each representing different bias-variance trade-offs, to estimate the association between respondent characteristics and abortion. We find no clear evidence of a design effect in list experiment data from Liberia (p = 0.18), affirming the first key assumption of the method. Multivariable analyses suggest a negative association between education and history of abortion. The retrospective nature of measuring lifetime experience of abortion, however, complicates interpretation of results, as the timing and safety of a respondent's abortion may have influenced her ability to pursue an education. Our work demonstrates that multivariable analyses, as well as statistical testing of a key design assumption, are possible with list experiment data, although with important limitations when considering lifetime measures. We outline how to implement this methodology with list experiment data in future research.
de Brito, Aila Riany; Santos Reis, Nadabe Dos; Silva, Tatielle Pereira; Ferreira Bonomo, Renata Cristina; Trovatti Uetanabaro, Ana Paula; de Assis, Sandra Aparecida; da Silva, Erik Galvão Paranhos; Aguiar-Oliveira, Elizama; Oliveira, Julieta Rangel; Franco, Marcelo
2017-11-26
Endoglucanase production by Aspergillus oryzae ATCC 10124 cultivated in rice husks or peanut shells was optimized by experimental design as a function of humidity, time, and temperature. The optimum temperature for the endoglucanase activity was estimated by a univariate analysis (one factor at the time) as 50°C (rice husks) and 60°C (peanut shells), however, by a multivariate analysis (synergism of factors), it was determined a different temperature (56°C) for endoglucanase from peanut shells. For the optimum pH, values determined by univariate and multivariate analysis were 5 and 5.2 (rice husk) and 5 and 7.6 (peanut shells). In addition, the best half-lives were observed at 50°C as 22.8 hr (rice husks) and 7.3 hr (peanut shells), also, 80% of residual activities was obtained between 30 and 50°C for both substrates, and the pH stability was improved at 5-7 (rice hulls) and 6-9 (peanut shells). Both endoglucanases obtained presented different characteristics as a result of the versatility of fungi in different substrates.
Smyczek-Gargya, B; Volz, B; Geppert, M; Dietl, J
1997-01-01
Clinical and histological data of 168 patients with squamous cell carcinoma of the vulva were analyzed with respect to survival. 151 patients underwent surgery, 12 patients were treated with primary radiation and in 5 patients no treatment was performed. Follow-up lasted from at least 2 up to 22 years' posttreatment. In univariate analysis, the following factors were highly significant: presurgery lymph node status, tumor infiltration beyond the vulva, tumor grading, histological inguinal lymph node status, pre- and postsurgery tumor stage, depth of invasion and tumor diameter. In the multivariate analysis (Cox regression), the most powerful factors were shown to be histological inguinal lymph node status, tumor diameter and tumor grading. The multivariate logistic regression analysis worked out as main prognostic factors for metastases of inguinal lymph nodes: presurgery inguinal lymph node status, tumor size, depth of invasion and tumor grading. Based on these results, tumor biology seems to be the decisive factor concerning recurrence and survival. Therefore, we suggest a more conservative treatment of vulvar carcinoma. Patients with confined carcinoma to the vulva, with a tumor diameter up to 3 cm and without clinical suspected lymph nodes, should be treated by wide excision/partial vulvectomy with ipsilateral lymphadenectomy.
Using sperm morphometry and multivariate analysis to differentiate species of gray Mazama
Duarte, José Maurício Barbanti
2016-01-01
There is genetic evidence that the two species of Brazilian gray Mazama, Mazama gouazoubira and Mazama nemorivaga, belong to different genera. This study identified significant differences that separated them into distinct groups, based on characteristics of the spermatozoa and ejaculate of both species. The characteristics that most clearly differentiated between the species were ejaculate colour, white for M. gouazoubira and reddish for M. nemorivaga, and sperm head dimensions. Multivariate analysis of sperm head dimension and format data accurately discriminated three groups for species with total percentage of misclassified of 0.71. The individual analysis, by animal, and the multivariate analysis have also discriminated correctly all five animals (total percentage of misclassified of 13.95%), and the canonical plot has shown three different clusters: Cluster 1, including individuals of M. nemorivaga; Cluster 2, including two individuals of M. gouazoubira; and Cluster 3, including a single individual of M. gouazoubira. The results obtained in this work corroborate the hypothesis of the formation of new genera and species for gray Mazama. Moreover, the easily applied method described herein can be used as an auxiliary tool to identify sibling species of other taxonomic groups. PMID:28018612
Falahati, Farshad; Westman, Eric; Simmons, Andrew
2014-01-01
Machine learning algorithms and multivariate data analysis methods have been widely utilized in the field of Alzheimer's disease (AD) research in recent years. Advances in medical imaging and medical image analysis have provided a means to generate and extract valuable neuroimaging information. Automatic classification techniques provide tools to analyze this information and observe inherent disease-related patterns in the data. In particular, these classifiers have been used to discriminate AD patients from healthy control subjects and to predict conversion from mild cognitive impairment to AD. In this paper, recent studies are reviewed that have used machine learning and multivariate analysis in the field of AD research. The main focus is on studies that used structural magnetic resonance imaging (MRI), but studies that included positron emission tomography and cerebrospinal fluid biomarkers in addition to MRI are also considered. A wide variety of materials and methods has been employed in different studies, resulting in a range of different outcomes. Influential factors such as classifiers, feature extraction algorithms, feature selection methods, validation approaches, and cohort properties are reviewed, as well as key MRI-based and multi-modal based studies. Current and future trends are discussed.
Using Interactive Graphics to Teach Multivariate Data Analysis to Psychology Students
ERIC Educational Resources Information Center
Valero-Mora, Pedro M.; Ledesma, Ruben D.
2011-01-01
This paper discusses the use of interactive graphics to teach multivariate data analysis to Psychology students. Three techniques are explored through separate activities: parallel coordinates/boxplots; principal components/exploratory factor analysis; and cluster analysis. With interactive graphics, students may perform important parts of the…
Wang, Longfei; Lee, Sungyoung; Gim, Jungsoo; Qiao, Dandi; Cho, Michael; Elston, Robert C; Silverman, Edwin K; Won, Sungho
2016-09-01
Family-based designs have been repeatedly shown to be powerful in detecting the significant rare variants associated with human diseases. Furthermore, human diseases are often defined by the outcomes of multiple phenotypes, and thus we expect multivariate family-based analyses may be very efficient in detecting associations with rare variants. However, few statistical methods implementing this strategy have been developed for family-based designs. In this report, we describe one such implementation: the multivariate family-based rare variant association tool (mFARVAT). mFARVAT is a quasi-likelihood-based score test for rare variant association analysis with multiple phenotypes, and tests both homogeneous and heterogeneous effects of each variant on multiple phenotypes. Simulation results show that the proposed method is generally robust and efficient for various disease models, and we identify some promising candidate genes associated with chronic obstructive pulmonary disease. The software of mFARVAT is freely available at http://healthstat.snu.ac.kr/software/mfarvat/, implemented in C++ and supported on Linux and MS Windows. © 2016 WILEY PERIODICALS, INC.
Tailored multivariate analysis for modulated enhanced diffraction
Caliandro, Rocco; Guccione, Pietro; Nico, Giovanni; ...
2015-10-21
Modulated enhanced diffraction (MED) is a technique allowing the dynamic structural characterization of crystalline materials subjected to an external stimulus, which is particularly suited forin situandoperandostructural investigations at synchrotron sources. Contributions from the (active) part of the crystal system that varies synchronously with the stimulus can be extracted by an offline analysis, which can only be applied in the case of periodic stimuli and linear system responses. In this paper a new decomposition approach based on multivariate analysis is proposed. The standard principal component analysis (PCA) is adapted to treat MED data: specific figures of merit based on their scoresmore » and loadings are found, and the directions of the principal components obtained by PCA are modified to maximize such figures of merit. As a result, a general method to decompose MED data, called optimum constrained components rotation (OCCR), is developed, which produces very precise results on simulated data, even in the case of nonperiodic stimuli and/or nonlinear responses. Furthermore, the multivariate analysis approach is able to supply in one shot both the diffraction pattern related to the active atoms (through the OCCR loadings) and the time dependence of the system response (through the OCCR scores). Furthermore, when applied to real data, OCCR was able to supply only the latter information, as the former was hindered by changes in abundances of different crystal phases, which occurred besides structural variations in the specific case considered. In order to develop a decomposition procedure able to cope with this combined effect represents the next challenge in MED analysis.« less
Motivational Profiles of Adult Learners
ERIC Educational Resources Information Center
Rothes, Ana; Lemos, Marina S.; Gonçalves, Teresa
2017-01-01
This study investigated profiles of autonomous and controlled motivation and their effects in a sample of 188 adult learners from two Portuguese urban areas. Using a person-centered approach, results of cluster analysis and multivariate analysis of covariance revealed four motivational groups with different effects in self-efficacy, engagement,…
A power analysis for multivariate tests of temporal trend in species composition.
Irvine, Kathryn M; Dinger, Eric C; Sarr, Daniel
2011-10-01
Long-term monitoring programs emphasize power analysis as a tool to determine the sampling effort necessary to effectively document ecologically significant changes in ecosystems. Programs that monitor entire multispecies assemblages require a method for determining the power of multivariate statistical models to detect trend. We provide a method to simulate presence-absence species assemblage data that are consistent with increasing or decreasing directional change in species composition within multiple sites. This step is the foundation for using Monte Carlo methods to approximate the power of any multivariate method for detecting temporal trends. We focus on comparing the power of the Mantel test, permutational multivariate analysis of variance, and constrained analysis of principal coordinates. We find that the power of the various methods we investigate is sensitive to the number of species in the community, univariate species patterns, and the number of sites sampled over time. For increasing directional change scenarios, constrained analysis of principal coordinates was as or more powerful than permutational multivariate analysis of variance, the Mantel test was the least powerful. However, in our investigation of decreasing directional change, the Mantel test was typically as or more powerful than the other models.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lines, Amanda M.; Nelson, Gilbert L.; Casella, Amanda J.
Microfluidic devices are a growing field with significant potential for application to small scale processing of solutions. Much like large scale processing, fast, reliable, and cost effective means of monitoring the streams during processing are needed. Here we apply a novel Micro-Raman probe to the on-line monitoring of streams within a microfluidic device. For either macro or micro scale process monitoring via spectroscopic response, there is the danger of interfering or confounded bands obfuscating results. By utilizing chemometric analysis, a form of multivariate analysis, species can be accurately quantified in solution despite the presence of overlapping or confounded spectroscopic bands.more » This is demonstrated on solutions of HNO 3 and NaNO 3 within micro-flow and microfluidic devices.« less
Yang, Heejung; Lee, Dong Young; Jeon, Minji; Suh, Youngbae; Sung, Sang Hyun
2014-05-01
Five active compounds, chlorogenic acid, 3,5-di-O-caffeoylquinic acid, 4,5-di-O-caffeoylquinic acid, jaceosidin, and eupatilin, in Artemisia princeps (Compositae) were simultaneously determined by ultra-performance liquid chromatography connected to diode array detector. The morphological resemblance between A. princeps and A. capillaris makes it difficult to properly identify species properly. It occasionally leads to misuse or misapplication in Korean traditional medicine. In the study, the discrimination between A. princeps and A. capillaris was optimally performed by the developed validation method, which resulted in definitely a difference between two species. Also, it was developed the most reliable markers contributing to the discrimination of two species by the multivariate analysis methods, such as a principal component analysis and a partial least squares discrimination analysis.
NASA Astrophysics Data System (ADS)
Hasyim, M.; Prastyo, D. D.
2018-03-01
Survival analysis performs relationship between independent variables and survival time as dependent variable. In fact, not all survival data can be recorded completely by any reasons. In such situation, the data is called censored data. Moreover, several model for survival analysis requires assumptions. One of the approaches in survival analysis is nonparametric that gives more relax assumption. In this research, the nonparametric approach that is employed is Multivariate Regression Adaptive Spline (MARS). This study is aimed to measure the performance of private university’s lecturer. The survival time in this study is duration needed by lecturer to obtain their professional certificate. The results show that research activities is a significant factor along with developing courses material, good publication in international or national journal, and activities in research collaboration.
Badran, M; Morsy, R; Soliman, H; Elnimr, T
2016-01-01
The trace elements metabolism has been reported to possess specific roles in the pathogenesis and progress of diabetes mellitus. Due to the continuous increase in the population of patients with Type 2 diabetes (T2D), this study aims to assess the levels and inter-relationships of fast blood glucose (FBG) and serum trace elements in Type 2 diabetic patients. This study was conducted on 40 Egyptian Type 2 diabetic patients and 36 healthy volunteers (Hospital of Tanta University, Tanta, Egypt). The blood serum was digested and then used to determine the levels of 24 trace elements using an inductive coupled plasma mass spectroscopy (ICP-MS). Multivariate statistical analysis depended on correlation coefficient, cluster analysis (CA) and principal component analysis (PCA), were used to analysis the data. The results exhibited significant changes in FBG and eight of trace elements, Zn, Cu, Se, Fe, Mn, Cr, Mg, and As, levels in the blood serum of Type 2 diabetic patients relative to those of healthy controls. The statistical analyses using multivariate statistical techniques were obvious in the reduction of the experimental variables, and grouping the trace elements in patients into three clusters. The application of PCA revealed a distinct difference in associations of trace elements and their clustering patterns in control and patients group in particular for Mg, Fe, Cu, and Zn that appeared to be the most crucial factors which related with Type 2 diabetes. Therefore, on the basis of this study, the contributors of trace elements content in Type 2 diabetic patients can be determine and specify with correlation relationship and multivariate statistical analysis, which confirm that the alteration of some essential trace metals may play a role in the development of diabetes mellitus. Copyright © 2015 Elsevier GmbH. All rights reserved.
NASA Astrophysics Data System (ADS)
Darvishzadeh, R.; Skidmore, A. K.; Mirzaie, M.; Atzberger, C.; Schlerf, M.
2014-12-01
Accurate estimation of grassland biomass at their peak productivity can provide crucial information regarding the functioning and productivity of the rangelands. Hyperspectral remote sensing has proved to be valuable for estimation of vegetation biophysical parameters such as biomass using different statistical techniques. However, in statistical analysis of hyperspectral data, multicollinearity is a common problem due to large amount of correlated hyper-spectral reflectance measurements. The aim of this study was to examine the prospect of above ground biomass estimation in a heterogeneous Mediterranean rangeland employing multivariate calibration methods. Canopy spectral measurements were made in the field using a GER 3700 spectroradiometer, along with concomitant in situ measurements of above ground biomass for 170 sample plots. Multivariate calibrations including partial least squares regression (PLSR), principal component regression (PCR), and Least-Squared Support Vector Machine (LS-SVM) were used to estimate the above ground biomass. The prediction accuracy of the multivariate calibration methods were assessed using cross validated R2 and RMSE. The best model performance was obtained using LS_SVM and then PLSR both calibrated with first derivative reflectance dataset with R2cv = 0.88 & 0.86 and RMSEcv= 1.15 & 1.07 respectively. The weakest prediction accuracy was appeared when PCR were used (R2cv = 0.31 and RMSEcv= 2.48). The obtained results highlight the importance of multivariate calibration methods for biomass estimation when hyperspectral data are used.
The Potential of Multivariate Analysis in Assessing Students' Attitude to Curriculum Subjects
ERIC Educational Resources Information Center
Gaotlhobogwe, Michael; Laugharne, Janet; Durance, Isabelle
2011-01-01
Background: Understanding student attitudes to curriculum subjects is central to providing evidence-based options to policy makers in education. Purpose: We illustrate how quantitative approaches used in the social sciences and based on multivariate analysis (categorical Principal Components Analysis, Clustering Analysis and General Linear…
Two-sample tests and one-way MANOVA for multivariate biomarker data with nondetects.
Thulin, M
2016-09-10
Testing whether the mean vector of a multivariate set of biomarkers differs between several populations is an increasingly common problem in medical research. Biomarker data is often left censored because some measurements fall below the laboratory's detection limit. We investigate how such censoring affects multivariate two-sample and one-way multivariate analysis of variance tests. Type I error rates, power and robustness to increasing censoring are studied, under both normality and non-normality. Parametric tests are found to perform better than non-parametric alternatives, indicating that the current recommendations for analysis of censored multivariate data may have to be revised. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
A non-iterative extension of the multivariate random effects meta-analysis.
Makambi, Kepher H; Seung, Hyunuk
2015-01-01
Multivariate methods in meta-analysis are becoming popular and more accepted in biomedical research despite computational issues in some of the techniques. A number of approaches, both iterative and non-iterative, have been proposed including the multivariate DerSimonian and Laird method by Jackson et al. (2010), which is non-iterative. In this study, we propose an extension of the method by Hartung and Makambi (2002) and Makambi (2001) to multivariate situations. A comparison of the bias and mean square error from a simulation study indicates that, in some circumstances, the proposed approach perform better than the multivariate DerSimonian-Laird approach. An example is presented to demonstrate the application of the proposed approach.
2014-01-01
Background Network meta-analysis (NMA) enables simultaneous comparison of multiple treatments while preserving randomisation. When summarising evidence to inform an economic evaluation, it is important that the analysis accurately reflects the dependency structure within the data, as correlations between outcomes may have implication for estimating the net benefit associated with treatment. A multivariate NMA offers a framework for evaluating multiple treatments across multiple outcome measures while accounting for the correlation structure between outcomes. Methods The standard NMA model is extended to multiple outcome settings in two stages. In the first stage, information is borrowed across outcomes as well across studies through modelling the within-study and between-study correlation structure. In the second stage, we make use of the additional assumption that intervention effects are exchangeable between outcomes to predict effect estimates for all outcomes, including effect estimates on outcomes where evidence is either sparse or the treatment had not been considered by any one of the studies included in the analysis. We apply the methods to binary outcome data from a systematic review evaluating the effectiveness of nine home safety interventions on uptake of three poisoning prevention practices (safe storage of medicines, safe storage of other household products, and possession of poison centre control telephone number) in households with children. Analyses are conducted in WinBUGS using Markov Chain Monte Carlo (MCMC) simulations. Results Univariate and the first stage multivariate models produced broadly similar point estimates of intervention effects but the uncertainty around the multivariate estimates varied depending on the prior distribution specified for the between-study covariance structure. The second stage multivariate analyses produced more precise effect estimates while enabling intervention effects to be predicted for all outcomes, including intervention effects on outcomes not directly considered by the studies included in the analysis. Conclusions Accounting for the dependency between outcomes in a multivariate meta-analysis may or may not improve the precision of effect estimates from a network meta-analysis compared to analysing each outcome separately. PMID:25047164
Truu, Jaak; Heinaru, Eeva; Talpsep, Ene; Heinaru, Ain
2002-01-01
The oil-shale industry has created serious pollution problems in northeastern Estonia. Untreated, phenol-rich leachate from semi-coke mounds formed as a by-product of oil-shale processing is discharged into the Baltic Sea via channels and rivers. An exploratory analysis of water chemical and microbiological data sets from the low-flow period was carried out using different multivariate analysis techniques. Principal component analysis allowed us to distinguish different locations in the river system. The riverine microbial community response to water chemical parameters was assessed by co-inertia analysis. Water pH, COD and total nitrogen were negatively related to the number of biodegradative bacteria, while oxygen concentration promoted the abundance of these bacteria. The results demonstrate the utility of multivariate statistical techniques as tools for estimating the magnitude and extent of pollution based on river water chemical and microbiological parameters. An evaluation of river chemical and microbiological data suggests that the ambient natural attenuation mechanisms only partly eliminate pollutants from river water, and that a sufficient reduction of more recalcitrant compounds could be achieved through the reduction of wastewater discharge from the oil-shale chemical industry into the rivers.
A refined method for multivariate meta-analysis and meta-regression.
Jackson, Daniel; Riley, Richard D
2014-02-20
Making inferences about the average treatment effect using the random effects model for meta-analysis is problematic in the common situation where there is a small number of studies. This is because estimates of the between-study variance are not precise enough to accurately apply the conventional methods for testing and deriving a confidence interval for the average effect. We have found that a refined method for univariate meta-analysis, which applies a scaling factor to the estimated effects' standard error, provides more accurate inference. We explain how to extend this method to the multivariate scenario and show that our proposal for refined multivariate meta-analysis and meta-regression can provide more accurate inferences than the more conventional approach. We explain how our proposed approach can be implemented using standard output from multivariate meta-analysis software packages and apply our methodology to two real examples. Copyright © 2013 John Wiley & Sons, Ltd.
Toda, Hiroyuki; Inoue, Takeshi; Tsunoda, Tomoya; Nakai, Yukiei; Tanichi, Masaaki; Tanaka, Teppei; Hashimoto, Naoki; Nakato, Yasuya; Nakagawa, Shin; Kitaichi, Yuji; Mitsui, Nobuyuki; Boku, Shuken; Tanabe, Hajime; Nibuya, Masashi; Yoshino, Aihide; Kusumi, Ichiro
2015-01-01
Background Previous studies have shown the interaction between heredity and childhood stress or life events on the pathogenesis of a major depressive disorder (MDD). In this study, we tested our hypothesis that childhood abuse, affective temperaments, and adult stressful life events interact and influence the diagnosis of MDD. Patients and methods A total of 170 healthy controls and 98 MDD patients were studied using the following self-administered questionnaire surveys: the Patient Health Questionnaire-9 (PHQ-9), the Life Experiences Survey, the Temperament Evaluation of the Memphis, Pisa, Paris, and San Diego Autoquestionnaire, and the Child Abuse and Trauma Scale (CATS). The data were analyzed with univariate analysis, multivariable analysis, and structural equation modeling. Results The neglect scores of the CATS indirectly predicted the diagnosis of MDD through cyclothymic and anxious temperament scores of the Temperament Evaluation of the Memphis, Pisa, Paris, and San Diego Autoquestionnaire in the structural equation modeling. Two temperaments – cyclothymic and anxious – directly predicted the diagnosis of MDD. The validity of this result was supported by the results of the stepwise multivariate logistic regression analysis as follows: three factors – neglect, cyclothymic, and anxious temperaments – were significant predictors of MDD. Neglect and the total CATS scores were also predictors of remission vs treatment-resistance in MDD patients independently of depressive symptoms. Limitations The sample size was small for the comparison between the remission and treatment-resistant groups in MDD patients in multivariable analysis. Conclusion This study suggests that childhood abuse, especially neglect, indirectly predicted the diagnosis of MDD through increased affective temperaments. The important role as a mediator of affective temperaments in the effect of childhood abuse on MDD was suggested. PMID:26316754
Waskitho, Dri; Lukitaningsih, Endang; Sudjadi; Rohman, Abdul
2016-01-01
Analysis of lard extracted from lipstick formulation containing castor oil has been performed using FTIR spectroscopic method combined with multivariate calibration. Three different extraction methods were compared, namely saponification method followed by liquid/liquid extraction with hexane/dichlorometane/ethanol/water, saponification method followed by liquid/liquid extraction with dichloromethane/ethanol/water, and Bligh & Dyer method using chloroform/methanol/water as extracting solvent. Qualitative and quantitative analysis of lard were performed using principle component (PCA) and partial least square (PLS) analysis, respectively. The results showed that, in all samples prepared by the three extraction methods, PCA was capable of identifying lard at wavelength region of 1200-800 cm -1 with the best result was obtained by Bligh & Dyer method. Furthermore, PLS analysis at the same wavelength region used for qualification showed that Bligh and Dyer was the most suitable extraction method with the highest determination coefficient (R 2 ) and the lowest root mean square error of calibration (RMSEC) as well as root mean square error of prediction (RMSEP) values.
ERIC Educational Resources Information Center
Fouladi, Rachel T.
2000-01-01
Provides an overview of standard and modified normal theory and asymptotically distribution-free covariance and correlation structure analysis techniques and details Monte Carlo simulation results on Type I and Type II error control. Demonstrates through the simulation that robustness and nonrobustness of structure analysis techniques vary as a…
Hakimzadeh, Neda; Parastar, Hadi; Fattahi, Mohammad
2014-01-24
In this study, multivariate curve resolution (MCR) and multivariate classification methods are proposed to develop a new chemometric strategy for comprehensive analysis of high-performance liquid chromatography-diode array absorbance detection (HPLC-DAD) fingerprints of sixty Salvia reuterana samples from five different geographical regions. Different chromatographic problems occurred during HPLC-DAD analysis of S. reuterana samples, such as baseline/background contribution and noise, low signal-to-noise ratio (S/N), asymmetric peaks, elution time shifts, and peak overlap are handled using the proposed strategy. In this way, chromatographic fingerprints of sixty samples are properly segmented to ten common chromatographic regions using local rank analysis and then, the corresponding segments are column-wise augmented for subsequent MCR analysis. Extended multivariate curve resolution-alternating least squares (MCR-ALS) is used to obtain pure component profiles in each segment. In general, thirty-one chemical components were resolved using MCR-ALS in sixty S. reuterana samples and the lack of fit (LOF) values of MCR-ALS models were below 10.0% in all cases. Pure spectral profiles are considered for identification of chemical components by comparing their resolved spectra with the standard ones and twenty-four components out of thirty-one components were identified. Additionally, pure elution profiles are used to obtain relative concentrations of chemical components in different samples for multivariate classification analysis by principal component analysis (PCA) and k-nearest neighbors (kNN). Inspection of the PCA score plot (explaining 76.1% of variance accounted for three PCs) showed that S. reuterana samples belong to four clusters. The degree of class separation (DCS) which quantifies the distance separating clusters in relation to the scatter within each cluster is calculated for four clusters and it was in the range of 1.6-5.8. These results are then confirmed by kNN. In addition, according to the PCA loading plot and kNN dendrogram of thirty-one variables, five chemical constituents of luteolin-7-o-glucoside, salvianolic acid D, rosmarinic acid, lithospermic acid and trijuganone A are identified as the most important variables (i.e., chemical markers) for clusters discrimination. Finally, the effect of different chemical markers on samples differentiation is investigated using counter-propagation artificial neural network (CP-ANN) method. It is concluded that the proposed strategy can be successfully applied for comprehensive analysis of chromatographic fingerprints of complex natural samples. Copyright © 2013 Elsevier B.V. All rights reserved.
1993-06-18
the exception. In the Standardized Aquatic Microcosm and the Mixed Flask Culture (MFC) microcosms, multivariate analysis and clustering methods...rule rather than the exception. In the Standardized Aquatic Microcosm and the Mixed Flask Culture (MFC) microcosms, multivariate analysis and...experiments using two microcosm protocols. We use nonmetric clustering, a multivariate pattern recognition technique developed by Matthews and Heame (1991
Liu, Ya-Juan; André, Silvère; Saint Cristau, Lydia; Lagresle, Sylvain; Hannas, Zahia; Calvosa, Éric; Devos, Olivier; Duponchel, Ludovic
2017-02-01
Multivariate statistical process control (MSPC) is increasingly popular as the challenge provided by large multivariate datasets from analytical instruments such as Raman spectroscopy for the monitoring of complex cell cultures in the biopharmaceutical industry. However, Raman spectroscopy for in-line monitoring often produces unsynchronized data sets, resulting in time-varying batches. Moreover, unsynchronized data sets are common for cell culture monitoring because spectroscopic measurements are generally recorded in an alternate way, with more than one optical probe parallelly connecting to the same spectrometer. Synchronized batches are prerequisite for the application of multivariate analysis such as multi-way principal component analysis (MPCA) for the MSPC monitoring. Correlation optimized warping (COW) is a popular method for data alignment with satisfactory performance; however, it has never been applied to synchronize acquisition time of spectroscopic datasets in MSPC application before. In this paper we propose, for the first time, to use the method of COW to synchronize batches with varying durations analyzed with Raman spectroscopy. In a second step, we developed MPCA models at different time intervals based on the normal operation condition (NOC) batches synchronized by COW. New batches are finally projected considering the corresponding MPCA model. We monitored the evolution of the batches using two multivariate control charts based on Hotelling's T 2 and Q. As illustrated with results, the MSPC model was able to identify abnormal operation condition including contaminated batches which is of prime importance in cell culture monitoring We proved that Raman-based MSPC monitoring can be used to diagnose batches deviating from the normal condition, with higher efficacy than traditional diagnosis, which would save time and money in the biopharmaceutical industry. Copyright © 2016 Elsevier B.V. All rights reserved.
Multivariate analysis for scanning tunneling spectroscopy data
NASA Astrophysics Data System (ADS)
Yamanishi, Junsuke; Iwase, Shigeru; Ishida, Nobuyuki; Fujita, Daisuke
2018-01-01
We applied principal component analysis (PCA) to two-dimensional tunneling spectroscopy (2DTS) data obtained on a Si(111)-(7 × 7) surface to explore the effectiveness of multivariate analysis for interpreting 2DTS data. We demonstrated that several components that originated mainly from specific atoms at the Si(111)-(7 × 7) surface can be extracted by PCA. Furthermore, we showed that hidden components in the tunneling spectra can be decomposed (peak separation), which is difficult to achieve with normal 2DTS analysis without the support of theoretical calculations. Our analysis showed that multivariate analysis can be an additional powerful way to analyze 2DTS data and extract hidden information from a large amount of spectroscopic data.
Processes and subdivisions in diogenites, a multivariate statistical analysis
NASA Technical Reports Server (NTRS)
Harriott, T. A.; Hewins, R. H.
1984-01-01
Multivariate statistical techniques used on diogenite orthopyroxene analyses show the relationships that occur within diogenites and the two orthopyroxenite components (class I and II) in the polymict diogenite Garland. Cluster analysis shows that only Peckelsheim is similar to Garland class I (Fe-rich) and the other diogenites resemble Garland class II. The unique diogenite Y 75032 may be related to type I by fractionation. Factor analysis confirms the subdivision and shows that Fe does not correlate with the weakly incompatible elements across the entire pyroxene composition range, indicating that igneous fractionation is not the process controlling total diogenite composition variation. The occurrence of two groups of diogenites is interpreted as the result of sampling or mixing of two main sequences of orthopyroxene cumulates with slightly different compositions.
NASA Astrophysics Data System (ADS)
Huang, Shaohua; Wang, Lan; Chen, Weisheng; Feng, Shangyuan; Lin, Juqiang; Huang, Zufang; Chen, Guannan; Li, Buhong; Chen, Rong
2014-11-01
Non-invasive esophagus cancer detection based on urine surface-enhanced Raman spectroscopy (SERS) analysis was presented. Urine SERS spectra were measured on esophagus cancer patients (n = 56) and healthy volunteers (n = 36) for control analysis. Tentative assignments of the urine SERS spectra indicated some interesting esophagus cancer-specific biomolecular changes, including a decrease in the relative content of urea and an increase in the percentage of uric acid in the urine of esophagus cancer patients compared to that of healthy subjects. Principal component analysis (PCA) combined with linear discriminant analysis (LDA) was employed to analyze and differentiate the SERS spectra between normal and esophagus cancer urine. The diagnostic algorithms utilizing a multivariate analysis method achieved a diagnostic sensitivity of 89.3% and specificity of 83.3% for separating esophagus cancer samples from normal urine samples. These results from the explorative work suggested that silver nano particle-based urine SERS analysis coupled with PCA-LDA multivariate analysis has potential for non-invasive detection of esophagus cancer.
Multivariate Analysis of Schools and Educational Policy.
ERIC Educational Resources Information Center
Kiesling, Herbert J.
This report describes a multivariate analysis technique that approaches the problems of educational production function analysis by (1) using comparable measures of output across large experiments, (2) accounting systematically for differences in socioeconomic background, and (3) treating the school as a complete system in which different…
NASA Technical Reports Server (NTRS)
Wolf, S. F.; Lipschutz, M. E.
1993-01-01
Multivariate statistical analysis techniques (linear discriminant analysis and logistic regression) can provide powerful discrimination tools which are generally unfamiliar to the planetary science community. Fall parameters were used to identify a group of 17 H chondrites (Cluster 1) that were part of a coorbital stream which intersected Earth's orbit in May, from 1855 - 1895, and can be distinguished from all other H chondrite falls. Using multivariate statistical techniques, it was demonstrated that a totally different criterion, labile trace element contents - hence thermal histories - or 13 Cluster 1 meteorites are distinguishable from those of 45 non-Cluster 1 H chondrites. Here, we focus upon the principles of multivariate statistical techniques and illustrate their application using non-meteoritic and meteoritic examples.
Xu, Min; Zhang, Lei; Yue, Hong-Shui; Pang, Hong-Wei; Ye, Zheng-Liang; Ding, Li
2017-10-01
To establish an on-line monitoring method for extraction process of Schisandrae Chinensis Fructus, the formula medicinal material of Yiqi Fumai lyophilized injection by combining near infrared spectroscopy with multi-variable data analysis technology. The multivariate statistical process control (MSPC) model was established based on 5 normal batches in production and 2 test batches were monitored by PC scores, DModX and Hotelling T2 control charts. The results showed that MSPC model had a good monitoring ability for the extraction process. The application of the MSPC model to actual production process could effectively achieve on-line monitoring for extraction process of Schisandrae Chinensis Fructus, and can reflect the change of material properties in the production process in real time. This established process monitoring method could provide reference for the application of process analysis technology in the process quality control of traditional Chinese medicine injections. Copyright© by the Chinese Pharmaceutical Association.
Wang, Gang; Teng, Chaolin; Li, Kuo; Zhang, Zhonglin; Yan, Xiangguo
2016-09-01
The recorded electroencephalography (EEG) signals are usually contaminated by electrooculography (EOG) artifacts. In this paper, by using independent component analysis (ICA) and multivariate empirical mode decomposition (MEMD), the ICA-based MEMD method was proposed to remove EOG artifacts (EOAs) from multichannel EEG signals. First, the EEG signals were decomposed by the MEMD into multiple multivariate intrinsic mode functions (MIMFs). The EOG-related components were then extracted by reconstructing the MIMFs corresponding to EOAs. After performing the ICA of EOG-related signals, the EOG-linked independent components were distinguished and rejected. Finally, the clean EEG signals were reconstructed by implementing the inverse transform of ICA and MEMD. The results of simulated and real data suggested that the proposed method could successfully eliminate EOAs from EEG signals and preserve useful EEG information with little loss. By comparing with other existing techniques, the proposed method achieved much improvement in terms of the increase of signal-to-noise and the decrease of mean square error after removing EOAs.
Igloo-Plot: a tool for visualization of multidimensional datasets.
Kuntal, Bhusan K; Ghosh, Tarini Shankar; Mande, Sharmila S
2014-01-01
Advances in science and technology have resulted in an exponential growth of multivariate (or multi-dimensional) datasets which are being generated from various research areas especially in the domain of biological sciences. Visualization and analysis of such data (with the objective of uncovering the hidden patterns therein) is an important and challenging task. We present a tool, called Igloo-Plot, for efficient visualization of multidimensional datasets. The tool addresses some of the key limitations of contemporary multivariate visualization and analysis tools. The visualization layout, not only facilitates an easy identification of clusters of data-points having similar feature compositions, but also the 'marker features' specific to each of these clusters. The applicability of the various functionalities implemented herein is demonstrated using several well studied multi-dimensional datasets. Igloo-Plot is expected to be a valuable resource for researchers working in multivariate data mining studies. Igloo-Plot is available for download from: http://metagenomics.atc.tcs.com/IglooPlot/. Copyright © 2014 Elsevier Inc. All rights reserved.
Cardiovascular reactivity patterns and pathways to hypertension: a multivariate cluster analysis.
Brindle, R C; Ginty, A T; Jones, A; Phillips, A C; Roseboom, T J; Carroll, D; Painter, R C; de Rooij, S R
2016-12-01
Substantial evidence links exaggerated mental stress induced blood pressure reactivity to future hypertension, but the results for heart rate reactivity are less clear. For this reason multivariate cluster analysis was carried out to examine the relationship between heart rate and blood pressure reactivity patterns and hypertension in a large prospective cohort (age range 55-60 years). Four clusters emerged with statistically different systolic and diastolic blood pressure and heart rate reactivity patterns. Cluster 1 was characterised by a relatively exaggerated blood pressure and heart rate response while the blood pressure and heart rate responses of cluster 2 were relatively modest and in line with the sample mean. Cluster 3 was characterised by blunted cardiovascular stress reactivity across all variables and cluster 4, by an exaggerated blood pressure response and modest heart rate response. Membership to cluster 4 conferred an increased risk of hypertension at 5-year follow-up (hazard ratio=2.98 (95% CI: 1.50-5.90), P<0.01) that survived adjustment for a host of potential confounding variables. These results suggest that the cardiac reactivity plays a potentially important role in the link between blood pressure reactivity and hypertension and support the use of multivariate approaches to stress psychophysiology.
2011-01-01
Introduction Necrotizing fasciitis (NF) is a life threatening infectious disease with a high mortality rate. We carried out a microbiological characterization of the causative pathogens. We investigated the correlation of mortality in NF with bloodstream infection and with the presence of co-morbidities. Methods In this retrospective study, we analyzed 323 patients who presented with necrotizing fasciitis at two different institutions. Bloodstream infection (BSI) was defined as a positive blood culture result. The patients were categorized as survivors and non-survivors. Eleven clinically important variables which were statistically significant by univariate analysis were selected for multivariate regression analysis and a stepwise logistic regression model was developed to determine the association between BSI and mortality. Results Univariate logistic regression analysis showed that patients with hypotension, heart disease, liver disease, presence of Vibrio spp. in wound cultures, presence of fungus in wound cultures, and presence of Streptococcus group A, Aeromonas spp. or Vibrio spp. in blood cultures, had a significantly higher risk of in-hospital mortality. Our multivariate logistic regression analysis showed a higher risk of mortality in patients with pre-existing conditions like hypotension, heart disease, and liver disease. Multivariate logistic regression analysis also showed that presence of Vibrio spp in wound cultures, and presence of Streptococcus Group A in blood cultures were associated with a high risk of mortality while debridement > = 3 was associated with improved survival. Conclusions Mortality in patients with necrotizing fasciitis was significantly associated with the presence of Vibrio in wound cultures and Streptococcus group A in blood cultures. PMID:21693053
Ferreira, Ana P; Tobyn, Mike
2015-01-01
In the pharmaceutical industry, chemometrics is rapidly establishing itself as a tool that can be used at every step of product development and beyond: from early development to commercialization. This set of multivariate analysis methods allows the extraction of information contained in large, complex data sets thus contributing to increase product and process understanding which is at the core of the Food and Drug Administration's Process Analytical Tools (PAT) Guidance for Industry and the International Conference on Harmonisation's Pharmaceutical Development guideline (Q8). This review is aimed at providing pharmaceutical industry professionals an introduction to multivariate analysis and how it is being adopted and implemented by companies in the transition from "quality-by-testing" to "quality-by-design". It starts with an introduction to multivariate analysis and the two methods most commonly used: principal component analysis and partial least squares regression, their advantages, common pitfalls and requirements for their effective use. That is followed with an overview of the diverse areas of application of multivariate analysis in the pharmaceutical industry: from the development of real-time analytical methods to definition of the design space and control strategy, from formulation optimization during development to the application of quality-by-design principles to improve manufacture of existing commercial products.
Lie, Octavian V; van Mierlo, Pieter
2017-01-01
The visual interpretation of intracranial EEG (iEEG) is the standard method used in complex epilepsy surgery cases to map the regions of seizure onset targeted for resection. Still, visual iEEG analysis is labor-intensive and biased due to interpreter dependency. Multivariate parametric functional connectivity measures using adaptive autoregressive (AR) modeling of the iEEG signals based on the Kalman filter algorithm have been used successfully to localize the electrographic seizure onsets. Due to their high computational cost, these methods have been applied to a limited number of iEEG time-series (<60). The aim of this study was to test two Kalman filter implementations, a well-known multivariate adaptive AR model (Arnold et al. 1998) and a simplified, computationally efficient derivation of it, for their potential application to connectivity analysis of high-dimensional (up to 192 channels) iEEG data. When used on simulated seizures together with a multivariate connectivity estimator, the partial directed coherence, the two AR models were compared for their ability to reconstitute the designed seizure signal connections from noisy data. Next, focal seizures from iEEG recordings (73-113 channels) in three patients rendered seizure-free after surgery were mapped with the outdegree, a graph-theory index of outward directed connectivity. Simulation results indicated high levels of mapping accuracy for the two models in the presence of low-to-moderate noise cross-correlation. Accordingly, both AR models correctly mapped the real seizure onset to the resection volume. This study supports the possibility of conducting fully data-driven multivariate connectivity estimations on high-dimensional iEEG datasets using the Kalman filter approach.
Duffy, Sonia A.; Ronis, David L.; McLean, Scott; Fowler, Karen E.; Gruber, Stephen B.; Wolf, Gregory T.; Terrell, Jeffrey E.
2009-01-01
Purpose Our prior work has shown that the health behaviors of head and neck cancer patients are interrelated and are associated with quality of life; however, other than smoking, the relationship between health behaviors and survival is unclear. Patients and Methods A prospective cohort study was conducted to determine the relationship between five pretreatment health behaviors (smoking, alcohol, diet, physical activity, and sleep) and all-cause survival among 504 head and neck cancer patients. Results Smoking status was the strongest predictor of survival, with both current smokers (hazard ratio [HR] = 2.4; 95% CI, 1.3 to 4.4) and former smokers (HR = 2.0; 95% CI, 1.2 to 3.5) showing significant associations with poor survival. Problem drinking was associated with survival in the univariate analysis (HR = 1.4; 95% CI, 1.0 to 2.0) but lost significance when controlling for other factors. Low fruit intake was negatively associated with survival in the univariate analysis only (HR = 1.6; 95% CI, 1.1 to 2.1), whereas vegetable intake was not significant in either univariate or multivariate analyses. Although physical activity was associated with survival in the univariate analysis (HR = 0.95; 95% CI, 0.93 to 0.97), it was not significant in the multivariate model. Sleep was not significantly associated with survival in either univariate or multivariate analysis. Control variables that were also independently associated with survival in the multivariate analysis were age, education, tumor site, cancer stage, and surgical treatment. Conclusion Variation in selected pretreatment health behaviors (eg, smoking, fruit intake, and physical activity) in this population is associated with variation in survival. PMID:19289626
Multivariate pattern analysis for MEG: A comparison of dissimilarity measures.
Guggenmos, Matthias; Sterzer, Philipp; Cichy, Radoslaw Martin
2018-06-01
Multivariate pattern analysis (MVPA) methods such as decoding and representational similarity analysis (RSA) are growing rapidly in popularity for the analysis of magnetoencephalography (MEG) data. However, little is known about the relative performance and characteristics of the specific dissimilarity measures used to describe differences between evoked activation patterns. Here we used a multisession MEG data set to qualitatively characterize a range of dissimilarity measures and to quantitatively compare them with respect to decoding accuracy (for decoding) and between-session reliability of representational dissimilarity matrices (for RSA). We tested dissimilarity measures from a range of classifiers (Linear Discriminant Analysis - LDA, Support Vector Machine - SVM, Weighted Robust Distance - WeiRD, Gaussian Naïve Bayes - GNB) and distances (Euclidean distance, Pearson correlation). In addition, we evaluated three key processing choices: 1) preprocessing (noise normalisation, removal of the pattern mean), 2) weighting decoding accuracies by decision values, and 3) computing distances in three different partitioning schemes (non-cross-validated, cross-validated, within-class-corrected). Four main conclusions emerged from our results. First, appropriate multivariate noise normalization substantially improved decoding accuracies and the reliability of dissimilarity measures. Second, LDA, SVM and WeiRD yielded high peak decoding accuracies and nearly identical time courses. Third, while using decoding accuracies for RSA was markedly less reliable than continuous distances, this disadvantage was ameliorated by decision-value-weighting of decoding accuracies. Fourth, the cross-validated Euclidean distance provided unbiased distance estimates and highly replicable representational dissimilarity matrices. Overall, we strongly advise the use of multivariate noise normalisation as a general preprocessing step, recommend LDA, SVM and WeiRD as classifiers for decoding and highlight the cross-validated Euclidean distance as a reliable and unbiased default choice for RSA. Copyright © 2018 Elsevier Inc. All rights reserved.
Multi-variant study of obesity risk genes in African Americans: The Jackson Heart Study.
Liu, Shijian; Wilson, James G; Jiang, Fan; Griswold, Michael; Correa, Adolfo; Mei, Hao
2016-11-30
Genome-wide association study (GWAS) has been successful in identifying obesity risk genes by single-variant association analysis. For this study, we designed steps of analysis strategy and aimed to identify multi-variant effects on obesity risk among candidate genes. Our analyses were focused on 2137 African American participants with body mass index measured in the Jackson Heart Study and 657 common single nucleotide polymorphisms (SNPs) genotyped at 8 GWAS-identified obesity risk genes. Single-variant association test showed that no SNPs reached significance after multiple testing adjustment. The following gene-gene interaction analysis, which was focused on SNPs with unadjusted p-value<0.10, identified 6 significant multi-variant associations. Logistic regression showed that SNPs in these associations did not have significant linear interactions; examination of genetic risk score evidenced that 4 multi-variant associations had significant additive effects of risk SNPs; and haplotype association test presented that all multi-variant associations contained one or several combinations of particular alleles or haplotypes, associated with increased obesity risk. Our study evidenced that obesity risk genes generated multi-variant effects, which can be additive or non-linear interactions, and multi-variant study is an important supplement to existing GWAS for understanding genetic effects of obesity risk genes. Copyright © 2016 Elsevier B.V. All rights reserved.
Guo, Jing; Yuan, Yahong; Dou, Pei; Yue, Tianli
2017-10-01
Fifty-one kiwifruit juice samples of seven kiwifruit varieties from five regions in China were analyzed to determine their polyphenols contents and to trace fruit varieties and geographical origins by multivariate statistical analysis. Twenty-one polyphenols belonging to four compound classes were determined by ultra-high-performance liquid chromatography coupled with ultra-high-resolution TOF mass spectrometry. (-)-Epicatechin, (+)-catechin, procyanidin B1 and caffeic acid derivatives were the predominant phenolic compounds in the juices. Principal component analysis (PCA) allowed a clear separation of the juices according to kiwifruit varieties. Stepwise linear discriminant analysis (SLDA) yielded satisfactory categorization of samples, provided 100% success rate according to kiwifruit varieties and 92.2% success rate according to geographical origins. The result showed that polyphenolic profiles of kiwifruit juices contain enough information to trace fruit varieties and geographical origins. Copyright © 2017 Elsevier Ltd. All rights reserved.
Fakayode, Sayo O; Mitchell, Breanna S; Pollard, David A
2014-08-01
Accurate understanding of analyte boiling points (BP) is of critical importance in gas chromatographic (GC) separation and crude oil refinery operation in petrochemical industries. This study reported the first combined use of GC separation and partial-least-square (PLS1) multivariate regression analysis of petrochemical structural activity relationship (SAR) for accurate BP determination of two commercially available (D3710 and MA VHP) calibration gas mix samples. The results of the BP determination using PLS1 multivariate regression were further compared with the results of traditional simulated distillation method of BP determination. The developed PLS1 regression was able to correctly predict analytes BP in D3710 and MA VHP calibration gas mix samples, with a root-mean-square-%-relative-error (RMS%RE) of 6.4%, and 10.8% respectively. In contrast, the overall RMS%RE of 32.9% and 40.4%, respectively obtained for BP determination in D3710 and MA VHP using a traditional simulated distillation method were approximately four times larger than the corresponding RMS%RE of BP prediction using MRA, demonstrating the better predictive ability of MRA. The reported method is rapid, robust, and promising, and can be potentially used routinely for fast analysis, pattern recognition, and analyte BP determination in petrochemical industries. Copyright © 2014 Elsevier B.V. All rights reserved.
Multivariate Classification of Original and Fake Perfumes by Ion Analysis and Ethanol Content.
Gomes, Clêrton L; de Lima, Ari Clecius A; Loiola, Adonay R; da Silva, Abel B R; Cândido, Manuela C L; Nascimento, Ronaldo F
2016-07-01
The increased marketing of fake perfumes has encouraged us to investigate how to identify such products by their chemical characteristics and multivariate analysis. The aim of this study was to present an alternative approach to distinguish original from fake perfumes by means of the investigation of sodium, potassium, chloride ions, and ethanol contents by chemometric tools. For this, 50 perfumes were used (25 original and 25 counterfeit) for the analysis of ions (ion chromatography) and ethanol (gas chromatography). The results demonstrated that the fake perfume had low levels of ethanol and high levels of chloride compared to the original product. The data were treated by chemometric tools such as principal component analysis and linear discriminant analysis. This study proved that the analysis of ethanol is an effective method of distinguishing original from the fake products, and it may potentially be used to assist legal authorities in such cases. © 2016 American Academy of Forensic Sciences.
Coexpression of aPKCλ/ι and IL-6 in prostate cancer tissue correlates with biochemical recurrence.
Ishiguro, Hitoshi; Akimoto, Kazunori; Nagashima, Yoji; Kagawa, Eriko; Sasaki, Takeshi; Sano, Jin-yu; Takagawa, Ryo; Fujinami, Kiyoshi; Sasaki, Kazunori; Aoki, Ichiro; Ohno, Shigeo; Kubota, Yoshinobu; Uemura, Hiroji
2011-08-01
Atypical protein kinase C λ/ι (aPKCλ/ι) and interleukin-6 (IL-6) have been implicated in prostate cancer progression, the mechanisms of which have been demonstrated both in vitro and in vivo. However, the clinical significance of the correlation between the expressions of these factors remains to be clarified. In the present study, we report a significant correlation between aPKCλ/ι and IL-6 proteins in prostate cancer tissue by immunohistochemical staining. We evaluated the association of both proteins by analyzing clinicopathological parameters using chi-square test, Kaplan-Meier with log-rank test, and a Cox proportional hazard regression model in univariate and multivariate analyses. The results again showed that the expression of aPKCλ/ι and IL-6 correlates in prostate cancer tissue (P < 0.001). Atypical protein kinase C λ/ι was also found to correlate with the Gleason score (P < 0.001) and with biochemical recurrence after prostatectomy (P = 0.02). Furthermore, aPKCλ/ι correlated with biochemical recurrence in a Kaplan-Meier and log-rank test (P = 0.01) and Cox analysis (P = 0.02 in the univariate analysis, P = 0.02 in the multivariate analysis). The coexpression of aPKCλ/ι and IL-6 also correlated with biochemical recurrence by Kaplan-Meier and log-rank test (P = 0.005) and Cox analysis (P = 0.01 in the univariate analysis, P = 0.03 in the multivariate analysis). These results indicate a strong correlation between aPKCλ/ι and IL-6 in prostate tumors, and that the aPKCλ/ι-IL-6 axis is a reliable prognostic factor for the biochemical recurrence of this cancer. © 2011 Japanese Cancer Association.
Instrumental Neutron Activation Analysis and Multivariate Statistics for Pottery Provenance
NASA Astrophysics Data System (ADS)
Glascock, M. D.; Neff, H.; Vaughn, K. J.
2004-06-01
The application of instrumental neutron activation analysis and multivariate statistics to archaeological studies of ceramics and clays is described. A small pottery data set from the Nasca culture in southern Peru is presented for illustration.
Localization of genes involved in the metabolic syndrome using multivariate linkage analysis.
Olswold, Curtis; de Andrade, Mariza
2003-12-31
There are no well accepted criteria for the diagnosis of the metabolic syndrome. However, the metabolic syndrome is identified clinically by the presence of three or more of these five variables: larger waist circumference, higher triglyceride levels, lower HDL-cholesterol concentrations, hypertension, and impaired fasting glucose. We use sets of two or three variables, which are available in the Framingham Heart Study data set, to localize genes responsible for this syndrome using multivariate quantitative linkage analysis. This analysis demonstrates the applicability of using multivariate linkage analysis and how its use increases the power to detect linkage when genes are involved in the same disease mechanism.
A Comparative Analysis of Student Motivation in Traditional Classroom and E-Learning Courses
ERIC Educational Resources Information Center
Rovai, Alfred; Ponton, Michael; Wighting, Mervyn; Baker, Jason
2007-01-01
Multivariate analysis of variance was used to determine if there were differences in seven measures of motivation between students enrolled in 12 e-learning and 12 traditional classroom university courses (N = 353). Study results provide evidence that e-learning students possess stronger intrinsic motivation than on-campus students who attend…
ERIC Educational Resources Information Center
Shieh, Gwowen
2006-01-01
This paper considers the problem of analysis of correlation coefficients from a multivariate normal population. A unified theorem is derived for the regression model with normally distributed explanatory variables and the general results are employed to provide useful expressions for the distributions of simple, multiple, and partial-multiple…
NASA Astrophysics Data System (ADS)
Di Anibal, Carolina V.; Marsal, Lluís F.; Callao, M. Pilar; Ruisánchez, Itziar
2012-02-01
Raman spectroscopy combined with multivariate analysis was evaluated as a tool for detecting Sudan I dye in culinary spices. Three Raman modalities were studied: normal Raman, FT-Raman and SERS. The results show that SERS is the most appropriate modality capable of providing a proper Raman signal when a complex matrix is analyzed. To get rid of the spectral noise and background, Savitzky-Golay smoothing with polynomial baseline correction and wavelet transform were applied. Finally, to check whether unadulterated samples can be differentiated from samples adulterated with Sudan I dye, an exploratory analysis such as principal component analysis (PCA) was applied to raw data and data processed with the two mentioned strategies. The results obtained by PCA show that Raman spectra need to be properly treated if useful information is to be obtained and both spectra treatments are appropriate for processing the Raman signal. The proposed methodology shows that SERS combined with appropriate spectra treatment can be used as a practical screening tool to distinguish samples suspicious to be adulterated with Sudan I dye.
Kawashima, Atsunari; Nakai, Yasutomo; Nakayama, Masashi; Ujike, Takeshi; Tanigawa, Go; Ono, Yutaka; Kamoto, Akihito; Takada, Tsuyosi; Yamaguchi, Yuichiro; Takayama, Hitoshi; Nishimura, Kazuo; Nonomura, Norio; Tsujimura, Akira
2012-10-01
To determine through the analysis of our multi-institutional database whether postoperative adjuvant chemotherapy for upper urinary tract carcinoma with localized invasive upper urinary tract carcinoma (UUTC) is beneficial. A study population of 93 patients with pT3N0/xM0 UUTC was eligible for this study. Clinical features evaluated were sex, tumor location, adjuvant chemotherapy status, tumor pathology (histology, grade, infiltrating growth, lymphovascular invasion (LVI)), and cause of death. Cancer-specific survival (CSS) was estimated by Kaplan-Meier method. Prognostic factors related to CSS were analyzed by Cox proportional hazards regression model for multivariate analysis. In pT3 patients, overall 5-year CSS rate was 68.4% and median CSS time was 31 months (range 3-114 months). In the adjuvant chemotherapy group, 5-year CSS rate was 80.8%, whereas 5-year CSS rate was 64.4% in the non-adjuvant chemotherapy group. By multivariate analysis, adjuvant chemotherapy status was significantly associated with CSS (P = 0.008) were sex, tumor grade, tumor histology, and LVI presence. This study, although it was retrospective study, revealed that adjuvant chemotherapy after RNU may be beneficial in pT3N0/X patients by multivariate analysis. Prospective studies evaluating adjuvant therapy regimens for UTTC are required.
Malaquias, José B; Ramalho, Francisco S; Dos S Dias, Carlos T; Brugger, Bruno P; S Lira, Aline Cristina; Wilcken, Carlos F; Pachú, Jéssica K S; Zanuncio, José C
2017-02-09
The relationship between pests and natural enemies using multivariate analysis on cotton in different spacing has not been documented yet. Using multivariate approaches is possible to optimize strategies to control Aphis gossypii at different crop spacings because the possibility of a better use of the aphid sampling strategies as well as the conservation and release of its natural enemies. The aims of the study were (i) to characterize the temporal abundance data of aphids and its natural enemies using principal components, (ii) to analyze the degree of correlation between the insects and between groups of variables (pests and natural enemies), (iii) to identify the main natural enemies responsible for regulating A. gossypii populations, and (iv) to investigate the similarities in arthropod occurrence patterns at different spacings of cotton crops over two seasons. High correlations in the occurrence of Scymnus rubicundus with aphids are shown through principal component analysis and through the important role the species plays in canonical correlation analysis. Clustering the presence of apterous aphids matches the pattern verified for Chrysoperla externa at the three different spacings between rows. Our results indicate that S. rubicundus is the main candidate to regulate the aphid populations in all spacings studied.
Malaquias, José B.; Ramalho, Francisco S.; dos S. Dias, Carlos T.; Brugger, Bruno P.; S. Lira, Aline Cristina; Wilcken, Carlos F.; Pachú, Jéssica K. S.; Zanuncio, José C.
2017-01-01
The relationship between pests and natural enemies using multivariate analysis on cotton in different spacing has not been documented yet. Using multivariate approaches is possible to optimize strategies to control Aphis gossypii at different crop spacings because the possibility of a better use of the aphid sampling strategies as well as the conservation and release of its natural enemies. The aims of the study were (i) to characterize the temporal abundance data of aphids and its natural enemies using principal components, (ii) to analyze the degree of correlation between the insects and between groups of variables (pests and natural enemies), (iii) to identify the main natural enemies responsible for regulating A. gossypii populations, and (iv) to investigate the similarities in arthropod occurrence patterns at different spacings of cotton crops over two seasons. High correlations in the occurrence of Scymnus rubicundus with aphids are shown through principal component analysis and through the important role the species plays in canonical correlation analysis. Clustering the presence of apterous aphids matches the pattern verified for Chrysoperla externa at the three different spacings between rows. Our results indicate that S. rubicundus is the main candidate to regulate the aphid populations in all spacings studied. PMID:28181503
NASA Astrophysics Data System (ADS)
Malaquias, José B.; Ramalho, Francisco S.; Dos S. Dias, Carlos T.; Brugger, Bruno P.; S. Lira, Aline Cristina; Wilcken, Carlos F.; Pachú, Jéssica K. S.; Zanuncio, José C.
2017-02-01
The relationship between pests and natural enemies using multivariate analysis on cotton in different spacing has not been documented yet. Using multivariate approaches is possible to optimize strategies to control Aphis gossypii at different crop spacings because the possibility of a better use of the aphid sampling strategies as well as the conservation and release of its natural enemies. The aims of the study were (i) to characterize the temporal abundance data of aphids and its natural enemies using principal components, (ii) to analyze the degree of correlation between the insects and between groups of variables (pests and natural enemies), (iii) to identify the main natural enemies responsible for regulating A. gossypii populations, and (iv) to investigate the similarities in arthropod occurrence patterns at different spacings of cotton crops over two seasons. High correlations in the occurrence of Scymnus rubicundus with aphids are shown through principal component analysis and through the important role the species plays in canonical correlation analysis. Clustering the presence of apterous aphids matches the pattern verified for Chrysoperla externa at the three different spacings between rows. Our results indicate that S. rubicundus is the main candidate to regulate the aphid populations in all spacings studied.
Multivariate frequency domain analysis of protein dynamics
NASA Astrophysics Data System (ADS)
Matsunaga, Yasuhiro; Fuchigami, Sotaro; Kidera, Akinori
2009-03-01
Multivariate frequency domain analysis (MFDA) is proposed to characterize collective vibrational dynamics of protein obtained by a molecular dynamics (MD) simulation. MFDA performs principal component analysis (PCA) for a bandpass filtered multivariate time series using the multitaper method of spectral estimation. By applying MFDA to MD trajectories of bovine pancreatic trypsin inhibitor, we determined the collective vibrational modes in the frequency domain, which were identified by their vibrational frequencies and eigenvectors. At near zero temperature, the vibrational modes determined by MFDA agreed well with those calculated by normal mode analysis. At 300 K, the vibrational modes exhibited characteristic features that were considerably different from the principal modes of the static distribution given by the standard PCA. The influences of aqueous environments were discussed based on two different sets of vibrational modes, one derived from a MD simulation in water and the other from a simulation in vacuum. Using the varimax rotation, an algorithm of the multivariate statistical analysis, the representative orthogonal set of eigenmodes was determined at each vibrational frequency.
A refined method for multivariate meta-analysis and meta-regression
Jackson, Daniel; Riley, Richard D
2014-01-01
Making inferences about the average treatment effect using the random effects model for meta-analysis is problematic in the common situation where there is a small number of studies. This is because estimates of the between-study variance are not precise enough to accurately apply the conventional methods for testing and deriving a confidence interval for the average effect. We have found that a refined method for univariate meta-analysis, which applies a scaling factor to the estimated effects’ standard error, provides more accurate inference. We explain how to extend this method to the multivariate scenario and show that our proposal for refined multivariate meta-analysis and meta-regression can provide more accurate inferences than the more conventional approach. We explain how our proposed approach can be implemented using standard output from multivariate meta-analysis software packages and apply our methodology to two real examples. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd. PMID:23996351
Rosen, Sophia; Davidov, Ori
2012-07-20
Multivariate outcomes are often measured longitudinally. For example, in hearing loss studies, hearing thresholds for each subject are measured repeatedly over time at several frequencies. Thus, each patient is associated with a multivariate longitudinal outcome. The multivariate mixed-effects model is a useful tool for the analysis of such data. There are situations in which the parameters of the model are subject to some restrictions or constraints. For example, it is known that hearing thresholds, at every frequency, increase with age. Moreover, this age-related threshold elevation is monotone in frequency, that is, the higher the frequency, the higher, on average, is the rate of threshold elevation. This means that there is a natural ordering among the different frequencies in the rate of hearing loss. In practice, this amounts to imposing a set of constraints on the different frequencies' regression coefficients modeling the mean effect of time and age at entry to the study on hearing thresholds. The aforementioned constraints should be accounted for in the analysis. The result is a multivariate longitudinal model with restricted parameters. We propose estimation and testing procedures for such models. We show that ignoring the constraints may lead to misleading inferences regarding the direction and the magnitude of various effects. Moreover, simulations show that incorporating the constraints substantially improves the mean squared error of the estimates and the power of the tests. We used this methodology to analyze a real hearing loss study. Copyright © 2012 John Wiley & Sons, Ltd.
Sampling effort affects multivariate comparisons of stream assemblages
Cao, Y.; Larsen, D.P.; Hughes, R.M.; Angermeier, P.L.; Patton, T.M.
2002-01-01
Multivariate analyses are used widely for determining patterns of assemblage structure, inferring species-environment relationships and assessing human impacts on ecosystems. The estimation of ecological patterns often depends on sampling effort, so the degree to which sampling effort affects the outcome of multivariate analyses is a concern. We examined the effect of sampling effort on site and group separation, which was measured using a mean similarity method. Two similarity measures, the Jaccard Coefficient and Bray-Curtis Index were investigated with 1 benthic macroinvertebrate and 2 fish data sets. Site separation was significantly improved with increased sampling effort because the similarity between replicate samples of a site increased more rapidly than between sites. Similarly, the faster increase in similarity between sites of the same group than between sites of different groups caused clearer separation between groups. The strength of site and group separation completely stabilized only when the mean similarity between replicates reached 1. These results are applicable to commonly used multivariate techniques such as cluster analysis and ordination because these multivariate techniques start with a similarity matrix. Completely stable outcomes of multivariate analyses are not feasible. Instead, we suggest 2 criteria for estimating the stability of multivariate analyses of assemblage data: 1) mean within-site similarity across all sites compared, indicating sample representativeness, and 2) the SD of within-site similarity across sites, measuring sample comparability.
Jackson, Dan; White, Ian R; Riley, Richard D
2013-01-01
Multivariate meta-analysis is becoming more commonly used. Methods for fitting the multivariate random effects model include maximum likelihood, restricted maximum likelihood, Bayesian estimation and multivariate generalisations of the standard univariate method of moments. Here, we provide a new multivariate method of moments for estimating the between-study covariance matrix with the properties that (1) it allows for either complete or incomplete outcomes and (2) it allows for covariates through meta-regression. Further, for complete data, it is invariant to linear transformations. Our method reduces to the usual univariate method of moments, proposed by DerSimonian and Laird, in a single dimension. We illustrate our method and compare it with some of the alternatives using a simulation study and a real example. PMID:23401213
Biostatistics Series Module 10: Brief Overview of Multivariate Methods.
Hazra, Avijit; Gogtay, Nithya
2017-01-01
Multivariate analysis refers to statistical techniques that simultaneously look at three or more variables in relation to the subjects under investigation with the aim of identifying or clarifying the relationships between them. These techniques have been broadly classified as dependence techniques, which explore the relationship between one or more dependent variables and their independent predictors, and interdependence techniques, that make no such distinction but treat all variables equally in a search for underlying relationships. Multiple linear regression models a situation where a single numerical dependent variable is to be predicted from multiple numerical independent variables. Logistic regression is used when the outcome variable is dichotomous in nature. The log-linear technique models count type of data and can be used to analyze cross-tabulations where more than two variables are included. Analysis of covariance is an extension of analysis of variance (ANOVA), in which an additional independent variable of interest, the covariate, is brought into the analysis. It tries to examine whether a difference persists after "controlling" for the effect of the covariate that can impact the numerical dependent variable of interest. Multivariate analysis of variance (MANOVA) is a multivariate extension of ANOVA used when multiple numerical dependent variables have to be incorporated in the analysis. Interdependence techniques are more commonly applied to psychometrics, social sciences and market research. Exploratory factor analysis and principal component analysis are related techniques that seek to extract from a larger number of metric variables, a smaller number of composite factors or components, which are linearly related to the original variables. Cluster analysis aims to identify, in a large number of cases, relatively homogeneous groups called clusters, without prior information about the groups. The calculation intensive nature of multivariate analysis has so far precluded most researchers from using these techniques routinely. The situation is now changing with wider availability, and increasing sophistication of statistical software and researchers should no longer shy away from exploring the applications of multivariate methods to real-life data sets.
Islam, Ebtesam A.; Limsuwat, Chok; Nantsupawat, Teerapat; Berdine, Gilbert G.; Nugent, Kenneth M.
2015-01-01
BACKGROUND: Corticosteroids used for chronic obstructive pulmonary disease (COPD) exacerbations can cause hyperglycemia in hospitalized patients, and hyperglycemia may be associated with increased mortality, length of stay (LOS), and re-admissions in these patients. MATERIALS AND METHODS: We did three retrospective studies using charts from July 2008 through June 2009, January 2006 through December 2010, and October 2010 through March 2011. We collected demographic and clinical information, laboratory results, radiographic results, and information on LOS, mortality, and re-admission. RESULTS: Glucose levels did not predict outcomes in any of the studied cohorts, after adjustment for covariates in multivariable analysis. The first database included 30 patients admitted to non-intensive care unit (ICU) hospital beds. Six of 20 non-diabetic patients had peak glucoses above 200 mg/dl. Nine of the ten diabetic patients had peak glucoses above 200 mg/dl. The maximum daily corticosteroid dose had no apparent effect on the glucose levels. The second database included 217 patients admitted to ICUs. The initial blood glucose was higher in patients who died than those who survived using bivariate analysis (P = 0.015; odds ratio, OR, 1.01) but not in multivariable analysis. Multivariable logistic regression analysis also demonstrated that glucose levels did not affect LOS. The third database analyzing COPD re-admission rates included 81 patients; the peak glucose levels were not associated with re-admission. CONCLUSIONS: Our data demonstrate that COPD patients treated with corticosteroids developed significant hyperglycemia, but the increase in blood glucose levels did not correlate with the maximum dose of corticosteroids. Blood glucose levels were not associated with mortality, LOS, or re-admission rates. PMID:25829959
[Analysis of variance of repeated data measured by water maze with SPSS].
Qiu, Hong; Jin, Guo-qin; Jin, Ru-feng; Zhao, Wei-kang
2007-01-01
To introduce the method of analyzing repeated data measured by water maze with SPSS 11.0, and offer a reference statistical method to clinical and basic medicine researchers who take the design of repeated measures. Using repeated measures and multivariate analysis of variance (ANOVA) process of the general linear model in SPSS and giving comparison among different groups and different measure time pairwise. Firstly, Mauchly's test of sphericity should be used to judge whether there were relations among the repeatedly measured data. If any (P
Pilatti, Fernanda Kokowicz; Ramlov, Fernanda; Schmidt, Eder Carlos; Costa, Christopher; Oliveira, Eva Regina de; Bauer, Claudia M; Rocha, Miguel; Bouzon, Zenilda Laurita; Maraschin, Marcelo
2017-01-30
Fossil fuels, e.g. gasoline and diesel oil, account for substantial share of the pollution that affects marine ecosystems. Environmental metabolomics is an emerging field that may help unravel the effect of these xenobiotics on seaweeds and provide methodologies for biomonitoring coastal ecosystems. In the present study, FTIR and multivariate analysis were used to discriminate metabolic profiles of Ulva lactuca after in vitro exposure to diesel oil and gasoline, in combinations of concentrations (0.001%, 0.01%, 0.1%, and 1.0% - v/v) and times of exposure (30min, 1h, 12h, and 24h). PCA and HCA performed on entire mid-infrared spectral window were able to discriminate diesel oil-exposed thalli from the gasoline-exposed ones. HCA performed on spectral window related to the protein absorbance (1700-1500cm -1 ) enabled the best discrimination between gasoline-exposed samples regarding the time of exposure, and between diesel oil-exposed samples according to the concentration. The results indicate that the combination of FTIR with multivariate analysis is a simple and efficient methodology for metabolic profiling with potential use for biomonitoring strategies. Copyright © 2016 Elsevier Ltd. All rights reserved.
Chieng, Norman; Trnka, Hjalte; Boetker, Johan; Pikal, Michael; Rantanen, Jukka; Grohganz, Holger
2013-09-15
The purpose of this study is to investigate the use of multivariate data analysis for powder X-ray diffraction-pair-wise distribution function (PXRD-PDF) data to detect phase separation in freeze-dried binary amorphous systems. Polymer-polymer and polymer-sugar binary systems at various ratios were freeze-dried. All samples were analyzed by PXRD, transformed to PDF and analyzed by principal component analysis (PCA). These results were validated by differential scanning calorimetry (DSC) through characterization of glass transition of the maximally freeze-concentrate solute (Tg'). Analysis of PXRD-PDF data using PCA provides a more clear 'miscible' or 'phase separated' interpretation through the distribution pattern of samples on a score plot presentation compared to residual plot method. In a phase separated system, samples were found to be evenly distributed around the theoretical PDF profile. For systems that were miscible, a clear deviation of samples away from the theoretical PDF profile was observed. Moreover, PCA analysis allows simultaneous analysis of replicate samples. Comparatively, the phase behavior analysis from PXRD-PDF-PCA method was in agreement with the DSC results. Overall, the combined PXRD-PDF-PCA approach improves the clarity of the PXRD-PDF results and can be used as an alternative explorative data analytical tool in detecting phase separation in freeze-dried binary amorphous systems. Copyright © 2013 Elsevier B.V. All rights reserved.
Bujkiewicz, Sylwia; Riley, Richard D
2016-01-01
Multivariate random-effects meta-analysis allows the joint synthesis of correlated results from multiple studies, for example, for multiple outcomes or multiple treatment groups. In a Bayesian univariate meta-analysis of one endpoint, the importance of specifying a sensible prior distribution for the between-study variance is well understood. However, in multivariate meta-analysis, there is little guidance about the choice of prior distributions for the variances or, crucially, the between-study correlation, ρB; for the latter, researchers often use a Uniform(−1,1) distribution assuming it is vague. In this paper, an extensive simulation study and a real illustrative example is used to examine the impact of various (realistically) vague prior distributions for ρB and the between-study variances within a Bayesian bivariate random-effects meta-analysis of two correlated treatment effects. A range of diverse scenarios are considered, including complete and missing data, to examine the impact of the prior distributions on posterior results (for treatment effect and between-study correlation), amount of borrowing of strength, and joint predictive distributions of treatment effectiveness in new studies. Two key recommendations are identified to improve the robustness of multivariate meta-analysis results. First, the routine use of a Uniform(−1,1) prior distribution for ρB should be avoided, if possible, as it is not necessarily vague. Instead, researchers should identify a sensible prior distribution, for example, by restricting values to be positive or negative as indicated by prior knowledge. Second, it remains critical to use sensible (e.g. empirically based) prior distributions for the between-study variances, as an inappropriate choice can adversely impact the posterior distribution for ρB, which may then adversely affect inferences such as joint predictive probabilities. These recommendations are especially important with a small number of studies and missing data. PMID:26988929
Multivariate time series analysis of neuroscience data: some challenges and opportunities.
Pourahmadi, Mohsen; Noorbaloochi, Siamak
2016-04-01
Neuroimaging data may be viewed as high-dimensional multivariate time series, and analyzed using techniques from regression analysis, time series analysis and spatiotemporal analysis. We discuss issues related to data quality, model specification, estimation, interpretation, dimensionality and causality. Some recent research areas addressing aspects of some recurring challenges are introduced. Copyright © 2015 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Ding, Hao; Cao, Ming; DuPont, Andrew W.; Scott, Larry D.; Guha, Sushovan; Singhal, Shashideep; Younes, Mamoun; Pence, Isaac; Herline, Alan; Schwartz, David; Xu, Hua; Mahadevan-Jansen, Anita; Bi, Xiaohong
2016-03-01
Inflammatory bowel disease (IBD) is an idiopathic disease that is typically characterized by chronic inflammation of the gastrointestinal tract. Recently much effort has been devoted to the development of novel diagnostic tools that can assist physicians for fast, accurate, and automated diagnosis of the disease. Previous research based on Raman spectroscopy has shown promising results in differentiating IBD patients from normal screening cases. In the current study, we examined IBD patients in vivo through a colonoscope-coupled Raman system. Optical diagnosis for IBD discrimination was conducted based on full-range spectra using multivariate statistical methods. Further, we incorporated several feature selection methods in machine learning into the classification model. The diagnostic performance for disease differentiation was significantly improved after feature selection. Our results showed that improved IBD diagnosis can be achieved using Raman spectroscopy in combination with multivariate analysis and feature selection.
NASA Astrophysics Data System (ADS)
Yamasaki, Hideki; Morita, Shigeaki
2018-05-01
Multivariate curve resolution (MCR) was applied to a hetero-spectrally combined dataset consisting of mid-infrared (MIR) and near-infrared (NIR) spectra collected during the isothermal curing reaction of an epoxy resin. An epoxy monomer, bisphenol A diglycidyl ether (BADGE), and a hardening agent, 4,4‧-diaminodiphenyl methane (DDM), were used for the reaction. The fundamental modes of the Nsbnd H and Osbnd H stretches were highly overlapped in the MIR region, while their first overtones could be independently identified in the NIR region. The concentration profiles obtained by MCR using the hetero-spectral combination showed good agreement with the results of calculations based on the Beer-Lambert law and the mass balance. The band assignments and absorption sites estimated by the analysis also showed good agreement with the results using two-dimensional (2D) hetero-correlation spectroscopy.
Yamasaki, Hideki; Morita, Shigeaki
2018-05-15
Multivariate curve resolution (MCR) was applied to a hetero-spectrally combined dataset consisting of mid-infrared (MIR) and near-infrared (NIR) spectra collected during the isothermal curing reaction of an epoxy resin. An epoxy monomer, bisphenol A diglycidyl ether (BADGE), and a hardening agent, 4,4'-diaminodiphenyl methane (DDM), were used for the reaction. The fundamental modes of the NH and OH stretches were highly overlapped in the MIR region, while their first overtones could be independently identified in the NIR region. The concentration profiles obtained by MCR using the hetero-spectral combination showed good agreement with the results of calculations based on the Beer-Lambert law and the mass balance. The band assignments and absorption sites estimated by the analysis also showed good agreement with the results using two-dimensional (2D) hetero-correlation spectroscopy. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Technical Reports Server (NTRS)
Park, Steve
1990-01-01
A large and diverse number of computational techniques are routinely used to process and analyze remotely sensed data. These techniques include: univariate statistics; multivariate statistics; principal component analysis; pattern recognition and classification; other multivariate techniques; geometric correction; registration and resampling; radiometric correction; enhancement; restoration; Fourier analysis; and filtering. Each of these techniques will be considered, in order.
Chemical structure of wood charcoal by infrared spectroscopy and multivariate analysis
Nicole Labbe; David Harper; Timothy Rials; Thomas Elder
2006-01-01
In this work, the effect of temperature on charcoal structure and chemical composition is investigated for four tree species. Wood charcoal carbonized at various temperatures is analyzed by mid infrared spectroscopy coupled with multivariate analysis and by thermogravimetric analysis to characterize the chemical composition during the carbonization process. The...
Yan, Binjun; Fang, Zhonghua; Shen, Lijuan; Qu, Haibin
2015-01-01
The batch-to-batch quality consistency of herbal drugs has always been an important issue. To propose a methodology for batch-to-batch quality control based on HPLC-MS fingerprints and process knowledgebase. The extraction process of Compound E-jiao Oral Liquid was taken as a case study. After establishing the HPLC-MS fingerprint analysis method, the fingerprints of the extract solutions produced under normal and abnormal operation conditions were obtained. Multivariate statistical models were built for fault detection and a discriminant analysis model was built using the probabilistic discriminant partial-least-squares method for fault diagnosis. Based on multivariate statistical analysis, process knowledge was acquired and the cause-effect relationship between process deviations and quality defects was revealed. The quality defects were detected successfully by multivariate statistical control charts and the type of process deviations were diagnosed correctly by discriminant analysis. This work has demonstrated the benefits of combining HPLC-MS fingerprints, process knowledge and multivariate analysis for the quality control of herbal drugs. Copyright © 2015 John Wiley & Sons, Ltd.
Kukreti, B M; Pandey, Pradeep; Singh, R V
2012-08-01
Non-coring based exploratory drilling was under taken in the sedimentary environment of Rangsohkham block, East Khasi Hills district to examine the eastern extension of existing uranium resources located at Domiasiat and Wakhyn in the Mahadek basin of Meghalaya (India). Although radiometric survey and radiometric analysis of surface grab/channel samples in the block indicate high uranium content but the gamma ray logging results of exploratory boreholes in the block, did not obtain the expected results. To understand this abrupt discontinuity between the two sets of data (surface and subsurface) multivariate statistical analysis of primordial radioactive elements (K(40), U(238) and Th(232)) was performed using the concept of representative subsurface samples, drawn from the randomly selected 11 boreholes of this block. The study was performed to a high confidence level (99%), and results are discussed for assessing the U and Th behavior in the block. Results not only confirm the continuation of three distinct geological formations in the area but also the uranium bearing potential in the Mahadek sandstone of the eastern part of Mahadek Basin. Copyright © 2012 Elsevier Ltd. All rights reserved.
Comparison of plant cover of river valley fragments by using GIS tools and multivariate analysis
NASA Astrophysics Data System (ADS)
Waldon-Rudzionek, Barbara
2017-11-01
Selected landscape registers and results of ecological analyses of flora used in studies of transformations of anthropogenic plant cover and river valley landscapes were presented. The results were shown pursuant to a comparison of fragments of two adjacent valleys in north-western Poland.
NASA Astrophysics Data System (ADS)
Sikora, Roman; Markiewicz, Przemysław; Pabjańczyk, Wiesława
2018-04-01
The power systems usually include a number of nonlinear receivers. Nonlinear receivers are the source of disturbances generated to the power system in the form of higher harmonics. The level of these disturbances describes the total harmonic distortion coefficient THD. Its value depends on many factors. One of them are the deformation and change in RMS value of supply voltage. A modern LED luminaire is a nonlinear receiver as well. The paper presents the results of the analysis of the influence of change in RMS value of supply voltage and the level of dimming of the tested luminaire on the value of the current THD. The analysis was made using a mathematical model based on multivariable polynomial fitting.
Multivariate analysis: greater insights into complex systems
USDA-ARS?s Scientific Manuscript database
Many agronomic researchers measure and collect multiple response variables in an effort to understand the more complex nature of the system being studied. Multivariate (MV) statistical methods encompass the simultaneous analysis of all random variables (RV) measured on each experimental or sampling ...
Extracting chemical information from high-resolution Kβ X-ray emission spectroscopy
NASA Astrophysics Data System (ADS)
Limandri, S.; Robledo, J.; Tirao, G.
2018-06-01
High-resolution X-ray emission spectroscopy allows studying the chemical environment of a wide variety of materials. Chemical information can be obtained by fitting the X-ray spectra and observing the behavior of some spectral features. Spectral changes can also be quantified by means of statistical parameters calculated by considering the spectrum as a probability distribution. Another possibility is to perform statistical multivariate analysis, such as principal component analysis. In this work the performance of these procedures for extracting chemical information in X-ray emission spectroscopy spectra for mixtures of Mn2+ and Mn4+ oxides are studied. A detail analysis of the parameters obtained, as well as the associated uncertainties is shown. The methodologies are also applied for Mn oxidation state characterization of double perovskite oxides Ba1+xLa1-xMnSbO6 (with 0 ≤ x ≤ 0.7). The results show that statistical parameters and multivariate analysis are the most suitable for the analysis of this kind of spectra.
NASA Astrophysics Data System (ADS)
He, Shixuan; Xie, Wanyi; Zhang, Wei; Zhang, Liqun; Wang, Yunxia; Liu, Xiaoling; Liu, Yulong; Du, Chunlei
2015-02-01
A novel strategy which combines iteratively cubic spline fitting baseline correction method with discriminant partial least squares qualitative analysis is employed to analyze the surface enhanced Raman scattering (SERS) spectroscopy of banned food additives, such as Sudan I dye and Rhodamine B in food, Malachite green residues in aquaculture fish. Multivariate qualitative analysis methods, using the combination of spectra preprocessing iteratively cubic spline fitting (ICSF) baseline correction with principal component analysis (PCA) and discriminant partial least squares (DPLS) classification respectively, are applied to investigate the effectiveness of SERS spectroscopy for predicting the class assignments of unknown banned food additives. PCA cannot be used to predict the class assignments of unknown samples. However, the DPLS classification can discriminate the class assignment of unknown banned additives using the information of differences in relative intensities. The results demonstrate that SERS spectroscopy combined with ICSF baseline correction method and exploratory analysis methodology DPLS classification can be potentially used for distinguishing the banned food additives in field of food safety.
Grootswagers, Tijl; Wardle, Susan G; Carlson, Thomas A
2017-04-01
Multivariate pattern analysis (MVPA) or brain decoding methods have become standard practice in analyzing fMRI data. Although decoding methods have been extensively applied in brain-computer interfaces, these methods have only recently been applied to time series neuroimaging data such as MEG and EEG to address experimental questions in cognitive neuroscience. In a tutorial style review, we describe a broad set of options to inform future time series decoding studies from a cognitive neuroscience perspective. Using example MEG data, we illustrate the effects that different options in the decoding analysis pipeline can have on experimental results where the aim is to "decode" different perceptual stimuli or cognitive states over time from dynamic brain activation patterns. We show that decisions made at both preprocessing (e.g., dimensionality reduction, subsampling, trial averaging) and decoding (e.g., classifier selection, cross-validation design) stages of the analysis can significantly affect the results. In addition to standard decoding, we describe extensions to MVPA for time-varying neuroimaging data including representational similarity analysis, temporal generalization, and the interpretation of classifier weight maps. Finally, we outline important caveats in the design and interpretation of time series decoding experiments.
Dermatoglyphic analysis of La Liébana (Cantabria, Spain). 2. Finger ridge counts.
Martín, J; Gómez, P
1993-06-01
The results of univariate and multivariate analyses of the quantitative finger dermatoglyphic traits (i.e. ridge counts) of a sample of 109 males and 88 females from La Liébana (Cantabria, Spain) are reported. Univariate results follow the trends usually found in previous studies, e.g., ranking of finger ridge counts, bilateral asymmetry or shape of the distributions of the frequencies. However, sexual dimorphism is nearly inexistent concerning finger ridge counts. This lack of dimorphism could be related to certain characteristics of the distribution of finger dermatoglyphic patterns previously reported by the same authors. The multivariate description has been carried out by means of principal component analysis (with varimax rotation to obtain the final solution) of the correlation matrices computed from the 10 maximal finger ridge counts. Although the results do not necessarily prove the concept of developmental fields ("field theory" and later modifications), some precepts of the theory are present: field polarization and field overlapping.
Practical robustness measures in multivariable control system analysis. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Lehtomaki, N. A.
1981-01-01
The robustness of the stability of multivariable linear time invariant feedback control systems with respect to model uncertainty is considered using frequency domain criteria. Available robustness tests are unified under a common framework based on the nature and structure of model errors. These results are derived using a multivariable version of Nyquist's stability theorem in which the minimum singular value of the return difference transfer matrix is shown to be the multivariable generalization of the distance to the critical point on a single input, single output Nyquist diagram. Using the return difference transfer matrix, a very general robustness theorem is presented from which all of the robustness tests dealing with specific model errors may be derived. The robustness tests that explicitly utilized model error structure are able to guarantee feedback system stability in the face of model errors of larger magnitude than those robustness tests that do not. The robustness of linear quadratic Gaussian control systems are analyzed.
Huang, Cheng-Wei; Sue, Pei-Der; Abbod, Maysam F; Jiang, Bernard C; Shieh, Jiann-Shing
2013-08-08
To assess the improvement of human body balance, a low cost and portable measuring device of center of pressure (COP), known as center of pressure and complexity monitoring system (CPCMS), has been developed for data logging and analysis. In order to prove that the system can estimate the different magnitude of different sways in comparison with the commercial Advanced Mechanical Technology Incorporation (AMTI) system, four sway tests have been developed (i.e., eyes open, eyes closed, eyes open with water pad, and eyes closed with water pad) to produce different sway displacements. Firstly, static and dynamic tests were conducted to investigate the feasibility of the system. Then, correlation tests of the CPCMS and AMTI systems have been compared with four sway tests. The results are within the acceptable range. Furthermore, multivariate empirical mode decomposition (MEMD) and enhanced multivariate multiscale entropy (MMSE) analysis methods have been used to analyze COP data reported by the CPCMS and compare it with the AMTI system. The improvements of the CPCMS are 35% to 70% (open eyes test) and 60% to 70% (eyes closed test) with and without water pad. The AMTI system has shown an improvement of 40% to 80% (open eyes test) and 65% to 75% (closed eyes test). The results indicate that the CPCMS system can achieve similar results to the commercial product so it can determine the balance.
Huang, Cheng-Wei; Sue, Pei-Der; Abbod, Maysam F.; Jiang, Bernard C.; Shieh, Jiann-Shing
2013-01-01
To assess the improvement of human body balance, a low cost and portable measuring device of center of pressure (COP), known as center of pressure and complexity monitoring system (CPCMS), has been developed for data logging and analysis. In order to prove that the system can estimate the different magnitude of different sways in comparison with the commercial Advanced Mechanical Technology Incorporation (AMTI) system, four sway tests have been developed (i.e., eyes open, eyes closed, eyes open with water pad, and eyes closed with water pad) to produce different sway displacements. Firstly, static and dynamic tests were conducted to investigate the feasibility of the system. Then, correlation tests of the CPCMS and AMTI systems have been compared with four sway tests. The results are within the acceptable range. Furthermore, multivariate empirical mode decomposition (MEMD) and enhanced multivariate multiscale entropy (MMSE) analysis methods have been used to analyze COP data reported by the CPCMS and compare it with the AMTI system. The improvements of the CPCMS are 35% to 70% (open eyes test) and 60% to 70% (eyes closed test) with and without water pad. The AMTI system has shown an improvement of 40% to 80% (open eyes test) and 65% to 75% (closed eyes test). The results indicate that the CPCMS system can achieve similar results to the commercial product so it can determine the balance. PMID:23966184
Changes in Concurrent Risk of Warm and Dry Years under Impact of Climate Change
NASA Astrophysics Data System (ADS)
Sarhadi, A.; Wiper, M.; Touma, D. E.; Ausín, M. C.; Diffenbaugh, N. S.
2017-12-01
Anthropogenic global warming has changed the nature and the risk of extreme climate phenomena. The changing concurrence of multiple climatic extremes (warm and dry years) may result in intensification of undesirable consequences for water resources, human and ecosystem health, and environmental equity. The present study assesses how global warming influences the probability that warm and dry years co-occur in a global scale. In the first step of the study a designed multivariate Mann-Kendall trend analysis is used to detect the areas in which the concurrence of warm and dry years has increased in the historical climate records and also climate models in the global scale. The next step investigates the concurrent risk of the extremes under dynamic nonstationary conditions. A fully generalized multivariate risk framework is designed to evolve through time under dynamic nonstationary conditions. In this methodology, Bayesian, dynamic copulas are developed to model the time-varying dependence structure between the two different climate extremes (warm and dry years). The results reveal an increasing trend in the concurrence risk of warm and dry years, which are in agreement with the multivariate trend analysis from historical and climate models. In addition to providing a novel quantification of the changing probability of compound extreme events, the results of this study can help decision makers develop short- and long-term strategies to prepare for climate stresses now and in the future.
Kilborn, Joshua P; Jones, David L; Peebles, Ernst B; Naar, David F
2017-04-01
Clustering data continues to be a highly active area of data analysis, and resemblance profiles are being incorporated into ecological methodologies as a hypothesis testing-based approach to clustering multivariate data. However, these new clustering techniques have not been rigorously tested to determine the performance variability based on the algorithm's assumptions or any underlying data structures. Here, we use simulation studies to estimate the statistical error rates for the hypothesis test for multivariate structure based on dissimilarity profiles (DISPROF). We concurrently tested a widely used algorithm that employs the unweighted pair group method with arithmetic mean (UPGMA) to estimate the proficiency of clustering with DISPROF as a decision criterion. We simulated unstructured multivariate data from different probability distributions with increasing numbers of objects and descriptors, and grouped data with increasing overlap, overdispersion for ecological data, and correlation among descriptors within groups. Using simulated data, we measured the resolution and correspondence of clustering solutions achieved by DISPROF with UPGMA against the reference grouping partitions used to simulate the structured test datasets. Our results highlight the dynamic interactions between dataset dimensionality, group overlap, and the properties of the descriptors within a group (i.e., overdispersion or correlation structure) that are relevant to resemblance profiles as a clustering criterion for multivariate data. These methods are particularly useful for multivariate ecological datasets that benefit from distance-based statistical analyses. We propose guidelines for using DISPROF as a clustering decision tool that will help future users avoid potential pitfalls during the application of methods and the interpretation of results.
Li, Yan; Zhang, Ji; Zhao, Yanli; Liu, Honggao; Wang, Yuanzhong; Jin, Hang
2016-01-01
In this study the geographical differentiation of dried sclerotia of the medicinal mushroom Wolfiporia extensa, obtained from different regions in Yunnan Province, China, was explored using Fourier-transform infrared (FT-IR) spectroscopy coupled with multivariate data analysis. The FT-IR spectra of 97 samples were obtained for wave numbers ranging from 4000 to 400 cm-1. Then, the fingerprint region of 1800-600 cm-1 of the FT-IR spectrum, rather than the full spectrum, was analyzed. Different pretreatments were applied on the spectra, and a discriminant analysis model based on the Mahalanobis distance was developed to select an optimal pretreatment combination. Two unsupervised pattern recognition procedures- principal component analysis and hierarchical cluster analysis-were applied to enhance the authenticity of discrimination of the specimens. The results showed that excellent classification could be obtained after optimizing spectral pretreatment. The tested samples were successfully discriminated according to their geographical locations. The chemical properties of dried sclerotia of W. extensa were clearly dependent on the mushroom's geographical origins. Furthermore, an interesting finding implied that the elevations of collection areas may have effects on the chemical components of wild W. extensa sclerotia. Overall, this study highlights the feasibility of FT-IR spectroscopy combined with multivariate data analysis in particular for exploring the distinction of different regional W. extensa sclerotia samples. This research could also serve as a basis for the exploitation and utilization of medicinal mushrooms.
MULTIVARIATE CURVE RESOLUTION OF NMR SPECTROSCOPY METABONOMIC DATA
Sandia National Laboratories is working with the EPA to evaluate and develop mathematical tools for analysis of the collected NMR spectroscopy data. Initially, we have focused on the use of Multivariate Curve Resolution (MCR) also known as molecular factor analysis (MFA), a tech...
Drunk driving detection based on classification of multivariate time series.
Li, Zhenlong; Jin, Xue; Zhao, Xiaohua
2015-09-01
This paper addresses the problem of detecting drunk driving based on classification of multivariate time series. First, driving performance measures were collected from a test in a driving simulator located in the Traffic Research Center, Beijing University of Technology. Lateral position and steering angle were used to detect drunk driving. Second, multivariate time series analysis was performed to extract the features. A piecewise linear representation was used to represent multivariate time series. A bottom-up algorithm was then employed to separate multivariate time series. The slope and time interval of each segment were extracted as the features for classification. Third, a support vector machine classifier was used to classify driver's state into two classes (normal or drunk) according to the extracted features. The proposed approach achieved an accuracy of 80.0%. Drunk driving detection based on the analysis of multivariate time series is feasible and effective. The approach has implications for drunk driving detection. Copyright © 2015 Elsevier Ltd and National Safety Council. All rights reserved.
NASA Astrophysics Data System (ADS)
DSouza, Adora M.; Abidin, Anas Z.; Leistritz, Lutz; Wismüller, Axel
2017-02-01
We investigate the applicability of large-scale Granger Causality (lsGC) for extracting a measure of multivariate information flow between pairs of regional brain activities from resting-state functional MRI (fMRI) and test the effectiveness of these measures for predicting a disease state. Such pairwise multivariate measures of interaction provide high-dimensional representations of connectivity profiles for each subject and are used in a machine learning task to distinguish between healthy controls and individuals presenting with symptoms of HIV Associated Neurocognitive Disorder (HAND). Cognitive impairment in several domains can occur as a result of HIV infection of the central nervous system. The current paradigm for assessing such impairment is through neuropsychological testing. With fMRI data analysis, we aim at non-invasively capturing differences in brain connectivity patterns between healthy subjects and subjects presenting with symptoms of HAND. To classify the extracted interaction patterns among brain regions, we use a prototype-based learning algorithm called Generalized Matrix Learning Vector Quantization (GMLVQ). Our approach to characterize connectivity using lsGC followed by GMLVQ for subsequent classification yields good prediction results with an accuracy of 87% and an area under the ROC curve (AUC) of up to 0.90. We obtain a statistically significant improvement (p<0.01) over a conventional Granger causality approach (accuracy = 0.76, AUC = 0.74). High accuracy and AUC values using our multivariate method to connectivity analysis suggests that our approach is able to better capture changes in interaction patterns between different brain regions when compared to conventional Granger causality analysis known from the literature.
Assessment of water quality parameters using multivariate analysis for Klang River basin, Malaysia.
Mohamed, Ibrahim; Othman, Faridah; Ibrahim, Adriana I N; Alaa-Eldin, M E; Yunus, Rossita M
2015-01-01
This case study uses several univariate and multivariate statistical techniques to evaluate and interpret a water quality data set obtained from the Klang River basin located within the state of Selangor and the Federal Territory of Kuala Lumpur, Malaysia. The river drains an area of 1,288 km(2), from the steep mountain rainforests of the main Central Range along Peninsular Malaysia to the river mouth in Port Klang, into the Straits of Malacca. Water quality was monitored at 20 stations, nine of which are situated along the main river and 11 along six tributaries. Data was collected from 1997 to 2007 for seven parameters used to evaluate the status of the water quality, namely dissolved oxygen, biochemical oxygen demand, chemical oxygen demand, suspended solids, ammoniacal nitrogen, pH, and temperature. The data were first investigated using descriptive statistical tools, followed by two practical multivariate analyses that reduced the data dimensions for better interpretation. The analyses employed were factor analysis and principal component analysis, which explain 60 and 81.6% of the total variation in the data, respectively. We found that the resulting latent variables from the factor analysis are interpretable and beneficial for describing the water quality in the Klang River. This study presents the usefulness of several statistical methods in evaluating and interpreting water quality data for the purpose of monitoring the effectiveness of water resource management. The results should provide more straightforward data interpretation as well as valuable insight for managers to conceive optimum action plans for controlling pollution in river water.
The Recoverability of P-Technique Factor Analysis
ERIC Educational Resources Information Center
Molenaar, Peter C. M.; Nesselroade, John R.
2009-01-01
It seems that just when we are about to lay P-technique factor analysis finally to rest as obsolete because of newer, more sophisticated multivariate time-series models using latent variables--dynamic factor models--it rears its head to inform us that an obituary may be premature. We present the results of some simulations demonstrating that even…
ERIC Educational Resources Information Center
Muslihah, Oleh Eneng
2015-01-01
The research examines the correlation between the understanding of school-based management, emotional intelligences and headmaster performance. Data was collected, using quantitative methods. The statistical analysis used was the Pearson Correlation, and multivariate regression analysis. The results of this research suggest firstly that there is…
Hebart, Martin N.; Görgen, Kai; Haynes, John-Dylan
2015-01-01
The multivariate analysis of brain signals has recently sparked a great amount of interest, yet accessible and versatile tools to carry out decoding analyses are scarce. Here we introduce The Decoding Toolbox (TDT) which represents a user-friendly, powerful and flexible package for multivariate analysis of functional brain imaging data. TDT is written in Matlab and equipped with an interface to the widely used brain data analysis package SPM. The toolbox allows running fast whole-brain analyses, region-of-interest analyses and searchlight analyses, using machine learning classifiers, pattern correlation analysis, or representational similarity analysis. It offers automatic creation and visualization of diverse cross-validation schemes, feature scaling, nested parameter selection, a variety of feature selection methods, multiclass capabilities, and pattern reconstruction from classifier weights. While basic users can implement a generic analysis in one line of code, advanced users can extend the toolbox to their needs or exploit the structure to combine it with external high-performance classification toolboxes. The toolbox comes with an example data set which can be used to try out the various analysis methods. Taken together, TDT offers a promising option for researchers who want to employ multivariate analyses of brain activity patterns. PMID:25610393
ANALYSIS OF LOTIC MACROINVERTEBRATE ASSEMBLAGES IN CALIFORNIA'S CENTRAL VALLEY
Using multivariate and cluster analyses, we examined the relaitonships between chemical and physical characteristics and macroinvertebrate assemblages at sites sampled by R-EMAP in California's Central Valley. By contrasting results where community structure was summarized as met...
Application of multivariable statistical techniques in plant-wide WWTP control strategies analysis.
Flores, X; Comas, J; Roda, I R; Jiménez, L; Gernaey, K V
2007-01-01
The main objective of this paper is to present the application of selected multivariable statistical techniques in plant-wide wastewater treatment plant (WWTP) control strategies analysis. In this study, cluster analysis (CA), principal component analysis/factor analysis (PCA/FA) and discriminant analysis (DA) are applied to the evaluation matrix data set obtained by simulation of several control strategies applied to the plant-wide IWA Benchmark Simulation Model No 2 (BSM2). These techniques allow i) to determine natural groups or clusters of control strategies with a similar behaviour, ii) to find and interpret hidden, complex and casual relation features in the data set and iii) to identify important discriminant variables within the groups found by the cluster analysis. This study illustrates the usefulness of multivariable statistical techniques for both analysis and interpretation of the complex multicriteria data sets and allows an improved use of information for effective evaluation of control strategies.
ERIC Educational Resources Information Center
Barton, Mitch; Yeatts, Paul E.; Henson, Robin K.; Martin, Scott B.
2016-01-01
There has been a recent call to improve data reporting in kinesiology journals, including the appropriate use of univariate and multivariate analysis techniques. For example, a multivariate analysis of variance (MANOVA) with univariate post hocs and a Bonferroni correction is frequently used to investigate group differences on multiple dependent…
NASA Astrophysics Data System (ADS)
Gaudio, P.; Malizia, A.; Gelfusa, M.; Martinelli, E.; Di Natale, C.; Poggi, L. A.; Bellecci, C.
2017-01-01
Nowadays Toxic Industrial Components (TICs) and Toxic Industrial Materials (TIMs) are one of the most dangerous and diffuse vehicle of contamination in urban and industrial areas. The academic world together with the industrial and military one are working on innovative solutions to monitor the diffusion in atmosphere of such pollutants. In this phase the most common commercial sensors are based on “point detection” technology but it is clear that such instruments cannot satisfy the needs of the smart cities. The new challenge is developing stand-off systems to continuously monitor the atmosphere. Quantum Electronics and Plasma Physics (QEP) research group has a long experience in laser system development and has built two demonstrators based on DIAL (Differential Absorption of Light) technology could be able to identify chemical agents in atmosphere. In this work the authors will present one of those DIAL system, the miniaturized one, together with the preliminary results of an experimental campaign conducted on TICs and TIMs simulants in cell with aim of use the absorption database for the further atmospheric an analysis using the same DIAL system. The experimental results are analysed with standard multivariate data analysis technique as Principal Component Analysis (PCA) to develop a classification model aimed at identifying organic chemical compound in atmosphere. The preliminary results of absorption coefficients of some chemical compound are shown together pre PCA analysis.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rades, Dirk, E-mail: Rades.Dirk@gmx.net; Setter, Cornelia; Dahl, Olav
2012-01-01
Purpose: The prognostic value of the tumor cell expression of the fibroblast growth factor 2 (FGF-2) in patients with non-small-cell lung cancer (NSCLC) is unclear. The present study investigated the effect of tumor cell expression of FGF-2 on the outcome of 60 patients irradiated for Stage II-III NSCLC. Methods and Materials: The effect of FGF-2 expression and 13 additional factors on locoregional control (LRC), metastasis-free survival (MFS), and overall survival (OS) were retrospectively evaluated. These additional factors included age, gender, Karnofsky performance status, histologic type, histologic grade, T and N category, American Joint Committee on Cancer stage, surgery, chemotherapy, pack-years,more » smoking during radiotherapy, and hemoglobin during radiotherapy. Locoregional failure was identified by endoscopy or computed tomography. Univariate analyses were performed with the Kaplan-Meier method and the Wilcoxon test and multivariate analyses with the Cox proportional hazard model. Results: On univariate analysis, improved LRC was associated with surgery (p = .017), greater hemoglobin levels (p = .036), and FGF-2 negativity (p <.001). On multivariate analysis of LRC, surgery (relative risk [RR], 2.44; p = .037), and FGF-2 expression (RR, 5.06; p <.001) maintained significance. On univariate analysis, improved MFS was associated with squamous cell carcinoma (p = .020), greater hemoglobin levels (p = .007), and FGF-2 negativity (p = .001). On multivariate analysis of MFS, the hemoglobin levels (RR, 2.65; p = .019) and FGF-2 expression (RR, 3.05; p = .004) were significant. On univariate analysis, improved OS was associated with a lower N category (p = .048), greater hemoglobin levels (p <.001), and FGF-2 negativity (p <.001). On multivariate analysis of OS, greater hemoglobin levels (RR, 4.62; p = .002) and FGF-2 expression (RR, 3.25; p = .002) maintained significance. Conclusions: Tumor cell expression of FGF-2 appeared to be an independent negative predictor of LRC, MFS, and OS.« less
Duarte, João V; Ribeiro, Maria J; Violante, Inês R; Cunha, Gil; Silva, Eduardo; Castelo-Branco, Miguel
2014-01-01
Neurofibromatosis Type 1 (NF1) is a common genetic condition associated with cognitive dysfunction. However, the pathophysiology of the NF1 cognitive deficits is not well understood. Abnormal brain structure, including increased total brain volume, white matter (WM) and grey matter (GM) abnormalities have been reported in the NF1 brain. These previous studies employed univariate model-driven methods preventing detection of subtle and spatially distributed differences in brain anatomy. Multivariate pattern analysis allows the combination of information from multiple spatial locations yielding a discriminative power beyond that of single voxels. Here we investigated for the first time subtle anomalies in the NF1 brain, using a multivariate data-driven classification approach. We used support vector machines (SVM) to classify whole-brain GM and WM segments of structural T1 -weighted MRI scans from 39 participants with NF1 and 60 non-affected individuals, divided in children/adolescents and adults groups. We also employed voxel-based morphometry (VBM) as a univariate gold standard to study brain structural differences. SVM classifiers correctly classified 94% of cases (sensitivity 92%; specificity 96%) revealing the existence of brain structural anomalies that discriminate NF1 individuals from controls. Accordingly, VBM analysis revealed structural differences in agreement with the SVM weight maps representing the most relevant brain regions for group discrimination. These included the hippocampus, basal ganglia, thalamus, and visual cortex. This multivariate data-driven analysis thus identified subtle anomalies in brain structure in the absence of visible pathology. Our results provide further insight into the neuroanatomical correlates of known features of the cognitive phenotype of NF1. Copyright © 2012 Wiley Periodicals, Inc.
Casarrubea, M; Magnusson, M S; Roy, V; Arabo, A; Sorbera, F; Santangelo, A; Faulisi, F; Crescimanno, G
2014-08-30
Aim of this article is to illustrate the application of a multivariate approach known as t-pattern analysis in the study of rat behavior in elevated plus maze. By means of this multivariate approach, significant relationships among behavioral events in the course of time can be described. Both quantitative and t-pattern analyses were utilized to analyze data obtained from fifteen male Wistar rats following a trial 1-trial 2 protocol. In trial 2, in comparison with the initial exposure, mean occurrences of behavioral elements performed in protected zones of the maze showed a significant increase counterbalanced by a significant decrease of mean occurrences of behavioral elements in unprotected zones. Multivariate t-pattern analysis, in trial 1, revealed the presence of 134 t-patterns of different composition. In trial 2, the temporal structure of behavior become more simple, being present only 32 different t-patterns. Behavioral strings and stripes (i.e. graphical representation of each t-pattern onset) of all t-patterns were presented both for trial 1 and trial 2 as well. Finally, percent distributions in the three zones of the maze show a clear-cut increase of t-patterns in closed arm and a significant reduction in the remaining zones. Results show that previous experience deeply modifies the temporal structure of rat behavior in the elevated plus maze. In addition, this article, by highlighting several conceptual, methodological and illustrative aspects on the utilization of t-pattern analysis, could represent a useful background to employ such a refined approach in the study of rat behavior in elevated plus maze. Copyright © 2014 Elsevier B.V. All rights reserved.
Cohen, Mitchell J; Grossman, Adam D; Morabito, Diane; Knudson, M Margaret; Butte, Atul J; Manley, Geoffrey T
2010-01-01
Advances in technology have made extensive monitoring of patient physiology the standard of care in intensive care units (ICUs). While many systems exist to compile these data, there has been no systematic multivariate analysis and categorization across patient physiological data. The sheer volume and complexity of these data make pattern recognition or identification of patient state difficult. Hierarchical cluster analysis allows visualization of high dimensional data and enables pattern recognition and identification of physiologic patient states. We hypothesized that processing of multivariate data using hierarchical clustering techniques would allow identification of otherwise hidden patient physiologic patterns that would be predictive of outcome. Multivariate physiologic and ventilator data were collected continuously using a multimodal bioinformatics system in the surgical ICU at San Francisco General Hospital. These data were incorporated with non-continuous data and stored on a server in the ICU. A hierarchical clustering algorithm grouped each minute of data into 1 of 10 clusters. Clusters were correlated with outcome measures including incidence of infection, multiple organ failure (MOF), and mortality. We identified 10 clusters, which we defined as distinct patient states. While patients transitioned between states, they spent significant amounts of time in each. Clusters were enriched for our outcome measures: 2 of the 10 states were enriched for infection, 6 of 10 were enriched for MOF, and 3 of 10 were enriched for death. Further analysis of correlations between pairs of variables within each cluster reveals significant differences in physiology between clusters. Here we show for the first time the feasibility of clustering physiological measurements to identify clinically relevant patient states after trauma. These results demonstrate that hierarchical clustering techniques can be useful for visualizing complex multivariate data and may provide new insights for the care of critically injured patients.
MGAS: a powerful tool for multivariate gene-based genome-wide association analysis.
Van der Sluis, Sophie; Dolan, Conor V; Li, Jiang; Song, Youqiang; Sham, Pak; Posthuma, Danielle; Li, Miao-Xin
2015-04-01
Standard genome-wide association studies, testing the association between one phenotype and a large number of single nucleotide polymorphisms (SNPs), are limited in two ways: (i) traits are often multivariate, and analysis of composite scores entails loss in statistical power and (ii) gene-based analyses may be preferred, e.g. to decrease the multiple testing problem. Here we present a new method, multivariate gene-based association test by extended Simes procedure (MGAS), that allows gene-based testing of multivariate phenotypes in unrelated individuals. Through extensive simulation, we show that under most trait-generating genotype-phenotype models MGAS has superior statistical power to detect associated genes compared with gene-based analyses of univariate phenotypic composite scores (i.e. GATES, multiple regression), and multivariate analysis of variance (MANOVA). Re-analysis of metabolic data revealed 32 False Discovery Rate controlled genome-wide significant genes, and 12 regions harboring multiple genes; of these 44 regions, 30 were not reported in the original analysis. MGAS allows researchers to conduct their multivariate gene-based analyses efficiently, and without the loss of power that is often associated with an incorrectly specified genotype-phenotype models. MGAS is freely available in KGG v3.0 (http://statgenpro.psychiatry.hku.hk/limx/kgg/download.php). Access to the metabolic dataset can be requested at dbGaP (https://dbgap.ncbi.nlm.nih.gov/). The R-simulation code is available from http://ctglab.nl/people/sophie_van_der_sluis. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
Naccarato, Attilio; Furia, Emilia; Sindona, Giovanni; Tagarelli, Antonio
2016-09-01
Four class-modeling techniques (soft independent modeling of class analogy (SIMCA), unequal dispersed classes (UNEQ), potential functions (PF), and multivariate range modeling (MRM)) were applied to multielement distribution to build chemometric models able to authenticate chili pepper samples grown in Calabria respect to those grown outside of Calabria. The multivariate techniques were applied by considering both all the variables (32 elements, Al, As, Ba, Ca, Cd, Ce, Co, Cr, Cs, Cu, Dy, Fe, Ga, La, Li, Mg, Mn, Na, Nd, Ni, Pb, Pr, Rb, Sc, Se, Sr, Tl, Tm, V, Y, Yb, Zn) and variables selected by means of stepwise linear discriminant analysis (S-LDA). In the first case, satisfactory and comparable results in terms of CV efficiency are obtained with the use of SIMCA and MRM (82.3 and 83.2% respectively), whereas MRM performs better than SIMCA in terms of forced model efficiency (96.5%). The selection of variables by S-LDA permitted to build models characterized, in general, by a higher efficiency. MRM provided again the best results for CV efficiency (87.7% with an effective balance of sensitivity and specificity) as well as forced model efficiency (96.5%). Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Kanzaki, Yasushi
Many kinds of water products have been offered commercially suggesting some strange efficacy beyond our scientific knowledge even now at which various advanced scientific and technological research have been highly promoted. However, it seems quite obvious that such a strange efficacy must be nonexistent. If such efficacy were really existing, it must be solved by some suitable scientific procedure. In this study, the extraction of paeoniflorin from paeoniae radix was examined by varying the kind of extracting water. Then, the result was analyzed using multivariate analysis where the effect on the extraction was assumed to be ascribed to the ionic species dissolved in each water examined. The dissolved species were analyzed by chemical and instrumental analyses. According to the multivariate analysis, the amount of extracted paeoniflorin (Y) was presented by the following regression equation. The result shows that pH, [Ca2+], and [HCO3 -] were significant parameters and the combination of Ca2+ and HCO3 - affected negatively on the extraction of paeoniflorin.
Y=28.11-0.71 pH-0.0034[Ca2+]-0.93[HCO3 -]
where [Ca2+] is the concentration of calcium ion and [HCO3 -] is that of bicarbonate ion.
Graphite Web: web tool for gene set analysis exploiting pathway topology
Sales, Gabriele; Calura, Enrica; Martini, Paolo; Romualdi, Chiara
2013-01-01
Graphite web is a novel web tool for pathway analyses and network visualization for gene expression data of both microarray and RNA-seq experiments. Several pathway analyses have been proposed either in the univariate or in the global and multivariate context to tackle the complexity and the interpretation of expression results. These methods can be further divided into ‘topological’ and ‘non-topological’ methods according to their ability to gain power from pathway topology. Biological pathways are, in fact, not only gene lists but can be represented through a network where genes and connections are, respectively, nodes and edges. To this day, the most used approaches are non-topological and univariate although they miss the relationship among genes. On the contrary, topological and multivariate approaches are more powerful, but difficult to be used by researchers without bioinformatic skills. Here we present Graphite web, the first public web server for pathway analysis on gene expression data that combines topological and multivariate pathway analyses with an efficient system of interactive network visualizations for easy results interpretation. Specifically, Graphite web implements five different gene set analyses on three model organisms and two pathway databases. Graphite Web is freely available at http://graphiteweb.bio.unipd.it/. PMID:23666626
Finley, Andrew O.; Banerjee, Sudipto; Cook, Bruce D.; Bradford, John B.
2013-01-01
In this paper we detail a multivariate spatial regression model that couples LiDAR, hyperspectral and forest inventory data to predict forest outcome variables at a high spatial resolution. The proposed model is used to analyze forest inventory data collected on the US Forest Service Penobscot Experimental Forest (PEF), ME, USA. In addition to helping meet the regression model's assumptions, results from the PEF analysis suggest that the addition of multivariate spatial random effects improves model fit and predictive ability, compared with two commonly applied modeling approaches. This improvement results from explicitly modeling the covariation among forest outcome variables and spatial dependence among observations through the random effects. Direct application of such multivariate models to even moderately large datasets is often computationally infeasible because of cubic order matrix algorithms involved in estimation. We apply a spatial dimension reduction technique to help overcome this computational hurdle without sacrificing richness in modeling.
Goudey, Benjamin; Abedini, Mani; Hopper, John L; Inouye, Michael; Makalic, Enes; Schmidt, Daniel F; Wagner, John; Zhou, Zeyu; Zobel, Justin; Reumann, Matthias
2015-01-01
Genome-wide association studies (GWAS) are a common approach for systematic discovery of single nucleotide polymorphisms (SNPs) which are associated with a given disease. Univariate analysis approaches commonly employed may miss important SNP associations that only appear through multivariate analysis in complex diseases. However, multivariate SNP analysis is currently limited by its inherent computational complexity. In this work, we present a computational framework that harnesses supercomputers. Based on our results, we estimate a three-way interaction analysis on 1.1 million SNP GWAS data requiring over 5.8 years on the full "Avoca" IBM Blue Gene/Q installation at the Victorian Life Sciences Computation Initiative. This is hundreds of times faster than estimates for other CPU based methods and four times faster than runtimes estimated for GPU methods, indicating how the improvement in the level of hardware applied to interaction analysis may alter the types of analysis that can be performed. Furthermore, the same analysis would take under 3 months on the currently largest IBM Blue Gene/Q supercomputer "Sequoia" at the Lawrence Livermore National Laboratory assuming linear scaling is maintained as our results suggest. Given that the implementation used in this study can be further optimised, this runtime means it is becoming feasible to carry out exhaustive analysis of higher order interaction studies on large modern GWAS.
SPReM: Sparse Projection Regression Model For High-dimensional Linear Regression *
Sun, Qiang; Zhu, Hongtu; Liu, Yufeng; Ibrahim, Joseph G.
2014-01-01
The aim of this paper is to develop a sparse projection regression modeling (SPReM) framework to perform multivariate regression modeling with a large number of responses and a multivariate covariate of interest. We propose two novel heritability ratios to simultaneously perform dimension reduction, response selection, estimation, and testing, while explicitly accounting for correlations among multivariate responses. Our SPReM is devised to specifically address the low statistical power issue of many standard statistical approaches, such as the Hotelling’s T2 test statistic or a mass univariate analysis, for high-dimensional data. We formulate the estimation problem of SPREM as a novel sparse unit rank projection (SURP) problem and propose a fast optimization algorithm for SURP. Furthermore, we extend SURP to the sparse multi-rank projection (SMURP) by adopting a sequential SURP approximation. Theoretically, we have systematically investigated the convergence properties of SURP and the convergence rate of SURP estimates. Our simulation results and real data analysis have shown that SPReM out-performs other state-of-the-art methods. PMID:26527844
Multiscale analysis of information dynamics for linear multivariate processes.
Faes, Luca; Montalto, Alessandro; Stramaglia, Sebastiano; Nollo, Giandomenico; Marinazzo, Daniele
2016-08-01
In the study of complex physical and physiological systems represented by multivariate time series, an issue of great interest is the description of the system dynamics over a range of different temporal scales. While information-theoretic approaches to the multiscale analysis of complex dynamics are being increasingly used, the theoretical properties of the applied measures are poorly understood. This study introduces for the first time a framework for the analytical computation of information dynamics for linear multivariate stochastic processes explored at different time scales. After showing that the multiscale processing of a vector autoregressive (VAR) process introduces a moving average (MA) component, we describe how to represent the resulting VARMA process using statespace (SS) models and how to exploit the SS model parameters to compute analytical measures of information storage and information transfer for the original and rescaled processes. The framework is then used to quantify multiscale information dynamics for simulated unidirectionally and bidirectionally coupled VAR processes, showing that rescaling may lead to insightful patterns of information storage and transfer but also to potentially misleading behaviors.
Hybrid least squares multivariate spectral analysis methods
Haaland, David M.
2002-01-01
A set of hybrid least squares multivariate spectral analysis methods in which spectral shapes of components or effects not present in the original calibration step are added in a following estimation or calibration step to improve the accuracy of the estimation of the amount of the original components in the sampled mixture. The "hybrid" method herein means a combination of an initial classical least squares analysis calibration step with subsequent analysis by an inverse multivariate analysis method. A "spectral shape" herein means normally the spectral shape of a non-calibrated chemical component in the sample mixture but can also mean the spectral shapes of other sources of spectral variation, including temperature drift, shifts between spectrometers, spectrometer drift, etc. The "shape" can be continuous, discontinuous, or even discrete points illustrative of the particular effect.
Darwish, Hany W; Bakheit, Ahmed H; Abdelhameed, Ali S
2016-03-01
Simultaneous spectrophotometric analysis of a multi-component dosage form of olmesartan, amlodipine and hydrochlorothiazide used for the treatment of hypertension has been carried out using various chemometric methods. Multivariate calibration methods include classical least squares (CLS) executed by net analyte processing (NAP-CLS), orthogonal signal correction (OSC-CLS) and direct orthogonal signal correction (DOSC-CLS) in addition to multivariate curve resolution-alternating least squares (MCR-ALS). Results demonstrated the efficiency of the proposed methods as quantitative tools of analysis as well as their qualitative capability. The three analytes were determined precisely using the aforementioned methods in an external data set and in a dosage form after optimization of experimental conditions. Finally, the efficiency of the models was validated via comparison with the partial least squares (PLS) method in terms of accuracy and precision.
Multivariate space - time analysis of PRE-STORM precipitation
NASA Technical Reports Server (NTRS)
Polyak, Ilya; North, Gerald R.; Valdes, Juan B.
1994-01-01
This paper presents the methodologies and results of the multivariate modeling and two-dimensional spectral and correlation analysis of PRE-STORM rainfall gauge data. Estimated parameters of the models for the specific spatial averages clearly indicate the eastward and southeastward wave propagation of rainfall fluctuations. A relationship between the coefficients of the diffusion equation and the parameters of the stochastic model of rainfall fluctuations is derived that leads directly to the exclusive use of rainfall data to estimate advection speed (about 12 m/s) as well as other coefficients of the diffusion equation of the corresponding fields. The statistical methodology developed here can be used for confirmation of physical models by comparison of the corresponding second-moment statistics of the observed and simulated data, for generating multiple samples of any size, for solving the inverse problem of the hydrodynamic equations, and for application in some other areas of meteorological and climatological data analysis and modeling.
NASA Astrophysics Data System (ADS)
Kumar, S.; Jasinski, M. F.; Mocko, D. M.; Rodell, M.; Borak, J.; Li, B.; Beaudoing, H. K.; Peters-Lidard, C. D.
2017-12-01
This presentation will describe one of the first successful examples of multisensor, multivariate land data assimilation, encompassing a large suite of soil moisture, snow depth, snow cover and irrigation intensity environmental data records (EDRs) from Scanning Multi-channel Microwave Radiometer (SMMR), the Special Sensor Microwave Imager (SSM/I), the Advanced Scatterometer (ASCAT), the Moderate-Resolution Imaging Spectroradiometer (MODIS), the Advanced Microwave Scanning Radiometer (AMSR-E and AMSR2), the Soil Moisture Ocean Salinity (SMOS) mission and the Soil Moisture Active Passive (SMAP) mission. The analysis is performed using the NASA Land Information System (LIS) as an enabling tool for the U.S. National Climate Assessment (NCA). The performance of NCA Land Data Assimilation System (NCA-LDAS) is evaluated by comparing to a number of hydrological reference data products. Results indicate that multivariate assimilation provides systematic improvements in simulated soil moisture and snow depth, with marginal effects on the accuracy of simulated streamflow and ET. An important conclusion is that across all evaluated variables, assimilation of data from increasingly more modern sensors (e.g. SMOS, SMAP, AMSR2, ASCAT) produces more skillful results than assimilation of data from older sensors (e.g. SMMR, SSM/I, AMSR-E). The evaluation also indicates high skill of NCA-LDAS when compared with other land analysis products. Further, drought indicators based on NCA-LDAS output suggest a trend of longer and more severe droughts over parts of Western U.S. during 1979-2015, particularly in the Southwestern U.S.
D'Amico, E J; Neilands, T B; Zambarano, R
2001-11-01
Although power analysis is an important component in the planning and implementation of research designs, it is often ignored. Computer programs for performing power analysis are available, but most have limitations, particularly for complex multivariate designs. An SPSS procedure is presented that can be used for calculating power for univariate, multivariate, and repeated measures models with and without time-varying and time-constant covariates. Three examples provide a framework for calculating power via this method: an ANCOVA, a MANOVA, and a repeated measures ANOVA with two or more groups. The benefits and limitations of this procedure are discussed.
Zhou, Fei; Zhao, Yajing; Peng, Jiyu; Jiang, Yirong; Li, Maiquan; Jiang, Yuan; Lu, Baiyi
2017-07-01
Osmanthus fragrans flowers are used as folk medicine and additives for teas, beverages and foods. The metabolites of O. fragrans flowers from different geographical origins were inconsistent in some extent. Chromatography and mass spectrometry combined with multivariable analysis methods provides an approach for discriminating the origin of O. fragrans flowers. To discriminate the Osmanthus fragrans var. thunbergii flowers from different origins with the identified metabolites. GC-MS and UPLC-PDA were conducted to analyse the metabolites in O. fragrans var. thunbergii flowers (in total 150 samples). Principal component analysis (PCA), soft independent modelling of class analogy analysis (SIMCA) and random forest (RF) analysis were applied to group the GC-MS and UPLC-PDA data. GC-MS identified 32 compounds common to all samples while UPLC-PDA/QTOF-MS identified 16 common compounds. PCA of the UPLC-PDA data generated a better clustering than PCA of the GC-MS data. Ten metabolites (six from GC-MS and four from UPLC-PDA) were selected as effective compounds for discrimination by PCA loadings. SIMCA and RF analysis were used to build classification models, and the RF model, based on the four effective compounds (caffeic acid derivative, acteoside, ligustroside and compound 15), yielded better results with the classification rate of 100% in the calibration set and 97.8% in the prediction set. GC-MS and UPLC-PDA combined with multivariable analysis methods can discriminate the origin of Osmanthus fragrans var. thunbergii flowers. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Li, Hongru; Xu, Yadong; Li, Hui
2017-01-01
Objective To assess the prognostic and clinicopathological characteristics of CD147 in human bladder cancer. Methods Studies on CD147 expression in bladder cancer were retrieved from PubMed, EMBASE, the Cochrane Library, Web of Science, China National Knowledge Infrastructure, and the WanFang databases. Outcomes were pooled with meta-analyzing softwares RevMan 5.3 and STATA 14.0. Results Twenty-four studies with 25 datasets demonstrated that CD147 expression was higher in bladder cancer than in non-cancer tissues (OR=43.64, P<0.00001). Moreover, this increase was associated with more advanced clinical stages (OR=73.89, P<0.0001), deeper invasion (OR=3.22, P<0.00001), lower histological differentiation (OR=4.54, P=0.0005), poorer overall survival (univariate analysis, HR=2.63, P<0.00001; multivariate analysis, HR=1.86, P=0.00036), disease specific survival (univariate analysis, HR=1.65, P=0.002), disease recurrence-free survival (univariate analysis, HR=2.78, P=0.001; multivariate analysis, HR=5.51, P=0.017), rate of recurrence (OR=1.91, P=0.0006), invasive depth (pT2∼T4 vs. pTa∼T1; OR=3.22, P<0.00001), and histological differentiation (low versus moderate-to-high; OR=4.54, P=0.0005). No difference was found among disease specific survival in multivariate analysis (P=0.067), lymph node metastasis (P=0.12), and sex (P=0.15). Conclusion CD147 could be a biomarker for early diagnosis, treatment, and prognosis of bladder cancer. PMID:28977970
NASA Astrophysics Data System (ADS)
Ngan Nguyen, Thi To; Liu, Cheng-Chien
2013-04-01
How landslides occurred and which factors triggered and sped up landslide occurrences were usually asked by researchers in the past decades. Many investigations carried out in many places in the world to finding out methods that predict and prevent damages from landslides phenomena. Chen-Yu-Lan River watershed is reputed as a 'hot pot' of landslide researches in Taiwan by its complicated geological structures with the significant tectonic fault systems and steeply mountainous terrain. Beside annual high precipitation concentration and the abrupt slopes, some natural disaster, as typhoons (Sinlaku-2008, Kalmaegi-2008, and Marakot-2009) and earthquake (Chi-Chi earthquake-1999) are also the triggered factors cause landslides with serious damages in this place. This research expresses the quantitative approaches to generate landslide susceptible map for Chen-Yu-Lan watershed, a mountainous area in the central Taiwan. Landslide inventories data, which were detected from the Formosat-2 imageries for eight years from 2004 to 2011, were applied to carry out landslide susceptibility mapping. Bivariate statistics analysis and multivariate statistics analysis would be applied to calculate susceptible index of landslides. The weights of parameters were computed based on landslide data for eight years from 2004 to 2011. To validate effective levels of factors to landslide occurrences, this method built some multivariate algorithms and compared these results with real landslide occurrences. Besides this method, the historical data of landslides were also used to assess and classify landslide susceptibility levels. From long-term landslide data, relation between landslide susceptibility levels and landslide repetition was assigned. The results demonstrated differently effective levels of potential factors, such as, slope gradient, drainage density, lithology and land use to landslide phenomena. The results also showed logical relationship between weights and characteristics of factors' classes. Depending on these results be able to help planning managers localize the high risk areas of landslide or safely areas by building and human activities.
Moya, Claudio E; Raiber, Matthias; Taulis, Mauricio; Cox, Malcolm E
2015-03-01
The Galilee and Eromanga basins are sub-basins of the Great Artesian Basin (GAB). In this study, a multivariate statistical approach (hierarchical cluster analysis, principal component analysis and factor analysis) is carried out to identify hydrochemical patterns and assess the processes that control hydrochemical evolution within key aquifers of the GAB in these basins. The results of the hydrochemical assessment are integrated into a 3D geological model (previously developed) to support the analysis of spatial patterns of hydrochemistry, and to identify the hydrochemical and hydrological processes that control hydrochemical variability. In this area of the GAB, the hydrochemical evolution of groundwater is dominated by evapotranspiration near the recharge area resulting in a dominance of the Na-Cl water types. This is shown conceptually using two selected cross-sections which represent discrete groundwater flow paths from the recharge areas to the deeper parts of the basins. With increasing distance from the recharge area, a shift towards a dominance of carbonate (e.g. Na-HCO3 water type) has been observed. The assessment of hydrochemical changes along groundwater flow paths highlights how aquifers are separated in some areas, and how mixing between groundwater from different aquifers occurs elsewhere controlled by geological structures, including between GAB aquifers and coal bearing strata of the Galilee Basin. The results of this study suggest that distinct hydrochemical differences can be observed within the previously defined Early Cretaceous-Jurassic aquifer sequence of the GAB. A revision of the two previously recognised hydrochemical sequences is being proposed, resulting in three hydrochemical sequences based on systematic differences in hydrochemistry, salinity and dominant hydrochemical processes. The integrated approach presented in this study which combines different complementary multivariate statistical techniques with a detailed assessment of the geological framework of these sedimentary basins, can be adopted in other complex multi-aquifer systems to assess hydrochemical evolution and its geological controls. Copyright © 2014 Elsevier B.V. All rights reserved.
Multi-Sample Cluster Analysis Using Akaike’s Information Criterion.
1982-12-20
of Likelihood Criteria for I)fferent Hypotheses," in P. A. Krishnaiah (Ed.), Multivariate Analysis-Il, New York: Academic Press. [5] Fisher, R. A...Methods of Simultaneous Inference in MANOVA," in P. R. Krishnaiah (Ed.), rultivariate Analysis-Il, New York: Academic Press. [8) Kendall, M. G. (1966...1982), Applied Multivariate Statisti- cal-Analysis, Englewood Cliffs: Prentice-Mall, Inc. [1U] Krishnaiah , P. R. (1969), "Simultaneous Test
SUGGESTIONS FOR OPTIMIZED PLANNING OF MULTIVARIATE MONITORING OF ATMOSPHERIC POLLUTION
Recent work in factor analysis of multivariate data sets has shown that variables with little signal should not be included in the factor analysis. Work also shows that rotational ambiguity is reduced if sources impacting a receptor have both large and small contributions. Thes...
Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G.; Shah, Arvind K.; Lin, Jianxin
2013-01-01
In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data (IPD) in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the Deviance Information Criterion (DIC) is used to select the best transformation model. Since the model is quite complex, a novel Monte Carlo Markov chain (MCMC) sampling scheme is developed to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol lowering drugs where the goal is to jointly model the three dimensional response consisting of Low Density Lipoprotein Cholesterol (LDL-C), High Density Lipoprotein Cholesterol (HDL-C), and Triglycerides (TG) (LDL-C, HDL-C, TG). Since the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately: however, a multivariate approach would be more appropriate since these variables are correlated with each other. A detailed analysis of these data is carried out using the proposed methodology. PMID:23580436
Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G; Shah, Arvind K; Lin, Jianxin
2013-10-15
In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the deviance information criterion is used to select the best transformation model. Because the model is quite complex, we develop a novel Monte Carlo Markov chain sampling scheme to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol-lowering drugs where the goal is to jointly model the three-dimensional response consisting of low density lipoprotein cholesterol (LDL-C), high density lipoprotein cholesterol (HDL-C), and triglycerides (TG) (LDL-C, HDL-C, TG). Because the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately; however, a multivariate approach would be more appropriate because these variables are correlated with each other. We carry out a detailed analysis of these data by using the proposed methodology. Copyright © 2013 John Wiley & Sons, Ltd.
Xu, Z J; Pan, J; Zhou, Q; Wang, D J
2017-10-24
Objective: To estimate the prevalence and the risk factors of preoperative coronary angiography (CAG) confirmed coronary stenosis in patients with degenerative valvular heart disease. Methods: A total of 491 patients who underwent screening CAG before valvular surgery due to degenerative valvular heart disease were enrolled from January 2011 to September 2014 in our hospital, and clinical data were analyzed. According to CAG results, patients were divided into positive CAG result (PCAG) group or negative CAG (NCAG) group. Positive CAG result was defined as stenosis ≥50% of the diameter of the left main coronary artery or stenosis ≥70% of the diameter of left anterior descending, left circumflex artery, and right coronary artery.Risk factors of positive CAG result were analyzed by multivariable logistic regression analysis, and Bootstrap method was used to verify the results. Results: There were 47(9.57%)degenerative valvular heart disease patients with PCAG. Patients were older ((68.0±7.6)years vs.(62.6±7.1)years, P <0.001) and the prevalence of typical angina was significantly higher (14.89%(7/47)vs. 2.03%(9/444), P <0.001)in PCAG group than in NCAG group. Multivariable logistic regression analysis showed that age ( OR =1.118, 95% CI 1.067-1.172, P <0.001), typical angina ( OR =8.970, 95% CI 2.963-27.154, P <0.001), and serum concentration of apolipoprotein B ( OR =20.311, 95% CI 4.774-86.416, P <0.001) were the independent risk factors of PCAG in degenerative valvular heart disease patients. Bootstrap method revealed satisfactory repeatability of multivariable logistic regression analysis results (age: OR =1.118, 95% CI 1.068-1.178, P =0.001; typical angina: OR =8.970, 95% CI 2.338-35.891, P =0.001; serum concentration of apolipoprotein B: OR =20.311, 95% CI 4.639-91.977, P =0.001). Conclusions: A low prevalence of PCAG before valvular surgery is observed in degenerative valvular heart disease patients in this patient cohort. Age, typical angina, and serum concentration of apolipoprotein B are independent risk factors of PCAG in this patient cohort.
Prognostic implications of adhesion molecule expression in colorectal cancer.
Seo, Kyung-Jin; Kim, Maru; Kim, Jeana
2015-01-01
Research on the expression of adhesion molecules, E-cadherin (ECAD), CD24, CD44 and osteopontin (OPN) in colorectal cancer (CRC) has been limited, even though CRC is one of the leading causes of cancer-related deaths. This study was conducted to evaluate the expression of adhesion molecules in CRC and to determine their relationships with clinicopathologic variables, and the prognostic significance. The expression of ECAD, CD24, CD44 and OPN was examined in 174 stage II and III CRC specimens by immunohistochemistry of TMA. Negative ECAD expression was significantly correlated with advanced nodal stage and poor tumor differentiation. Multivariate analysis showed that both negative expression of ECAD and positive expression of CD24 were independent prognostic factors for disease-free survival (DFS) in CRC patients (P<0.001, relative risk [RR] = 5.596, 95% CI = 2.712-11.549; P = 0.038, RR = 3.768, 95% CI = 1.077-13.185, respectively). However, for overall survival (OS), only ECAD negativity showed statistically significant results in multivariate analysis (P<0.001, RR = 4.819, 95% CI = 2.515-9.234). Positive expression of CD24 was associated with poor OS in univariate analysis but was of no prognostic value in multivariate analysis. In conclusion, our study suggests that among these four adhesion molecules, ECAD and CD24 expression can be considered independent prognostic factors. The role of CD44 and OPN may need further evaluation.
Prognostic implications of adhesion molecule expression in colorectal cancer
Seo, Kyung-Jin; Kim, Maru; Kim, Jeana
2015-01-01
Research on the expression of adhesion molecules, E-cadherin (ECAD), CD24, CD44 and osteopontin (OPN) in colorectal cancer (CRC) has been limited, even though CRC is one of the leading causes of cancer-related deaths. This study was conducted to evaluate the expression of adhesion molecules in CRC and to determine their relationships with clinicopathologic variables, and the prognostic significance. The expression of ECAD, CD24, CD44 and OPN was examined in 174 stage II and III CRC specimens by immunohistochemistry of TMA. Negative ECAD expression was significantly correlated with advanced nodal stage and poor tumor differentiation. Multivariate analysis showed that both negative expression of ECAD and positive expression of CD24 were independent prognostic factors for disease-free survival (DFS) in CRC patients (P<0.001, relative risk [RR] = 5.596, 95% CI = 2.712-11.549; P = 0.038, RR = 3.768, 95% CI = 1.077-13.185, respectively). However, for overall survival (OS), only ECAD negativity showed statistically significant results in multivariate analysis (P<0.001, RR = 4.819, 95% CI = 2.515-9.234). Positive expression of CD24 was associated with poor OS in univariate analysis but was of no prognostic value in multivariate analysis. In conclusion, our study suggests that among these four adhesion molecules, ECAD and CD24 expression can be considered independent prognostic factors. The role of CD44 and OPN may need further evaluation. PMID:26097606
Wangkahad, Bencharong; Mongkolsuk, Skorn; Sirikanchana, Kwanrawee
2017-02-21
We developed sewage-specific microbial source tracking (MST) tools using enterococci bacteriophages and evaluated their performance with univariate and multivariate analyses involving data below detection limits. Newly isolated Enterococci faecalis bacterial strains AIM06 (DSM100702) and SR14 (DSM100701) demonstrated 100% specificity and 90% sensitivity to human sewage without detecting 68 animal manure pooled samples of cats, chickens, cows, dogs, ducks, pigs, and pigeons. AIM06 and SR14 bacteriophages were present in human sewage at 2-4 orders of magnitude. A principal component analysis confirmed the importance of both phages as main water quality parameters. The phages presented only in the polluted water, as classified by a cluster analysis, and at median concentrations of 1.71 × 10 2 and 4.27 × 10 2 PFU/100 mL, respectively, higher than nonhost specific RYC2056 phages and sewage-specific KS148 phages (p < 0.05). Interestingly, AIM06 and SR14 phages exhibited significant correlations with each other and with total coliforms, E. coli, enterococci, and biochemical oxygen demand (Kendall's tau = 0.348 to 0.605, p < 0.05), a result supporting their roles as water quality indicators. This research demonstrates the multiregional applicability of enterococci hosts in MST application and highlights the significance of multivariate analysis with nondetects in evaluating the performance of new MST host strains.
Evaluation and simplification of the occupational slip, trip and fall risk-assessment test
NAKAMURA, Takehiro; OYAMA, Ichiro; FUJINO, Yoshihisa; KUBO, Tatsuhiko; KADOWAKI, Koji; KUNIMOTO, Masamizu; ODOI, Haruka; TABATA, Hidetoshi; MATSUDA, Shinya
2016-01-01
Objective: The purpose of this investigation is to evaluate the efficacy of the occupational slip, trip and fall (STF) risk assessment test developed by the Japan Industrial Safety and Health Association (JISHA). We further intended to simplify the test to improve efficiency. Methods: A previous cohort study was performed using 540 employees aged ≥50 years who took the JISHA’s STF risk assessment test. We conducted multivariate analysis using these previous results as baseline values and answers to questionnaire items or score on physical fitness tests as variables. The screening efficiency of each model was evaluated based on the obtained receiver operating characteristic (ROC) curve. Results: The area under the ROC obtained in multivariate analysis was 0.79 when using all items. Six of the 25 questionnaire items were selected for stepwise analysis, giving an area under the ROC curve of 0.77. Conclusion: Based on the results of follow-up performed one year after the initial examination, we successfully determined the usefulness of the STF risk assessment test. Administering a questionnaire alone is sufficient for screening subjects at risk of STF during the subsequent one-year period. PMID:27021057
Bello, Alessandra; Bianchi, Federica; Careri, Maria; Giannetto, Marco; Mori, Giovanni; Musci, Marilena
2007-11-05
A new NIR method based on multivariate calibration for determination of ethanol in industrially packed wholemeal bread was developed and validated. GC-FID was used as reference method for the determination of actual ethanol concentration of different samples of wholemeal bread with proper content of added ethanol, ranging from 0 to 3.5% (w/w). Stepwise discriminant analysis was carried out on the NIR dataset, in order to reduce the number of original variables by selecting those that were able to discriminate between the samples of different ethanol concentrations. With the so selected variables a multivariate calibration model was then obtained by multiple linear regression. The prediction power of the linear model was optimized by a new "leave one out" method, so that the number of original variables resulted further reduced.
TENSOR DECOMPOSITIONS AND SPARSE LOG-LINEAR MODELS
Johndrow, James E.; Bhattacharya, Anirban; Dunson, David B.
2017-01-01
Contingency table analysis routinely relies on log-linear models, with latent structure analysis providing a common alternative. Latent structure models lead to a reduced rank tensor factorization of the probability mass function for multivariate categorical data, while log-linear models achieve dimensionality reduction through sparsity. Little is known about the relationship between these notions of dimensionality reduction in the two paradigms. We derive several results relating the support of a log-linear model to nonnegative ranks of the associated probability tensor. Motivated by these findings, we propose a new collapsed Tucker class of tensor decompositions, which bridge existing PARAFAC and Tucker decompositions, providing a more flexible framework for parsimoniously characterizing multivariate categorical data. Taking a Bayesian approach to inference, we illustrate empirical advantages of the new decompositions. PMID:29332971
Influence factors and forecast of carbon emission in China: structure adjustment for emission peak
NASA Astrophysics Data System (ADS)
Wang, B.; Cui, C. Q.; Li, Z. P.
2018-02-01
This paper introduced Principal Component Analysis and Multivariate Linear Regression Model to verify long-term balance relationships between Carbon Emissions and the impact factors. The integrated model of improved PCA and multivariate regression analysis model is attainable to figure out the pattern of carbon emission sources. Main empirical results indicate that among all selected variables, the role of energy consumption scale was largest. GDP and Population follow and also have significant impacts on carbon emission. Industrialization rate and fossil fuel proportion, which is the indicator of reflecting the economic structure and energy structure, have a higher importance than the factor of urbanization rate and the dweller consumption level of urban areas. In this way, some suggestions are put forward for government to achieve the peak of carbon emissions.
Ji, Hong; Petro, Nathan M; Chen, Badong; Yuan, Zejian; Wang, Jianji; Zheng, Nanning; Keil, Andreas
2018-02-06
Over the past decade, the simultaneous recording of electroencephalogram (EEG) and functional magnetic resonance imaging (fMRI) data has garnered growing interest because it may provide an avenue towards combining the strengths of both imaging modalities. Given their pronounced differences in temporal and spatial statistics, the combination of EEG and fMRI data is however methodologically challenging. Here, we propose a novel screening approach that relies on a Cross Multivariate Correlation Coefficient (xMCC) framework. This approach accomplishes three tasks: (1) It provides a measure for testing multivariate correlation and multivariate uncorrelation of the two modalities; (2) it provides criterion for the selection of EEG features; (3) it performs a screening of relevant EEG information by grouping the EEG channels into clusters to improve efficiency and to reduce computational load when searching for the best predictors of the BOLD signal. The present report applies this approach to a data set with concurrent recordings of steady-state-visual evoked potentials (ssVEPs) and fMRI, recorded while observers viewed phase-reversing Gabor patches. We test the hypothesis that fluctuations in visuo-cortical mass potentials systematically covary with BOLD fluctuations not only in visual cortical, but also in anterior temporal and prefrontal areas. Results supported the hypothesis and showed that the xMCC-based analysis provides straightforward identification of neurophysiological plausible brain regions with EEG-fMRI covariance. Furthermore xMCC converged with other extant methods for EEG-fMRI analysis. © 2018 The Authors Journal of Neuroscience Research Published by Wiley Periodicals, Inc.
A survey of variable selection methods in two Chinese epidemiology journals
2010-01-01
Background Although much has been written on developing better procedures for variable selection, there is little research on how it is practiced in actual studies. This review surveys the variable selection methods reported in two high-ranking Chinese epidemiology journals. Methods Articles published in 2004, 2006, and 2008 in the Chinese Journal of Epidemiology and the Chinese Journal of Preventive Medicine were reviewed. Five categories of methods were identified whereby variables were selected using: A - bivariate analyses; B - multivariable analysis; e.g. stepwise or individual significance testing of model coefficients; C - first bivariate analyses, followed by multivariable analysis; D - bivariate analyses or multivariable analysis; and E - other criteria like prior knowledge or personal judgment. Results Among the 287 articles that reported using variable selection methods, 6%, 26%, 30%, 21%, and 17% were in categories A through E, respectively. One hundred sixty-three studies selected variables using bivariate analyses, 80% (130/163) via multiple significance testing at the 5% alpha-level. Of the 219 multivariable analyses, 97 (44%) used stepwise procedures, 89 (41%) tested individual regression coefficients, but 33 (15%) did not mention how variables were selected. Sixty percent (58/97) of the stepwise routines also did not specify the algorithm and/or significance levels. Conclusions The variable selection methods reported in the two journals were limited in variety, and details were often missing. Many studies still relied on problematic techniques like stepwise procedures and/or multiple testing of bivariate associations at the 0.05 alpha-level. These deficiencies should be rectified to safeguard the scientific validity of articles published in Chinese epidemiology journals. PMID:20920252
Schnabel, Thomas; Musso, Maurizio; Tondi, Gianluca
2014-01-01
Vibrational spectroscopy is one of the most powerful tools in polymer science. Three main techniques--Fourier transform infrared spectroscopy (FT-IR), FT-Raman spectroscopy, and FT near-infrared (NIR) spectroscopy--can also be applied to wood science. Here, these three techniques were used to investigate the chemical modification occurring in wood after impregnation with tannin-hexamine preservatives. These spectroscopic techniques have the capacity to detect the externally added tannin. FT-IR has very strong sensitivity to the aromatic peak at around 1610 cm(-1) in the tannin-treated samples, whereas FT-Raman reflects the peak at around 1600 cm(-1) for the externally added tannin. This high efficacy in distinguishing chemical features was demonstrated in univariate analysis and confirmed via cluster analysis. Conversely, the results of the NIR measurements show noticeable sensitivity for small differences. For this technique, multivariate analysis is required and with this chemometric tool, it is also possible to predict the concentration of tannin on the surface.
Lim, Jongguk; Kim, Giyoung; Mo, Changyeun; Oh, Kyoungmin; Yoo, Hyeonchae; Ham, Hyeonheui; Kim, Moon S.
2017-01-01
The purpose of this study is to use near-infrared reflectance (NIR) spectroscopy equipment to nondestructively and rapidly discriminate Fusarium-infected hulled barley. Both normal hulled barley and Fusarium-infected hulled barley were scanned by using a NIR spectrometer with a wavelength range of 1175 to 2170 nm. Multiple mathematical pretreatments were applied to the reflectance spectra obtained for Fusarium discrimination and the multivariate analysis method of partial least squares discriminant analysis (PLS-DA) was used for discriminant prediction. The PLS-DA prediction model developed by applying the second-order derivative pretreatment to the reflectance spectra obtained from the side of hulled barley without crease achieved 100% accuracy in discriminating the normal hulled barley and the Fusarium-infected hulled barley. These results demonstrated the feasibility of rapid discrimination of the Fusarium-infected hulled barley by combining multivariate analysis with the NIR spectroscopic technique, which is utilized as a nondestructive detection method. PMID:28974012
NASA Astrophysics Data System (ADS)
Oh, Han Bin; Leach, Franklin E.; Arungundram, Sailaja; Al-Mafraji, Kanar; Venot, Andre; Boons, Geert-Jan; Amster, I. Jonathan
2011-03-01
The structural characterization of glycosaminoglycan (GAG) carbohydrates by mass spectrometry has been a long-standing analytical challenge due to the inherent heterogeneity of these biomolecules, specifically polydispersity, variability in sulfation, and hexuronic acid stereochemistry. Recent advances in tandem mass spectrometry methods employing threshold and electron-based ion activation have resulted in the ability to determine the location of the labile sulfate modification as well as assign the stereochemistry of hexuronic acid residues. To facilitate the analysis of complex electron detachment dissociation (EDD) spectra, principal component analysis (PCA) is employed to differentiate the hexuronic acid stereochemistry of four synthetic GAG epimers whose EDD spectra are nearly identical upon visual inspection. For comparison, PCA is also applied to infrared multiphoton dissociation spectra (IRMPD) of the examined epimers. To assess the applicability of multivariate methods in GAG mixture analysis, PCA is utilized to identify the relative content of two epimers in a binary mixture.
Song, Seung Yeob; Lee, Young Koung; Kim, In-Jung
2016-01-01
A high-throughput screening system for Citrus lines were established with higher sugar and acid contents using Fourier transform infrared (FT-IR) spectroscopy in combination with multivariate analysis. FT-IR spectra confirmed typical spectral differences between the frequency regions of 950-1100 cm(-1), 1300-1500 cm(-1), and 1500-1700 cm(-1). Principal component analysis (PCA) and subsequent partial least square-discriminant analysis (PLS-DA) were able to discriminate five Citrus lines into three separate clusters corresponding to their taxonomic relationships. The quantitative predictive modeling of sugar and acid contents from Citrus fruits was established using partial least square regression algorithms from FT-IR spectra. The regression coefficients (R(2)) between predicted values and estimated sugar and acid content values were 0.99. These results demonstrate that by using FT-IR spectra and applying quantitative prediction modeling to Citrus sugar and acid contents, excellent Citrus lines can be early detected with greater accuracy. Copyright © 2015 Elsevier Ltd. All rights reserved.
An improved method for bivariate meta-analysis when within-study correlations are unknown.
Hong, Chuan; D Riley, Richard; Chen, Yong
2018-03-01
Multivariate meta-analysis, which jointly analyzes multiple and possibly correlated outcomes in a single analysis, is becoming increasingly popular in recent years. An attractive feature of the multivariate meta-analysis is its ability to account for the dependence between multiple estimates from the same study. However, standard inference procedures for multivariate meta-analysis require the knowledge of within-study correlations, which are usually unavailable. This limits standard inference approaches in practice. Riley et al proposed a working model and an overall synthesis correlation parameter to account for the marginal correlation between outcomes, where the only data needed are those required for a separate univariate random-effects meta-analysis. As within-study correlations are not required, the Riley method is applicable to a wide variety of evidence synthesis situations. However, the standard variance estimator of the Riley method is not entirely correct under many important settings. As a consequence, the coverage of a function of pooled estimates may not reach the nominal level even when the number of studies in the multivariate meta-analysis is large. In this paper, we improve the Riley method by proposing a robust variance estimator, which is asymptotically correct even when the model is misspecified (ie, when the likelihood function is incorrect). Simulation studies of a bivariate meta-analysis, in a variety of settings, show a function of pooled estimates has improved performance when using the proposed robust variance estimator. In terms of individual pooled estimates themselves, the standard variance estimator and robust variance estimator give similar results to the original method, with appropriate coverage. The proposed robust variance estimator performs well when the number of studies is relatively large. Therefore, we recommend the use of the robust method for meta-analyses with a relatively large number of studies (eg, m≥50). When the sample size is relatively small, we recommend the use of the robust method under the working independence assumption. We illustrate the proposed method through 2 meta-analyses. Copyright © 2017 John Wiley & Sons, Ltd.
A Multivariate Descriptive Model of Motivation for Orthodontic Treatment.
ERIC Educational Resources Information Center
Hackett, Paul M. W.; And Others
1993-01-01
Motivation for receiving orthodontic treatment was studied among 109 young adults, and a multivariate model of the process is proposed. The combination of smallest scale analysis and Partial Order Scalogram Analysis by base Coordinates (POSAC) illustrates an interesting methodology for health treatment studies and explores motivation for dental…
ERIC Educational Resources Information Center
Grundmann, Matthias
Following the assumptions of ecological socialization research, adequate analysis of socialization conditions must take into account the multilevel and multivariate structure of social factors that impact on human development. This statement implies that complex models of family configurations or of socialization factors are needed to explain the…
Univariate Analysis of Multivariate Outcomes in Educational Psychology.
ERIC Educational Resources Information Center
Hubble, L. M.
1984-01-01
The author examined the prevalence of multiple operational definitions of outcome constructs and an estimate of the incidence of Type I error rates when univariate procedures were applied to multiple variables in educational psychology. Multiple operational definitions of constructs were advocated and wider use of multivariate analysis was…
Applied Statistics: From Bivariate through Multivariate Techniques [with CD-ROM
ERIC Educational Resources Information Center
Warner, Rebecca M.
2007-01-01
This book provides a clear introduction to widely used topics in bivariate and multivariate statistics, including multiple regression, discriminant analysis, MANOVA, factor analysis, and binary logistic regression. The approach is applied and does not require formal mathematics; equations are accompanied by verbal explanations. Students are asked…
Evaluation of Meterorite Amono Acid Analysis Data Using Multivariate Techniques
NASA Technical Reports Server (NTRS)
McDonald, G.; Storrie-Lombardi, M.; Nealson, K.
1999-01-01
The amino acid distributions in the Murchison carbonaceous chondrite, Mars meteorite ALH84001, and ice from the Allan Hills region of Antarctica are shown, using a multivariate technique known as Principal Component Analysis (PCA), to be statistically distinct from the average amino acid compostion of 101 terrestrial protein superfamilies.
Microenvironmental and biological/personal monitoring information were collected during the National Human Exposure Assessment Survey (NHEXAS), conducted in the six states comprising U.S. EPA Region Five. They have been analyzed by multivariate analysis techniques with general ...
Multivariate meta-analysis: a robust approach based on the theory of U-statistic.
Ma, Yan; Mazumdar, Madhu
2011-10-30
Meta-analysis is the methodology for combining findings from similar research studies asking the same question. When the question of interest involves multiple outcomes, multivariate meta-analysis is used to synthesize the outcomes simultaneously taking into account the correlation between the outcomes. Likelihood-based approaches, in particular restricted maximum likelihood (REML) method, are commonly utilized in this context. REML assumes a multivariate normal distribution for the random-effects model. This assumption is difficult to verify, especially for meta-analysis with small number of component studies. The use of REML also requires iterative estimation between parameters, needing moderately high computation time, especially when the dimension of outcomes is large. A multivariate method of moments (MMM) is available and is shown to perform equally well to REML. However, there is a lack of information on the performance of these two methods when the true data distribution is far from normality. In this paper, we propose a new nonparametric and non-iterative method for multivariate meta-analysis on the basis of the theory of U-statistic and compare the properties of these three procedures under both normal and skewed data through simulation studies. It is shown that the effect on estimates from REML because of non-normal data distribution is marginal and that the estimates from MMM and U-statistic-based approaches are very similar. Therefore, we conclude that for performing multivariate meta-analysis, the U-statistic estimation procedure is a viable alternative to REML and MMM. Easy implementation of all three methods are illustrated by their application to data from two published meta-analysis from the fields of hip fracture and periodontal disease. We discuss ideas for future research based on U-statistic for testing significance of between-study heterogeneity and for extending the work to meta-regression setting. Copyright © 2011 John Wiley & Sons, Ltd.
Regression analysis for LED color detection of visual-MIMO system
NASA Astrophysics Data System (ADS)
Banik, Partha Pratim; Saha, Rappy; Kim, Ki-Doo
2018-04-01
Color detection from a light emitting diode (LED) array using a smartphone camera is very difficult in a visual multiple-input multiple-output (visual-MIMO) system. In this paper, we propose a method to determine the LED color using a smartphone camera by applying regression analysis. We employ a multivariate regression model to identify the LED color. After taking a picture of an LED array, we select the LED array region, and detect the LED using an image processing algorithm. We then apply the k-means clustering algorithm to determine the number of potential colors for feature extraction of each LED. Finally, we apply the multivariate regression model to predict the color of the transmitted LEDs. In this paper, we show our results for three types of environmental light condition: room environmental light, low environmental light (560 lux), and strong environmental light (2450 lux). We compare the results of our proposed algorithm from the analysis of training and test R-Square (%) values, percentage of closeness of transmitted and predicted colors, and we also mention about the number of distorted test data points from the analysis of distortion bar graph in CIE1931 color space.
A systematic uncertainty analysis for liner impedance eduction technology
NASA Astrophysics Data System (ADS)
Zhou, Lin; Bodén, Hans
2015-11-01
The so-called impedance eduction technology is widely used for obtaining acoustic properties of liners used in aircraft engines. The measurement uncertainties for this technology are still not well understood though it is essential for data quality assessment and model validation. A systematic framework based on multivariate analysis is presented in this paper to provide 95 percent confidence interval uncertainty estimates in the process of impedance eduction. The analysis is made using a single mode straightforward method based on transmission coefficients involving the classic Ingard-Myers boundary condition. The multivariate technique makes it possible to obtain an uncertainty analysis for the possibly correlated real and imaginary parts of the complex quantities. The results show that the errors in impedance results at low frequency mainly depend on the variability of transmission coefficients, while the mean Mach number accuracy is the most important source of error at high frequencies. The effect of Mach numbers used in the wave dispersion equation and in the Ingard-Myers boundary condition has been separated for comparison of the outcome of impedance eduction. A local Mach number based on friction velocity is suggested as a way to reduce the inconsistencies found when estimating impedance using upstream and downstream acoustic excitation.
Lizier, Joseph T; Heinzle, Jakob; Horstmann, Annette; Haynes, John-Dylan; Prokopenko, Mikhail
2011-02-01
The human brain undertakes highly sophisticated information processing facilitated by the interaction between its sub-regions. We present a novel method for interregional connectivity analysis, using multivariate extensions to the mutual information and transfer entropy. The method allows us to identify the underlying directed information structure between brain regions, and how that structure changes according to behavioral conditions. This method is distinguished in using asymmetric, multivariate, information-theoretical analysis, which captures not only directional and non-linear relationships, but also collective interactions. Importantly, the method is able to estimate multivariate information measures with only relatively little data. We demonstrate the method to analyze functional magnetic resonance imaging time series to establish the directed information structure between brain regions involved in a visuo-motor tracking task. Importantly, this results in a tiered structure, with known movement planning regions driving visual and motor control regions. Also, we examine the changes in this structure as the difficulty of the tracking task is increased. We find that task difficulty modulates the coupling strength between regions of a cortical network involved in movement planning and between motor cortex and the cerebellum which is involved in the fine-tuning of motor control. It is likely these methods will find utility in identifying interregional structure (and experimentally induced changes in this structure) in other cognitive tasks and data modalities.
Wang, Yalin; Zhang, Jie; Gutman, Boris; Chan, Tony F.; Becker, James T.; Aizenstein, Howard J.; Lopez, Oscar L.; Tamburo, Robert J.; Toga, Arthur W.; Thompson, Paul M.
2010-01-01
Here we developed a new method, called multivariate tensor-based surface morphometry (TBM), and applied it to study lateral ventricular surface differences associated with HIV/AIDS. Using concepts from differential geometry and the theory of differential forms, we created mathematical structures known as holomorphic one-forms, to obtain an efficient and accurate conformal parameterization of the lateral ventricular surfaces in the brain. The new meshing approach also provides a natural way to register anatomical surfaces across subjects, and improves on prior methods as it handles surfaces that branch and join at complex 3D junctions. To analyze anatomical differences, we computed new statistics from the Riemannian surface metrics - these retain multivariate information on local surface geometry. We applied this framework to analyze lateral ventricular surface morphometry in 3D MRI data from 11 subjects with HIV/AIDS and 8 healthy controls. Our method detected a 3D profile of surface abnormalities even in this small sample. Multivariate statistics on the local tensors gave better effect sizes for detecting group differences, relative to other TBM-based methods including analysis of the Jacobian determinant, the largest and smallest eigenvalues of the surface metric, and the pair of eigenvalues of the Jacobian matrix. The resulting analysis pipeline may improve the power of surface-based morphometry studies of the brain. PMID:19900560
Multivariate spatial models of excess crash frequency at area level: case of Costa Rica.
Aguero-Valverde, Jonathan
2013-10-01
Recently, areal models of crash frequency have being used in the analysis of various area-wide factors affecting road crashes. On the other hand, disease mapping methods are commonly used in epidemiology to assess the relative risk of the population at different spatial units. A natural next step is to combine these two approaches to estimate the excess crash frequency at area level as a measure of absolute crash risk. Furthermore, multivariate spatial models of crash severity are explored in order to account for both frequency and severity of crashes and control for the spatial correlation frequently found in crash data. This paper aims to extent the concept of safety performance functions to be used in areal models of crash frequency. A multivariate spatial model is used for that purpose and compared to its univariate counterpart. Full Bayes hierarchical approach is used to estimate the models of crash frequency at canton level for Costa Rica. An intrinsic multivariate conditional autoregressive model is used for modeling spatial random effects. The results show that the multivariate spatial model performs better than its univariate counterpart in terms of the penalized goodness-of-fit measure Deviance Information Criteria. Additionally, the effects of the spatial smoothing due to the multivariate spatial random effects are evident in the estimation of excess equivalent property damage only crashes. Copyright © 2013 Elsevier Ltd. All rights reserved.
Challenging a dogma: five-year survival does not equal cure in all colorectal cancer patients.
Abdel-Rahman, Omar
2018-02-01
The current study tried to evaluate the factors affecting 10- to 20- years' survival among long term survivors (>5 years) of colorectal cancer (CRC). Surveillance, Epidemiology and End Results (SEER) database (1988-2008) was queried through SEER*Stat program.Univariate probability of overall and cancer-specific survival was determined and the difference between groups was examined. Multivariate analysis for factors affecting overall and cancer-specific survival was also conducted. Among node positive patients (Dukes C), 34% of the deaths beyond 5 years can be attributed to CRC; while among M1 patients, 63% of the deaths beyond 5 years can be attributed to CRC. The following factors were predictors of better overall survival in multivariate analysis: younger age, white race (versus black race), female gender, Right colon location (versus rectal location), earlier stage and surgery (P <0.0001 for all parameters). Similarly, the following factors were predictors of better cancer-specific survival in multivariate analysis: younger age, white race (versus black race), female gender, Right colon location (versus left colon and rectal locations), earlier stage and surgery (P <0.0001 for all parameters). Among node positive long-term CRC survivors, more than one third of all deaths can be attributed to CRC.
Redo surgery risk in patients with cardiac prosthetic valve dysfunction
Maciejewski, Marek; Piestrzeniewicz, Katarzyna; Bielecka-Dąbrowa, Agata; Piechowiak, Monika; Jaszewski, Ryszard
2011-01-01
Introduction The aim of the study was to analyse the risk factors of early and late mortality in patients undergoing the first reoperation for prosthetic valve dysfunction. Material and methods A retrospective observational study was performed in 194 consecutive patients (M = 75, F = 119; mean age 53.2 ±11 years) with a mechanical prosthetic valve (n = 103 cases; 53%) or bioprosthesis (91; 47%). Univariate and multivariate Cox statistical analysis was performed to determine risk factors of early and late mortality. Results The overall early mortality was 18.6%: 31.4% in patients with symptoms of NYHA functional class III-IV and 3.4% in pts in NYHA class I-II. Multivariate analysis identified symptoms of NYHA class III-IV and endocarditis as independent predictors of early mortality. The overall late mortality (> 30 days) was 8.2% (0.62% year/patient). Multivariate analysis identified age at the time of reoperation as a strong independent predictor of late mortality. Conclusions Reoperation in patients with prosthetic valves, performed urgently, especially in patients with symptoms of NYHA class III-IV or in the case of endocarditis, bears a high mortality rate. Risk of planned reoperation, mostly in patients with symptoms of NYHA class I-II, does not differ from the risk of the first operation. PMID:22291767
Horsch, Salome; Kopczynski, Dominik; Kuthe, Elias; Baumbach, Jörg Ingo; Rahmann, Sven
2017-01-01
Motivation Disease classification from molecular measurements typically requires an analysis pipeline from raw noisy measurements to final classification results. Multi capillary column—ion mobility spectrometry (MCC-IMS) is a promising technology for the detection of volatile organic compounds in the air of exhaled breath. From raw measurements, the peak regions representing the compounds have to be identified, quantified, and clustered across different experiments. Currently, several steps of this analysis process require manual intervention of human experts. Our goal is to identify a fully automatic pipeline that yields competitive disease classification results compared to an established but subjective and tedious semi-manual process. Method We combine a large number of modern methods for peak detection, peak clustering, and multivariate classification into analysis pipelines for raw MCC-IMS data. We evaluate all combinations on three different real datasets in an unbiased cross-validation setting. We determine which specific algorithmic combinations lead to high AUC values in disease classifications across the different medical application scenarios. Results The best fully automated analysis process achieves even better classification results than the established manual process. The best algorithms for the three analysis steps are (i) SGLTR (Savitzky-Golay Laplace-operator filter thresholding regions) and LM (Local Maxima) for automated peak identification, (ii) EM clustering (Expectation Maximization) and DBSCAN (Density-Based Spatial Clustering of Applications with Noise) for the clustering step and (iii) RF (Random Forest) for multivariate classification. Thus, automated methods can replace the manual steps in the analysis process to enable an unbiased high throughput use of the technology. PMID:28910313
Classical least squares multivariate spectral analysis
Haaland, David M.
2002-01-01
An improved classical least squares multivariate spectral analysis method that adds spectral shapes describing non-calibrated components and system effects (other than baseline corrections) present in the analyzed mixture to the prediction phase of the method. These improvements decrease or eliminate many of the restrictions to the CLS-type methods and greatly extend their capabilities, accuracy, and precision. One new application of PACLS includes the ability to accurately predict unknown sample concentrations when new unmodeled spectral components are present in the unknown samples. Other applications of PACLS include the incorporation of spectrometer drift into the quantitative multivariate model and the maintenance of a calibration on a drifting spectrometer. Finally, the ability of PACLS to transfer a multivariate model between spectrometers is demonstrated.
Effect of microwave radiation on Jayadhar cotton fibers: WAXS studies
DOE Office of Scientific and Technical Information (OSTI.GOV)
Niranjana, A. R., E-mail: arnphysics@gmail.com; Mahesh, S. S., E-mail: arnphysics@gmail.com; Divakara, S., E-mail: arnphysics@gmail.com
Thermal effect in the form of micro wave energy on Jayadhar cotton fiber has been investigated. Microstructural parameters have been estimated using wide angle x-ray scattering (WAXS) data and line profile analysis program developed by us. Physical properties like tensile strength are correlated with X-ray results. We observe that the microwave radiation do affect significantly many parameters and we have suggested a multivariate analysis of these parameters to arrive at a significant result.
NASA Astrophysics Data System (ADS)
Bhattacharjee, T.; Kumar, P.; Fillipe, L.
2018-02-01
Vibrational spectroscopy, especially FTIR and Raman, has shown enormous potential in disease diagnosis, especially in cancers. Their potential for detecting varied pathological conditions are regularly reported. However, to prove their applicability in clinics, large multi-center multi-national studies need to be undertaken; and these will result in enormous amount of data. A parallel effort to develop analytical methods, including user-friendly software that can quickly pre-process data and subject them to required multivariate analysis is warranted in order to obtain results in real time. This study reports a MATLAB based script that can automatically import data, preprocess spectra— interpolation, derivatives, normalization, and then carry out Principal Component Analysis (PCA) followed by Linear Discriminant Analysis (LDA) of the first 10 PCs; all with a single click. The software has been verified on data obtained from cell lines, animal models, and in vivo patient datasets, and gives results comparable to Minitab 16 software. The software can be used to import variety of file extensions, asc, .txt., .xls, and many others. Options to ignore noisy data, plot all possible graphs with PCA factors 1 to 5, and save loading factors, confusion matrices and other parameters are also present. The software can provide results for a dataset of 300 spectra within 0.01 s. We believe that the software will be vital not only in clinical trials using vibrational spectroscopic data, but also to obtain rapid results when these tools get translated into clinics.
Family Influences on College Students' Occupational Identity
ERIC Educational Resources Information Center
Berrios-Allison, Ana C.
2005-01-01
The occupational identity statuses of 232 college students were analyzed by examining their family emotional environment and the identity control processes that drive career decision making. Results of multivariate analysis showed that each family differentiation construct, family tolerance for connectedness, and separateness explained significant…
Walling, Craig A; Morrissey, Michael B; Foerster, Katharina; Clutton-Brock, Tim H; Pemberton, Josephine M; Kruuk, Loeske E B
2014-12-01
Evolutionary theory predicts that genetic constraints should be widespread, but empirical support for their existence is surprisingly rare. Commonly applied univariate and bivariate approaches to detecting genetic constraints can underestimate their prevalence, with important aspects potentially tractable only within a multivariate framework. However, multivariate genetic analyses of data from natural populations are challenging because of modest sample sizes, incomplete pedigrees, and missing data. Here we present results from a study of a comprehensive set of life history traits (juvenile survival, age at first breeding, annual fecundity, and longevity) for both males and females in a wild, pedigreed, population of red deer (Cervus elaphus). We use factor analytic modeling of the genetic variance-covariance matrix ( G: ) to reduce the dimensionality of the problem and take a multivariate approach to estimating genetic constraints. We consider a range of metrics designed to assess the effect of G: on the deflection of a predicted response to selection away from the direction of fastest adaptation and on the evolvability of the traits. We found limited support for genetic constraint through genetic covariances between traits, both within sex and between sexes. We discuss these results with respect to other recent findings and to the problems of estimating these parameters for natural populations. Copyright © 2014 Walling et al.
Walling, Craig A.; Morrissey, Michael B.; Foerster, Katharina; Clutton-Brock, Tim H.; Pemberton, Josephine M.; Kruuk, Loeske E. B.
2014-01-01
Evolutionary theory predicts that genetic constraints should be widespread, but empirical support for their existence is surprisingly rare. Commonly applied univariate and bivariate approaches to detecting genetic constraints can underestimate their prevalence, with important aspects potentially tractable only within a multivariate framework. However, multivariate genetic analyses of data from natural populations are challenging because of modest sample sizes, incomplete pedigrees, and missing data. Here we present results from a study of a comprehensive set of life history traits (juvenile survival, age at first breeding, annual fecundity, and longevity) for both males and females in a wild, pedigreed, population of red deer (Cervus elaphus). We use factor analytic modeling of the genetic variance–covariance matrix (G) to reduce the dimensionality of the problem and take a multivariate approach to estimating genetic constraints. We consider a range of metrics designed to assess the effect of G on the deflection of a predicted response to selection away from the direction of fastest adaptation and on the evolvability of the traits. We found limited support for genetic constraint through genetic covariances between traits, both within sex and between sexes. We discuss these results with respect to other recent findings and to the problems of estimating these parameters for natural populations. PMID:25278555
The intervals method: a new approach to analyse finite element outputs using multivariate statistics
De Esteban-Trivigno, Soledad; Püschel, Thomas A.; Fortuny, Josep
2017-01-01
Background In this paper, we propose a new method, named the intervals’ method, to analyse data from finite element models in a comparative multivariate framework. As a case study, several armadillo mandibles are analysed, showing that the proposed method is useful to distinguish and characterise biomechanical differences related to diet/ecomorphology. Methods The intervals’ method consists of generating a set of variables, each one defined by an interval of stress values. Each variable is expressed as a percentage of the area of the mandible occupied by those stress values. Afterwards these newly generated variables can be analysed using multivariate methods. Results Applying this novel method to the biological case study of whether armadillo mandibles differ according to dietary groups, we show that the intervals’ method is a powerful tool to characterize biomechanical performance and how this relates to different diets. This allows us to positively discriminate between specialist and generalist species. Discussion We show that the proposed approach is a useful methodology not affected by the characteristics of the finite element mesh. Additionally, the positive discriminating results obtained when analysing a difficult case study suggest that the proposed method could be a very useful tool for comparative studies in finite element analysis using multivariate statistical approaches. PMID:29043107
Time Series Model Identification by Estimating Information.
1982-11-01
principle, Applications of Statistics, P. R. Krishnaiah , ed., North-Holland: Amsterdam, 27-41. Anderson, T. W. (1971). The Statistical Analysis of Time Series...E. (1969). Multiple Time Series Modeling, Multivariate Analysis II, edited by P. Krishnaiah , Academic Press: New York, 389-409. Parzen, E. (1981...Newton, H. J. (1980). Multiple Time Series Modeling, II Multivariate Analysis - V, edited by P. Krishnaiah , North Holland: Amsterdam, 181-197. Shibata, R
Genomic Analysis of Complex Microbial Communities in Wounds
2012-01-01
thoroughly in the ecology literature. Permutation Multivariate Analysis of Variance ( PerMANOVA ). We used PerMANOVA to test the null-hypothesis of no...difference between the bacterial communities found within a single wound compared to those from different patients (α = 0.05). PerMANOVA is a...permutation-based version of the multivariate analysis of variance (MANOVA). PerMANOVA uses the distances between samples to partition variance and
NASA Astrophysics Data System (ADS)
Guimarães Nobre, Gabriela; Arnbjerg-Nielsen, Karsten; Rosbjerg, Dan; Madsen, Henrik
2016-04-01
Traditionally, flood risk assessment studies have been carried out from a univariate frequency analysis perspective. However, statistical dependence between hydrological variables, such as extreme rainfall and extreme sea surge, is plausible to exist, since both variables to some extent are driven by common meteorological conditions. Aiming to overcome this limitation, multivariate statistical techniques has the potential to combine different sources of flooding in the investigation. The aim of this study was to apply a range of statistical methodologies for analyzing combined extreme hydrological variables that can lead to coastal and urban flooding. The study area is the Elwood Catchment, which is a highly urbanized catchment located in the city of Port Phillip, Melbourne, Australia. The first part of the investigation dealt with the marginal extreme value distributions. Two approaches to extract extreme value series were applied (Annual Maximum and Partial Duration Series), and different probability distribution functions were fit to the observed sample. Results obtained by using the Generalized Pareto distribution demonstrate the ability of the Pareto family to model the extreme events. Advancing into multivariate extreme value analysis, first an investigation regarding the asymptotic properties of extremal dependence was carried out. As a weak positive asymptotic dependence between the bivariate extreme pairs was found, the Conditional method proposed by Heffernan and Tawn (2004) was chosen. This approach is suitable to model bivariate extreme values, which are relatively unlikely to occur together. The results show that the probability of an extreme sea surge occurring during a one-hour intensity extreme precipitation event (or vice versa) can be twice as great as what would occur when assuming independent events. Therefore, presuming independence between these two variables would result in severe underestimation of the flooding risk in the study area.
Nojima, Masanori; Tokunaga, Mutsumi; Nagamura, Fumitaka
2018-05-05
To investigate under what circumstances inappropriate use of 'multivariate analysis' is likely to occur and to identify the population that needs more support with medical statistics. The frequency of inappropriate regression model construction in multivariate analysis and related factors were investigated in observational medical research publications. The inappropriate algorithm of using only variables that were significant in univariate analysis was estimated to occur at 6.4% (95% CI 4.8% to 8.5%). This was observed in 1.1% of the publications with a medical statistics expert (hereinafter 'expert') as the first author, 3.5% if an expert was included as coauthor and in 12.2% if experts were not involved. In the publications where the number of cases was 50 or less and the study did not include experts, inappropriate algorithm usage was observed with a high proportion of 20.2%. The OR of the involvement of experts for this outcome was 0.28 (95% CI 0.15 to 0.53). A further, nation-level, analysis showed that the involvement of experts and the implementation of unfavourable multivariate analysis are associated at the nation-level analysis (R=-0.652). Based on the results of this study, the benefit of participation of medical statistics experts is obvious. Experts should be involved for proper confounding adjustment and interpretation of statistical models. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Wen, Cheng; Dallimer, Martin; Carver, Steve; Ziv, Guy
2018-05-06
Despite the great potential of mitigating carbon emission, development of wind farms is often opposed by local communities due to the visual impact on landscape. A growing number of studies have applied nonmarket valuation methods like Choice Experiments (CE) to value the visual impact by eliciting respondents' willingness to pay (WTP) or willingness to accept (WTA) for hypothetical wind farms through survey questions. Several meta-analyses have been found in the literature to synthesize results from different valuation studies, but they have various limitations related to the use of the prevailing multivariate meta-regression analysis. In this paper, we propose a new meta-analysis method to establish general functions for the relationships between the estimated WTP or WTA and three wind farm attributes, namely the distance to residential/coastal areas, the number of turbines and turbine height. This method involves establishing WTA or WTP functions for individual studies, fitting the average derivative functions and deriving the general integral functions of WTP or WTA against wind farm attributes. Results indicate that respondents in different studies consistently showed increasing WTP for moving wind farms to greater distances, which can be fitted by non-linear (natural logarithm) functions. However, divergent preferences for the number of turbines and turbine height were found in different studies. We argue that the new analysis method proposed in this paper is an alternative to the mainstream multivariate meta-regression analysis for synthesizing CE studies and the general integral functions of WTP or WTA against wind farm attributes are useful for future spatial modelling and benefit transfer studies. We also suggest that future multivariate meta-analyses should include non-linear components in the regression functions. Copyright © 2018. Published by Elsevier B.V.
Mujica Ascencio, Saul; Choe, ChunSik; Meinke, Martina C; Müller, Rainer H; Maksimov, George V; Wigger-Alberti, Walter; Lademann, Juergen; Darvin, Maxim E
2016-07-01
Propylene glycol is one of the known substances added in cosmetic formulations as a penetration enhancer. Recently, nanocrystals have been employed also to increase the skin penetration of active components. Caffeine is a component with many applications and its penetration into the epidermis is controversially discussed in the literature. In the present study, the penetration ability of two components - caffeine nanocrystals and propylene glycol, applied topically on porcine ear skin in the form of a gel, was investigated ex vivo using two confocal Raman microscopes operated at different excitation wavelengths (785nm and 633nm). Several depth profiles were acquired in the fingerprint region and different spectral ranges, i.e., 526-600cm(-1) and 810-880cm(-1) were chosen for independent analysis of caffeine and propylene glycol penetration into the skin, respectively. Multivariate statistical methods such as principal component analysis (PCA) and linear discriminant analysis (LDA) combined with Student's t-test were employed to calculate the maximum penetration depths of each substance (caffeine and propylene glycol). The results show that propylene glycol penetrates significantly deeper than caffeine (20.7-22.0μm versus 12.3-13.0μm) without any penetration enhancement effect on caffeine. The results confirm that different substances, even if applied onto the skin as a mixture, can penetrate differently. The penetration depths of caffeine and propylene glycol obtained using two different confocal Raman microscopes are comparable showing that both types of microscopes are well suited for such investigations and that multivariate statistical PCA-LDA methods combined with Student's t-test are very useful for analyzing the penetration of different substances into the skin. Copyright © 2016 Elsevier B.V. All rights reserved.
Settle, Steven; Vickery, Lillian; Nemirovskiy, Olga; Vidmar, Tom; Bendele, Alison; Messing, Dean; Ruminski, Peter; Schnute, Mark; Sunyer, Teresa
2010-10-01
To demonstrate that the novel highly selective matrix metalloproteinase 13 (MMP-13) inhibitor PF152 reduces joint lesions in adult dogs with osteoarthritis (OA) and decreases biomarkers of cartilage degradation. The potency and selectivity of PF152 were evaluated in vitro using 16 MMPs, TACE, and ADAMTS-4 and ADAMTS-5, as well as ex vivo in human cartilage explants. In vivo effects were evaluated at 3 concentrations in mature beagles with partial medial meniscectomy. Gross and histologic changes in the femorotibial joints were evaluated using various measures of cartilage degeneration. Biomarkers of cartilage turnover were examined in serum, urine, or synovial fluid. Results were analyzed individually and in combination using multivariate analysis. The potent and selective MMP-13 inhibitor PF152 decreased human cartilage degradation ex vivo in a dose-dependent manner. PF152 treatment of dogs with OA reduced cartilage lesions and decreased biomarkers of type II collagen (type II collagen neoepitope) and aggrecan (peptides ending in ARGN or AGEG) degradation. The dose required for significant inhibition varied with the measure used, but multivariate analysis of 6 gross and histologic measures indicated that all doses differed significantly from vehicle but not from each other. Combined analysis of cartilage degradation markers showed similar results. This highly selective MMP-13 inhibitor exhibits chondroprotective effects in mature animals. Biomarkers of cartilage degradation, when evaluated in combination, parallel the joint structural changes induced by the MMP-13 inhibitor. These data support the potential therapeutic value of selective MMP-13 inhibitors and the use of a set of appropriate biomarkers to predict efficacy in OA clinical trials.
Wood, Marnie J; Powell, Lawrie W; Dixon, Jeannette L; Subramaniam, V Nathan; Ramm, Grant A
2013-01-01
AIM: To investigate the role of genetic polymorphisms in the progression of hepatic fibrosis in hereditary haemochromatosis. METHODS: A cohort of 245 well-characterised C282Y homozygous patients with haemochromatosis was studied, with all subjects having liver biopsy data and DNA available for testing. This study assessed the association of eight single nucleotide polymorphisms (SNPs) in a total of six genes including toll-like receptor 4 (TLR4), transforming growth factor-beta (TGF-β), oxoguanine DNA glycosylase, monocyte chemoattractant protein 1, chemokine C-C motif receptor 2 and interleukin-10 with liver disease severity. Genotyping was performed using high resolution melt analysis and sequencing. The results were analysed in relation to the stage of hepatic fibrosis in multivariate analysis incorporating other cofactors including alcohol consumption and hepatic iron concentration. RESULTS: There were significant associations between the cofactors of male gender (P = 0.0001), increasing age (P = 0.006), alcohol consumption (P = 0.0001), steatosis (P = 0.03), hepatic iron concentration (P < 0.0001) and the presence of hepatic fibrosis. Of the candidate gene polymorphisms studied, none showed a significant association with hepatic fibrosis in univariate or multivariate analysis incorporating cofactors. We also specifically studied patients with hepatic iron loading above threshold levels for cirrhosis and compared the genetic polymorphisms between those with no fibrosis vs cirrhosis however there was no significant effect from any of the candidate genes studied. Importantly, in this large, well characterised cohort of patients there was no association between SNPs for TGF-β or TLR4 and the presence of fibrosis, cirrhosis or increasing fibrosis stage in multivariate analysis. CONCLUSION: In our large, well characterised group of haemochromatosis subjects we did not demonstrate any relationship between candidate gene polymorphisms and hepatic fibrosis or cirrhosis. PMID:24409064
Lavergne, Céline; Jeison, David; Ortega, Valentina; Chamy, Rolando; Donoso-Bravo, Andrés
2018-09-15
An important variability in the experimental results in anaerobic digestion lab test has been reported. This study presents a meta-analysis coupled with multivariate analysis aiming to assess the impact of this experimental variability in batch and continuous operation at mesophilic and thermophilic anaerobic digestion of waste activated sludge. An analysis of variance showed that there was no significant difference between mesophilic and thermophilic conditions in both continuous and batch conditions. Concerning the operation mode, the values of methane yield were significantly higher in batch experiment than in continuous reactors. According to the PCA, for both cases, the methane yield is positive correlated to the temperature rises. Interestingly, in the batch experiments, the higher the volatile solids in the substrate was, the lowest was the methane production, which is correlated to experimental flaws when setting up those tests. In continuous mode, unlike the batch test, the methane yield is strongly (positively) correlated to the organic content of the substrate. Experimental standardization, above all, in batch conditions are urgently necessary or move to continuous experiments for reporting results. The modeling can also be a source of disturbance in batch test. Copyright © 2018 Elsevier Ltd. All rights reserved.
Hybrid least squares multivariate spectral analysis methods
Haaland, David M.
2004-03-23
A set of hybrid least squares multivariate spectral analysis methods in which spectral shapes of components or effects not present in the original calibration step are added in a following prediction or calibration step to improve the accuracy of the estimation of the amount of the original components in the sampled mixture. The hybrid method herein means a combination of an initial calibration step with subsequent analysis by an inverse multivariate analysis method. A spectral shape herein means normally the spectral shape of a non-calibrated chemical component in the sample mixture but can also mean the spectral shapes of other sources of spectral variation, including temperature drift, shifts between spectrometers, spectrometer drift, etc. The shape can be continuous, discontinuous, or even discrete points illustrative of the particular effect.
ERIC Educational Resources Information Center
Tchumtchoua, Sylvie; Dey, Dipak K.
2012-01-01
This paper proposes a semiparametric Bayesian framework for the analysis of associations among multivariate longitudinal categorical variables in high-dimensional data settings. This type of data is frequent, especially in the social and behavioral sciences. A semiparametric hierarchical factor analysis model is developed in which the…
The impact of moderate wine consumption on the risk of developing prostate cancer
Ferro, Matteo; Foerster, Beat; Abufaraj, Mohammad; Briganti, Alberto; Karakiewicz, Pierre I; Shariat, Shahrokh F
2018-01-01
Objective To investigate the impact of moderate wine consumption on the risk of prostate cancer (PCa). We focused on the differential effect of moderate consumption of red versus white wine. Design This study was a meta-analysis that includes data from case–control and cohort studies. Materials and methods A systematic search of Web of Science, Medline/PubMed, and Cochrane library was performed on December 1, 2017. Studies were deemed eligible if they assessed the risk of PCa due to red, white, or any wine using multivariable logistic regression analysis. We performed a formal meta-analysis for the risk of PCa according to moderate wine and wine type consumption (white or red). Heterogeneity between studies was assessed using Cochrane’s Q test and I2 statistics. Publication bias was assessed using Egger’s regression test. Results A total of 930 abstracts and titles were initially identified. After removal of duplicates, reviews, and conference abstracts, 83 full-text original articles were screened. Seventeen studies (611,169 subjects) were included for final evaluation and fulfilled the inclusion criteria. In the case of moderate wine consumption: the pooled risk ratio (RR) for the risk of PCa was 0.98 (95% CI 0.92–1.05, p=0.57) in the multivariable analysis. Moderate white wine consumption increased the risk of PCa with a pooled RR of 1.26 (95% CI 1.10–1.43, p=0.001) in the multi-variable analysis. Meanwhile, moderate red wine consumption had a protective role reducing the risk by 12% (RR 0.88, 95% CI 0.78–0.999, p=0.047) in the multivariable analysis that comprised 222,447 subjects. Conclusions In this meta-analysis, moderate wine consumption did not impact the risk of PCa. Interestingly, regarding the type of wine, moderate consumption of white wine increased the risk of PCa, whereas moderate consumption of red wine had a protective effect. Further analyses are needed to assess the differential molecular effect of white and red wine conferring their impact on PCa risk. PMID:29713200
The association between body mass index and severe biliary infections: a multivariate analysis.
Stewart, Lygia; Griffiss, J McLeod; Jarvis, Gary A; Way, Lawrence W
2012-11-01
Obesity has been associated with worse infectious disease outcomes. It is a risk factor for cholesterol gallstones, but little is known about associations between body mass index (BMI) and biliary infections. We studied this using factors associated with biliary infections. A total of 427 patients with gallstones were studied. Gallstones, bile, and blood (as applicable) were cultured. Illness severity was classified as follows: none (no infection or inflammation), systemic inflammatory response syndrome (fever, leukocytosis), severe (abscess, cholangitis, empyema), or multi-organ dysfunction syndrome (bacteremia, hypotension, organ failure). Associations between BMI and biliary bacteria, bacteremia, gallstone type, and illness severity were examined using bivariate and multivariate analysis. BMI inversely correlated with pigment stones, biliary bacteria, bacteremia, and increased illness severity on bivariate and multivariate analysis. Obesity correlated with less severe biliary infections. BMI inversely correlated with pigment stones and biliary bacteria; multivariate analysis showed an independent correlation between lower BMI and illness severity. Most patients with severe biliary infections had a normal BMI, suggesting that obesity may be protective in biliary infections. This study examined the correlation between BMI and biliary infection severity. Published by Elsevier Inc.
Multivariate analysis of longitudinal rates of change.
Bryan, Matthew; Heagerty, Patrick J
2016-12-10
Longitudinal data allow direct comparison of the change in patient outcomes associated with treatment or exposure. Frequently, several longitudinal measures are collected that either reflect a common underlying health status, or characterize processes that are influenced in a similar way by covariates such as exposure or demographic characteristics. Statistical methods that can combine multivariate response variables into common measures of covariate effects have been proposed in the literature. Current methods for characterizing the relationship between covariates and the rate of change in multivariate outcomes are limited to select models. For example, 'accelerated time' methods have been developed which assume that covariates rescale time in longitudinal models for disease progression. In this manuscript, we detail an alternative multivariate model formulation that directly structures longitudinal rates of change and that permits a common covariate effect across multiple outcomes. We detail maximum likelihood estimation for a multivariate longitudinal mixed model. We show via asymptotic calculations the potential gain in power that may be achieved with a common analysis of multiple outcomes. We apply the proposed methods to the analysis of a trivariate outcome for infant growth and compare rates of change for HIV infected and uninfected infants. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
He, Shixuan; Xie, Wanyi; Zhang, Wei; Zhang, Liqun; Wang, Yunxia; Liu, Xiaoling; Liu, Yulong; Du, Chunlei
2015-02-25
A novel strategy which combines iteratively cubic spline fitting baseline correction method with discriminant partial least squares qualitative analysis is employed to analyze the surface enhanced Raman scattering (SERS) spectroscopy of banned food additives, such as Sudan I dye and Rhodamine B in food, Malachite green residues in aquaculture fish. Multivariate qualitative analysis methods, using the combination of spectra preprocessing iteratively cubic spline fitting (ICSF) baseline correction with principal component analysis (PCA) and discriminant partial least squares (DPLS) classification respectively, are applied to investigate the effectiveness of SERS spectroscopy for predicting the class assignments of unknown banned food additives. PCA cannot be used to predict the class assignments of unknown samples. However, the DPLS classification can discriminate the class assignment of unknown banned additives using the information of differences in relative intensities. The results demonstrate that SERS spectroscopy combined with ICSF baseline correction method and exploratory analysis methodology DPLS classification can be potentially used for distinguishing the banned food additives in field of food safety. Copyright © 2014 Elsevier B.V. All rights reserved.
[Quality evaluation of American ginseng using UPLC coupled with multivariate analysis].
Tang, Yan; Yan, Shu-Mo; Wang, Jing-Jing; Yuan, Yuan; Yang, Bin
2016-05-01
An ultra performance liquid chromatography (UPLC)method combined with multivariate data analysis was developed to evaluate the quality of American ginseng by simultaneously determining the concentrations of six ginsenosides (Rg₁, Re, Rb₁, Rc, Ro and Rd)in the samples. For UPLC, acetonitrile with 0.01% formic acid and water with 0.01% formic acid were used as the mobile phase with gradient elution. Under the established chromatographic conditions, the six ginsenosides could be well separated and the results of linearity, stability, precision, repeatability, and recovery rate all reached the requirement of quantification analysis, respectively. The total contents of Rg₁, Re, and Rb₁ in 57 samples all reached the requirement of the 2015 edition of Chinese Pharmacopoeia. At the same time, the experimental data were analyzed by principle component analysis (PCA) and partial least squares discriminant analysis (PLS-DA). The crude drugs and the decoction pieces can be discriminated by a PCA method and the samples with different age can be distinguished by a PLS-DA method. Copyright© by the Chinese Pharmaceutical Association.
Multivariate analysis of toxicity experimental results of environmental endpoints. (FutureToxII)
The toxicity of hundreds of chemicals have been assessed in laboratory animal studies through EPA chemical regulation and toxicological research. Currently, over 5000 laboratory animal toxicity studies have been collected in the Toxicity Reference Database (ToxRefDB). In addition...
College Student Invulnerability Beliefs and HIV Vaccine Acceptability
ERIC Educational Resources Information Center
Ravert, Russell D.; Zimet, Gregory D.
2009-01-01
Objective: To examine behavioral history, beliefs, and vaccine characteristics as predictors of HIV vaccine acceptability. Methods: Two hundred forty-five US under graduates were surveyed regarding their sexual history, risk beliefs, and likelihood of accepting hypothetical HIV vaccines. Results: Multivariate regression analysis indicated that…
Cai, Li-mei; Ma, Jin; Zhou, Yong-zhang; Huang, Lan-chun; Dou, Lei; Zhang, Cheng-bo; Fu, Shan-ming
2008-12-01
One hundred and eighteen surface soil samples were collected from the Dongguan City, and analyzed for concentration of Cu, Zn, Ni, Cr, Pb, Cd, As, Hg, pH and OM. The spatial distribution and sources of soil heavy metals were studied using multivariate geostatistical methods and GIS technique. The results indicated concentrations of Cu, Zn, Ni, Pb, Cd and Hg were beyond the soil background content in Guangdong province, and especially concentrations of Pb, Cd and Hg were greatly beyond the content. The results of factor analysis group Cu, Zn, Ni, Cr and As in Factor 1, Pb and Hg in Factor 2 and Cd in Factor 3. The spatial maps based on geostatistical analysis show definite association of Factor 1 with the soil parent material, Factor 2 was mainly affected by industries. The spatial distribution of Factor 3 was attributed to anthropogenic influence.
Exploratory analysis of TOF-SIMS data from biological surfaces
NASA Astrophysics Data System (ADS)
Vaidyanathan, Seetharaman; Fletcher, John S.; Henderson, Alex; Lockyer, Nicholas P.; Vickerman, John C.
2008-12-01
The application of multivariate analytical tools enables simplification of TOF-SIMS datasets so that useful information can be extracted from complex spectra and images, especially those that do not give readily interpretable results. There is however a challenge in understanding the outputs from such analyses. The problem is complicated when analysing images, given the additional dimensions in the dataset. Here we demonstrate how the application of simple pre-processing routines can enable the interpretation of TOF-SIMS spectra and images. For the spectral data, TOF-SIMS spectra used to discriminate bacterial isolates associated with urinary tract infection were studied. Using different criteria for picking peaks before carrying out PC-DFA enabled identification of the discriminatory information with greater certainty. For the image data, an air-dried salt stressed bacterial sample, discussed in another paper by us in this issue, was studied. Exploration of the image datasets with and without normalisation prior to multivariate analysis by PCA or MAF resulted in different regions of the image being highlighted by the techniques.
Willis, Michael; Asseburg, Christian; Nilsson, Andreas; Johnsson, Kristina; Kartman, Bernt
2017-03-01
Type 2 diabetes mellitus (T2DM) is chronic and progressive and the cost-effectiveness of new treatment interventions must be established over long time horizons. Given the limited durability of drugs, assumptions regarding downstream rescue medication can drive results. Especially for insulin, for which treatment effects and adverse events are known to depend on patient characteristics, this can be problematic for health economic evaluation involving modeling. To estimate parsimonious multivariate equations of treatment effects and hypoglycemic event risks for use in parameterizing insulin rescue therapy in model-based cost-effectiveness analysis. Clinical evidence for insulin use in T2DM was identified in PubMed and from published reviews and meta-analyses. Study and patient characteristics and treatment effects and adverse event rates were extracted and the data used to estimate parsimonious treatment effect and hypoglycemic event risk equations using multivariate regression analysis. Data from 91 studies featuring 171 usable study arms were identified, mostly for premix and basal insulin types. Multivariate prediction equations for glycated hemoglobin A 1c lowering and weight change were estimated separately for insulin-naive and insulin-experienced patients. Goodness of fit (R 2 ) for both outcomes were generally good, ranging from 0.44 to 0.84. Multivariate prediction equations for symptomatic, nocturnal, and severe hypoglycemic events were also estimated, though considerable heterogeneity in definitions limits their usefulness. Parsimonious and robust multivariate prediction equations were estimated for glycated hemoglobin A 1c and weight change, separately for insulin-naive and insulin-experienced patients. Using these in economic simulation modeling in T2DM can improve realism and flexibility in modeling insulin rescue medication. Copyright © 2017 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
A Statistical Discrimination Experiment for Eurasian Events Using a Twenty-Seven-Station Network
1980-07-08
to test the effectiveness of a multivariate method of analysis for distinguishing earthquakes from explosions. The data base for the experiment...to test the effectiveness of a multivariate method of analysis for distinguishing earthquakes from explosions. The data base for the experiment...the weight assigned to each variable whenever a new one is added. Jennrich, R. I. (1977). Stepwise discriminant analysis , in Statistical Methods for
2015-01-01
different PRBC transfusion volumes. We performed multivariate regression analysis using HRV metrics and routine vital signs to test the hypothesis that...study sponsors did not have any role in the study design, data collection, analysis and interpretation of data, report writing, or the decision to...primary outcome was hemorrhagic injury plus different PRBC transfusion volumes. We performed multivariate regression analysis using HRV metrics and
Multivariate optimum interpolation of surface pressure and winds over oceans
NASA Technical Reports Server (NTRS)
Bloom, S. C.
1984-01-01
The observations of surface pressure are quite sparse over oceanic areas. An effort to improve the analysis of surface pressure over oceans through the development of a multivariate surface analysis scheme which makes use of surface pressure and wind data is discussed. Although the present research used ship winds, future versions of this analysis scheme could utilize winds from additional sources, such as satellite scatterometer data.
Nonlinear multivariate and time series analysis by neural network methods
NASA Astrophysics Data System (ADS)
Hsieh, William W.
2004-03-01
Methods in multivariate statistical analysis are essential for working with large amounts of geophysical data, data from observational arrays, from satellites, or from numerical model output. In classical multivariate statistical analysis, there is a hierarchy of methods, starting with linear regression at the base, followed by principal component analysis (PCA) and finally canonical correlation analysis (CCA). A multivariate time series method, the singular spectrum analysis (SSA), has been a fruitful extension of the PCA technique. The common drawback of these classical methods is that only linear structures can be correctly extracted from the data. Since the late 1980s, neural network methods have become popular for performing nonlinear regression and classification. More recently, neural network methods have been extended to perform nonlinear PCA (NLPCA), nonlinear CCA (NLCCA), and nonlinear SSA (NLSSA). This paper presents a unified view of the NLPCA, NLCCA, and NLSSA techniques and their applications to various data sets of the atmosphere and the ocean (especially for the El Niño-Southern Oscillation and the stratospheric quasi-biennial oscillation). These data sets reveal that the linear methods are often too simplistic to describe real-world systems, with a tendency to scatter a single oscillatory phenomenon into numerous unphysical modes or higher harmonics, which can be largely alleviated in the new nonlinear paradigm.
Yoon, Jong H.; Tamir, Diana; Minzenberg, Michael J.; Ragland, J. Daniel; Ursu, Stefan; Carter, Cameron S.
2009-01-01
Background Multivariate pattern analysis is an alternative method of analyzing fMRI data, which is capable of decoding distributed neural representations. We applied this method to test the hypothesis of the impairment in distributed representations in schizophrenia. We also compared the results of this method with traditional GLM-based univariate analysis. Methods 19 schizophrenia and 15 control subjects viewed two runs of stimuli--exemplars of faces, scenes, objects, and scrambled images. To verify engagement with stimuli, subjects completed a 1-back matching task. A multi-voxel pattern classifier was trained to identify category-specific activity patterns on one run of fMRI data. Classification testing was conducted on the remaining run. Correlation of voxel-wise activity across runs evaluated variance over time in activity patterns. Results Patients performed the task less accurately. This group difference was reflected in the pattern analysis results with diminished classification accuracy in patients compared to controls, 59% and 72% respectively. In contrast, there was no group difference in GLM-based univariate measures. In both groups, classification accuracy was significantly correlated with behavioral measures. Both groups showed highly significant correlation between inter-run correlations and classification accuracy. Conclusions Distributed representations of visual objects are impaired in schizophrenia. This impairment is correlated with diminished task performance, suggesting that decreased integrity of cortical activity patterns is reflected in impaired behavior. Comparisons with univariate results suggest greater sensitivity of pattern analysis in detecting group differences in neural activity and reduced likelihood of non-specific factors driving these results. PMID:18822407
Dong, Chunjiao; Clarke, David B; Yan, Xuedong; Khattak, Asad; Huang, Baoshan
2014-09-01
Crash data are collected through police reports and integrated with road inventory data for further analysis. Integrated police reports and inventory data yield correlated multivariate data for roadway entities (e.g., segments or intersections). Analysis of such data reveals important relationships that can help focus on high-risk situations and coming up with safety countermeasures. To understand relationships between crash frequencies and associated variables, while taking full advantage of the available data, multivariate random-parameters models are appropriate since they can simultaneously consider the correlation among the specific crash types and account for unobserved heterogeneity. However, a key issue that arises with correlated multivariate data is the number of crash-free samples increases, as crash counts have many categories. In this paper, we describe a multivariate random-parameters zero-inflated negative binomial (MRZINB) regression model for jointly modeling crash counts. The full Bayesian method is employed to estimate the model parameters. Crash frequencies at urban signalized intersections in Tennessee are analyzed. The paper investigates the performance of MZINB and MRZINB regression models in establishing the relationship between crash frequencies, pavement conditions, traffic factors, and geometric design features of roadway intersections. Compared to the MZINB model, the MRZINB model identifies additional statistically significant factors and provides better goodness of fit in developing the relationships. The empirical results show that MRZINB model possesses most of the desirable statistical properties in terms of its ability to accommodate unobserved heterogeneity and excess zero counts in correlated data. Notably, in the random-parameters MZINB model, the estimated parameters vary significantly across intersections for different crash types. Copyright © 2014 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Das, Bappa; Sahoo, Rabi N.; Pargal, Sourabh; Krishna, Gopal; Verma, Rakesh; Chinnusamy, Viswanathan; Sehgal, Vinay K.; Gupta, Vinod K.; Dash, Sushanta K.; Swain, Padmini
2018-03-01
In the present investigation, the changes in sucrose, reducing and total sugar content due to water-deficit stress in rice leaves were modeled using visible, near infrared (VNIR) and shortwave infrared (SWIR) spectroscopy. The objectives of the study were to identify the best vegetation indices and suitable multivariate technique based on precise analysis of hyperspectral data (350 to 2500 nm) and sucrose, reducing sugar and total sugar content measured at different stress levels from 16 different rice genotypes. Spectral data analysis was done to identify suitable spectral indices and models for sucrose estimation. Novel spectral indices in near infrared (NIR) range viz. ratio spectral index (RSI) and normalised difference spectral indices (NDSI) sensitive to sucrose, reducing sugar and total sugar content were identified which were subsequently calibrated and validated. The RSI and NDSI models had R2 values of 0.65, 0.71 and 0.67; RPD values of 1.68, 1.95 and 1.66 for sucrose, reducing sugar and total sugar, respectively for validation dataset. Different multivariate spectral models such as artificial neural network (ANN), multivariate adaptive regression splines (MARS), multiple linear regression (MLR), partial least square regression (PLSR), random forest regression (RFR) and support vector machine regression (SVMR) were also evaluated. The best performing multivariate models for sucrose, reducing sugars and total sugars were found to be, MARS, ANN and MARS, respectively with respect to RPD values of 2.08, 2.44, and 1.93. Results indicated that VNIR and SWIR spectroscopy combined with multivariate calibration can be used as a reliable alternative to conventional methods for measurement of sucrose, reducing sugars and total sugars of rice under water-deficit stress as this technique is fast, economic, and noninvasive.
Alkarkhi, Abbas F M; Ramli, Saifullah Bin; Easa, Azhar Mat
2009-01-01
Major (sodium, potassium, calcium, magnesium) and minor elements (iron, copper, zinc, manganese) and one heavy metal (lead) of Cavendish banana flour and Dream banana flour were determined, and data were analyzed using multivariate statistical techniques of factor analysis and discriminant analysis. Factor analysis yielded four factors explaining more than 81% of the total variance: the first factor explained 28.73%, comprising magnesium, sodium, and iron; the second factor explained 21.47%, comprising only manganese and copper; the third factor explained 15.66%, comprising zinc and lead; while the fourth factor explained 15.50%, comprising potassium. Discriminant analysis showed that magnesium and sodium exhibited a strong contribution in discriminating the two types of banana flour, affording 100% correct assignation. This study presents the usefulness of multivariate statistical techniques for analysis and interpretation of complex mineral content data from banana flour of different varieties.
Thiagarajah, Shankar; Wilkinson, J. Mark; Panoutsopoulou, Kalliope; Day‐Williams, Aaron G.; Cootes, Timothy F.; Wallis, Gillian A.; Loughlin, John; Arden, Nigel; Birrell, Fraser; Carr, Andrew; Chapman, Kay; Deloukas, Panos; Doherty, Michael; McCaskie, Andrew; Ollier, William E. R.; Rai, Ashok; Ralston, Stuart H.; Spector, Timothy D.; Valdes, Ana M.; Wallis, Gillian A.; Mark Wilkinson, J.; Zeggini, Eleftheria
2015-01-01
Objective To test whether previously reported hip morphology or osteoarthritis (OA) susceptibility loci are associated with proximal femur shape as represented by statistical shape model (SSM) modes and as univariate or multivariate quantitative traits. Methods We used pelvic radiographs and genotype data from 929 subjects with unilateral hip OA who had been recruited previously for the Arthritis Research UK Osteoarthritis Genetics Consortium genome‐wide association study. We built 3 SSMs capturing the shape variation of the OA‐unaffected proximal femur in the entire mixed‐sex cohort and for male/female‐stratified cohorts. We selected 41 candidate single‐nucleotide polymorphisms (SNPs) previously reported as being associated with hip morphology (for replication analysis) or OA (for discovery analysis) and for which genotype data were available. We performed 2 types of analysis for genotype–phenotype associations between these SNPs and the modes of the SSMs: 1) a univariate analysis using individual SSM modes and 2) a multivariate analysis using combinations of SSM modes. Results The univariate analysis identified association between rs4836732 (within the ASTN2 gene) and mode 5 of the female SSM (P = 0.0016) and between rs6976 (within the GLT8D1 gene) and mode 7 of the mixed‐sex SSM (P = 0.0003). The multivariate analysis identified association between rs5009270 (near the IFRD1 gene) and a combination of modes 3, 4, and 9 of the mixed‐sex SSM (P = 0.0004). Evidence of associations remained significant following adjustment for multiple testing. All 3 SNPs had previously been associated with hip OA. Conclusion These de novo findings suggest that rs4836732, rs6976, and rs5009270 may contribute to hip OA susceptibility by altering proximal femur shape. PMID:25939412
Phung, Dung; Huang, Cunrui; Rutherford, Shannon; Dwirahmadi, Febi; Chu, Cordia; Wang, Xiaoming; Nguyen, Minh; Nguyen, Nga Huy; Do, Cuong Manh; Nguyen, Trung Hieu; Dinh, Tuan Anh Diep
2015-05-01
The present study is an evaluation of temporal/spatial variations of surface water quality using multivariate statistical techniques, comprising cluster analysis (CA), principal component analysis (PCA), factor analysis (FA) and discriminant analysis (DA). Eleven water quality parameters were monitored at 38 different sites in Can Tho City, a Mekong Delta area of Vietnam from 2008 to 2012. Hierarchical cluster analysis grouped the 38 sampling sites into three clusters, representing mixed urban-rural areas, agricultural areas and industrial zone. FA/PCA resulted in three latent factors for the entire research location, three for cluster 1, four for cluster 2, and four for cluster 3 explaining 60, 60.2, 80.9, and 70% of the total variance in the respective water quality. The varifactors from FA indicated that the parameters responsible for water quality variations are related to erosion from disturbed land or inflow of effluent from sewage plants and industry, discharges from wastewater treatment plants and domestic wastewater, agricultural activities and industrial effluents, and contamination by sewage waste with faecal coliform bacteria through sewer and septic systems. Discriminant analysis (DA) revealed that nephelometric turbidity units (NTU), chemical oxygen demand (COD) and NH₃ are the discriminating parameters in space, affording 67% correct assignation in spatial analysis; pH and NO₂ are the discriminating parameters according to season, assigning approximately 60% of cases correctly. The findings suggest a possible revised sampling strategy that can reduce the number of sampling sites and the indicator parameters responsible for large variations in water quality. This study demonstrates the usefulness of multivariate statistical techniques for evaluation of temporal/spatial variations in water quality assessment and management.
Tan, Chao; Zhao, Jia; Dong, Feng
2015-03-01
Flow behavior characterization is important to understand gas-liquid two-phase flow mechanics and further establish its description model. An Electrical Resistance Tomography (ERT) provides information regarding flow conditions at different directions where the sensing electrodes implemented. We extracted the multivariate sample entropy (MSampEn) by treating ERT data as a multivariate time series. The dynamic experimental results indicate that the MSampEn is sensitive to complexity change of flow patterns including bubbly flow, stratified flow, plug flow and slug flow. MSampEn can characterize the flow behavior at different direction of two-phase flow, and reveal the transition between flow patterns when flow velocity changes. The proposed method is effective to analyze two-phase flow pattern transition by incorporating information of different scales and different spatial directions. Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.
Mathew, Boby; Holand, Anna Marie; Koistinen, Petri; Léon, Jens; Sillanpää, Mikko J
2016-02-01
A novel reparametrization-based INLA approach as a fast alternative to MCMC for the Bayesian estimation of genetic parameters in multivariate animal model is presented. Multi-trait genetic parameter estimation is a relevant topic in animal and plant breeding programs because multi-trait analysis can take into account the genetic correlation between different traits and that significantly improves the accuracy of the genetic parameter estimates. Generally, multi-trait analysis is computationally demanding and requires initial estimates of genetic and residual correlations among the traits, while those are difficult to obtain. In this study, we illustrate how to reparametrize covariance matrices of a multivariate animal model/animal models using modified Cholesky decompositions. This reparametrization-based approach is used in the Integrated Nested Laplace Approximation (INLA) methodology to estimate genetic parameters of multivariate animal model. Immediate benefits are: (1) to avoid difficulties of finding good starting values for analysis which can be a problem, for example in Restricted Maximum Likelihood (REML); (2) Bayesian estimation of (co)variance components using INLA is faster to execute than using Markov Chain Monte Carlo (MCMC) especially when realized relationship matrices are dense. The slight drawback is that priors for covariance matrices are assigned for elements of the Cholesky factor but not directly to the covariance matrix elements as in MCMC. Additionally, we illustrate the concordance of the INLA results with the traditional methods like MCMC and REML approaches. We also present results obtained from simulated data sets with replicates and field data in rice.
Guo, L W; Liu, S Z; Zhang, M; Chen, Q; Zhang, S K; Sun, X B
2017-12-10
Objective: To investigate the effect of fried food intake on the pathogenesis of esophageal cancer and precancerous lesions. Methods: From 2005 to 2013, all the residents aged 40-69 years from 11 counties (cities) where cancer screening of upper gastrointestinal cancer had been conducted in rural areas of Henan province, were recruited as the subjects of study. Information on demography and lifestyle was collected. The residents under study were screened with iodine staining endoscopic examination and biopsy samples were diagnosed pathologically, under standardized criteria. Subjects with high risk were divided into the groups based on their different pathological degrees. Multivariate ordinal logistic regression analysis was used to analyze the relationship between the frequency of fried food intake and esophageal cancer and precancerous lesions. Results: A total number of 8 792 cases with normal esophagus, 3 680 with mild hyperplasia, 972 with moderate hyperplasia, 413 with severe hyperplasia carcinoma in situ, and 336 cases of esophageal cancer were recruited. Results from multivariate logistic regression analysis showed that, when compared with those who did not eat fried food, the intake of fried food (<2 times/week: OR =1.60, 95% CI : 1.40-1.83; ≥2 times/week: OR =2.58, 95% CI : 1.98-3.37) appeared a risk factor for both esophageal cancer or precancerous lesions after adjustment for age, sex, marital status, educational level, body mass index, smoking and alcohol intake. Conclusion: The intake of fried food appeared a risk factor for both esophageal cancer and precancerous lesions.
Liebenberg, Leandi; L'Abbé, Ericka N; Stull, Kyra E
2015-12-01
The cranium is widely recognized as the most important skeletal element to use when evaluating population differences and estimating ancestry. However, the cranium is not always intact or available for analysis, which emphasizes the need for postcranial alternatives. The purpose of this study was to quantify postcraniometric differences among South Africans that can be used to estimate ancestry. Thirty-nine standard measurements from 11 postcranial bones were collected from 360 modern black, white and coloured South Africans; the sex and ancestry distribution were equal. Group differences were explored with analysis of variance (ANOVA) and Tukey's honestly significant difference (HSD) test. Linear and flexible discriminant analysis (LDA and FDA, respectively) were conducted with bone models as well as numerous multivariate subsets to identify the model and method that yielded the highest correct classifications. Leave-one-out (LDA) and k-fold (k=10; FDA) cross-validation with equal priors were used for all models. ANOVA and Tukey's HSD results reveal statistically significant differences between at least two of the three groups for the majority of the variables, with varying degrees of group overlap. Bone models, which consisted of all measurements per bone, resulted in low accuracies that ranged from 46% to 63% (LDA) and 41% to 66% (FDA). In contrast, the multivariate subsets, which consisted of different variable combinations from all elements, achieved accuracies as high as 85% (LDA) and 87% (FDA). Thus, when using a multivariate approach, the postcranial skeleton can distinguish among three modern South African groups with high accuracy. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Åberg Lindell, M.; Andersson, P.; Grape, S.; Hellesen, C.; Håkansson, A.; Thulin, M.
2018-03-01
This paper investigates how concentrations of certain fission products and their related gamma-ray emissions can be used to discriminate between uranium oxide (UOX) and mixed oxide (MOX) type fuel. Discrimination of irradiated MOX fuel from irradiated UOX fuel is important in nuclear facilities and for transport of nuclear fuel, for purposes of both criticality safety and nuclear safeguards. Although facility operators keep records on the identity and properties of each fuel, tools for nuclear safeguards inspectors that enable independent verification of the fuel are critical in the recovery of continuity of knowledge, should it be lost. A discrimination methodology for classification of UOX and MOX fuel, based on passive gamma-ray spectroscopy data and multivariate analysis methods, is presented. Nuclear fuels and their gamma-ray emissions were simulated in the Monte Carlo code Serpent, and the resulting data was used as input to train seven different multivariate classification techniques. The trained classifiers were subsequently implemented and evaluated with respect to their capabilities to correctly predict the classes of unknown fuel items. The best results concerning successful discrimination of UOX and MOX-fuel were acquired when using non-linear classification techniques, such as the k nearest neighbors method and the Gaussian kernel support vector machine. For fuel with cooling times up to 20 years, when it is considered that gamma-rays from the isotope 134Cs can still be efficiently measured, success rates of 100% were obtained. A sensitivity analysis indicated that these methods were also robust.
PYCHEM: a multivariate analysis package for python.
Jarvis, Roger M; Broadhurst, David; Johnson, Helen; O'Boyle, Noel M; Goodacre, Royston
2006-10-15
We have implemented a multivariate statistical analysis toolbox, with an optional standalone graphical user interface (GUI), using the Python scripting language. This is a free and open source project that addresses the need for a multivariate analysis toolbox in Python. Although the functionality provided does not cover the full range of multivariate tools that are available, it has a broad complement of methods that are widely used in the biological sciences. In contrast to tools like MATLAB, PyChem 2.0.0 is easily accessible and free, allows for rapid extension using a range of Python modules and is part of the growing amount of complementary and interoperable scientific software in Python based upon SciPy. One of the attractions of PyChem is that it is an open source project and so there is an opportunity, through collaboration, to increase the scope of the software and to continually evolve a user-friendly platform that has applicability across a wide range of analytical and post-genomic disciplines. http://sourceforge.net/projects/pychem
¹H NMR and multivariate data analysis of the relationship between the age and quality of duck meat.
Liu, Chunli; Pan, Daodong; Ye, Yangfang; Cao, Jinxuan
2013-11-15
To contribute to a better understanding of the factors affecting meat quality, we investigated the influence of age on the chemical composition of duck meat. Aging probably affects the quality of meat through changes in metabolism. Therefore, we studied the metabolic composition of duck meat using (1)H nuclear magnetic resonance (NMR) spectroscopy. Comprehensive multivariate data analysis showed significant differences between extracts from ducks that had been aged for four different time periods. Although lactate and anserine increased with age, fumarate, betaine, taurine, inosine and alkyl-substituted free amino acids decreased. These results contribute to a better understanding of changes in duck meat metabolism as meat ages, which could be used to help assess the quality of duck meat as a food. Copyright © 2013 Elsevier Ltd. All rights reserved.
Unsupervised pattern recognition methods in ciders profiling based on GCE voltammetric signals.
Jakubowska, Małgorzata; Sordoń, Wanda; Ciepiela, Filip
2016-07-15
This work presents a complete methodology of distinguishing between different brands of cider and ageing degrees, based on voltammetric signals, utilizing dedicated data preprocessing procedures and unsupervised multivariate analysis. It was demonstrated that voltammograms recorded on glassy carbon electrode in Britton-Robinson buffer at pH 2 are reproducible for each brand. By application of clustering algorithms and principal component analysis visible homogenous clusters were obtained. Advanced signal processing strategy which included automatic baseline correction, interval scaling and continuous wavelet transform with dedicated mother wavelet, was a key step in the correct recognition of the objects. The results show that voltammetry combined with optimized univariate and multivariate data processing is a sufficient tool to distinguish between ciders from various brands and to evaluate their freshness. Copyright © 2016 Elsevier Ltd. All rights reserved.
Kernel canonical-correlation Granger causality for multiple time series
NASA Astrophysics Data System (ADS)
Wu, Guorong; Duan, Xujun; Liao, Wei; Gao, Qing; Chen, Huafu
2011-04-01
Canonical-correlation analysis as a multivariate statistical technique has been applied to multivariate Granger causality analysis to infer information flow in complex systems. It shows unique appeal and great superiority over the traditional vector autoregressive method, due to the simplified procedure that detects causal interaction between multiple time series, and the avoidance of potential model estimation problems. However, it is limited to the linear case. Here, we extend the framework of canonical correlation to include the estimation of multivariate nonlinear Granger causality for drawing inference about directed interaction. Its feasibility and effectiveness are verified on simulated data.
Keller, Lisa A; Clauser, Brian E; Swanson, David B
2010-12-01
In recent years, demand for performance assessments has continued to grow. However, performance assessments are notorious for lower reliability, and in particular, low reliability resulting from task specificity. Since reliability analyses typically treat the performance tasks as randomly sampled from an infinite universe of tasks, these estimates of reliability may not be accurate. For tests built according to a table of specifications, tasks are randomly sampled from different strata (content domains, skill areas, etc.). If these strata remain fixed in the test construction process, ignoring this stratification in the reliability analysis results in an underestimate of "parallel forms" reliability, and an overestimate of the person-by-task component. This research explores the effect of representing and misrepresenting the stratification appropriately in estimation of reliability and the standard error of measurement. Both multivariate and univariate generalizability studies are reported. Results indicate that the proper specification of the analytic design is essential in yielding the proper information both about the generalizability of the assessment and the standard error of measurement. Further, illustrative D studies present the effect under a variety of situations and test designs. Additional benefits of multivariate generalizability theory in test design and evaluation are also discussed.
Comparison of Optimum Interpolation and Cressman Analyses
NASA Technical Reports Server (NTRS)
Baker, W. E.; Bloom, S. C.; Nestler, M. S.
1984-01-01
The objective of this investigation is to develop a state-of-the-art optimum interpolation (O/I) objective analysis procedure for use in numerical weather prediction studies. A three-dimensional multivariate O/I analysis scheme has been developed. Some characteristics of the GLAS O/I compared with those of the NMC and ECMWF systems are summarized. Some recent enhancements of the GLAS scheme include a univariate analysis of water vapor mixing ratio, a geographically dependent model prediction error correlation function and a multivariate oceanic surface analysis.
Comparison of Optimum Interpolation and Cressman Analyses
NASA Technical Reports Server (NTRS)
Baker, W. E.; Bloom, S. C.; Nestler, M. S.
1985-01-01
The development of a state of the art optimum interpolation (O/I) objective analysis procedure for use in numerical weather prediction studies was investigated. A three dimensional multivariate O/I analysis scheme was developed. Some characteristics of the GLAS O/I compared with those of the NMC and ECMWF systems are summarized. Some recent enhancements of the GLAS scheme include a univariate analysis of water vapor mixing ratio, a geographically dependent model prediction error correlation function and a multivariate oceanic surface analysis.
Feng, Xiao-Liang; He, Yun-biao; Liang, Yi-Zeng; Wang, Yu-Lin; Huang, Lan-Fang; Xie, Jian-Wei
2013-01-01
Gas chromatography-mass spectrometry and multivariate curve resolution were applied to the differential analysis of the volatile components in Agrimonia eupatoria specimens from different plant parts. After extracted with water distillation method, the volatile components in Agrimonia eupatoria from leaves and roots were detected by GC-MS. Then the qualitative and quantitative analysis of the volatile components in the main root of Agrimonia eupatoria was completed with the help of subwindow factor analysis resolving two-dimensional original data into mass spectra and chromatograms. 68 of 87 separated constituents in the total ion chromatogram of the volatile components were identified and quantified, accounting for about 87.03% of the total content. Then, the common peaks in leaf were extracted with orthogonal projection resolution method. Among the components determined, there were 52 components coexisting in the studied samples although the relative content of each component showed difference to some extent. The results showed a fair consistency in their GC-MS fingerprint. It was the first time to apply orthogonal projection method to compare different plant parts of Agrimonia eupatoria, and it reduced the burden of qualitative analysis as well as the subjectivity. The obtained results proved the combined approach powerful for the analysis of complex Agrimonia eupatoria samples. The developed method can be used to further study and quality control of Agrimonia eupatoria. PMID:24286016
Feng, Xiao-Liang; He, Yun-Biao; Liang, Yi-Zeng; Wang, Yu-Lin; Huang, Lan-Fang; Xie, Jian-Wei
2013-01-01
Gas chromatography-mass spectrometry and multivariate curve resolution were applied to the differential analysis of the volatile components in Agrimonia eupatoria specimens from different plant parts. After extracted with water distillation method, the volatile components in Agrimonia eupatoria from leaves and roots were detected by GC-MS. Then the qualitative and quantitative analysis of the volatile components in the main root of Agrimonia eupatoria was completed with the help of subwindow factor analysis resolving two-dimensional original data into mass spectra and chromatograms. 68 of 87 separated constituents in the total ion chromatogram of the volatile components were identified and quantified, accounting for about 87.03% of the total content. Then, the common peaks in leaf were extracted with orthogonal projection resolution method. Among the components determined, there were 52 components coexisting in the studied samples although the relative content of each component showed difference to some extent. The results showed a fair consistency in their GC-MS fingerprint. It was the first time to apply orthogonal projection method to compare different plant parts of Agrimonia eupatoria, and it reduced the burden of qualitative analysis as well as the subjectivity. The obtained results proved the combined approach powerful for the analysis of complex Agrimonia eupatoria samples. The developed method can be used to further study and quality control of Agrimonia eupatoria.
The Multivariate Nature of Professional Job Satisfaction.
ERIC Educational Resources Information Center
Wood, Donald A.; LeBold, William K.
Discussed are two theories of professional job satisfaction--(1) unidimensional and (2) multidimensional with special reference to Herzberg's two factor theory. A national sample of over 3,000 engineering graduates responded to a questionnaire and satisfaction index. Analysis of results revealed that job satisfaction is multidimensional. Job…
On Recruiting: A Multivariate Analysis of Marine Corps Recruiters and the Market
Three recommendations result from this study. The quantitative recommendation developed in this thesis is to add approximately three missioned...canvassing recruiters per Recruiting Station, or 144 total, where the marginal cost of the 1,400 potentially gained contracts is the most economical manpower
Tracking Problem Solving by Multivariate Pattern Analysis and Hidden Markov Model Algorithms
ERIC Educational Resources Information Center
Anderson, John R.
2012-01-01
Multivariate pattern analysis can be combined with Hidden Markov Model algorithms to track the second-by-second thinking as people solve complex problems. Two applications of this methodology are illustrated with a data set taken from children as they interacted with an intelligent tutoring system for algebra. The first "mind reading" application…
ERIC Educational Resources Information Center
Martin, James L.
This paper reports on attempts by the author to construct a theoretical framework of adult education participation using a theory development process and the corresponding multivariate statistical techniques. Two problems are identified: the lack of theoretical framework in studying problems, and the limiting of statistical analysis to univariate…
Missing Data and Multiple Imputation in the Context of Multivariate Analysis of Variance
ERIC Educational Resources Information Center
Finch, W. Holmes
2016-01-01
Multivariate analysis of variance (MANOVA) is widely used in educational research to compare means on multiple dependent variables across groups. Researchers faced with the problem of missing data often use multiple imputation of values in place of the missing observations. This study compares the performance of 2 methods for combining p values in…
Web-Based Tools for Modelling and Analysis of Multivariate Data: California Ozone Pollution Activity
ERIC Educational Resources Information Center
Dinov, Ivo D.; Christou, Nicolas
2011-01-01
This article presents a hands-on web-based activity motivated by the relation between human health and ozone pollution in California. This case study is based on multivariate data collected monthly at 20 locations in California between 1980 and 2006. Several strategies and tools for data interrogation and exploratory data analysis, model fitting…
Multivariate analysis of climate along the southern coast of Alaskasome forestry implications.
Wilbur A. Farr; John S. Hard
1987-01-01
A multivariate analysis of climate was used to delineate 10 significantly different groups of climatic stations along the southern coast of Alaska based on latitude, longitude, seasonal temperatures and precipitation, frost-free periods, and total number of growing degree days. The climatic stations were too few to delineate this rugged, mountainous region into...
Shi, Wenhao; Zhang, Silin; Zhao, Wanqiu; Xia, Xue; Wang, Min; Wang, Hui; Bai, Haiyan; Shi, Juanzi
2013-07-01
What factors does multivariate logistic regression show to be significantly associated with the likelihood of clinical pregnancy in vitrified-warmed embryo transfer (VET) cycles? Assisted hatching (AH) and if the reason to freeze embryos was to avoid the risk of ovarian hyperstimulation syndrome (OHSS) were significantly positively associated with a greater likelihood of clinical pregnancy. Single factor analysis has shown AH, number of embryos transferred and the reason of freezing for OHSS to be positively and damaged blastomere to be negatively significantly associated with the chance of clinical pregnancy after VET. It remains unclear what factors would be significant after multivariate analysis. The study was a retrospective analysis of 2313 VET cycles from 1481 patients performed between January 2008 and April 2012. A multivariate logistic regression analysis was performed to identify the factors to affect clinical pregnancy outcome of VET. There were 22 candidate variables selected based on clinical experiences and the literature. With the thresholds of α entry = α removal= 0.05 for both variable entry and variable removal, eight variables were chosen to contribute the multivariable model by the bootstrap stepwise variable selection algorithm (n = 1000). Eight variables were age at controlled ovarian hyperstimulation (COH), reason for freezing, AH, endometrial thickness, damaged blastomere, number of embryos transferred, number of good-quality embryos, and blood presence on transfer catheter. A descriptive comparison of the relative importance was accomplished by the proportion of explained variation (PEV). Among the reasons for freezing, the OHSS group showed a higher OR than the surplus embryo group when compared with other reasons for VET groups (OHSS versus Other, OR: 2.145; CI: 1.4-3.286; Surplus embryos versus Other, OR: 1.152; CI: 0.761-1.743) and high PEV (marginal 2.77%, P = 0.2911; partial 1.68%; CI of area under receptor operator characteristic curve (ROC): 0.5576-0.6000). AH also showed a high OR (OR: 2.105, CI: 1.554-2.85) and high PEV (marginal 1.97%; partial 1.02%; CI of area under ROC: 0.5344-0.5647). The number of good-quality embryos showed the highest marginal PEV and partial PEV (marginal 3.91%, partial 2.28%; CI of area under ROC: 0.5886-0.6343). This was a retrospective multivariate analysis of the data obtained in 5 years from a single IVF center. Repeated cycles in the same woman were treated as independent observations, which could introduce bias. Results are based on clinical pregnancy and not live births. Prospective analysis of a larger data set from a multicenter study based on live births is necessary to confirm the findings. Paying attention to the quality of embryos, the number of good embryos, AH and the reasons for freezing that are associated with clinical pregnancy after VET will assist the improvement of success rates.
Kragelj, Borut
2016-03-01
Aiming at improving treatment individualization in patients with prostate cancer treated with combination of external beam radiotherapy and high-dose-rate brachytherapy to boost the dose to prostate (HDRB-B), the objective was to evaluate factors that have potential impact on obstructive urination problems (OUP) after HDRB-B. In the follow-up study 88 patients consecutively treated with HDRB-B at the Institute of Oncology Ljubljana in the period 2006-2011 were included. The observed outcome was deterioration of OUP (DOUP) during the follow-up period longer than 1 year. Univariate and multivariate relationship analysis between DOUP and potential risk factors (treatment factors, patients' characteristics) was carried out by using binary logistic regression. ROC curve was constructed on predicted values and the area under the curve (AUC) calculated to assess the performance of the multivariate model. Analysis was carried out on 71 patients who completed 3 years of follow-up. DOUP was noted in 13/71 (18.3%) of them. The results of multivariate analysis showed statistically significant relationship between DOUP and anti-coagulation treatment (OR 4.86, 95% C.I. limits: 1.21-19.61, p = 0.026). Also minimal dose received by 90% of the urethra volume was close to statistical significance (OR = 1.23; 95% C.I. limits: 0.98-1.07, p = 0.099). The value of AUC was 0.755. The study emphasized the relationship between DOUP and anticoagulation treatment, and suggested the multivariate model with fair predictive performance. This model potentially enables a reduction of DOUP after HDRB-B. It supports the belief that further research should be focused on urethral sphincter as a critical structure for OUP.
Eigenvalue and eigenvector sensitivity and approximate analysis for repeated eigenvalue problems
NASA Technical Reports Server (NTRS)
Hou, Gene J. W.; Kenny, Sean P.
1991-01-01
A set of computationally efficient equations for eigenvalue and eigenvector sensitivity analysis are derived, and a method for eigenvalue and eigenvector approximate analysis in the presence of repeated eigenvalues is presented. The method developed for approximate analysis involves a reparamaterization of the multivariable structural eigenvalue problem in terms of a single positive-valued parameter. The resulting equations yield first-order approximations of changes in both the eigenvalues and eigenvectors associated with the repeated eigenvalue problem. Examples are given to demonstrate the application of such equations for sensitivity and approximate analysis.
Gerhardt, H Carl; Brooks, Robert
2009-10-01
Even simple biological signals vary in several measurable dimensions. Understanding their evolution requires, therefore, a multivariate understanding of selection, including how different properties interact to determine the effectiveness of the signal. We combined experimental manipulation with multivariate selection analysis to assess female mate choice on the simple trilled calls of male gray treefrogs. We independently and randomly varied five behaviorally relevant acoustic properties in 154 synthetic calls. We compared response times of each of 154 females to one of these calls with its response to a standard call that had mean values of the five properties. We found directional and quadratic selection on two properties indicative of the amount of signaling, pulse number, and call rate. Canonical rotation of the fitness surface showed that these properties, along with pulse rate, contributed heavily to a major axis of stabilizing selection, a result consistent with univariate studies showing diminishing effects of increasing pulse number well beyond the mean. Spectral properties contributed to a second major axis of stabilizing selection. The single major axis of disruptive selection suggested that a combination of two temporal and two spectral properties with values differing from the mean should be especially attractive.
Prat, Chantal; Besalú, Emili; Bañeras, Lluís; Anticó, Enriqueta
2011-06-15
The volatile fraction of aqueous cork macerates of tainted and non-tainted agglomerate cork stoppers was analysed by headspace solid-phase microextraction (HS-SPME)/gas chromatography. Twenty compounds containing terpenoids, aliphatic alcohols, lignin-related compounds and others were selected and analysed in individual corks. Cork stoppers were previously classified in six different classes according to sensory descriptions including, 2,4,6-trichloroanisole taint and other frequent, non-characteristic odours found in cork. A multivariate analysis of the chromatographic data of 20 selected chemical compounds using linear discriminant analysis models helped in the differentiation of the a priori made groups. The discriminant model selected five compounds as the best combination. Selected compounds appear in the model in the following order; 2,4,6 TCA, fenchyl alcohol, 1-octen-3-ol, benzyl alcohol and benzothiazole. Unfortunately, not all six a priori differentiated sensory classes were clearly discriminated in the model, probably indicating that no measurable differences exist in the chromatographic data for some categories. The predictive analyses of a refined model in which two sensory classes were fused together resulted in a good classification. Prediction rates of control (non-tainted), TCA, musty-earthy-vegetative, vegetative and chemical descriptions were 100%, 100%, 85%, 67.3% and 100%, respectively, when the modified model was used. The multivariate analysis of chromatographic data will help in the classification of stoppers and provide a perfect complement to sensory analyses. Copyright © 2010 Elsevier Ltd. All rights reserved.
Bohn, Justin; Eddings, Wesley; Schneeweiss, Sebastian
2017-03-15
Distributed networks of health-care data sources are increasingly being utilized to conduct pharmacoepidemiologic database studies. Such networks may contain data that are not physically pooled but instead are distributed horizontally (separate patients within each data source) or vertically (separate measures within each data source) in order to preserve patient privacy. While multivariable methods for the analysis of horizontally distributed data are frequently employed, few practical approaches have been put forth to deal with vertically distributed health-care databases. In this paper, we propose 2 propensity score-based approaches to vertically distributed data analysis and test their performance using 5 example studies. We found that these approaches produced point estimates close to what could be achieved without partitioning. We further found a performance benefit (i.e., lower mean squared error) for sequentially passing a propensity score through each data domain (called the "sequential approach") as compared with fitting separate domain-specific propensity scores (called the "parallel approach"). These results were validated in a small simulation study. This proof-of-concept study suggests a new multivariable analysis approach to vertically distributed health-care databases that is practical, preserves patient privacy, and warrants further investigation for use in clinical research applications that rely on health-care databases. © The Author 2017. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Bludau, Sebastian; Bzdok, Danilo; Gruber, Oliver; Kohn, Nils; Riedl, Valentin; Sorg, Christian; Palomero-Gallagher, Nicola; Müller, Veronika I.; Hoffstaedter, Felix; Amunts, Katrin; Eickhoff, Simon B.
2017-01-01
Objective The heterogeneous human frontal pole has been identified as a node in the dysfunctional network of major depressive disorder. The contribution of the medial (socio-affective) versus lateral (cognitive) frontal pole to major depression pathogenesis is currently unclear. The present study performs morphometric comparison of the microstructurally informed subdivisions of human frontal pole between depressed patients and controls using both uni- and multivariate statistics. Methods Multi-site voxel- and region-based morphometric MRI analysis of 73 depressed patients and 73 matched controls without psychiatric history. Frontal pole volume was first compared between depressed patients and controls by subdivision-wise classical morphometric analysis. In a second approach, frontal pole volume was compared by subdivision-naive multivariate searchlight analysis based on support vector machines. Results Subdivision-wise morphometric analysis found a significantly smaller medial frontal pole in depressed patients with a negative correlation of disease severity and duration. Histologically uninformed multivariate voxel-wise statistics provided converging evidence for structural aberrations specific to the microstructurally defined medial area of the frontal pole in depressed patients. Conclusions Across disparate methods, we demonstrated subregion specificity in the left medial frontal pole volume in depressed patients. Indeed, the frontal pole was shown to structurally and functionally connect to other key regions in major depression pathology like the anterior cingulate cortex and the amygdala via the uncinate fasciculus. Present and previous findings consolidate the left medial portion of the frontal pole as particularly altered in major depression. PMID:26621569
NASA Astrophysics Data System (ADS)
Anggraeni, Anni; Arianto, Fernando; Mutalib, Abdul; Pratomo, Uji; Bahti, Husein H.
2017-05-01
Rare Earth Elements (REE) are elements that a lot of function for life, such as metallurgy, optical devices, and manufacture of electronic devices. Sources of REE is present in the mineral, in which each element has similar properties. Currently, to determining the content of REE is used instruments such as ICP-OES, ICP-MS, XRF, and HPLC. But in each instruments, there are still have some weaknesses. Therefore we need an alternative analytical method for the determination of rare earth metal content, one of them is by a combination of UV-Visible spectrophotometry and multivariate analysis, including Principal Component Analysis (PCA), Principal Component Regression (PCR), and Partial Least Square Regression (PLS). The purpose of this experiment is to determine the content of light and medium rare earth elements in the mineral monazite without chemical separation by using a combination of multivariate analysis and UV-Visible spectrophotometric methods. Training set created 22 variations of concentration and absorbance was measured using a UV-Vis spectrophotometer, then the data is processed by PCA, PCR, and PLSR. The results were compared and validated to obtain the mathematical equation with the smallest percent error. From this experiment, mathematical equation used PLS methods was better than PCR after validated, which has RMSE value for La, Ce, Pr, Nd, Gd, Sm, Eu, and Tb respectively 0.095; 0.573; 0.538; 0.440; 3.387; 1.240; 1.870; and 0.639.
Anthropometric profile of combat athletes via multivariate analysis.
Burdukiewicz, Anna; Pietraszewska, Jadwiga; Stachoń, Aleksandra; Andrzejewska, Justyna
2017-11-07
Athletic success is a complex phenotype influenced by multiple factors, from sport-specific skills to anthropometric characteristics. Considering the latter, the literature has repeatedly indicated that athletes possess distinct physical characteristics depending on the practiced discipline. The aim of the present study was to apply univariate and multivariate methods to assess a wide range of morphometric and somatotypic characteristics in male combat athletes. Biometric data were obtained from 206 male university-level practitioners of judo, jiu-jitsu, karate, kickboxing, taekwondo, and wrestling. Measures included height- and length-based variables, breadths, circumferences, and skinfolds. Body proportions and somatotype, using Sheldon's method of somatotopy as modified by Heath and Carter, were then determined. Body fat percentage was assessed by bioelectrical impedance analysis using tetrapolar hand-to-foot electrodes. Data were subjected to a wide array of statistical analysis. The results show between-group differences in the magnitudes of the analyzed characteristics. While mesomorphy was the dominant component of each group somatotype, enhanced ectomorphy was observed in those disciplines that require a high level of agility. Principal component analysis reduced the multivariate dimensionality of the data to three components (characterizing body size, height-based measures, and the anthropometric structure of the upper extremities) that explained the majority of data variance. The development of a sport-specific anthropometric profile via height- and mass-based and morphometric and somatotypic variables can aid in the design of training protocols and the identification of athlete markers as well as serve as a diagnostic criterion in predicting combat athlete performance.
Kaul, Goldi; Huang, Jun; Chatlapalli, Ramarao; Ghosh, Krishnendu; Nagi, Arwinder
2011-12-01
The role of poloxamer 188, water and binder addition rate, on retarding dissolution in immediate-release tablets of a model drug from BCS class II was investigated by means of multivariate data analysis (MVDA) combined with design of experiments (DOE). While the DOE analysis yielded important clues into the cause-and-effect relationship between the responses and design factors, multivariate data analysis of the 40+ variables provided additional information on slowdown in tablet dissolution. A steep dependence of both tablet dissolution and disintegration on the poloxamer and less so on other design variables was observed. Poloxamer was found to increase dissolution rates in granules as expected of surfactants in general but retard dissolution in tablets. The unexpected effect of poloxamer in tablets was accompanied by an increase in tablet-disintegration-time-mediated slowdown of tablet dissolution and by a surrogate binding effect of poloxamer at higher concentrations. It was additionally realized through MVDA that poloxamer in tablets either acts as a binder by itself or promotes binder action of the binder povidone resulting in increased intragranular cohesion. Additionally, poloxamer was found to mediate tablet dissolution on stability as well. In contrast to tablet dissolution at release (time zero), poloxamer appeared to increase tablet dissolution in a concentration-dependent manner on accelerated open-dish stability. Substituting polysorbate 80 as an alternate surfactant in place of poloxamer in the formulation was found to stabilize tablet dissolution.
NASA Astrophysics Data System (ADS)
Kwiatkowski, Mirosław
2015-09-01
The paper presents the results of the research on the application of the LBET class adsorption models with the fast multivariant identification procedure as a tool for analysing the microporous structure of the active carbons obtained by chemical activation using potassium and sodium hydroxides as an activator. The proposed technique of the fast multivariant fitting of the LBET class models to the empirical adsorption data was employed particularly to evaluate the impact of the used activator and the impregnation ratio on the obtained microporous structure of the carbonaceous adsorbents.
Many multivariate methods are used in describing and predicting relation; each has its unique usage of categorical and non-categorical data. In multivariate analysis of variance (MANOVA), many response variables (y's) are related to many independent variables that are categorical...
Multivariate Density Estimation and Remote Sensing
NASA Technical Reports Server (NTRS)
Scott, D. W.
1983-01-01
Current efforts to develop methods and computer algorithms to effectively represent multivariate data commonly encountered in remote sensing applications are described. While this may involve scatter diagrams, multivariate representations of nonparametric probability density estimates are emphasized. The density function provides a useful graphical tool for looking at data and a useful theoretical tool for classification. This approach is called a thunderstorm data analysis.
Zhou, Yan; Wang, Pei; Wang, Xianlong; Zhu, Ji; Song, Peter X-K
2017-01-01
The multivariate regression model is a useful tool to explore complex associations between two kinds of molecular markers, which enables the understanding of the biological pathways underlying disease etiology. For a set of correlated response variables, accounting for such dependency can increase statistical power. Motivated by integrative genomic data analyses, we propose a new methodology-sparse multivariate factor analysis regression model (smFARM), in which correlations of response variables are assumed to follow a factor analysis model with latent factors. This proposed method not only allows us to address the challenge that the number of association parameters is larger than the sample size, but also to adjust for unobserved genetic and/or nongenetic factors that potentially conceal the underlying response-predictor associations. The proposed smFARM is implemented by the EM algorithm and the blockwise coordinate descent algorithm. The proposed methodology is evaluated and compared to the existing methods through extensive simulation studies. Our results show that accounting for latent factors through the proposed smFARM can improve sensitivity of signal detection and accuracy of sparse association map estimation. We illustrate smFARM by two integrative genomics analysis examples, a breast cancer dataset, and an ovarian cancer dataset, to assess the relationship between DNA copy numbers and gene expression arrays to understand genetic regulatory patterns relevant to the disease. We identify two trans-hub regions: one in cytoband 17q12 whose amplification influences the RNA expression levels of important breast cancer genes, and the other in cytoband 9q21.32-33, which is associated with chemoresistance in ovarian cancer. © 2016 WILEY PERIODICALS, INC.
van Mierlo, Pieter; Lie, Octavian; Staljanssens, Willeke; Coito, Ana; Vulliémoz, Serge
2018-04-26
We investigated the influence of processing steps in the estimation of multivariate directed functional connectivity during seizures recorded with intracranial EEG (iEEG) on seizure-onset zone (SOZ) localization. We studied the effect of (i) the number of nodes, (ii) time-series normalization, (iii) the choice of multivariate time-varying connectivity measure: Adaptive Directed Transfer Function (ADTF) or Adaptive Partial Directed Coherence (APDC) and (iv) graph theory measure: outdegree or shortest path length. First, simulations were performed to quantify the influence of the various processing steps on the accuracy to localize the SOZ. Afterwards, the SOZ was estimated from a 113-electrodes iEEG seizure recording and compared with the resection that rendered the patient seizure-free. The simulations revealed that ADTF is preferred over APDC to localize the SOZ from ictal iEEG recordings. Normalizing the time series before analysis resulted in an increase of 25-35% of correctly localized SOZ, while adding more nodes to the connectivity analysis led to a moderate decrease of 10%, when comparing 128 with 32 input nodes. The real-seizure connectivity estimates localized the SOZ inside the resection area using the ADTF coupled to outdegree or shortest path length. Our study showed that normalizing the time-series is an important pre-processing step, while adding nodes to the analysis did only marginally affect the SOZ localization. The study shows that directed multivariate Granger-based connectivity analysis is feasible with many input nodes (> 100) and that normalization of the time-series before connectivity analysis is preferred.
Ringham, Brandy M; Kreidler, Sarah M; Muller, Keith E; Glueck, Deborah H
2016-07-30
Multilevel and longitudinal studies are frequently subject to missing data. For example, biomarker studies for oral cancer may involve multiple assays for each participant. Assays may fail, resulting in missing data values that can be assumed to be missing completely at random. Catellier and Muller proposed a data analytic technique to account for data missing at random in multilevel and longitudinal studies. They suggested modifying the degrees of freedom for both the Hotelling-Lawley trace F statistic and its null case reference distribution. We propose parallel adjustments to approximate power for this multivariate test in studies with missing data. The power approximations use a modified non-central F statistic, which is a function of (i) the expected number of complete cases, (ii) the expected number of non-missing pairs of responses, or (iii) the trimmed sample size, which is the planned sample size reduced by the anticipated proportion of missing data. The accuracy of the method is assessed by comparing the theoretical results to the Monte Carlo simulated power for the Catellier and Muller multivariate test. Over all experimental conditions, the closest approximation to the empirical power of the Catellier and Muller multivariate test is obtained by adjusting power calculations with the expected number of complete cases. The utility of the method is demonstrated with a multivariate power analysis for a hypothetical oral cancer biomarkers study. We describe how to implement the method using standard, commercially available software products and give example code. Copyright © 2015 John Wiley & Sons, Ltd. Copyright © 2015 John Wiley & Sons, Ltd.
Study on rapid valid acidity evaluation of apple by fiber optic diffuse reflectance technique
NASA Astrophysics Data System (ADS)
Liu, Yande; Ying, Yibin; Fu, Xiaping; Jiang, Xuesong
2004-03-01
Some issues related to nondestructive evaluation of valid acidity in intact apples by means of Fourier transform near infrared (FTNIR) (800-2631nm) method were addressed. A relationship was established between the diffuse reflectance spectra recorded with a bifurcated optic fiber and the valid acidity. The data were analyzed by multivariate calibration analysis such as partial least squares (PLS) analysis and principal component regression (PCR) technique. A total of 120 Fuji apples were tested and 80 of them were used to form a calibration data set. The influence of data preprocessing and different spectra treatments were also investigated. Models based on smoothing spectra were slightly worse than models based on derivative spectra and the best result was obtained when the segment length was 5 and the gap size was 10. Depending on data preprocessing and multivariate calibration technique, the best prediction model had a correlation efficient (0.871), a low RMSEP (0.0677), a low RMSEC (0.056) and a small difference between RMSEP and RMSEC by PLS analysis. The results point out the feasibility of FTNIR spectral analysis to predict the fruit valid acidity non-destructively. The ratio of data standard deviation to the root mean square error of prediction (SDR) is better to be less than 3 in calibration models, however, the results cannot meet the demand of actual application. Therefore, further study is required for better calibration and prediction.
Effect of Contact Damage on the Strength of Ceramic Materials.
1982-10-01
variables that are important to erosion, and a multivariate , linear regression analysis is used to fit the data to the dimensional analysis. The...of Equations 7 and 8 by a multivariable regression analysis (room tem- perature data) Exponent Regression Standard error Computed coefficient of...1980) 593. WEAVER, Proc. Brit. Ceram. Soc. 22 (1973) 125. 39. P. W. BRIDGMAN, "Dimensional Analaysis ", (Yale 18. R. W. RICE, S. W. FREIMAN and P. F
Detecting spatial regimes in ecosystems | Science Inventory ...
Research on early warning indicators has generally focused on assessing temporal transitions with limited application of these methods to detecting spatial regimes. Traditional spatial boundary detection procedures that result in ecoregion maps are typically based on ecological potential (i.e. potential vegetation), and often fail to account for ongoing changes due to stressors such as land use change and climate change and their effects on plant and animal communities. We use Fisher information, an information theory based method, on both terrestrial and aquatic animal data (US Breeding Bird Survey and marine zooplankton) to identify ecological boundaries, and compare our results to traditional early warning indicators, conventional ecoregion maps, and multivariate analysis such as nMDS (non-metric Multidimensional Scaling) and cluster analysis. We successfully detect spatial regimes and transitions in both terrestrial and aquatic systems using Fisher information. Furthermore, Fisher information provided explicit spatial information about community change that is absent from other multivariate approaches. Our results suggest that defining spatial regimes based on animal communities may better reflect ecological reality than do traditional ecoregion maps, especially in our current era of rapid and unpredictable ecological change. Use an information theory based method to identify ecological boundaries and compare our results to traditional early warning
NASA Astrophysics Data System (ADS)
Valder, J.; Kenner, S.; Long, A.
2008-12-01
Portions of the Cheyenne River are characterized as impaired by the U.S. Environmental Protection Agency because of water-quality exceedences. The Cheyenne River watershed includes the Black Hills National Forest and part of the Badlands National Park. Preliminary analysis indicates that the Badlands National Park is a major contributor to the exceedances of the water-quality constituents for total dissolved solids and total suspended solids. Water-quality data have been collected continuously since 2007, and in the second year of collection (2008), monthly grab and passive sediment samplers are being used to collect total suspended sediment and total dissolved solids in both base-flow and runoff-event conditions. In addition, sediment samples from the river channel, including bed, bank, and floodplain, have been collected. These samples are being analyzed at the South Dakota School of Mines and Technology's X-Ray Diffraction Lab to quantify the mineralogy of the sediments. A multivariate statistical approach (including principal components, least squares, and maximum likelihood techniques) is applied to the mineral percentages that were characterized for each site to identify the contributing source areas that are causing exceedances of sediment transport in the Cheyenne River watershed. Results of the multivariate analysis demonstrate the likely sources of solids found in the Cheyenne River samples. A further refinement of the methods is in progress that utilizes a conceptual model which, when applied with the multivariate statistical approach, provides a better estimate for sediment sources.
Ríos-Reina, Rocío; Morales, M Lourdes; García-González, Diego L; Amigo, José M; Callejón, Raquel M
2018-03-01
High-quality wine vinegars have been registered in Spain under protected designation of origin (PDO): "Vinagre de Jerez", "Vinagre de Condado de Huelva" and "Vinagre de Montilla-Moriles". The raw material, production and aging processes determine their quality and their aromatic composition. Vinegar volatile profile is usually analyzed by gas chromatography-mass spectrometry (GC-MS), being necessary a previous extraction step. Thus, three different sampling methods (Headspace solid phase microextraction "HS-SPME", Headspace stir bar sorptive extraction "HSSE" and Dynamic headspace extraction "DHS") were studied for the analysis of the volatile composition of Spanish PDO wine vinegars. Multivariate curve resolution (MCR) was used to solve chromatographic problems, improving the results obtained. Principal component analysis (PCA) showed that not all the sampling methods were equally suitable for the characterization and differentiation between PDOs and categories, being HSSE the technique that made able the best vinegar characterization. Copyright © 2017 Elsevier Ltd. All rights reserved.
Zhang, Xiuxiu; Li, Yubo; Zhou, Huifang; Fan, Simiao; Zhang, Zhenzhu; Wang, Lei; Zhang, Yanjun
2014-08-01
Acyclovir (ACV) is an antiviral agent. However, its use is limited by adverse side effect, particularly by its nephrotoxicity. Metabonomics technology can provide essential information on the metabolic profiles of biofluids and organs upon drug administration. Therefore, in this study, mass spectrometry-based metabonomics coupled with multivariate data analysis was used to identify the plasma metabolites and metabolic pathways related to nephrotoxicity caused by intraperitoneal injection of low (50mg/kg) and high (100mg/kg) doses of acyclovir. Sixteen biomarkers were identified by metabonomics and nephrotoxicity results revealed the dose-dependent effect of acyclovir on kidney tissues. The present study showed that the top four metabolic pathways interrupted by acyclovir included the metabolisms of arachidonic acid, tryptophan, arginine and proline, and glycerophospholipid. This research proves the established metabonomic approach can provide information on changes in metabolites and metabolic pathways, which can be applied to in-depth research on the mechanism of acyclovir-induced kidney injury. Copyright © 2014 Elsevier B.V. All rights reserved.
Time-varying nonstationary multivariate risk analysis using a dynamic Bayesian copula
NASA Astrophysics Data System (ADS)
Sarhadi, Ali; Burn, Donald H.; Concepción Ausín, María.; Wiper, Michael P.
2016-03-01
A time-varying risk analysis is proposed for an adaptive design framework in nonstationary conditions arising from climate change. A Bayesian, dynamic conditional copula is developed for modeling the time-varying dependence structure between mixed continuous and discrete multiattributes of multidimensional hydrometeorological phenomena. Joint Bayesian inference is carried out to fit the marginals and copula in an illustrative example using an adaptive, Gibbs Markov Chain Monte Carlo (MCMC) sampler. Posterior mean estimates and credible intervals are provided for the model parameters and the Deviance Information Criterion (DIC) is used to select the model that best captures different forms of nonstationarity over time. This study also introduces a fully Bayesian, time-varying joint return period for multivariate time-dependent risk analysis in nonstationary environments. The results demonstrate that the nature and the risk of extreme-climate multidimensional processes are changed over time under the impact of climate change, and accordingly the long-term decision making strategies should be updated based on the anomalies of the nonstationary environment.
Detection of Leukemia with Blood Samples Using Raman Spectroscopy and Multivariate Analysis
NASA Astrophysics Data System (ADS)
Martínez-Espinosa, J. C.; González-Solís, J. L.; Frausto-Reyes, C.; Miranda-Beltrán, M. L.; Soria-Fregoso, C.; Medina-Valtierra, J.
2009-06-01
The use of Raman spectroscopy to analyze blood biochemistry and hence distinguish between normal and abnormal blood was investigated. Blood samples were obtained from 6 patients who were clinically diagnosed with leukemia and 6 healthy volunteers. The imprint was put under the microscope and several points were chosen for Raman measurement. All the spectra were collected by a confocal Raman micro-spectroscopy (Renishaw) with a NIR 830 nm laser. It is shown that the serum samples from patients with leukemia and from the control group can be discriminated when the multivariate statistical methods of principal component analysis (PCA) and linear discriminated analysis (LDA) are applied to their Raman spectra. The ratios of some band intensities were analyzed and some band ratios were significant and corresponded to proteins, phospholipids, and polysaccharides. The preliminary results suggest that Raman Spectroscopy could be a new technique to study the degree of damage to the bone marrow using just blood samples instead of biopsies, treatment very painful for patients.
[Prevalence and determinants of exclusive breastfeeding in the city of Serrana, São Paulo, Brazil].
Queluz, Mariângela Carletti; Pereira, Maria José Bistafa; dos Santos, Claudia Benedita; Leite, Adriana Moraes; Ricco, Rubens Garcia
2012-06-01
The objective of this cross-sectional and quantitative study was to identify the prevalence and determinants of exclusive breastfeeding among infants less than six months of age in the city of Serrana, Sao Paulo, Brazil in 2009. A validated semi-structured questionnaire was administered to the guardians of the children less than six months of age who attended the second phase of a Brazilian vaccination campaign against polio. Univariate and multivariate analysis presented in odds ratios and confidence intervals was accomplished. Of the total of 275 infant participants, only 29.8% were exclusively breastfed. Univariate analysis revealed that mothers who work outside the home without maternity leave, mothers who did not work outside the home, adolescent mothers, and the use of pacifiers have a greater chance of interrupting exclusive breastfeeding. In the multivariate analysis, mothers who work outside the home without maternity leave are three times more likely to wean their children early. Results provide suggestions for the redirection and planning of interventions targeting breastfeeding.
Handwriting Examination: Moving from Art to Science
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jarman, K.H.; Hanlen, R.C.; Manzolillo, P.A.
In this document, we present a method for validating the premises and methodology of forensic handwriting examination. This method is intuitively appealing because it relies on quantitative measurements currently used qualitatively by FDE's in making comparisons, and it is scientifically rigorous because it exploits the power of multivariate statistical analysis. This approach uses measures of both central tendency and variation to construct a profile for a given individual. (Central tendency and variation are important for characterizing an individual's writing and both are currently used by FDE's in comparative analyses). Once constructed, different profiles are then compared for individuality using clustermore » analysis; they are grouped so that profiles within a group cannot be differentiated from one another based on the measured characteristics, whereas profiles between groups can. The cluster analysis procedure used here exploits the power of multivariate hypothesis testing. The result is not only a profile grouping but also an indication of statistical significance of the groups generated.« less
Study of archaeological coins of different dynasties using libs coupled with multivariate analysis
NASA Astrophysics Data System (ADS)
Awasthi, Shikha; Kumar, Rohit; Rai, G. K.; Rai, A. K.
2016-04-01
Laser Induced Breakdown Spectroscopy (LIBS) is an atomic emission spectroscopic technique having unique capability of an in-situ monitoring tool for detection and quantification of elements present in different artifacts. Archaeological coins collected form G.R. Sharma Memorial Museum; University of Allahabad, India has been analyzed using LIBS technique. These coins were obtained from excavation of Kausambi, Uttar Pradesh, India. LIBS system assembled in the laboratory (laser Nd:YAG 532 nm, 4 ns pulse width FWHM with Ocean Optics LIBS 2000+ spectrometer) is employed for spectral acquisition. The spectral lines of Ag, Cu, Ca, Sn, Si, Fe and Mg are identified in the LIBS spectra of different coins. LIBS along with Multivariate Analysis play an effective role for classification and contribution of spectral lines in different coins. The discrimination between five coins with Archaeological interest has been carried out using Principal Component Analysis (PCA). The results show the potential relevancy of the methodology used in the elemental identification and classification of artifacts with high accuracy and robustness.
SPICE: exploration and analysis of post-cytometric complex multivariate datasets.
Roederer, Mario; Nozzi, Joshua L; Nason, Martha C
2011-02-01
Polychromatic flow cytometry results in complex, multivariate datasets. To date, tools for the aggregate analysis of these datasets across multiple specimens grouped by different categorical variables, such as demographic information, have not been optimized. Often, the exploration of such datasets is accomplished by visualization of patterns with pie charts or bar charts, without easy access to statistical comparisons of measurements that comprise multiple components. Here we report on algorithms and a graphical interface we developed for these purposes. In particular, we discuss thresholding necessary for accurate representation of data in pie charts, the implications for display and comparison of normalized versus unnormalized data, and the effects of averaging when samples with significant background noise are present. Finally, we define a statistic for the nonparametric comparison of complex distributions to test for difference between groups of samples based on multi-component measurements. While originally developed to support the analysis of T cell functional profiles, these techniques are amenable to a broad range of datatypes. Published 2011 Wiley-Liss, Inc.
Bena, Antonella; Giraudo, Massimiliano
2013-01-01
To study the relationship between job tenure and injury risk, controlling for individual factors and company characteristics. Analysis of incidence and injury risk by job tenure, controlling for gender, age, nationality, economic activity, firm size. Sample of 7% of Italian workers registered in the INPS (National Institute of Social Insurance) database. Private sector employees who worked as blue collars or apprentices. First-time occupational injuries, all occupational injuries, serious occupational injuries. Our findings show an increase in injury risk among those who start a new job and an inverse relationship between job tenure and injury risk. Multivariate analysis confirm these results. Recommendations for improving this situation include the adoption of organizational models that provide periods of mentoring from colleagues already in the company and the assignment to simple and not much hazardous tasks. The economic crisis may exacerbate this problem: it is important for Italy to improve the systems of monitoring relations between temporary employment and health.
Bourne, Roger; Himmelreich, Uwe; Sharma, Ansuiya; Mountford, Carolyn; Sorrell, Tania
2001-01-01
A new fingerprinting technique with the potential for rapid identification of bacteria was developed by combining proton magnetic resonance spectroscopy (1H MRS) with multivariate statistical analysis. This resulted in an objective identification strategy for common clinical isolates belonging to the bacterial species Staphylococcus aureus, Staphylococcus epidermidis, Enterococcus faecalis, Streptococcus pneumoniae, Streptococcus pyogenes, Streptococcus agalactiae, and the Streptococcus milleri group. Duplicate cultures of 104 different isolates were examined one or more times using 1H MRS. A total of 312 cultures were examined. An optimized classifier was developed using a bootstrapping process and a seven-group linear discriminant analysis to provide objective classification of the spectra. Identification of isolates was based on consistent high-probability classification of spectra from duplicate cultures and achieved 92% agreement with conventional methods of identification. Fewer than 1% of isolates were identified incorrectly. Identification of the remaining 7% of isolates was defined as indeterminate. PMID:11474013
Wang, Jinjia; Zhang, Yanna
2015-02-01
Brain-computer interface (BCI) systems identify brain signals through extracting features from them. In view of the limitations of the autoregressive model feature extraction method and the traditional principal component analysis to deal with the multichannel signals, this paper presents a multichannel feature extraction method that multivariate autoregressive (MVAR) model combined with the multiple-linear principal component analysis (MPCA), and used for magnetoencephalography (MEG) signals and electroencephalograph (EEG) signals recognition. Firstly, we calculated the MVAR model coefficient matrix of the MEG/EEG signals using this method, and then reduced the dimensions to a lower one, using MPCA. Finally, we recognized brain signals by Bayes Classifier. The key innovation we introduced in our investigation showed that we extended the traditional single-channel feature extraction method to the case of multi-channel one. We then carried out the experiments using the data groups of IV-III and IV - I. The experimental results proved that the method proposed in this paper was feasible.
A Course in... Multivariable Control Methods.
ERIC Educational Resources Information Center
Deshpande, Pradeep B.
1988-01-01
Describes an engineering course for graduate study in process control. Lists four major topics: interaction analysis, multiloop controller design, decoupling, and multivariable control strategies. Suggests a course outline and gives information about each topic. (MVL)
Metabolic changes in different developmental stages of Vanilla planifolia pods.
Palama, Tony Lionel; Khatib, Alfi; Choi, Young Hae; Payet, Bertrand; Fock, Isabelle; Verpoorte, Robert; Kodja, Hippolyte
2009-09-09
The metabolomic analysis of developing Vanilla planifolia green pods (between 3 and 8 months after pollination) was carried out by nuclear magnetic resonance (NMR) spectroscopy and multivariate data analysis. Multivariate data analysis of the (1)H NMR spectra, such as principal component analysis (PCA) and partial least-squares-discriminant analysis (PLS-DA), showed a trend of separation of those samples based on the metabolites present in the methanol/water (1:1) extract. Older pods had a higher content of glucovanillin, vanillin, p-hydroxybenzaldehyde glucoside, p-hydroxybenzaldehyde, and sucrose, while younger pods had more bis[4-(beta-D-glucopyranosyloxy)-benzyl]-2-isopropyltartrate (glucoside A), bis[4-(beta-D-glucopyranosyloxy)-benzyl]-2-(2-butyl)tartrate (glucoside B), glucose, malic acid, and homocitric acid. A liquid chromatography-mass spectrometry (LC-MS) analysis targeted at phenolic compound content was also performed on the developing pods and confirmed the NMR results. Ratios of aglycones/glucosides were estimated and thus allowed for detection of more minor metabolites in the green vanilla pods. Quantification of compounds based on both LC-MS and NMR analyses showed that free vanillin can reach 24% of the total vanillin content after 8 months of development in the vanilla green pods.
Zhang, Yong; Zhong, Miner; Geng, Nana; Jiang, Yunjian
2017-01-01
The market demand for electric vehicles (EVs) has increased in recent years. Suitable models are necessary to understand and forecast EV sales. This study presents a singular spectrum analysis (SSA) as a univariate time-series model and vector autoregressive model (VAR) as a multivariate model. Empirical results suggest that SSA satisfactorily indicates the evolving trend and provides reasonable results. The VAR model, which comprised exogenous parameters related to the market on a monthly basis, can significantly improve the prediction accuracy. The EV sales in China, which are categorized into battery and plug-in EVs, are predicted in both short term (up to December 2017) and long term (up to 2020), as statistical proofs of the growth of the Chinese EV industry.
Zhang, Yong; Zhong, Miner; Geng, Nana; Jiang, Yunjian
2017-01-01
The market demand for electric vehicles (EVs) has increased in recent years. Suitable models are necessary to understand and forecast EV sales. This study presents a singular spectrum analysis (SSA) as a univariate time-series model and vector autoregressive model (VAR) as a multivariate model. Empirical results suggest that SSA satisfactorily indicates the evolving trend and provides reasonable results. The VAR model, which comprised exogenous parameters related to the market on a monthly basis, can significantly improve the prediction accuracy. The EV sales in China, which are categorized into battery and plug-in EVs, are predicted in both short term (up to December 2017) and long term (up to 2020), as statistical proofs of the growth of the Chinese EV industry. PMID:28459872
Macpherson, Ignacio; Roqué-Sánchez, María V; Legget Bn, Finola O; Fuertes, Ferran; Segarra, Ignacio
2016-10-01
personalised support provided to women by health professionals is one of the prime factors attaining women's satisfaction during pregnancy and childbirth. However the multifactorial nature of 'satisfaction' makes difficult to assess it. Statistical multivariate analysis may be an effective technique to obtain in depth quantitative evidence of the importance of this factor and its interaction with the other factors involved. This technique allows us to estimate the importance of overall satisfaction in its context and suggest actions for healthcare services. systematic review of studies that quantitatively measure the personal relationship between women and healthcare professionals (gynecologists, obstetricians, nurse, midwifes, etc.) regarding maternity care satisfaction. The literature search focused on studies carried out between 1970 and 2014 that used multivariate analyses and included the woman-caregiver relationship as a factor of their analysis. twenty-four studies which applied various multivariate analysis tools to different periods of maternity care (antenatal, perinatal, post partum) were selected. The studies included discrete scale scores and questionnaires from women with low-risk pregnancies. The "personal relationship" factor appeared under various names: care received, personalised treatment, professional support, amongst others. The most common multivariate techniques used to assess the percentage of variance explained and the odds ratio of each factor were principal component analysis and logistic regression. the data, variables and factor analysis suggest that continuous, personalised care provided by the usual midwife and delivered within a family or a specialised setting, generates the highest level of satisfaction. In addition, these factors foster the woman's psychological and physiological recovery, often surpassing clinical action (e.g. medicalization and hospital organization) and/or physiological determinants (e.g. pain, pathologies, etc.). Copyright © 2016 Elsevier Ltd. All rights reserved.
Dimopoulos, M A; Kastritis, E; Michalis, E; Tsatalas, C; Michael, M; Pouli, A; Kartasis, Z; Delimpasi, S; Gika, D; Zomas, A; Roussou, M; Konstantopoulos, K; Parcharidou, A; Zervas, K; Terpos, E
2012-03-01
The International Staging System (ISS) is the most widely used staging system for patients with multiple myeloma (MM). However, serum β2-microglobulin increases in renal impairment (RI) and there have been concerns that ISS-3 stage may include 'up-staged' MM patients in whom elevated β2-microglobulin reflects the degree of renal dysfunction rather than tumor load. In order to assess the impact of RI on the prognostic value of ISS, we analyzed 1516 patients with symptomatic MM and the degree of RI was classified according to the Kidney Disease Outcomes Quality Initiative-Chronic Kidney Disease (CKD) criteria. Forty-eight percent patients had stages 3-5 CKD while 29% of patients had ISS-1, 38% had ISS-2 and 33% ISS-3. The frequency and severity of RI were more common in ISS-3 patients. RI was associated with inferior survival in univariate but not in multivariate analysis. When analyzed separately, ISS-1 and ISS-2 patients with RI had inferior survival in univariate but not in multivariate analysis. In ISS-3 MM patients, RI had no prognostic impact either in univariate or multivariate analysis. Results were similar, when we analyzed only patients with Bence-Jones >200 mg/day. ISS remains unaffected by the degree of RI, even in patients with ISS-3, which includes most patients with renal dysfunction.
Lee, Yune-Sang; Turkeltaub, Peter; Granger, Richard; Raizada, Rajeev D S
2012-03-14
Although much effort has been directed toward understanding the neural basis of speech processing, the neural processes involved in the categorical perception of speech have been relatively less studied, and many questions remain open. In this functional magnetic resonance imaging (fMRI) study, we probed the cortical regions mediating categorical speech perception using an advanced brain-mapping technique, whole-brain multivariate pattern-based analysis (MVPA). Normal healthy human subjects (native English speakers) were scanned while they listened to 10 consonant-vowel syllables along the /ba/-/da/ continuum. Outside of the scanner, individuals' own category boundaries were measured to divide the fMRI data into /ba/ and /da/ conditions per subject. The whole-brain MVPA revealed that Broca's area and the left pre-supplementary motor area evoked distinct neural activity patterns between the two perceptual categories (/ba/ vs /da/). Broca's area was also found when the same analysis was applied to another dataset (Raizada and Poldrack, 2007), which previously yielded the supramarginal gyrus using a univariate adaptation-fMRI paradigm. The consistent MVPA findings from two independent datasets strongly indicate that Broca's area participates in categorical speech perception, with a possible role of translating speech signals into articulatory codes. The difference in results between univariate and multivariate pattern-based analyses of the same data suggest that processes in different cortical areas along the dorsal speech perception stream are distributed on different spatial scales.
Giacomino, Agnese; Abollino, Ornella; Malandrino, Mery; Mentasti, Edoardo
2011-03-04
Single and sequential extraction procedures are used for studying element mobility and availability in solid matrices, like soils, sediments, sludge, and airborne particulate matter. In the first part of this review we reported an overview on these procedures and described the applications of chemometric uni- and bivariate techniques and of multivariate pattern recognition techniques based on variable reduction to the experimental results obtained. The second part of the review deals with the use of chemometrics not only for the visualization and interpretation of data, but also for the investigation of the effects of experimental conditions on the response, the optimization of their values and the calculation of element fractionation. We will describe the principles of the multivariate chemometric techniques considered, the aims for which they were applied and the key findings obtained. The following topics will be critically addressed: pattern recognition by cluster analysis (CA), linear discriminant analysis (LDA) and other less common techniques; modelling by multiple linear regression (MLR); investigation of spatial distribution of variables by geostatistics; calculation of fractionation patterns by a mixture resolution method (Chemometric Identification of Substrates and Element Distributions, CISED); optimization and characterization of extraction procedures by experimental design; other multivariate techniques less commonly applied. Copyright © 2010 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
O'Brien, S. J.; Fitzpatrick, P. J.; Dzwonkowski, B.; Dykstra, S. L.; Wallace, D. J.; Church, I.; Wiggert, J. D.
2016-02-01
The Mississippi Sound is influenced by a high volume of sediment discharge from the Biloxi River, Mobile Bay via Pas aux Herons, Pascagoula River, Pearl River, Wolf River, and Lake Pontchartrain through the Rigolets. The river discharge, variable wind speed, wind direction and tides have a significant impact on the turbidity and transport of sediments in the Sound. Level 1 Moderate Resolution Imaging Spectroradiometer (MODIS) data is processed to extract the remote sensing reflectance at the wavelength of 645 nm and binned into an 8-day composite at a resolution of 500 m. The study uses a regional ocean color algorithm to compute suspended particulate matter (SPM) concentration based on these 8-day composite images. Multivariate analysis is applied between the SPM and time series of tides, wind, turbidity and river discharge measured at federal and academic institutions' stations and moorings. The multivariate analysis also includes in situ measurements of suspended sediment concentration and advective exchanges through the Mississippi Sound's tidal inlets between the coastal shelf and the nearshore estuarine waters. Mechanisms underlying the observed spatiotemporal distribution of SPM, including material exchange between the Sound and adjacent shelf waters, will be explored. The results of this study will contribute to current understanding of exchange mechanisms and pathways with the Mississippi Bight via the Mississippi Sound's tidal inlets.
Marriage Type and Reproductive Decisions: A Comparative Study in Sub-Saharan Africa.
ERIC Educational Resources Information Center
Dodoo, F. Nii-Amoo
1998-01-01
The effect of marriage type (polygamy vs. monogamy) on reproductive decisions is investigated using comparative data from the 1988, 1989, and 1993 Demographic and Health Surveys of Ghana and Kenya. Multivariate analysis is used. Inconclusive results are discussed with a focus on future research directions. (Author/EMK)
Robustness properties of discrete time regulators, LOG regulators and hybrid systems
NASA Technical Reports Server (NTRS)
Stein, G.; Athans, M.
1979-01-01
Robustness properites of sample-data LQ regulators are derived which show that these regulators have fundamentally inferior uncertainty tolerances when compared to their continuous-time counterparts. Results are also presented in stability theory, multivariable frequency domain analysis, LQG robustness, and mathematical representations of hybrid systems.
Method for factor analysis of GC/MS data
Van Benthem, Mark H; Kotula, Paul G; Keenan, Michael R
2012-09-11
The method of the present invention provides a fast, robust, and automated multivariate statistical analysis of gas chromatography/mass spectroscopy (GC/MS) data sets. The method can involve systematic elimination of undesired, saturated peak masses to yield data that follow a linear, additive model. The cleaned data can then be subjected to a combination of PCA and orthogonal factor rotation followed by refinement with MCR-ALS to yield highly interpretable results.
Sun, Li-Li; Wang, Meng; Zhang, Hui-Jie; Liu, Ya-Nan; Ren, Xiao-Liang; Deng, Yan-Ru; Qi, Ai-Di
2018-01-01
Polygoni Multiflori Radix (PMR) is increasingly being used not just as a traditional herbal medicine but also as a popular functional food. In this study, multivariate chemometric methods and mass spectrometry were combined to analyze the ultra-high-performance liquid chromatograph (UPLC) fingerprints of PMR from six different geographical origins. A chemometric strategy based on multivariate curve resolution-alternating least squares (MCR-ALS) and three classification methods is proposed to analyze the UPLC fingerprints obtained. Common chromatographic problems, including the background contribution, baseline contribution, and peak overlap, were handled by the established MCR-ALS model. A total of 22 components were resolved. Moreover, relative species concentrations were obtained from the MCR-ALS model, which was used for multivariate classification analysis. Principal component analysis (PCA) and Ward's method have been applied to classify 72 PMR samples from six different geographical regions. The PCA score plot showed that the PMR samples fell into four clusters, which related to the geographical location and climate of the source areas. The results were then corroborated by Ward's method. In addition, according to the variance-weighted distance between cluster centers obtained from Ward's method, five components were identified as the most significant variables (chemical markers) for cluster discrimination. A counter-propagation artificial neural network has been applied to confirm and predict the effects of chemical markers on different samples. Finally, the five chemical markers were identified by UPLC-quadrupole time-of-flight mass spectrometer. Components 3, 12, 16, 18, and 19 were identified as 2,3,5,4'-tetrahydroxy-stilbene-2-O-β-d-glucoside, emodin-8-O-β-d-glucopyranoside, emodin-8-O-(6'-O-acetyl)-β-d-glucopyranoside, emodin, and physcion, respectively. In conclusion, the proposed method can be applied for the comprehensive analysis of natural samples. Copyright © 2016. Published by Elsevier B.V.
The analysis of morphometric data on rocky mountain wolves and artic wolves using statistical method
NASA Astrophysics Data System (ADS)
Ammar Shafi, Muhammad; Saifullah Rusiman, Mohd; Hamzah, Nor Shamsidah Amir; Nor, Maria Elena; Ahmad, Noor’ani; Azia Hazida Mohamad Azmi, Nur; Latip, Muhammad Faez Ab; Hilmi Azman, Ahmad
2018-04-01
Morphometrics is a quantitative analysis depending on the shape and size of several specimens. Morphometric quantitative analyses are commonly used to analyse fossil record, shape and size of specimens and others. The aim of the study is to find the differences between rocky mountain wolves and arctic wolves based on gender. The sample utilised secondary data which included seven variables as independent variables and two dependent variables. Statistical modelling was used in the analysis such was the analysis of variance (ANOVA) and multivariate analysis of variance (MANOVA). The results showed there exist differentiating results between arctic wolves and rocky mountain wolves based on independent factors and gender.
Li, Cen; Yang, Hongxia; Xiao, Yuancan; Zhandui; Sanglao; Wang, Zhang; Ladan, Duojie; Bi, Hongtao
2016-01-01
Zuotai (gTso thal) is one of the famous drugs containing mercury in Tibetan medicine. However, little is known about the chemical substance basis of its pharmacodynamics and the intrinsic link of different samples sources so far. Given this, energy dispersive spectrometry of X-ray (EDX), scanning electron microscopy (SEM), atomic force microscopy (AFM), and powder X-ray diffraction (XRD) were used to assay the elements, micromorphology, and phase composition of nine Zuotai samples from different regions, respectively; the XRD fingerprint features of Zuotai were analyzed by multivariate statistical analysis. EDX result shows that Zuotai contains Hg, S, O, Fe, Al, Cu, and other elements. SEM and AFM observations suggest that Zuotai is a kind of ancient nanodrug. Its particles are mainly in the range of 100–800 nm, which commonly further aggregate into 1–30 μm loosely amorphous particles. XRD test shows that β-HgS, S8, and α-HgS are its main phase compositions. XRD fingerprint analysis indicates that the similarity degrees of nine samples are very high, and the results of multivariate statistical analysis are broadly consistent with sample sources. The present research has revealed the physicochemical characteristics of Zuotai, and it would play a positive role in interpreting this mysterious Tibetan drug. PMID:27738409
Li, Cen; Yang, Hongxia; Du, Yuzhi; Xiao, Yuancan; Zhandui; Sanglao; Wang, Zhang; Ladan, Duojie; Bi, Hongtao; Wei, Lixin
2016-01-01
Zuotai ( gTso thal ) is one of the famous drugs containing mercury in Tibetan medicine. However, little is known about the chemical substance basis of its pharmacodynamics and the intrinsic link of different samples sources so far. Given this, energy dispersive spectrometry of X-ray (EDX), scanning electron microscopy (SEM), atomic force microscopy (AFM), and powder X-ray diffraction (XRD) were used to assay the elements, micromorphology, and phase composition of nine Zuotai samples from different regions, respectively; the XRD fingerprint features of Zuotai were analyzed by multivariate statistical analysis. EDX result shows that Zuotai contains Hg, S, O, Fe, Al, Cu, and other elements. SEM and AFM observations suggest that Zuotai is a kind of ancient nanodrug. Its particles are mainly in the range of 100-800 nm, which commonly further aggregate into 1-30 μ m loosely amorphous particles. XRD test shows that β -HgS, S 8 , and α -HgS are its main phase compositions. XRD fingerprint analysis indicates that the similarity degrees of nine samples are very high, and the results of multivariate statistical analysis are broadly consistent with sample sources. The present research has revealed the physicochemical characteristics of Zuotai , and it would play a positive role in interpreting this mysterious Tibetan drug.
Factors Controlling Sediment Load in The Central Anatolia Region of Turkey: Ankara River Basin.
Duru, Umit; Wohl, Ellen; Ahmadi, Mehdi
2017-05-01
Better understanding of the factors controlling sediment load at a catchment scale can facilitate estimation of soil erosion and sediment transport rates. The research summarized here enhances understanding of correlations between potential control variables on suspended sediment loads. The Soil and Water Assessment Tool was used to simulate flow and sediment at the Ankara River basin. Multivariable regression analysis and principal component analysis were then performed between sediment load and controlling variables. The physical variables were either directly derived from a Digital Elevation Model or from field maps or computed using established equations. Mean observed sediment rate is 6697 ton/year and mean sediment yield is 21 ton/y/km² from the gage. Soil and Water Assessment Tool satisfactorily simulated observed sediment load with Nash-Sutcliffe efficiency, relative error, and coefficient of determination (R²) values of 0.81, -1.55, and 0.93, respectively in the catchment. Therefore, parameter values from the physically based model were applied to the multivariable regression analysis as well as principal component analysis. The results indicate that stream flow, drainage area, and channel width explain most of the variability in sediment load among the catchments. The implications of the results, efficient siltation management practices in the catchment should be performed to stream flow, drainage area, and channel width.
Factors Controlling Sediment Load in The Central Anatolia Region of Turkey: Ankara River Basin
NASA Astrophysics Data System (ADS)
Duru, Umit; Wohl, Ellen; Ahmadi, Mehdi
2017-05-01
Better understanding of the factors controlling sediment load at a catchment scale can facilitate estimation of soil erosion and sediment transport rates. The research summarized here enhances understanding of correlations between potential control variables on suspended sediment loads. The Soil and Water Assessment Tool was used to simulate flow and sediment at the Ankara River basin. Multivariable regression analysis and principal component analysis were then performed between sediment load and controlling variables. The physical variables were either directly derived from a Digital Elevation Model or from field maps or computed using established equations. Mean observed sediment rate is 6697 ton/year and mean sediment yield is 21 ton/y/km² from the gage. Soil and Water Assessment Tool satisfactorily simulated observed sediment load with Nash-Sutcliffe efficiency, relative error, and coefficient of determination ( R²) values of 0.81, -1.55, and 0.93, respectively in the catchment. Therefore, parameter values from the physically based model were applied to the multivariable regression analysis as well as principal component analysis. The results indicate that stream flow, drainage area, and channel width explain most of the variability in sediment load among the catchments. The implications of the results, efficient siltation management practices in the catchment should be performed to stream flow, drainage area, and channel width.
[Risk factors for anorexia in children].
Liu, Wei-Xiao; Lang, Jun-Feng; Zhang, Qin-Feng
2016-11-01
To investigate the risk factors for anorexia in children, and to reduce the prevalence of anorexia in children. A questionnaire survey and a case-control study were used to collect the general information of 150 children with anorexia (case group) and 150 normal children (control group). Univariate analysis and multivariate logistic stepwise regression analysis were performed to identify the risk factors for anorexia in children. The results of the univariate analysis showed significant differences between the case and control groups in the age in months when supplementary food were added, feeding pattern, whether they liked meat, vegetables and salty food, whether they often took snacks and beverages, whether they liked to play while eating, and whether their parents asked them to eat food on time (P<0.05). The results of the multivariate logistic regression analysis showed that late addition of supplementary food (OR=5.408), high frequency of taking snacks and/or drinks (OR=11.813), and eating while playing (OR=6.654) were major risk factors for anorexia in children. Liking of meat (OR=0.093) and vegetables (OR=0.272) and eating on time required by parents (OR=0.079) were protective factors against anorexia in children. Timely addition of supplementary food, a proper diet, and development of children's proper eating and living habits can reduce the incidence of anorexia in children.
Multivariate analysis of cytokine profiles in pregnancy complications.
Azizieh, Fawaz; Dingle, Kamaludin; Raghupathy, Raj; Johnson, Kjell; VanderPlas, Jacob; Ansari, Ali
2018-03-01
The immunoregulation to tolerate the semiallogeneic fetus during pregnancy includes a harmonious dynamic balance between anti- and pro-inflammatory cytokines. Several earlier studies reported significantly different levels and/or ratios of several cytokines in complicated pregnancy as compared to normal pregnancy. However, as cytokines operate in networks with potentially complex interactions, it is also interesting to compare groups with multi-cytokine data sets, with multivariate analysis. Such analysis will further examine how great the differences are, and which cytokines are more different than others. Various multivariate statistical tools, such as Cramer test, classification and regression trees, partial least squares regression figures, 2-dimensional Kolmogorov-Smirmov test, principal component analysis and gap statistic, were used to compare cytokine data of normal vs anomalous groups of different pregnancy complications. Multivariate analysis assisted in examining if the groups were different, how strongly they differed, in what ways they differed and further reported evidence for subgroups in 1 group (pregnancy-induced hypertension), possibly indicating multiple causes for the complication. This work contributes to a better understanding of cytokines interaction and may have important implications on targeting cytokine balance modulation or design of future medications or interventions that best direct management or prevention from an immunological approach. © 2018 The Authors. American Journal of Reproductive Immunology Published by John Wiley & Sons Ltd.
A single determinant dominates the rate of yeast protein evolution.
Drummond, D Allan; Raval, Alpan; Wilke, Claus O
2006-02-01
A gene's rate of sequence evolution is among the most fundamental evolutionary quantities in common use, but what determines evolutionary rates has remained unclear. Here, we carry out the first combined analysis of seven predictors (gene expression level, dispensability, protein abundance, codon adaptation index, gene length, number of protein-protein interactions, and the gene's centrality in the interaction network) previously reported to have independent influences on protein evolutionary rates. Strikingly, our analysis reveals a single dominant variable linked to the number of translation events which explains 40-fold more variation in evolutionary rate than any other, suggesting that protein evolutionary rate has a single major determinant among the seven predictors. The dominant variable explains nearly half the variation in the rate of synonymous and protein evolution. We show that the two most commonly used methods to disentangle the determinants of evolutionary rate, partial correlation analysis and ordinary multivariate regression, produce misleading or spurious results when applied to noisy biological data. We overcome these difficulties by employing principal component regression, a multivariate regression of evolutionary rate against the principal components of the predictor variables. Our results support the hypothesis that translational selection governs the rate of synonymous and protein sequence evolution in yeast.
A first application of independent component analysis to extracting structure from stock returns.
Back, A D; Weigend, A S
1997-08-01
This paper explores the application of a signal processing technique known as independent component analysis (ICA) or blind source separation to multivariate financial time series such as a portfolio of stocks. The key idea of ICA is to linearly map the observed multivariate time series into a new space of statistically independent components (ICs). We apply ICA to three years of daily returns of the 28 largest Japanese stocks and compare the results with those obtained using principal component analysis. The results indicate that the estimated ICs fall into two categories, (i) infrequent large shocks (responsible for the major changes in the stock prices), and (ii) frequent smaller fluctuations (contributing little to the overall level of the stocks). We show that the overall stock price can be reconstructed surprisingly well by using a small number of thresholded weighted ICs. In contrast, when using shocks derived from principal components instead of independent components, the reconstructed price is less similar to the original one. ICA is shown to be a potentially powerful method of analyzing and understanding driving mechanisms in financial time series. The application to portfolio optimization is described in Chin and Weigend (1998).
Backes, Dirce Stein; Zanatta, Fabrício Batistin; Costenaro, Regina Santini; Rangel, Rosiane Filipin; Vidal, Janice; Kruel, Cristina Saling; de Mattos, Karen Mallo
2014-03-01
This study sought to identify the risk indicators associated with the consumption of illicit drugs by schoolchildren in public schools in a community in the south of Brazil. This is a non-experimental cross-sectional study conducted with 535 students of primary schoolchildren from six public schools. Data were collected using a questionnaire between October 2011 and March 2012. The results were presented by simple and relative distribution of frequency and odds ratio (OR) and the 95% reliability intervals were calculated to verify the association between the dependent and independent variables. Multivariate analysis was also performed using the question "have you ever used illicit drugs?" Univariate analysis revealed an association between family income, color, period in which the child studied, failure to pass annual tests, use of methods of prevention, smoking habit and knowing someone who uses drugs with the fact of having experimented with the use of illicit drugs. After multivariate analysis, the smoking habit was the only indicator significantly associated with the question of having made use of illicit drugs. The results indicate that the smoking habit is an important indicator of the predictive risk for the use of illicit drugs.
Ielpo, Pierina; Leardi, Riccardo; Pappagallo, Giuseppe; Uricchio, Vito Felice
2017-06-01
In this paper, the results obtained from multivariate statistical techniques such as PCA (Principal component analysis) and LDA (Linear discriminant analysis) applied to a wide soil data set are presented. The results have been compared with those obtained on a groundwater data set, whose samples were collected together with soil ones, within the project "Improvement of the Regional Agro-meteorological Monitoring Network (2004-2007)". LDA, applied to soil data, has allowed to distinguish the geographical origin of the sample from either one of the two macroaeras: Bari and Foggia provinces vs Brindisi, Lecce e Taranto provinces, with a percentage of correct prediction in cross validation of 87%. In the case of the groundwater data set, the best classification was obtained when the samples were grouped into three macroareas: Foggia province, Bari province and Brindisi, Lecce and Taranto provinces, by reaching a percentage of correct predictions in cross validation of 84%. The obtained information can be very useful in supporting soil and water resource management, such as the reduction of water consumption and the reduction of energy and chemical (nutrients and pesticides) inputs in agriculture.
Investigating the Moisture Content of Polyamide 6 by Raman-Microscopy and Multivariate Data Analysis
NASA Astrophysics Data System (ADS)
Lechner, Tobias; Noack, Kristina; Thöne, Manuel; Amend, Philipp; Schmidt, Michael; Will, Stefan
Thermal malleability of thermoplastics results in a high product diversity in various industry sectors. However, industrial applications require a constant and high component quality. Hence, material processing such as laser welding has to consider that, e.g., the moisture content of thermoplastics influences the mechanical properties such as the tensile strength. Moreover, water evaporates during laser welding and can form pores and defects. Thus, there is a large need for non-invasive material inspection before processing. To that end, we developed a methodology based on Raman-microscopy and multivariate data analysis (MVD) to determine the moisture content of polyamide (MCP). Further, the impact of the MCP on the mechanical properties was verified. For samples with a defined variation of the MCP, xyz-Raman-scans were carried out and analysed using MVD. For reference purposes, the samples were weighted and tensile tests were performed. An evaluation by means of partial least squares regression analysis (PLSR) resulted in a prediction of the MCP with a correlation coefficient >98%. Consequently, Raman-microscopy shows large potential for developing new techniques for inspection and quality control of plastics before processing. Dedicated to Professor Alfred Leipertz on the occasion of his 70th birthday.
Multivariate Analysis of Longitudinal Rates of Change
Bryan, Matthew; Heagerty, Patrick J.
2016-01-01
Longitudinal data allow direct comparison of the change in patient outcomes associated with treatment or exposure. Frequently, several longitudinal measures are collected that either reflect a common underlying health status, or characterize processes that are influenced in a similar way by covariates such as exposure or demographic characteristics. Statistical methods that can combine multivariate response variables into common measures of covariate effects have been proposed by Roy and Lin [1]; Proust-Lima, Letenneur and Jacqmin-Gadda [2]; and Gray and Brookmeyer [3] among others. Current methods for characterizing the relationship between covariates and the rate of change in multivariate outcomes are limited to select models. For example, Gray and Brookmeyer [3] introduce an “accelerated time” method which assumes that covariates rescale time in longitudinal models for disease progression. In this manuscript we detail an alternative multivariate model formulation that directly structures longitudinal rates of change, and that permits a common covariate effect across multiple outcomes. We detail maximum likelihood estimation for a multivariate longitudinal mixed model. We show via asymptotic calculations the potential gain in power that may be achieved with a common analysis of multiple outcomes. We apply the proposed methods to the analysis of a trivariate outcome for infant growth and compare rates of change for HIV infected and uninfected infants. PMID:27417129
Composition Studies with the Telescope Array Surface Detector
NASA Astrophysics Data System (ADS)
Kuznetsov, Mikhail; Piskunov, Maxim; Rubtsov, Grigory; Troitsky, Sergey; Zhezher, Yana
The results on ultra-high-energy cosmic-ray chemical composition based on the data from the Telescope Array surface-detector are presented. The method is based on the multivariate boosted decision tree (BDT) analysis which uses surface-detector observables. The results on average atomic mass in the energy range 1018.0-1020.0 eV are presented. A comparison with the Telescope Array hybrid results and the Pierre Auger Observatory surface detector results is shown.
The NLS-Based Nonlinear Grey Multivariate Model for Forecasting Pollutant Emissions in China.
Pei, Ling-Ling; Li, Qin; Wang, Zheng-Xin
2018-03-08
The relationship between pollutant discharge and economic growth has been a major research focus in environmental economics. To accurately estimate the nonlinear change law of China's pollutant discharge with economic growth, this study establishes a transformed nonlinear grey multivariable (TNGM (1, N )) model based on the nonlinear least square (NLS) method. The Gauss-Seidel iterative algorithm was used to solve the parameters of the TNGM (1, N ) model based on the NLS basic principle. This algorithm improves the precision of the model by continuous iteration and constantly approximating the optimal regression coefficient of the nonlinear model. In our empirical analysis, the traditional grey multivariate model GM (1, N ) and the NLS-based TNGM (1, N ) models were respectively adopted to forecast and analyze the relationship among wastewater discharge per capita (WDPC), and per capita emissions of SO₂ and dust, alongside GDP per capita in China during the period 1996-2015. Results indicated that the NLS algorithm is able to effectively help the grey multivariable model identify the nonlinear relationship between pollutant discharge and economic growth. The results show that the NLS-based TNGM (1, N ) model presents greater precision when forecasting WDPC, SO₂ emissions and dust emissions per capita, compared to the traditional GM (1, N ) model; WDPC indicates a growing tendency aligned with the growth of GDP, while the per capita emissions of SO₂ and dust reduce accordingly.
ERIC Educational Resources Information Center
Joo, Soohyung; Kipp, Margaret E. I.
2015-01-01
Introduction: This study examines the structure of Web space in the field of library and information science using multivariate analysis of social tags from the Website, Delicious.com. A few studies have examined mathematical modelling of tags, mainly examining tagging in terms of tripartite graphs, pattern tracing and descriptive statistics. This…
2016-06-01
unlimited. v List of Tables Table 1 Single-lap-joint experimental parameters ..............................................7 Table 2 Survey ...Joints: Experimental and Workflow Protocols by Robert E Jensen, Daniel C DeSchepper, and David P Flanagan Approved for...TR-7696 ● JUNE 2016 US Army Research Laboratory Multivariate Analysis of High Through-Put Adhesively Bonded Single Lap Joints: Experimental
A Multivariate Model for the Meta-Analysis of Study Level Survival Data at Multiple Times
ERIC Educational Resources Information Center
Jackson, Dan; Rollins, Katie; Coughlin, Patrick
2014-01-01
Motivated by our meta-analytic dataset involving survival rates after treatment for critical leg ischemia, we develop and apply a new multivariate model for the meta-analysis of study level survival data at multiple times. Our data set involves 50 studies that provide mortality rates at up to seven time points, which we model simultaneously, and…
Application of Multivariate Statistical Analysis to Biomarkers in Se-Turkey Crude Oils
NASA Astrophysics Data System (ADS)
Gürgey, K.; Canbolat, S.
2017-11-01
Twenty-four crude oil samples were collected from the 24 oil fields distributed in different districts of SE-Turkey. API and Sulphur content (%), Stable Carbon Isotope, Gas Chromatography (GC), and Gas Chromatography-Mass Spectrometry (GC-MS) data were used to construct a geochemical data matrix. The aim of this study is to examine the genetic grouping or correlations in the crude oil samples, hence the number of source rocks present in the SE-Turkey. To achieve these aims, two of the multivariate statistical analysis techniques (Principle Component Analysis [PCA] and Cluster Analysis were applied to data matrix of 24 samples and 8 source specific biomarker variables/parameters. The results showed that there are 3 genetically different oil groups: Batman-Nusaybin Oils, Adıyaman-Kozluk Oils and Diyarbakir Oils, in addition to a one mixed group. These groupings imply that at least, three different source rocks are present in South-Eastern (SE) Turkey. Grouping of the crude oil samples appears to be consistent with the geographic locations of the oils fields, subsurface stratigraphy as well as geology of the area.
Thomassen, Yvonne E; van Sprang, Eric N M; van der Pol, Leo A; Bakker, Wilfried A M
2010-09-01
Historical manufacturing data can potentially harbor a wealth of information for process optimization and enhancement of efficiency and robustness. To extract useful data multivariate data analysis (MVDA) using projection methods is often applied. In this contribution, the results obtained from applying MVDA on data from inactivated polio vaccine (IPV) production runs are described. Data from over 50 batches at two different production scales (700-L and 1,500-L) were available. The explorative analysis performed on single unit operations indicated consistent manufacturing. Known outliers (e.g., rejected batches) were identified using principal component analysis (PCA). The source of operational variation was pinpointed to variation of input such as media. Other relevant process parameters were in control and, using this manufacturing data, could not be correlated to product quality attributes. The gained knowledge of the IPV production process, not only from the MVDA, but also from digitalizing the available historical data, has proven to be useful for troubleshooting, understanding limitations of available data and seeing the opportunity for improvements. 2010 Wiley Periodicals, Inc.
Chen, Zhixiang; Shao, Peng; Sun, Qizhao; Zhao, Dong
2015-03-01
The purpose of the present study was to use a prospectively collected data to evaluate the rate of incidental durotomy (ID) during lumbar surgery and determine the associated risk factors by using univariate and multivariate analysis. We retrospectively reviewed 2184 patients who underwent lumbar surgery from January 1, 2009 to December 31, 2011 at a single hospital. Patients with ID (n=97) were compared with the patients without ID (n=2019). The influences of several potential risk factors that might affect the occurrence of ID were assessed using univariate and multivariate analyses. The overall incidence of ID was 4.62%. Univariate analysis demonstrated that older age, diabetes, lumbar central stenosis, posterior approach, revision surgery, prior lumber surgery and minimal invasive surgery are risk factors for ID during lumbar surgery. However, multivariate analysis identified older age, prior lumber surgery, revision surgery, and minimally invasive surgery as independent risk factors. Older age, prior lumber surgery, revision surgery, and minimal invasive surgery were independent risk factors for ID during lumbar surgery. These findings may guide clinicians making future surgical decisions regarding ID and aid in the patient counseling process to alleviate risks and complications. Copyright © 2015 Elsevier B.V. All rights reserved.
Multivariate reference technique for quantitative analysis of fiber-optic tissue Raman spectroscopy.
Bergholt, Mads Sylvest; Duraipandian, Shiyamala; Zheng, Wei; Huang, Zhiwei
2013-12-03
We report a novel method making use of multivariate reference signals of fused silica and sapphire Raman signals generated from a ball-lens fiber-optic Raman probe for quantitative analysis of in vivo tissue Raman measurements in real time. Partial least-squares (PLS) regression modeling is applied to extract the characteristic internal reference Raman signals (e.g., shoulder of the prominent fused silica boson peak (~130 cm(-1)); distinct sapphire ball-lens peaks (380, 417, 646, and 751 cm(-1))) from the ball-lens fiber-optic Raman probe for quantitative analysis of fiber-optic Raman spectroscopy. To evaluate the analytical value of this novel multivariate reference technique, a rapid Raman spectroscopy system coupled with a ball-lens fiber-optic Raman probe is used for in vivo oral tissue Raman measurements (n = 25 subjects) under 785 nm laser excitation powers ranging from 5 to 65 mW. An accurate linear relationship (R(2) = 0.981) with a root-mean-square error of cross validation (RMSECV) of 2.5 mW can be obtained for predicting the laser excitation power changes based on a leave-one-subject-out cross-validation, which is superior to the normal univariate reference method (RMSE = 6.2 mW). A root-mean-square error of prediction (RMSEP) of 2.4 mW (R(2) = 0.985) can also be achieved for laser power prediction in real time when we applied the multivariate method independently on the five new subjects (n = 166 spectra). We further apply the multivariate reference technique for quantitative analysis of gelatin tissue phantoms that gives rise to an RMSEP of ~2.0% (R(2) = 0.998) independent of laser excitation power variations. This work demonstrates that multivariate reference technique can be advantageously used to monitor and correct the variations of laser excitation power and fiber coupling efficiency in situ for standardizing the tissue Raman intensity to realize quantitative analysis of tissue Raman measurements in vivo, which is particularly appealing in challenging Raman endoscopic applications.
A novel multi-variant epitope ensemble vaccine against avian leukosis virus subgroup J.
Wang, Xiaoyu; Zhou, Defang; Wang, Guihua; Huang, Libo; Zheng, Qiankun; Li, Chengui; Cheng, Ziqiang
2017-12-04
The hypervariable antigenicity and immunosuppressive features of avian leukosis virus subgroup J (ALV-J) has led to great challenges to develop effective vaccines. Epitope vaccine will be a perspective trend. Previously, we identified a variant antigenic neutralizing epitope in hypervariable region 1 (hr1) of ALV-J, N-LRDFIA/E/TKWKS/GDDL/HLIRPYVNQS-C. BLAST analysis showed that the mutation of A, E, T and H in this epitope cover 79% of all ALV-J strains. Base on this data, we designed a multi-variant epitope ensemble vaccine comprising the four mutation variants linked with glycine and serine. The recombinant multi-variant epitope gene was expressed in Escherichia coli BL21. The expressed protein of the variant multi-variant epitope gene can react with positive sera and monoclonal antibodies of ALV-J, while cannot react with ALV-J negative sera. The multi-variant epitope vaccine that conjugated Freund's adjuvant complete/incomplete showed high immunogenicity that reached the titer of 1:64,000 at 42 days post immunization and maintained the immune period for at least 126 days in SPF chickens. Further, we demonstrated that the antibody induced by the variant multi-variant ensemble epitope vaccine recognized and neutralized different ALV-J strains (NX0101, TA1, WS1, BZ1224 and BZ4). Protection experiment that was evaluated by clinical symptom, viral shedding, weight gain, gross and histopathology showed 100% chickens that inoculated the multi-epitope vaccine were well protected against ALV-J challenge. The result shows a promising multi-variant epitope ensemble vaccine against hypervariable viruses in animals. Copyright © 2017 Elsevier Ltd. All rights reserved.
TATES: Efficient Multivariate Genotype-Phenotype Analysis for Genome-Wide Association Studies
van der Sluis, Sophie; Posthuma, Danielle; Dolan, Conor V.
2013-01-01
To date, the genome-wide association study (GWAS) is the primary tool to identify genetic variants that cause phenotypic variation. As GWAS analyses are generally univariate in nature, multivariate phenotypic information is usually reduced to a single composite score. This practice often results in loss of statistical power to detect causal variants. Multivariate genotype–phenotype methods do exist but attain maximal power only in special circumstances. Here, we present a new multivariate method that we refer to as TATES (Trait-based Association Test that uses Extended Simes procedure), inspired by the GATES procedure proposed by Li et al (2011). For each component of a multivariate trait, TATES combines p-values obtained in standard univariate GWAS to acquire one trait-based p-value, while correcting for correlations between components. Extensive simulations, probing a wide variety of genotype–phenotype models, show that TATES's false positive rate is correct, and that TATES's statistical power to detect causal variants explaining 0.5% of the variance can be 2.5–9 times higher than the power of univariate tests based on composite scores and 1.5–2 times higher than the power of the standard MANOVA. Unlike other multivariate methods, TATES detects both genetic variants that are common to multiple phenotypes and genetic variants that are specific to a single phenotype, i.e. TATES provides a more complete view of the genetic architecture of complex traits. As the actual causal genotype–phenotype model is usually unknown and probably phenotypically and genetically complex, TATES, available as an open source program, constitutes a powerful new multivariate strategy that allows researchers to identify novel causal variants, while the complexity of traits is no longer a limiting factor. PMID:23359524
Causal diagrams and multivariate analysis II: precision work.
Jupiter, Daniel C
2014-01-01
In this Investigators' Corner, I continue my discussion of when and why we researchers should include variables in multivariate regression. My examination focuses on studies comparing treatment groups and situations for which we can either exclude variables from multivariate analyses or include them for reasons of precision. Copyright © 2014 American College of Foot and Ankle Surgeons. Published by Elsevier Inc. All rights reserved.
A CLIPS expert system for clinical flow cytometry data analysis
NASA Technical Reports Server (NTRS)
Salzman, G. C.; Duque, R. E.; Braylan, R. C.; Stewart, C. C.
1990-01-01
An expert system is being developed using CLIPS to assist clinicians in the analysis of multivariate flow cytometry data from cancer patients. Cluster analysis is used to find subpopulations representing various cell types in multiple datasets each consisting of four to five measurements on each of 5000 cells. CLIPS facts are derived from results of the clustering. CLIPS rules are based on the expertise of Drs. Stewart, Duque, and Braylan. The rules incorporate certainty factors based on case histories.
NASA Astrophysics Data System (ADS)
Hsiao, Y. R.; Tsai, C.
2017-12-01
As the WHO Air Quality Guideline indicates, ambient air pollution exposes world populations under threat of fatal symptoms (e.g. heart disease, lung cancer, asthma etc.), raising concerns of air pollution sources and relative factors. This study presents a novel approach to investigating the multiscale variations of PM2.5 in southern Taiwan over the past decade, with four meteorological influencing factors (Temperature, relative humidity, precipitation and wind speed),based on Noise-assisted Multivariate Empirical Mode Decomposition(NAMEMD) algorithm, Hilbert Spectral Analysis(HSA) and Time-dependent Intrinsic Correlation(TDIC) method. NAMEMD algorithm is a fully data-driven approach designed for nonlinear and nonstationary multivariate signals, and is performed to decompose multivariate signals into a collection of channels of Intrinsic Mode Functions (IMFs). TDIC method is an EMD-based method using a set of sliding window sizes to quantify localized correlation coefficients for multiscale signals. With the alignment property and quasi-dyadic filter bank of NAMEMD algorithm, one is able to produce same number of IMFs for all variables and estimates the cross correlation in a more accurate way. The performance of spectral representation of NAMEMD-HSA method is compared with Complementary Empirical Mode Decomposition/ Hilbert Spectral Analysis (CEEMD-HSA) and Wavelet Analysis. The nature of NAMAMD-based TDICC analysis is then compared with CEEMD-based TDIC analysis and the traditional correlation analysis.
The Statistical Consulting Center for Astronomy (SCCA)
NASA Technical Reports Server (NTRS)
Akritas, Michael
2001-01-01
The process by which raw astronomical data acquisition is transformed into scientifically meaningful results and interpretation typically involves many statistical steps. Traditional astronomy limits itself to a narrow range of old and familiar statistical methods: means and standard deviations; least-squares methods like chi(sup 2) minimization; and simple nonparametric procedures such as the Kolmogorov-Smirnov tests. These tools are often inadequate for the complex problems and datasets under investigations, and recent years have witnessed an increased usage of maximum-likelihood, survival analysis, multivariate analysis, wavelet and advanced time-series methods. The Statistical Consulting Center for Astronomy (SCCA) assisted astronomers with the use of sophisticated tools, and to match these tools with specific problems. The SCCA operated with two professors of statistics and a professor of astronomy working together. Questions were received by e-mail, and were discussed in detail with the questioner. Summaries of those questions and answers leading to new approaches were posted on the Web (www.state.psu.edu/ mga/SCCA). In addition to serving individual astronomers, the SCCA established a Web site for general use that provides hypertext links to selected on-line public-domain statistical software and services. The StatCodes site (www.astro.psu.edu/statcodes) provides over 200 links in the areas of: Bayesian statistics; censored and truncated data; correlation and regression, density estimation and smoothing, general statistics packages and information; image analysis; interactive Web tools; multivariate analysis; multivariate clustering and classification; nonparametric analysis; software written by astronomers; spatial statistics; statistical distributions; time series analysis; and visualization tools. StatCodes has received a remarkable high and constant hit rate of 250 hits/week (over 10,000/year) since its inception in mid-1997. It is of interest to scientists both within and outside of astronomy. The most popular sections are multivariate techniques, image analysis, and time series analysis. Hundreds of copies of the ASURV, SLOPES and CENS-TAU codes developed by SCCA scientists were also downloaded from the StatCodes site. In addition to formal SCCA duties, SCCA scientists continued a variety of related activities in astrostatistics, including refereeing of statistically oriented papers submitted to the Astrophysical Journal, talks in meetings including Feigelson's talk to science journalists entitled "The reemergence of astrostatistics" at the American Association for the Advancement of Science meeting, and published papers of astrostatistical content.
Analysis/forecast experiments with a multivariate statistical analysis scheme using FGGE data
NASA Technical Reports Server (NTRS)
Baker, W. E.; Bloom, S. C.; Nestler, M. S.
1985-01-01
A three-dimensional, multivariate, statistical analysis method, optimal interpolation (OI) is described for modeling meteorological data from widely dispersed sites. The model was developed to analyze FGGE data at the NASA-Goddard Laboratory of Atmospherics. The model features a multivariate surface analysis over the oceans, including maintenance of the Ekman balance and a geographically dependent correlation function. Preliminary comparisons are made between the OI model and similar schemes employed at the European Center for Medium Range Weather Forecasts and the National Meteorological Center. The OI scheme is used to provide input to a GCM, and model error correlations are calculated for forecasts of 500 mb vertical water mixing ratios and the wind profiles. Comparisons are made between the predictions and measured data. The model is shown to be as accurate as a successive corrections model out to 4.5 days.
Zubrick, Stephen R; Taylor, Catherine L; Christensen, Daniel
2015-01-01
Oral language is the foundation of literacy. Naturally, policies and practices to promote children's literacy begin in early childhood and have a strong focus on developing children's oral language, especially for children with known risk factors for low language ability. The underlying assumption is that children's progress along the oral to literate continuum is stable and predictable, such that low language ability foretells low literacy ability. This study investigated patterns and predictors of children's oral language and literacy abilities at 4, 6, 8 and 10 years. The study sample comprised 2,316 to 2,792 children from the first nationally representative Longitudinal Study of Australian Children (LSAC). Six developmental patterns were observed, a stable middle-high pattern, a stable low pattern, an improving pattern, a declining pattern, a fluctuating low pattern, and a fluctuating middle-high pattern. Most children (69%) fit a stable middle-high pattern. By contrast, less than 1% of children fit a stable low pattern. These results challenged the view that children's progress along the oral to literate continuum is stable and predictable. Multivariate logistic regression was used to investigate risks for low literacy ability at 10 years and sensitivity-specificity analysis was used to examine the predictive utility of the multivariate model. Predictors were modelled as risk variables with the lowest level of risk as the reference category. In the multivariate model, substantial risks for low literacy ability at 10 years, in order of descending magnitude, were: low school readiness, Aboriginal and/or Torres Strait Islander status and low language ability at 8 years. Moderate risks were high temperamental reactivity, low language ability at 4 years, and low language ability at 6 years. The following risk factors were not statistically significant in the multivariate model: Low maternal consistency, low family income, health care card, child not read to at home, maternal smoking, maternal education, family structure, temperamental persistence, and socio-economic area disadvantage. The results of the sensitivity-specificity analysis showed that a well-fitted multivariate model featuring risks of substantive magnitude did not do particularly well in predicting low literacy ability at 10 years.
Camelo-Méndez, G A; Ragazzo-Sánchez, J A; Jiménez-Aparicio, A R; Vanegas-Espinoza, P E; Paredes-López, O; Del Villar-Martínez, A A
2013-09-01
Anthocyanins are a group of water-soluble pigments that provide red, purple or blue color to the leaves, flowers, and fruits. In addition, benefits have been attributed to hypertension and cardiovascular diseases. This study compared the content of total anthocyanins and volatile compounds in aqueous and ethanolic extracts of four varieties of Mexican roselle, with different levels of pigmentation. The multivariable analysis of categorical data demonstrated that ethanol was the best solvent for the extraction of both anthocyanins and volatile compounds. The concentration of anthocyanin in pigmented varieties ranged from 17.3 to 32.2 mg of cyanidin 3-glucoside/g dry weight, while volatile compounds analysis showed that geraniol was the main compound in extracts from the four varieties. The principal component analysis (PCA) allowed description of results with 77.38% of variance establishing a clear grouping for each variety in addition to similarities among some of these varieties. These results were validated by the confusion matrix obtained in the classification by the factorial discriminate analysis (FDA); it can be useful for roselle varieties classification. Small differences in anthocyanin and volatile compounds content could be detected, and it may be of interest for the food industry in order to classify a new individual into one of several groups using different variables at once.
Can we discover double Higgs production at the LHC?
NASA Astrophysics Data System (ADS)
Alves, Alexandre; Ghosh, Tathagata; Sinha, Kuver
2017-08-01
We explore double Higgs production via gluon fusion in the b b ¯γ γ channel at the high-luminosity LHC using machine learning tools. We first propose a Bayesian optimization approach to select cuts on kinematic variables, obtaining a 30%-50% increase in the significance compared to current results in the literature. We show that this improvement persists once systematic uncertainties are taken into account. We next use boosted decision trees (BDT) to further discriminate signal and background events. Our analysis shows that a joint optimization of kinematic cuts and BDT hyperparameters results in an appreciable improvement in the significance. Finally, we perform a multivariate analysis of the output scores of the BDT. We find that assuming a very low level of systematics, the techniques proposed here will be able to confirm the production of a pair of standard model Higgs bosons at 5 σ level with 3 ab-1 of data. Assuming a more realistic projection of the level of systematics, around 10%, the optimization of cuts to train BDTs combined with a multivariate analysis delivers a respectable significance of 4.6 σ . Even assuming large systematics of 20%, our analysis predicts a 3.6 σ significance, which represents at least strong evidence in favor of double Higgs production. We carefully incorporate background contributions coming from light flavor jets or c jets being misidentified as b jets and jets being misidentified as photons in our analysis.
Amit, Moran; Binenbaum, Yoav; Sharma, Kanika; Ramer, Naomi; Ramer, Ilana; Agbetoba, Abib; Glick, Joelle; Yang, Xinjie; Lei, Delin; Bjørndal, Kristine; Godballe, Christian; Mücke, Thomas; Wolff, Klaus-Dietrich; Fliss, Dan; Eckardt, André M.; Copelli, Chiara; Sesenna, Enrico; Palmer, Frank; Ganly, Ian; Patel, Snehal; Gil, Ziv
2016-01-01
Background The patterns of regional metastasis in adenoid cystic carcinoma (ACC) of the head and neck and its association with outcome is not established. Methods We conducted a retrospective multicentered multivariate analysis of 270 patients who underwent neck dissection. Results The incidence rate of neck metastases was 29%. The rate observed in the oral cavity is 37%, and in the major salivary glands is 19% (p = .001). The rate of occult nodal metastases was 17%. Overall 5-year survival rates were 44% in patients undergoing therapeutic neck dissections, and 65% and 73% among those undergoing elective neck dissections, with and without nodal metastases, respectively (p = .017). Multivariate analysis revealed that the primary site, nodal classification, and margin status were independent predictors of survival. Conclusion Our findings support the consideration of elective neck treatment in patients with ACC of the oral cavity. PMID:25060927
Jamshidi-Zanjani, Ahmad; Saeedi, Mohsen
2017-07-01
Vertical distribution of metals (Cu, Zn, Cr, Fe, Mn, Pb, Ni, Cd, and Li) in four sediment core samples (C 1 , C 2 , C 3 , and C 4 ) from Anzali international wetland located southwest of the Caspian Sea was examined. Background concentration of each metal was calculated according to different statistical approaches. The results of multivariate statistical analysis showed that Fe and Mn might have significant role in the fate of Ni and Zn in sediment core samples. Different sediment quality indexes were utilized to assess metal pollution in sediment cores. Moreover, a new sediment quality index named aggregative toxicity index (ATI) based on sediment quality guidelines (SQGs) was developed to assess the degree of metal toxicity in an aggregative manner. The increasing pattern of metal pollution and their toxicity degree in upper layers of core samples indicated increasing effects of anthropogenic sources in the study area.
NASA Astrophysics Data System (ADS)
Gu, Huaying; Liu, Zhixue; Weng, Yingliang
2017-04-01
The present study applies the multivariate generalized autoregressive conditional heteroscedasticity (MGARCH) with spatial effects approach for the analysis of the time-varying conditional correlations and contagion effects among global real estate markets. A distinguishing feature of the proposed model is that it can simultaneously capture the spatial interactions and the dynamic conditional correlations compared with the traditional MGARCH models. Results reveal that the estimated dynamic conditional correlations have exhibited significant increases during the global financial crisis from 2007 to 2009, thereby suggesting contagion effects among global real estate markets. The analysis further indicates that the returns of the regional real estate markets that are in close geographic and economic proximities exhibit strong co-movement. In addition, evidence of significantly positive leverage effects in global real estate markets is also determined. The findings have significant implications on global portfolio diversification opportunities and risk management practices.
Spectral compression algorithms for the analysis of very large multivariate images
Keenan, Michael R.
2007-10-16
A method for spectrally compressing data sets enables the efficient analysis of very large multivariate images. The spectral compression algorithm uses a factored representation of the data that can be obtained from Principal Components Analysis or other factorization technique. Furthermore, a block algorithm can be used for performing common operations more efficiently. An image analysis can be performed on the factored representation of the data, using only the most significant factors. The spectral compression algorithm can be combined with a spatial compression algorithm to provide further computational efficiencies.
NASA Astrophysics Data System (ADS)
Feng, Shangyuan; Lin, Juqiang; Huang, Zufang; Chen, Guannan; Chen, Weisheng; Wang, Yue; Chen, Rong; Zeng, Haishan
2013-01-01
The capability of using silver nanoparticle based near-infrared surface enhanced Raman scattering (SERS) spectroscopy combined with principal component analysis (PCA) and linear discriminate analysis (LDA) to differentiate esophageal cancer tissue from normal tissue was presented. Significant differences in Raman intensities of prominent SERS bands were observed between normal and cancer tissues. PCA-LDA multivariate analysis of the measured tissue SERS spectra achieved diagnostic sensitivity of 90.9% and specificity of 97.8%. This exploratory study demonstrated great potential for developing label-free tissue SERS analysis into a clinical tool for esophageal cancer detection.
New multivariable capabilities of the INCA program
NASA Technical Reports Server (NTRS)
Bauer, Frank H.; Downing, John P.; Thorpe, Christopher J.
1989-01-01
The INteractive Controls Analysis (INCA) program was developed at NASA's Goddard Space Flight Center to provide a user friendly, efficient environment for the design and analysis of control systems, specifically spacecraft control systems. Since its inception, INCA has found extensive use in the design, development, and analysis of control systems for spacecraft, instruments, robotics, and pointing systems. The (INCA) program was initially developed as a comprehensive classical design analysis tool for small and large order control systems. The latest version of INCA, expected to be released in February of 1990, was expanded to include the capability to perform multivariable controls analysis and design.
Bathke, Arne C.; Friedrich, Sarah; Pauly, Markus; Konietschke, Frank; Staffen, Wolfgang; Strobl, Nicolas; Höller, Yvonne
2018-01-01
ABSTRACT To date, there is a lack of satisfactory inferential techniques for the analysis of multivariate data in factorial designs, when only minimal assumptions on the data can be made. Presently available methods are limited to very particular study designs or assume either multivariate normality or equal covariance matrices across groups, or they do not allow for an assessment of the interaction effects across within-subjects and between-subjects variables. We propose and methodologically validate a parametric bootstrap approach that does not suffer from any of the above limitations, and thus provides a rather general and comprehensive methodological route to inference for multivariate and repeated measures data. As an example application, we consider data from two different Alzheimer’s disease (AD) examination modalities that may be used for precise and early diagnosis, namely, single-photon emission computed tomography (SPECT) and electroencephalogram (EEG). These data violate the assumptions of classical multivariate methods, and indeed classical methods would not have yielded the same conclusions with regards to some of the factors involved. PMID:29565679
Development of multivariate exposure and fatal accident involvement rates for 1977
DOT National Transportation Integrated Search
1985-10-01
The need for multivariate accident involvement rates is often encounted in : accident analysis. The FARS (Fatal Accident Reporting System) files contain : records of fatal involvements characterized by many variables while NPTS : (National Personal T...
Bayesian multivariate hierarchical transformation models for ROC analysis.
O'Malley, A James; Zou, Kelly H
2006-02-15
A Bayesian multivariate hierarchical transformation model (BMHTM) is developed for receiver operating characteristic (ROC) curve analysis based on clustered continuous diagnostic outcome data with covariates. Two special features of this model are that it incorporates non-linear monotone transformations of the outcomes and that multiple correlated outcomes may be analysed. The mean, variance, and transformation components are all modelled parametrically, enabling a wide range of inferences. The general framework is illustrated by focusing on two problems: (1) analysis of the diagnostic accuracy of a covariate-dependent univariate test outcome requiring a Box-Cox transformation within each cluster to map the test outcomes to a common family of distributions; (2) development of an optimal composite diagnostic test using multivariate clustered outcome data. In the second problem, the composite test is estimated using discriminant function analysis and compared to the test derived from logistic regression analysis where the gold standard is a binary outcome. The proposed methodology is illustrated on prostate cancer biopsy data from a multi-centre clinical trial.
Bayesian multivariate hierarchical transformation models for ROC analysis
O'Malley, A. James; Zou, Kelly H.
2006-01-01
SUMMARY A Bayesian multivariate hierarchical transformation model (BMHTM) is developed for receiver operating characteristic (ROC) curve analysis based on clustered continuous diagnostic outcome data with covariates. Two special features of this model are that it incorporates non-linear monotone transformations of the outcomes and that multiple correlated outcomes may be analysed. The mean, variance, and transformation components are all modelled parametrically, enabling a wide range of inferences. The general framework is illustrated by focusing on two problems: (1) analysis of the diagnostic accuracy of a covariate-dependent univariate test outcome requiring a Box–Cox transformation within each cluster to map the test outcomes to a common family of distributions; (2) development of an optimal composite diagnostic test using multivariate clustered outcome data. In the second problem, the composite test is estimated using discriminant function analysis and compared to the test derived from logistic regression analysis where the gold standard is a binary outcome. The proposed methodology is illustrated on prostate cancer biopsy data from a multi-centre clinical trial. PMID:16217836
DOE Office of Scientific and Technical Information (OSTI.GOV)
Beppler, Christina L
2015-12-01
A new approach was created for studying energetic material degradation. This approach involved detecting and tentatively identifying non-volatile chemical species by liquid chromatography-mass spectrometry (LC-MS) with multivariate statistical data analysis that form as the CL-20 energetic material thermally degraded. Multivariate data analysis showed clear separation and clustering of samples based on sample group: either pristine or aged material. Further analysis showed counter-clockwise trends in the principal components analysis (PCA), a type of multivariate data analysis, Scores plots. These trends may indicate that there was a discrete shift in the chemical markers as the went from pristine to aged material, andmore » then again when the aged CL-20 mixed with a potentially incompatible material was thermally aged for 4, 6, or 9 months. This new approach to studying energetic material degradation should provide greater knowledge of potential degradation markers in these materials.« less
Risk Factors for Central Serous Chorioretinopathy: Multivariate Approach in a Case-Control Study.
Chatziralli, Irini; Kabanarou, Stamatina A; Parikakis, Efstratios; Chatzirallis, Alexandros; Xirou, Tina; Mitropoulos, Panagiotis
2017-07-01
The purpose of this prospective study was to investigate the potential risk factors associated independently with central serous retinopathy (CSR) in a Greek population, using multivariate approach. Participants in the study were 183 consecutive patients diagnosed with CSR and 183 controls, matched for age. All participants underwent complete ophthalmological examination and information regarding their sociodemographic, clinical, medical and ophthalmological history were recorded, so as to assess potential risk factors for CSR. Univariate and multivariate analysis was performed. Univariate analysis showed that male sex, high educational status, high income, alcohol consumption, smoking, hypertension, coronary heart disease, obstructive sleep apnea, autoimmune disorders, H. pylori infection, type A personality and stress, steroid use, pregnancy and hyperopia were associated with CSR, while myopia was found to protect from CSR. In multivariate analysis, alcohol consumption, hypertension, coronary heart disease and autoimmune disorders lost their significance, while the remaining factors were all independently associated with CSR. It is important to take into account the various risk factors for CSR, so as to define vulnerable groups and to shed light into the pathogenesis of the disease.
Yang, James J; Williams, L Keoki; Buu, Anne
2017-08-24
A multivariate genome-wide association test is proposed for analyzing data on multivariate quantitative phenotypes collected from related subjects. The proposed method is a two-step approach. The first step models the association between the genotype and marginal phenotype using a linear mixed model. The second step uses the correlation between residuals of the linear mixed model to estimate the null distribution of the Fisher combination test statistic. The simulation results show that the proposed method controls the type I error rate and is more powerful than the marginal tests across different population structures (admixed or non-admixed) and relatedness (related or independent). The statistical analysis on the database of the Study of Addiction: Genetics and Environment (SAGE) demonstrates that applying the multivariate association test may facilitate identification of the pleiotropic genes contributing to the risk for alcohol dependence commonly expressed by four correlated phenotypes. This study proposes a multivariate method for identifying pleiotropic genes while adjusting for cryptic relatedness and population structure between subjects. The two-step approach is not only powerful but also computationally efficient even when the number of subjects and the number of phenotypes are both very large.
Higher-order Multivariable Polynomial Regression to Estimate Human Affective States
NASA Astrophysics Data System (ADS)
Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin
2016-03-01
From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states.
Higher-order Multivariable Polynomial Regression to Estimate Human Affective States
Wei, Jie; Chen, Tong; Liu, Guangyuan; Yang, Jiemin
2016-01-01
From direct observations, facial, vocal, gestural, physiological, and central nervous signals, estimating human affective states through computational models such as multivariate linear-regression analysis, support vector regression, and artificial neural network, have been proposed in the past decade. In these models, linear models are generally lack of precision because of ignoring intrinsic nonlinearities of complex psychophysiological processes; and nonlinear models commonly adopt complicated algorithms. To improve accuracy and simplify model, we introduce a new computational modeling method named as higher-order multivariable polynomial regression to estimate human affective states. The study employs standardized pictures in the International Affective Picture System to induce thirty subjects’ affective states, and obtains pure affective patterns of skin conductance as input variables to the higher-order multivariable polynomial model for predicting affective valence and arousal. Experimental results show that our method is able to obtain efficient correlation coefficients of 0.98 and 0.96 for estimation of affective valence and arousal, respectively. Moreover, the method may provide certain indirect evidences that valence and arousal have their brain’s motivational circuit origins. Thus, the proposed method can serve as a novel one for efficiently estimating human affective states. PMID:26996254
ERIC Educational Resources Information Center
Keegan, John; Ditchman, Nicole; Dutta, Alo; Chiu, Chung-Yi; Muller, Veronica; Chan, Fong; Kundu, Madan
2016-01-01
Purpose: To apply the constructs of social cognitive theory (SCT) and the theory of planned behavior (TPB) to understand the stages of change (SOC) for physical activities among individuals with a spinal cord injury (SCI). Method: Ex post facto design using multivariate analysis of variance (MANOVA). The participants were 144 individuals with SCI…
ERIC Educational Resources Information Center
Pezzolo, Alessandra De Lorenzi
2011-01-01
The diffuse reflectance infrared Fourier transform (DRIFT) spectra of sand samples exhibit features reflecting their composition. Basic multivariate analysis (MVA) can be used to effectively sort subsets of homogeneous specimens collected from nearby locations, as well as pointing out similarities in composition among sands of different origins.…
Oosterhof, Nikolaas N; Wiggett, Alison J; Cross, Emily S
2014-04-01
Cook et al. overstate the evidence supporting their associative account of mirror neurons in humans: most studies do not address a key property, action-specificity that generalizes across the visual and motor domains. Multivariate pattern analysis (MVPA) of neuroimaging data can address this concern, and we illustrate how MVPA can be used to test key predictions of their account.
Multivariate Quantitative Chemical Analysis
NASA Technical Reports Server (NTRS)
Kinchen, David G.; Capezza, Mary
1995-01-01
Technique of multivariate quantitative chemical analysis devised for use in determining relative proportions of two components mixed and sprayed together onto object to form thermally insulating foam. Potentially adaptable to other materials, especially in process-monitoring applications in which necessary to know and control critical properties of products via quantitative chemical analyses of products. In addition to chemical composition, also used to determine such physical properties as densities and strengths.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wong, Jonathan; Xu, Beibei; Moores Cancer Center, University of California San Diego, La Jolla, California
Purpose/Objective: Palliative radiation therapy represents an important treatment option among patients with advanced cancer, although research shows decreased use among older patients. This study evaluated age-related patterns of palliative radiation use among an elderly Medicare population. Methods and Materials: We identified 63,221 patients with metastatic lung, breast, prostate, or colorectal cancer diagnosed between 2000 and 2007 from the Surveillance, Epidemiology, and End Results (SEER)-Medicare linked database. Receipt of palliative radiation therapy was extracted from Medicare claims. Multivariate Poisson regression analysis determined residual age-related disparity in the receipt of palliative radiation therapy after controlling for confounding covariates including age-related differences inmore » patient and demographic covariates, length of life, and patient preferences for aggressive cancer therapy. Results: The use of radiation decreased steadily with increasing patient age. Forty-two percent of patients aged 66 to 69 received palliative radiation therapy. Rates of palliative radiation decreased to 38%, 32%, 24%, and 14% among patients aged 70 to 74, 75 to 79, 80 to 84, and over 85, respectively. Multivariate analysis found that confounding covariates attenuated these findings, although the decreased relative rate of palliative radiation therapy among the elderly remained clinically and statistically significant. On multivariate analysis, compared to patients 66 to 69 years old, those aged 70 to 74, 75 to 79, 80 to 84, and over 85 had a 7%, 15%, 25%, and 44% decreased rate of receiving palliative radiation, respectively (all P<.0001). Conclusions: Age disparity with palliative radiation therapy exists among older cancer patients. Further research should strive to identify barriers to palliative radiation among the elderly, and extra effort should be made to give older patients the opportunity to receive this quality of life-enhancing treatment at the end of life.« less
Parastar, Hadi; Akvan, Nadia
2014-03-13
In the present contribution, a new combination of multivariate curve resolution-correlation optimized warping (MCR-COW) with trilinear parallel factor analysis (PARAFAC) is developed to exploit second-order advantage in complex chromatographic measurements. In MCR-COW, the complexity of the chromatographic data is reduced by arranging the data in a column-wise augmented matrix, analyzing using MCR bilinear model and aligning the resolved elution profiles using COW in a component-wise manner. The aligned chromatographic data is then decomposed using trilinear model of PARAFAC in order to exploit pure chromatographic and spectroscopic information. The performance of this strategy is evaluated using simulated and real high-performance liquid chromatography-diode array detection (HPLC-DAD) datasets. The obtained results showed that the MCR-COW can efficiently correct elution time shifts of target compounds that are completely overlapped by coeluted interferences in complex chromatographic data. In addition, the PARAFAC analysis of aligned chromatographic data has the advantage of unique decomposition of overlapped chromatographic peaks to identify and quantify the target compounds in the presence of interferences. Finally, to confirm the reliability of the proposed strategy, the performance of the MCR-COW-PARAFAC is compared with the frequently used methods of PARAFAC, COW-PARAFAC, multivariate curve resolution-alternating least squares (MCR-ALS), and MCR-COW-MCR. In general, in most of the cases the MCR-COW-PARAFAC showed an improvement in terms of lack of fit (LOF), relative error (RE) and spectral correlation coefficients in comparison to the PARAFAC, COW-PARAFAC, MCR-ALS and MCR-COW-MCR results. Copyright © 2014 Elsevier B.V. All rights reserved.
Multivariate image analysis of laser-induced photothermal imaging used for detection of caries tooth
NASA Astrophysics Data System (ADS)
El-Sherif, Ashraf F.; Abdel Aziz, Wessam M.; El-Sharkawy, Yasser H.
2010-08-01
Time-resolved photothermal imaging has been investigated to characterize tooth for the purpose of discriminating between normal and caries areas of the hard tissue using thermal camera. Ultrasonic thermoelastic waves were generated in hard tissue by the absorption of fiber-coupled Q-switched Nd:YAG laser pulses operating at 1064 nm in conjunction with a laser-induced photothermal technique used to detect the thermal radiation waves for diagnosis of human tooth. The concepts behind the use of photo-thermal techniques for off-line detection of caries tooth features were presented by our group in earlier work. This paper illustrates the application of multivariate image analysis (MIA) techniques to detect the presence of caries tooth. MIA is used to rapidly detect the presence and quantity of common caries tooth features as they scanned by the high resolution color (RGB) thermal cameras. Multivariate principal component analysis is used to decompose the acquired three-channel tooth images into a two dimensional principal components (PC) space. Masking score point clusters in the score space and highlighting corresponding pixels in the image space of the two dominant PCs enables isolation of caries defect pixels based on contrast and color information. The technique provides a qualitative result that can be used for early stage caries tooth detection. The proposed technique can potentially be used on-line or real-time resolved to prescreen the existence of caries through vision based systems like real-time thermal camera. Experimental results on the large number of extracted teeth as well as one of the thermal image panoramas of the human teeth voltanteer are investigated and presented.
Time to antibiotics and outcomes in cancer patients with febrile neutropenia
2014-01-01
Background Febrile neutropenia is an oncologic emergency. The timing of antibiotics administration in patients with febrile neutropenia may result in adverse outcomes. Our study aims to determine time-to- antibiotic administration in patients with febrile neutropenia, and its relationship with length of hospital stay, intensive care unit monitoring, and hospital mortality. Methods The study population was comprised of adult cancer patients with febrile neutropenia who were hospitalized, at a tertiary care hospital, between January 2010 and December 2011. Using Multination Association of Supportive Care in Cancer (MASCC) risk score, the study cohort was divided into high and low risk groups. A multivariate regression analysis was performed to assess relationship between time-to- antibiotic administration and various outcome variables. Results One hundred and five eligible patients with median age of 60 years (range: 18–89) and M:F of 43:62 were identified. Thirty-seven (35%) patients were in MASCC high risk group. Median time-to- antibiotic administration was 2.5 hrs (range: 0.03-50) and median length of hospital stay was 6 days (range: 1–57). In the multivariate analysis time-to- antibiotic administration (regression coefficient [RC]: 0.31 days [95% CI: 0.13-0.48]), known source of fever (RC: 4.1 days [95% CI: 0.76-7.5]), and MASCC high risk group (RC: 4 days [95% CI: 1.1-7.0]) were significantly correlated with longer hospital stay. Of 105 patients, 5 (4.7%) died & or required ICU monitoring. In multivariate analysis no variables significantly correlated with mortality or ICU monitoring. Conclusions Our study revealed that delay in antibiotics administration has been associated with a longer hospital stay. PMID:24716604
Khurrum, Huma; Bedaiwi, Khalid M.; AlBalahi, Naif Meshael
2017-01-01
Background The Koebner phenomenon (KP) is a common entity observed in dermatological disorders. The reported incidence of KP in vitiligo varies widely. Although the KP is frequently observed in patients with viltiligo, the associated factors with KP has not been established yet. Objective The aim is to estimate the prevalence of KP in vitiligo patients and to investigate the associated factors with KP among vitiligo characteristics. Methods A cross-sectional observational study was conducted using 381 vitiligo patients. Demographic and clinical information was obtained via the completion of Vitiligo European Task Force (VETF) questionnaires. Patients with positive history of KP were extracted from this vitiligo database. Multivariate analysis was performed to assess associations with KP. Results The median age of cases was 24 years (range, 0.6~76). In total, 237 of the patients were male (62.2%). Vitiligo vulgaris was the most common type observed (152/381, 39.9%). Seventy-two percent (274/381) patients did not exhibit KP, whereas 28.1% (107/381) of patients exhibited this condition. Multivariable analysis showed the following to be independent factors with KP in patients with vitiligo: the progressive disease (odds ratio [OR], 1.82; 95% confidence interval [95% CI], 1.17~2.92; p=0.041), disease duration longer than 5 years (OR, 1.92; 95% CI, 1.22~2.11; p=0.003), and body surface area more than 2% (OR, 2.20; 95% CI, 1.26~3.24; p<0.001). Conclusion Our results suggest that KP may be used to evaluate disease activity and investigate different associations between the clinical profile and course of vitiligo. Further studies are needed to predict the relationship between KP and responsiveness to therapy. PMID:28566906
NASA Astrophysics Data System (ADS)
Chen, Long; Wang, Yue; Liu, Nenrong; Lin, Duo; Weng, Cuncheng; Zhang, Jixue; Zhu, Lihuan; Chen, Weisheng; Chen, Rong; Feng, Shangyuan
2013-06-01
The diagnostic capability of using tissue intrinsic micro-Raman signals to obtain biochemical information from human esophageal tissue is presented in this paper. Near-infrared micro-Raman spectroscopy combined with multivariate analysis was applied for discrimination of esophageal cancer tissue from normal tissue samples. Micro-Raman spectroscopy measurements were performed on 54 esophageal cancer tissues and 55 normal tissues in the 400-1750 cm-1 range. The mean Raman spectra showed significant differences between the two groups. Tentative assignments of the Raman bands in the measured tissue spectra suggested some changes in protein structure, a decrease in the relative amount of lactose, and increases in the percentages of tryptophan, collagen and phenylalanine content in esophageal cancer tissue as compared to those of a normal subject. The diagnostic algorithms based on principal component analysis (PCA) and linear discriminate analysis (LDA) achieved a diagnostic sensitivity of 87.0% and specificity of 70.9% for separating cancer from normal esophageal tissue samples. The result demonstrated that near-infrared micro-Raman spectroscopy combined with PCA-LDA analysis could be an effective and sensitive tool for identification of esophageal cancer.
Multivariate statistical analysis of low-voltage EDS spectrum images
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anderson, I.M.
1998-03-01
Whereas energy-dispersive X-ray spectrometry (EDS) has been used for compositional analysis in the scanning electron microscope for 30 years, the benefits of using low operating voltages for such analyses have been explored only during the last few years. This paper couples low-voltage EDS with two other emerging areas of characterization: spectrum imaging and multivariate statistical analysis. The specimen analyzed for this study was a finished Intel Pentium processor, with the polyimide protective coating stripped off to expose the final active layers.
NASA Astrophysics Data System (ADS)
Cannon, Alex J.
2018-01-01
Most bias correction algorithms used in climatology, for example quantile mapping, are applied to univariate time series. They neglect the dependence between different variables. Those that are multivariate often correct only limited measures of joint dependence, such as Pearson or Spearman rank correlation. Here, an image processing technique designed to transfer colour information from one image to another—the N-dimensional probability density function transform—is adapted for use as a multivariate bias correction algorithm (MBCn) for climate model projections/predictions of multiple climate variables. MBCn is a multivariate generalization of quantile mapping that transfers all aspects of an observed continuous multivariate distribution to the corresponding multivariate distribution of variables from a climate model. When applied to climate model projections, changes in quantiles of each variable between the historical and projection period are also preserved. The MBCn algorithm is demonstrated on three case studies. First, the method is applied to an image processing example with characteristics that mimic a climate projection problem. Second, MBCn is used to correct a suite of 3-hourly surface meteorological variables from the Canadian Centre for Climate Modelling and Analysis Regional Climate Model (CanRCM4) across a North American domain. Components of the Canadian Forest Fire Weather Index (FWI) System, a complicated set of multivariate indices that characterizes the risk of wildfire, are then calculated and verified against observed values. Third, MBCn is used to correct biases in the spatial dependence structure of CanRCM4 precipitation fields. Results are compared against a univariate quantile mapping algorithm, which neglects the dependence between variables, and two multivariate bias correction algorithms, each of which corrects a different form of inter-variable correlation structure. MBCn outperforms these alternatives, often by a large margin, particularly for annual maxima of the FWI distribution and spatiotemporal autocorrelation of precipitation fields.
NASA Astrophysics Data System (ADS)
Teye, Ernest; Huang, Xingyi; Dai, Huang; Chen, Quansheng
2013-10-01
Quick, accurate and reliable technique for discrimination of cocoa beans according to geographical origin is essential for quality control and traceability management. This current study presents the application of Near Infrared Spectroscopy technique and multivariate classification for the differentiation of Ghana cocoa beans. A total of 194 cocoa bean samples from seven cocoa growing regions were used. Principal component analysis (PCA) was used to extract relevant information from the spectral data and this gave visible cluster trends. The performance of four multivariate classification methods: Linear discriminant analysis (LDA), K-nearest neighbors (KNN), Back propagation artificial neural network (BPANN) and Support vector machine (SVM) were compared. The performances of the models were optimized by cross validation. The results revealed that; SVM model was superior to all the mathematical methods with a discrimination rate of 100% in both the training and prediction set after preprocessing with Mean centering (MC). BPANN had a discrimination rate of 99.23% for the training set and 96.88% for prediction set. While LDA model had 96.15% and 90.63% for the training and prediction sets respectively. KNN model had 75.01% for the training set and 72.31% for prediction set. The non-linear classification methods used were superior to the linear ones. Generally, the results revealed that NIR Spectroscopy coupled with SVM model could be used successfully to discriminate cocoa beans according to their geographical origins for effective quality assurance.
Enhancing e-waste estimates: improving data quality by multivariate Input-Output Analysis.
Wang, Feng; Huisman, Jaco; Stevels, Ab; Baldé, Cornelis Peter
2013-11-01
Waste electrical and electronic equipment (or e-waste) is one of the fastest growing waste streams, which encompasses a wide and increasing spectrum of products. Accurate estimation of e-waste generation is difficult, mainly due to lack of high quality data referred to market and socio-economic dynamics. This paper addresses how to enhance e-waste estimates by providing techniques to increase data quality. An advanced, flexible and multivariate Input-Output Analysis (IOA) method is proposed. It links all three pillars in IOA (product sales, stock and lifespan profiles) to construct mathematical relationships between various data points. By applying this method, the data consolidation steps can generate more accurate time-series datasets from available data pool. This can consequently increase the reliability of e-waste estimates compared to the approach without data processing. A case study in the Netherlands is used to apply the advanced IOA model. As a result, for the first time ever, complete datasets of all three variables for estimating all types of e-waste have been obtained. The result of this study also demonstrates significant disparity between various estimation models, arising from the use of data under different conditions. It shows the importance of applying multivariate approach and multiple sources to improve data quality for modelling, specifically using appropriate time-varying lifespan parameters. Following the case study, a roadmap with a procedural guideline is provided to enhance e-waste estimation studies. Copyright © 2013 Elsevier Ltd. All rights reserved.
Brodsky, Burton S; Owens, Gary M; Scotti, Dennis J; Needham, Keith A; Cool, Christina L
2017-10-01
Ovarian cancer is the eighth most common cancer among women, but ranks fifth in cancer-related causes of death, the majority of which are detected in late stages, after the cancer has metastasized. The CA125 test is the standard of care for assessing suspicious pelvic masses. However, the primary use of CA125 is to monitor treatment progress rather than to screen for disease, and its sensitivity is exceedingly low, unlike the multivariate assay OVA1. A cost-effective treatment of ovarian cancer requires early and accurate diagnosis of pelvic masses and reduced referrals of patients with benign tumors to a gynecologic oncologist. To analyze the economic impact of increased utilization of a multivariate assay, such as OVA1, to guide the treatment of ovarian cancer. The study population was drawn from Medicare and commercial health plan claims data. A budget impact model was constructed to estimate the economic consequences of substituting the multivariate assay OVA1 to replace the single biomarker assay CA125 to assess the likelihood of pelvic mass malignancy in premenopausal and/or postmenopausal women. All patients selected for the analysis had CA125 testing before surgical intervention. A total of 92,843 health plan members were included for analysis, comprising 48,113 commercially insured members and 44,730 Medicare beneficiaries. Estimates of future health plan expenditures, which were calculated from base-case assumptions, projected overall savings of $0.05 per-member per-month (PMPM) for commercially insured members and $0.01 PMPM for Medicare beneficiaries as a result of increased utilization of OVA1. Sensitivity analysis revealed potential savings of up to $0.17 PMPM for commercially insured patients and up to $0.05 for Medicare beneficiaries. The results of the budget impact model support the use of OVA1 instead of CA125 by indicating that modest cost-savings can be achieved, while reaping the clinical benefits of improved diagnostic accuracy, early disease detection, and reductions in multiple, and possibly unnecessary, referrals to gynecologic oncologists.
Spatial assessment of air quality patterns in Malaysia using multivariate analysis
NASA Astrophysics Data System (ADS)
Dominick, Doreena; Juahir, Hafizan; Latif, Mohd Talib; Zain, Sharifuddin M.; Aris, Ahmad Zaharin
2012-12-01
This study aims to investigate possible sources of air pollutants and the spatial patterns within the eight selected Malaysian air monitoring stations based on a two-year database (2008-2009). The multivariate analysis was applied on the dataset. It incorporated Hierarchical Agglomerative Cluster Analysis (HACA) to access the spatial patterns, Principal Component Analysis (PCA) to determine the major sources of the air pollution and Multiple Linear Regression (MLR) to assess the percentage contribution of each air pollutant. The HACA results grouped the eight monitoring stations into three different clusters, based on the characteristics of the air pollutants and meteorological parameters. The PCA analysis showed that the major sources of air pollution were emissions from motor vehicles, aircraft, industries and areas of high population density. The MLR analysis demonstrated that the main pollutant contributing to variability in the Air Pollutant Index (API) at all stations was particulate matter with a diameter of less than 10 μm (PM10). Further MLR analysis showed that the main air pollutant influencing the high concentration of PM10 was carbon monoxide (CO). This was due to combustion processes, particularly originating from motor vehicles. Meteorological factors such as ambient temperature, wind speed and humidity were also noted to influence the concentration of PM10.
Al-Shudifat, Abdul Rahman; Kahlon, Babar; Höglund, Peter; Soliman, Ahmed Y; Lindskog, Kristoffer; Siesjo, Peter
2014-01-01
The aim of the present study was to identify predictive factors for outcome after surgery of vestibular schwannomas. This is a retrospective study with partially collected prospective data of patients who were surgically treated for vestibular schwannomas at a single institution from 1979 to 2000. Patients with recurrent tumours, NF2 and those incapable of answering questionnaires were excluded from the study. The short form 36 (SF36) questionnaire and a specific questionnaire regarding neurological status, work status and independent life (IL) status were sent to all eligible patients. The questionnaires were sent to 430 eligible patients (out of 537) and 395 (93%) responded. Scores for work capacity (WC) and IL were compared with SF36 scores as outcome estimates. Patients were divided into two groups (<64, ≥64-years-old) in order to assess them for either WC or IL. Putative preoperative and postoperative predictive factors were tested in univariate and multivariable regression analysis for the outcome scores of WC, IL and SF36. In the group <64 years, age, gender and tumour diameter were independent predictive factors for postoperative WC in multivariate analysis. A high-risk group was identified in women with age >50 years and tumour diameter >25 mm. In patients ≥64, gender and tumour diameter were significant predictive factors for IL in univariate analysis. Perioperative and postoperative objective factors as length of surgery, blood loss and complications did not predict outcome in the multivariable analysis for any age group. Patients' assessment of change in balance function was the only neurological factor that showed significance both in univariate and multivariable analysis in both age cohorts. While SF36 scores were lower in surgically treated patients in relation to normograms for the general population, they did not correlate significantly to WC and IL. The SF36 questionnaire did not correlate to outcome measures as WC and IL in patients undergoing surgery for vestibular schwannomas. Women and patients above 50 years with larger tumours have a high risk for reduced WC after surgical treatment. These results question the validity of quality of life scores in assessment of outcome after surgery of benign skullbase lesions.
NASA Astrophysics Data System (ADS)
Gourdol, L.; Hissler, C.; Pfister, L.
2012-04-01
The Luxembourg sandstone aquifer is of major relevance for the national supply of drinking water in Luxembourg. The city of Luxembourg (20% of the country's population) gets almost 2/3 of its drinking water from this aquifer. As a consequence, the study of both the groundwater hydrochemistry, as well as its spatial and temporal variations, are considered as of highest priority. Since 2005, a monitoring network has been implemented by the Water Department of Luxembourg City, with a view to a more sustainable management of this strategic water resource. The data collected to date forms a large and complex dataset, describing spatial and temporal variations of many hydrochemical parameters. The data treatment issue is tightly connected to this kind of water monitoring programs and complex databases. Standard multivariate statistical techniques, such as principal components analysis and hierarchical cluster analysis, have been widely used as unbiased methods for extracting meaningful information from groundwater quality data and are now classically used in many hydrogeological studies, in particular to characterize temporal or spatial hydrochemical variations induced by natural and anthropogenic factors. But these classical multivariate methods deal with two-way matrices, usually parameters/sites or parameters/time, while often the dataset resulting from qualitative water monitoring programs should be seen as a datacube parameters/sites/time. Three-way matrices, such as the one we propose here, are difficult to handle and to analyse by classical multivariate statistical tools and thus should be treated with approaches dealing with three-way data structures. One possible analysis approach consists in the use of partial triadic analysis (PTA). The PTA was previously used with success in many ecological studies but never to date in the domain of hydrogeology. Applied to the dataset of the Luxembourg Sandstone aquifer, the PTA appears as a new promising statistical instrument for hydrogeologists, in particular to characterize temporal and spatial hydrochemical variations induced by natural and anthropogenic factors. This new approach for groundwater management offers potential for 1) identifying a common multivariate spatial structure, 2) untapping the different hydrochemical patterns and explaining their controlling factors and 3) analysing the temporal variability of this structure and grasping hydrochemical changes.
Mariappan, Shanthi; Sekar, Uma; Kamalanathan, Arunagiri
2017-01-01
Background: Carbapenemase-producing Enterobacteriaceae (CPE) have increased in recent years leading to limitations of treatment options. The present study was undertaken to detect CPE, risk factors for acquiring them and their impact on clinical outcomes. Methods: This retrospective observational study included 111 clinically significant Enterobacteriaceae resistant to cephalosporins subclass III and exhibiting a positive modified Hodge test. Screening for carbapenemase production was done by phenotypic methods, and polymerase chain reaction was performed to detect genes encoding them. Retrospectively, the medical records of the patients were perused to assess risk factors for infections with CPE and their impact. The data collected were duration of hospital stay, Intensive Care Unit (ICU) stay, use of invasive devices, mechanical ventilation, the presence of comorbidities, and antimicrobial therapy. The outcome was followed up. Univariate and multivariate analysis of the data were performed using SPSS software. Results: Carbapenemase-encoding genes were detected in 67 isolates. The genes detected were New Delhi metallo-β-lactamase, Verona integron-encoded metallo-β-lactamase, and oxacillinase-181.Although univariate analysis identified risk factors associated with acquiring CPE infections as ICU stay (P = 0.021), mechanical ventilation (P = 0.013), indwelling device (P = 0.011), diabetes mellitus (P = 0.036), usage of multiple antimicrobial agents (P = 0.007), administration of carbapenems (P = 0.042), presence of focal infection or sepsis (P = 0.013), and surgical interventions (P = 0.016), multivariate analysis revealed that all these factors were insignificant. Mortality rate was 56.7% in patients with CPE infections. By both univariate and multivariate analysis of impact of the variables on mortality in these patients, the significant factors were mechanical ventilation (odds ratio [OR]: 0.141, 95% confidence interval [CI]: 0.024–0.812) and presence of indwelling invasive device (OR: 8.034; 95% CI: 2.060–31.335). Conclusion: In this study, no specific factor was identified as an independent risk for acquisition of CPE infection. However, as it is evident by multivariate analysis, there is an increased risk of mortality in patients with CPE infections when they are ventilated and are supported by indwelling devices. PMID:28251105
Achana, Felix A; Cooper, Nicola J; Bujkiewicz, Sylwia; Hubbard, Stephanie J; Kendrick, Denise; Jones, David R; Sutton, Alex J
2014-07-21
Network meta-analysis (NMA) enables simultaneous comparison of multiple treatments while preserving randomisation. When summarising evidence to inform an economic evaluation, it is important that the analysis accurately reflects the dependency structure within the data, as correlations between outcomes may have implication for estimating the net benefit associated with treatment. A multivariate NMA offers a framework for evaluating multiple treatments across multiple outcome measures while accounting for the correlation structure between outcomes. The standard NMA model is extended to multiple outcome settings in two stages. In the first stage, information is borrowed across outcomes as well across studies through modelling the within-study and between-study correlation structure. In the second stage, we make use of the additional assumption that intervention effects are exchangeable between outcomes to predict effect estimates for all outcomes, including effect estimates on outcomes where evidence is either sparse or the treatment had not been considered by any one of the studies included in the analysis. We apply the methods to binary outcome data from a systematic review evaluating the effectiveness of nine home safety interventions on uptake of three poisoning prevention practices (safe storage of medicines, safe storage of other household products, and possession of poison centre control telephone number) in households with children. Analyses are conducted in WinBUGS using Markov Chain Monte Carlo (MCMC) simulations. Univariate and the first stage multivariate models produced broadly similar point estimates of intervention effects but the uncertainty around the multivariate estimates varied depending on the prior distribution specified for the between-study covariance structure. The second stage multivariate analyses produced more precise effect estimates while enabling intervention effects to be predicted for all outcomes, including intervention effects on outcomes not directly considered by the studies included in the analysis. Accounting for the dependency between outcomes in a multivariate meta-analysis may or may not improve the precision of effect estimates from a network meta-analysis compared to analysing each outcome separately.