The Potential of Multivariate Analysis in Assessing Students' Attitude to Curriculum Subjects
ERIC Educational Resources Information Center
Gaotlhobogwe, Michael; Laugharne, Janet; Durance, Isabelle
2011-01-01
Background: Understanding student attitudes to curriculum subjects is central to providing evidence-based options to policy makers in education. Purpose: We illustrate how quantitative approaches used in the social sciences and based on multivariate analysis (categorical Principal Components Analysis, Clustering Analysis and General Linear…
Multivariate analysis: A statistical approach for computations
NASA Astrophysics Data System (ADS)
Michu, Sachin; Kaushik, Vandana
2014-10-01
Multivariate analysis is a type of multivariate statistical approach commonly used in, automotive diagnosis, education evaluating clusters in finance etc and more recently in the health-related professions. The objective of the paper is to provide a detailed exploratory discussion about factor analysis (FA) in image retrieval method and correlation analysis (CA) of network traffic. Image retrieval methods aim to retrieve relevant images from a collected database, based on their content. The problem is made more difficult due to the high dimension of the variable space in which the images are represented. Multivariate correlation analysis proposes an anomaly detection and analysis method based on the correlation coefficient matrix. Anomaly behaviors in the network include the various attacks on the network like DDOs attacks and network scanning.
Multivariate Cluster Analysis.
ERIC Educational Resources Information Center
McRae, Douglas J.
Procedures for grouping students into homogeneous subsets have long interested educational researchers. The research reported in this paper is an investigation of a set of objective grouping procedures based on multivariate analysis considerations. Four multivariate functions that might serve as criteria for adequate grouping are given and…
Chen, Yong; Luo, Sheng; Chu, Haitao; Wei, Peng
2013-05-01
Multivariate meta-analysis is useful in combining evidence from independent studies which involve several comparisons among groups based on a single outcome. For binary outcomes, the commonly used statistical models for multivariate meta-analysis are multivariate generalized linear mixed effects models which assume risks, after some transformation, follow a multivariate normal distribution with possible correlations. In this article, we consider an alternative model for multivariate meta-analysis where the risks are modeled by the multivariate beta distribution proposed by Sarmanov (1966). This model have several attractive features compared to the conventional multivariate generalized linear mixed effects models, including simplicity of likelihood function, no need to specify a link function, and has a closed-form expression of distribution functions for study-specific risk differences. We investigate the finite sample performance of this model by simulation studies and illustrate its use with an application to multivariate meta-analysis of adverse events of tricyclic antidepressants treatment in clinical trials.
MGAS: a powerful tool for multivariate gene-based genome-wide association analysis.
Van der Sluis, Sophie; Dolan, Conor V; Li, Jiang; Song, Youqiang; Sham, Pak; Posthuma, Danielle; Li, Miao-Xin
2015-04-01
Standard genome-wide association studies, testing the association between one phenotype and a large number of single nucleotide polymorphisms (SNPs), are limited in two ways: (i) traits are often multivariate, and analysis of composite scores entails loss in statistical power and (ii) gene-based analyses may be preferred, e.g. to decrease the multiple testing problem. Here we present a new method, multivariate gene-based association test by extended Simes procedure (MGAS), that allows gene-based testing of multivariate phenotypes in unrelated individuals. Through extensive simulation, we show that under most trait-generating genotype-phenotype models MGAS has superior statistical power to detect associated genes compared with gene-based analyses of univariate phenotypic composite scores (i.e. GATES, multiple regression), and multivariate analysis of variance (MANOVA). Re-analysis of metabolic data revealed 32 False Discovery Rate controlled genome-wide significant genes, and 12 regions harboring multiple genes; of these 44 regions, 30 were not reported in the original analysis. MGAS allows researchers to conduct their multivariate gene-based analyses efficiently, and without the loss of power that is often associated with an incorrectly specified genotype-phenotype models. MGAS is freely available in KGG v3.0 (http://statgenpro.psychiatry.hku.hk/limx/kgg/download.php). Access to the metabolic dataset can be requested at dbGaP (https://dbgap.ncbi.nlm.nih.gov/). The R-simulation code is available from http://ctglab.nl/people/sophie_van_der_sluis. Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press.
Correlative and multivariate analysis of increased radon concentration in underground laboratory.
Maletić, Dimitrije M; Udovičić, Vladimir I; Banjanac, Radomir M; Joković, Dejan R; Dragić, Aleksandar L; Veselinović, Nikola B; Filipović, Jelena
2014-11-01
The results of analysis using correlative and multivariate methods, as developed for data analysis in high-energy physics and implemented in the Toolkit for Multivariate Analysis software package, of the relations of the variation of increased radon concentration with climate variables in shallow underground laboratory is presented. Multivariate regression analysis identified a number of multivariate methods which can give a good evaluation of increased radon concentrations based on climate variables. The use of the multivariate regression methods will enable the investigation of the relations of specific climate variable with increased radon concentrations by analysis of regression methods resulting in 'mapped' underlying functional behaviour of radon concentrations depending on a wide spectrum of climate variables. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Web-Based Tools for Modelling and Analysis of Multivariate Data: California Ozone Pollution Activity
ERIC Educational Resources Information Center
Dinov, Ivo D.; Christou, Nicolas
2011-01-01
This article presents a hands-on web-based activity motivated by the relation between human health and ozone pollution in California. This case study is based on multivariate data collected monthly at 20 locations in California between 1980 and 2006. Several strategies and tools for data interrogation and exploratory data analysis, model fitting…
Estimating an Effect Size in One-Way Multivariate Analysis of Variance (MANOVA)
ERIC Educational Resources Information Center
Steyn, H. S., Jr.; Ellis, S. M.
2009-01-01
When two or more univariate population means are compared, the proportion of variation in the dependent variable accounted for by population group membership is eta-squared. This effect size can be generalized by using multivariate measures of association, based on the multivariate analysis of variance (MANOVA) statistics, to establish whether…
Multivariate analysis in thoracic research.
Mengual-Macenlle, Noemí; Marcos, Pedro J; Golpe, Rafael; González-Rivas, Diego
2015-03-01
Multivariate analysis is based in observation and analysis of more than one statistical outcome variable at a time. In design and analysis, the technique is used to perform trade studies across multiple dimensions while taking into account the effects of all variables on the responses of interest. The development of multivariate methods emerged to analyze large databases and increasingly complex data. Since the best way to represent the knowledge of reality is the modeling, we should use multivariate statistical methods. Multivariate methods are designed to simultaneously analyze data sets, i.e., the analysis of different variables for each person or object studied. Keep in mind at all times that all variables must be treated accurately reflect the reality of the problem addressed. There are different types of multivariate analysis and each one should be employed according to the type of variables to analyze: dependent, interdependence and structural methods. In conclusion, multivariate methods are ideal for the analysis of large data sets and to find the cause and effect relationships between variables; there is a wide range of analysis types that we can use.
Sun, Hui; Wang, Huiyu; Zhang, Aihua; Yan, Guangli; Han, Ying; Li, Yuan; Wu, Xiuhong; Meng, Xiangcai; Wang, Xijun
2016-01-01
As herbal medicines have an important position in health care systems worldwide, their current assessment, and quality control are a major bottleneck. Cortex Phellodendri chinensis (CPC) and Cortex Phellodendri amurensis (CPA) are widely used in China, however, how to identify species of CPA and CPC has become urgent. In this study, multivariate analysis approach was performed to the investigation of chemical discrimination of CPA and CPC. Principal component analysis showed that two herbs could be separated clearly. The chemical markers such as berberine, palmatine, phellodendrine, magnoflorine, obacunone, and obaculactone were identified through the orthogonal partial least squared discriminant analysis, and were identified tentatively by the accurate mass of quadruple-time-of-flight mass spectrometry. A total of 29 components can be used as the chemical markers for discrimination of CPA and CPC. Of them, phellodenrine is significantly higher in CPC than that of CPA, whereas obacunone and obaculactone are significantly higher in CPA than that of CPC. The present study proves that multivariate analysis approach based chemical analysis greatly contributes to the investigation of CPA and CPC, and showed that the identified chemical markers as a whole should be used to discriminate the two herbal medicines, and simultaneously the results also provided chemical information for their quality assessment. Multivariate analysis approach was performed to the investigate the herbal medicineThe chemical markers were identified through multivariate analysis approachA total of 29 components can be used as the chemical markers. UPLC-Q/TOF-MS-based multivariate analysis method for the herbal medicine samples Abbreviations used: CPC: Cortex Phellodendri chinensis, CPA: Cortex Phellodendri amurensis, PCA: Principal component analysis, OPLS-DA: Orthogonal partial least squares discriminant analysis, BPI: Base peaks ion intensity.
Drunk driving detection based on classification of multivariate time series.
Li, Zhenlong; Jin, Xue; Zhao, Xiaohua
2015-09-01
This paper addresses the problem of detecting drunk driving based on classification of multivariate time series. First, driving performance measures were collected from a test in a driving simulator located in the Traffic Research Center, Beijing University of Technology. Lateral position and steering angle were used to detect drunk driving. Second, multivariate time series analysis was performed to extract the features. A piecewise linear representation was used to represent multivariate time series. A bottom-up algorithm was then employed to separate multivariate time series. The slope and time interval of each segment were extracted as the features for classification. Third, a support vector machine classifier was used to classify driver's state into two classes (normal or drunk) according to the extracted features. The proposed approach achieved an accuracy of 80.0%. Drunk driving detection based on the analysis of multivariate time series is feasible and effective. The approach has implications for drunk driving detection. Copyright © 2015 Elsevier Ltd and National Safety Council. All rights reserved.
A Statistical Discrimination Experiment for Eurasian Events Using a Twenty-Seven-Station Network
1980-07-08
to test the effectiveness of a multivariate method of analysis for distinguishing earthquakes from explosions. The data base for the experiment...to test the effectiveness of a multivariate method of analysis for distinguishing earthquakes from explosions. The data base for the experiment...the weight assigned to each variable whenever a new one is added. Jennrich, R. I. (1977). Stepwise discriminant analysis , in Statistical Methods for
DOE Office of Scientific and Technical Information (OSTI.GOV)
Loveday, D.L.; Craggs, C.
Box-Jenkins-based multivariate stochastic modeling is carried out using data recorded from a domestic heating system. The system comprises an air-source heat pump sited in the roof space of a house, solar assistance being provided by the conventional tile roof acting as a radiation absorber. Multivariate models are presented which illustrate the time-dependent relationships between three air temperatures - at external ambient, at entry to, and at exit from, the heat pump evaporator. Using a deterministic modeling approach, physical interpretations are placed on the results of the multivariate technique. It is concluded that the multivariate Box-Jenkins approach is a suitable techniquemore » for building thermal analysis. Application to multivariate Box-Jenkins approach is a suitable technique for building thermal analysis. Application to multivariate model-based control is discussed, with particular reference to building energy management systems. It is further concluded that stochastic modeling of data drawn from a short monitoring period offers a means of retrofitting an advanced model-based control system in existing buildings, which could be used to optimize energy savings. An approach to system simulation is suggested.« less
NASA Astrophysics Data System (ADS)
Hsiao, Y. R.; Tsai, C.
2017-12-01
As the WHO Air Quality Guideline indicates, ambient air pollution exposes world populations under threat of fatal symptoms (e.g. heart disease, lung cancer, asthma etc.), raising concerns of air pollution sources and relative factors. This study presents a novel approach to investigating the multiscale variations of PM2.5 in southern Taiwan over the past decade, with four meteorological influencing factors (Temperature, relative humidity, precipitation and wind speed),based on Noise-assisted Multivariate Empirical Mode Decomposition(NAMEMD) algorithm, Hilbert Spectral Analysis(HSA) and Time-dependent Intrinsic Correlation(TDIC) method. NAMEMD algorithm is a fully data-driven approach designed for nonlinear and nonstationary multivariate signals, and is performed to decompose multivariate signals into a collection of channels of Intrinsic Mode Functions (IMFs). TDIC method is an EMD-based method using a set of sliding window sizes to quantify localized correlation coefficients for multiscale signals. With the alignment property and quasi-dyadic filter bank of NAMEMD algorithm, one is able to produce same number of IMFs for all variables and estimates the cross correlation in a more accurate way. The performance of spectral representation of NAMEMD-HSA method is compared with Complementary Empirical Mode Decomposition/ Hilbert Spectral Analysis (CEEMD-HSA) and Wavelet Analysis. The nature of NAMAMD-based TDICC analysis is then compared with CEEMD-based TDIC analysis and the traditional correlation analysis.
Cichonska, Anna; Rousu, Juho; Marttinen, Pekka; Kangas, Antti J; Soininen, Pasi; Lehtimäki, Terho; Raitakari, Olli T; Järvelin, Marjo-Riitta; Salomaa, Veikko; Ala-Korpela, Mika; Ripatti, Samuli; Pirinen, Matti
2016-07-01
A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts, and restricted availability of individual-level genotype-phenotype data across the cohorts limit conducting multivariate tests. We introduce metaCCA, a computational framework for summary statistics-based analysis of a single or multiple studies that allows multivariate representation of both genotype and phenotype. It extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness.Multivariate meta-analysis of two Finnish studies of nuclear magnetic resonance metabolomics by metaCCA, using standard univariate output from the program SNPTEST, shows an excellent agreement with the pooled individual-level analysis of original data. Motivated by strong multivariate signals in the lipid genes tested, we envision that multivariate association testing using metaCCA has a great potential to provide novel insights from already published summary statistics from high-throughput phenotyping technologies. Code is available at https://github.com/aalto-ics-kepaco anna.cichonska@helsinki.fi or matti.pirinen@helsinki.fi Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
Cichonska, Anna; Rousu, Juho; Marttinen, Pekka; Kangas, Antti J.; Soininen, Pasi; Lehtimäki, Terho; Raitakari, Olli T.; Järvelin, Marjo-Riitta; Salomaa, Veikko; Ala-Korpela, Mika; Ripatti, Samuli; Pirinen, Matti
2016-01-01
Motivation: A dominant approach to genetic association studies is to perform univariate tests between genotype-phenotype pairs. However, analyzing related traits together increases statistical power, and certain complex associations become detectable only when several variants are tested jointly. Currently, modest sample sizes of individual cohorts, and restricted availability of individual-level genotype-phenotype data across the cohorts limit conducting multivariate tests. Results: We introduce metaCCA, a computational framework for summary statistics-based analysis of a single or multiple studies that allows multivariate representation of both genotype and phenotype. It extends the statistical technique of canonical correlation analysis to the setting where original individual-level records are not available, and employs a covariance shrinkage algorithm to achieve robustness. Multivariate meta-analysis of two Finnish studies of nuclear magnetic resonance metabolomics by metaCCA, using standard univariate output from the program SNPTEST, shows an excellent agreement with the pooled individual-level analysis of original data. Motivated by strong multivariate signals in the lipid genes tested, we envision that multivariate association testing using metaCCA has a great potential to provide novel insights from already published summary statistics from high-throughput phenotyping technologies. Availability and implementation: Code is available at https://github.com/aalto-ics-kepaco Contacts: anna.cichonska@helsinki.fi or matti.pirinen@helsinki.fi Supplementary information: Supplementary data are available at Bioinformatics online. PMID:27153689
Multivariate meta-analysis: a robust approach based on the theory of U-statistic.
Ma, Yan; Mazumdar, Madhu
2011-10-30
Meta-analysis is the methodology for combining findings from similar research studies asking the same question. When the question of interest involves multiple outcomes, multivariate meta-analysis is used to synthesize the outcomes simultaneously taking into account the correlation between the outcomes. Likelihood-based approaches, in particular restricted maximum likelihood (REML) method, are commonly utilized in this context. REML assumes a multivariate normal distribution for the random-effects model. This assumption is difficult to verify, especially for meta-analysis with small number of component studies. The use of REML also requires iterative estimation between parameters, needing moderately high computation time, especially when the dimension of outcomes is large. A multivariate method of moments (MMM) is available and is shown to perform equally well to REML. However, there is a lack of information on the performance of these two methods when the true data distribution is far from normality. In this paper, we propose a new nonparametric and non-iterative method for multivariate meta-analysis on the basis of the theory of U-statistic and compare the properties of these three procedures under both normal and skewed data through simulation studies. It is shown that the effect on estimates from REML because of non-normal data distribution is marginal and that the estimates from MMM and U-statistic-based approaches are very similar. Therefore, we conclude that for performing multivariate meta-analysis, the U-statistic estimation procedure is a viable alternative to REML and MMM. Easy implementation of all three methods are illustrated by their application to data from two published meta-analysis from the fields of hip fracture and periodontal disease. We discuss ideas for future research based on U-statistic for testing significance of between-study heterogeneity and for extending the work to meta-regression setting. Copyright © 2011 John Wiley & Sons, Ltd.
Yan, Binjun; Fang, Zhonghua; Shen, Lijuan; Qu, Haibin
2015-01-01
The batch-to-batch quality consistency of herbal drugs has always been an important issue. To propose a methodology for batch-to-batch quality control based on HPLC-MS fingerprints and process knowledgebase. The extraction process of Compound E-jiao Oral Liquid was taken as a case study. After establishing the HPLC-MS fingerprint analysis method, the fingerprints of the extract solutions produced under normal and abnormal operation conditions were obtained. Multivariate statistical models were built for fault detection and a discriminant analysis model was built using the probabilistic discriminant partial-least-squares method for fault diagnosis. Based on multivariate statistical analysis, process knowledge was acquired and the cause-effect relationship between process deviations and quality defects was revealed. The quality defects were detected successfully by multivariate statistical control charts and the type of process deviations were diagnosed correctly by discriminant analysis. This work has demonstrated the benefits of combining HPLC-MS fingerprints, process knowledge and multivariate analysis for the quality control of herbal drugs. Copyright © 2015 John Wiley & Sons, Ltd.
Characterizing multivariate decoding models based on correlated EEG spectral features
McFarland, Dennis J.
2013-01-01
Objective Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Methods Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). Results The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Conclusions Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. Significance While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. PMID:23466267
Multivariate missing data in hydrology - Review and applications
NASA Astrophysics Data System (ADS)
Ben Aissia, Mohamed-Aymen; Chebana, Fateh; Ouarda, Taha B. M. J.
2017-12-01
Water resources planning and management require complete data sets of a number of hydrological variables, such as flood peaks and volumes. However, hydrologists are often faced with the problem of missing data (MD) in hydrological databases. Several methods are used to deal with the imputation of MD. During the last decade, multivariate approaches have gained popularity in the field of hydrology, especially in hydrological frequency analysis (HFA). However, treating the MD remains neglected in the multivariate HFA literature whereas the focus has been mainly on the modeling component. For a complete analysis and in order to optimize the use of data, MD should also be treated in the multivariate setting prior to modeling and inference. Imputation of MD in the multivariate hydrological framework can have direct implications on the quality of the estimation. Indeed, the dependence between the series represents important additional information that can be included in the imputation process. The objective of the present paper is to highlight the importance of treating MD in multivariate hydrological frequency analysis by reviewing and applying multivariate imputation methods and by comparing univariate and multivariate imputation methods. An application is carried out for multiple flood attributes on three sites in order to evaluate the performance of the different methods based on the leave-one-out procedure. The results indicate that, the performance of imputation methods can be improved by adopting the multivariate setting, compared to mean substitution and interpolation methods, especially when using the copula-based approach.
NASA Astrophysics Data System (ADS)
Vittal, H.; Singh, Jitendra; Kumar, Pankaj; Karmakar, Subhankar
2015-06-01
In watershed management, flood frequency analysis (FFA) is performed to quantify the risk of flooding at different spatial locations and also to provide guidelines for determining the design periods of flood control structures. The traditional FFA was extensively performed by considering univariate scenario for both at-site and regional estimation of return periods. However, due to inherent mutual dependence of the flood variables or characteristics [i.e., peak flow (P), flood volume (V) and flood duration (D), which are random in nature], analysis has been further extended to multivariate scenario, with some restrictive assumptions. To overcome the assumption of same family of marginal density function for all flood variables, the concept of copula has been introduced. Although, the advancement from univariate to multivariate analyses drew formidable attention to the FFA research community, the basic limitation was that the analyses were performed with the implementation of only parametric family of distributions. The aim of the current study is to emphasize the importance of nonparametric approaches in the field of multivariate FFA; however, the nonparametric distribution may not always be a good-fit and capable of replacing well-implemented multivariate parametric and multivariate copula-based applications. Nevertheless, the potential of obtaining best-fit using nonparametric distributions might be improved because such distributions reproduce the sample's characteristics, resulting in more accurate estimations of the multivariate return period. Hence, the current study shows the importance of conjugating multivariate nonparametric approach with multivariate parametric and copula-based approaches, thereby results in a comprehensive framework for complete at-site FFA. Although the proposed framework is designed for at-site FFA, this approach can also be applied to regional FFA because regional estimations ideally include at-site estimations. The framework is based on the following steps: (i) comprehensive trend analysis to assess nonstationarity in the observed data; (ii) selection of the best-fit univariate marginal distribution with a comprehensive set of parametric and nonparametric distributions for the flood variables; (iii) multivariate frequency analyses with parametric, copula-based and nonparametric approaches; and (iv) estimation of joint and various conditional return periods. The proposed framework for frequency analysis is demonstrated using 110 years of observed data from Allegheny River at Salamanca, New York, USA. The results show that for both univariate and multivariate cases, the nonparametric Gaussian kernel provides the best estimate. Further, we perform FFA for twenty major rivers over continental USA, which shows for seven rivers, all the flood variables followed nonparametric Gaussian kernel; whereas for other rivers, parametric distributions provide the best-fit either for one or two flood variables. Thus the summary of results shows that the nonparametric method cannot substitute the parametric and copula-based approaches, but should be considered during any at-site FFA to provide the broadest choices for best estimation of the flood return periods.
NASA Astrophysics Data System (ADS)
Safi, A.; Campanella, B.; Grifoni, E.; Legnaioli, S.; Lorenzetti, G.; Pagnotta, S.; Poggialini, F.; Ripoll-Seguer, L.; Hidalgo, M.; Palleschi, V.
2018-06-01
The introduction of multivariate calibration curve approach in Laser-Induced Breakdown Spectroscopy (LIBS) quantitative analysis has led to a general improvement of the LIBS analytical performances, since a multivariate approach allows to exploit the redundancy of elemental information that are typically present in a LIBS spectrum. Software packages implementing multivariate methods are available in the most diffused commercial and open source analytical programs; in most of the cases, the multivariate algorithms are robust against noise and operate in unsupervised mode. The reverse of the coin of the availability and ease of use of such packages is the (perceived) difficulty in assessing the reliability of the results obtained which often leads to the consideration of the multivariate algorithms as 'black boxes' whose inner mechanism is supposed to remain hidden to the user. In this paper, we will discuss the dangers of a 'black box' approach in LIBS multivariate analysis, and will discuss how to overcome them using the chemical-physical knowledge that is at the base of any LIBS quantitative analysis.
Dinç, Erdal; Ozdemir, Abdil
2005-01-01
Multivariate chromatographic calibration technique was developed for the quantitative analysis of binary mixtures enalapril maleate (EA) and hydrochlorothiazide (HCT) in tablets in the presence of losartan potassium (LST). The mathematical algorithm of multivariate chromatographic calibration technique is based on the use of the linear regression equations constructed using relationship between concentration and peak area at the five-wavelength set. The algorithm of this mathematical calibration model having a simple mathematical content was briefly described. This approach is a powerful mathematical tool for an optimum chromatographic multivariate calibration and elimination of fluctuations coming from instrumental and experimental conditions. This multivariate chromatographic calibration contains reduction of multivariate linear regression functions to univariate data set. The validation of model was carried out by analyzing various synthetic binary mixtures and using the standard addition technique. Developed calibration technique was applied to the analysis of the real pharmaceutical tablets containing EA and HCT. The obtained results were compared with those obtained by classical HPLC method. It was observed that the proposed multivariate chromatographic calibration gives better results than classical HPLC.
Deconstructing multivariate decoding for the study of brain function.
Hebart, Martin N; Baker, Chris I
2017-08-04
Multivariate decoding methods were developed originally as tools to enable accurate predictions in real-world applications. The realization that these methods can also be employed to study brain function has led to their widespread adoption in the neurosciences. However, prior to the rise of multivariate decoding, the study of brain function was firmly embedded in a statistical philosophy grounded on univariate methods of data analysis. In this way, multivariate decoding for brain interpretation grew out of two established frameworks: multivariate decoding for predictions in real-world applications, and classical univariate analysis based on the study and interpretation of brain activation. We argue that this led to two confusions, one reflecting a mixture of multivariate decoding for prediction or interpretation, and the other a mixture of the conceptual and statistical philosophies underlying multivariate decoding and classical univariate analysis. Here we attempt to systematically disambiguate multivariate decoding for the study of brain function from the frameworks it grew out of. After elaborating these confusions and their consequences, we describe six, often unappreciated, differences between classical univariate analysis and multivariate decoding. We then focus on how the common interpretation of what is signal and noise changes in multivariate decoding. Finally, we use four examples to illustrate where these confusions may impact the interpretation of neuroimaging data. We conclude with a discussion of potential strategies to help resolve these confusions in interpreting multivariate decoding results, including the potential departure from multivariate decoding methods for the study of brain function. Copyright © 2017. Published by Elsevier Inc.
A Multivariate Descriptive Model of Motivation for Orthodontic Treatment.
ERIC Educational Resources Information Center
Hackett, Paul M. W.; And Others
1993-01-01
Motivation for receiving orthodontic treatment was studied among 109 young adults, and a multivariate model of the process is proposed. The combination of smallest scale analysis and Partial Order Scalogram Analysis by base Coordinates (POSAC) illustrates an interesting methodology for health treatment studies and explores motivation for dental…
NASA Astrophysics Data System (ADS)
Feng, Shangyuan; Lin, Juqiang; Huang, Zufang; Chen, Guannan; Chen, Weisheng; Wang, Yue; Chen, Rong; Zeng, Haishan
2013-01-01
The capability of using silver nanoparticle based near-infrared surface enhanced Raman scattering (SERS) spectroscopy combined with principal component analysis (PCA) and linear discriminate analysis (LDA) to differentiate esophageal cancer tissue from normal tissue was presented. Significant differences in Raman intensities of prominent SERS bands were observed between normal and cancer tissues. PCA-LDA multivariate analysis of the measured tissue SERS spectra achieved diagnostic sensitivity of 90.9% and specificity of 97.8%. This exploratory study demonstrated great potential for developing label-free tissue SERS analysis into a clinical tool for esophageal cancer detection.
A Cyber-Attack Detection Model Based on Multivariate Analyses
NASA Astrophysics Data System (ADS)
Sakai, Yuto; Rinsaka, Koichiro; Dohi, Tadashi
In the present paper, we propose a novel cyber-attack detection model based on two multivariate-analysis methods to the audit data observed on a host machine. The statistical techniques used here are the well-known Hayashi's quantification method IV and cluster analysis method. We quantify the observed qualitative audit event sequence via the quantification method IV, and collect similar audit event sequence in the same groups based on the cluster analysis. It is shown in simulation experiments that our model can improve the cyber-attack detection accuracy in some realistic cases where both normal and attack activities are intermingled.
NASA Technical Reports Server (NTRS)
Hague, D. S.; Vanderberg, J. D.; Woodbury, N. W.
1974-01-01
A method for rapidly examining the probable applicability of weight estimating formulae to a specific aerospace vehicle design is presented. The Multivariate Analysis Retrieval and Storage System (MARS) is comprised of three computer programs which sequentially operate on the weight and geometry characteristics of past aerospace vehicles designs. Weight and geometric characteristics are stored in a set of data bases which are fully computerized. Additional data bases are readily added to the MARS system and/or the existing data bases may be easily expanded to include additional vehicles or vehicle characteristics.
Characterizing multivariate decoding models based on correlated EEG spectral features.
McFarland, Dennis J
2013-07-01
Multivariate decoding methods are popular techniques for analysis of neurophysiological data. The present study explored potential interpretative problems with these techniques when predictors are correlated. Data from sensorimotor rhythm-based cursor control experiments was analyzed offline with linear univariate and multivariate models. Features were derived from autoregressive (AR) spectral analysis of varying model order which produced predictors that varied in their degree of correlation (i.e., multicollinearity). The use of multivariate regression models resulted in much better prediction of target position as compared to univariate regression models. However, with lower order AR features interpretation of the spectral patterns of the weights was difficult. This is likely to be due to the high degree of multicollinearity present with lower order AR features. Care should be exercised when interpreting the pattern of weights of multivariate models with correlated predictors. Comparison with univariate statistics is advisable. While multivariate decoding algorithms are very useful for prediction their utility for interpretation may be limited when predictors are correlated. Copyright © 2013 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.
Comprehensive drought characteristics analysis based on a nonlinear multivariate drought index
NASA Astrophysics Data System (ADS)
Yang, Jie; Chang, Jianxia; Wang, Yimin; Li, Yunyun; Hu, Hui; Chen, Yutong; Huang, Qiang; Yao, Jun
2018-02-01
It is vital to identify drought events and to evaluate multivariate drought characteristics based on a composite drought index for better drought risk assessment and sustainable development of water resources. However, most composite drought indices are constructed by the linear combination, principal component analysis and entropy weight method assuming a linear relationship among different drought indices. In this study, the multidimensional copulas function was applied to construct a nonlinear multivariate drought index (NMDI) to solve the complicated and nonlinear relationship due to its dependence structure and flexibility. The NMDI was constructed by combining meteorological, hydrological, and agricultural variables (precipitation, runoff, and soil moisture) to better reflect the multivariate variables simultaneously. Based on the constructed NMDI and runs theory, drought events for a particular area regarding three drought characteristics: duration, peak, and severity were identified. Finally, multivariate drought risk was analyzed as a tool for providing reliable support in drought decision-making. The results indicate that: (1) multidimensional copulas can effectively solve the complicated and nonlinear relationship among multivariate variables; (2) compared with single and other composite drought indices, the NMDI is slightly more sensitive in capturing recorded drought events; and (3) drought risk shows a spatial variation; out of the five partitions studied, the Jing River Basin as well as the upstream and midstream of the Wei River Basin are characterized by a higher multivariate drought risk. In general, multidimensional copulas provides a reliable way to solve the nonlinear relationship when constructing a comprehensive drought index and evaluating multivariate drought characteristics.
Moazami-Goudarzi, K; Laloë, D
2002-01-01
To determine the relationships among closely related populations or species, two methods are commonly used in the literature: phylogenetic reconstruction or multivariate analysis. The aim of this article is to assess the reliability of multivariate analysis. We describe a method that is based on principal component analysis and Mantel correlations, using a two-step process: The first step consists of a single-marker analysis and the second step tests if each marker reveals the same typology concerning population differentiation. We conclude that if single markers are not congruent, the compromise structure is not meaningful. Our model is not based on any particular mutation process and it can be applied to most of the commonly used genetic markers. This method is also useful to determine the contribution of each marker to the typology of populations. We test whether our method is efficient with two real data sets based on microsatellite markers. Our analysis suggests that for closely related populations, it is not always possible to accept the hypothesis that an increase in the number of markers will increase the reliability of the typology analysis. PMID:12242255
Statistical analysis of multivariate atmospheric variables. [cloud cover
NASA Technical Reports Server (NTRS)
Tubbs, J. D.
1979-01-01
Topics covered include: (1) estimation in discrete multivariate distributions; (2) a procedure to predict cloud cover frequencies in the bivariate case; (3) a program to compute conditional bivariate normal parameters; (4) the transformation of nonnormal multivariate to near-normal; (5) test of fit for the extreme value distribution based upon the generalized minimum chi-square; (6) test of fit for continuous distributions based upon the generalized minimum chi-square; (7) effect of correlated observations on confidence sets based upon chi-square statistics; and (8) generation of random variates from specified distributions.
Multivariate longitudinal data analysis with censored and intermittent missing responses.
Lin, Tsung-I; Lachos, Victor H; Wang, Wan-Lun
2018-05-08
The multivariate linear mixed model (MLMM) has emerged as an important analytical tool for longitudinal data with multiple outcomes. However, the analysis of multivariate longitudinal data could be complicated by the presence of censored measurements because of a detection limit of the assay in combination with unavoidable missing values arising when subjects miss some of their scheduled visits intermittently. This paper presents a generalization of the MLMM approach, called the MLMM-CM, for a joint analysis of the multivariate longitudinal data with censored and intermittent missing responses. A computationally feasible expectation maximization-based procedure is developed to carry out maximum likelihood estimation within the MLMM-CM framework. Moreover, the asymptotic standard errors of fixed effects are explicitly obtained via the information-based method. We illustrate our methodology by using simulated data and a case study from an AIDS clinical trial. Experimental results reveal that the proposed method is able to provide more satisfactory performance as compared with the traditional MLMM approach. Copyright © 2018 John Wiley & Sons, Ltd.
Wang, Longfei; Lee, Sungyoung; Gim, Jungsoo; Qiao, Dandi; Cho, Michael; Elston, Robert C; Silverman, Edwin K; Won, Sungho
2016-09-01
Family-based designs have been repeatedly shown to be powerful in detecting the significant rare variants associated with human diseases. Furthermore, human diseases are often defined by the outcomes of multiple phenotypes, and thus we expect multivariate family-based analyses may be very efficient in detecting associations with rare variants. However, few statistical methods implementing this strategy have been developed for family-based designs. In this report, we describe one such implementation: the multivariate family-based rare variant association tool (mFARVAT). mFARVAT is a quasi-likelihood-based score test for rare variant association analysis with multiple phenotypes, and tests both homogeneous and heterogeneous effects of each variant on multiple phenotypes. Simulation results show that the proposed method is generally robust and efficient for various disease models, and we identify some promising candidate genes associated with chronic obstructive pulmonary disease. The software of mFARVAT is freely available at http://healthstat.snu.ac.kr/software/mfarvat/, implemented in C++ and supported on Linux and MS Windows. © 2016 WILEY PERIODICALS, INC.
Genomic Analysis of Complex Microbial Communities in Wounds
2012-01-01
thoroughly in the ecology literature. Permutation Multivariate Analysis of Variance ( PerMANOVA ). We used PerMANOVA to test the null-hypothesis of no...difference between the bacterial communities found within a single wound compared to those from different patients (α = 0.05). PerMANOVA is a...permutation-based version of the multivariate analysis of variance (MANOVA). PerMANOVA uses the distances between samples to partition variance and
Multivariate analysis of climate along the southern coast of Alaskasome forestry implications.
Wilbur A. Farr; John S. Hard
1987-01-01
A multivariate analysis of climate was used to delineate 10 significantly different groups of climatic stations along the southern coast of Alaska based on latitude, longitude, seasonal temperatures and precipitation, frost-free periods, and total number of growing degree days. The climatic stations were too few to delineate this rugged, mountainous region into...
NASA Astrophysics Data System (ADS)
Jia, Xiaoliang; An, Haizhong; Sun, Xiaoqi; Huang, Xuan; Gao, Xiangyun
2016-04-01
The globalization and regionalization of crude oil trade inevitably give rise to the difference of crude oil prices. The understanding of the pattern of the crude oil prices' mutual propagation is essential for analyzing the development of global oil trade. Previous research has focused mainly on the fuzzy long- or short-term one-to-one propagation of bivariate oil prices, generally ignoring various patterns of periodical multivariate propagation. This study presents a wavelet-based network approach to help uncover the multipath propagation of multivariable crude oil prices in a joint time-frequency period. The weekly oil spot prices of the OPEC member states from June 1999 to March 2011 are adopted as the sample data. First, we used wavelet analysis to find different subseries based on an optimal decomposing scale to describe the periodical feature of the original oil price time series. Second, a complex network model was constructed based on an optimal threshold selection to describe the structural feature of multivariable oil prices. Third, Bayesian network analysis (BNA) was conducted to find the probability causal relationship based on periodical structural features to describe the various patterns of periodical multivariable propagation. Finally, the significance of the leading and intermediary oil prices is discussed. These findings are beneficial for the implementation of periodical target-oriented pricing policies and investment strategies.
Jackson, Dan; White, Ian R; Riley, Richard D
2013-01-01
Multivariate meta-analysis is becoming more commonly used. Methods for fitting the multivariate random effects model include maximum likelihood, restricted maximum likelihood, Bayesian estimation and multivariate generalisations of the standard univariate method of moments. Here, we provide a new multivariate method of moments for estimating the between-study covariance matrix with the properties that (1) it allows for either complete or incomplete outcomes and (2) it allows for covariates through meta-regression. Further, for complete data, it is invariant to linear transformations. Our method reduces to the usual univariate method of moments, proposed by DerSimonian and Laird, in a single dimension. We illustrate our method and compare it with some of the alternatives using a simulation study and a real example. PMID:23401213
ERIC Educational Resources Information Center
Joo, Soohyung; Kipp, Margaret E. I.
2015-01-01
Introduction: This study examines the structure of Web space in the field of library and information science using multivariate analysis of social tags from the Website, Delicious.com. A few studies have examined mathematical modelling of tags, mainly examining tagging in terms of tripartite graphs, pattern tracing and descriptive statistics. This…
Multivariate frequency domain analysis of protein dynamics
NASA Astrophysics Data System (ADS)
Matsunaga, Yasuhiro; Fuchigami, Sotaro; Kidera, Akinori
2009-03-01
Multivariate frequency domain analysis (MFDA) is proposed to characterize collective vibrational dynamics of protein obtained by a molecular dynamics (MD) simulation. MFDA performs principal component analysis (PCA) for a bandpass filtered multivariate time series using the multitaper method of spectral estimation. By applying MFDA to MD trajectories of bovine pancreatic trypsin inhibitor, we determined the collective vibrational modes in the frequency domain, which were identified by their vibrational frequencies and eigenvectors. At near zero temperature, the vibrational modes determined by MFDA agreed well with those calculated by normal mode analysis. At 300 K, the vibrational modes exhibited characteristic features that were considerably different from the principal modes of the static distribution given by the standard PCA. The influences of aqueous environments were discussed based on two different sets of vibrational modes, one derived from a MD simulation in water and the other from a simulation in vacuum. Using the varimax rotation, an algorithm of the multivariate statistical analysis, the representative orthogonal set of eigenmodes was determined at each vibrational frequency.
PyMVPA: A python toolbox for multivariate pattern analysis of fMRI data.
Hanke, Michael; Halchenko, Yaroslav O; Sederberg, Per B; Hanson, Stephen José; Haxby, James V; Pollmann, Stefan
2009-01-01
Decoding patterns of neural activity onto cognitive states is one of the central goals of functional brain imaging. Standard univariate fMRI analysis methods, which correlate cognitive and perceptual function with the blood oxygenation-level dependent (BOLD) signal, have proven successful in identifying anatomical regions based on signal increases during cognitive and perceptual tasks. Recently, researchers have begun to explore new multivariate techniques that have proven to be more flexible, more reliable, and more sensitive than standard univariate analysis. Drawing on the field of statistical learning theory, these new classifier-based analysis techniques possess explanatory power that could provide new insights into the functional properties of the brain. However, unlike the wealth of software packages for univariate analyses, there are few packages that facilitate multivariate pattern classification analyses of fMRI data. Here we introduce a Python-based, cross-platform, and open-source software toolbox, called PyMVPA, for the application of classifier-based analysis techniques to fMRI datasets. PyMVPA makes use of Python's ability to access libraries written in a large variety of programming languages and computing environments to interface with the wealth of existing machine learning packages. We present the framework in this paper and provide illustrative examples on its usage, features, and programmability.
PyMVPA: A Python toolbox for multivariate pattern analysis of fMRI data
Hanke, Michael; Halchenko, Yaroslav O.; Sederberg, Per B.; Hanson, Stephen José; Haxby, James V.; Pollmann, Stefan
2009-01-01
Decoding patterns of neural activity onto cognitive states is one of the central goals of functional brain imaging. Standard univariate fMRI analysis methods, which correlate cognitive and perceptual function with the blood oxygenation-level dependent (BOLD) signal, have proven successful in identifying anatomical regions based on signal increases during cognitive and perceptual tasks. Recently, researchers have begun to explore new multivariate techniques that have proven to be more flexible, more reliable, and more sensitive than standard univariate analysis. Drawing on the field of statistical learning theory, these new classifier-based analysis techniques possess explanatory power that could provide new insights into the functional properties of the brain. However, unlike the wealth of software packages for univariate analyses, there are few packages that facilitate multivariate pattern classification analyses of fMRI data. Here we introduce a Python-based, cross-platform, and open-source software toolbox, called PyMVPA, for the application of classifier-based analysis techniques to fMRI datasets. PyMVPA makes use of Python's ability to access libraries written in a large variety of programming languages and computing environments to interface with the wealth of existing machine-learning packages. We present the framework in this paper and provide illustrative examples on its usage, features, and programmability. PMID:19184561
Multi-country health surveys: are the analyses misleading?
Masood, Mohd; Reidpath, Daniel D
2014-05-01
The aim of this paper was to review the types of approaches currently utilized in the analysis of multi-country survey data, specifically focusing on design and modeling issues with a focus on analyses of significant multi-country surveys published in 2010. A systematic search strategy was used to identify the 10 multi-country surveys and the articles published from them in 2010. The surveys were selected to reflect diverse topics and foci; and provide an insight into analytic approaches across research themes. The search identified 159 articles appropriate for full text review and data extraction. The analyses adopted in the multi-country surveys can be broadly classified as: univariate/bivariate analyses, and multivariate/multivariable analyses. Multivariate/multivariable analyses may be further divided into design- and model-based analyses. Of the 159 articles reviewed, 129 articles used model-based analysis, 30 articles used design-based analyses. Similar patterns could be seen in all the individual surveys. While there is general agreement among survey statisticians that complex surveys are most appropriately analyzed using design-based analyses, most researchers continued to use the more common model-based approaches. Recent developments in design-based multi-level analysis may be one approach to include all the survey design characteristics. This is a relatively new area, however, and there remains statistical, as well as applied analytic research required. An important limitation of this study relates to the selection of the surveys used and the choice of year for the analysis, i.e., year 2010 only. There is, however, no strong reason to believe that analytic strategies have changed radically in the past few years, and 2010 provides a credible snapshot of current practice.
Dankers, Frank; Wijsman, Robin; Troost, Esther G C; Monshouwer, René; Bussink, Johan; Hoffmann, Aswin L
2017-05-07
In our previous work, a multivariable normal-tissue complication probability (NTCP) model for acute esophageal toxicity (AET) Grade ⩾2 after highly conformal (chemo-)radiotherapy for non-small cell lung cancer (NSCLC) was developed using multivariable logistic regression analysis incorporating clinical parameters and mean esophageal dose (MED). Since the esophagus is a tubular organ, spatial information of the esophageal wall dose distribution may be important in predicting AET. We investigated whether the incorporation of esophageal wall dose-surface data with spatial information improves the predictive power of our established NTCP model. For 149 NSCLC patients treated with highly conformal radiation therapy esophageal wall dose-surface histograms (DSHs) and polar dose-surface maps (DSMs) were generated. DSMs were used to generate new DSHs and dose-length-histograms that incorporate spatial information of the dose-surface distribution. From these histograms dose parameters were derived and univariate logistic regression analysis showed that they correlated significantly with AET. Following our previous work, new multivariable NTCP models were developed using the most significant dose histogram parameters based on univariate analysis (19 in total). However, the 19 new models incorporating esophageal wall dose-surface data with spatial information did not show improved predictive performance (area under the curve, AUC range 0.79-0.84) over the established multivariable NTCP model based on conventional dose-volume data (AUC = 0.84). For prediction of AET, based on the proposed multivariable statistical approach, spatial information of the esophageal wall dose distribution is of no added value and it is sufficient to only consider MED as a predictive dosimetric parameter.
NASA Astrophysics Data System (ADS)
Dankers, Frank; Wijsman, Robin; Troost, Esther G. C.; Monshouwer, René; Bussink, Johan; Hoffmann, Aswin L.
2017-05-01
In our previous work, a multivariable normal-tissue complication probability (NTCP) model for acute esophageal toxicity (AET) Grade ⩾2 after highly conformal (chemo-)radiotherapy for non-small cell lung cancer (NSCLC) was developed using multivariable logistic regression analysis incorporating clinical parameters and mean esophageal dose (MED). Since the esophagus is a tubular organ, spatial information of the esophageal wall dose distribution may be important in predicting AET. We investigated whether the incorporation of esophageal wall dose-surface data with spatial information improves the predictive power of our established NTCP model. For 149 NSCLC patients treated with highly conformal radiation therapy esophageal wall dose-surface histograms (DSHs) and polar dose-surface maps (DSMs) were generated. DSMs were used to generate new DSHs and dose-length-histograms that incorporate spatial information of the dose-surface distribution. From these histograms dose parameters were derived and univariate logistic regression analysis showed that they correlated significantly with AET. Following our previous work, new multivariable NTCP models were developed using the most significant dose histogram parameters based on univariate analysis (19 in total). However, the 19 new models incorporating esophageal wall dose-surface data with spatial information did not show improved predictive performance (area under the curve, AUC range 0.79-0.84) over the established multivariable NTCP model based on conventional dose-volume data (AUC = 0.84). For prediction of AET, based on the proposed multivariable statistical approach, spatial information of the esophageal wall dose distribution is of no added value and it is sufficient to only consider MED as a predictive dosimetric parameter.
Information extraction from multivariate images
NASA Technical Reports Server (NTRS)
Park, S. K.; Kegley, K. A.; Schiess, J. R.
1986-01-01
An overview of several multivariate image processing techniques is presented, with emphasis on techniques based upon the principal component transformation (PCT). Multiimages in various formats have a multivariate pixel value, associated with each pixel location, which has been scaled and quantized into a gray level vector, and the bivariate of the extent to which two images are correlated. The PCT of a multiimage decorrelates the multiimage to reduce its dimensionality and reveal its intercomponent dependencies if some off-diagonal elements are not small, and for the purposes of display the principal component images must be postprocessed into multiimage format. The principal component analysis of a multiimage is a statistical analysis based upon the PCT whose primary application is to determine the intrinsic component dimensionality of the multiimage. Computational considerations are also discussed.
Applications of modern statistical methods to analysis of data in physical science
NASA Astrophysics Data System (ADS)
Wicker, James Eric
Modern methods of statistical and computational analysis offer solutions to dilemmas confronting researchers in physical science. Although the ideas behind modern statistical and computational analysis methods were originally introduced in the 1970's, most scientists still rely on methods written during the early era of computing. These researchers, who analyze increasingly voluminous and multivariate data sets, need modern analysis methods to extract the best results from their studies. The first section of this work showcases applications of modern linear regression. Since the 1960's, many researchers in spectroscopy have used classical stepwise regression techniques to derive molecular constants. However, problems with thresholds of entry and exit for model variables plagues this analysis method. Other criticisms of this kind of stepwise procedure include its inefficient searching method, the order in which variables enter or leave the model and problems with overfitting data. We implement an information scoring technique that overcomes the assumptions inherent in the stepwise regression process to calculate molecular model parameters. We believe that this kind of information based model evaluation can be applied to more general analysis situations in physical science. The second section proposes new methods of multivariate cluster analysis. The K-means algorithm and the EM algorithm, introduced in the 1960's and 1970's respectively, formed the basis of multivariate cluster analysis methodology for many years. However, several shortcomings of these methods include strong dependence on initial seed values and inaccurate results when the data seriously depart from hypersphericity. We propose new cluster analysis methods based on genetic algorithms that overcomes the strong dependence on initial seed values. In addition, we propose a generalization of the Genetic K-means algorithm which can accurately identify clusters with complex hyperellipsoidal covariance structures. We then use this new algorithm in a genetic algorithm based Expectation-Maximization process that can accurately calculate parameters describing complex clusters in a mixture model routine. Using the accuracy of this GEM algorithm, we assign information scores to cluster calculations in order to best identify the number of mixture components in a multivariate data set. We will showcase how these algorithms can be used to process multivariate data from astronomical observations.
The Fourier decomposition method for nonlinear and non-stationary time series analysis.
Singh, Pushpendra; Joshi, Shiv Dutt; Patney, Rakesh Kumar; Saha, Kaushik
2017-03-01
for many decades, there has been a general perception in the literature that Fourier methods are not suitable for the analysis of nonlinear and non-stationary data. In this paper, we propose a novel and adaptive Fourier decomposition method (FDM), based on the Fourier theory, and demonstrate its efficacy for the analysis of nonlinear and non-stationary time series. The proposed FDM decomposes any data into a small number of 'Fourier intrinsic band functions' (FIBFs). The FDM presents a generalized Fourier expansion with variable amplitudes and variable frequencies of a time series by the Fourier method itself. We propose an idea of zero-phase filter bank-based multivariate FDM (MFDM), for the analysis of multivariate nonlinear and non-stationary time series, using the FDM. We also present an algorithm to obtain cut-off frequencies for MFDM. The proposed MFDM generates a finite number of band-limited multivariate FIBFs (MFIBFs). The MFDM preserves some intrinsic physical properties of the multivariate data, such as scale alignment, trend and instantaneous frequency. The proposed methods provide a time-frequency-energy (TFE) distribution that reveals the intrinsic structure of a data. Numerical computations and simulations have been carried out and comparison is made with the empirical mode decomposition algorithms.
The Fourier decomposition method for nonlinear and non-stationary time series analysis
Joshi, Shiv Dutt; Patney, Rakesh Kumar; Saha, Kaushik
2017-01-01
for many decades, there has been a general perception in the literature that Fourier methods are not suitable for the analysis of nonlinear and non-stationary data. In this paper, we propose a novel and adaptive Fourier decomposition method (FDM), based on the Fourier theory, and demonstrate its efficacy for the analysis of nonlinear and non-stationary time series. The proposed FDM decomposes any data into a small number of ‘Fourier intrinsic band functions’ (FIBFs). The FDM presents a generalized Fourier expansion with variable amplitudes and variable frequencies of a time series by the Fourier method itself. We propose an idea of zero-phase filter bank-based multivariate FDM (MFDM), for the analysis of multivariate nonlinear and non-stationary time series, using the FDM. We also present an algorithm to obtain cut-off frequencies for MFDM. The proposed MFDM generates a finite number of band-limited multivariate FIBFs (MFIBFs). The MFDM preserves some intrinsic physical properties of the multivariate data, such as scale alignment, trend and instantaneous frequency. The proposed methods provide a time–frequency–energy (TFE) distribution that reveals the intrinsic structure of a data. Numerical computations and simulations have been carried out and comparison is made with the empirical mode decomposition algorithms. PMID:28413352
Avalappampatty Sivasamy, Aneetha; Sundan, Bose
2015-01-01
The ever expanding communication requirements in today's world demand extensive and efficient network systems with equally efficient and reliable security features integrated for safe, confident, and secured communication and data transfer. Providing effective security protocols for any network environment, therefore, assumes paramount importance. Attempts are made continuously for designing more efficient and dynamic network intrusion detection models. In this work, an approach based on Hotelling's T2 method, a multivariate statistical analysis technique, has been employed for intrusion detection, especially in network environments. Components such as preprocessing, multivariate statistical analysis, and attack detection have been incorporated in developing the multivariate Hotelling's T2 statistical model and necessary profiles have been generated based on the T-square distance metrics. With a threshold range obtained using the central limit theorem, observed traffic profiles have been classified either as normal or attack types. Performance of the model, as evaluated through validation and testing using KDD Cup'99 dataset, has shown very high detection rates for all classes with low false alarm rates. Accuracy of the model presented in this work, in comparison with the existing models, has been found to be much better. PMID:26357668
Sivasamy, Aneetha Avalappampatty; Sundan, Bose
2015-01-01
The ever expanding communication requirements in today's world demand extensive and efficient network systems with equally efficient and reliable security features integrated for safe, confident, and secured communication and data transfer. Providing effective security protocols for any network environment, therefore, assumes paramount importance. Attempts are made continuously for designing more efficient and dynamic network intrusion detection models. In this work, an approach based on Hotelling's T(2) method, a multivariate statistical analysis technique, has been employed for intrusion detection, especially in network environments. Components such as preprocessing, multivariate statistical analysis, and attack detection have been incorporated in developing the multivariate Hotelling's T(2) statistical model and necessary profiles have been generated based on the T-square distance metrics. With a threshold range obtained using the central limit theorem, observed traffic profiles have been classified either as normal or attack types. Performance of the model, as evaluated through validation and testing using KDD Cup'99 dataset, has shown very high detection rates for all classes with low false alarm rates. Accuracy of the model presented in this work, in comparison with the existing models, has been found to be much better.
Structural analysis and design of multivariable control systems: An algebraic approach
NASA Technical Reports Server (NTRS)
Tsay, Yih Tsong; Shieh, Leang-San; Barnett, Stephen
1988-01-01
The application of algebraic system theory to the design of controllers for multivariable (MV) systems is explored analytically using an approach based on state-space representations and matrix-fraction descriptions. Chapters are devoted to characteristic lambda matrices and canonical descriptions of MIMO systems; spectral analysis, divisors, and spectral factors of nonsingular lambda matrices; feedback control of MV systems; and structural decomposition theories and their application to MV control systems.
NASA Astrophysics Data System (ADS)
Huang, Shaohua; Wang, Lan; Chen, Weisheng; Feng, Shangyuan; Lin, Juqiang; Huang, Zufang; Chen, Guannan; Li, Buhong; Chen, Rong
2014-11-01
Non-invasive esophagus cancer detection based on urine surface-enhanced Raman spectroscopy (SERS) analysis was presented. Urine SERS spectra were measured on esophagus cancer patients (n = 56) and healthy volunteers (n = 36) for control analysis. Tentative assignments of the urine SERS spectra indicated some interesting esophagus cancer-specific biomolecular changes, including a decrease in the relative content of urea and an increase in the percentage of uric acid in the urine of esophagus cancer patients compared to that of healthy subjects. Principal component analysis (PCA) combined with linear discriminant analysis (LDA) was employed to analyze and differentiate the SERS spectra between normal and esophagus cancer urine. The diagnostic algorithms utilizing a multivariate analysis method achieved a diagnostic sensitivity of 89.3% and specificity of 83.3% for separating esophagus cancer samples from normal urine samples. These results from the explorative work suggested that silver nano particle-based urine SERS analysis coupled with PCA-LDA multivariate analysis has potential for non-invasive detection of esophagus cancer.
Alkarkhi, Abbas F M; Ramli, Saifullah Bin; Easa, Azhar Mat
2009-01-01
Major (sodium, potassium, calcium, magnesium) and minor elements (iron, copper, zinc, manganese) and one heavy metal (lead) of Cavendish banana flour and Dream banana flour were determined, and data were analyzed using multivariate statistical techniques of factor analysis and discriminant analysis. Factor analysis yielded four factors explaining more than 81% of the total variance: the first factor explained 28.73%, comprising magnesium, sodium, and iron; the second factor explained 21.47%, comprising only manganese and copper; the third factor explained 15.66%, comprising zinc and lead; while the fourth factor explained 15.50%, comprising potassium. Discriminant analysis showed that magnesium and sodium exhibited a strong contribution in discriminating the two types of banana flour, affording 100% correct assignation. This study presents the usefulness of multivariate statistical techniques for analysis and interpretation of complex mineral content data from banana flour of different varieties.
Barimani, Shirin; Kleinebudde, Peter
2017-10-01
A multivariate analysis method, Science-Based Calibration (SBC), was used for the first time for endpoint determination of a tablet coating process using Raman data. Two types of tablet cores, placebo and caffeine cores, received a coating suspension comprising a polyvinyl alcohol-polyethylene glycol graft-copolymer and titanium dioxide to a maximum coating thickness of 80µm. Raman spectroscopy was used as in-line PAT tool. The spectra were acquired every minute and correlated to the amount of applied aqueous coating suspension. SBC was compared to another well-known multivariate analysis method, Partial Least Squares-regression (PLS) and a simpler approach, Univariate Data Analysis (UVDA). All developed calibration models had coefficient of determination values (R 2 ) higher than 0.99. The coating endpoints could be predicted with root mean square errors (RMSEP) less than 3.1% of the applied coating suspensions. Compared to PLS and UVDA, SBC proved to be an alternative multivariate calibration method with high predictive power. Copyright © 2017 Elsevier B.V. All rights reserved.
Comparison of connectivity analyses for resting state EEG data
NASA Astrophysics Data System (ADS)
Olejarczyk, Elzbieta; Marzetti, Laura; Pizzella, Vittorio; Zappasodi, Filippo
2017-06-01
Objective. In the present work, a nonlinear measure (transfer entropy, TE) was used in a multivariate approach for the analysis of effective connectivity in high density resting state EEG data in eyes open and eyes closed. Advantages of the multivariate approach in comparison to the bivariate one were tested. Moreover, the multivariate TE was compared to an effective linear measure, i.e. directed transfer function (DTF). Finally, the existence of a relationship between the information transfer and the level of brain synchronization as measured by phase synchronization value (PLV) was investigated. Approach. The comparison between the connectivity measures, i.e. bivariate versus multivariate TE, TE versus DTF, TE versus PLV, was performed by means of statistical analysis of indexes based on graph theory. Main results. The multivariate approach is less sensitive to false indirect connections with respect to the bivariate estimates. The multivariate TE differentiated better between eyes closed and eyes open conditions compared to DTF. Moreover, the multivariate TE evidenced non-linear phenomena in information transfer, which are not evidenced by the use of DTF. We also showed that the target of information flow, in particular the frontal region, is an area of greater brain synchronization. Significance. Comparison of different connectivity analysis methods pointed to the advantages of nonlinear methods, and indicated a relationship existing between the flow of information and the level of synchronization of the brain.
Generating Nonnormal Multivariate Data Using Copulas: Applications to SEM
ERIC Educational Resources Information Center
Mair, Patrick; Satorra, Albert; Bentler, Peter M.
2012-01-01
This article develops a procedure based on copulas to simulate multivariate nonnormal data that satisfy a prespecified variance-covariance matrix. The covariance matrix used can comply with a specific moment structure form (e.g., a factor analysis or a general structural equation model). Thus, the method is particularly useful for Monte Carlo…
A Versatile Cell Death Screening Assay Using Dye-Stained Cells and Multivariate Image Analysis.
Collins, Tony J; Ylanko, Jarkko; Geng, Fei; Andrews, David W
2015-11-01
A novel dye-based method for measuring cell death in image-based screens is presented. Unlike conventional high- and medium-throughput cell death assays that measure only one form of cell death accurately, using multivariate analysis of micrographs of cells stained with the inexpensive mix, red dye nonyl acridine orange, and a nuclear stain, it was possible to quantify cell death induced by a variety of different agonists even without a positive control. Surprisingly, using a single known cytotoxic agent as a positive control for training a multivariate classifier allowed accurate quantification of cytotoxicity for mechanistically unrelated compounds enabling generation of dose-response curves. Comparison with low throughput biochemical methods suggested that cell death was accurately distinguished from cell stress induced by low concentrations of the bioactive compounds Tunicamycin and Brefeldin A. High-throughput image-based format analyses of more than 300 kinase inhibitors correctly identified 11 as cytotoxic with only 1 false positive. The simplicity and robustness of this dye-based assay makes it particularly suited to live cell screening for toxic compounds.
A Versatile Cell Death Screening Assay Using Dye-Stained Cells and Multivariate Image Analysis
Collins, Tony J.; Ylanko, Jarkko; Geng, Fei
2015-01-01
Abstract A novel dye-based method for measuring cell death in image-based screens is presented. Unlike conventional high- and medium-throughput cell death assays that measure only one form of cell death accurately, using multivariate analysis of micrographs of cells stained with the inexpensive mix, red dye nonyl acridine orange, and a nuclear stain, it was possible to quantify cell death induced by a variety of different agonists even without a positive control. Surprisingly, using a single known cytotoxic agent as a positive control for training a multivariate classifier allowed accurate quantification of cytotoxicity for mechanistically unrelated compounds enabling generation of dose–response curves. Comparison with low throughput biochemical methods suggested that cell death was accurately distinguished from cell stress induced by low concentrations of the bioactive compounds Tunicamycin and Brefeldin A. High-throughput image-based format analyses of more than 300 kinase inhibitors correctly identified 11 as cytotoxic with only 1 false positive. The simplicity and robustness of this dye-based assay makes it particularly suited to live cell screening for toxic compounds. PMID:26422066
Choi, Young Hae; Sertic, Sarah; Kim, Hye Kyong; Wilson, Erica G; Michopoulos, Filippos; Lefeber, Alfons W M; Erkelens, Cornelis; Prat Kricun, Sergio D; Verpoorte, Robert
2005-02-23
The metabolomic analysis of 11 Ilex species, I. argentina, I. brasiliensis, I. brevicuspis, I. dumosavar. dumosa, I. dumosa var. guaranina, I. integerrima, I. microdonta, I. paraguariensis var. paraguariensis, I. pseudobuxus, I. taubertiana, and I. theezans, was carried out by NMR spectroscopy and multivariate data analysis. The analysis using principal component analysis and classification of the (1)H NMR spectra showed a clear discrimination of those samples based on the metabolites present in the organic and aqueous fractions. The major metabolites that contribute to the discrimination are arbutin, caffeine, phenylpropanoids, and theobromine. Among those metabolites, arbutin, which has not been reported yet as a constituent of Ilex species, was found to be a biomarker for I. argentina,I. brasiliensis, I. brevicuspis, I. integerrima, I. microdonta, I. pseudobuxus, I. taubertiana, and I. theezans. This reliable method based on the determination of a large number of metabolites makes the chemotaxonomical analysis of Ilex species possible.
Alegre-Cortés, J; Soto-Sánchez, C; Pizá, Á G; Albarracín, A L; Farfán, F D; Felice, C J; Fernández, E
2016-07-15
Linear analysis has classically provided powerful tools for understanding the behavior of neural populations, but the neuron responses to real-world stimulation are nonlinear under some conditions, and many neuronal components demonstrate strong nonlinear behavior. In spite of this, temporal and frequency dynamics of neural populations to sensory stimulation have been usually analyzed with linear approaches. In this paper, we propose the use of Noise-Assisted Multivariate Empirical Mode Decomposition (NA-MEMD), a data-driven template-free algorithm, plus the Hilbert transform as a suitable tool for analyzing population oscillatory dynamics in a multi-dimensional space with instantaneous frequency (IF) resolution. The proposed approach was able to extract oscillatory information of neurophysiological data of deep vibrissal nerve and visual cortex multiunit recordings that were not evidenced using linear approaches with fixed bases such as the Fourier analysis. Texture discrimination analysis performance was increased when Noise-Assisted Multivariate Empirical Mode plus Hilbert transform was implemented, compared to linear techniques. Cortical oscillatory population activity was analyzed with precise time-frequency resolution. Similarly, NA-MEMD provided increased time-frequency resolution of cortical oscillatory population activity. Noise-Assisted Multivariate Empirical Mode Decomposition plus Hilbert transform is an improved method to analyze neuronal population oscillatory dynamics overcoming linear and stationary assumptions of classical methods. Copyright © 2016 Elsevier B.V. All rights reserved.
PYCHEM: a multivariate analysis package for python.
Jarvis, Roger M; Broadhurst, David; Johnson, Helen; O'Boyle, Noel M; Goodacre, Royston
2006-10-15
We have implemented a multivariate statistical analysis toolbox, with an optional standalone graphical user interface (GUI), using the Python scripting language. This is a free and open source project that addresses the need for a multivariate analysis toolbox in Python. Although the functionality provided does not cover the full range of multivariate tools that are available, it has a broad complement of methods that are widely used in the biological sciences. In contrast to tools like MATLAB, PyChem 2.0.0 is easily accessible and free, allows for rapid extension using a range of Python modules and is part of the growing amount of complementary and interoperable scientific software in Python based upon SciPy. One of the attractions of PyChem is that it is an open source project and so there is an opportunity, through collaboration, to increase the scope of the software and to continually evolve a user-friendly platform that has applicability across a wide range of analytical and post-genomic disciplines. http://sourceforge.net/projects/pychem
Borrowing of strength and study weights in multivariate and network meta-analysis.
Jackson, Dan; White, Ian R; Price, Malcolm; Copas, John; Riley, Richard D
2017-12-01
Multivariate and network meta-analysis have the potential for the estimated mean of one effect to borrow strength from the data on other effects of interest. The extent of this borrowing of strength is usually assessed informally. We present new mathematical definitions of 'borrowing of strength'. Our main proposal is based on a decomposition of the score statistic, which we show can be interpreted as comparing the precision of estimates from the multivariate and univariate models. Our definition of borrowing of strength therefore emulates the usual informal assessment. We also derive a method for calculating study weights, which we embed into the same framework as our borrowing of strength statistics, so that percentage study weights can accompany the results from multivariate and network meta-analyses as they do in conventional univariate meta-analyses. Our proposals are illustrated using three meta-analyses involving correlated effects for multiple outcomes, multiple risk factor associations and multiple treatments (network meta-analysis).
Borrowing of strength and study weights in multivariate and network meta-analysis
Jackson, Dan; White, Ian R; Price, Malcolm; Copas, John; Riley, Richard D
2016-01-01
Multivariate and network meta-analysis have the potential for the estimated mean of one effect to borrow strength from the data on other effects of interest. The extent of this borrowing of strength is usually assessed informally. We present new mathematical definitions of ‘borrowing of strength’. Our main proposal is based on a decomposition of the score statistic, which we show can be interpreted as comparing the precision of estimates from the multivariate and univariate models. Our definition of borrowing of strength therefore emulates the usual informal assessment. We also derive a method for calculating study weights, which we embed into the same framework as our borrowing of strength statistics, so that percentage study weights can accompany the results from multivariate and network meta-analyses as they do in conventional univariate meta-analyses. Our proposals are illustrated using three meta-analyses involving correlated effects for multiple outcomes, multiple risk factor associations and multiple treatments (network meta-analysis). PMID:26546254
Network structure of multivariate time series.
Lacasa, Lucas; Nicosia, Vincenzo; Latora, Vito
2015-10-21
Our understanding of a variety of phenomena in physics, biology and economics crucially depends on the analysis of multivariate time series. While a wide range tools and techniques for time series analysis already exist, the increasing availability of massive data structures calls for new approaches for multidimensional signal processing. We present here a non-parametric method to analyse multivariate time series, based on the mapping of a multidimensional time series into a multilayer network, which allows to extract information on a high dimensional dynamical system through the analysis of the structure of the associated multiplex network. The method is simple to implement, general, scalable, does not require ad hoc phase space partitioning, and is thus suitable for the analysis of large, heterogeneous and non-stationary time series. We show that simple structural descriptors of the associated multiplex networks allow to extract and quantify nontrivial properties of coupled chaotic maps, including the transition between different dynamical phases and the onset of various types of synchronization. As a concrete example we then study financial time series, showing that a multiplex network analysis can efficiently discriminate crises from periods of financial stability, where standard methods based on time-series symbolization often fail.
Motegi, Hiromi; Tsuboi, Yuuri; Saga, Ayako; Kagami, Tomoko; Inoue, Maki; Toki, Hideaki; Minowa, Osamu; Noda, Tetsuo; Kikuchi, Jun
2015-11-04
There is an increasing need to use multivariate statistical methods for understanding biological functions, identifying the mechanisms of diseases, and exploring biomarkers. In addition to classical analyses such as hierarchical cluster analysis, principal component analysis, and partial least squares discriminant analysis, various multivariate strategies, including independent component analysis, non-negative matrix factorization, and multivariate curve resolution, have recently been proposed. However, determining the number of components is problematic. Despite the proposal of several different methods, no satisfactory approach has yet been reported. To resolve this problem, we implemented a new idea: classifying a component as "reliable" or "unreliable" based on the reproducibility of its appearance, regardless of the number of components in the calculation. Using the clustering method for classification, we applied this idea to multivariate curve resolution-alternating least squares (MCR-ALS). Comparisons between conventional and modified methods applied to proton nuclear magnetic resonance ((1)H-NMR) spectral datasets derived from known standard mixtures and biological mixtures (urine and feces of mice) revealed that more plausible results are obtained by the modified method. In particular, clusters containing little information were detected with reliability. This strategy, named "cluster-aided MCR-ALS," will facilitate the attainment of more reliable results in the metabolomics datasets.
Multivariate assessment of event-related potentials with the t-CWT method.
Bostanov, Vladimir
2015-11-05
Event-related brain potentials (ERPs) are usually assessed with univariate statistical tests although they are essentially multivariate objects. Brain-computer interface applications are a notable exception to this practice, because they are based on multivariate classification of single-trial ERPs. Multivariate ERP assessment can be facilitated by feature extraction methods. One such method is t-CWT, a mathematical-statistical algorithm based on the continuous wavelet transform (CWT) and Student's t-test. This article begins with a geometric primer on some basic concepts of multivariate statistics as applied to ERP assessment in general and to the t-CWT method in particular. Further, it presents for the first time a detailed, step-by-step, formal mathematical description of the t-CWT algorithm. A new multivariate outlier rejection procedure based on principal component analysis in the frequency domain is presented as an important pre-processing step. The MATLAB and GNU Octave implementation of t-CWT is also made publicly available for the first time as free and open source code. The method is demonstrated on some example ERP data obtained in a passive oddball paradigm. Finally, some conceptually novel applications of the multivariate approach in general and of the t-CWT method in particular are suggested and discussed. Hopefully, the publication of both the t-CWT source code and its underlying mathematical algorithm along with a didactic geometric introduction to some basic concepts of multivariate statistics would make t-CWT more accessible to both users and developers in the field of neuroscience research.
Web-based tools for modelling and analysis of multivariate data: California ozone pollution activity
Dinov, Ivo D.; Christou, Nicolas
2014-01-01
This article presents a hands-on web-based activity motivated by the relation between human health and ozone pollution in California. This case study is based on multivariate data collected monthly at 20 locations in California between 1980 and 2006. Several strategies and tools for data interrogation and exploratory data analysis, model fitting and statistical inference on these data are presented. All components of this case study (data, tools, activity) are freely available online at: http://wiki.stat.ucla.edu/socr/index.php/SOCR_MotionCharts_CAOzoneData. Several types of exploratory (motion charts, box-and-whisker plots, spider charts) and quantitative (inference, regression, analysis of variance (ANOVA)) data analyses tools are demonstrated. Two specific human health related questions (temporal and geographic effects of ozone pollution) are discussed as motivational challenges. PMID:24465054
Dinov, Ivo D; Christou, Nicolas
2011-09-01
This article presents a hands-on web-based activity motivated by the relation between human health and ozone pollution in California. This case study is based on multivariate data collected monthly at 20 locations in California between 1980 and 2006. Several strategies and tools for data interrogation and exploratory data analysis, model fitting and statistical inference on these data are presented. All components of this case study (data, tools, activity) are freely available online at: http://wiki.stat.ucla.edu/socr/index.php/SOCR_MotionCharts_CAOzoneData. Several types of exploratory (motion charts, box-and-whisker plots, spider charts) and quantitative (inference, regression, analysis of variance (ANOVA)) data analyses tools are demonstrated. Two specific human health related questions (temporal and geographic effects of ozone pollution) are discussed as motivational challenges.
Multivariable Parametric Cost Model for Ground Optical Telescope Assembly
NASA Technical Reports Server (NTRS)
Stahl, H. Philip; Rowell, Ginger Holmes; Reese, Gayle; Byberg, Alicia
2005-01-01
A parametric cost model for ground-based telescopes is developed using multivariable statistical analysis of both engineering and performance parameters. While diameter continues to be the dominant cost driver, diffraction-limited wavelength is found to be a secondary driver. Other parameters such as radius of curvature are examined. The model includes an explicit factor for primary mirror segmentation and/or duplication (i.e., multi-telescope phased-array systems). Additionally, single variable models Based on aperture diameter are derived.
Zhi, Ruicong; Zhao, Lei; Xie, Nan; Wang, Houyin; Shi, Bolin; Shi, Jingye
2016-01-13
A framework of establishing standard reference scale (texture) is proposed by multivariate statistical analysis according to instrumental measurement and sensory evaluation. Multivariate statistical analysis is conducted to rapidly select typical reference samples with characteristics of universality, representativeness, stability, substitutability, and traceability. The reasonableness of the framework method is verified by establishing standard reference scale of texture attribute (hardness) with Chinese well-known food. More than 100 food products in 16 categories were tested using instrumental measurement (TPA test), and the result was analyzed with clustering analysis, principal component analysis, relative standard deviation, and analysis of variance. As a result, nine kinds of foods were determined to construct the hardness standard reference scale. The results indicate that the regression coefficient between the estimated sensory value and the instrumentally measured value is significant (R(2) = 0.9765), which fits well with Stevens's theory. The research provides reliable a theoretical basis and practical guide for quantitative standard reference scale establishment on food texture characteristics.
Wang, Yalin; Zhang, Jie; Gutman, Boris; Chan, Tony F.; Becker, James T.; Aizenstein, Howard J.; Lopez, Oscar L.; Tamburo, Robert J.; Toga, Arthur W.; Thompson, Paul M.
2010-01-01
Here we developed a new method, called multivariate tensor-based surface morphometry (TBM), and applied it to study lateral ventricular surface differences associated with HIV/AIDS. Using concepts from differential geometry and the theory of differential forms, we created mathematical structures known as holomorphic one-forms, to obtain an efficient and accurate conformal parameterization of the lateral ventricular surfaces in the brain. The new meshing approach also provides a natural way to register anatomical surfaces across subjects, and improves on prior methods as it handles surfaces that branch and join at complex 3D junctions. To analyze anatomical differences, we computed new statistics from the Riemannian surface metrics - these retain multivariate information on local surface geometry. We applied this framework to analyze lateral ventricular surface morphometry in 3D MRI data from 11 subjects with HIV/AIDS and 8 healthy controls. Our method detected a 3D profile of surface abnormalities even in this small sample. Multivariate statistics on the local tensors gave better effect sizes for detecting group differences, relative to other TBM-based methods including analysis of the Jacobian determinant, the largest and smallest eigenvalues of the surface metric, and the pair of eigenvalues of the Jacobian matrix. The resulting analysis pipeline may improve the power of surface-based morphometry studies of the brain. PMID:19900560
ERIC Educational Resources Information Center
Mun, Eun Young; von Eye, Alexander; Bates, Marsha E.; Vaschillo, Evgeny G.
2008-01-01
Model-based cluster analysis is a new clustering procedure to investigate population heterogeneity utilizing finite mixture multivariate normal densities. It is an inferentially based, statistically principled procedure that allows comparison of nonnested models using the Bayesian information criterion to compare multiple models and identify the…
Vitte, Joana; Ranque, Stéphane; Carsin, Ania; Gomez, Carine; Romain, Thomas; Cassagne, Carole; Gouitaa, Marion; Baravalle-Einaudi, Mélisande; Bel, Nathalie Stremler-Le; Reynaud-Gaubert, Martine; Dubus, Jean-Christophe; Mège, Jean-Louis; Gaudart, Jean
2017-01-01
Molecular-based allergy diagnosis yields multiple biomarker datasets. The classical diagnostic score for allergic bronchopulmonary aspergillosis (ABPA), a severe disease usually occurring in asthmatic patients and people with cystic fibrosis, comprises succinct immunological criteria formulated in 1977: total IgE, anti- Aspergillus fumigatus ( Af ) IgE, anti- Af "precipitins," and anti- Af IgG. Progress achieved over the last four decades led to multiple IgE and IgG(4) Af biomarkers available with quantitative, standardized, molecular-level reports. These newly available biomarkers have not been included in the current diagnostic criteria, either individually or in algorithms, despite persistent underdiagnosis of ABPA. Large numbers of individual biomarkers may hinder their use in clinical practice. Conversely, multivariate analysis using new tools may bring about a better chance of less diagnostic mistakes. We report here a proof-of-concept work consisting of a three-step multivariate analysis of Af IgE, IgG, and IgG4 biomarkers through a combination of principal component analysis, hierarchical ascendant classification, and classification and regression tree multivariate analysis. The resulting diagnostic algorithms might show the way for novel criteria and improved diagnostic efficiency in Af -sensitized patients at risk for ABPA.
An Individualized Student Term Project for Multivariate Calculus
ERIC Educational Resources Information Center
Gordon, Sheldon P.
2004-01-01
In this article, the author describes an individualized term project that is designed to increase student understanding of some of the major concepts and methods in multivariate calculus. The project involves having each student conduct a complete max-min analysis of a third degree polynomial in x and y that is based on his or her social security…
Iorgulescu, E; Voicu, V A; Sârbu, C; Tache, F; Albu, F; Medvedovici, A
2016-08-01
The influence of the experimental variability (instrumental repeatability, instrumental intermediate precision and sample preparation variability) and data pre-processing (normalization, peak alignment, background subtraction) on the discrimination power of multivariate data analysis methods (Principal Component Analysis -PCA- and Cluster Analysis -CA-) as well as a new algorithm based on linear regression was studied. Data used in the study were obtained through positive or negative ion monitoring electrospray mass spectrometry (+/-ESI/MS) and reversed phase liquid chromatography/UV spectrometric detection (RPLC/UV) applied to green tea extracts. Extractions in ethanol and heated water infusion were used as sample preparation procedures. The multivariate methods were directly applied to mass spectra and chromatograms, involving strictly a holistic comparison of shapes, without assignment of any structural identity to compounds. An alternative data interpretation based on linear regression analysis mutually applied to data series is also discussed. Slopes, intercepts and correlation coefficients produced by the linear regression analysis applied on pairs of very large experimental data series successfully retain information resulting from high frequency instrumental acquisition rates, obviously better defining the profiles being compared. Consequently, each type of sample or comparison between samples produces in the Cartesian space an ellipsoidal volume defined by the normal variation intervals of the slope, intercept and correlation coefficient. Distances between volumes graphically illustrates (dis)similarities between compared data. The instrumental intermediate precision had the major effect on the discrimination power of the multivariate data analysis methods. Mass spectra produced through ionization from liquid state in atmospheric pressure conditions of bulk complex mixtures resulting from extracted materials of natural origins provided an excellent data basis for multivariate analysis methods, equivalent to data resulting from chromatographic separations. The alternative evaluation of very large data series based on linear regression analysis produced information equivalent to results obtained through application of PCA an CA. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Technical Reports Server (NTRS)
Djorgovski, George
1993-01-01
The existing and forthcoming data bases from NASA missions contain an abundance of information whose complexity cannot be efficiently tapped with simple statistical techniques. Powerful multivariate statistical methods already exist which can be used to harness much of the richness of these data. Automatic classification techniques have been developed to solve the problem of identifying known types of objects in multiparameter data sets, in addition to leading to the discovery of new physical phenomena and classes of objects. We propose an exploratory study and integration of promising techniques in the development of a general and modular classification/analysis system for very large data bases, which would enhance and optimize data management and the use of human research resource.
NASA Technical Reports Server (NTRS)
Djorgovski, Stanislav
1992-01-01
The existing and forthcoming data bases from NASA missions contain an abundance of information whose complexity cannot be efficiently tapped with simple statistical techniques. Powerful multivariate statistical methods already exist which can be used to harness much of the richness of these data. Automatic classification techniques have been developed to solve the problem of identifying known types of objects in multi parameter data sets, in addition to leading to the discovery of new physical phenomena and classes of objects. We propose an exploratory study and integration of promising techniques in the development of a general and modular classification/analysis system for very large data bases, which would enhance and optimize data management and the use of human research resources.
2013-01-01
Background Matching pursuit algorithm (MP), especially with recent multivariate extensions, offers unique advantages in analysis of EEG and MEG. Methods We propose a novel construction of an optimal Gabor dictionary, based upon the metrics introduced in this paper. We implement this construction in a freely available software for MP decomposition of multivariate time series, with a user friendly interface via the Svarog package (Signal Viewer, Analyzer and Recorder On GPL, http://braintech.pl/svarog), and provide a hands-on introduction to its application to EEG. Finally, we describe numerical and mathematical optimizations used in this implementation. Results Optimal Gabor dictionaries, based on the metric introduced in this paper, for the first time allowed for a priori assessment of maximum one-step error of the MP algorithm. Variants of multivariate MP, implemented in the accompanying software, are organized according to the mathematical properties of the algorithms, relevant in the light of EEG/MEG analysis. Some of these variants have been successfully applied to both multichannel and multitrial EEG and MEG in previous studies, improving preprocessing for EEG/MEG inverse solutions and parameterization of evoked potentials in single trials; we mention also ongoing work and possible novel applications. Conclusions Mathematical results presented in this paper improve our understanding of the basics of the MP algorithm. Simple introduction of its properties and advantages, together with the accompanying stable and user-friendly Open Source software package, pave the way for a widespread and reproducible analysis of multivariate EEG and MEG time series and novel applications, while retaining a high degree of compatibility with the traditional, visual analysis of EEG. PMID:24059247
Hou, Deyi; O'Connor, David; Nathanail, Paul; Tian, Li; Ma, Yan
2017-12-01
Heavy metal soil contamination is associated with potential toxicity to humans or ecotoxicity. Scholars have increasingly used a combination of geographical information science (GIS) with geostatistical and multivariate statistical analysis techniques to examine the spatial distribution of heavy metals in soils at a regional scale. A review of such studies showed that most soil sampling programs were based on grid patterns and composite sampling methodologies. Many programs intended to characterize various soil types and land use types. The most often used sampling depth intervals were 0-0.10 m, or 0-0.20 m, below surface; and the sampling densities used ranged from 0.0004 to 6.1 samples per km 2 , with a median of 0.4 samples per km 2 . The most widely used spatial interpolators were inverse distance weighted interpolation and ordinary kriging; and the most often used multivariate statistical analysis techniques were principal component analysis and cluster analysis. The review also identified several determining and correlating factors in heavy metal distribution in soils, including soil type, soil pH, soil organic matter, land use type, Fe, Al, and heavy metal concentrations. The major natural and anthropogenic sources of heavy metals were found to derive from lithogenic origin, roadway and transportation, atmospheric deposition, wastewater and runoff from industrial and mining facilities, fertilizer application, livestock manure, and sewage sludge. This review argues that the full potential of integrated GIS and multivariate statistical analysis for assessing heavy metal distribution in soils on a regional scale has not yet been fully realized. It is proposed that future research be conducted to map multivariate results in GIS to pinpoint specific anthropogenic sources, to analyze temporal trends in addition to spatial patterns, to optimize modeling parameters, and to expand the use of different multivariate analysis tools beyond principal component analysis (PCA) and cluster analysis (CA). Copyright © 2017 Elsevier Ltd. All rights reserved.
Yu, Hao; Dick, Andrew W
2012-10-01
Given the rapid growth of health care costs, some experts were concerned with erosion of employment-based private insurance (EBPI). This empirical analysis aims to quantify the concern. Using the National Health Account, we generated a cost index to represent state-level annual cost growth. We merged it with the 1996-2003 Medical Expenditure Panel Survey. The unit of analysis is the family. We conducted both bivariate and multivariate logistic analyses. The bivariate analysis found a significant inverse association between the cost index and the proportion of families receiving an offer of EBPI. The multivariate analysis showed that the cost index was significantly negatively associated with the likelihood of receiving an EBPI offer for the entire sample and for families in the first, second, and third quartiles of income distribution. The cost index was also significantly negatively associated with the proportion of families with EBPI for the entire year for each family member (EBPI-EYEM). The multivariate analysis confirmed significance of the relationship for the entire sample, and for families in the second and third quartiles of income distribution. Among the families with EBPI-EYEM, there was a positive relationship between the cost index and this group's likelihood of having out-of-pocket expenditures exceeding 10 percent of family income. The multivariate analysis confirmed significance of the relationship for the entire group and for families in the second and third quartiles of income distribution. Rising health costs reduce EBPI availability and enrollment, and the financial protection provided by it, especially for middle-class families. © Health Research and Educational Trust.
Yu, Hao; Dick, Andrew W
2012-01-01
Background Given the rapid growth of health care costs, some experts were concerned with erosion of employment-based private insurance (EBPI). This empirical analysis aims to quantify the concern. Methods Using the National Health Account, we generated a cost index to represent state-level annual cost growth. We merged it with the 1996–2003 Medical Expenditure Panel Survey. The unit of analysis is the family. We conducted both bivariate and multivariate logistic analyses. Results The bivariate analysis found a significant inverse association between the cost index and the proportion of families receiving an offer of EBPI. The multivariate analysis showed that the cost index was significantly negatively associated with the likelihood of receiving an EBPI offer for the entire sample and for families in the first, second, and third quartiles of income distribution. The cost index was also significantly negatively associated with the proportion of families with EBPI for the entire year for each family member (EBPI-EYEM). The multivariate analysis confirmed significance of the relationship for the entire sample, and for families in the second and third quartiles of income distribution. Among the families with EBPI-EYEM, there was a positive relationship between the cost index and this group's likelihood of having out-of-pocket expenditures exceeding 10 percent of family income. The multivariate analysis confirmed significance of the relationship for the entire group and for families in the second and third quartiles of income distribution. Conclusions Rising health costs reduce EBPI availability and enrollment, and the financial protection provided by it, especially for middle-class families. PMID:22417314
Nonlinear multivariate and time series analysis by neural network methods
NASA Astrophysics Data System (ADS)
Hsieh, William W.
2004-03-01
Methods in multivariate statistical analysis are essential for working with large amounts of geophysical data, data from observational arrays, from satellites, or from numerical model output. In classical multivariate statistical analysis, there is a hierarchy of methods, starting with linear regression at the base, followed by principal component analysis (PCA) and finally canonical correlation analysis (CCA). A multivariate time series method, the singular spectrum analysis (SSA), has been a fruitful extension of the PCA technique. The common drawback of these classical methods is that only linear structures can be correctly extracted from the data. Since the late 1980s, neural network methods have become popular for performing nonlinear regression and classification. More recently, neural network methods have been extended to perform nonlinear PCA (NLPCA), nonlinear CCA (NLCCA), and nonlinear SSA (NLSSA). This paper presents a unified view of the NLPCA, NLCCA, and NLSSA techniques and their applications to various data sets of the atmosphere and the ocean (especially for the El Niño-Southern Oscillation and the stratospheric quasi-biennial oscillation). These data sets reveal that the linear methods are often too simplistic to describe real-world systems, with a tendency to scatter a single oscillatory phenomenon into numerous unphysical modes or higher harmonics, which can be largely alleviated in the new nonlinear paradigm.
Falahati, Farshad; Westman, Eric; Simmons, Andrew
2014-01-01
Machine learning algorithms and multivariate data analysis methods have been widely utilized in the field of Alzheimer's disease (AD) research in recent years. Advances in medical imaging and medical image analysis have provided a means to generate and extract valuable neuroimaging information. Automatic classification techniques provide tools to analyze this information and observe inherent disease-related patterns in the data. In particular, these classifiers have been used to discriminate AD patients from healthy control subjects and to predict conversion from mild cognitive impairment to AD. In this paper, recent studies are reviewed that have used machine learning and multivariate analysis in the field of AD research. The main focus is on studies that used structural magnetic resonance imaging (MRI), but studies that included positron emission tomography and cerebrospinal fluid biomarkers in addition to MRI are also considered. A wide variety of materials and methods has been employed in different studies, resulting in a range of different outcomes. Influential factors such as classifiers, feature extraction algorithms, feature selection methods, validation approaches, and cohort properties are reviewed, as well as key MRI-based and multi-modal based studies. Current and future trends are discussed.
Multivariable Parametric Cost Model for Ground Optical: Telescope Assembly
NASA Technical Reports Server (NTRS)
Stahl, H. Philip; Rowell, Ginger Holmes; Reese, Gayle; Byberg, Alicia
2004-01-01
A parametric cost model for ground-based telescopes is developed using multi-variable statistical analysis of both engineering and performance parameters. While diameter continues to be the dominant cost driver, diffraction limited wavelength is found to be a secondary driver. Other parameters such as radius of curvature were examined. The model includes an explicit factor for primary mirror segmentation and/or duplication (i.e. multi-telescope phased-array systems). Additionally, single variable models based on aperture diameter were derived.
Bayesian multivariate hierarchical transformation models for ROC analysis.
O'Malley, A James; Zou, Kelly H
2006-02-15
A Bayesian multivariate hierarchical transformation model (BMHTM) is developed for receiver operating characteristic (ROC) curve analysis based on clustered continuous diagnostic outcome data with covariates. Two special features of this model are that it incorporates non-linear monotone transformations of the outcomes and that multiple correlated outcomes may be analysed. The mean, variance, and transformation components are all modelled parametrically, enabling a wide range of inferences. The general framework is illustrated by focusing on two problems: (1) analysis of the diagnostic accuracy of a covariate-dependent univariate test outcome requiring a Box-Cox transformation within each cluster to map the test outcomes to a common family of distributions; (2) development of an optimal composite diagnostic test using multivariate clustered outcome data. In the second problem, the composite test is estimated using discriminant function analysis and compared to the test derived from logistic regression analysis where the gold standard is a binary outcome. The proposed methodology is illustrated on prostate cancer biopsy data from a multi-centre clinical trial.
Bayesian multivariate hierarchical transformation models for ROC analysis
O'Malley, A. James; Zou, Kelly H.
2006-01-01
SUMMARY A Bayesian multivariate hierarchical transformation model (BMHTM) is developed for receiver operating characteristic (ROC) curve analysis based on clustered continuous diagnostic outcome data with covariates. Two special features of this model are that it incorporates non-linear monotone transformations of the outcomes and that multiple correlated outcomes may be analysed. The mean, variance, and transformation components are all modelled parametrically, enabling a wide range of inferences. The general framework is illustrated by focusing on two problems: (1) analysis of the diagnostic accuracy of a covariate-dependent univariate test outcome requiring a Box–Cox transformation within each cluster to map the test outcomes to a common family of distributions; (2) development of an optimal composite diagnostic test using multivariate clustered outcome data. In the second problem, the composite test is estimated using discriminant function analysis and compared to the test derived from logistic regression analysis where the gold standard is a binary outcome. The proposed methodology is illustrated on prostate cancer biopsy data from a multi-centre clinical trial. PMID:16217836
DOE Office of Scientific and Technical Information (OSTI.GOV)
Beppler, Christina L
2015-12-01
A new approach was created for studying energetic material degradation. This approach involved detecting and tentatively identifying non-volatile chemical species by liquid chromatography-mass spectrometry (LC-MS) with multivariate statistical data analysis that form as the CL-20 energetic material thermally degraded. Multivariate data analysis showed clear separation and clustering of samples based on sample group: either pristine or aged material. Further analysis showed counter-clockwise trends in the principal components analysis (PCA), a type of multivariate data analysis, Scores plots. These trends may indicate that there was a discrete shift in the chemical markers as the went from pristine to aged material, andmore » then again when the aged CL-20 mixed with a potentially incompatible material was thermally aged for 4, 6, or 9 months. This new approach to studying energetic material degradation should provide greater knowledge of potential degradation markers in these materials.« less
NASA Technical Reports Server (NTRS)
Belcastro, Christine M.
1998-01-01
Robust control system analysis and design is based on an uncertainty description, called a linear fractional transformation (LFT), which separates the uncertain (or varying) part of the system from the nominal system. These models are also useful in the design of gain-scheduled control systems based on Linear Parameter Varying (LPV) methods. Low-order LFT models are difficult to form for problems involving nonlinear parameter variations. This paper presents a numerical computational method for constructing and LFT model for a given LPV model. The method is developed for multivariate polynomial problems, and uses simple matrix computations to obtain an exact low-order LFT representation of the given LPV system without the use of model reduction. Although the method is developed for multivariate polynomial problems, multivariate rational problems can also be solved using this method by reformulating the rational problem into a polynomial form.
Forcino, Frank L; Leighton, Lindsey R; Twerdy, Pamela; Cahill, James F
2015-01-01
Community ecologists commonly perform multivariate techniques (e.g., ordination, cluster analysis) to assess patterns and gradients of taxonomic variation. A critical requirement for a meaningful statistical analysis is accurate information on the taxa found within an ecological sample. However, oversampling (too many individuals counted per sample) also comes at a cost, particularly for ecological systems in which identification and quantification is substantially more resource consuming than the field expedition itself. In such systems, an increasingly larger sample size will eventually result in diminishing returns in improving any pattern or gradient revealed by the data, but will also lead to continually increasing costs. Here, we examine 396 datasets: 44 previously published and 352 created datasets. Using meta-analytic and simulation-based approaches, the research within the present paper seeks (1) to determine minimal sample sizes required to produce robust multivariate statistical results when conducting abundance-based, community ecology research. Furthermore, we seek (2) to determine the dataset parameters (i.e., evenness, number of taxa, number of samples) that require larger sample sizes, regardless of resource availability. We found that in the 44 previously published and the 220 created datasets with randomly chosen abundances, a conservative estimate of a sample size of 58 produced the same multivariate results as all larger sample sizes. However, this minimal number varies as a function of evenness, where increased evenness resulted in increased minimal sample sizes. Sample sizes as small as 58 individuals are sufficient for a broad range of multivariate abundance-based research. In cases when resource availability is the limiting factor for conducting a project (e.g., small university, time to conduct the research project), statistically viable results can still be obtained with less of an investment.
Measures of precision for dissimilarity-based multivariate analysis of ecological communities
Anderson, Marti J; Santana-Garcon, Julia
2015-01-01
Ecological studies require key decisions regarding the appropriate size and number of sampling units. No methods currently exist to measure precision for multivariate assemblage data when dissimilarity-based analyses are intended to follow. Here, we propose a pseudo multivariate dissimilarity-based standard error (MultSE) as a useful quantity for assessing sample-size adequacy in studies of ecological communities. Based on sums of squared dissimilarities, MultSE measures variability in the position of the centroid in the space of a chosen dissimilarity measure under repeated sampling for a given sample size. We describe a novel double resampling method to quantify uncertainty in MultSE values with increasing sample size. For more complex designs, values of MultSE can be calculated from the pseudo residual mean square of a permanova model, with the double resampling done within appropriate cells in the design. R code functions for implementing these techniques, along with ecological examples, are provided. PMID:25438826
Nagraj, Nandini; Slocik, Joseph M; Phillips, David M; Kelley-Loughnane, Nancy; Naik, Rajesh R; Potyrailo, Radislav A
2013-08-07
Peptide-capped AYSSGAPPMPPF gold nanoparticles were demonstrated for highly selective chemical vapor sensing using individual multivariable inductor-capacitor-resistor (LCR) resonators. Their multivariable response was achieved by measuring their resonance impedance spectra followed by multivariate spectral analysis. Detection of model toxic vapors and chemical agent simulants, such as acetonitrile, dichloromethane and methyl salicylate, was performed. Dichloromethane (dielectric constant εr = 9.1) and methyl salicylate (εr = 9.0) were discriminated using a single sensor. These sensing materials coupled to multivariable transducers can provide numerous opportunities for tailoring the vapor response selectivity based on the diversity of the amino acid composition of the peptides, and by the modulation of the nature of peptide-nanoparticle interactions through designed combinations of hydrophobic and hydrophilic amino acids.
Almeida, Tiago P; Chu, Gavin S; Li, Xin; Dastagir, Nawshin; Tuan, Jiun H; Stafford, Peter J; Schlindwein, Fernando S; Ng, G André
2017-01-01
Purpose: Complex fractionated atrial electrograms (CFAE)-guided ablation after pulmonary vein isolation (PVI) has been used for persistent atrial fibrillation (persAF) therapy. This strategy has shown suboptimal outcomes due to, among other factors, undetected changes in the atrial tissue following PVI. In the present work, we investigate CFAE distribution before and after PVI in patients with persAF using a multivariate statistical model. Methods: 207 pairs of atrial electrograms (AEGs) were collected before and after PVI respectively, from corresponding LA regions in 18 persAF patients. Twelve attributes were measured from the AEGs, before and after PVI. Statistical models based on multivariate analysis of variance (MANOVA) and linear discriminant analysis (LDA) have been used to characterize the atrial regions and AEGs. Results: PVI significantly reduced CFAEs in the LA (70 vs. 40%; P < 0.0001). Four types of LA regions were identified, based on the AEGs characteristics: (i) fractionated before PVI that remained fractionated after PVI (31% of the collected points); (ii) fractionated that converted to normal (39%); (iii) normal prior to PVI that became fractionated (9%) and; (iv) normal that remained normal (21%). Individually, the attributes failed to distinguish these LA regions, but multivariate statistical models were effective in their discrimination ( P < 0.0001). Conclusion: Our results have unveiled that there are LA regions resistant to PVI, while others are affected by it. Although, traditional methods were unable to identify these different regions, the proposed multivariate statistical model discriminated LA regions resistant to PVI from those affected by it without prior ablation information.
Multiple imputation for handling missing outcome data when estimating the relative risk.
Sullivan, Thomas R; Lee, Katherine J; Ryan, Philip; Salter, Amy B
2017-09-06
Multiple imputation is a popular approach to handling missing data in medical research, yet little is known about its applicability for estimating the relative risk. Standard methods for imputing incomplete binary outcomes involve logistic regression or an assumption of multivariate normality, whereas relative risks are typically estimated using log binomial models. It is unclear whether misspecification of the imputation model in this setting could lead to biased parameter estimates. Using simulated data, we evaluated the performance of multiple imputation for handling missing data prior to estimating adjusted relative risks from a correctly specified multivariable log binomial model. We considered an arbitrary pattern of missing data in both outcome and exposure variables, with missing data induced under missing at random mechanisms. Focusing on standard model-based methods of multiple imputation, missing data were imputed using multivariate normal imputation or fully conditional specification with a logistic imputation model for the outcome. Multivariate normal imputation performed poorly in the simulation study, consistently producing estimates of the relative risk that were biased towards the null. Despite outperforming multivariate normal imputation, fully conditional specification also produced somewhat biased estimates, with greater bias observed for higher outcome prevalences and larger relative risks. Deleting imputed outcomes from analysis datasets did not improve the performance of fully conditional specification. Both multivariate normal imputation and fully conditional specification produced biased estimates of the relative risk, presumably since both use a misspecified imputation model. Based on simulation results, we recommend researchers use fully conditional specification rather than multivariate normal imputation and retain imputed outcomes in the analysis when estimating relative risks. However fully conditional specification is not without its shortcomings, and so further research is needed to identify optimal approaches for relative risk estimation within the multiple imputation framework.
Domingo-Almenara, Xavier; Perera, Alexandre; Brezmes, Jesus
2016-11-25
Gas chromatography-mass spectrometry (GC-MS) produces large and complex datasets characterized by co-eluted compounds and at trace levels, and with a distinct compound ion-redundancy as a result of the high fragmentation by the electron impact ionization. Compounds in GC-MS can be resolved by taking advantage of the multivariate nature of GC-MS data by applying multivariate resolution methods. However, multivariate methods have to be applied in small regions of the chromatogram, and therefore chromatograms are segmented prior to the application of the algorithms. The automation of this segmentation process is a challenging task as it implies separating between informative data and noise from the chromatogram. This study demonstrates the capabilities of independent component analysis-orthogonal signal deconvolution (ICA-OSD) and multivariate curve resolution-alternating least squares (MCR-ALS) with an overlapping moving window implementation to avoid the typical hard chromatographic segmentation. Also, after being resolved, compounds are aligned across samples by an automated alignment algorithm. We evaluated the proposed methods through a quantitative analysis of GC-qTOF MS data from 25 serum samples. The quantitative performance of both moving window ICA-OSD and MCR-ALS-based implementations was compared with the quantification of 33 compounds by the XCMS package. Results shown that most of the R 2 coefficients of determination exhibited a high correlation (R 2 >0.90) in both ICA-OSD and MCR-ALS moving window-based approaches. Copyright © 2016 Elsevier B.V. All rights reserved.
Multivariate Meta-Analysis of Preference-Based Quality of Life Values in Coronary Heart Disease.
Stevanović, Jelena; Pechlivanoglou, Petros; Kampinga, Marthe A; Krabbe, Paul F M; Postma, Maarten J
2016-01-01
There are numerous health-related quality of life (HRQol) measurements used in coronary heart disease (CHD) in the literature. However, only values assessed with preference-based instruments can be directly applied in a cost-utility analysis (CUA). To summarize and synthesize instrument-specific preference-based values in CHD and the underlying disease-subgroups, stable angina and post-acute coronary syndrome (post-ACS), for developed countries, while accounting for study-level characteristics, and within- and between-study correlation. A systematic review was conducted to identify studies reporting preference-based values in CHD. A multivariate meta-analysis was applied to synthesize the HRQoL values. Meta-regression analyses examined the effect of study level covariates age, publication year, prevalence of diabetes and gender. A total of 40 studies providing preference-based values were detected. Synthesized estimates of HRQoL in post-ACS ranged from 0.64 (Quality of Well-Being) to 0.92 (EuroQol European"tariff"), while in stable angina they ranged from 0.64 (Short form 6D) to 0.89 (Standard Gamble). Similar findings were observed in estimates applying to general CHD. No significant improvement in model fit was found after adjusting for study-level covariates. Large between-study heterogeneity was observed in all the models investigated. The main finding of our study is the presence of large heterogeneity both within and between instrument-specific HRQoL values. Current economic models in CHD ignore this between-study heterogeneity. Multivariate meta-analysis can quantify this heterogeneity and offers the means for uncertainty around HRQoL values to be translated to uncertainty in CUAs.
Xu, Min; Zhang, Lei; Yue, Hong-Shui; Pang, Hong-Wei; Ye, Zheng-Liang; Ding, Li
2017-10-01
To establish an on-line monitoring method for extraction process of Schisandrae Chinensis Fructus, the formula medicinal material of Yiqi Fumai lyophilized injection by combining near infrared spectroscopy with multi-variable data analysis technology. The multivariate statistical process control (MSPC) model was established based on 5 normal batches in production and 2 test batches were monitored by PC scores, DModX and Hotelling T2 control charts. The results showed that MSPC model had a good monitoring ability for the extraction process. The application of the MSPC model to actual production process could effectively achieve on-line monitoring for extraction process of Schisandrae Chinensis Fructus, and can reflect the change of material properties in the production process in real time. This established process monitoring method could provide reference for the application of process analysis technology in the process quality control of traditional Chinese medicine injections. Copyright© by the Chinese Pharmaceutical Association.
Kaier, Klaus; Hagist, Christian; Frank, Uwe; Conrad, Andreas; Meyer, Elisabeth
2009-04-01
To determine the impact of antibiotic consumption and alcohol-based hand disinfection on the incidences of nosocomial methicillin-resistant Staphylococcus aureus (MRSA) infection and Clostridium difficile infection (CDI). Two multivariate time-series analyses were performed that used as dependent variables the monthly incidences of nosocomial MRSA infection and CDI at the Freiburg University Medical Center during the period January 2003 through October 2007. The volume of alcohol-based hand rub solution used per month was quantified in liters per 1,000 patient-days. Antibiotic consumption was calculated in terms of the number of defined daily doses per 1,000 patient-days per month. The use of alcohol-based hand rub was found to have a significant impact on the incidence of nosocomial MRSA infection (P< .001). The multivariate analysis (R2=0.66) showed that a higher volume of use of alcohol-based hand rub was associated with a lower incidence of nosocomial MRSA infection. Conversely, a higher level of consumption of selected antimicrobial agents was associated with a higher incidence of nosocomial MRSA infection. This analysis showed this relationship was the same for the use of second-generation cephalosporins (P= .023), third-generation cephalosporins (P= .05), fluoroquinolones (P= .01), and lincosamides (P= .05). The multivariate analysis (R2=0.55) showed that a higher level of consumption of third-generation cephalosporins (P= .008), fluoroquinolones (P= .084), and/or macrolides (P= .007) was associated with a higher incidence of CDI. A correlation with use of alcohol-based hand rub was not detected. In 2 multivariate time-series analyses, we were able to show the impact of hand hygiene and antibiotic use on the incidence of nosocomial MRSA infection, but we found no association between hand hygiene and incidence of CDI.
Al-Aziz, Jameel; Christou, Nicolas; Dinov, Ivo D.
2011-01-01
The amount, complexity and provenance of data have dramatically increased in the past five years. Visualization of observed and simulated data is a critical component of any social, environmental, biomedical or scientific quest. Dynamic, exploratory and interactive visualization of multivariate data, without preprocessing by dimensionality reduction, remains a nearly insurmountable challenge. The Statistics Online Computational Resource (www.SOCR.ucla.edu) provides portable online aids for probability and statistics education, technology-based instruction and statistical computing. We have developed a new Java-based infrastructure, SOCR Motion Charts, for discovery-based exploratory analysis of multivariate data. This interactive data visualization tool enables the visualization of high-dimensional longitudinal data. SOCR Motion Charts allows mapping of ordinal, nominal and quantitative variables onto time, 2D axes, size, colors, glyphs and appearance characteristics, which facilitates the interactive display of multidimensional data. We validated this new visualization paradigm using several publicly available multivariate datasets including Ice-Thickness, Housing Prices, Consumer Price Index, and California Ozone Data. SOCR Motion Charts is designed using object-oriented programming, implemented as a Java Web-applet and is available to the entire community on the web at www.socr.ucla.edu/SOCR_MotionCharts. It can be used as an instructional tool for rendering and interrogating high-dimensional data in the classroom, as well as a research tool for exploratory data analysis. PMID:21479108
Clarke, Nicholas; McNamara, Deirdre; Kearney, Patricia M; O'Morain, Colm A; Shearer, Nikki; Sharp, Linda
2016-12-01
This study aimed to investigate the effects of sex and deprivation on participation in a population-based faecal immunochemical test (FIT) colorectal cancer screening programme. The study population included 9785 individuals invited to participate in two rounds of a population-based biennial FIT-based screening programme, in a relatively deprived area of Dublin, Ireland. Explanatory variables included in the analysis were sex, deprivation category of area of residence and age (at end of screening). The primary outcome variable modelled was participation status in both rounds combined (with "participation" defined as having taken part in either or both rounds of screening). Poisson regression with a log link and robust error variance was used to estimate relative risks (RR) for participation. As a sensitivity analysis, data were stratified by screening round. In both the univariable and multivariable models deprivation was strongly associated with participation. Increasing affluence was associated with higher participation; participation was 26% higher in people resident in the most affluent compared to the most deprived areas (multivariable RR=1.26: 95% CI 1.21-1.30). Participation was significantly lower in males (multivariable RR=0.96: 95%CI 0.95-0.97) and generally increased with increasing age (trend per age group, multivariable RR=1.02: 95%CI, 1.01-1.02). No significant interactions between the explanatory variables were found. The effects of deprivation and sex were similar by screening round. Deprivation and male gender are independently associated with lower uptake of population-based FIT colorectal cancer screening, even in a relatively deprived setting. Development of evidence-based interventions to increase uptake in these disadvantaged groups is urgently required. Copyright © 2016. Published by Elsevier Inc.
NASA Astrophysics Data System (ADS)
Azami, Hamed; Escudero, Javier
2017-01-01
Multiscale entropy (MSE) is an appealing tool to characterize the complexity of time series over multiple temporal scales. Recent developments in the field have tried to extend the MSE technique in different ways. Building on these trends, we propose the so-called refined composite multivariate multiscale fuzzy entropy (RCmvMFE) whose coarse-graining step uses variance (RCmvMFEσ2) or mean (RCmvMFEμ). We investigate the behavior of these multivariate methods on multichannel white Gaussian and 1/ f noise signals, and two publicly available biomedical recordings. Our simulations demonstrate that RCmvMFEσ2 and RCmvMFEμ lead to more stable results and are less sensitive to the signals' length in comparison with the other existing multivariate multiscale entropy-based methods. The classification results also show that using both the variance and mean in the coarse-graining step offers complexity profiles with complementary information for biomedical signal analysis. We also made freely available all the Matlab codes used in this paper.
Lie, Octavian V; van Mierlo, Pieter
2017-01-01
The visual interpretation of intracranial EEG (iEEG) is the standard method used in complex epilepsy surgery cases to map the regions of seizure onset targeted for resection. Still, visual iEEG analysis is labor-intensive and biased due to interpreter dependency. Multivariate parametric functional connectivity measures using adaptive autoregressive (AR) modeling of the iEEG signals based on the Kalman filter algorithm have been used successfully to localize the electrographic seizure onsets. Due to their high computational cost, these methods have been applied to a limited number of iEEG time-series (<60). The aim of this study was to test two Kalman filter implementations, a well-known multivariate adaptive AR model (Arnold et al. 1998) and a simplified, computationally efficient derivation of it, for their potential application to connectivity analysis of high-dimensional (up to 192 channels) iEEG data. When used on simulated seizures together with a multivariate connectivity estimator, the partial directed coherence, the two AR models were compared for their ability to reconstitute the designed seizure signal connections from noisy data. Next, focal seizures from iEEG recordings (73-113 channels) in three patients rendered seizure-free after surgery were mapped with the outdegree, a graph-theory index of outward directed connectivity. Simulation results indicated high levels of mapping accuracy for the two models in the presence of low-to-moderate noise cross-correlation. Accordingly, both AR models correctly mapped the real seizure onset to the resection volume. This study supports the possibility of conducting fully data-driven multivariate connectivity estimations on high-dimensional iEEG datasets using the Kalman filter approach.
Independent Predictors of Prognosis Based on Oral Cavity Squamous Cell Carcinoma Surgical Margins.
Buchakjian, Marisa R; Ginader, Timothy; Tasche, Kendall K; Pagedar, Nitin A; Smith, Brian J; Sperry, Steven M
2018-05-01
Objective To conduct a multivariate analysis of a large cohort of oral cavity squamous cell carcinoma (OCSCC) cases for independent predictors of local recurrence (LR) and overall survival (OS), with emphasis on the relationship between (1) prognosis and (2) main specimen permanent margins and intraoperative tumor bed frozen margins. Study Design Retrospective cohort study. Setting Tertiary academic head and neck cancer program. Subjects and Methods This study included 426 patients treated with OCSCC resection between 2005 and 2014 at University of Iowa Hospitals and Clinics. Patients underwent excision of OCSCC with intraoperative tumor bed frozen margin sampling and main specimen permanent margin assessment. Multivariate analysis of the data set to predict LR and OS was performed. Results Independent predictors of LR included nodal involvement, histologic grade, and main specimen permanent margin status. Specifically, the presence of a positive margin (odds ratio, 6.21; 95% CI, 3.3-11.9) or <1-mm/carcinoma in situ margin (odds ratio, 2.41; 95% CI, 1.19-4.87) on the main specimen was an independent predictor of LR, whereas intraoperative tumor bed margins were not predictive of LR on multivariate analysis. Similarly, independent predictors of OS on multivariate analysis included nodal involvement, extracapsular extension, and a positive main specimen margin. Tumor bed margins did not independently predict OS. Conclusion The main specimen margin is a strong independent predictor of LR and OS on multivariate analysis. Intraoperative tumor bed frozen margins do not independently predict prognosis. We conclude that emphasis should be placed on evaluating the main specimen margins when estimating prognosis after OCSCC resection.
Kilborn, Joshua P; Jones, David L; Peebles, Ernst B; Naar, David F
2017-04-01
Clustering data continues to be a highly active area of data analysis, and resemblance profiles are being incorporated into ecological methodologies as a hypothesis testing-based approach to clustering multivariate data. However, these new clustering techniques have not been rigorously tested to determine the performance variability based on the algorithm's assumptions or any underlying data structures. Here, we use simulation studies to estimate the statistical error rates for the hypothesis test for multivariate structure based on dissimilarity profiles (DISPROF). We concurrently tested a widely used algorithm that employs the unweighted pair group method with arithmetic mean (UPGMA) to estimate the proficiency of clustering with DISPROF as a decision criterion. We simulated unstructured multivariate data from different probability distributions with increasing numbers of objects and descriptors, and grouped data with increasing overlap, overdispersion for ecological data, and correlation among descriptors within groups. Using simulated data, we measured the resolution and correspondence of clustering solutions achieved by DISPROF with UPGMA against the reference grouping partitions used to simulate the structured test datasets. Our results highlight the dynamic interactions between dataset dimensionality, group overlap, and the properties of the descriptors within a group (i.e., overdispersion or correlation structure) that are relevant to resemblance profiles as a clustering criterion for multivariate data. These methods are particularly useful for multivariate ecological datasets that benefit from distance-based statistical analyses. We propose guidelines for using DISPROF as a clustering decision tool that will help future users avoid potential pitfalls during the application of methods and the interpretation of results.
ERIC Educational Resources Information Center
Borsuk, Ellen R.; Watkins, Marley W.; Canivez, Gary L.
2006-01-01
Although often applied in practice, clinically based cognitive subtest profile analysis has failed to achieve empirical support. Nonlinear multivariate subtest profile analysis may have benefits over clinically based techniques, but the psychometric properties of these methods must be studied prior to their implementation and interpretation. The…
ERIC Educational Resources Information Center
Muslihah, Oleh Eneng
2015-01-01
The research examines the correlation between the understanding of school-based management, emotional intelligences and headmaster performance. Data was collected, using quantitative methods. The statistical analysis used was the Pearson Correlation, and multivariate regression analysis. The results of this research suggest firstly that there is…
Liu, Ya-Juan; André, Silvère; Saint Cristau, Lydia; Lagresle, Sylvain; Hannas, Zahia; Calvosa, Éric; Devos, Olivier; Duponchel, Ludovic
2017-02-01
Multivariate statistical process control (MSPC) is increasingly popular as the challenge provided by large multivariate datasets from analytical instruments such as Raman spectroscopy for the monitoring of complex cell cultures in the biopharmaceutical industry. However, Raman spectroscopy for in-line monitoring often produces unsynchronized data sets, resulting in time-varying batches. Moreover, unsynchronized data sets are common for cell culture monitoring because spectroscopic measurements are generally recorded in an alternate way, with more than one optical probe parallelly connecting to the same spectrometer. Synchronized batches are prerequisite for the application of multivariate analysis such as multi-way principal component analysis (MPCA) for the MSPC monitoring. Correlation optimized warping (COW) is a popular method for data alignment with satisfactory performance; however, it has never been applied to synchronize acquisition time of spectroscopic datasets in MSPC application before. In this paper we propose, for the first time, to use the method of COW to synchronize batches with varying durations analyzed with Raman spectroscopy. In a second step, we developed MPCA models at different time intervals based on the normal operation condition (NOC) batches synchronized by COW. New batches are finally projected considering the corresponding MPCA model. We monitored the evolution of the batches using two multivariate control charts based on Hotelling's T 2 and Q. As illustrated with results, the MSPC model was able to identify abnormal operation condition including contaminated batches which is of prime importance in cell culture monitoring We proved that Raman-based MSPC monitoring can be used to diagnose batches deviating from the normal condition, with higher efficacy than traditional diagnosis, which would save time and money in the biopharmaceutical industry. Copyright © 2016 Elsevier B.V. All rights reserved.
Hierarchical multivariate covariance analysis of metabolic connectivity.
Carbonell, Felix; Charil, Arnaud; Zijdenbos, Alex P; Evans, Alan C; Bedell, Barry J
2014-12-01
Conventional brain connectivity analysis is typically based on the assessment of interregional correlations. Given that correlation coefficients are derived from both covariance and variance, group differences in covariance may be obscured by differences in the variance terms. To facilitate a comprehensive assessment of connectivity, we propose a unified statistical framework that interrogates the individual terms of the correlation coefficient. We have evaluated the utility of this method for metabolic connectivity analysis using [18F]2-fluoro-2-deoxyglucose (FDG) positron emission tomography (PET) data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) study. As an illustrative example of the utility of this approach, we examined metabolic connectivity in angular gyrus and precuneus seed regions of mild cognitive impairment (MCI) subjects with low and high β-amyloid burdens. This new multivariate method allowed us to identify alterations in the metabolic connectome, which would not have been detected using classic seed-based correlation analysis. Ultimately, this novel approach should be extensible to brain network analysis and broadly applicable to other imaging modalities, such as functional magnetic resonance imaging (MRI).
Linn, Kristin A; Gaonkar, Bilwaj; Satterthwaite, Theodore D; Doshi, Jimit; Davatzikos, Christos; Shinohara, Russell T
2016-05-15
Normalization of feature vector values is a common practice in machine learning. Generally, each feature value is standardized to the unit hypercube or by normalizing to zero mean and unit variance. Classification decisions based on support vector machines (SVMs) or by other methods are sensitive to the specific normalization used on the features. In the context of multivariate pattern analysis using neuroimaging data, standardization effectively up- and down-weights features based on their individual variability. Since the standard approach uses the entire data set to guide the normalization, it utilizes the total variability of these features. This total variation is inevitably dependent on the amount of marginal separation between groups. Thus, such a normalization may attenuate the separability of the data in high dimensional space. In this work we propose an alternate approach that uses an estimate of the control-group standard deviation to normalize features before training. We study our proposed approach in the context of group classification using structural MRI data. We show that control-based normalization leads to better reproducibility of estimated multivariate disease patterns and improves the classifier performance in many cases. Copyright © 2016 Elsevier Inc. All rights reserved.
New robust bilinear least squares method for the analysis of spectral-pH matrix data.
Goicoechea, Héctor C; Olivieri, Alejandro C
2005-07-01
A new second-order multivariate method has been developed for the analysis of spectral-pH matrix data, based on a bilinear least-squares (BLLS) model achieving the second-order advantage and handling multiple calibration standards. A simulated Monte Carlo study of synthetic absorbance-pH data allowed comparison of the newly proposed BLLS methodology with constrained parallel factor analysis (PARAFAC) and with the combination multivariate curve resolution-alternating least-squares (MCR-ALS) technique under different conditions of sample-to-sample pH mismatch and analyte-background ratio. The results indicate an improved prediction ability for the new method. Experimental data generated by measuring absorption spectra of several calibration standards of ascorbic acid and samples of orange juice were subjected to second-order calibration analysis with PARAFAC, MCR-ALS, and the new BLLS method. The results indicate that the latter method provides the best analytical results in regard to analyte recovery in samples of complex composition requiring strict adherence to the second-order advantage. Linear dependencies appear when multivariate data are produced by using the pH or a reaction time as one of the data dimensions, posing a challenge to classical multivariate calibration models. The presently discussed algorithm is useful for these latter systems.
NASA Astrophysics Data System (ADS)
Anderson, R. B.; Finch, N.; Clegg, S.; Graff, T.; Morris, R. V.; Laura, J.
2017-06-01
We present a Python-based library and graphical interface for the analysis of point spectra. The tool is being developed with a focus on methods used for ChemCam data, but is flexible enough to handle spectra from other instruments.
NASA Astrophysics Data System (ADS)
Mansouri, Edris; Feizi, Faranak; Jafari Rad, Alireza; Arian, Mehran
2018-03-01
This paper uses multivariate regression to create a mathematical model for iron skarn exploration in the Sarvian area, central Iran, using multivariate regression for mineral prospectivity mapping (MPM). The main target of this paper is to apply multivariate regression analysis (as an MPM method) to map iron outcrops in the northeastern part of the study area in order to discover new iron deposits in other parts of the study area. Two types of multivariate regression models using two linear equations were employed to discover new mineral deposits. This method is one of the reliable methods for processing satellite images. ASTER satellite images (14 bands) were used as unique independent variables (UIVs), and iron outcrops were mapped as dependent variables for MPM. According to the results of the probability value (p value), coefficient of determination value (R2) and adjusted determination coefficient (Radj2), the second regression model (which consistent of multiple UIVs) fitted better than other models. The accuracy of the model was confirmed by iron outcrops map and geological observation. Based on field observation, iron mineralization occurs at the contact of limestone and intrusive rocks (skarn type).
Measures of precision for dissimilarity-based multivariate analysis of ecological communities.
Anderson, Marti J; Santana-Garcon, Julia
2015-01-01
Ecological studies require key decisions regarding the appropriate size and number of sampling units. No methods currently exist to measure precision for multivariate assemblage data when dissimilarity-based analyses are intended to follow. Here, we propose a pseudo multivariate dissimilarity-based standard error (MultSE) as a useful quantity for assessing sample-size adequacy in studies of ecological communities. Based on sums of squared dissimilarities, MultSE measures variability in the position of the centroid in the space of a chosen dissimilarity measure under repeated sampling for a given sample size. We describe a novel double resampling method to quantify uncertainty in MultSE values with increasing sample size. For more complex designs, values of MultSE can be calculated from the pseudo residual mean square of a permanova model, with the double resampling done within appropriate cells in the design. R code functions for implementing these techniques, along with ecological examples, are provided. © 2014 The Authors. Ecology Letters published by John Wiley & Sons Ltd and CNRS.
Bludau, Sebastian; Bzdok, Danilo; Gruber, Oliver; Kohn, Nils; Riedl, Valentin; Sorg, Christian; Palomero-Gallagher, Nicola; Müller, Veronika I.; Hoffstaedter, Felix; Amunts, Katrin; Eickhoff, Simon B.
2017-01-01
Objective The heterogeneous human frontal pole has been identified as a node in the dysfunctional network of major depressive disorder. The contribution of the medial (socio-affective) versus lateral (cognitive) frontal pole to major depression pathogenesis is currently unclear. The present study performs morphometric comparison of the microstructurally informed subdivisions of human frontal pole between depressed patients and controls using both uni- and multivariate statistics. Methods Multi-site voxel- and region-based morphometric MRI analysis of 73 depressed patients and 73 matched controls without psychiatric history. Frontal pole volume was first compared between depressed patients and controls by subdivision-wise classical morphometric analysis. In a second approach, frontal pole volume was compared by subdivision-naive multivariate searchlight analysis based on support vector machines. Results Subdivision-wise morphometric analysis found a significantly smaller medial frontal pole in depressed patients with a negative correlation of disease severity and duration. Histologically uninformed multivariate voxel-wise statistics provided converging evidence for structural aberrations specific to the microstructurally defined medial area of the frontal pole in depressed patients. Conclusions Across disparate methods, we demonstrated subregion specificity in the left medial frontal pole volume in depressed patients. Indeed, the frontal pole was shown to structurally and functionally connect to other key regions in major depression pathology like the anterior cingulate cortex and the amygdala via the uncinate fasciculus. Present and previous findings consolidate the left medial portion of the frontal pole as particularly altered in major depression. PMID:26621569
Anthropometric profile of combat athletes via multivariate analysis.
Burdukiewicz, Anna; Pietraszewska, Jadwiga; Stachoń, Aleksandra; Andrzejewska, Justyna
2017-11-07
Athletic success is a complex phenotype influenced by multiple factors, from sport-specific skills to anthropometric characteristics. Considering the latter, the literature has repeatedly indicated that athletes possess distinct physical characteristics depending on the practiced discipline. The aim of the present study was to apply univariate and multivariate methods to assess a wide range of morphometric and somatotypic characteristics in male combat athletes. Biometric data were obtained from 206 male university-level practitioners of judo, jiu-jitsu, karate, kickboxing, taekwondo, and wrestling. Measures included height- and length-based variables, breadths, circumferences, and skinfolds. Body proportions and somatotype, using Sheldon's method of somatotopy as modified by Heath and Carter, were then determined. Body fat percentage was assessed by bioelectrical impedance analysis using tetrapolar hand-to-foot electrodes. Data were subjected to a wide array of statistical analysis. The results show between-group differences in the magnitudes of the analyzed characteristics. While mesomorphy was the dominant component of each group somatotype, enhanced ectomorphy was observed in those disciplines that require a high level of agility. Principal component analysis reduced the multivariate dimensionality of the data to three components (characterizing body size, height-based measures, and the anthropometric structure of the upper extremities) that explained the majority of data variance. The development of a sport-specific anthropometric profile via height- and mass-based and morphometric and somatotypic variables can aid in the design of training protocols and the identification of athlete markers as well as serve as a diagnostic criterion in predicting combat athlete performance.
Multivariate Boosting for Integrative Analysis of High-Dimensional Cancer Genomic Data
Xiong, Lie; Kuan, Pei-Fen; Tian, Jianan; Keles, Sunduz; Wang, Sijian
2015-01-01
In this paper, we propose a novel multivariate component-wise boosting method for fitting multivariate response regression models under the high-dimension, low sample size setting. Our method is motivated by modeling the association among different biological molecules based on multiple types of high-dimensional genomic data. Particularly, we are interested in two applications: studying the influence of DNA copy number alterations on RNA transcript levels and investigating the association between DNA methylation and gene expression. For this purpose, we model the dependence of the RNA expression levels on DNA copy number alterations and the dependence of gene expression on DNA methylation through multivariate regression models and utilize boosting-type method to handle the high dimensionality as well as model the possible nonlinear associations. The performance of the proposed method is demonstrated through simulation studies. Finally, our multivariate boosting method is applied to two breast cancer studies. PMID:26609213
Lee, Yune-Sang; Turkeltaub, Peter; Granger, Richard; Raizada, Rajeev D S
2012-03-14
Although much effort has been directed toward understanding the neural basis of speech processing, the neural processes involved in the categorical perception of speech have been relatively less studied, and many questions remain open. In this functional magnetic resonance imaging (fMRI) study, we probed the cortical regions mediating categorical speech perception using an advanced brain-mapping technique, whole-brain multivariate pattern-based analysis (MVPA). Normal healthy human subjects (native English speakers) were scanned while they listened to 10 consonant-vowel syllables along the /ba/-/da/ continuum. Outside of the scanner, individuals' own category boundaries were measured to divide the fMRI data into /ba/ and /da/ conditions per subject. The whole-brain MVPA revealed that Broca's area and the left pre-supplementary motor area evoked distinct neural activity patterns between the two perceptual categories (/ba/ vs /da/). Broca's area was also found when the same analysis was applied to another dataset (Raizada and Poldrack, 2007), which previously yielded the supramarginal gyrus using a univariate adaptation-fMRI paradigm. The consistent MVPA findings from two independent datasets strongly indicate that Broca's area participates in categorical speech perception, with a possible role of translating speech signals into articulatory codes. The difference in results between univariate and multivariate pattern-based analyses of the same data suggest that processes in different cortical areas along the dorsal speech perception stream are distributed on different spatial scales.
Improving Cluster Analysis with Automatic Variable Selection Based on Trees
2014-12-01
regression trees Daisy DISsimilAritY PAM partitioning around medoids PMA penalized multivariate analysis SPC sparse principal components UPGMA unweighted...unweighted pair-group average method ( UPGMA ). This method measures dissimilarities between all objects in two clusters and takes the average value
NASA Technical Reports Server (NTRS)
Aires, Filipe; Rossow, William B.; Hansen, James E. (Technical Monitor)
2001-01-01
A new approach is presented for the analysis of feedback processes in a nonlinear dynamical system by observing its variations. The new methodology consists of statistical estimates of the sensitivities between all pairs of variables in the system based on a neural network modeling of the dynamical system. The model can then be used to estimate the instantaneous, multivariate and nonlinear sensitivities, which are shown to be essential for the analysis of the feedbacks processes involved in the dynamical system. The method is described and tested on synthetic data from the low-order Lorenz circulation model where the correct sensitivities can be evaluated analytically.
Multivariate Methods for Meta-Analysis of Genetic Association Studies.
Dimou, Niki L; Pantavou, Katerina G; Braliou, Georgia G; Bagos, Pantelis G
2018-01-01
Multivariate meta-analysis of genetic association studies and genome-wide association studies has received a remarkable attention as it improves the precision of the analysis. Here, we review, summarize and present in a unified framework methods for multivariate meta-analysis of genetic association studies and genome-wide association studies. Starting with the statistical methods used for robust analysis and genetic model selection, we present in brief univariate methods for meta-analysis and we then scrutinize multivariate methodologies. Multivariate models of meta-analysis for a single gene-disease association studies, including models for haplotype association studies, multiple linked polymorphisms and multiple outcomes are discussed. The popular Mendelian randomization approach and special cases of meta-analysis addressing issues such as the assumption of the mode of inheritance, deviation from Hardy-Weinberg Equilibrium and gene-environment interactions are also presented. All available methods are enriched with practical applications and methodologies that could be developed in the future are discussed. Links for all available software implementing multivariate meta-analysis methods are also provided.
Tailored multivariate analysis for modulated enhanced diffraction
DOE Office of Scientific and Technical Information (OSTI.GOV)
Caliandro, Rocco; Guccione, Pietro; Nico, Giovanni
2015-10-21
Modulated enhanced diffraction (MED) is a technique allowing the dynamic structural characterization of crystalline materials subjected to an external stimulus, which is particularly suited forin situandoperandostructural investigations at synchrotron sources. Contributions from the (active) part of the crystal system that varies synchronously with the stimulus can be extracted by an offline analysis, which can only be applied in the case of periodic stimuli and linear system responses. In this paper a new decomposition approach based on multivariate analysis is proposed. The standard principal component analysis (PCA) is adapted to treat MED data: specific figures of merit based on their scoresmore » and loadings are found, and the directions of the principal components obtained by PCA are modified to maximize such figures of merit. As a result, a general method to decompose MED data, called optimum constrained components rotation (OCCR), is developed, which produces very precise results on simulated data, even in the case of nonperiodic stimuli and/or nonlinear responses. The multivariate analysis approach is able to supply in one shot both the diffraction pattern related to the active atoms (through the OCCR loadings) and the time dependence of the system response (through the OCCR scores). When applied to real data, OCCR was able to supply only the latter information, as the former was hindered by changes in abundances of different crystal phases, which occurred besides structural variations in the specific case considered. To develop a decomposition procedure able to cope with this combined effect represents the next challenge in MED analysis.« less
ERIC Educational Resources Information Center
Grochowalski, Joseph H.
2015-01-01
Component Universe Score Profile analysis (CUSP) is introduced in this paper as a psychometric alternative to multivariate profile analysis. The theoretical foundations of CUSP analysis are reviewed, which include multivariate generalizability theory and constrained principal components analysis. Because CUSP is a combination of generalizability…
Demanuele, Charmaine; Bähner, Florian; Plichta, Michael M; Kirsch, Peter; Tost, Heike; Meyer-Lindenberg, Andreas; Durstewitz, Daniel
2015-01-01
Multivariate pattern analysis can reveal new information from neuroimaging data to illuminate human cognition and its disturbances. Here, we develop a methodological approach, based on multivariate statistical/machine learning and time series analysis, to discern cognitive processing stages from functional magnetic resonance imaging (fMRI) blood oxygenation level dependent (BOLD) time series. We apply this method to data recorded from a group of healthy adults whilst performing a virtual reality version of the delayed win-shift radial arm maze (RAM) task. This task has been frequently used to study working memory and decision making in rodents. Using linear classifiers and multivariate test statistics in conjunction with time series bootstraps, we show that different cognitive stages of the task, as defined by the experimenter, namely, the encoding/retrieval, choice, reward and delay stages, can be statistically discriminated from the BOLD time series in brain areas relevant for decision making and working memory. Discrimination of these task stages was significantly reduced during poor behavioral performance in dorsolateral prefrontal cortex (DLPFC), but not in the primary visual cortex (V1). Experimenter-defined dissection of time series into class labels based on task structure was confirmed by an unsupervised, bottom-up approach based on Hidden Markov Models. Furthermore, we show that different groupings of recorded time points into cognitive event classes can be used to test hypotheses about the specific cognitive role of a given brain region during task execution. We found that whilst the DLPFC strongly differentiated between task stages associated with different memory loads, but not between different visual-spatial aspects, the reverse was true for V1. Our methodology illustrates how different aspects of cognitive information processing during one and the same task can be separated and attributed to specific brain regions based on information contained in multivariate patterns of voxel activity.
Analysis of Forest Foliage Using a Multivariate Mixture Model
NASA Technical Reports Server (NTRS)
Hlavka, C. A.; Peterson, David L.; Johnson, L. F.; Ganapol, B.
1997-01-01
Data with wet chemical measurements and near infrared spectra of ground leaf samples were analyzed to test a multivariate regression technique for estimating component spectra which is based on a linear mixture model for absorbance. The resulting unmixed spectra for carbohydrates, lignin, and protein resemble the spectra of extracted plant starches, cellulose, lignin, and protein. The unmixed protein spectrum has prominent absorption spectra at wavelengths which have been associated with nitrogen bonds.
NASA Astrophysics Data System (ADS)
Sadegh, Mojtaba; Ragno, Elisa; AghaKouchak, Amir
2017-06-01
We present a newly developed Multivariate Copula Analysis Toolbox (MvCAT) which includes a wide range of copula families with different levels of complexity. MvCAT employs a Bayesian framework with a residual-based Gaussian likelihood function for inferring copula parameters and estimating the underlying uncertainties. The contribution of this paper is threefold: (a) providing a Bayesian framework to approximate the predictive uncertainties of fitted copulas, (b) introducing a hybrid-evolution Markov Chain Monte Carlo (MCMC) approach designed for numerical estimation of the posterior distribution of copula parameters, and (c) enabling the community to explore a wide range of copulas and evaluate them relative to the fitting uncertainties. We show that the commonly used local optimization methods for copula parameter estimation often get trapped in local minima. The proposed method, however, addresses this limitation and improves describing the dependence structure. MvCAT also enables evaluation of uncertainties relative to the length of record, which is fundamental to a wide range of applications such as multivariate frequency analysis.
Sornborger, Andrew T; Lauderdale, James D
2016-11-01
Neural data analysis has increasingly incorporated causal information to study circuit connectivity. Dimensional reduction forms the basis of most analyses of large multivariate time series. Here, we present a new, multitaper-based decomposition for stochastic, multivariate time series that acts on the covariance of the time series at all lags, C ( τ ), as opposed to standard methods that decompose the time series, X ( t ), using only information at zero-lag. In both simulated and neural imaging examples, we demonstrate that methods that neglect the full causal structure may be discarding important dynamical information in a time series.
Van Hertem, T; Bahr, C; Schlageter Tello, A; Viazzi, S; Steensels, M; Romanini, C E B; Lokhorst, C; Maltz, E; Halachmi, I; Berckmans, D
2016-09-01
The objective of this study was to evaluate if a multi-sensor system (milk, activity, body posture) was a better classifier for lameness than the single-sensor-based detection models. Between September 2013 and August 2014, 3629 cow observations were collected on a commercial dairy farm in Belgium. Human locomotion scoring was used as reference for the model development and evaluation. Cow behaviour and performance was measured with existing sensors that were already present at the farm. A prototype of three-dimensional-based video recording system was used to quantify automatically the back posture of a cow. For the single predictor comparisons, a receiver operating characteristics curve was made. For the multivariate detection models, logistic regression and generalized linear mixed models (GLMM) were developed. The best lameness classification model was obtained by the multi-sensor analysis (area under the receiver operating characteristics curve (AUC)=0.757±0.029), containing a combination of milk and milking variables, activity and gait and posture variables from videos. Second, the multivariate video-based system (AUC=0.732±0.011) performed better than the multivariate milk sensors (AUC=0.604±0.026) and the multivariate behaviour sensors (AUC=0.633±0.018). The video-based system performed better than the combined behaviour and performance-based detection model (AUC=0.669±0.028), indicating that it is worthwhile to consider a video-based lameness detection system, regardless the presence of other existing sensors in the farm. The results suggest that Θ2, the feature variable for the back curvature around the hip joints, with an AUC of 0.719 is the best single predictor variable for lameness detection based on locomotion scoring. In general, this study showed that the video-based back posture monitoring system is outperforming the behaviour and performance sensing techniques for locomotion scoring-based lameness detection. A GLMM with seven specific variables (walking speed, back posture measurement, daytime activity, milk yield, lactation stage, milk peak flow rate and milk peak conductivity) is the best combination of variables for lameness classification. The accuracy on four-level lameness classification was 60.3%. The accuracy improved to 79.8% for binary lameness classification. The binary GLMM obtained a sensitivity of 68.5% and a specificity of 87.6%, which both exceed the sensitivity (52.1%±4.7%) and specificity (83.2%±2.3%) of the multi-sensor logistic regression model. This shows that the repeated measures analysis in the GLMM, taking into account the individual history of the animal, outperforms the classification when thresholds based on herd level (a statistical population) are used.
Multivariate reference technique for quantitative analysis of fiber-optic tissue Raman spectroscopy.
Bergholt, Mads Sylvest; Duraipandian, Shiyamala; Zheng, Wei; Huang, Zhiwei
2013-12-03
We report a novel method making use of multivariate reference signals of fused silica and sapphire Raman signals generated from a ball-lens fiber-optic Raman probe for quantitative analysis of in vivo tissue Raman measurements in real time. Partial least-squares (PLS) regression modeling is applied to extract the characteristic internal reference Raman signals (e.g., shoulder of the prominent fused silica boson peak (~130 cm(-1)); distinct sapphire ball-lens peaks (380, 417, 646, and 751 cm(-1))) from the ball-lens fiber-optic Raman probe for quantitative analysis of fiber-optic Raman spectroscopy. To evaluate the analytical value of this novel multivariate reference technique, a rapid Raman spectroscopy system coupled with a ball-lens fiber-optic Raman probe is used for in vivo oral tissue Raman measurements (n = 25 subjects) under 785 nm laser excitation powers ranging from 5 to 65 mW. An accurate linear relationship (R(2) = 0.981) with a root-mean-square error of cross validation (RMSECV) of 2.5 mW can be obtained for predicting the laser excitation power changes based on a leave-one-subject-out cross-validation, which is superior to the normal univariate reference method (RMSE = 6.2 mW). A root-mean-square error of prediction (RMSEP) of 2.4 mW (R(2) = 0.985) can also be achieved for laser power prediction in real time when we applied the multivariate method independently on the five new subjects (n = 166 spectra). We further apply the multivariate reference technique for quantitative analysis of gelatin tissue phantoms that gives rise to an RMSEP of ~2.0% (R(2) = 0.998) independent of laser excitation power variations. This work demonstrates that multivariate reference technique can be advantageously used to monitor and correct the variations of laser excitation power and fiber coupling efficiency in situ for standardizing the tissue Raman intensity to realize quantitative analysis of tissue Raman measurements in vivo, which is particularly appealing in challenging Raman endoscopic applications.
TATES: Efficient Multivariate Genotype-Phenotype Analysis for Genome-Wide Association Studies
van der Sluis, Sophie; Posthuma, Danielle; Dolan, Conor V.
2013-01-01
To date, the genome-wide association study (GWAS) is the primary tool to identify genetic variants that cause phenotypic variation. As GWAS analyses are generally univariate in nature, multivariate phenotypic information is usually reduced to a single composite score. This practice often results in loss of statistical power to detect causal variants. Multivariate genotype–phenotype methods do exist but attain maximal power only in special circumstances. Here, we present a new multivariate method that we refer to as TATES (Trait-based Association Test that uses Extended Simes procedure), inspired by the GATES procedure proposed by Li et al (2011). For each component of a multivariate trait, TATES combines p-values obtained in standard univariate GWAS to acquire one trait-based p-value, while correcting for correlations between components. Extensive simulations, probing a wide variety of genotype–phenotype models, show that TATES's false positive rate is correct, and that TATES's statistical power to detect causal variants explaining 0.5% of the variance can be 2.5–9 times higher than the power of univariate tests based on composite scores and 1.5–2 times higher than the power of the standard MANOVA. Unlike other multivariate methods, TATES detects both genetic variants that are common to multiple phenotypes and genetic variants that are specific to a single phenotype, i.e. TATES provides a more complete view of the genetic architecture of complex traits. As the actual causal genotype–phenotype model is usually unknown and probably phenotypically and genetically complex, TATES, available as an open source program, constitutes a powerful new multivariate strategy that allows researchers to identify novel causal variants, while the complexity of traits is no longer a limiting factor. PMID:23359524
Multivariate analysis of prognostic factors in synovial sarcoma.
Koh, Kyoung Hwan; Cho, Eun Yoon; Kim, Dong Wook; Seo, Sung Wook
2009-11-01
Many studies have described the diversity of synovial sarcoma in terms of its biological characteristics and clinical features. Moreover, much effort has been expended on the identification of prognostic factors because of unpredictable behaviors of synovial sarcomas. However, with the exception of tumor size, published results have been inconsistent. We attempted to identify independent risk factors using survival analysis. Forty-one consecutive patients with synovial sarcoma were prospectively followed from January 1997 to March 2008. Overall and progression-free survival for age, sex, tumor size, tumor location, metastasis at presentation, histologic subtype, chemotherapy, radiation therapy, and resection margin were analyzed, and standard multivariate Cox proportional hazard regression analysis was used to evaluate potential prognostic factors. Tumor size (>5 cm), nonlimb-based tumors, metastasis at presentation, and a monophasic subtype were associated with poorer overall survival. Multivariate analysis showed metastasis at presentation and monophasic tumor subtype affected overall survival. For the progression-free survival, monophasic subtype was found to be only 1 prognostic factor. The study confirmed that histologic subtype is the single most important independent prognostic factors of synovial sarcoma regardless of tumor stage.
Yang, James J; Li, Jia; Williams, L Keoki; Buu, Anne
2016-01-05
In genome-wide association studies (GWAS) for complex diseases, the association between a SNP and each phenotype is usually weak. Combining multiple related phenotypic traits can increase the power of gene search and thus is a practically important area that requires methodology work. This study provides a comprehensive review of existing methods for conducting GWAS on complex diseases with multiple phenotypes including the multivariate analysis of variance (MANOVA), the principal component analysis (PCA), the generalizing estimating equations (GEE), the trait-based association test involving the extended Simes procedure (TATES), and the classical Fisher combination test. We propose a new method that relaxes the unrealistic independence assumption of the classical Fisher combination test and is computationally efficient. To demonstrate applications of the proposed method, we also present the results of statistical analysis on the Study of Addiction: Genetics and Environment (SAGE) data. Our simulation study shows that the proposed method has higher power than existing methods while controlling for the type I error rate. The GEE and the classical Fisher combination test, on the other hand, do not control the type I error rate and thus are not recommended. In general, the power of the competing methods decreases as the correlation between phenotypes increases. All the methods tend to have lower power when the multivariate phenotypes come from long tailed distributions. The real data analysis also demonstrates that the proposed method allows us to compare the marginal results with the multivariate results and specify which SNPs are specific to a particular phenotype or contribute to the common construct. The proposed method outperforms existing methods in most settings and also has great applications in GWAS on complex diseases with multiple phenotypes such as the substance abuse disorders.
Innovation Analysis | Energy Analysis | NREL
. New empirical methods for estimating technical and commercial impact (based on patent citations and Commercial Breakthroughs, NREL employed regression models and multivariate simulations to compare social in the marketplace and found that: Web presence may provide a better representation of the commercial
Goh, Choon Fu; Craig, Duncan Q M; Hadgraft, Jonathan; Lane, Majella E
2017-02-01
Drug permeation through the intercellular lipids, which pack around and between corneocytes, may be enhanced by increasing the thermodynamic activity of the active in a formulation. However, this may also result in unwanted drug crystallisation on and in the skin. In this work, we explore the combination of ATR-FTIR spectroscopy and multivariate data analysis to study drug crystallisation in the skin. Ex vivo permeation studies of saturated solutions of diclofenac sodium (DF Na) in two vehicles, propylene glycol (PG) and dimethyl sulphoxide (DMSO), were carried out in porcine ear skin. Tape stripping and ATR-FTIR spectroscopy were conducted simultaneously to collect spectral data as a function of skin depth. Multivariate data analysis was applied to visualise and categorise the spectral data in the region of interest (1700-1500cm -1 ) containing the carboxylate (COO - ) asymmetric stretching vibrations of DF Na. Spectral data showed the redshifts of the COO - asymmetric stretching vibrations for DF Na in the solution compared with solid drug. Similar shifts were evident following application of saturated solutions of DF Na to porcine skin samples. Multivariate data analysis categorised the spectral data based on the spectral differences and drug crystallisation was found to be confined to the upper layers of the skin. This proof-of-concept study highlights the utility of ATR-FTIR spectroscopy in combination with multivariate data analysis as a simple and rapid approach in the investigation of drug deposition in the skin. The approach described here will be extended to the study of other actives for topical application to the skin. Copyright © 2016 Elsevier B.V. All rights reserved.
Shi, Wenhao; Zhang, Silin; Zhao, Wanqiu; Xia, Xue; Wang, Min; Wang, Hui; Bai, Haiyan; Shi, Juanzi
2013-07-01
What factors does multivariate logistic regression show to be significantly associated with the likelihood of clinical pregnancy in vitrified-warmed embryo transfer (VET) cycles? Assisted hatching (AH) and if the reason to freeze embryos was to avoid the risk of ovarian hyperstimulation syndrome (OHSS) were significantly positively associated with a greater likelihood of clinical pregnancy. Single factor analysis has shown AH, number of embryos transferred and the reason of freezing for OHSS to be positively and damaged blastomere to be negatively significantly associated with the chance of clinical pregnancy after VET. It remains unclear what factors would be significant after multivariate analysis. The study was a retrospective analysis of 2313 VET cycles from 1481 patients performed between January 2008 and April 2012. A multivariate logistic regression analysis was performed to identify the factors to affect clinical pregnancy outcome of VET. There were 22 candidate variables selected based on clinical experiences and the literature. With the thresholds of α entry = α removal= 0.05 for both variable entry and variable removal, eight variables were chosen to contribute the multivariable model by the bootstrap stepwise variable selection algorithm (n = 1000). Eight variables were age at controlled ovarian hyperstimulation (COH), reason for freezing, AH, endometrial thickness, damaged blastomere, number of embryos transferred, number of good-quality embryos, and blood presence on transfer catheter. A descriptive comparison of the relative importance was accomplished by the proportion of explained variation (PEV). Among the reasons for freezing, the OHSS group showed a higher OR than the surplus embryo group when compared with other reasons for VET groups (OHSS versus Other, OR: 2.145; CI: 1.4-3.286; Surplus embryos versus Other, OR: 1.152; CI: 0.761-1.743) and high PEV (marginal 2.77%, P = 0.2911; partial 1.68%; CI of area under receptor operator characteristic curve (ROC): 0.5576-0.6000). AH also showed a high OR (OR: 2.105, CI: 1.554-2.85) and high PEV (marginal 1.97%; partial 1.02%; CI of area under ROC: 0.5344-0.5647). The number of good-quality embryos showed the highest marginal PEV and partial PEV (marginal 3.91%, partial 2.28%; CI of area under ROC: 0.5886-0.6343). This was a retrospective multivariate analysis of the data obtained in 5 years from a single IVF center. Repeated cycles in the same woman were treated as independent observations, which could introduce bias. Results are based on clinical pregnancy and not live births. Prospective analysis of a larger data set from a multicenter study based on live births is necessary to confirm the findings. Paying attention to the quality of embryos, the number of good embryos, AH and the reasons for freezing that are associated with clinical pregnancy after VET will assist the improvement of success rates.
Aguirre-Gamboa, Raul; Trevino, Victor
2014-06-01
MicroRNAs (miRNAs) play a key role in post-transcriptional regulation of mRNA levels. Their function in cancer has been studied by high-throughput methods generating valuable sources of public information. Thus, miRNA signatures predicting cancer clinical outcomes are emerging. An important step to propose miRNA-based biomarkers before clinical validation is their evaluation in independent cohorts. Although it can be carried out using public data, such task is time-consuming and requires a specialized analysis. Therefore, to aid and simplify the evaluation of prognostic miRNA signatures in cancer, we developed SurvMicro, a free and easy-to-use web tool that assesses miRNA signatures from publicly available miRNA profiles using multivariate survival analysis. SurvMicro is composed of a wide and updated database of >40 cohorts in different tissues and a web tool where survival analysis can be done in minutes. We presented evaluations to portray the straightforward functionality of SurvMicro in liver and lung cancer. To our knowledge, SurvMicro is the only bioinformatic tool that aids the evaluation of multivariate prognostic miRNA signatures in cancer. SurvMicro and its tutorial are freely available at http://bioinformatica.mty.itesm.mx/SurvMicro. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Multivariate meta-analysis: potential and promise.
Jackson, Dan; Riley, Richard; White, Ian R
2011-09-10
The multivariate random effects model is a generalization of the standard univariate model. Multivariate meta-analysis is becoming more commonly used and the techniques and related computer software, although continually under development, are now in place. In order to raise awareness of the multivariate methods, and discuss their advantages and disadvantages, we organized a one day 'Multivariate meta-analysis' event at the Royal Statistical Society. In addition to disseminating the most recent developments, we also received an abundance of comments, concerns, insights, critiques and encouragement. This article provides a balanced account of the day's discourse. By giving others the opportunity to respond to our assessment, we hope to ensure that the various view points and opinions are aired before multivariate meta-analysis simply becomes another widely used de facto method without any proper consideration of it by the medical statistics community. We describe the areas of application that multivariate meta-analysis has found, the methods available, the difficulties typically encountered and the arguments for and against the multivariate methods, using four representative but contrasting examples. We conclude that the multivariate methods can be useful, and in particular can provide estimates with better statistical properties, but also that these benefits come at the price of making more assumptions which do not result in better inference in every case. Although there is evidence that multivariate meta-analysis has considerable potential, it must be even more carefully applied than its univariate counterpart in practice. Copyright © 2011 John Wiley & Sons, Ltd.
The NLS-Based Nonlinear Grey Multivariate Model for Forecasting Pollutant Emissions in China.
Pei, Ling-Ling; Li, Qin; Wang, Zheng-Xin
2018-03-08
The relationship between pollutant discharge and economic growth has been a major research focus in environmental economics. To accurately estimate the nonlinear change law of China's pollutant discharge with economic growth, this study establishes a transformed nonlinear grey multivariable (TNGM (1, N )) model based on the nonlinear least square (NLS) method. The Gauss-Seidel iterative algorithm was used to solve the parameters of the TNGM (1, N ) model based on the NLS basic principle. This algorithm improves the precision of the model by continuous iteration and constantly approximating the optimal regression coefficient of the nonlinear model. In our empirical analysis, the traditional grey multivariate model GM (1, N ) and the NLS-based TNGM (1, N ) models were respectively adopted to forecast and analyze the relationship among wastewater discharge per capita (WDPC), and per capita emissions of SO₂ and dust, alongside GDP per capita in China during the period 1996-2015. Results indicated that the NLS algorithm is able to effectively help the grey multivariable model identify the nonlinear relationship between pollutant discharge and economic growth. The results show that the NLS-based TNGM (1, N ) model presents greater precision when forecasting WDPC, SO₂ emissions and dust emissions per capita, compared to the traditional GM (1, N ) model; WDPC indicates a growing tendency aligned with the growth of GDP, while the per capita emissions of SO₂ and dust reduce accordingly.
Impact of Gender on 30-Day Complications After Primary Total Joint Arthroplasty.
Robinson, Jonathan; Shin, John I; Dowdell, James E; Moucha, Calin S; Chen, Darwin D
2017-08-01
Impact of gender on 30-day complications has been investigated in other surgical procedures but has not yet been studied in total hip arthroplasty (THA) or total knee arthroplasty (TKA). Patients who received THA or TKA from 2012 to 2014 were identified in the National Surgical Quality Improvement Program database. Patients were divided into 2 groups based on gender. Bivariate and multivariate analyses were performed to assess associations between gender and patient factors and complications after THA or TKA and to assess whether gender was an independent risk factor. THA patients consisted of 45.1% male and 54.9% female. In a multivariate analysis, female gender was found to be a protective factor for mortality, sepsis, cardiovascular complications, unplanned reintubation, and renal complications and as an independent risk factor for urinary tract infection, blood transfusion, and nonhome discharge after THA. TKA patients consisted of 36.7% male and 62.3% female. Multivariate analysis revealed female gender as a protective factor for sepsis, cardiovascular complications, and renal complications and as an independent risk factor for urinary tract infection, blood transfusion, and nonhome discharge after TKA. There are discrepancies in the THA or TKA complications based on gender, and the multivariate analyses confirmed gender as an independent risk factor for certain complications. Physicians should be mindful of patient's gender for better risk stratification and informed consent. Copyright © 2017 Elsevier Inc. All rights reserved.
Tanabe, Kenji
2016-04-27
Small-molecule compounds are widely used as biological research tools and therapeutic drugs. Therefore, uncovering novel targets of these compounds should provide insights that are valuable in both basic and clinical studies. I developed a method for image-based compound profiling by quantitating the effects of compounds on signal transduction and vesicle trafficking of epidermal growth factor receptor (EGFR). Using six signal transduction molecules and two markers of vesicle trafficking, 570 image features were obtained and subjected to multivariate analysis. Fourteen compounds that affected EGFR or its pathways were classified into four clusters, based on their phenotypic features. Surprisingly, one EGFR inhibitor (CAS 879127-07-8) was classified into the same cluster as nocodazole, a microtubule depolymerizer. In fact, this compound directly depolymerized microtubules. These results indicate that CAS 879127-07-8 could be used as a chemical probe to investigate both the EGFR pathway and microtubule dynamics. The image-based multivariate analysis developed herein has potential as a powerful tool for discovering unexpected drug properties.
Estimation of failure criteria in multivariate sensory shelf life testing using survival analysis.
Giménez, Ana; Gagliardi, Andrés; Ares, Gastón
2017-09-01
For most food products, shelf life is determined by changes in their sensory characteristics. A predetermined increase or decrease in the intensity of a sensory characteristic has frequently been used to signal that a product has reached the end of its shelf life. Considering all attributes change simultaneously, the concept of multivariate shelf life allows a single measurement of deterioration that takes into account all these sensory changes at a certain storage time. The aim of the present work was to apply survival analysis to estimate failure criteria in multivariate sensory shelf life testing using two case studies, hamburger buns and orange juice, by modelling the relationship between consumers' rejection of the product and the deterioration index estimated using PCA. In both studies, a panel of 13 trained assessors evaluated the samples using descriptive analysis whereas a panel of 100 consumers answered a "yes" or "no" question regarding intention to buy or consume the product. PC1 explained the great majority of the variance, indicating all sensory characteristics evolved similarly with storage time. Thus, PC1 could be regarded as index of sensory deterioration and a single failure criterion could be estimated through survival analysis for 25 and 50% consumers' rejection. The proposed approach based on multivariate shelf life testing may increase the accuracy of shelf life estimations. Copyright © 2017 Elsevier Ltd. All rights reserved.
Wang, Li; Wang, Xiaoyi; Jin, Xuebo; Xu, Jiping; Zhang, Huiyan; Yu, Jiabin; Sun, Qian; Gao, Chong; Wang, Lingbin
2017-03-01
The formation process of algae is described inaccurately and water blooms are predicted with a low precision by current methods. In this paper, chemical mechanism of algae growth is analyzed, and a correlation analysis of chlorophyll-a and algal density is conducted by chemical measurement. Taking into account the influence of multi-factors on algae growth and water blooms, the comprehensive prediction method combined with multivariate time series and intelligent model is put forward in this paper. Firstly, through the process of photosynthesis, the main factors that affect the reproduction of the algae are analyzed. A compensation prediction method of multivariate time series analysis based on neural network and Support Vector Machine has been put forward which is combined with Kernel Principal Component Analysis to deal with dimension reduction of the influence factors of blooms. Then, Genetic Algorithm is applied to improve the generalization ability of the BP network and Least Squares Support Vector Machine. Experimental results show that this method could better compensate the prediction model of multivariate time series analysis which is an effective way to improve the description accuracy of algae growth and prediction precision of water blooms.
Kia, Seyed Mostafa; Vega Pons, Sandro; Weisz, Nathan; Passerini, Andrea
2016-01-01
Brain decoding is a popular multivariate approach for hypothesis testing in neuroimaging. Linear classifiers are widely employed in the brain decoding paradigm to discriminate among experimental conditions. Then, the derived linear weights are visualized in the form of multivariate brain maps to further study spatio-temporal patterns of underlying neural activities. It is well known that the brain maps derived from weights of linear classifiers are hard to interpret because of high correlations between predictors, low signal to noise ratios, and the high dimensionality of neuroimaging data. Therefore, improving the interpretability of brain decoding approaches is of primary interest in many neuroimaging studies. Despite extensive studies of this type, at present, there is no formal definition for interpretability of multivariate brain maps. As a consequence, there is no quantitative measure for evaluating the interpretability of different brain decoding methods. In this paper, first, we present a theoretical definition of interpretability in brain decoding; we show that the interpretability of multivariate brain maps can be decomposed into their reproducibility and representativeness. Second, as an application of the proposed definition, we exemplify a heuristic for approximating the interpretability in multivariate analysis of evoked magnetoencephalography (MEG) responses. Third, we propose to combine the approximated interpretability and the generalization performance of the brain decoding into a new multi-objective criterion for model selection. Our results, for the simulated and real MEG data, show that optimizing the hyper-parameters of the regularized linear classifier based on the proposed criterion results in more informative multivariate brain maps. More importantly, the presented definition provides the theoretical background for quantitative evaluation of interpretability, and hence, facilitates the development of more effective brain decoding algorithms in the future.
Kia, Seyed Mostafa; Vega Pons, Sandro; Weisz, Nathan; Passerini, Andrea
2017-01-01
Brain decoding is a popular multivariate approach for hypothesis testing in neuroimaging. Linear classifiers are widely employed in the brain decoding paradigm to discriminate among experimental conditions. Then, the derived linear weights are visualized in the form of multivariate brain maps to further study spatio-temporal patterns of underlying neural activities. It is well known that the brain maps derived from weights of linear classifiers are hard to interpret because of high correlations between predictors, low signal to noise ratios, and the high dimensionality of neuroimaging data. Therefore, improving the interpretability of brain decoding approaches is of primary interest in many neuroimaging studies. Despite extensive studies of this type, at present, there is no formal definition for interpretability of multivariate brain maps. As a consequence, there is no quantitative measure for evaluating the interpretability of different brain decoding methods. In this paper, first, we present a theoretical definition of interpretability in brain decoding; we show that the interpretability of multivariate brain maps can be decomposed into their reproducibility and representativeness. Second, as an application of the proposed definition, we exemplify a heuristic for approximating the interpretability in multivariate analysis of evoked magnetoencephalography (MEG) responses. Third, we propose to combine the approximated interpretability and the generalization performance of the brain decoding into a new multi-objective criterion for model selection. Our results, for the simulated and real MEG data, show that optimizing the hyper-parameters of the regularized linear classifier based on the proposed criterion results in more informative multivariate brain maps. More importantly, the presented definition provides the theoretical background for quantitative evaluation of interpretability, and hence, facilitates the development of more effective brain decoding algorithms in the future. PMID:28167896
Meeker, Daniella; Jiang, Xiaoqian; Matheny, Michael E; Farcas, Claudiu; D'Arcy, Michel; Pearlman, Laura; Nookala, Lavanya; Day, Michele E; Kim, Katherine K; Kim, Hyeoneui; Boxwala, Aziz; El-Kareh, Robert; Kuo, Grace M; Resnic, Frederic S; Kesselman, Carl; Ohno-Machado, Lucila
2015-11-01
Centralized and federated models for sharing data in research networks currently exist. To build multivariate data analysis for centralized networks, transfer of patient-level data to a central computation resource is necessary. The authors implemented distributed multivariate models for federated networks in which patient-level data is kept at each site and data exchange policies are managed in a study-centric manner. The objective was to implement infrastructure that supports the functionality of some existing research networks (e.g., cohort discovery, workflow management, and estimation of multivariate analytic models on centralized data) while adding additional important new features, such as algorithms for distributed iterative multivariate models, a graphical interface for multivariate model specification, synchronous and asynchronous response to network queries, investigator-initiated studies, and study-based control of staff, protocols, and data sharing policies. Based on the requirements gathered from statisticians, administrators, and investigators from multiple institutions, the authors developed infrastructure and tools to support multisite comparative effectiveness studies using web services for multivariate statistical estimation in the SCANNER federated network. The authors implemented massively parallel (map-reduce) computation methods and a new policy management system to enable each study initiated by network participants to define the ways in which data may be processed, managed, queried, and shared. The authors illustrated the use of these systems among institutions with highly different policies and operating under different state laws. Federated research networks need not limit distributed query functionality to count queries, cohort discovery, or independently estimated analytic models. Multivariate analyses can be efficiently and securely conducted without patient-level data transport, allowing institutions with strict local data storage requirements to participate in sophisticated analyses based on federated research networks. © The Author 2015. Published by Oxford University Press on behalf of the American Medical Informatics Association.
Mapping Informative Clusters in a Hierarchial Framework of fMRI Multivariate Analysis
Xu, Rui; Zhen, Zonglei; Liu, Jia
2010-01-01
Pattern recognition methods have become increasingly popular in fMRI data analysis, which are powerful in discriminating between multi-voxel patterns of brain activities associated with different mental states. However, when they are used in functional brain mapping, the location of discriminative voxels varies significantly, raising difficulties in interpreting the locus of the effect. Here we proposed a hierarchical framework of multivariate approach that maps informative clusters rather than voxels to achieve reliable functional brain mapping without compromising the discriminative power. In particular, we first searched for local homogeneous clusters that consisted of voxels with similar response profiles. Then, a multi-voxel classifier was built for each cluster to extract discriminative information from the multi-voxel patterns. Finally, through multivariate ranking, outputs from the classifiers were served as a multi-cluster pattern to identify informative clusters by examining interactions among clusters. Results from both simulated and real fMRI data demonstrated that this hierarchical approach showed better performance in the robustness of functional brain mapping than traditional voxel-based multivariate methods. In addition, the mapped clusters were highly overlapped for two perceptually equivalent object categories, further confirming the validity of our approach. In short, the hierarchical framework of multivariate approach is suitable for both pattern classification and brain mapping in fMRI studies. PMID:21152081
Rosa, Maria J; Mehta, Mitul A; Pich, Emilio M; Risterucci, Celine; Zelaya, Fernando; Reinders, Antje A T S; Williams, Steve C R; Dazzan, Paola; Doyle, Orla M; Marquand, Andre F
2015-01-01
An increasing number of neuroimaging studies are based on either combining more than one data modality (inter-modal) or combining more than one measurement from the same modality (intra-modal). To date, most intra-modal studies using multivariate statistics have focused on differences between datasets, for instance relying on classifiers to differentiate between effects in the data. However, to fully characterize these effects, multivariate methods able to measure similarities between datasets are needed. One classical technique for estimating the relationship between two datasets is canonical correlation analysis (CCA). However, in the context of high-dimensional data the application of CCA is extremely challenging. A recent extension of CCA, sparse CCA (SCCA), overcomes this limitation, by regularizing the model parameters while yielding a sparse solution. In this work, we modify SCCA with the aim of facilitating its application to high-dimensional neuroimaging data and finding meaningful multivariate image-to-image correspondences in intra-modal studies. In particular, we show how the optimal subset of variables can be estimated independently and we look at the information encoded in more than one set of SCCA transformations. We illustrate our framework using Arterial Spin Labeling data to investigate multivariate similarities between the effects of two antipsychotic drugs on cerebral blood flow.
Clark, Neil R.; Szymkiewicz, Maciej; Wang, Zichen; Monteiro, Caroline D.; Jones, Matthew R.; Ma’ayan, Avi
2016-01-01
Gene set analysis of differential expression, which identifies collectively differentially expressed gene sets, has become an important tool for biology. The power of this approach lies in its reduction of the dimensionality of the statistical problem and its incorporation of biological interpretation by construction. Many approaches to gene set analysis have been proposed, but benchmarking their performance in the setting of real biological data is difficult due to the lack of a gold standard. In a previously published work we proposed a geometrical approach to differential expression which performed highly in benchmarking tests and compared well to the most popular methods of differential gene expression. As reported, this approach has a natural extension to gene set analysis which we call Principal Angle Enrichment Analysis (PAEA). PAEA employs dimensionality reduction and a multivariate approach for gene set enrichment analysis. However, the performance of this method has not been assessed nor its implementation as a web-based tool. Here we describe new benchmarking protocols for gene set analysis methods and find that PAEA performs highly. The PAEA method is implemented as a user-friendly web-based tool, which contains 70 gene set libraries and is freely available to the community. PMID:26848405
Clark, Neil R; Szymkiewicz, Maciej; Wang, Zichen; Monteiro, Caroline D; Jones, Matthew R; Ma'ayan, Avi
2015-11-01
Gene set analysis of differential expression, which identifies collectively differentially expressed gene sets, has become an important tool for biology. The power of this approach lies in its reduction of the dimensionality of the statistical problem and its incorporation of biological interpretation by construction. Many approaches to gene set analysis have been proposed, but benchmarking their performance in the setting of real biological data is difficult due to the lack of a gold standard. In a previously published work we proposed a geometrical approach to differential expression which performed highly in benchmarking tests and compared well to the most popular methods of differential gene expression. As reported, this approach has a natural extension to gene set analysis which we call Principal Angle Enrichment Analysis (PAEA). PAEA employs dimensionality reduction and a multivariate approach for gene set enrichment analysis. However, the performance of this method has not been assessed nor its implementation as a web-based tool. Here we describe new benchmarking protocols for gene set analysis methods and find that PAEA performs highly. The PAEA method is implemented as a user-friendly web-based tool, which contains 70 gene set libraries and is freely available to the community.
Kamal, Ghulam Mustafa; Wang, Xiaohua; Bin Yuan; Wang, Jie; Sun, Peng; Zhang, Xu; Liu, Maili
2016-09-01
Soy sauce a well known seasoning all over the world, especially in Asia, is available in global market in a wide range of types based on its purpose and the processing methods. Its composition varies with respect to the fermentation processes and addition of additives, preservatives and flavor enhancers. A comprehensive (1)H NMR based study regarding the metabonomic variations of soy sauce to differentiate among different types of soy sauce available on the global market has been limited due to the complexity of the mixture. In present study, (13)C NMR spectroscopy coupled with multivariate statistical data analysis like principle component analysis (PCA), and orthogonal partial least square-discriminant analysis (OPLS-DA) was applied to investigate metabonomic variations among different types of soy sauce, namely super light, super dark, red cooking and mushroom soy sauce. The main additives in soy sauce like glutamate, sucrose and glucose were easily distinguished and quantified using (13)C NMR spectroscopy which were otherwise difficult to be assigned and quantified due to serious signal overlaps in (1)H NMR spectra. The significantly higher concentration of sucrose in dark, red cooking and mushroom flavored soy sauce can directly be linked to the addition of caramel in soy sauce. Similarly, significantly higher level of glutamate in super light as compared to super dark and mushroom flavored soy sauce may come from the addition of monosodium glutamate. The study highlights the potentiality of (13)C NMR based metabonomics coupled with multivariate statistical data analysis in differentiating between the types of soy sauce on the basis of level of additives, raw materials and fermentation procedures. Copyright © 2016 Elsevier B.V. All rights reserved.
Multivariate Models for Normal and Binary Responses in Intervention Studies
ERIC Educational Resources Information Center
Pituch, Keenan A.; Whittaker, Tiffany A.; Chang, Wanchen
2016-01-01
Use of multivariate analysis (e.g., multivariate analysis of variance) is common when normally distributed outcomes are collected in intervention research. However, when mixed responses--a set of normal and binary outcomes--are collected, standard multivariate analyses are no longer suitable. While mixed responses are often obtained in…
NASA Astrophysics Data System (ADS)
Hashim, Noor Haslinda Noor; Latip, Jalifah; Khatib, Alfi
2016-11-01
The metabolites of Clinacanthus nutans leaves extracts and their dependence on drying process were systematically characterized using 1H nuclear magnetic resonance spectroscopy (NMR) multivariate data analysis. Principal component analysis (PCA) and partial least square-discriminant analysis (PLS-DA) were able to distinguish the leaves extracts obtained from different drying methods. The identified metabolites were carbohydrates, amino acid, flavonoids and sulfur glucoside compounds. The major metabolites responsible for the separation in PLS-DA loading plots were lupeol, cycloclinacosides, betulin, cerebrosides and choline. The results showed that the combination of 1H NMR spectroscopy and multivariate data analyses could act as an efficient technique to understand the C. nutans composition and its variation.
Ferreira, Fábio S; Pereira, João M S; Duarte, João V; Castelo-Branco, Miguel
2017-01-01
Although voxel based morphometry studies are still the standard for analyzing brain structure, their dependence on massive univariate inferential methods is a limiting factor. A better understanding of brain pathologies can be achieved by applying inferential multivariate methods, which allow the study of multiple dependent variables, e.g. different imaging modalities of the same subject. Given the widespread use of SPM software in the brain imaging community, the main aim of this work is the implementation of massive multivariate inferential analysis as a toolbox in this software package. applied to the use of T1 and T2 structural data from diabetic patients and controls. This implementation was compared with the traditional ANCOVA in SPM and a similar multivariate GLM toolbox (MRM). We implemented the new toolbox and tested it by investigating brain alterations on a cohort of twenty-eight type 2 diabetes patients and twenty-six matched healthy controls, using information from both T1 and T2 weighted structural MRI scans, both separately - using standard univariate VBM - and simultaneously, with multivariate analyses. Univariate VBM replicated predominantly bilateral changes in basal ganglia and insular regions in type 2 diabetes patients. On the other hand, multivariate analyses replicated key findings of univariate results, while also revealing the thalami as additional foci of pathology. While the presented algorithm must be further optimized, the proposed toolbox is the first implementation of multivariate statistics in SPM8 as a user-friendly toolbox, which shows great potential and is ready to be validated in other clinical cohorts and modalities.
Ferreira, Fábio S.; Pereira, João M.S.; Duarte, João V.; Castelo-Branco, Miguel
2017-01-01
Background: Although voxel based morphometry studies are still the standard for analyzing brain structure, their dependence on massive univariate inferential methods is a limiting factor. A better understanding of brain pathologies can be achieved by applying inferential multivariate methods, which allow the study of multiple dependent variables, e.g. different imaging modalities of the same subject. Objective: Given the widespread use of SPM software in the brain imaging community, the main aim of this work is the implementation of massive multivariate inferential analysis as a toolbox in this software package. applied to the use of T1 and T2 structural data from diabetic patients and controls. This implementation was compared with the traditional ANCOVA in SPM and a similar multivariate GLM toolbox (MRM). Method: We implemented the new toolbox and tested it by investigating brain alterations on a cohort of twenty-eight type 2 diabetes patients and twenty-six matched healthy controls, using information from both T1 and T2 weighted structural MRI scans, both separately – using standard univariate VBM - and simultaneously, with multivariate analyses. Results: Univariate VBM replicated predominantly bilateral changes in basal ganglia and insular regions in type 2 diabetes patients. On the other hand, multivariate analyses replicated key findings of univariate results, while also revealing the thalami as additional foci of pathology. Conclusion: While the presented algorithm must be further optimized, the proposed toolbox is the first implementation of multivariate statistics in SPM8 as a user-friendly toolbox, which shows great potential and is ready to be validated in other clinical cohorts and modalities. PMID:28761571
Fast classification of hazelnut cultivars through portable infrared spectroscopy and chemometrics
NASA Astrophysics Data System (ADS)
Manfredi, Marcello; Robotti, Elisa; Quasso, Fabio; Mazzucco, Eleonora; Calabrese, Giorgio; Marengo, Emilio
2018-01-01
The authentication and traceability of hazelnuts is very important for both the consumer and the food industry, to safeguard the protected varieties and the food quality. This study investigates the use of a portable FTIR spectrometer coupled to multivariate statistical analysis for the classification of raw hazelnuts. The method discriminates hazelnuts from different origins/cultivars based on differences of the signal intensities of their IR spectra. The multivariate classification methods, namely principal component analysis (PCA) followed by linear discriminant analysis (LDA) and partial least square discriminant analysis (PLS-DA), with or without variable selection, allowed a very good discrimination among the groups, with PLS-DA coupled to variable selection providing the best results. Due to the fast analysis, high sensitivity, simplicity and no sample preparation, the proposed analytical methodology could be successfully used to verify the cultivar of hazelnuts, and the analysis can be performed quickly and directly on site.
Hierarchical multivariate covariance analysis of metabolic connectivity
Carbonell, Felix; Charil, Arnaud; Zijdenbos, Alex P; Evans, Alan C; Bedell, Barry J
2014-01-01
Conventional brain connectivity analysis is typically based on the assessment of interregional correlations. Given that correlation coefficients are derived from both covariance and variance, group differences in covariance may be obscured by differences in the variance terms. To facilitate a comprehensive assessment of connectivity, we propose a unified statistical framework that interrogates the individual terms of the correlation coefficient. We have evaluated the utility of this method for metabolic connectivity analysis using [18F]2-fluoro-2-deoxyglucose (FDG) positron emission tomography (PET) data from the Alzheimer's Disease Neuroimaging Initiative (ADNI) study. As an illustrative example of the utility of this approach, we examined metabolic connectivity in angular gyrus and precuneus seed regions of mild cognitive impairment (MCI) subjects with low and high β-amyloid burdens. This new multivariate method allowed us to identify alterations in the metabolic connectome, which would not have been detected using classic seed-based correlation analysis. Ultimately, this novel approach should be extensible to brain network analysis and broadly applicable to other imaging modalities, such as functional magnetic resonance imaging (MRI). PMID:25294129
Fontes, Cristiano Hora; Budman, Hector
2017-11-01
A clustering problem involving multivariate time series (MTS) requires the selection of similarity metrics. This paper shows the limitations of the PCA similarity factor (SPCA) as a single metric in nonlinear problems where there are differences in magnitude of the same process variables due to expected changes in operation conditions. A novel method for clustering MTS based on a combination between SPCA and the average-based Euclidean distance (AED) within a fuzzy clustering approach is proposed. Case studies involving either simulated or real industrial data collected from a large scale gas turbine are used to illustrate that the hybrid approach enhances the ability to recognize normal and fault operating patterns. This paper also proposes an oversampling procedure to create synthetic multivariate time series that can be useful in commonly occurring situations involving unbalanced data sets. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.
Tailored multivariate analysis for modulated enhanced diffraction
Caliandro, Rocco; Guccione, Pietro; Nico, Giovanni; ...
2015-10-21
Modulated enhanced diffraction (MED) is a technique allowing the dynamic structural characterization of crystalline materials subjected to an external stimulus, which is particularly suited forin situandoperandostructural investigations at synchrotron sources. Contributions from the (active) part of the crystal system that varies synchronously with the stimulus can be extracted by an offline analysis, which can only be applied in the case of periodic stimuli and linear system responses. In this paper a new decomposition approach based on multivariate analysis is proposed. The standard principal component analysis (PCA) is adapted to treat MED data: specific figures of merit based on their scoresmore » and loadings are found, and the directions of the principal components obtained by PCA are modified to maximize such figures of merit. As a result, a general method to decompose MED data, called optimum constrained components rotation (OCCR), is developed, which produces very precise results on simulated data, even in the case of nonperiodic stimuli and/or nonlinear responses. Furthermore, the multivariate analysis approach is able to supply in one shot both the diffraction pattern related to the active atoms (through the OCCR loadings) and the time dependence of the system response (through the OCCR scores). Furthermore, when applied to real data, OCCR was able to supply only the latter information, as the former was hindered by changes in abundances of different crystal phases, which occurred besides structural variations in the specific case considered. In order to develop a decomposition procedure able to cope with this combined effect represents the next challenge in MED analysis.« less
Physical Violence between Siblings: A Theoretical and Empirical Analysis
ERIC Educational Resources Information Center
Hoffman, Kristi L.; Kiecolt, K. Jill; Edwards, John N.
2005-01-01
This study develops and tests a theoretical model to explain sibling violence based on the feminist, conflict, and social learning theoretical perspectives and research in psychology and sociology. A multivariate analysis of data from 651 young adults generally supports hypotheses from all three theoretical perspectives. Males with brothers have…
Unsupervised pattern recognition methods in ciders profiling based on GCE voltammetric signals.
Jakubowska, Małgorzata; Sordoń, Wanda; Ciepiela, Filip
2016-07-15
This work presents a complete methodology of distinguishing between different brands of cider and ageing degrees, based on voltammetric signals, utilizing dedicated data preprocessing procedures and unsupervised multivariate analysis. It was demonstrated that voltammograms recorded on glassy carbon electrode in Britton-Robinson buffer at pH 2 are reproducible for each brand. By application of clustering algorithms and principal component analysis visible homogenous clusters were obtained. Advanced signal processing strategy which included automatic baseline correction, interval scaling and continuous wavelet transform with dedicated mother wavelet, was a key step in the correct recognition of the objects. The results show that voltammetry combined with optimized univariate and multivariate data processing is a sufficient tool to distinguish between ciders from various brands and to evaluate their freshness. Copyright © 2016 Elsevier Ltd. All rights reserved.
Chiu, Chi-yang; Jung, Jeesun; Chen, Wei; Weeks, Daniel E; Ren, Haobo; Boehnke, Michael; Amos, Christopher I; Liu, Aiyi; Mills, James L; Ting Lee, Mei-ling; Xiong, Momiao; Fan, Ruzong
2017-01-01
To analyze next-generation sequencing data, multivariate functional linear models are developed for a meta-analysis of multiple studies to connect genetic variant data to multiple quantitative traits adjusting for covariates. The goal is to take the advantage of both meta-analysis and pleiotropic analysis in order to improve power and to carry out a unified association analysis of multiple studies and multiple traits of complex disorders. Three types of approximate F -distributions based on Pillai–Bartlett trace, Hotelling–Lawley trace, and Wilks's Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants. Simulation analysis is performed to evaluate false-positive rates and power of the proposed tests. The proposed methods are applied to analyze lipid traits in eight European cohorts. It is shown that it is more advantageous to perform multivariate analysis than univariate analysis in general, and it is more advantageous to perform meta-analysis of multiple studies instead of analyzing the individual studies separately. The proposed models require individual observations. The value of the current paper can be seen at least for two reasons: (a) the proposed methods can be applied to studies that have individual genotype data; (b) the proposed methods can be used as a criterion for future work that uses summary statistics to build test statistics to meta-analyze the data. PMID:28000696
Chiu, Chi-Yang; Jung, Jeesun; Chen, Wei; Weeks, Daniel E; Ren, Haobo; Boehnke, Michael; Amos, Christopher I; Liu, Aiyi; Mills, James L; Ting Lee, Mei-Ling; Xiong, Momiao; Fan, Ruzong
2017-02-01
To analyze next-generation sequencing data, multivariate functional linear models are developed for a meta-analysis of multiple studies to connect genetic variant data to multiple quantitative traits adjusting for covariates. The goal is to take the advantage of both meta-analysis and pleiotropic analysis in order to improve power and to carry out a unified association analysis of multiple studies and multiple traits of complex disorders. Three types of approximate F -distributions based on Pillai-Bartlett trace, Hotelling-Lawley trace, and Wilks's Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants. Simulation analysis is performed to evaluate false-positive rates and power of the proposed tests. The proposed methods are applied to analyze lipid traits in eight European cohorts. It is shown that it is more advantageous to perform multivariate analysis than univariate analysis in general, and it is more advantageous to perform meta-analysis of multiple studies instead of analyzing the individual studies separately. The proposed models require individual observations. The value of the current paper can be seen at least for two reasons: (a) the proposed methods can be applied to studies that have individual genotype data; (b) the proposed methods can be used as a criterion for future work that uses summary statistics to build test statistics to meta-analyze the data.
Multivariate meta-analysis: Potential and promise
Jackson, Dan; Riley, Richard; White, Ian R
2011-01-01
The multivariate random effects model is a generalization of the standard univariate model. Multivariate meta-analysis is becoming more commonly used and the techniques and related computer software, although continually under development, are now in place. In order to raise awareness of the multivariate methods, and discuss their advantages and disadvantages, we organized a one day ‘Multivariate meta-analysis’ event at the Royal Statistical Society. In addition to disseminating the most recent developments, we also received an abundance of comments, concerns, insights, critiques and encouragement. This article provides a balanced account of the day's discourse. By giving others the opportunity to respond to our assessment, we hope to ensure that the various view points and opinions are aired before multivariate meta-analysis simply becomes another widely used de facto method without any proper consideration of it by the medical statistics community. We describe the areas of application that multivariate meta-analysis has found, the methods available, the difficulties typically encountered and the arguments for and against the multivariate methods, using four representative but contrasting examples. We conclude that the multivariate methods can be useful, and in particular can provide estimates with better statistical properties, but also that these benefits come at the price of making more assumptions which do not result in better inference in every case. Although there is evidence that multivariate meta-analysis has considerable potential, it must be even more carefully applied than its univariate counterpart in practice. Copyright © 2011 John Wiley & Sons, Ltd. PMID:21268052
A systematic uncertainty analysis for liner impedance eduction technology
NASA Astrophysics Data System (ADS)
Zhou, Lin; Bodén, Hans
2015-11-01
The so-called impedance eduction technology is widely used for obtaining acoustic properties of liners used in aircraft engines. The measurement uncertainties for this technology are still not well understood though it is essential for data quality assessment and model validation. A systematic framework based on multivariate analysis is presented in this paper to provide 95 percent confidence interval uncertainty estimates in the process of impedance eduction. The analysis is made using a single mode straightforward method based on transmission coefficients involving the classic Ingard-Myers boundary condition. The multivariate technique makes it possible to obtain an uncertainty analysis for the possibly correlated real and imaginary parts of the complex quantities. The results show that the errors in impedance results at low frequency mainly depend on the variability of transmission coefficients, while the mean Mach number accuracy is the most important source of error at high frequencies. The effect of Mach numbers used in the wave dispersion equation and in the Ingard-Myers boundary condition has been separated for comparison of the outcome of impedance eduction. A local Mach number based on friction velocity is suggested as a way to reduce the inconsistencies found when estimating impedance using upstream and downstream acoustic excitation.
Zanchetti Meneghini, Leonardo; Rübensam, Gabriel; Claudino Bica, Vinicius; Ceccon, Amanda; Barreto, Fabiano; Flores Ferrão, Marco; Bergold, Ana Maria
2014-01-01
A simple and inexpensive method based on solvent extraction followed by low temperature clean-up was applied for determination of seven pyrethroids residues in bovine raw milk using gas chromatography coupled to tandem mass spectrometry (GC-MS/MS) and gas chromatography with electron-capture detector (GC-ECD). Sample extraction procedure was established through the evaluation of seven different extraction protocols, evaluated in terms of analyte recovery and cleanup efficiency. Sample preparation optimization was based on Doehlert design using fifteen runs with three different variables. Response surface methodologies and polynomial analysis were used to define the best extraction conditions. Method validation was carried out based on SANCO guide parameters and assessed by multivariate analysis. Method performance was considered satisfactory since mean recoveries were between 87% and 101% for three distinct concentrations. Accuracy and precision were lower than ±20%, and led to no significant differences (p < 0.05) between results obtained by GC-ECD and GC-MS/MS techniques. The method has been applied to routine analysis for determination of pyrethroid residues in bovine raw milk in the Brazilian National Residue Control Plan since 2013, in which a total of 50 samples were analyzed. PMID:25380457
Rathi, Monika; Ahrenkiel, S P; Carapella, J J; Wanlass, M W
2013-02-01
Given an unknown multicomponent alloy, and a set of standard compounds or alloys of known composition, can one improve upon popular standards-based methods for energy dispersive X-ray (EDX) spectrometry to quantify the elemental composition of the unknown specimen? A method is presented here for determining elemental composition of alloys using transmission electron microscopy-based EDX with appropriate standards. The method begins with a discrete set of related reference standards of known composition, applies multivariate statistical analysis to those spectra, and evaluates the compositions with a linear matrix algebra method to relate the spectra to elemental composition. By using associated standards, only limited assumptions about the physical origins of the EDX spectra are needed. Spectral absorption corrections can be performed by providing an estimate of the foil thickness of one or more reference standards. The technique was applied to III-V multicomponent alloy thin films: composition and foil thickness were determined for various III-V alloys. The results were then validated by comparing with X-ray diffraction and photoluminescence analysis, demonstrating accuracy of approximately 1% in atomic fraction.
Multivariate Longitudinal Analysis with Bivariate Correlation Test
Adjakossa, Eric Houngla; Sadissou, Ibrahim; Hounkonnou, Mahouton Norbert; Nuel, Gregory
2016-01-01
In the context of multivariate multilevel data analysis, this paper focuses on the multivariate linear mixed-effects model, including all the correlations between the random effects when the dimensional residual terms are assumed uncorrelated. Using the EM algorithm, we suggest more general expressions of the model’s parameters estimators. These estimators can be used in the framework of the multivariate longitudinal data analysis as well as in the more general context of the analysis of multivariate multilevel data. By using a likelihood ratio test, we test the significance of the correlations between the random effects of two dependent variables of the model, in order to investigate whether or not it is useful to model these dependent variables jointly. Simulation studies are done to assess both the parameter recovery performance of the EM estimators and the power of the test. Using two empirical data sets which are of longitudinal multivariate type and multivariate multilevel type, respectively, the usefulness of the test is illustrated. PMID:27537692
Multivariate Longitudinal Analysis with Bivariate Correlation Test.
Adjakossa, Eric Houngla; Sadissou, Ibrahim; Hounkonnou, Mahouton Norbert; Nuel, Gregory
2016-01-01
In the context of multivariate multilevel data analysis, this paper focuses on the multivariate linear mixed-effects model, including all the correlations between the random effects when the dimensional residual terms are assumed uncorrelated. Using the EM algorithm, we suggest more general expressions of the model's parameters estimators. These estimators can be used in the framework of the multivariate longitudinal data analysis as well as in the more general context of the analysis of multivariate multilevel data. By using a likelihood ratio test, we test the significance of the correlations between the random effects of two dependent variables of the model, in order to investigate whether or not it is useful to model these dependent variables jointly. Simulation studies are done to assess both the parameter recovery performance of the EM estimators and the power of the test. Using two empirical data sets which are of longitudinal multivariate type and multivariate multilevel type, respectively, the usefulness of the test is illustrated.
Jha, Dilip Kumar; Vinithkumar, Nambali Valsalan; Sahu, Biraja Kumar; Dheenan, Palaiya Sukumaran; Das, Apurba Kumar; Begum, Mehmuna; Devi, Marimuthu Prashanthi; Kirubagaran, Ramalingam
2015-07-15
Chidiyatappu Bay is one of the least disturbed marine environments of Andaman & Nicobar Islands, the union territory of India. Oceanic flushing from southeast and northwest direction is prevalent in this bay. Further, anthropogenic activity is minimal in the adjoining environment. Considering the pristine nature of this bay, seawater samples collected from 12 sampling stations covering three seasons were analyzed. Principal Component Analysis (PCA) revealed 69.9% of total variance and exhibited strong factor loading for nitrite, chlorophyll a and phaeophytin. In addition, analysis of variance (ANOVA-one way), regression analysis, box-whisker plots and Geographical Information System based hot spot analysis further simplified and supported multivariate results. The results obtained are important to establish reference conditions for comparative study with other similar ecosystems in the region. Copyright © 2015 Elsevier Ltd. All rights reserved.
Yang, Heejung; Lee, Dong Young; Jeon, Minji; Suh, Youngbae; Sung, Sang Hyun
2014-05-01
Five active compounds, chlorogenic acid, 3,5-di-O-caffeoylquinic acid, 4,5-di-O-caffeoylquinic acid, jaceosidin, and eupatilin, in Artemisia princeps (Compositae) were simultaneously determined by ultra-performance liquid chromatography connected to diode array detector. The morphological resemblance between A. princeps and A. capillaris makes it difficult to properly identify species properly. It occasionally leads to misuse or misapplication in Korean traditional medicine. In the study, the discrimination between A. princeps and A. capillaris was optimally performed by the developed validation method, which resulted in definitely a difference between two species. Also, it was developed the most reliable markers contributing to the discrimination of two species by the multivariate analysis methods, such as a principal component analysis and a partial least squares discrimination analysis.
Teodoro, P E; Rodrigues, E V; Peixoto, L A; Silva, L A; Laviola, B G; Bhering, L L
2017-03-22
Jatropha is research target worldwide aimed at large-scale oil production for biodiesel and bio-kerosene. Its production potential is among 1200 and 1500 kg/ha of oil after the 4th year. This study aimed to estimate combining ability of Jatropha genotypes by multivariate diallel analysis to select parents and crosses that allow gains in important agronomic traits. We performed crosses in diallel complete genetic design (3 x 3) arranged in blocks with five replications and three plants per plot. The following traits were evaluated: plant height, stem diameter, canopy projection between rows, canopy projection on the line, number of branches, mass of hundred grains, and grain yield. Data were submitted to univariate and multivariate diallel analysis. Genotypes 107 and 190 can be used in crosses for establishing a base population of Jatropha, since it has favorable alleles for increasing the mass of hundred grains and grain yield and reducing the plant height. The cross 190 x 107 is the most promising to perform the selection of superior genotypes for the simultaneous breeding of these traits.
Bioprospecting Chemical Diversity and Bioactivity in a Marine Derived Aspergillus terreus.
Adpressa, Donovon A; Loesgen, Sandra
2016-02-01
A comparative metabolomic study of a marine derived fungus (Aspergillus terreus) grown under various culture conditions is presented. The fungus was grown in eleven different culture conditions using solid agar, broth cultures, or grain based media (OSMAC). Multivariate analysis of LC/MS data from the organic extracts revealed drastic differences in the metabolic profiles and guided our subsequent isolation efforts. The compound 7-desmethylcitreoviridin was isolated and identified, and is fully described for the first time. In addition, 16 known fungal metabolites were also isolated and identified. All compounds were elucidated by detailed spectroscopic analysis and tested for antibacterial activities against five human pathogens and tested for cytotoxicity. This study demonstrates that LC/MS based multivariate analysis provides a simple yet powerful tool to analyze the metabolome of a single fungal strain grown under various conditions. This approach allows environmentally-induced changes in metabolite expression to be rapidly visualized, and uses these differences to guide the discovery of new bioactive molecules. Copyright © 2016 Verlag Helvetica Chimica Acta AG, Zürich.
H, Maulidiani; Khatib, Alfi; Shaari, Khozirah; Abas, Faridah; Shitan, Mahendran; Kneer, Ralf; Neto, Victor; Lajis, Nordin H
2012-01-11
The metabolites of three species of Apiaceae, also known as Pegaga, were analyzed utilizing (1)H NMR spectroscopy and multivariate data analysis. Principal component analysis (PCA) and hierarchical cluster analysis (HCA) resolved the species, Centella asiatica, Hydrocotyle bonariensis, and Hydrocotyle sibthorpioides, into three clusters. The saponins, asiaticoside and madecassoside, along with chlorogenic acids were the metabolites that contributed most to the separation. Furthermore, the effects of growth-lighting condition to metabolite contents were also investigated. The extracts of C. asiatica grown in full-day light exposure exhibited a stronger radical scavenging activity and contained more triterpenes (asiaticoside and madecassoside), flavonoids, and chlorogenic acids as compared to plants grown in 50% shade. This study established the potential of using a combination of (1)H NMR spectroscopy and multivariate data analyses in differentiating three closely related species and the effects of growth lighting, based on their metabolite contents and identification of the markers contributing to their differences.
Amidžić Klarić, Daniela; Klarić, Ilija; Mornar, Ana; Velić, Darko; Velić, Natalija
2015-08-01
This study brings out the data on the content of 21 mineral and heavy metal in 15 blackberry wines made of conventionally and organically grown blackberries. The objective of this study was to classify the blackberry wine samples based on their mineral composition and the applied cultivation method of the starting raw material by using chemometric analysis. The metal content of Croatian blackberry wine samples was determined by AAS after dry ashing. The comparison between an organic and conventional group of investigated blackberry wines showed statistically significant difference in concentrations of Si and Li, where the organic group contained higher concentrations of these compounds. According to multivariate data analysis, the model based on the original metal content data set finally included seven original variables (K, Fe, Mn, Cu, Ba, Cd and Cr) and gave a satisfactory separation of two applied cultivation methods of the starting raw material.
Interpreting support vector machine models for multivariate group wise analysis in neuroimaging
Gaonkar, Bilwaj; Shinohara, Russell T; Davatzikos, Christos
2015-01-01
Machine learning based classification algorithms like support vector machines (SVMs) have shown great promise for turning a high dimensional neuroimaging data into clinically useful decision criteria. However, tracing imaging based patterns that contribute significantly to classifier decisions remains an open problem. This is an issue of critical importance in imaging studies seeking to determine which anatomical or physiological imaging features contribute to the classifier’s decision, thereby allowing users to critically evaluate the findings of such machine learning methods and to understand disease mechanisms. The majority of published work addresses the question of statistical inference for support vector classification using permutation tests based on SVM weight vectors. Such permutation testing ignores the SVM margin, which is critical in SVM theory. In this work we emphasize the use of a statistic that explicitly accounts for the SVM margin and show that the null distributions associated with this statistic are asymptotically normal. Further, our experiments show that this statistic is a lot less conservative as compared to weight based permutation tests and yet specific enough to tease out multivariate patterns in the data. Thus, we can better understand the multivariate patterns that the SVM uses for neuroimaging based classification. PMID:26210913
Smyczek-Gargya, B; Volz, B; Geppert, M; Dietl, J
1997-01-01
Clinical and histological data of 168 patients with squamous cell carcinoma of the vulva were analyzed with respect to survival. 151 patients underwent surgery, 12 patients were treated with primary radiation and in 5 patients no treatment was performed. Follow-up lasted from at least 2 up to 22 years' posttreatment. In univariate analysis, the following factors were highly significant: presurgery lymph node status, tumor infiltration beyond the vulva, tumor grading, histological inguinal lymph node status, pre- and postsurgery tumor stage, depth of invasion and tumor diameter. In the multivariate analysis (Cox regression), the most powerful factors were shown to be histological inguinal lymph node status, tumor diameter and tumor grading. The multivariate logistic regression analysis worked out as main prognostic factors for metastases of inguinal lymph nodes: presurgery inguinal lymph node status, tumor size, depth of invasion and tumor grading. Based on these results, tumor biology seems to be the decisive factor concerning recurrence and survival. Therefore, we suggest a more conservative treatment of vulvar carcinoma. Patients with confined carcinoma to the vulva, with a tumor diameter up to 3 cm and without clinical suspected lymph nodes, should be treated by wide excision/partial vulvectomy with ipsilateral lymphadenectomy.
Using sperm morphometry and multivariate analysis to differentiate species of gray Mazama
Duarte, José Maurício Barbanti
2016-01-01
There is genetic evidence that the two species of Brazilian gray Mazama, Mazama gouazoubira and Mazama nemorivaga, belong to different genera. This study identified significant differences that separated them into distinct groups, based on characteristics of the spermatozoa and ejaculate of both species. The characteristics that most clearly differentiated between the species were ejaculate colour, white for M. gouazoubira and reddish for M. nemorivaga, and sperm head dimensions. Multivariate analysis of sperm head dimension and format data accurately discriminated three groups for species with total percentage of misclassified of 0.71. The individual analysis, by animal, and the multivariate analysis have also discriminated correctly all five animals (total percentage of misclassified of 13.95%), and the canonical plot has shown three different clusters: Cluster 1, including individuals of M. nemorivaga; Cluster 2, including two individuals of M. gouazoubira; and Cluster 3, including a single individual of M. gouazoubira. The results obtained in this work corroborate the hypothesis of the formation of new genera and species for gray Mazama. Moreover, the easily applied method described herein can be used as an auxiliary tool to identify sibling species of other taxonomic groups. PMID:28018612
Water quality analysis of the Rapur area, Andhra Pradesh, South India using multivariate techniques
NASA Astrophysics Data System (ADS)
Nagaraju, A.; Sreedhar, Y.; Thejaswi, A.; Sayadi, Mohammad Hossein
2017-10-01
The groundwater samples from Rapur area were collected from different sites to evaluate the major ion chemistry. The large number of data can lead to difficulties in the integration, interpretation, and representation of the results. Two multivariate statistical methods, hierarchical cluster analysis (HCA) and factor analysis (FA), were applied to evaluate their usefulness to classify and identify geochemical processes controlling groundwater geochemistry. Four statistically significant clusters were obtained from 30 sampling stations. This has resulted two important clusters viz., cluster 1 (pH, Si, CO3, Mg, SO4, Ca, K, HCO3, alkalinity, Na, Na + K, Cl, and hardness) and cluster 2 (EC and TDS) which are released to the study area from different sources. The application of different multivariate statistical techniques, such as principal component analysis (PCA), assists in the interpretation of complex data matrices for a better understanding of water quality of a study area. From PCA, it is clear that the first factor (factor 1), accounted for 36.2% of the total variance, was high positive loading in EC, Mg, Cl, TDS, and hardness. Based on the PCA scores, four significant cluster groups of sampling locations were detected on the basis of similarity of their water quality.
A CLIPS expert system for clinical flow cytometry data analysis
NASA Technical Reports Server (NTRS)
Salzman, G. C.; Duque, R. E.; Braylan, R. C.; Stewart, C. C.
1990-01-01
An expert system is being developed using CLIPS to assist clinicians in the analysis of multivariate flow cytometry data from cancer patients. Cluster analysis is used to find subpopulations representing various cell types in multiple datasets each consisting of four to five measurements on each of 5000 cells. CLIPS facts are derived from results of the clustering. CLIPS rules are based on the expertise of Drs. Stewart, Duque, and Braylan. The rules incorporate certainty factors based on case histories.
Yan, Yan; Zhang, Qianqian; Feng, Fang
2016-07-01
Sulfur fumigation has recently been used during the postharvest handling of rhubarb to reduce the drying duration and control pests. However, a few reports question the effect of sulfur fumigation on the bioactive components of rhubarb, which is crucial for the quality evaluation of the herbal medicine. The bottleneck limiting the study comes from the complex compounds that exist in herb samples with diverse structural features, wide concentration range and the difficulty to obtain all the reference standards. In this study, an integrated strategy based on the highly effective separation and analysis by liquid chromatography coupled with diode-array detection and time-of-flight/triple-quadruple tandem mass spectrometry combined with multivariate analysis was established. 68 phenolic compounds that exist in nonfumigated and sulfur-fumigated herb samples of rhubarb were tentatively assigned based on their retention behavior, UV spectra, accurate molecular weight, and mass spectral fragments. Qualitative and semiquantitative comparison revealed a serious reduction of the majority of phenolic compounds in sulfur-fumigated rhubarb. Furthermore, multivariate analysis was applied to holistically discriminate nonfumigated from sulfur-fumigated rhubarb and explore the characteristic chemical markers. The established approach was specific and rapid for characterizing and screening sulfur-fumigated rhubarb among commercial samples and could be applied for the quality assessment of other sulfur-fumigated herbs. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Metabolic changes in different developmental stages of Vanilla planifolia pods.
Palama, Tony Lionel; Khatib, Alfi; Choi, Young Hae; Payet, Bertrand; Fock, Isabelle; Verpoorte, Robert; Kodja, Hippolyte
2009-09-09
The metabolomic analysis of developing Vanilla planifolia green pods (between 3 and 8 months after pollination) was carried out by nuclear magnetic resonance (NMR) spectroscopy and multivariate data analysis. Multivariate data analysis of the (1)H NMR spectra, such as principal component analysis (PCA) and partial least-squares-discriminant analysis (PLS-DA), showed a trend of separation of those samples based on the metabolites present in the methanol/water (1:1) extract. Older pods had a higher content of glucovanillin, vanillin, p-hydroxybenzaldehyde glucoside, p-hydroxybenzaldehyde, and sucrose, while younger pods had more bis[4-(beta-D-glucopyranosyloxy)-benzyl]-2-isopropyltartrate (glucoside A), bis[4-(beta-D-glucopyranosyloxy)-benzyl]-2-(2-butyl)tartrate (glucoside B), glucose, malic acid, and homocitric acid. A liquid chromatography-mass spectrometry (LC-MS) analysis targeted at phenolic compound content was also performed on the developing pods and confirmed the NMR results. Ratios of aglycones/glucosides were estimated and thus allowed for detection of more minor metabolites in the green vanilla pods. Quantification of compounds based on both LC-MS and NMR analyses showed that free vanillin can reach 24% of the total vanillin content after 8 months of development in the vanilla green pods.
Multivariate Radiological-Based Models for the Prediction of Future Knee Pain: Data from the OAI
Galván-Tejada, Jorge I.; Celaya-Padilla, José M.; Treviño, Victor; Tamez-Peña, José G.
2015-01-01
In this work, the potential of X-ray based multivariate prognostic models to predict the onset of chronic knee pain is presented. Using X-rays quantitative image assessments of joint-space-width (JSW) and paired semiquantitative central X-ray scores from the Osteoarthritis Initiative (OAI), a case-control study is presented. The pain assessments of the right knee at the baseline and the 60-month visits were used to screen for case/control subjects. Scores were analyzed at the time of pain incidence (T-0), the year prior incidence (T-1), and two years before pain incidence (T-2). Multivariate models were created by a cross validated elastic-net regularized generalized linear models feature selection tool. Univariate differences between cases and controls were reported by AUC, C-statistics, and ODDs ratios. Univariate analysis indicated that the medial osteophytes were significantly more prevalent in cases than controls: C-stat 0.62, 0.62, and 0.61, at T-0, T-1, and T-2, respectively. The multivariate JSW models significantly predicted pain: AUC = 0.695, 0.623, and 0.620, at T-0, T-1, and T-2, respectively. Semiquantitative multivariate models predicted paint with C-stat = 0.671, 0.648, and 0.645 at T-0, T-1, and T-2, respectively. Multivariate models derived from plain X-ray radiography assessments may be used to predict subjects that are at risk of developing knee pain. PMID:26504490
NASA Astrophysics Data System (ADS)
Yu, H.; Gu, H.
2017-12-01
A novel multivariate seismic formation pressure prediction methodology is presented, which incorporates high-resolution seismic velocity data from prestack AVO inversion, and petrophysical data (porosity and shale volume) derived from poststack seismic motion inversion. In contrast to traditional seismic formation prediction methods, the proposed methodology is based on a multivariate pressure prediction model and utilizes a trace-by-trace multivariate regression analysis on seismic-derived petrophysical properties to calibrate model parameters in order to make accurate predictions with higher resolution in both vertical and lateral directions. With prestack time migration velocity as initial velocity model, an AVO inversion was first applied to prestack dataset to obtain high-resolution seismic velocity with higher frequency that is to be used as the velocity input for seismic pressure prediction, and the density dataset to calculate accurate Overburden Pressure (OBP). Seismic Motion Inversion (SMI) is an inversion technique based on Markov Chain Monte Carlo simulation. Both structural variability and similarity of seismic waveform are used to incorporate well log data to characterize the variability of the property to be obtained. In this research, porosity and shale volume are first interpreted on well logs, and then combined with poststack seismic data using SMI to build porosity and shale volume datasets for seismic pressure prediction. A multivariate effective stress model is used to convert velocity, porosity and shale volume datasets to effective stress. After a thorough study of the regional stratigraphic and sedimentary characteristics, a regional normally compacted interval model is built, and then the coefficients in the multivariate prediction model are determined in a trace-by-trace multivariate regression analysis on the petrophysical data. The coefficients are used to convert velocity, porosity and shale volume datasets to effective stress and then to calculate formation pressure with OBP. Application of the proposed methodology to a research area in East China Sea has proved that the method can bridge the gap between seismic and well log pressure prediction and give predicted pressure values close to pressure meassurements from well testing.
The NLS-Based Nonlinear Grey Multivariate Model for Forecasting Pollutant Emissions in China
Pei, Ling-Ling; Li, Qin
2018-01-01
The relationship between pollutant discharge and economic growth has been a major research focus in environmental economics. To accurately estimate the nonlinear change law of China’s pollutant discharge with economic growth, this study establishes a transformed nonlinear grey multivariable (TNGM (1, N)) model based on the nonlinear least square (NLS) method. The Gauss–Seidel iterative algorithm was used to solve the parameters of the TNGM (1, N) model based on the NLS basic principle. This algorithm improves the precision of the model by continuous iteration and constantly approximating the optimal regression coefficient of the nonlinear model. In our empirical analysis, the traditional grey multivariate model GM (1, N) and the NLS-based TNGM (1, N) models were respectively adopted to forecast and analyze the relationship among wastewater discharge per capita (WDPC), and per capita emissions of SO2 and dust, alongside GDP per capita in China during the period 1996–2015. Results indicated that the NLS algorithm is able to effectively help the grey multivariable model identify the nonlinear relationship between pollutant discharge and economic growth. The results show that the NLS-based TNGM (1, N) model presents greater precision when forecasting WDPC, SO2 emissions and dust emissions per capita, compared to the traditional GM (1, N) model; WDPC indicates a growing tendency aligned with the growth of GDP, while the per capita emissions of SO2 and dust reduce accordingly. PMID:29517985
Multivariate Regression Analysis and Slaughter Livestock,
AGRICULTURE, *ECONOMICS), (*MEAT, PRODUCTION), MULTIVARIATE ANALYSIS, REGRESSION ANALYSIS , ANIMALS, WEIGHT, COSTS, PREDICTIONS, STABILITY, MATHEMATICAL MODELS, STORAGE, BEEF, PORK, FOOD, STATISTICAL DATA, ACCURACY
Network-Based Visual Analysis of Tabular Data
ERIC Educational Resources Information Center
Liu, Zhicheng
2012-01-01
Tabular data is pervasive in the form of spreadsheets and relational databases. Although tables often describe multivariate data without explicit network semantics, it may be advantageous to explore the data modeled as a graph or network for analysis. Even when a given table design conveys some static network semantics, analysts may want to look…
Introducing Undergraduate Students to Metabolomics Using a NMR-Based Analysis of Coffee Beans
ERIC Educational Resources Information Center
Sandusky, Peter Olaf
2017-01-01
Metabolomics applies multivariate statistical analysis to sets of high-resolution spectra taken over a population of biologically derived samples. The objective is to distinguish subpopulations within the overall sample population, and possibly also to identify biomarkers. While metabolomics has become part of the standard analytical toolbox in…
Using Time Series Analysis to Predict Cardiac Arrest in a PICU.
Kennedy, Curtis E; Aoki, Noriaki; Mariscalco, Michele; Turley, James P
2015-11-01
To build and test cardiac arrest prediction models in a PICU, using time series analysis as input, and to measure changes in prediction accuracy attributable to different classes of time series data. Retrospective cohort study. Thirty-one bed academic PICU that provides care for medical and general surgical (not congenital heart surgery) patients. Patients experiencing a cardiac arrest in the PICU and requiring external cardiac massage for at least 2 minutes. None. One hundred three cases of cardiac arrest and 109 control cases were used to prepare a baseline dataset that consisted of 1,025 variables in four data classes: multivariate, raw time series, clinical calculations, and time series trend analysis. We trained 20 arrest prediction models using a matrix of five feature sets (combinations of data classes) with four modeling algorithms: linear regression, decision tree, neural network, and support vector machine. The reference model (multivariate data with regression algorithm) had an accuracy of 78% and 87% area under the receiver operating characteristic curve. The best model (multivariate + trend analysis data with support vector machine algorithm) had an accuracy of 94% and 98% area under the receiver operating characteristic curve. Cardiac arrest predictions based on a traditional model built with multivariate data and a regression algorithm misclassified cases 3.7 times more frequently than predictions that included time series trend analysis and built with a support vector machine algorithm. Although the final model lacks the specificity necessary for clinical application, we have demonstrated how information from time series data can be used to increase the accuracy of clinical prediction models.
Kwon, Yong-Kook; Ahn, Myung Suk; Park, Jong Suk; Liu, Jang Ryol; In, Dong Su; Min, Byung Whan; Kim, Suk Weon
2013-01-01
To determine whether Fourier transform (FT)-IR spectral analysis combined with multivariate analysis of whole-cell extracts from ginseng leaves can be applied as a high-throughput discrimination system of cultivation ages and cultivars, a total of total 480 leaf samples belonging to 12 categories corresponding to four different cultivars (Yunpung, Kumpung, Chunpung, and an open-pollinated variety) and three different cultivation ages (1 yr, 2 yr, and 3 yr) were subjected to FT-IR. The spectral data were analyzed by principal component analysis and partial least squares-discriminant analysis. A dendrogram based on hierarchical clustering analysis of the FT-IR spectral data on ginseng leaves showed that leaf samples were initially segregated into three groups in a cultivation age-dependent manner. Then, within the same cultivation age group, leaf samples were clustered into four subgroups in a cultivar-dependent manner. The overall prediction accuracy for discrimination of cultivars and cultivation ages was 94.8% in a cross-validation test. These results clearly show that the FT-IR spectra combined with multivariate analysis from ginseng leaves can be applied as an alternative tool for discriminating of ginseng cultivars and cultivation ages. Therefore, we suggest that this result could be used as a rapid and reliable F1 hybrid seed-screening tool for accelerating the conventional breeding of ginseng. PMID:24558311
NASA Astrophysics Data System (ADS)
Ding, Hao; Cao, Ming; DuPont, Andrew W.; Scott, Larry D.; Guha, Sushovan; Singhal, Shashideep; Younes, Mamoun; Pence, Isaac; Herline, Alan; Schwartz, David; Xu, Hua; Mahadevan-Jansen, Anita; Bi, Xiaohong
2016-03-01
Inflammatory bowel disease (IBD) is an idiopathic disease that is typically characterized by chronic inflammation of the gastrointestinal tract. Recently much effort has been devoted to the development of novel diagnostic tools that can assist physicians for fast, accurate, and automated diagnosis of the disease. Previous research based on Raman spectroscopy has shown promising results in differentiating IBD patients from normal screening cases. In the current study, we examined IBD patients in vivo through a colonoscope-coupled Raman system. Optical diagnosis for IBD discrimination was conducted based on full-range spectra using multivariate statistical methods. Further, we incorporated several feature selection methods in machine learning into the classification model. The diagnostic performance for disease differentiation was significantly improved after feature selection. Our results showed that improved IBD diagnosis can be achieved using Raman spectroscopy in combination with multivariate analysis and feature selection.
Truu, Jaak; Heinaru, Eeva; Talpsep, Ene; Heinaru, Ain
2002-01-01
The oil-shale industry has created serious pollution problems in northeastern Estonia. Untreated, phenol-rich leachate from semi-coke mounds formed as a by-product of oil-shale processing is discharged into the Baltic Sea via channels and rivers. An exploratory analysis of water chemical and microbiological data sets from the low-flow period was carried out using different multivariate analysis techniques. Principal component analysis allowed us to distinguish different locations in the river system. The riverine microbial community response to water chemical parameters was assessed by co-inertia analysis. Water pH, COD and total nitrogen were negatively related to the number of biodegradative bacteria, while oxygen concentration promoted the abundance of these bacteria. The results demonstrate the utility of multivariate statistical techniques as tools for estimating the magnitude and extent of pollution based on river water chemical and microbiological parameters. An evaluation of river chemical and microbiological data suggests that the ambient natural attenuation mechanisms only partly eliminate pollutants from river water, and that a sufficient reduction of more recalcitrant compounds could be achieved through the reduction of wastewater discharge from the oil-shale chemical industry into the rivers.
Multivariate analysis of fears in dental phobic patients according to a reduced FSS-II scale.
Hakeberg, M; Gustafsson, J E; Berggren, U; Carlsson, S G
1995-10-01
This study analyzed and assessed dimensions of a questionnaire developed to measure general fears and phobias. A previous factor analysis among 109 dental phobics had revealed a five-factor structure with 22 items and an explained total variance of 54%. The present study analyzed the same material using a multivariate statistical procedure (LISREL) to reveal structural latent variables. The LISREL analysis, based on the correlation matrix, yielded a chi-square of 216.6 with 195 degrees of freedom (P = 0.138) and showed a model with seven latent variables. One was a general fear factor correlated to all 22 items. The other six factors concerned "Illness & Death" (5 items), "Failures & Embarrassment" (5 items), "Social situations" (5 items), "Physical injuries" (4 items), "Animals & Natural phenomena" (4 items). One item (opposite sex) was included in both "Failures & Embarrassment" and "Social situations". The last factor, "Social interaction", combined all the items in "Failures & Embarrassment" and "Social situations" (9 items). In conclusion, this multivariate statistical analysis (LISREL) revealed and confirmed a factor structure similar to our previous study, but added two important dimensions not shown with a traditional factor analysis. This reduced FSS-II version measures general fears and phobias and may be used on a routine clinical basis as well as in dental phobia research.
Liao, Weiqi; Long, Xiaojing; Jiang, Chunxiang; Diao, Yanjun; Liu, Xin; Zheng, Hairong; Zhang, Lijuan
2014-05-01
Differentiating mild cognitive impairment (MCI) and Alzheimer Disease (AD) from healthy aging remains challenging. This study aimed to explore the cerebral structural alterations of subjects with MCI or AD as compared to healthy elderly based on the individual and collective effects of cerebral morphologic indices using univariate and multivariate analyses. T1-weighted images (T1WIs) were retrieved from Alzheimer Disease Neuroimaging Initiative database for 116 subjects who were categorized into groups of healthy aging, MCI, and AD. Analysis of covariance (ANCOVA) and multivariate analysis of covariance (MANCOVA) were performed to explore the intergroup morphologic alterations indexed by surface area, curvature index, cortical thickness, and subjacent white matter volume with age and sex controlled as covariates, in 34 parcellated gyri regions of interest (ROIs) for both cerebral hemispheres based on the T1WI. Statistical parameters were mapped on the anatomic images to facilitate visual inspection. Global rather than region-specific structural alterations were revealed in groups of MCI and AD relative to healthy elderly using MANCOVA. ANCOVA revealed that the cortical thickness decreased more prominently in entorhinal, temporal, and cingulate cortices and was positively correlated with patients' cognitive performance in AD group but not in MCI. The temporal lobe features marked atrophy of white matter during the disease dynamics. Significant intercorrelations were observed among the morphologic indices with univariate analysis for given ROIs. Significant global structural alterations were identified in MCI and AD based on MANCOVA model with improved sensitivity. The intercorrelation among the morphologic indices may dampen the use of individual morphological parameter in featuring cerebral structural alterations. Decrease in cortical thickness is not reflective of the cognitive performance at the early stage of AD. Copyright © 2014 AUR. Published by Elsevier Inc. All rights reserved.
Quantum attack-resistent certificateless multi-receiver signcryption scheme.
Li, Huixian; Chen, Xubao; Pang, Liaojun; Shi, Weisong
2013-01-01
The existing certificateless signcryption schemes were designed mainly based on the traditional public key cryptography, in which the security relies on the hard problems, such as factor decomposition and discrete logarithm. However, these problems will be easily solved by the quantum computing. So the existing certificateless signcryption schemes are vulnerable to the quantum attack. Multivariate public key cryptography (MPKC), which can resist the quantum attack, is one of the alternative solutions to guarantee the security of communications in the post-quantum age. Motivated by these concerns, we proposed a new construction of the certificateless multi-receiver signcryption scheme (CLMSC) based on MPKC. The new scheme inherits the security of MPKC, which can withstand the quantum attack. Multivariate quadratic polynomial operations, which have lower computation complexity than bilinear pairing operations, are employed in signcrypting a message for a certain number of receivers in our scheme. Security analysis shows that our scheme is a secure MPKC-based scheme. We proved its security under the hardness of the Multivariate Quadratic (MQ) problem and its unforgeability under the Isomorphism of Polynomials (IP) assumption in the random oracle model. The analysis results show that our scheme also has the security properties of non-repudiation, perfect forward secrecy, perfect backward secrecy and public verifiability. Compared with the existing schemes in terms of computation complexity and ciphertext length, our scheme is more efficient, which makes it suitable for terminals with low computation capacity like smart cards.
Hemakom, Apit; Powezka, Katarzyna; Goverdovsky, Valentin; Jaffer, Usman; Mandic, Danilo P
2017-12-01
A highly localized data-association measure, termed intrinsic synchrosqueezing transform (ISC), is proposed for the analysis of coupled nonlinear and non-stationary multivariate signals. This is achieved based on a combination of noise-assisted multivariate empirical mode decomposition and short-time Fourier transform-based univariate and multivariate synchrosqueezing transforms. It is shown that the ISC outperforms six other combinations of algorithms in estimating degrees of synchrony in synthetic linear and nonlinear bivariate signals. Its advantage is further illustrated in the precise identification of the synchronized respiratory and heart rate variability frequencies among a subset of bass singers of a professional choir, where it distinctly exhibits better performance than the continuous wavelet transform-based ISC. We also introduce an extension to the intrinsic phase synchrony (IPS) measure, referred to as nested intrinsic phase synchrony (N-IPS), for the empirical quantification of physically meaningful and straightforward-to-interpret trends in phase synchrony. The N-IPS is employed to reveal physically meaningful variations in the levels of cooperation in choir singing and performing a surgical procedure. Both the proposed techniques successfully reveal degrees of synchronization of the physiological signals in two different aspects: (i) precise localization of synchrony in time and frequency (ISC), and (ii) large-scale analysis for the empirical quantification of physically meaningful trends in synchrony (N-IPS).
NASA Astrophysics Data System (ADS)
Mondal, A.; Zachariah, M.; Achutarao, K. M.; Otto, F. E. L.
2017-12-01
The Marathwada region in Maharashtra, India is known to suffer significantly from agrarian crisis including farmer suicides resulting from persistent droughts. Drought monitoring in India is commonly based on univariate indicators that consider the deficiency in precipitation alone. However, droughts may involve complex interplay of multiple physical variables, necessitating an integrated, multivariate approach to analyse their behaviour. In this study, we compare the behaviour of drought characteristics in Marathwada in the recent years as compared to the first half of the twentieth century, using a joint precipitation and temperature-based Multivariate Standardized Drought Index (MSDI). Drought events in the recent times are found to exhibit exceptional simultaneous anomalies of high temperature and precipitation deficits in this region, though studies on precipitation alone show that these events are within the range of historically observed variability. Additionally, we also develop multivariate copula-based Severity-Duration-Frequency (SDF) relationships for droughts in this region and compare their natures pre- and post- 1950. Based on multivariate return periods considering both temperature and precipitation anomalies, as well as the severity and duration of droughts, it is found that droughts have become more frequent in the post-1950 period. Based on precipitation alone, such an observation cannot be made. This emphasizes the sensitivity of droughts to temperature and underlines the importance of considering compound effects of temperature and precipitation in order to avoid an underestimation of drought risk. This observation-based analysis is the first step towards investigating the causal mechanisms of droughts, their evolutions and impacts in this region, particularly those influenced by anthropogenic climate change.
Impact of an Adlerian Based Pretrial Diversion Program: Self Concept and Dissociation
ERIC Educational Resources Information Center
Norvell, Jeanell J.
2010-01-01
Clients' self concepts and dissociative experiences were examined to determine the impact of an Adlerian based pretrial diversion program. Clients completing the program displayed a significant change in self concepts and dissociative experiences. A repeated measures multivariate analysis of variance indicated a 35% change, made up of the…
An Exploration of Adult Career Interests and Work Values in Taiwan
ERIC Educational Resources Information Center
Tien, Hsiu-Lan Shelley
2011-01-01
The purpose of the study was to investigate the relationship between vocational interests and work values among 206 adults in Taiwan. The instruments were the Career Interest Inventory developed based on Holland's RIASEC typology and the Work Value Inventory developed based on Super's theory. The results of multivariate analysis of variance…
Wang, Gang; Teng, Chaolin; Li, Kuo; Zhang, Zhonglin; Yan, Xiangguo
2016-09-01
The recorded electroencephalography (EEG) signals are usually contaminated by electrooculography (EOG) artifacts. In this paper, by using independent component analysis (ICA) and multivariate empirical mode decomposition (MEMD), the ICA-based MEMD method was proposed to remove EOG artifacts (EOAs) from multichannel EEG signals. First, the EEG signals were decomposed by the MEMD into multiple multivariate intrinsic mode functions (MIMFs). The EOG-related components were then extracted by reconstructing the MIMFs corresponding to EOAs. After performing the ICA of EOG-related signals, the EOG-linked independent components were distinguished and rejected. Finally, the clean EEG signals were reconstructed by implementing the inverse transform of ICA and MEMD. The results of simulated and real data suggested that the proposed method could successfully eliminate EOAs from EEG signals and preserve useful EEG information with little loss. By comparing with other existing techniques, the proposed method achieved much improvement in terms of the increase of signal-to-noise and the decrease of mean square error after removing EOAs.
Yan, Zhengbing; Kuang, Te-Hui; Yao, Yuan
2017-09-01
In recent years, multivariate statistical monitoring of batch processes has become a popular research topic, wherein multivariate fault isolation is an important step aiming at the identification of the faulty variables contributing most to the detected process abnormality. Although contribution plots have been commonly used in statistical fault isolation, such methods suffer from the smearing effect between correlated variables. In particular, in batch process monitoring, the high autocorrelations and cross-correlations that exist in variable trajectories make the smearing effect unavoidable. To address such a problem, a variable selection-based fault isolation method is proposed in this research, which transforms the fault isolation problem into a variable selection problem in partial least squares discriminant analysis and solves it by calculating a sparse partial least squares model. As different from the traditional methods, the proposed method emphasizes the relative importance of each process variable. Such information may help process engineers in conducting root-cause diagnosis. Copyright © 2017 ISA. Published by Elsevier Ltd. All rights reserved.
Fingeret, Abbey L; Martinez, Rebecca H; Hsieh, Christine; Downey, Peter; Nowygrod, Roman
2016-02-01
We aim to determine whether observed operations or internet-based video review predict improved performance in the surgery clerkship. A retrospective review of students' usage of surgical videos, observed operations, evaluations, and examination scores were used to construct an exploratory principal component analysis. Multivariate regression was used to determine factors predictive of clerkship performance. Case log data for 231 students revealed a median of 25 observed cases. Students accessed the web-based video platform a median of 15 times. Principal component analysis yielded 4 factors contributing 74% of the variability with a Kaiser-Meyer-Olkin coefficient of .83. Multivariate regression predicted shelf score (P < .0001), internal clinical skills examination score (P < .0001), subjective evaluations (P < .001), and video website utilization (P < .001) but not observed cases to be significantly associated with overall performance. Utilization of a web-based operative video platform during a surgical clerkship is an independently associated with improved clinical reasoning, fund of knowledge, and overall evaluation. Thus, this modality can serve as a useful adjunct to live observation. Copyright © 2016 Elsevier Inc. All rights reserved.
Mathew, Boby; Holand, Anna Marie; Koistinen, Petri; Léon, Jens; Sillanpää, Mikko J
2016-02-01
A novel reparametrization-based INLA approach as a fast alternative to MCMC for the Bayesian estimation of genetic parameters in multivariate animal model is presented. Multi-trait genetic parameter estimation is a relevant topic in animal and plant breeding programs because multi-trait analysis can take into account the genetic correlation between different traits and that significantly improves the accuracy of the genetic parameter estimates. Generally, multi-trait analysis is computationally demanding and requires initial estimates of genetic and residual correlations among the traits, while those are difficult to obtain. In this study, we illustrate how to reparametrize covariance matrices of a multivariate animal model/animal models using modified Cholesky decompositions. This reparametrization-based approach is used in the Integrated Nested Laplace Approximation (INLA) methodology to estimate genetic parameters of multivariate animal model. Immediate benefits are: (1) to avoid difficulties of finding good starting values for analysis which can be a problem, for example in Restricted Maximum Likelihood (REML); (2) Bayesian estimation of (co)variance components using INLA is faster to execute than using Markov Chain Monte Carlo (MCMC) especially when realized relationship matrices are dense. The slight drawback is that priors for covariance matrices are assigned for elements of the Cholesky factor but not directly to the covariance matrix elements as in MCMC. Additionally, we illustrate the concordance of the INLA results with the traditional methods like MCMC and REML approaches. We also present results obtained from simulated data sets with replicates and field data in rice.
Multiple Hypothesis Testing for Experimental Gingivitis Based on Wilcoxon Signed Rank Statistics
Preisser, John S.; Sen, Pranab K.; Offenbacher, Steven
2011-01-01
Dental research often involves repeated multivariate outcomes on a small number of subjects for which there is interest in identifying outcomes that exhibit change in their levels over time as well as to characterize the nature of that change. In particular, periodontal research often involves the analysis of molecular mediators of inflammation for which multivariate parametric methods are highly sensitive to outliers and deviations from Gaussian assumptions. In such settings, nonparametric methods may be favored over parametric ones. Additionally, there is a need for statistical methods that control an overall error rate for multiple hypothesis testing. We review univariate and multivariate nonparametric hypothesis tests and apply them to longitudinal data to assess changes over time in 31 biomarkers measured from the gingival crevicular fluid in 22 subjects whereby gingivitis was induced by temporarily withholding tooth brushing. To identify biomarkers that can be induced to change, multivariate Wilcoxon signed rank tests for a set of four summary measures based upon area under the curve are applied for each biomarker and compared to their univariate counterparts. Multiple hypothesis testing methods with choice of control of the false discovery rate or strong control of the family-wise error rate are examined. PMID:21984957
Zhou, Jinzhe; Zhou, Yanbing; Cao, Shougen; Li, Shikuan; Wang, Hao; Niu, Zhaojian; Chen, Dong; Wang, Dongsheng; Lv, Liang; Zhang, Jian; Li, Yu; Jiao, Xuelong; Tan, Xiaojie; Zhang, Jianli; Wang, Haibo; Zhang, Bingyuan; Lu, Yun; Sun, Zhenqing
2016-01-01
Reporting of surgical complications is common, but few provide information about the severity and estimate risk factors of complications. If have, but lack of specificity. We retrospectively analyzed data on 2795 gastric cancer patients underwent surgical procedure at the Affiliated Hospital of Qingdao University between June 2007 and June 2012, established multivariate logistic regression model to predictive risk factors related to the postoperative complications according to the Clavien-Dindo classification system. Twenty-four out of 86 variables were identified statistically significant in univariate logistic regression analysis, 11 significant variables entered multivariate analysis were employed to produce the risk model. Liver cirrhosis, diabetes mellitus, Child classification, invasion of neighboring organs, combined resection, introperative transfusion, Billroth II anastomosis of reconstruction, malnutrition, surgical volume of surgeons, operating time and age were independent risk factors for postoperative complications after gastrectomy. Based on logistic regression equation, p=Exp∑BiXi / (1+Exp∑BiXi), multivariate logistic regression predictive model that calculated the risk of postoperative morbidity was developed, p = 1/(1 + e((4.810-1.287X1-0.504X2-0.500X3-0.474X4-0.405X5-0.318X6-0.316X7-0.305X8-0.278X9-0.255X10-0.138X11))). The accuracy, sensitivity and specificity of the model to predict the postoperative complications were 86.7%, 76.2% and 88.6%, respectively. This risk model based on Clavien-Dindo grading severity of complications system and logistic regression analysis can predict severe morbidity specific to an individual patient's risk factors, estimate patients' risks and benefits of gastric surgery as an accurate decision-making tool and may serve as a template for the development of risk models for other surgical groups.
NASA Astrophysics Data System (ADS)
Belianinov, Alex; Ganesh, Panchapakesan; Lin, Wenzhi; Sales, Brian C.; Sefat, Athena S.; Jesse, Stephen; Pan, Minghu; Kalinin, Sergei V.
2014-12-01
Atomic level spatial variability of electronic structure in Fe-based superconductor FeTe0.55Se0.45 (Tc = 15 K) is explored using current-imaging tunneling-spectroscopy. Multivariate statistical analysis of the data differentiates regions of dissimilar electronic behavior that can be identified with the segregation of chalcogen atoms, as well as boundaries between terminations and near neighbor interactions. Subsequent clustering analysis allows identification of the spatial localization of these dissimilar regions. Similar statistical analysis of modeled calculated density of states of chemically inhomogeneous FeTe1-xSex structures further confirms that the two types of chalcogens, i.e., Te and Se, can be identified by their electronic signature and differentiated by their local chemical environment. This approach allows detailed chemical discrimination of the scanning tunneling microscopy data including separation of atomic identities, proximity, and local configuration effects and can be universally applicable to chemically and electronically inhomogeneous surfaces.
Classification of adulterated honeys by multivariate analysis.
Amiry, Saber; Esmaiili, Mohsen; Alizadeh, Mohammad
2017-06-01
In this research, honey samples were adulterated with date syrup (DS) and invert sugar syrup (IS) at three concentrations (7%, 15% and 30%). 102 adulterated samples were prepared in six batches with 17 replications for each batch. For each sample, 32 parameters including color indices, rheological, physical, and chemical parameters were determined. To classify the samples, based on type and concentrations of adulterant, a multivariate analysis was applied using principal component analysis (PCA) followed by a linear discriminant analysis (LDA). Then, 21 principal components (PCs) were selected in five sets. Approximately two-thirds were identified correctly using color indices (62.75%) or rheological properties (67.65%). A power discrimination was obtained using physical properties (97.06%), and the best separations were achieved using two sets of chemical properties (set 1: lactone, diastase activity, sucrose - 100%) (set 2: free acidity, HMF, ash - 95%). Copyright © 2016 Elsevier Ltd. All rights reserved.
Arm structure in normal spiral galaxies, 1: Multivariate data for 492 galaxies
NASA Technical Reports Server (NTRS)
Magri, Christopher
1994-01-01
Multivariate data have been collected as part of an effort to develop a new classification system for spiral galaxies, one which is not necessarily based on subjective morphological properties. A sample of 492 moderately bright northern Sa and Sc spirals was chosen for future statistical analysis. New observations were made at 20 and 21 cm; the latter data are described in detail here. Infrared Astronomy Satellite (IRAS) fluxes were obtained from archival data. Finally, new estimates of arm pattern radomness and of local environmental harshness were compiled for most sample objects.
Selvarasu, Suresh; Kim, Do Yun; Karimi, Iftekhar A; Lee, Dong-Yup
2010-10-01
We present an integrated framework for characterizing fed-batch cultures of mouse hybridoma cells producing monoclonal antibody (mAb). This framework systematically combines data preprocessing, elemental balancing and statistical analysis technique. Initially, specific rates of cell growth, glucose/amino acid consumptions and mAb/metabolite productions were calculated via curve fitting using logistic equations, with subsequent elemental balancing of the preprocessed data indicating the presence of experimental measurement errors. Multivariate statistical analysis was then employed to understand physiological characteristics of the cellular system. The results from principal component analysis (PCA) revealed three major clusters of amino acids with similar trends in their consumption profiles: (i) arginine, threonine and serine, (ii) glycine, tyrosine, phenylalanine, methionine, histidine and asparagine, and (iii) lysine, valine and isoleucine. Further analysis using partial least square (PLS) regression identified key amino acids which were positively or negatively correlated with the cell growth, mAb production and the generation of lactate and ammonia. Based on these results, the optimal concentrations of key amino acids in the feed medium can be inferred, potentially leading to an increase in cell viability and productivity, as well as a decrease in toxic waste production. The study demonstrated how the current methodological framework using multivariate statistical analysis techniques can serve as a potential tool for deriving rational medium design strategies. Copyright © 2010 Elsevier B.V. All rights reserved.
Spatial assessment of air quality patterns in Malaysia using multivariate analysis
NASA Astrophysics Data System (ADS)
Dominick, Doreena; Juahir, Hafizan; Latif, Mohd Talib; Zain, Sharifuddin M.; Aris, Ahmad Zaharin
2012-12-01
This study aims to investigate possible sources of air pollutants and the spatial patterns within the eight selected Malaysian air monitoring stations based on a two-year database (2008-2009). The multivariate analysis was applied on the dataset. It incorporated Hierarchical Agglomerative Cluster Analysis (HACA) to access the spatial patterns, Principal Component Analysis (PCA) to determine the major sources of the air pollution and Multiple Linear Regression (MLR) to assess the percentage contribution of each air pollutant. The HACA results grouped the eight monitoring stations into three different clusters, based on the characteristics of the air pollutants and meteorological parameters. The PCA analysis showed that the major sources of air pollution were emissions from motor vehicles, aircraft, industries and areas of high population density. The MLR analysis demonstrated that the main pollutant contributing to variability in the Air Pollutant Index (API) at all stations was particulate matter with a diameter of less than 10 μm (PM10). Further MLR analysis showed that the main air pollutant influencing the high concentration of PM10 was carbon monoxide (CO). This was due to combustion processes, particularly originating from motor vehicles. Meteorological factors such as ambient temperature, wind speed and humidity were also noted to influence the concentration of PM10.
Bonetti, Jennifer; Quarino, Lawrence
2014-05-01
This study has shown that the combination of simple techniques with the use of multivariate statistics offers the potential for the comparative analysis of soil samples. Five samples were obtained from each of twelve state parks across New Jersey in both the summer and fall seasons. Each sample was examined using particle-size distribution, pH analysis in both water and 1 M CaCl2 , and a loss on ignition technique. Data from each of the techniques were combined, and principal component analysis (PCA) and canonical discriminant analysis (CDA) were used for multivariate data transformation. Samples from different locations could be visually differentiated from one another using these multivariate plots. Hold-one-out cross-validation analysis showed error rates as low as 3.33%. Ten blind study samples were analyzed resulting in no misclassifications using Mahalanobis distance calculations and visual examinations of multivariate plots. Seasonal variation was minimal between corresponding samples, suggesting potential success in forensic applications. © 2014 American Academy of Forensic Sciences.
Quantifying the impact of between-study heterogeneity in multivariate meta-analyses
Jackson, Dan; White, Ian R; Riley, Richard D
2012-01-01
Measures that quantify the impact of heterogeneity in univariate meta-analysis, including the very popular I2 statistic, are now well established. Multivariate meta-analysis, where studies provide multiple outcomes that are pooled in a single analysis, is also becoming more commonly used. The question of how to quantify heterogeneity in the multivariate setting is therefore raised. It is the univariate R2 statistic, the ratio of the variance of the estimated treatment effect under the random and fixed effects models, that generalises most naturally, so this statistic provides our basis. This statistic is then used to derive a multivariate analogue of I2, which we call . We also provide a multivariate H2 statistic, the ratio of a generalisation of Cochran's heterogeneity statistic and its associated degrees of freedom, with an accompanying generalisation of the usual I2 statistic, . Our proposed heterogeneity statistics can be used alongside all the usual estimates and inferential procedures used in multivariate meta-analysis. We apply our methods to some real datasets and show how our statistics are equally appropriate in the context of multivariate meta-regression, where study level covariate effects are included in the model. Our heterogeneity statistics may be used when applying any procedure for fitting the multivariate random effects model. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22763950
Are Canadian Adolescents Happy? A Gender-Based Analysis of a Nationally Representative Survey
ERIC Educational Resources Information Center
Weaver, Robert D.; Habibov, Nazim N.
2010-01-01
In this study, the authors analyzed data from a nationally representative survey of youth to study happiness amongst Canadian adolescents aged 12-17. Testing for differences in the level of happiness between female and male adolescents was conducted. Following this, multivariate analysis was employed to determine which factors were associated with…
Analyzing Multiple Outcomes in Clinical Research Using Multivariate Multilevel Models
Baldwin, Scott A.; Imel, Zac E.; Braithwaite, Scott R.; Atkins, David C.
2014-01-01
Objective Multilevel models have become a standard data analysis approach in intervention research. Although the vast majority of intervention studies involve multiple outcome measures, few studies use multivariate analysis methods. The authors discuss multivariate extensions to the multilevel model that can be used by psychotherapy researchers. Method and Results Using simulated longitudinal treatment data, the authors show how multivariate models extend common univariate growth models and how the multivariate model can be used to examine multivariate hypotheses involving fixed effects (e.g., does the size of the treatment effect differ across outcomes?) and random effects (e.g., is change in one outcome related to change in the other?). An online supplemental appendix provides annotated computer code and simulated example data for implementing a multivariate model. Conclusions Multivariate multilevel models are flexible, powerful models that can enhance clinical research. PMID:24491071
Using a Grocery List Is Associated With a Healthier Diet and Lower BMI Among Very High-Risk Adults.
Dubowitz, Tamara; Cohen, Deborah A; Huang, Christina Y; Beckman, Robin A; Collins, Rebecca L
2015-01-01
Examine whether use of a grocery list is associated with healthier diet and weight among food desert residents. Cross-sectional analysis of in-person interview data from randomly selected household food shoppers in 2 low-income, primarily African American urban neighborhoods in Pittsburgh, PA with limited access to healthy foods. Multivariate ordinary least-square regressions conducted among 1,372 participants and controlling for sociodemographic factors and other potential confounding variables indicated that although most of the sample (78%) was overweight or obese, consistently using a list was associated with lower body mass index (based on measured height and weight) (adjusted multivariant coefficient = 0.095) and higher dietary quality (based on the Healthy Eating Index-2005) (adjusted multivariant coefficient = 0.103) (P < .05). Shopping with a list may be a useful tool for low-income individuals to improve diet or decrease body mass index. Copyright © 2015 Society for Nutrition Education and Behavior. Published by Elsevier Inc. All rights reserved.
Davatzikos, Christos
2016-10-01
The past 20 years have seen a mushrooming growth of the field of computational neuroanatomy. Much of this work has been enabled by the development and refinement of powerful, high-dimensional image warping methods, which have enabled detailed brain parcellation, voxel-based morphometric analyses, and multivariate pattern analyses using machine learning approaches. The evolution of these 3 types of analyses over the years has overcome many challenges. We present the evolution of our work in these 3 directions, which largely follows the evolution of this field. We discuss the progression from single-atlas, single-registration brain parcellation work to current ensemble-based parcellation; from relatively basic mass-univariate t-tests to optimized regional pattern analyses combining deformations and residuals; and from basic application of support vector machines to generative-discriminative formulations of multivariate pattern analyses, and to methods dealing with heterogeneity of neuroanatomical patterns. We conclude with discussion of some of the future directions and challenges. Copyright © 2016. Published by Elsevier B.V.
Davatzikos, Christos
2017-01-01
The past 20 years have seen a mushrooming growth of the field of computational neuroanatomy. Much of this work has been enabled by the development and refinement of powerful, high-dimensional image warping methods, which have enabled detailed brain parcellation, voxel-based morphometric analyses, and multivariate pattern analyses using machine learning approaches. The evolution of these 3 types of analyses over the years has overcome many challenges. We present the evolution of our work in these 3 directions, which largely follows the evolution of this field. We discuss the progression from single-atlas, single-registration brain parcellation work to current ensemble-based parcellation; from relatively basic mass-univariate t-tests to optimized regional pattern analyses combining deformations and residuals; and from basic application of support vector machines to generative-discriminative formulations of multivariate pattern analyses, and to methods dealing with heterogeneity of neuroanatomical patterns. We conclude with discussion of some of the future directions and challenges. PMID:27514582
De Luca, Michele; Ioele, Giuseppina; Mas, Sílvia; Tauler, Romà; Ragno, Gaetano
2012-11-21
Amiloride photostability at different pH values was studied in depth by applying Multivariate Curve Resolution Alternating Least Squares (MCR-ALS) to the UV spectrophotometric data from drug solutions exposed to stressing irradiation. Resolution of all degradation photoproducts was possible by simultaneous spectrophotometric analysis of kinetic photodegradation and acid-base titration experiments. Amiloride photodegradation showed to be strongly dependent on pH. Two hard modelling constraints were sequentially used in MCR-ALS for the unambiguous resolution of all the species involved in the photodegradation process. An amiloride acid-base system was defined by using the equilibrium constraint, and the photodegradation pathway was modelled taking into account the kinetic constraint. The simultaneous analysis of photodegradation and titration experiments revealed the presence of eight different species, which were differently distributed according to pH and time. Concentration profiles of all the species as well as their pure spectra were resolved and kinetic rate constants were estimated. The values of rate constants changed with pH and under alkaline conditions the degradation pathway and photoproducts also changed. These results were compared to those obtained by LC-MS analysis from drug photodegradation experiments. MS analysis allowed the identification of up to five species and showed the simultaneous presence of more than one acid-base equilibrium.
Optimization of Interior Permanent Magnet Motor by Quality Engineering and Multivariate Analysis
NASA Astrophysics Data System (ADS)
Okada, Yukihiro; Kawase, Yoshihiro
This paper has described the method of optimization based on the finite element method. The quality engineering and the multivariable analysis are used as the optimization technique. This optimizing method consists of two steps. At Step.1, the influence of parameters for output is obtained quantitatively, at Step.2, the number of calculation by the FEM can be cut down. That is, the optimal combination of the design parameters, which satisfies the required characteristic, can be searched for efficiently. In addition, this method is applied to a design of IPM motor to reduce the torque ripple. The final shape can maintain average torque and cut down the torque ripple 65%. Furthermore, the amount of permanent magnets can be reduced.
NASA Astrophysics Data System (ADS)
Sikora, Roman; Markiewicz, Przemysław; Pabjańczyk, Wiesława
2018-04-01
The power systems usually include a number of nonlinear receivers. Nonlinear receivers are the source of disturbances generated to the power system in the form of higher harmonics. The level of these disturbances describes the total harmonic distortion coefficient THD. Its value depends on many factors. One of them are the deformation and change in RMS value of supply voltage. A modern LED luminaire is a nonlinear receiver as well. The paper presents the results of the analysis of the influence of change in RMS value of supply voltage and the level of dimming of the tested luminaire on the value of the current THD. The analysis was made using a mathematical model based on multivariable polynomial fitting.
Rapid detection of bacterial pathogens using flourescence spectroscopy and chemometrics
USDA-ARS?s Scientific Manuscript database
This work presents the development of a method for rapid bacterial identification based on the fluorescence spectroscopy combined with multivariate analysis. Fluorescence spectra of pure three different genera of bacteria (Escherichia coli, Salmonella, and Campylobacter) were collected from 200...
Augmented classical least squares multivariate spectral analysis
Haaland, David M.; Melgaard, David K.
2004-02-03
A method of multivariate spectral analysis, termed augmented classical least squares (ACLS), provides an improved CLS calibration model when unmodeled sources of spectral variation are contained in a calibration sample set. The ACLS methods use information derived from component or spectral residuals during the CLS calibration to provide an improved calibration-augmented CLS model. The ACLS methods are based on CLS so that they retain the qualitative benefits of CLS, yet they have the flexibility of PLS and other hybrid techniques in that they can define a prediction model even with unmodeled sources of spectral variation that are not explicitly included in the calibration model. The unmodeled sources of spectral variation may be unknown constituents, constituents with unknown concentrations, nonlinear responses, non-uniform and correlated errors, or other sources of spectral variation that are present in the calibration sample spectra. Also, since the various ACLS methods are based on CLS, they can incorporate the new prediction-augmented CLS (PACLS) method of updating the prediction model for new sources of spectral variation contained in the prediction sample set without having to return to the calibration process. The ACLS methods can also be applied to alternating least squares models. The ACLS methods can be applied to all types of multivariate data.
Augmented Classical Least Squares Multivariate Spectral Analysis
Haaland, David M.; Melgaard, David K.
2005-07-26
A method of multivariate spectral analysis, termed augmented classical least squares (ACLS), provides an improved CLS calibration model when unmodeled sources of spectral variation are contained in a calibration sample set. The ACLS methods use information derived from component or spectral residuals during the CLS calibration to provide an improved calibration-augmented CLS model. The ACLS methods are based on CLS so that they retain the qualitative benefits of CLS, yet they have the flexibility of PLS and other hybrid techniques in that they can define a prediction model even with unmodeled sources of spectral variation that are not explicitly included in the calibration model. The unmodeled sources of spectral variation may be unknown constituents, constituents with unknown concentrations, nonlinear responses, non-uniform and correlated errors, or other sources of spectral variation that are present in the calibration sample spectra. Also, since the various ACLS methods are based on CLS, they can incorporate the new prediction-augmented CLS (PACLS) method of updating the prediction model for new sources of spectral variation contained in the prediction sample set without having to return to the calibration process. The ACLS methods can also be applied to alternating least squares models. The ACLS methods can be applied to all types of multivariate data.
Augmented Classical Least Squares Multivariate Spectral Analysis
Haaland, David M.; Melgaard, David K.
2005-01-11
A method of multivariate spectral analysis, termed augmented classical least squares (ACLS), provides an improved CLS calibration model when unmodeled sources of spectral variation are contained in a calibration sample set. The ACLS methods use information derived from component or spectral residuals during the CLS calibration to provide an improved calibration-augmented CLS model. The ACLS methods are based on CLS so that they retain the qualitative benefits of CLS, yet they have the flexibility of PLS and other hybrid techniques in that they can define a prediction model even with unmodeled sources of spectral variation that are not explicitly included in the calibration model. The unmodeled sources of spectral variation may be unknown constituents, constituents with unknown concentrations, nonlinear responses, non-uniform and correlated errors, or other sources of spectral variation that are present in the calibration sample spectra. Also, since the various ACLS methods are based on CLS, they can incorporate the new prediction-augmented CLS (PACLS) method of updating the prediction model for new sources of spectral variation contained in the prediction sample set without having to return to the calibration process. The ACLS methods can also be applied to alternating least squares models. The ACLS methods can be applied to all types of multivariate data.
Analysis techniques for multivariate root loci. [a tool in linear control systems
NASA Technical Reports Server (NTRS)
Thompson, P. M.; Stein, G.; Laub, A. J.
1980-01-01
Analysis and techniques are developed for the multivariable root locus and the multivariable optimal root locus. The generalized eigenvalue problem is used to compute angles and sensitivities for both types of loci, and an algorithm is presented that determines the asymptotic properties of the optimal root locus.
Methods for presentation and display of multivariate data
NASA Technical Reports Server (NTRS)
Myers, R. H.
1981-01-01
Methods for the presentation and display of multivariate data are discussed with emphasis placed on the multivariate analysis of variance problems and the Hotelling T(2) solution in the two-sample case. The methods utilize the concepts of stepwise discrimination analysis and the computation of partial correlation coefficients.
A Primer on Multivariate Analysis of Variance (MANOVA) for Behavioral Scientists
ERIC Educational Resources Information Center
Warne, Russell T.
2014-01-01
Reviews of statistical procedures (e.g., Bangert & Baumberger, 2005; Kieffer, Reese, & Thompson, 2001; Warne, Lazo, Ramos, & Ritter, 2012) show that one of the most common multivariate statistical methods in psychological research is multivariate analysis of variance (MANOVA). However, MANOVA and its associated procedures are often not…
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xiaoyan Tang; Min Shao; Yuanhang Zhang
1996-12-31
Ambient aerosol is one of most important pollutants in China. This paper showed the results of aerosol sources of Beijing area revealed by combination of multivariate analysis models and 14C tracer measured on Accelerator Mass Spectrometry (AMS). The results indicated that the mass concentration of particulate (<100 (M)) didn`t increase rapidly, compared with economic development in Beijing city. The multivariate analysis showed that the predominant source was soil dust which contributed more than 50% to atmospheric particles. However, it would be a risk to conclude that the aerosol pollution from anthropogenic sources was less important in Beijing city based onmore » above phenomenon. Due to lack of reliable tracers, it was very hard to distinguish coal burning from soil source. Thus, it was suspected that the soil source above might be the mixture of soil dust and coal burning. The 14C measurement showed that carbonaceous species of aerosol had quite different emission sources. For carbonaceous aerosols in Beijing, the contribution from fossil fuel to ambient particles was nearly 2/3, as the man-made activities ( coal-burning, etc.) increased, the fossil part would contribute more to atmospheric carbonaceous particles. For example, in downtown Beijing at space-heating seasons, the fossil fuel even contributed more than 95% to carbonaceous particles, which would be potential harmful to population. By using multivariate analysis together with 14C data, two important sources of aerosols in Beijing (soil and coal) combustion were more reliably distinguished, which was critical important for the assessment of aerosol problem in China.« less
NASA Astrophysics Data System (ADS)
Braga, Jez Willian Batista; Trevizan, Lilian Cristina; Nunes, Lidiane Cristina; Rufini, Iolanda Aparecida; Santos, Dário, Jr.; Krug, Francisco José
2010-01-01
The application of laser induced breakdown spectrometry (LIBS) aiming the direct analysis of plant materials is a great challenge that still needs efforts for its development and validation. In this way, a series of experimental approaches has been carried out in order to show that LIBS can be used as an alternative method to wet acid digestions based methods for analysis of agricultural and environmental samples. The large amount of information provided by LIBS spectra for these complex samples increases the difficulties for selecting the most appropriated wavelengths for each analyte. Some applications have suggested that improvements in both accuracy and precision can be achieved by the application of multivariate calibration in LIBS data when compared to the univariate regression developed with line emission intensities. In the present work, the performance of univariate and multivariate calibration, based on partial least squares regression (PLSR), was compared for analysis of pellets of plant materials made from an appropriate mixture of cryogenically ground samples with cellulose as the binding agent. The development of a specific PLSR model for each analyte and the selection of spectral regions containing only lines of the analyte of interest were the best conditions for the analysis. In this particular application, these models showed a similar performance, but PLSR seemed to be more robust due to a lower occurrence of outliers in comparison to the univariate method. Data suggests that efforts dealing with sample presentation and fitness of standards for LIBS analysis must be done in order to fulfill the boundary conditions for matrix independent development and validation.
Fazeli, Bahare; Ravari, Hassan; Assadi, Reza
2012-08-01
The aim of this study was first to describe the natural history of Buerger's disease (BD) and then to discuss a clinical approach to this disease based on multivariate analysis. One hundred eight patients who corresponded with Shionoya's criteria were selected from 2000 to 2007 for this study. Major amputation was considered the ultimate adverse event. Survival analyses were performed by Kaplan-Meier curves. Independent variables including gender, duration of smoking, number of cigarettes smoked per day, minor amputation events and type of treatments, were determined by multivariate Cox regression analysis. The recorded data demonstrated that BD may present in four forms, including relapsing-remitting (75%), secondary progressive (4.6%), primary progressive (14.2%) and benign BD (6.2%). Most of the amputations occurred due to relapses within the six years after diagnosis of BD. In multivariate analysis, duration of smoking of more than 20 years had a significant relationship with further major amputation among patients with BD. Smoking cessation programs with experienced psychotherapists are strongly recommended for those areas in which Buerger's disease is common. Patients who have smoked for more than 20 years should be encouraged to quit smoking, but should also be recommended for more advanced treatment for limb salvage.
Chromatography methods and chemometrics for determination of milk fat adulterants
NASA Astrophysics Data System (ADS)
Trbović, D.; Petronijević, R.; Đorđević, V.
2017-09-01
Milk and milk-based products are among the leading food categories according to reported cases of food adulteration. Although many authentication problems exist in all areas of the food industry, adequate control methods are required to evaluate the authenticity of milk and milk products in the dairy industry. Moreover, gas chromatography (GC) analysis of triacylglycerols (TAGs) or fatty acid (FA) profiles of milk fat (MF) in combination with multivariate statistical data processing have been used to detect adulterations of milk and dairy products with foreign fats. The adulteration of milk and butter is a major issue for the dairy industry. The major adulterants of MF are vegetable oils (soybean, sunflower, groundnut, coconut, palm and peanut oil) and animal fat (cow tallow and pork lard). Multivariate analysis enables adulterated MF to be distinguished from authentic MF, while taking into account many analytical factors. Various multivariate analysis methods have been proposed to quantitatively detect levels of adulterant non-MFs, with multiple linear regression (MLR) seemingly the most suitable. There is a need for increased use of chemometric data analyses to detect adulterated MF in foods and for their expanded use in routine quality assurance testing.
Ground-Based Telescope Parametric Cost Model
NASA Technical Reports Server (NTRS)
Stahl, H. Philip; Rowell, Ginger Holmes
2004-01-01
A parametric cost model for ground-based telescopes is developed using multi-variable statistical analysis, The model includes both engineering and performance parameters. While diameter continues to be the dominant cost driver, other significant factors include primary mirror radius of curvature and diffraction limited wavelength. The model includes an explicit factor for primary mirror segmentation and/or duplication (i.e.. multi-telescope phased-array systems). Additionally, single variable models based on aperture diameter are derived. This analysis indicates that recent mirror technology advances have indeed reduced the historical telescope cost curve.
Preliminary Cost Model for Space Telescopes
NASA Technical Reports Server (NTRS)
Stahl, H. Philip; Prince, F. Andrew; Smart, Christian; Stephens, Kyle; Henrichs, Todd
2009-01-01
Parametric cost models are routinely used to plan missions, compare concepts and justify technology investments. However, great care is required. Some space telescope cost models, such as those based only on mass, lack sufficient detail to support such analysis and may lead to inaccurate conclusions. Similarly, using ground based telescope models which include the dome cost will also lead to inaccurate conclusions. This paper reviews current and historical models. Then, based on data from 22 different NASA space telescopes, this paper tests those models and presents preliminary analysis of single and multi-variable space telescope cost models.
Goudey, Benjamin; Abedini, Mani; Hopper, John L; Inouye, Michael; Makalic, Enes; Schmidt, Daniel F; Wagner, John; Zhou, Zeyu; Zobel, Justin; Reumann, Matthias
2015-01-01
Genome-wide association studies (GWAS) are a common approach for systematic discovery of single nucleotide polymorphisms (SNPs) which are associated with a given disease. Univariate analysis approaches commonly employed may miss important SNP associations that only appear through multivariate analysis in complex diseases. However, multivariate SNP analysis is currently limited by its inherent computational complexity. In this work, we present a computational framework that harnesses supercomputers. Based on our results, we estimate a three-way interaction analysis on 1.1 million SNP GWAS data requiring over 5.8 years on the full "Avoca" IBM Blue Gene/Q installation at the Victorian Life Sciences Computation Initiative. This is hundreds of times faster than estimates for other CPU based methods and four times faster than runtimes estimated for GPU methods, indicating how the improvement in the level of hardware applied to interaction analysis may alter the types of analysis that can be performed. Furthermore, the same analysis would take under 3 months on the currently largest IBM Blue Gene/Q supercomputer "Sequoia" at the Lawrence Livermore National Laboratory assuming linear scaling is maintained as our results suggest. Given that the implementation used in this study can be further optimised, this runtime means it is becoming feasible to carry out exhaustive analysis of higher order interaction studies on large modern GWAS.
Multivariate Analysis and Machine Learning in Cerebral Palsy Research
Zhang, Jing
2017-01-01
Cerebral palsy (CP), a common pediatric movement disorder, causes the most severe physical disability in children. Early diagnosis in high-risk infants is critical for early intervention and possible early recovery. In recent years, multivariate analytic and machine learning (ML) approaches have been increasingly used in CP research. This paper aims to identify such multivariate studies and provide an overview of this relatively young field. Studies reviewed in this paper have demonstrated that multivariate analytic methods are useful in identification of risk factors, detection of CP, movement assessment for CP prediction, and outcome assessment, and ML approaches have made it possible to automatically identify movement impairments in high-risk infants. In addition, outcome predictors for surgical treatments have been identified by multivariate outcome studies. To make the multivariate and ML approaches useful in clinical settings, further research with large samples is needed to verify and improve these multivariate methods in risk factor identification, CP detection, movement assessment, and outcome evaluation or prediction. As multivariate analysis, ML and data processing technologies advance in the era of Big Data of this century, it is expected that multivariate analysis and ML will play a bigger role in improving the diagnosis and treatment of CP to reduce mortality and morbidity rates, and enhance patient care for children with CP. PMID:29312134
Multivariate Analysis and Machine Learning in Cerebral Palsy Research.
Zhang, Jing
2017-01-01
Cerebral palsy (CP), a common pediatric movement disorder, causes the most severe physical disability in children. Early diagnosis in high-risk infants is critical for early intervention and possible early recovery. In recent years, multivariate analytic and machine learning (ML) approaches have been increasingly used in CP research. This paper aims to identify such multivariate studies and provide an overview of this relatively young field. Studies reviewed in this paper have demonstrated that multivariate analytic methods are useful in identification of risk factors, detection of CP, movement assessment for CP prediction, and outcome assessment, and ML approaches have made it possible to automatically identify movement impairments in high-risk infants. In addition, outcome predictors for surgical treatments have been identified by multivariate outcome studies. To make the multivariate and ML approaches useful in clinical settings, further research with large samples is needed to verify and improve these multivariate methods in risk factor identification, CP detection, movement assessment, and outcome evaluation or prediction. As multivariate analysis, ML and data processing technologies advance in the era of Big Data of this century, it is expected that multivariate analysis and ML will play a bigger role in improving the diagnosis and treatment of CP to reduce mortality and morbidity rates, and enhance patient care for children with CP.
Multivariate Analysis and Prediction of Dioxin-Furan ...
Peer Review Draft of Regional Methods Initiative Final Report Dioxins, which are bioaccumulative and environmentally persistent, pose an ongoing risk to human and ecosystem health. Fish constitute a significant source of dioxin exposure for humans and fish-eating wildlife. Current dioxin analytical methods are costly, time-consuming, and produce hazardous by-products. A Danish team developed a novel, multivariate statistical methodology based on the covariance of dioxin-furan congener Toxic Equivalences (TEQs) and fatty acid methyl esters (FAMEs) and applied it to North Atlantic Ocean fishmeal samples. The goal of the current study was to attempt to extend this Danish methodology to 77 whole and composite fish samples from three trophic groups: predator (whole largemouth bass), benthic (whole flathead and channel catfish) and forage fish (composite bluegill, pumpkinseed and green sunfish) from two dioxin contaminated rivers (Pocatalico R. and Kanawha R.) in West Virginia, USA. Multivariate statistical analyses, including, Principal Components Analysis (PCA), Hierarchical Clustering, and Partial Least Squares Regression (PLS), were used to assess the relationship between the FAMEs and TEQs in these dioxin contaminated freshwater fish from the Kanawha and Pocatalico Rivers. These three multivariate statistical methods all confirm that the pattern of Fatty Acid Methyl Esters (FAMEs) in these freshwater fish covaries with and is predictive of the WHO TE
Williams, L. Keoki; Buu, Anne
2017-01-01
We propose a multivariate genome-wide association test for mixed continuous, binary, and ordinal phenotypes. A latent response model is used to estimate the correlation between phenotypes with different measurement scales so that the empirical distribution of the Fisher’s combination statistic under the null hypothesis is estimated efficiently. The simulation study shows that our proposed correlation estimation methods have high levels of accuracy. More importantly, our approach conservatively estimates the variance of the test statistic so that the type I error rate is controlled. The simulation also shows that the proposed test maintains the power at the level very close to that of the ideal analysis based on known latent phenotypes while controlling the type I error. In contrast, conventional approaches–dichotomizing all observed phenotypes or treating them as continuous variables–could either reduce the power or employ a linear regression model unfit for the data. Furthermore, the statistical analysis on the database of the Study of Addiction: Genetics and Environment (SAGE) demonstrates that conducting a multivariate test on multiple phenotypes can increase the power of identifying markers that may not be, otherwise, chosen using marginal tests. The proposed method also offers a new approach to analyzing the Fagerström Test for Nicotine Dependence as multivariate phenotypes in genome-wide association studies. PMID:28081206
NASA Astrophysics Data System (ADS)
Padilla-Jiménez, Amira C.; Ortiz-Rivera, William; Rios-Velazquez, Carlos; Vazquez-Ayala, Iris; Hernández-Rivera, Samuel P.
2014-06-01
Investigations focusing on devising rapid and accurate methods for developing signatures for microorganisms that could be used as biological warfare agents' detection, identification, and discrimination have recently increased significantly. Quantum cascade laser (QCL)-based spectroscopic systems have revolutionized many areas of defense and security including this area of research. In this contribution, infrared spectroscopy detection based on QCL was used to obtain the mid-infrared (MIR) spectral signatures of Bacillus thuringiensis, Escherichia coli, and Staphylococcus epidermidis. These bacteria were used as microorganisms that simulate biothreats (biosimulants) very truthfully. The experiments were conducted in reflection mode with biosimulants deposited on various substrates including cardboard, glass, travel bags, wood, and stainless steel. Chemometrics multivariate statistical routines, such as principal component analysis regression and partial least squares coupled to discriminant analysis, were used to analyze the MIR spectra. Overall, the investigated infrared vibrational techniques were useful for detecting target microorganisms on the studied substrates, and the multivariate data analysis techniques proved to be very efficient for classifying the bacteria and discriminating them in the presence of highly IR-interfering media.
Huang, Jun; Kaul, Goldi; Cai, Chunsheng; Chatlapalli, Ramarao; Hernandez-Abad, Pedro; Ghosh, Krishnendu; Nagi, Arwinder
2009-12-01
To facilitate an in-depth process understanding, and offer opportunities for developing control strategies to ensure product quality, a combination of experimental design, optimization and multivariate techniques was integrated into the process development of a drug product. A process DOE was used to evaluate effects of the design factors on manufacturability and final product CQAs, and establish design space to ensure desired CQAs. Two types of analyses were performed to extract maximal information, DOE effect & response surface analysis and multivariate analysis (PCA and PLS). The DOE effect analysis was used to evaluate the interactions and effects of three design factors (water amount, wet massing time and lubrication time), on response variables (blend flow, compressibility and tablet dissolution). The design space was established by the combined use of DOE, optimization and multivariate analysis to ensure desired CQAs. Multivariate analysis of all variables from the DOE batches was conducted to study relationships between the variables and to evaluate the impact of material attributes/process parameters on manufacturability and final product CQAs. The integrated multivariate approach exemplifies application of QbD principles and tools to drug product and process development.
Application of multivariate statistical techniques in microbial ecology
Paliy, O.; Shankar, V.
2016-01-01
Recent advances in high-throughput methods of molecular analyses have led to an explosion of studies generating large scale ecological datasets. Especially noticeable effect has been attained in the field of microbial ecology, where new experimental approaches provided in-depth assessments of the composition, functions, and dynamic changes of complex microbial communities. Because even a single high-throughput experiment produces large amounts of data, powerful statistical techniques of multivariate analysis are well suited to analyze and interpret these datasets. Many different multivariate techniques are available, and often it is not clear which method should be applied to a particular dataset. In this review we describe and compare the most widely used multivariate statistical techniques including exploratory, interpretive, and discriminatory procedures. We consider several important limitations and assumptions of these methods, and we present examples of how these approaches have been utilized in recent studies to provide insight into the ecology of the microbial world. Finally, we offer suggestions for the selection of appropriate methods based on the research question and dataset structure. PMID:26786791
Practical robustness measures in multivariable control system analysis. Ph.D. Thesis
NASA Technical Reports Server (NTRS)
Lehtomaki, N. A.
1981-01-01
The robustness of the stability of multivariable linear time invariant feedback control systems with respect to model uncertainty is considered using frequency domain criteria. Available robustness tests are unified under a common framework based on the nature and structure of model errors. These results are derived using a multivariable version of Nyquist's stability theorem in which the minimum singular value of the return difference transfer matrix is shown to be the multivariable generalization of the distance to the critical point on a single input, single output Nyquist diagram. Using the return difference transfer matrix, a very general robustness theorem is presented from which all of the robustness tests dealing with specific model errors may be derived. The robustness tests that explicitly utilized model error structure are able to guarantee feedback system stability in the face of model errors of larger magnitude than those robustness tests that do not. The robustness of linear quadratic Gaussian control systems are analyzed.
Mehta, Kedar G; Baxi, Rajendra; Chavda, Parag; Patel, Sangita; Mazumdar, Vihang
2016-01-01
As more and more people with human immunodeficiency virus (HIV) live longer and healthier lives because of antiretroviral therapy (ART), an increasing number of sexual transmissions of HIV may arise from these people living with HIV/AIDS (PLWHA). Hence, this study is conducted to assess the predictors of unsafe sexual behavior among PLWHA on ART in Western India. The current cross-sectional study was carried out among 175 PLWHAs attending ART center of a Tertiary Care Hospital in Western India. Unsafe sex was defined as inconsistent and/or incorrect condom use. A total of 39 variables from four domains viz., sociodemographic, relationship-related, medical and psycho-social factors were studied for their relationship to unsafe sexual behavior. The variables found to be significantly associated with unsafe sex practices in bivariate analysis were explored by multivariate analysis using multiple logistic regression in SPSS 17.0 version. Fifty-eight percentage of PLWHAs were practicing unsafe sex. 15 out of total 39 variables showed significant association in bivariate analysis. Finally, 11 of them showed significant association in multivariate analysis. Young age group, illiteracy, lack of counseling, misbeliefs about condom use, nondisclosure to spouse and lack of partner communication were the major factors found to be independently associated with unsafe sex in multivariate analysis. Appropriate interventions like need-based counseling are required to address risk factors associated with unsafe sex.
Chemotaxonomy of Hawaiian Anthurium cultivars based on multivariate analysis of phenolic metabolites
USDA-ARS?s Scientific Manuscript database
Thirty-six anthurium spathes, sampled from species and commercial cultivars, were extracted and profiled using liquid-chromatography-mass spectrometry (LC-MS). 315 compounds, including anthocyanins, flavonoid glycosides, and flavanols, were detected from these extracts and used in chemotaxonomic ana...
Dangers in Using Analysis of Covariance Procedures.
ERIC Educational Resources Information Center
Campbell, Kathleen T.
Problems associated with the use of analysis of covariance (ANCOVA) as a statistical control technique are explained. Three problems relate to the use of "OVA" methods (analysis of variance, analysis of covariance, multivariate analysis of variance, and multivariate analysis of covariance) in general. These are: (1) the wasting of information when…
Brito Lopes, Fernando; da Silva, Marcelo Corrêa; Magnabosco, Cláudio Ulhôa; Goncalves Narciso, Marcelo; Sainz, Roberto Daniel
2016-01-01
This research evaluated a multivariate approach as an alternative tool for the purpose of selection regarding expected progeny differences (EPDs). Data were fitted using a multi-trait model and consisted of growth traits (birth weight and weights at 120, 210, 365 and 450 days of age) and carcass traits (longissimus muscle area (LMA), back-fat thickness (BF), and rump fat thickness (RF)), registered over 21 years in extensive breeding systems of Polled Nellore cattle in Brazil. Multivariate analyses were performed using standardized (zero mean and unit variance) EPDs. The k mean method revealed that the best fit of data occurred using three clusters (k = 3) (P < 0.001). Estimates of genetic correlation among growth and carcass traits and the estimates of heritability were moderate to high, suggesting that a correlated response approach is suitable for practical decision making. Estimates of correlation between selection indices and the multivariate index (LD1) were moderate to high, ranging from 0.48 to 0.97. This reveals that both types of indices give similar results and that the multivariate approach is reliable for the purpose of selection. The alternative tool seems very handy when economic weights are not available or in cases where more rapid identification of the best animals is desired. Interestingly, multivariate analysis allowed forecasting information based on the relationships among breeding values (EPDs). Also, it enabled fine discrimination, rapid data summarization after genetic evaluation, and permitted accounting for maternal ability and the genetic direct potential of the animals. In addition, we recommend the use of longissimus muscle area and subcutaneous fat thickness as selection criteria, to allow estimation of breeding values before the first mating season in order to accelerate the response to individual selection. PMID:26789008
Brito Lopes, Fernando; da Silva, Marcelo Corrêa; Magnabosco, Cláudio Ulhôa; Goncalves Narciso, Marcelo; Sainz, Roberto Daniel
2016-01-01
This research evaluated a multivariate approach as an alternative tool for the purpose of selection regarding expected progeny differences (EPDs). Data were fitted using a multi-trait model and consisted of growth traits (birth weight and weights at 120, 210, 365 and 450 days of age) and carcass traits (longissimus muscle area (LMA), back-fat thickness (BF), and rump fat thickness (RF)), registered over 21 years in extensive breeding systems of Polled Nellore cattle in Brazil. Multivariate analyses were performed using standardized (zero mean and unit variance) EPDs. The k mean method revealed that the best fit of data occurred using three clusters (k = 3) (P < 0.001). Estimates of genetic correlation among growth and carcass traits and the estimates of heritability were moderate to high, suggesting that a correlated response approach is suitable for practical decision making. Estimates of correlation between selection indices and the multivariate index (LD1) were moderate to high, ranging from 0.48 to 0.97. This reveals that both types of indices give similar results and that the multivariate approach is reliable for the purpose of selection. The alternative tool seems very handy when economic weights are not available or in cases where more rapid identification of the best animals is desired. Interestingly, multivariate analysis allowed forecasting information based on the relationships among breeding values (EPDs). Also, it enabled fine discrimination, rapid data summarization after genetic evaluation, and permitted accounting for maternal ability and the genetic direct potential of the animals. In addition, we recommend the use of longissimus muscle area and subcutaneous fat thickness as selection criteria, to allow estimation of breeding values before the first mating season in order to accelerate the response to individual selection.
Barigye, Stephen J; Freitas, Matheus P; Ausina, Priscila; Zancan, Patricia; Sola-Penna, Mauro; Castillo-Garit, Juan A
2018-02-12
We recently generalized the formerly alignment-dependent multivariate image analysis applied to quantitative structure-activity relationships (MIA-QSAR) method through the application of the discrete Fourier transform (DFT), allowing for its application to noncongruent and structurally diverse chemical compound data sets. Here we report the first practical application of this method in the screening of molecular entities of therapeutic interest, with human aromatase inhibitory activity as the case study. We developed an ensemble classification model based on the two-dimensional (2D) DFT MIA-QSAR descriptors, with which we screened the NCI Diversity Set V (1593 compounds) and obtained 34 chemical compounds with possible aromatase inhibitory activity. These compounds were docked into the aromatase active site, and the 10 most promising compounds were selected for in vitro experimental validation. Of these compounds, 7419 (nonsteroidal) and 89 201 (steroidal) demonstrated satisfactory antiproliferative and aromatase inhibitory activities. The obtained results suggest that the 2D-DFT MIA-QSAR method may be useful in ligand-based virtual screening of new molecular entities of therapeutic utility.
Quantum Attack-Resistent Certificateless Multi-Receiver Signcryption Scheme
Li, Huixian; Chen, Xubao; Pang, Liaojun; Shi, Weisong
2013-01-01
The existing certificateless signcryption schemes were designed mainly based on the traditional public key cryptography, in which the security relies on the hard problems, such as factor decomposition and discrete logarithm. However, these problems will be easily solved by the quantum computing. So the existing certificateless signcryption schemes are vulnerable to the quantum attack. Multivariate public key cryptography (MPKC), which can resist the quantum attack, is one of the alternative solutions to guarantee the security of communications in the post-quantum age. Motivated by these concerns, we proposed a new construction of the certificateless multi-receiver signcryption scheme (CLMSC) based on MPKC. The new scheme inherits the security of MPKC, which can withstand the quantum attack. Multivariate quadratic polynomial operations, which have lower computation complexity than bilinear pairing operations, are employed in signcrypting a message for a certain number of receivers in our scheme. Security analysis shows that our scheme is a secure MPKC-based scheme. We proved its security under the hardness of the Multivariate Quadratic (MQ) problem and its unforgeability under the Isomorphism of Polynomials (IP) assumption in the random oracle model. The analysis results show that our scheme also has the security properties of non-repudiation, perfect forward secrecy, perfect backward secrecy and public verifiability. Compared with the existing schemes in terms of computation complexity and ciphertext length, our scheme is more efficient, which makes it suitable for terminals with low computation capacity like smart cards. PMID:23967037
Peikert, Tobias; Duan, Fenghai; Rajagopalan, Srinivasan; Karwoski, Ronald A; Clay, Ryan; Robb, Richard A; Qin, Ziling; Sicks, JoRean; Bartholmai, Brian J; Maldonado, Fabien
2018-01-01
Optimization of the clinical management of screen-detected lung nodules is needed to avoid unnecessary diagnostic interventions. Herein we demonstrate the potential value of a novel radiomics-based approach for the classification of screen-detected indeterminate nodules. Independent quantitative variables assessing various radiologic nodule features such as sphericity, flatness, elongation, spiculation, lobulation and curvature were developed from the NLST dataset using 726 indeterminate nodules (all ≥ 7 mm, benign, n = 318 and malignant, n = 408). Multivariate analysis was performed using least absolute shrinkage and selection operator (LASSO) method for variable selection and regularization in order to enhance the prediction accuracy and interpretability of the multivariate model. The bootstrapping method was then applied for the internal validation and the optimism-corrected AUC was reported for the final model. Eight of the originally considered 57 quantitative radiologic features were selected by LASSO multivariate modeling. These 8 features include variables capturing Location: vertical location (Offset carina centroid z), Size: volume estimate (Minimum enclosing brick), Shape: flatness, Density: texture analysis (Score Indicative of Lesion/Lung Aggression/Abnormality (SILA) texture), and surface characteristics: surface complexity (Maximum shape index and Average shape index), and estimates of surface curvature (Average positive mean curvature and Minimum mean curvature), all with P<0.01. The optimism-corrected AUC for these 8 features is 0.939. Our novel radiomic LDCT-based approach for indeterminate screen-detected nodule characterization appears extremely promising however independent external validation is needed.
Jerlström, Tomas; Gårdmark, Truls; Carringer, Malcolm; Holmäng, Sten; Liedberg, Fredrik; Hosseini, Abolfazl; Malmström, Per-Uno; Ljungberg, Börje; Hagberg, Oskar; Jahnson, Staffan
2014-08-01
Cystectomy combined with pelvic lymph-node dissection and urinary diversion entails high morbidity and mortality. Improvements are needed, and a first step is to collect information on the current situation. In 2011, this group took the initiative to start a population-based database in Sweden (population 9.5 million in 2011) with prospective registration of patients and complications until 90 days after cystectomy. This article reports findings from the first year of registration. Participation was voluntary, and data were reported by local urologists or research nurses. Perioperative parameters and early complications classified according to the modified Clavien system were registered, and selected variables of possible importance for complications were analysed by univariate and multivariate logistic regression. During 2011, 285 (65%) of 435 cystectomies performed in Sweden were registered in the database, the majority reported by the seven academic centres. Median blood loss was 1000 ml, operating time 318 min, and length of hospital stay 15 days. Any complications were registered for 103 patients (36%). Clavien grades 1-2 and 3-5 were noted in 19% and 15%, respectively. Thirty-seven patients (13%) were reoperated on at least once. In logistic regression analysis elevated risk of complications was significantly associated with operating time exceeding 318 min in both univariate and multivariate analysis, and with age 76-89 years only in multivariate analysis. It was feasible to start a national population-based registry of radical cystectomies for bladder cancer. The evaluation of the first year shows an increased risk of complications in patients with longer operating time and higher age. The results agree with some previously published series but should be interpreted with caution considering the relatively low coverage, which is expected to be higher in the future.
NASA Astrophysics Data System (ADS)
Tsao, Sinchai; Gajawelli, Niharika; Zhou, Jiayu; Shi, Jie; Ye, Jieping; Wang, Yalin; Lepore, Natasha
2014-03-01
Prediction of Alzheimers disease (AD) progression based on baseline measures allows us to understand disease progression and has implications in decisions concerning treatment strategy. To this end we combine a predictive multi-task machine learning method1 with novel MR-based multivariate morphometric surface map of the hippocampus2 to predict future cognitive scores of patients. Previous work by Zhou et al.1 has shown that a multi-task learning framework that performs prediction of all future time points (or tasks) simultaneously can be used to encode both sparsity as well as temporal smoothness. They showed that this can be used in predicting cognitive outcomes of Alzheimers Disease Neuroimaging Initiative (ADNI) subjects based on FreeSurfer-based baseline MRI features, MMSE score demographic information and ApoE status. Whilst volumetric information may hold generalized information on brain status, we hypothesized that hippocampus specific information may be more useful in predictive modeling of AD. To this end, we applied Shi et al.2s recently developed multivariate tensor-based (mTBM) parametric surface analysis method to extract features from the hippocampal surface. We show that by combining the power of the multi-task framework with the sensitivity of mTBM features of the hippocampus surface, we are able to improve significantly improve predictive performance of ADAS cognitive scores 6, 12, 24, 36 and 48 months from baseline.
NASA Astrophysics Data System (ADS)
Slezak, Thomas Joseph; Radebaugh, Jani; Christiansen, Eric
2017-10-01
The shapes of craterform morphology on planetary surfaces provides rich information about their origins and evolution. While morphologic information provides rich visual clues to geologic processes and properties, the ability to quantitatively communicate this information is less easily accomplished. This study examines the morphology of craterforms using the quantitative outline-based shape methods of geometric morphometrics, commonly used in biology and paleontology. We examine and compare landforms on planetary surfaces using shape, a property of morphology that is invariant to translation, rotation, and size. We quantify the shapes of paterae on Io, martian calderas, terrestrial basaltic shield calderas, terrestrial ash-flow calderas, and lunar impact craters using elliptic Fourier analysis (EFA) and the Zahn and Roskies (Z-R) shape function, or tangent angle approach to produce multivariate shape descriptors. These shape descriptors are subjected to multivariate statistical analysis including canonical variate analysis (CVA), a multiple-comparison variant of discriminant analysis, to investigate the link between craterform shape and classification. Paterae on Io are most similar in shape to terrestrial ash-flow calderas and the shapes of terrestrial basaltic shield volcanoes are most similar to martian calderas. The shapes of lunar impact craters, including simple, transitional, and complex morphology, are classified with a 100% rate of success in all models. Multiple CVA models effectively predict and classify different craterforms using shape-based identification and demonstrate significant potential for use in the analysis of planetary surfaces.
Drop coating deposition Raman spectroscopy of blood plasma for the detection of colorectal cancer
NASA Astrophysics Data System (ADS)
Li, Pengpeng; Chen, Changshui; Deng, Xiaoyuan; Mao, Hua; Jin, Shaoqin
2015-03-01
We have recently applied the technique of drop coating deposition Raman (DCDR) spectroscopy for colorectal cancer (CRC) detection using blood plasma. The aim of this study was to develop a more convenient and stable method based on blood plasma for noninvasive CRC detection. Significant differences are observed in DCDR spectra between healthy (n=105) and cancer (n=75) plasma from 15 CRC patients and 21 volunteers, particularly in the spectra that are related to proteins, nucleic acids, and β-carotene. The multivariate analysis principal components analysis and the linear discriminate analysis, together with leave-one-out, cross validation were used on DCDR spectra and yielded a sensitivity of 100% (75/75) and specificity of 98.1% (103/105) for detection of CRC. This study demonstrates that DCDR spectroscopy of blood plasma associated with multivariate statistical algorithms has the potential for the noninvasive detection of CRC.
Darwish, Hany W; Bakheit, Ahmed H; Abdelhameed, Ali S
2016-03-01
Simultaneous spectrophotometric analysis of a multi-component dosage form of olmesartan, amlodipine and hydrochlorothiazide used for the treatment of hypertension has been carried out using various chemometric methods. Multivariate calibration methods include classical least squares (CLS) executed by net analyte processing (NAP-CLS), orthogonal signal correction (OSC-CLS) and direct orthogonal signal correction (DOSC-CLS) in addition to multivariate curve resolution-alternating least squares (MCR-ALS). Results demonstrated the efficiency of the proposed methods as quantitative tools of analysis as well as their qualitative capability. The three analytes were determined precisely using the aforementioned methods in an external data set and in a dosage form after optimization of experimental conditions. Finally, the efficiency of the models was validated via comparison with the partial least squares (PLS) method in terms of accuracy and precision.
NASA Astrophysics Data System (ADS)
Haq, Quazi M. I.; Mabood, Fazal; Naureen, Zakira; Al-Harrasi, Ahmed; Gilani, Sayed A.; Hussain, Javid; Jabeen, Farah; Khan, Ajmal; Al-Sabari, Ruqaya S. M.; Al-khanbashi, Fatema H. S.; Al-Fahdi, Amira A. M.; Al-Zaabi, Ahoud K. A.; Al-Shuraiqi, Fatma A. M.; Al-Bahaisi, Iman M.
2018-06-01
Nucleic acid & serology based methods have revolutionized plant disease detection, however, they are not very reliable at asymptomatic stage, especially in case of pathogen with systemic infection, in addition, they need at least 1-2 days for sample harvesting, processing, and analysis. In this study, two reflectance spectroscopies i.e. Near Infrared reflectance spectroscopy (NIR) and Fourier-Transform-Infrared spectroscopy with Attenuated Total Reflection (FT-IR, ATR) coupled with multivariate exploratory methods like Principle Component Analysis (PCA) and Partial least square discriminant analysis (PLS-DA) have been deployed to detect begomovirus infection in papaya leaves. The application of those techniques demonstrates that they are very useful for robust in vivo detection of plant begomovirus infection. These methods are simple, sensitive, reproducible, precise, and do not require any lengthy samples preparation procedures.
NASA Astrophysics Data System (ADS)
Sheykhizadeh, Saheleh; Naseri, Abdolhossein
2018-04-01
Variable selection plays a key role in classification and multivariate calibration. Variable selection methods are aimed at choosing a set of variables, from a large pool of available predictors, relevant to the analyte concentrations estimation, or to achieve better classification results. Many variable selection techniques have now been introduced among which, those which are based on the methodologies of swarm intelligence optimization have been more respected during a few last decades since they are mainly inspired by nature. In this work, a simple and new variable selection algorithm is proposed according to the invasive weed optimization (IWO) concept. IWO is considered a bio-inspired metaheuristic mimicking the weeds ecological behavior in colonizing as well as finding an appropriate place for growth and reproduction; it has been shown to be very adaptive and powerful to environmental changes. In this paper, the first application of IWO, as a very simple and powerful method, to variable selection is reported using different experimental datasets including FTIR and NIR data, so as to undertake classification and multivariate calibration tasks. Accordingly, invasive weed optimization - linear discrimination analysis (IWO-LDA) and invasive weed optimization- partial least squares (IWO-PLS) are introduced for multivariate classification and calibration, respectively.
Sheykhizadeh, Saheleh; Naseri, Abdolhossein
2018-04-05
Variable selection plays a key role in classification and multivariate calibration. Variable selection methods are aimed at choosing a set of variables, from a large pool of available predictors, relevant to the analyte concentrations estimation, or to achieve better classification results. Many variable selection techniques have now been introduced among which, those which are based on the methodologies of swarm intelligence optimization have been more respected during a few last decades since they are mainly inspired by nature. In this work, a simple and new variable selection algorithm is proposed according to the invasive weed optimization (IWO) concept. IWO is considered a bio-inspired metaheuristic mimicking the weeds ecological behavior in colonizing as well as finding an appropriate place for growth and reproduction; it has been shown to be very adaptive and powerful to environmental changes. In this paper, the first application of IWO, as a very simple and powerful method, to variable selection is reported using different experimental datasets including FTIR and NIR data, so as to undertake classification and multivariate calibration tasks. Accordingly, invasive weed optimization - linear discrimination analysis (IWO-LDA) and invasive weed optimization- partial least squares (IWO-PLS) are introduced for multivariate classification and calibration, respectively. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Daftedar Abdelhadi, Raghda Mohamed
Although the Next Generation Science Standards (NGSS) present a detailed set of Science and Engineering Practices, a finer grained representation of the underlying skills is lacking in the standards document. Therefore, it has been reported that teachers are facing challenges deciphering and effectively implementing the standards, especially with regards to the Practices. This analytical study assessed the development of high school chemistry students' (N = 41) inquiry, multivariable causal reasoning skills, and metacognition as a mediator for their development. Inquiry tasks based on concepts of element properties of the periodic table as well as reaction kinetics required students to conduct controlled thought experiments, make inferences, and declare predictions of the level of the outcome variable by coordinating the effects of multiple variables. An embedded mixed methods design was utilized for depth and breadth of understanding. Various sources of data were collected including students' written artifacts, audio recordings of in-depth observational groups and interviews. Data analysis was informed by a conceptual framework formulated around the concepts of coordinating theory and evidence, metacognition, and mental models of multivariable causal reasoning. Results of the study indicated positive change towards conducting controlled experimentation, making valid inferences and justifications. Additionally, significant positive correlation between metastrategic and metacognitive competencies, and sophistication of experimental strategies, signified the central role metacognition played. Finally, lack of consistency in indicating effective variables during the multivariable prediction task pointed towards the fragile mental models of multivariable causal reasoning the students had. Implications for teacher education, science education policy as well as classroom research methods are discussed. Finally, recommendations for developing reform-based chemistry curricula based on the Practices are presented.
Multivariate Analysis of Income Inequality: Data from 32 Nations.
ERIC Educational Resources Information Center
Stack, Steven
To analyze income inequality in 32 nations, the research tested hypotheses based upon eight socioeconomic variables. The first seven variables, often tested in income research, were: political participation, industrial development, population growth, educational level, inflation rate, economic growth, and technological complexity. The eighth…
Li, Yan; Zhang, Ji; Zhao, Yanli; Liu, Honggao; Wang, Yuanzhong; Jin, Hang
2016-01-01
In this study the geographical differentiation of dried sclerotia of the medicinal mushroom Wolfiporia extensa, obtained from different regions in Yunnan Province, China, was explored using Fourier-transform infrared (FT-IR) spectroscopy coupled with multivariate data analysis. The FT-IR spectra of 97 samples were obtained for wave numbers ranging from 4000 to 400 cm-1. Then, the fingerprint region of 1800-600 cm-1 of the FT-IR spectrum, rather than the full spectrum, was analyzed. Different pretreatments were applied on the spectra, and a discriminant analysis model based on the Mahalanobis distance was developed to select an optimal pretreatment combination. Two unsupervised pattern recognition procedures- principal component analysis and hierarchical cluster analysis-were applied to enhance the authenticity of discrimination of the specimens. The results showed that excellent classification could be obtained after optimizing spectral pretreatment. The tested samples were successfully discriminated according to their geographical locations. The chemical properties of dried sclerotia of W. extensa were clearly dependent on the mushroom's geographical origins. Furthermore, an interesting finding implied that the elevations of collection areas may have effects on the chemical components of wild W. extensa sclerotia. Overall, this study highlights the feasibility of FT-IR spectroscopy combined with multivariate data analysis in particular for exploring the distinction of different regional W. extensa sclerotia samples. This research could also serve as a basis for the exploitation and utilization of medicinal mushrooms.
ERIC Educational Resources Information Center
Chang, Chi-Cheng; Chou, Pao-Nan; Liang, Chaoyan
2018-01-01
The purpose of the present study was to examine the effects of the ePortfolio-based learning approach (ePBLA) on knowledge sharing and creation with 92 college students majoring in electrical engineering as the participants. Multivariate analysis of covariance (MANCOVA) with a covariance of pretest on knowledge sharing and creation was conducted…
NASA Technical Reports Server (NTRS)
Achtemeier, Gary L.; Kidder, Stanley Q.; Scott, Robert W.
1988-01-01
The variational multivariate assimilation method described in a companion paper by Achtemeier and Ochs is applied to conventional and conventional plus satellite data. Ground-based and space-based meteorological data are weighted according to the respective measurement errors and blended into a data set that is a solution of numerical forms of the two nonlinear horizontal momentum equations, the hydrostatic equation, and an integrated continuity equation for a dry atmosphere. The analyses serve first, to evaluate the accuracy of the model, and second to contrast the analyses with and without satellite data. Evaluation criteria measure the extent to which: (1) the assimilated fields satisfy the dynamical constraints, (2) the assimilated fields depart from the observations, and (3) the assimilated fields are judged to be realistic through pattern analysis. The last criterion requires that the signs, magnitudes, and patterns of the hypersensitive vertical velocity and local tendencies of the horizontal velocity components be physically consistent with respect to the larger scale weather systems.
Cai, Li-mei; Ma, Jin; Zhou, Yong-zhang; Huang, Lan-chun; Dou, Lei; Zhang, Cheng-bo; Fu, Shan-ming
2008-12-01
One hundred and eighteen surface soil samples were collected from the Dongguan City, and analyzed for concentration of Cu, Zn, Ni, Cr, Pb, Cd, As, Hg, pH and OM. The spatial distribution and sources of soil heavy metals were studied using multivariate geostatistical methods and GIS technique. The results indicated concentrations of Cu, Zn, Ni, Pb, Cd and Hg were beyond the soil background content in Guangdong province, and especially concentrations of Pb, Cd and Hg were greatly beyond the content. The results of factor analysis group Cu, Zn, Ni, Cr and As in Factor 1, Pb and Hg in Factor 2 and Cd in Factor 3. The spatial maps based on geostatistical analysis show definite association of Factor 1 with the soil parent material, Factor 2 was mainly affected by industries. The spatial distribution of Factor 3 was attributed to anthropogenic influence.
Boosting Higgs pair production in the [Formula: see text] final state with multivariate techniques.
Behr, J Katharina; Bortoletto, Daniela; Frost, James A; Hartland, Nathan P; Issever, Cigdem; Rojo, Juan
2016-01-01
The measurement of Higgs pair production will be a cornerstone of the LHC program in the coming years. Double Higgs production provides a crucial window upon the mechanism of electroweak symmetry breaking and has a unique sensitivity to the Higgs trilinear coupling. We study the feasibility of a measurement of Higgs pair production in the [Formula: see text] final state at the LHC. Our analysis is based on a combination of traditional cut-based methods with state-of-the-art multivariate techniques. We account for all relevant backgrounds, including the contributions from light and charm jet mis-identification, which are ultimately comparable in size to the irreducible 4 b QCD background. We demonstrate the robustness of our analysis strategy in a high pileup environment. For an integrated luminosity of [Formula: see text] ab[Formula: see text], a signal significance of [Formula: see text] is obtained, indicating that the [Formula: see text] final state alone could allow for the observation of double Higgs production at the High Luminosity LHC.
Ritota, Mena; Casciani, Lorena; Valentini, Massimiliano
2013-05-01
Analytical traceability of PGI and PDO foods (Protected Geographical Indication and Protected Denomination Origin respectively) is one of the most challenging tasks of current applied research. Here we proposed a metabolomic approach based on the combination of (1)H high-resolution magic angle spinning-nuclear magnetic resonance (HRMAS-NMR) spectroscopy with multivariate analysis, i.e. PLS-DA, as a reliable tool for the traceability of Italian PGI chicories (Cichorium intybus L.), i.e. Radicchio Rosso di Treviso and Radicchio Variegato di Castelfranco, also known as red and red-spotted, respectively. The metabolic profile was gained by means of HRMAS-NMR, and multivariate data analysis allowed us to build statistical models capable of providing clear discrimination among the two varieties and classification according to the geographical origin. Based on Variable Importance in Projection values, the molecular markers for classifying the different types of red chicories analysed were found accounting for both the cultivar and the place of origin. © 2012 Society of Chemical Industry.
Wilson, Iain; Paul Barrett, Michael; Sinha, Ashish; Chan, Shirley
2014-11-01
Elderly patients are often judged to be fit for emergency surgery based on age alone. This study identified risk factors predictive of in-hospital mortality amongst octogenarians undergoing emergency general surgery. A retrospective review of octogenarians undergoing emergency general surgery over 3 years was performed. Parametric survival analysis using Cox multivariate regression model was used to identify risk factors predictive of in-hospital mortality. Hazard ratios (HR) and corresponding 95% confidence interval were calculated. Seventy-three patients with a median age of 84 years were identified. Twenty-eight (38%) patients died post-operatively. Multivariate analysis identified ASA grade (ASA 5 HR 23.4 95% CI 2.38-230, p = 0.007) and chronic obstructive pulmonary disease (COPD) (HR 3.35 95% CI 1.15-9.69, p = 0.026) to be the only significant predictors of in-hospital mortality. Identification of high risk surgical patients should be based on physiological fitness for surgery rather than chronological age. Crown Copyright © 2014. Published by Elsevier Ltd. All rights reserved.
Wang, Jinjia; Liu, Yuan
2015-04-01
This paper presents a feature extraction method based on multivariate empirical mode decomposition (MEMD) combining with the power spectrum feature, and the method aims at the non-stationary electroencephalogram (EEG) or magnetoencephalogram (MEG) signal in brain-computer interface (BCI) system. Firstly, we utilized MEMD algorithm to decompose multichannel brain signals into a series of multiple intrinsic mode function (IMF), which was proximate stationary and with multi-scale. Then we extracted and reduced the power characteristic from each IMF to a lower dimensions using principal component analysis (PCA). Finally, we classified the motor imagery tasks by linear discriminant analysis classifier. The experimental verification showed that the correct recognition rates of the two-class and four-class tasks of the BCI competition III and competition IV reached 92.0% and 46.2%, respectively, which were superior to the winner of the BCI competition. The experimental proved that the proposed method was reasonably effective and stable and it would provide a new way for feature extraction.
Detecting spatial regimes in ecosystems | Science Inventory ...
Research on early warning indicators has generally focused on assessing temporal transitions with limited application of these methods to detecting spatial regimes. Traditional spatial boundary detection procedures that result in ecoregion maps are typically based on ecological potential (i.e. potential vegetation), and often fail to account for ongoing changes due to stressors such as land use change and climate change and their effects on plant and animal communities. We use Fisher information, an information theory based method, on both terrestrial and aquatic animal data (US Breeding Bird Survey and marine zooplankton) to identify ecological boundaries, and compare our results to traditional early warning indicators, conventional ecoregion maps, and multivariate analysis such as nMDS (non-metric Multidimensional Scaling) and cluster analysis. We successfully detect spatial regimes and transitions in both terrestrial and aquatic systems using Fisher information. Furthermore, Fisher information provided explicit spatial information about community change that is absent from other multivariate approaches. Our results suggest that defining spatial regimes based on animal communities may better reflect ecological reality than do traditional ecoregion maps, especially in our current era of rapid and unpredictable ecological change. Use an information theory based method to identify ecological boundaries and compare our results to traditional early warning
Buttigieg, Pier Luigi; Ramette, Alban
2014-12-01
The application of multivariate statistical analyses has become a consistent feature in microbial ecology. However, many microbial ecologists are still in the process of developing a deep understanding of these methods and appreciating their limitations. As a consequence, staying abreast of progress and debate in this arena poses an additional challenge to many microbial ecologists. To address these issues, we present the GUide to STatistical Analysis in Microbial Ecology (GUSTA ME): a dynamic, web-based resource providing accessible descriptions of numerous multivariate techniques relevant to microbial ecologists. A combination of interactive elements allows users to discover and navigate between methods relevant to their needs and examine how they have been used by others in the field. We have designed GUSTA ME to become a community-led and -curated service, which we hope will provide a common reference and forum to discuss and disseminate analytical techniques relevant to the microbial ecology community. © 2014 The Authors. FEMS Microbiology Ecology published by John Wiley & Sons Ltd on behalf of Federation of European Microbiological Societies.
Liu, Zechang; Wang, Liping; Liu, Yumei
2018-01-18
Hops impart flavor to beer, with the volatile components characterizing the various hop varieties and qualities. Fingerprinting, especially flavor fingerprinting, is often used to identify 'flavor products' because inconsistencies in the description of flavor may lead to an incorrect definition of beer quality. Compared to flavor fingerprinting, volatile fingerprinting is simpler and easier. We performed volatile fingerprinting using head space-solid phase micro-extraction gas chromatography-mass spectrometry combined with similarity analysis and principal component analysis (PCA) for evaluating and distinguishing between three major Chinese hops. Eighty-four volatiles were identified, which were classified into seven categories. Volatile fingerprinting based on similarity analysis did not yield any obvious result. By contrast, hop varieties and qualities were identified using volatile fingerprinting based on PCA. The potential variables explained the variance in the three hop varieties. In addition, the dendrogram and principal component score plot described the differences and classifications of hops. Volatile fingerprinting plus multivariate statistical analysis can rapidly differentiate between the different varieties and qualities of the three major Chinese hops. Furthermore, this method can be used as a reference in other fields. © 2018 Society of Chemical Industry. © 2018 Society of Chemical Industry.
Yang, Yan-Qin; Yin, Hong-Xu; Yuan, Hai-Bo; Jiang, Yong-Wen; Dong, Chun-Wang; Deng, Yu-Liang
2018-01-01
In the present work, a novel infrared-assisted extraction coupled to headspace solid-phase microextraction (IRAE-HS-SPME) followed by gas chromatography-mass spectrometry (GC-MS) was developed for rapid determination of the volatile components in green tea. The extraction parameters such as fiber type, sample amount, infrared power, extraction time, and infrared lamp distance were optimized by orthogonal experimental design. Under optimum conditions, a total of 82 volatile compounds in 21 green tea samples from different geographical origins were identified. Compared with classical water-bath heating, the proposed technique has remarkable advantages of considerably reducing the analytical time and high efficiency. In addition, an effective classification of green teas based on their volatile profiles was achieved by partial least square-discriminant analysis (PLS-DA) and hierarchical clustering analysis (HCA). Furthermore, the application of a dual criterion based on the variable importance in the projection (VIP) values of the PLS-DA models and on the category from one-way univariate analysis (ANOVA) allowed the identification of 12 potential volatile markers, which were considered to make the most important contribution to the discrimination of the samples. The results suggest that IRAE-HS-SPME/GC-MS technique combined with multivariate analysis offers a valuable tool to assess geographical traceability of different tea varieties.
Yin, Hong-Xu; Yuan, Hai-Bo; Jiang, Yong-Wen; Dong, Chun-Wang; Deng, Yu-Liang
2018-01-01
In the present work, a novel infrared-assisted extraction coupled to headspace solid-phase microextraction (IRAE-HS-SPME) followed by gas chromatography-mass spectrometry (GC-MS) was developed for rapid determination of the volatile components in green tea. The extraction parameters such as fiber type, sample amount, infrared power, extraction time, and infrared lamp distance were optimized by orthogonal experimental design. Under optimum conditions, a total of 82 volatile compounds in 21 green tea samples from different geographical origins were identified. Compared with classical water-bath heating, the proposed technique has remarkable advantages of considerably reducing the analytical time and high efficiency. In addition, an effective classification of green teas based on their volatile profiles was achieved by partial least square-discriminant analysis (PLS-DA) and hierarchical clustering analysis (HCA). Furthermore, the application of a dual criterion based on the variable importance in the projection (VIP) values of the PLS-DA models and on the category from one-way univariate analysis (ANOVA) allowed the identification of 12 potential volatile markers, which were considered to make the most important contribution to the discrimination of the samples. The results suggest that IRAE-HS-SPME/GC-MS technique combined with multivariate analysis offers a valuable tool to assess geographical traceability of different tea varieties. PMID:29494626
De Luca, Michele; Ragno, Gaetano; Ioele, Giuseppina; Tauler, Romà
2014-07-21
An advanced and powerful chemometric approach is proposed for the analysis of incomplete multiset data obtained by fusion of hyphenated liquid chromatographic DAD/MS data with UV spectrophotometric data from acid-base titration and kinetic degradation experiments. Column- and row-wise augmented data blocks were combined and simultaneously processed by means of a new version of the multivariate curve resolution-alternating least squares (MCR-ALS) technique, including the simultaneous analysis of incomplete multiset data from different instrumental techniques. The proposed procedure was applied to the detailed study of the kinetic photodegradation process of the amiloride (AML) drug. All chemical species involved in the degradation and equilibrium reactions were resolved and the pH dependent kinetic pathway described. Copyright © 2014 Elsevier B.V. All rights reserved.
Bello, Alessandra; Bianchi, Federica; Careri, Maria; Giannetto, Marco; Mori, Giovanni; Musci, Marilena
2007-11-05
A new NIR method based on multivariate calibration for determination of ethanol in industrially packed wholemeal bread was developed and validated. GC-FID was used as reference method for the determination of actual ethanol concentration of different samples of wholemeal bread with proper content of added ethanol, ranging from 0 to 3.5% (w/w). Stepwise discriminant analysis was carried out on the NIR dataset, in order to reduce the number of original variables by selecting those that were able to discriminate between the samples of different ethanol concentrations. With the so selected variables a multivariate calibration model was then obtained by multiple linear regression. The prediction power of the linear model was optimized by a new "leave one out" method, so that the number of original variables resulted further reduced.
Stewart analysis of apparently normal acid-base state in the critically ill.
Moviat, Miriam; van den Boogaard, Mark; Intven, Femke; van der Voort, Peter; van der Hoeven, Hans; Pickkers, Peter
2013-12-01
This study aimed to describe Stewart parameters in critically ill patients with an apparently normal acid-base state and to determine the incidence of mixed metabolic acid-base disorders in these patients. We conducted a prospective, observational multicenter study of 312 consecutive Dutch intensive care unit patients with normal pH (7.35 ≤ pH ≤ 7.45) on days 3 to 5. Apparent (SIDa) and effective strong ion difference (SIDe) and strong ion gap (SIG) were calculated from 3 consecutive arterial blood samples. Multivariate linear regression analysis was performed to analyze factors potentially associated with levels of SIDa and SIG. A total of 137 patients (44%) were identified with an apparently normal acid-base state (normal pH and -2 < base excess < 2 and 35 < PaCO2 < 45 mm Hg). In this group, SIDa values were 36.6 ± 3.6 mEq/L, resulting from hyperchloremia (109 ± 4.6 mEq/L, sodium-chloride difference 30.0 ± 3.6 mEq/L); SIDe values were 33.5 ± 2.3 mEq/L, resulting from hypoalbuminemia (24.0 ± 6.2 g/L); and SIG values were 3.1 ± 3.1 mEq/L. During admission, base excess increased secondary to a decrease in SIG levels and, subsequently, an increase in SIDa levels. Levels of SIDa were associated with positive cation load, chloride load, and admission SIDa (multivariate r(2) = 0.40, P < .001). Levels of SIG were associated with kidney function, sepsis, and SIG levels at intensive care unit admission (multivariate r(2) = 0.28, P < .001). Intensive care unit patients with an apparently normal acid-base state have an underlying mixed metabolic acid-base disorder characterized by acidifying effects of a low SIDa (caused by hyperchloremia) and high SIG combined with the alkalinizing effect of hypoalbuminemia. © 2013.
Mokhtari, Mohammadreza; Narayanan, Balaji; Hamm, Jordan P; Soh, Pauline; Calhoun, Vince D; Ruaño, Gualberto; Kocherla, Mohan; Windemuth, Andreas; Clementz, Brett A; Tamminga, Carol A; Sweeney, John A; Keshavan, Matcheri S; Pearlson, Godfrey D
2016-05-01
The complex molecular etiology of psychosis in schizophrenia (SZ) and psychotic bipolar disorder (PBP) is not well defined, presumably due to their multifactorial genetic architecture. Neurobiological correlates of psychosis can be identified through genetic associations of intermediate phenotypes such as event-related potential (ERP) from auditory paired stimulus processing (APSP). Various ERP components of APSP are heritable and aberrant in SZ, PBP and their relatives, but their multivariate genetic factors are less explored. We investigated the multivariate polygenic association of ERP from 64-sensor auditory paired stimulus data in 149 SZ, 209 PBP probands, and 99 healthy individuals from the multisite Bipolar-Schizophrenia Network on Intermediate Phenotypes study. Multivariate association of 64-channel APSP waveforms with a subset of 16 999 single nucleotide polymorphisms (SNPs) (reduced from 1 million SNP array) was examined using parallel independent component analysis (Para-ICA). Biological pathways associated with the genes were assessed using enrichment-based analysis tools. Para-ICA identified 2 ERP components, of which one was significantly correlated with a genetic network comprising multiple linearly coupled gene variants that explained ~4% of the ERP phenotype variance. Enrichment analysis revealed epidermal growth factor, endocannabinoid signaling, glutamatergic synapse and maltohexaose transport associated with P2 component of the N1-P2 ERP waveform. This ERP component also showed deficits in SZ and PBP. Aberrant P2 component in psychosis was associated with gene networks regulating several fundamental biologic functions, either general or specific to nervous system development. The pathways and processes underlying the gene clusters play a crucial role in brain function, plausibly implicated in psychosis. © The Author 2015. Published by Oxford University Press on behalf of the Maryland Psychiatric Research Center. All rights reserved. For permissions, please email: journals.permissions@oup.com.
A microcomputer-based whole-body counter for personnel routine monitoring.
Chou, H P; Tsai, T M; Lan, C Y
1993-05-01
The paper describes a cost-effective NaI(Tl) whole-body counter developed for routine examinations of worker intakes at an isotope production facility. Signal processing, data analysis and system operation are microcomputer-controlled for minimum human interactions. The pulse height analyzer is developed as an microcomputer add-on card for easy manipulation. The scheme for radionuclide analysis is aimed for fast running according to a knowledge base established from background samples and phantom experiments in conjunction with a multivariate regression analysis. Long-term stability and calibration with standards and in vivo measurements are reported.
Belianinov, Alex; Panchapakesan, G.; Lin, Wenzhi; ...
2014-12-02
Atomic level spatial variability of electronic structure in Fe-based superconductor FeTe0.55Se0.45 (Tc = 15 K) is explored using current-imaging tunneling-spectroscopy. Multivariate statistical analysis of the data differentiates regions of dissimilar electronic behavior that can be identified with the segregation of chalcogen atoms, as well as boundaries between terminations and near neighbor interactions. Subsequent clustering analysis allows identification of the spatial localization of these dissimilar regions. Similar statistical analysis of modeled calculated density of states of chemically inhomogeneous FeTe1 x Sex structures further confirms that the two types of chalcogens, i.e., Te and Se, can be identified by their electronic signaturemore » and differentiated by their local chemical environment. This approach allows detailed chemical discrimination of the scanning tunneling microscopy data including separation of atomic identities, proximity, and local configuration effects and can be universally applicable to chemically and electronically inhomogeneous surfaces.« less
Stone loaches of Choman River system, Kurdistan, Iran (Teleostei: Cypriniformes: Nemacheilidae).
Kamangar, Barzan Bahrami; Prokofiev, Artem M; Ghaderi, Edris; Nalbant, Theodore T
2014-01-20
For the first time, we present data on species composition and distributions of nemacheilid loaches in the Choman River basin of Kurdistan province, Iran. Two genera and four species are recorded from the area, of which three species are new for science: Oxynoemacheilus kurdistanicus, O. zagrosensis, O. chomanicus spp. nov., and Turcinoemacheilus kosswigi Băn. et Nalb. Detailed and illustrated morphological descriptions and univariate and multivariate analysis of morphometric and meristic features are for each of these species. Forty morphometric and eleven meristic characters were used in multivariate analysis to select characters that could discriminate between the four loach species. Discriminant Function Analysis revealed that sixteen morphometric measures and five meristic characters have the most variability between the loach species. The dendrograms based on cluster analysis of Mahalanobis distances of morphometrics and a combination of both characters confirmed two distinct groups: Oxynoemacheilus spp. and T. kosswigi. Within Oxynoemacheilus, O. zagrosensis and O. chomanicus are more similar to one other rather to either is to O. kurdistanicus.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Belianinov, Alex, E-mail: belianinova@ornl.gov; Ganesh, Panchapakesan; Lin, Wenzhi
2014-12-01
Atomic level spatial variability of electronic structure in Fe-based superconductor FeTe{sub 0.55}Se{sub 0.45} (T{sub c} = 15 K) is explored using current-imaging tunneling-spectroscopy. Multivariate statistical analysis of the data differentiates regions of dissimilar electronic behavior that can be identified with the segregation of chalcogen atoms, as well as boundaries between terminations and near neighbor interactions. Subsequent clustering analysis allows identification of the spatial localization of these dissimilar regions. Similar statistical analysis of modeled calculated density of states of chemically inhomogeneous FeTe{sub 1−x}Se{sub x} structures further confirms that the two types of chalcogens, i.e., Te and Se, can be identified bymore » their electronic signature and differentiated by their local chemical environment. This approach allows detailed chemical discrimination of the scanning tunneling microscopy data including separation of atomic identities, proximity, and local configuration effects and can be universally applicable to chemically and electronically inhomogeneous surfaces.« less
NASA Astrophysics Data System (ADS)
Oh, Han Bin; Leach, Franklin E.; Arungundram, Sailaja; Al-Mafraji, Kanar; Venot, Andre; Boons, Geert-Jan; Amster, I. Jonathan
2011-03-01
The structural characterization of glycosaminoglycan (GAG) carbohydrates by mass spectrometry has been a long-standing analytical challenge due to the inherent heterogeneity of these biomolecules, specifically polydispersity, variability in sulfation, and hexuronic acid stereochemistry. Recent advances in tandem mass spectrometry methods employing threshold and electron-based ion activation have resulted in the ability to determine the location of the labile sulfate modification as well as assign the stereochemistry of hexuronic acid residues. To facilitate the analysis of complex electron detachment dissociation (EDD) spectra, principal component analysis (PCA) is employed to differentiate the hexuronic acid stereochemistry of four synthetic GAG epimers whose EDD spectra are nearly identical upon visual inspection. For comparison, PCA is also applied to infrared multiphoton dissociation spectra (IRMPD) of the examined epimers. To assess the applicability of multivariate methods in GAG mixture analysis, PCA is utilized to identify the relative content of two epimers in a binary mixture.
Duarte, João V; Ribeiro, Maria J; Violante, Inês R; Cunha, Gil; Silva, Eduardo; Castelo-Branco, Miguel
2014-01-01
Neurofibromatosis Type 1 (NF1) is a common genetic condition associated with cognitive dysfunction. However, the pathophysiology of the NF1 cognitive deficits is not well understood. Abnormal brain structure, including increased total brain volume, white matter (WM) and grey matter (GM) abnormalities have been reported in the NF1 brain. These previous studies employed univariate model-driven methods preventing detection of subtle and spatially distributed differences in brain anatomy. Multivariate pattern analysis allows the combination of information from multiple spatial locations yielding a discriminative power beyond that of single voxels. Here we investigated for the first time subtle anomalies in the NF1 brain, using a multivariate data-driven classification approach. We used support vector machines (SVM) to classify whole-brain GM and WM segments of structural T1 -weighted MRI scans from 39 participants with NF1 and 60 non-affected individuals, divided in children/adolescents and adults groups. We also employed voxel-based morphometry (VBM) as a univariate gold standard to study brain structural differences. SVM classifiers correctly classified 94% of cases (sensitivity 92%; specificity 96%) revealing the existence of brain structural anomalies that discriminate NF1 individuals from controls. Accordingly, VBM analysis revealed structural differences in agreement with the SVM weight maps representing the most relevant brain regions for group discrimination. These included the hippocampus, basal ganglia, thalamus, and visual cortex. This multivariate data-driven analysis thus identified subtle anomalies in brain structure in the absence of visible pathology. Our results provide further insight into the neuroanatomical correlates of known features of the cognitive phenotype of NF1. Copyright © 2012 Wiley Periodicals, Inc.
ERIC Educational Resources Information Center
McArdle, John J.; Paskus, Thomas S.; Boker, Steven M.
2013-01-01
This is an application of contemporary multilevel regression modeling to the prediction of academic performances of 1st-year college students. At a first level of analysis, the data come from N greater than 16,000 students who were college freshman in 1994-1995 and who were also participants in high-level college athletics. At a second level of…
Using Interactive Graphics to Teach Multivariate Data Analysis to Psychology Students
ERIC Educational Resources Information Center
Valero-Mora, Pedro M.; Ledesma, Ruben D.
2011-01-01
This paper discusses the use of interactive graphics to teach multivariate data analysis to Psychology students. Three techniques are explored through separate activities: parallel coordinates/boxplots; principal components/exploratory factor analysis; and cluster analysis. With interactive graphics, students may perform important parts of the…
Vongsvivut, Jitraporn; Heraud, Philip; Gupta, Adarsha; Puri, Munish; McNaughton, Don; Barrow, Colin J
2013-10-21
The increase in polyunsaturated fatty acid (PUFA) consumption has prompted research into alternative resources other than fish oil. In this study, a new approach based on focal-plane-array Fourier transform infrared (FPA-FTIR) microspectroscopy and multivariate data analysis was developed for the characterisation of some marine microorganisms. Cell and lipid compositions in lipid-rich marine yeasts collected from the Australian coast were characterised in comparison to a commercially available PUFA-producing marine fungoid protist, thraustochytrid. Multivariate classification methods provided good discriminative accuracy evidenced from (i) separation of the yeasts from thraustochytrids and distinct spectral clusters among the yeasts that conformed well to their biological identities, and (ii) correct classification of yeasts from a totally independent set using cross-validation testing. The findings further indicated additional capability of the developed FPA-FTIR methodology, when combined with partial least squares regression (PLSR) analysis, for rapid monitoring of lipid production in one of the yeasts during the growth period, which was achieved at a high accuracy compared to the results obtained from the traditional lipid analysis based on gas chromatography. The developed FTIR-based approach when coupled to programmable withdrawal devices and a cytocentrifugation module would have strong potential as a novel online monitoring technology suited for bioprocessing applications and large-scale production.
Study of the propensity for hemorrhage in Hispanic Americans with stroke.
Frey, James L; Jahnke, Heidi K; Goslar, Pamela W
2008-01-01
Multiple sources document a higher proportion of intraparenchymal hemorrhage (HEM) in Hispanic (HIS) than white (WHI) patients with stroke. We sought an explanation for this phenomenon through analysis of multiple variables in our hospital-based stroke population. We performed univariate and multivariate analysis of risk factors in our HIS and WHI patients with stroke to identify differences that might account for a greater propensity for HEM in HIS patients. Multivariate analysis disclosed that the risk of HEM correlated significantly with untreated hypertension (HTN), HIS ethnicity, and heavy alcohol intake. A negative correlation was found for hyperlipidemia and diabetes. Our HIS patients with stroke had a greater prevalence of untreated HTN and heavy alcohol intake, with HIS men being at greatest risk. HIS patients with stroke in our hospital-based population appear relatively more prone to HEM than do WHI patients. This risk correlates with a greater likelihood of having untreated HTN and heavy alcohol intake, more so for HIS men. The explanation appears to be a relative lack of health awareness and involvement in our health care system. The possibility that HIS ethnicity itself constitutes a biological risk factor for HEM remains a matter of speculation. Validation of this work with community data should lead to remediation through a community-based effort.
A power analysis for multivariate tests of temporal trend in species composition.
Irvine, Kathryn M; Dinger, Eric C; Sarr, Daniel
2011-10-01
Long-term monitoring programs emphasize power analysis as a tool to determine the sampling effort necessary to effectively document ecologically significant changes in ecosystems. Programs that monitor entire multispecies assemblages require a method for determining the power of multivariate statistical models to detect trend. We provide a method to simulate presence-absence species assemblage data that are consistent with increasing or decreasing directional change in species composition within multiple sites. This step is the foundation for using Monte Carlo methods to approximate the power of any multivariate method for detecting temporal trends. We focus on comparing the power of the Mantel test, permutational multivariate analysis of variance, and constrained analysis of principal coordinates. We find that the power of the various methods we investigate is sensitive to the number of species in the community, univariate species patterns, and the number of sites sampled over time. For increasing directional change scenarios, constrained analysis of principal coordinates was as or more powerful than permutational multivariate analysis of variance, the Mantel test was the least powerful. However, in our investigation of decreasing directional change, the Mantel test was typically as or more powerful than the other models.
Wind Turbine Load Mitigation based on Multivariable Robust Control and Blade Root Sensors
NASA Astrophysics Data System (ADS)
Díaz de Corcuera, A.; Pujana-Arrese, A.; Ezquerra, J. M.; Segurola, E.; Landaluze, J.
2014-12-01
This paper presents two H∞ multivariable robust controllers based on blade root sensors' information for individual pitch angle control. The wind turbine of 5 MW defined in the Upwind European project is the reference non-linear model used in this research work, which has been modelled in the GH Bladed 4.0 software package. The main objective of these controllers is load mitigation in different components of wind turbines during power production in the above rated control zone. The first proposed multi-input multi-output (MIMO) individual pitch H" controller mitigates the wind effect on the tower side-to-side acceleration and reduces the asymmetrical loads which appear in the rotor due to its misalignment. The second individual pitch H" multivariable controller mitigates the loads on the three blades reducing the wind effect on the bending flapwise and edgewise momentums in the blades. The designed H" controllers have been validated in GH Bladed and an exhaustive analysis has been carried out to calculate fatigue load reduction on wind turbine components, as well as to analyze load mitigation in some extreme cases.
DU, Juan; Yuan, Zhen-Gang; Zhang, Chun-Yang; Fu, Wei-Jun; Jiang, Hua; Chen, Bao-An; Hou, Jian
2009-10-01
To evaluate the effect of polymorphism at the -238 and -308 position of the TNF-alpha promotor region on the clinical outcome of thalidomide (Thal)-based regimens for the treatment of multiple myeloma (MM). The polymorphism at the -238 and -308 position of the TNF-alpha promotor region of 168 MM patients treated with Thal-based regimens were determined by polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP). Genotypes were tested for association with overall response by logistic regression, and survival was evaluated by univariate and multivariate analysis. In TNF-alpha -238 position, 11 (6.5%) patients had GA genotype and 1 (0.6%) AA genotype. In TNF-alpha -308 position, 19 (11.3%) had GA genotype and 1 (0.6%) AA genotype. In univariate analysis, the TNF-alpha -238 GA + AA genotypes were associated with a significantly prolonged progression free survival (PFS) (P = 0.017), and a better overall survival (OS) (P = 0.150). Multivariate COX regression analysis showed that TNF-alpha -238 polymorphic status was an independent prognostic factor for prolonged PFS (P = 0.049). The TNF-alpha -238 polymorphic status is associated with a favorable clinical outcome in MM patients treated with thalidomide-based regimen. The polymorphism status of TNF-alpha gene might be of promise for developing a more informative stratification system for MM.
Mueller, Daniela; Ferrão, Marco Flôres; Marder, Luciano; da Costa, Adilson Ben; de Cássia de Souza Schneider, Rosana
2013-01-01
The main objective of this study was to use infrared spectroscopy to identify vegetable oils used as raw material for biodiesel production and apply multivariate analysis to the data. Six different vegetable oil sources—canola, cotton, corn, palm, sunflower and soybeans—were used to produce biodiesel batches. The spectra were acquired by Fourier transform infrared spectroscopy using a universal attenuated total reflectance sensor (FTIR-UATR). For the multivariate analysis principal component analysis (PCA), hierarchical cluster analysis (HCA), interval principal component analysis (iPCA) and soft independent modeling of class analogy (SIMCA) were used. The results indicate that is possible to develop a methodology to identify vegetable oils used as raw material in the production of biodiesel by FTIR-UATR applying multivariate analysis. It was also observed that the iPCA found the best spectral range for separation of biodiesel batches using FTIR-UATR data, and with this result, the SIMCA method classified 100% of the soybean biodiesel samples. PMID:23539030
Multivariate meta-analysis for non-linear and other multi-parameter associations
Gasparrini, A; Armstrong, B; Kenward, M G
2012-01-01
In this paper, we formalize the application of multivariate meta-analysis and meta-regression to synthesize estimates of multi-parameter associations obtained from different studies. This modelling approach extends the standard two-stage analysis used to combine results across different sub-groups or populations. The most straightforward application is for the meta-analysis of non-linear relationships, described for example by regression coefficients of splines or other functions, but the methodology easily generalizes to any setting where complex associations are described by multiple correlated parameters. The modelling framework of multivariate meta-analysis is implemented in the package mvmeta within the statistical environment R. As an illustrative example, we propose a two-stage analysis for investigating the non-linear exposure–response relationship between temperature and non-accidental mortality using time-series data from multiple cities. Multivariate meta-analysis represents a useful analytical tool for studying complex associations through a two-stage procedure. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22807043
Generating Nonnormal Multivariate Data Using Copulas: Applications to SEM.
Mair, Patrick; Satorra, Albert; Bentler, Peter M
2012-07-01
This article develops a procedure based on copulas to simulate multivariate nonnormal data that satisfy a prespecified variance-covariance matrix. The covariance matrix used can comply with a specific moment structure form (e.g., a factor analysis or a general structural equation model). Thus, the method is particularly useful for Monte Carlo evaluation of structural equation models within the context of nonnormal data. The new procedure for nonnormal data simulation is theoretically described and also implemented in the widely used R environment. The quality of the method is assessed by Monte Carlo simulations. A 1-sample test on the observed covariance matrix based on the copula methodology is proposed. This new test for evaluating the quality of a simulation is defined through a particular structural model specification and is robust against normality violations.
The analysis of morphometric data on rocky mountain wolves and artic wolves using statistical method
NASA Astrophysics Data System (ADS)
Ammar Shafi, Muhammad; Saifullah Rusiman, Mohd; Hamzah, Nor Shamsidah Amir; Nor, Maria Elena; Ahmad, Noor’ani; Azia Hazida Mohamad Azmi, Nur; Latip, Muhammad Faez Ab; Hilmi Azman, Ahmad
2018-04-01
Morphometrics is a quantitative analysis depending on the shape and size of several specimens. Morphometric quantitative analyses are commonly used to analyse fossil record, shape and size of specimens and others. The aim of the study is to find the differences between rocky mountain wolves and arctic wolves based on gender. The sample utilised secondary data which included seven variables as independent variables and two dependent variables. Statistical modelling was used in the analysis such was the analysis of variance (ANOVA) and multivariate analysis of variance (MANOVA). The results showed there exist differentiating results between arctic wolves and rocky mountain wolves based on independent factors and gender.
Hierl, L.A.; Loftin, C.S.; Longcore, J.R.; McAuley, D.G.; Urban, D.L.
2007-01-01
We assessed changes in vegetative structure of 49 impoundments at Moosehorn National Wildlife Refuge (MNWR), Maine, USA, between the periods 1984-1985 to 2002 with a multivariate, adaptive approach that may be useful in a variety of wetland and other habitat management situations. We used Mahalanobis Distance (MD) analysis to classify the refuge?s wetlands as poor or good waterbird habitat based on five variables: percent emergent vegetation, percent shrub, percent open water, relative richness of vegetative types, and an interspersion juxtaposition index that measures adjacency of vegetation patches. Mahalanobis Distance is a multivariate statistic that examines whether a particular data point is an outlier or a member of a data cluster while accounting for correlations among inputs. For each wetland, we used MD analysis to quantify a distance from a reference condition defined a priori by habitat conditions measured in MNWR wetlands used by waterbirds. Twenty-five wetlands declined in quality between the two periods, whereas 23 wetlands improved. We identified specific wetland characteristics that may be modified to improve habitat conditions for waterbirds. The MD analysis seems ideal for instituting an adaptive wetland management approach because metrics can be easily added or removed, ranges of target habitat conditions can be defined by field-collected data, and the analysis can identify priorities for single or multiple management objectives.
Zhou, Fei; Zhao, Yajing; Peng, Jiyu; Jiang, Yirong; Li, Maiquan; Jiang, Yuan; Lu, Baiyi
2017-07-01
Osmanthus fragrans flowers are used as folk medicine and additives for teas, beverages and foods. The metabolites of O. fragrans flowers from different geographical origins were inconsistent in some extent. Chromatography and mass spectrometry combined with multivariable analysis methods provides an approach for discriminating the origin of O. fragrans flowers. To discriminate the Osmanthus fragrans var. thunbergii flowers from different origins with the identified metabolites. GC-MS and UPLC-PDA were conducted to analyse the metabolites in O. fragrans var. thunbergii flowers (in total 150 samples). Principal component analysis (PCA), soft independent modelling of class analogy analysis (SIMCA) and random forest (RF) analysis were applied to group the GC-MS and UPLC-PDA data. GC-MS identified 32 compounds common to all samples while UPLC-PDA/QTOF-MS identified 16 common compounds. PCA of the UPLC-PDA data generated a better clustering than PCA of the GC-MS data. Ten metabolites (six from GC-MS and four from UPLC-PDA) were selected as effective compounds for discrimination by PCA loadings. SIMCA and RF analysis were used to build classification models, and the RF model, based on the four effective compounds (caffeic acid derivative, acteoside, ligustroside and compound 15), yielded better results with the classification rate of 100% in the calibration set and 97.8% in the prediction set. GC-MS and UPLC-PDA combined with multivariable analysis methods can discriminate the origin of Osmanthus fragrans var. thunbergii flowers. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Gao, Wen; Yang, Hua; Qi, Lian-Wen; Liu, E-Hu; Ren, Mei-Ting; Yan, Yu-Ting; Chen, Jun; Li, Ping
2012-07-06
Plant-based medicines become increasingly popular over the world. Authentication of herbal raw materials is important to ensure their safety and efficacy. Some herbs belonging to closely related species but differing in medicinal properties are difficult to be identified because of similar morphological and microscopic characteristics. Chromatographic fingerprinting is an alternative method to distinguish them. Existing approaches do not allow a comprehensive analysis for herbal authentication. We have now developed a strategy consisting of (1) full metabolic profiling of herbal medicines by rapid resolution liquid chromatography (RRLC) combined with quadrupole time-of-flight mass spectrometry (QTOF MS), (2) global analysis of non-targeted compounds by molecular feature extraction algorithm, (3) multivariate statistical analysis for classification and prediction, and (4) marker compounds characterization. This approach has provided a fast and unbiased comparative multivariate analysis of the metabolite composition of 33-batch samples covering seven Lonicera species. Individual metabolic profiles are performed at the level of molecular fragments without prior structural assignment. In the entire set, the obtained classifier for seven Lonicera species flower buds showed good prediction performance and a total of 82 statistically different components were rapidly obtained by the strategy. The elemental compositions of discriminative metabolites were characterized by the accurate mass measurement of the pseudomolecular ions and their chemical types were assigned by the MS/MS spectra. The high-resolution, comprehensive and unbiased strategy for metabolite data analysis presented here is powerful and opens the new direction of authentication in herbal analysis. Copyright © 2012 Elsevier B.V. All rights reserved.
Papadia, Andrea; Bellati, Filippo; Bogani, Giorgio; Ditto, Antonino; Martinelli, Fabio; Lorusso, Domenica; Donfrancesco, Cristina; Gasparri, Maria Luisa; Raspagliesi, Francesco
2015-12-01
The aim of this study was to identify clinical variables that may predict the need for adjuvant radiotherapy after neoadjuvant chemotherapy (NACT) and radical surgery in locally advanced cervical cancer patients. A retrospective series of cervical cancer patients with International Federation of Gynecology and Obstetrics (FIGO) stages IB2-IIB treated with NACT followed by radical surgery was analyzed. Clinical predictors of persistence of intermediate- and/or high-risk factors at final pathological analysis were investigated. Statistical analysis was performed using univariate and multivariate analysis and using a model based on artificial intelligence known as artificial neuronal network (ANN) analysis. Overall, 101 patients were available for the analyses. Fifty-two (51 %) patients were considered at high risk secondary to parametrial, resection margin and/or lymph node involvement. When disease was confined to the cervix, four (4 %) patients were considered at intermediate risk. At univariate analysis, FIGO grade 3, stage IIB disease at diagnosis and the presence of enlarged nodes before NACT predicted the presence of intermediate- and/or high-risk factors at final pathological analysis. At multivariate analysis, only FIGO grade 3 and tumor diameter maintained statistical significance. The specificity of ANN models in evaluating predictive variables was slightly superior to conventional multivariable models. FIGO grade, stage, tumor diameter, and histology are associated with persistence of pathological intermediate- and/or high-risk factors after NACT and radical surgery. This information is useful in counseling patients at the time of treatment planning with regard to the probability of being subjected to pelvic radiotherapy after completion of the initially planned treatment.
2013-01-01
Background The sea louse Lepeophtheirus salmonis is the most important ectoparasite of farmed Atlantic salmon (Salmo salar) in Norwegian aquaculture. Control of sea lice is primarily dependent on the use of delousing chemotherapeutants, which are both expensive and toxic to other wildlife. The method most commonly used for monitoring treatment effectiveness relies on measuring the percentage reduction in the mobile stages of Lepeophtheirus salmonis only. However, this does not account for changes in the other sea lice stages and may result in misleading or incomplete interpretation regarding the effectiveness of treatment. With the aim of improving the evaluation of delousing treatments, we explored multivariate analyses of bath treatments using the topical pyrethroid, cypermethrin, in salmon pens at five Norwegian production sites. Results Conventional univariate analysis indicated reductions of over 90% in mobile stages at all sites. In contrast, multivariate analyses indicated differing treatment effectiveness between sites (p-value < 0.01) based on changes in the proportion and abundance of the chalimus and PAAM (pre-adult and adult males) stages. Low water temperatures and shortened intervals between sampling after treatment may account for the differences in the composition of chalimus and PAAM stage groups following treatment. Using multivariate analysis, such factors could be separated from those which were attributable to inadequate treatment or chemotherapeutant failure. Conclusions Multivariate analyses for evaluation of treatment effectiveness against multiple life cycle stages of L. salmonis yield additional information beyond that derivable from univariate methods. This can aid in the identification of causes of apparent treatment failure in salmon aquaculture. PMID:24354936
Computational Visual Stress Level Analysis of Calcareous Algae Exposed to Sedimentation
Nilssen, Ingunn; Eide, Ingvar; de Oliveira Figueiredo, Marcia Abreu; de Souza Tâmega, Frederico Tapajós; Nattkemper, Tim W.
2016-01-01
This paper presents a machine learning based approach for analyses of photos collected from laboratory experiments conducted to assess the potential impact of water-based drill cuttings on deep-water rhodolith-forming calcareous algae. This pilot study uses imaging technology to quantify and monitor the stress levels of the calcareous algae Mesophyllum engelhartii (Foslie) Adey caused by various degrees of light exposure, flow intensity and amount of sediment. A machine learning based algorithm was applied to assess the temporal variation of the calcareous algae size (∼ mass) and color automatically. Measured size and color were correlated to the photosynthetic efficiency (maximum quantum yield of charge separation in photosystem II, ΦPSIImax) and degree of sediment coverage using multivariate regression. The multivariate regression showed correlations between time and calcareous algae sizes, as well as correlations between fluorescence and calcareous algae colors. PMID:27285611
The association between physical activity and social isolation in community-dwelling older adults.
Robins, Lauren M; Hill, Keith D; Finch, Caroline F; Clemson, Lindy; Haines, Terry
2018-02-01
Social isolation is an increasing concern in older community-dwelling adults. There is growing need to determine effective interventions addressing social isolation. This study aimed to determine whether a relationship exists between physical activity (recreational and/or household-based) and social isolation. An examination was conducted for whether group- or home-based falls prevention exercise was associated with social isolation. Cross-sectional analysis of telephone survey data was used to investigate relationships between physical activity, health, age, gender, living arrangements, ethnicity and participation in group- or home-based falls prevention exercise on social isolation. Univariable and multivariable ordered logistic regression analyses were conducted. Factors found to be significantly associated with reduced social isolation in multivariable analysis included living with a partner/spouse, reporting better general health, higher levels of household-based physical activity (OR = 1.03, CI = 1.01-1.05) and feeling less downhearted/depressed. Being more socially isolated was associated with symptoms of depression and a diagnosis of congestive heart failure (pseudo R 2 = 0.104). Findings suggest that household-based physical activity is related to social isolation in community-dwelling older adults. Further research is required to determine the nature of this relationship and to investigate the impact of group physical activity interventions on social isolation.
Two-sample tests and one-way MANOVA for multivariate biomarker data with nondetects.
Thulin, M
2016-09-10
Testing whether the mean vector of a multivariate set of biomarkers differs between several populations is an increasingly common problem in medical research. Biomarker data is often left censored because some measurements fall below the laboratory's detection limit. We investigate how such censoring affects multivariate two-sample and one-way multivariate analysis of variance tests. Type I error rates, power and robustness to increasing censoring are studied, under both normality and non-normality. Parametric tests are found to perform better than non-parametric alternatives, indicating that the current recommendations for analysis of censored multivariate data may have to be revised. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
A non-iterative extension of the multivariate random effects meta-analysis.
Makambi, Kepher H; Seung, Hyunuk
2015-01-01
Multivariate methods in meta-analysis are becoming popular and more accepted in biomedical research despite computational issues in some of the techniques. A number of approaches, both iterative and non-iterative, have been proposed including the multivariate DerSimonian and Laird method by Jackson et al. (2010), which is non-iterative. In this study, we propose an extension of the method by Hartung and Makambi (2002) and Makambi (2001) to multivariate situations. A comparison of the bias and mean square error from a simulation study indicates that, in some circumstances, the proposed approach perform better than the multivariate DerSimonian-Laird approach. An example is presented to demonstrate the application of the proposed approach.
Maric, Mark; Harvey, Lauren; Tomcsak, Maren; Solano, Angelique; Bridge, Candice
2017-06-30
In comparison to other violent crimes, sexual assaults suffer from very low prosecution and conviction rates especially in the absence of DNA evidence. As a result, the forensic community needs to utilize other forms of trace contact evidence, like lubricant evidence, in order to provide a link between the victim and the assailant. In this study, 90 personal bottled and condom lubricants from the three main marketing types, silicone-based, water-based and condoms, were characterized by direct analysis in real time time of flight mass spectrometry (DART-TOFMS). The instrumental data was analyzed by multivariate statistics including hierarchal cluster analysis, principal component analysis, and linear discriminant analysis. By interpreting the mass spectral data with multivariate statistics, 12 discrete groupings were identified, indicating inherent chemical diversity not only between but within the three main marketing groups. A number of unique chemical markers, both major and minor, were identified, other than the three main chemical components (i.e. PEG, PDMS and nonoxynol-9) currently used for lubricant classification. The data was validated by a stratified 20% withheld cross-validation which demonstrated that there was minimal overlap between the groupings. Based on the groupings identified and unique features of each group, a highly discriminating statistical model was then developed that aims to provide the foundation for the development of a forensic lubricant database that may eventually be applied to casework. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Multivariate Autoregressive Modeling and Granger Causality Analysis of Multiple Spike Trains
Krumin, Michael; Shoham, Shy
2010-01-01
Recent years have seen the emergence of microelectrode arrays and optical methods allowing simultaneous recording of spiking activity from populations of neurons in various parts of the nervous system. The analysis of multiple neural spike train data could benefit significantly from existing methods for multivariate time-series analysis which have proven to be very powerful in the modeling and analysis of continuous neural signals like EEG signals. However, those methods have not generally been well adapted to point processes. Here, we use our recent results on correlation distortions in multivariate Linear-Nonlinear-Poisson spiking neuron models to derive generalized Yule-Walker-type equations for fitting ‘‘hidden” Multivariate Autoregressive models. We use this new framework to perform Granger causality analysis in order to extract the directed information flow pattern in networks of simulated spiking neurons. We discuss the relative merits and limitations of the new method. PMID:20454705
A refined method for multivariate meta-analysis and meta-regression.
Jackson, Daniel; Riley, Richard D
2014-02-20
Making inferences about the average treatment effect using the random effects model for meta-analysis is problematic in the common situation where there is a small number of studies. This is because estimates of the between-study variance are not precise enough to accurately apply the conventional methods for testing and deriving a confidence interval for the average effect. We have found that a refined method for univariate meta-analysis, which applies a scaling factor to the estimated effects' standard error, provides more accurate inference. We explain how to extend this method to the multivariate scenario and show that our proposal for refined multivariate meta-analysis and meta-regression can provide more accurate inferences than the more conventional approach. We explain how our proposed approach can be implemented using standard output from multivariate meta-analysis software packages and apply our methodology to two real examples. Copyright © 2013 John Wiley & Sons, Ltd.
Masiá, M; Gutiérrez, F; Padilla, S; Soldán, B; Mirete, C; Shum, C; Hernández, I; Royo, G; Martin-Hidalgo, A
2007-02-01
The aim of this study was to characterise community-acquired pneumonia (CAP) caused by atypical pathogens by combining distinctive clinical and epidemiological features and novel biological markers. A population-based prospective study of consecutive patients with CAP included investigation of biomarkers of bacterial infection, e.g., procalcitonin, C-reactive protein and lipopolysaccharide-binding protein (LBP) levels. Clinical, radiological and laboratory data for patients with CAP caused by atypical pathogens were compared by univariate and multivariate analysis with data for patients with typical pathogens and patients from whom no organisms were identified. Two predictive scoring models were developed with the most discriminatory variables from multivariate analysis. Of 493 patients, 94 had CAP caused by atypical pathogens. According to multivariate analysis, patients with atypical pneumonia were more likely to have normal white blood cell counts, have repetitive air-conditioning exposure, be aged <65 years, have elevated aspartate aminotransferase levels, have been exposed to birds, and have lower serum levels of LBP. Two different scoring systems were developed that predicted atypical pathogens with sensitivities of 35.2% and 48.8%, and specificities of 93% and 91%, respectively. The combination of selected patient characteristics and laboratory data identified up to half of the cases of atypical pneumonia with high specificity, which should help clinicians to optimise initial empirical therapy for CAP.
NASA Astrophysics Data System (ADS)
Gaudio, P.; Malizia, A.; Gelfusa, M.; Martinelli, E.; Di Natale, C.; Poggi, L. A.; Bellecci, C.
2017-01-01
Nowadays Toxic Industrial Components (TICs) and Toxic Industrial Materials (TIMs) are one of the most dangerous and diffuse vehicle of contamination in urban and industrial areas. The academic world together with the industrial and military one are working on innovative solutions to monitor the diffusion in atmosphere of such pollutants. In this phase the most common commercial sensors are based on “point detection” technology but it is clear that such instruments cannot satisfy the needs of the smart cities. The new challenge is developing stand-off systems to continuously monitor the atmosphere. Quantum Electronics and Plasma Physics (QEP) research group has a long experience in laser system development and has built two demonstrators based on DIAL (Differential Absorption of Light) technology could be able to identify chemical agents in atmosphere. In this work the authors will present one of those DIAL system, the miniaturized one, together with the preliminary results of an experimental campaign conducted on TICs and TIMs simulants in cell with aim of use the absorption database for the further atmospheric an analysis using the same DIAL system. The experimental results are analysed with standard multivariate data analysis technique as Principal Component Analysis (PCA) to develop a classification model aimed at identifying organic chemical compound in atmosphere. The preliminary results of absorption coefficients of some chemical compound are shown together pre PCA analysis.
ERIC Educational Resources Information Center
Jinajai, Nattapong; Rattanavich, Saowalak
2015-01-01
This research aims to study the development of ninth grade students' reading and writing abilities and interests in learning English taught through computer-assisted instruction (CAI) based on the top-level structure (TLS) method. An experimental group time series design was used, and the data was analyzed by multivariate analysis of variance…
ERIC Educational Resources Information Center
Carpino, Rachel; Walker, Mary P.; Liu, Ying; Simmer-Beck, Melanie
2017-01-01
This program evaluation examines the effectiveness of a school-based dental clinic. A repeated-measures design was used to longitudinally examine secondary data from participants (N = 293). Encounter intensity was developed to normalize data. Multivariate analysis of variance and Kruskal-Wallis test were used to investigate the effect of encounter…
Ji, Hong; Petro, Nathan M; Chen, Badong; Yuan, Zejian; Wang, Jianji; Zheng, Nanning; Keil, Andreas
2018-02-06
Over the past decade, the simultaneous recording of electroencephalogram (EEG) and functional magnetic resonance imaging (fMRI) data has garnered growing interest because it may provide an avenue towards combining the strengths of both imaging modalities. Given their pronounced differences in temporal and spatial statistics, the combination of EEG and fMRI data is however methodologically challenging. Here, we propose a novel screening approach that relies on a Cross Multivariate Correlation Coefficient (xMCC) framework. This approach accomplishes three tasks: (1) It provides a measure for testing multivariate correlation and multivariate uncorrelation of the two modalities; (2) it provides criterion for the selection of EEG features; (3) it performs a screening of relevant EEG information by grouping the EEG channels into clusters to improve efficiency and to reduce computational load when searching for the best predictors of the BOLD signal. The present report applies this approach to a data set with concurrent recordings of steady-state-visual evoked potentials (ssVEPs) and fMRI, recorded while observers viewed phase-reversing Gabor patches. We test the hypothesis that fluctuations in visuo-cortical mass potentials systematically covary with BOLD fluctuations not only in visual cortical, but also in anterior temporal and prefrontal areas. Results supported the hypothesis and showed that the xMCC-based analysis provides straightforward identification of neurophysiological plausible brain regions with EEG-fMRI covariance. Furthermore xMCC converged with other extant methods for EEG-fMRI analysis. © 2018 The Authors Journal of Neuroscience Research Published by Wiley Periodicals, Inc.
1993-06-18
the exception. In the Standardized Aquatic Microcosm and the Mixed Flask Culture (MFC) microcosms, multivariate analysis and clustering methods...rule rather than the exception. In the Standardized Aquatic Microcosm and the Mixed Flask Culture (MFC) microcosms, multivariate analysis and...experiments using two microcosm protocols. We use nonmetric clustering, a multivariate pattern recognition technique developed by Matthews and Heame (1991
A nonparametric clustering technique which estimates the number of clusters
NASA Technical Reports Server (NTRS)
Ramey, D. B.
1983-01-01
In applications of cluster analysis, one usually needs to determine the number of clusters, K, and the assignment of observations to each cluster. A clustering technique based on recursive application of a multivariate test of bimodality which automatically estimates both K and the cluster assignments is presented.
Multivariate analysis for scanning tunneling spectroscopy data
NASA Astrophysics Data System (ADS)
Yamanishi, Junsuke; Iwase, Shigeru; Ishida, Nobuyuki; Fujita, Daisuke
2018-01-01
We applied principal component analysis (PCA) to two-dimensional tunneling spectroscopy (2DTS) data obtained on a Si(111)-(7 × 7) surface to explore the effectiveness of multivariate analysis for interpreting 2DTS data. We demonstrated that several components that originated mainly from specific atoms at the Si(111)-(7 × 7) surface can be extracted by PCA. Furthermore, we showed that hidden components in the tunneling spectra can be decomposed (peak separation), which is difficult to achieve with normal 2DTS analysis without the support of theoretical calculations. Our analysis showed that multivariate analysis can be an additional powerful way to analyze 2DTS data and extract hidden information from a large amount of spectroscopic data.
NASA Technical Reports Server (NTRS)
Hague, D. S.; Woodbury, N. W.
1975-01-01
The Mars system is a tool for rapid prediction of aircraft or engine characteristics based on correlation-regression analysis of past designs stored in the data bases. An example of output obtained from the MARS system, which involves derivation of an expression for gross weight of subsonic transport aircraft in terms of nine independent variables is given. The need is illustrated for careful selection of correlation variables and for continual review of the resulting estimation equations. For Vol. 1, see N76-10089.
Application of multivariate statistical techniques in microbial ecology.
Paliy, O; Shankar, V
2016-03-01
Recent advances in high-throughput methods of molecular analyses have led to an explosion of studies generating large-scale ecological data sets. In particular, noticeable effect has been attained in the field of microbial ecology, where new experimental approaches provided in-depth assessments of the composition, functions and dynamic changes of complex microbial communities. Because even a single high-throughput experiment produces large amount of data, powerful statistical techniques of multivariate analysis are well suited to analyse and interpret these data sets. Many different multivariate techniques are available, and often it is not clear which method should be applied to a particular data set. In this review, we describe and compare the most widely used multivariate statistical techniques including exploratory, interpretive and discriminatory procedures. We consider several important limitations and assumptions of these methods, and we present examples of how these approaches have been utilized in recent studies to provide insight into the ecology of the microbial world. Finally, we offer suggestions for the selection of appropriate methods based on the research question and data set structure. © 2016 John Wiley & Sons Ltd.
Multivariate Analysis of Schools and Educational Policy.
ERIC Educational Resources Information Center
Kiesling, Herbert J.
This report describes a multivariate analysis technique that approaches the problems of educational production function analysis by (1) using comparable measures of output across large experiments, (2) accounting systematically for differences in socioeconomic background, and (3) treating the school as a complete system in which different…
Analysis of Big Data in Gait Biomechanics: Current Trends and Future Directions.
Phinyomark, Angkoon; Petri, Giovanni; Ibáñez-Marcelo, Esther; Osis, Sean T; Ferber, Reed
2018-01-01
The increasing amount of data in biomechanics research has greatly increased the importance of developing advanced multivariate analysis and machine learning techniques, which are better able to handle "big data". Consequently, advances in data science methods will expand the knowledge for testing new hypotheses about biomechanical risk factors associated with walking and running gait-related musculoskeletal injury. This paper begins with a brief introduction to an automated three-dimensional (3D) biomechanical gait data collection system: 3D GAIT, followed by how the studies in the field of gait biomechanics fit the quantities in the 5 V's definition of big data: volume, velocity, variety, veracity, and value. Next, we provide a review of recent research and development in multivariate and machine learning methods-based gait analysis that can be applied to big data analytics. These modern biomechanical gait analysis methods include several main modules such as initial input features, dimensionality reduction (feature selection and extraction), and learning algorithms (classification and clustering). Finally, a promising big data exploration tool called "topological data analysis" and directions for future research are outlined and discussed.
Multivariate η-μ fading distribution with arbitrary correlation model
NASA Astrophysics Data System (ADS)
Ghareeb, Ibrahim; Atiani, Amani
2018-03-01
An extensive analysis for the multivariate ? distribution with arbitrary correlation is presented, where novel analytical expressions for the multivariate probability density function, cumulative distribution function and moment generating function (MGF) of arbitrarily correlated and not necessarily identically distributed ? power random variables are derived. Also, this paper provides exact-form expression for the MGF of the instantaneous signal-to-noise ratio at the combiner output in a diversity reception system with maximal-ratio combining and post-detection equal-gain combining operating in slow frequency nonselective arbitrarily correlated not necessarily identically distributed ?-fading channels. The average bit error probability of differentially detected quadrature phase shift keying signals with post-detection diversity reception system over arbitrarily correlated and not necessarily identical fading parameters ?-fading channels is determined by using the MGF-based approach. The effect of fading correlation between diversity branches, fading severity parameters and diversity level is studied.
Copula-based analysis of rhythm
NASA Astrophysics Data System (ADS)
García, J. E.; González-López, V. A.; Viola, M. L. Lanfredi
2016-06-01
In this paper we establish stochastic profiles of the rhythm for three languages: English, Japanese and Spanish. We model the increase or decrease of the acoustical energy, collected into three bands coming from the acoustic signal. The number of parameters needed to specify a discrete multivariate Markov chain grows exponentially with the order and dimension of the chain. In this case the size of the database is not large enough for a consistent estimation of the model. We apply a strategy to estimate a multivariate process with an order greater than the order achieved using standard procedures. The new strategy consist on obtaining a partition of the state space which is constructed from a combination of the partitions corresponding to the three marginal processes, one for each band of energy, and the partition coming from to the multivariate Markov chain. Then, all the partitions are linked using a copula, in order to estimate the transition probabilities.
Gauging Skills of Hospital Security Personnel: a Statistically-driven, Questionnaire-based Approach.
Rinkoo, Arvind Vashishta; Mishra, Shubhra; Rahesuddin; Nabi, Tauqeer; Chandra, Vidha; Chandra, Hem
2013-01-01
This study aims to gauge the technical and soft skills of the hospital security personnel so as to enable prioritization of their training needs. A cross sectional questionnaire based study was conducted in December 2011. Two separate predesigned and pretested questionnaires were used for gauging soft skills and technical skills of the security personnel. Extensive statistical analysis, including Multivariate Analysis (Pillai-Bartlett trace along with Multi-factorial ANOVA) and Post-hoc Tests (Bonferroni Test) was applied. The 143 participants performed better on the soft skills front with an average score of 6.43 and standard deviation of 1.40. The average technical skills score was 5.09 with a standard deviation of 1.44. The study avowed a need for formal hands on training with greater emphasis on technical skills. Multivariate analysis of the available data further helped in identifying 20 security personnel who should be prioritized for soft skills training and a group of 36 security personnel who should receive maximum attention during technical skills training. This statistically driven approach can be used as a prototype by healthcare delivery institutions worldwide, after situation specific customizations, to identify the training needs of any category of healthcare staff.
Evaluation and simplification of the occupational slip, trip and fall risk-assessment test
NAKAMURA, Takehiro; OYAMA, Ichiro; FUJINO, Yoshihisa; KUBO, Tatsuhiko; KADOWAKI, Koji; KUNIMOTO, Masamizu; ODOI, Haruka; TABATA, Hidetoshi; MATSUDA, Shinya
2016-01-01
Objective: The purpose of this investigation is to evaluate the efficacy of the occupational slip, trip and fall (STF) risk assessment test developed by the Japan Industrial Safety and Health Association (JISHA). We further intended to simplify the test to improve efficiency. Methods: A previous cohort study was performed using 540 employees aged ≥50 years who took the JISHA’s STF risk assessment test. We conducted multivariate analysis using these previous results as baseline values and answers to questionnaire items or score on physical fitness tests as variables. The screening efficiency of each model was evaluated based on the obtained receiver operating characteristic (ROC) curve. Results: The area under the ROC obtained in multivariate analysis was 0.79 when using all items. Six of the 25 questionnaire items were selected for stepwise analysis, giving an area under the ROC curve of 0.77. Conclusion: Based on the results of follow-up performed one year after the initial examination, we successfully determined the usefulness of the STF risk assessment test. Administering a questionnaire alone is sufficient for screening subjects at risk of STF during the subsequent one-year period. PMID:27021057
Gauging Skills of Hospital Security Personnel: a Statistically-driven, Questionnaire-based Approach
Rinkoo, Arvind Vashishta; Mishra, Shubhra; Rahesuddin; Nabi, Tauqeer; Chandra, Vidha; Chandra, Hem
2013-01-01
Objectives This study aims to gauge the technical and soft skills of the hospital security personnel so as to enable prioritization of their training needs. Methodology A cross sectional questionnaire based study was conducted in December 2011. Two separate predesigned and pretested questionnaires were used for gauging soft skills and technical skills of the security personnel. Extensive statistical analysis, including Multivariate Analysis (Pillai-Bartlett trace along with Multi-factorial ANOVA) and Post-hoc Tests (Bonferroni Test) was applied. Results The 143 participants performed better on the soft skills front with an average score of 6.43 and standard deviation of 1.40. The average technical skills score was 5.09 with a standard deviation of 1.44. The study avowed a need for formal hands on training with greater emphasis on technical skills. Multivariate analysis of the available data further helped in identifying 20 security personnel who should be prioritized for soft skills training and a group of 36 security personnel who should receive maximum attention during technical skills training. Conclusion This statistically driven approach can be used as a prototype by healthcare delivery institutions worldwide, after situation specific customizations, to identify the training needs of any category of healthcare staff. PMID:23559904
A novel multivariate STeady-state index during general ANesthesia (STAN).
Castro, Ana; de Almeida, Fernando Gomes; Amorim, Pedro; Nunes, Catarina S
2017-08-01
The assessment of the adequacy of general anesthesia for surgery, namely the nociception/anti-nociception balance, has received wide attention from the scientific community. Monitoring systems based on the frontal EEG/EMG, or autonomic state reactions (e.g. heart rate and blood pressure) have been developed aiming to objectively assess this balance. In this study a new multivariate indicator of patients' steady-state during anesthesia (STAN) is proposed, based on wavelet analysis of signals linked to noxious activation. A clinical protocol was designed to analyze precise noxious stimuli (laryngoscopy/intubation, tetanic, and incision), under three different analgesic doses; patients were randomized to receive either remifentanil 2.0, 3.0 or 4.0 ng/ml. ECG, PPG, BP, BIS, EMG and [Formula: see text] were continuously recorded. ECG, PPG and BP were processed to extract beat-to-beat information, and [Formula: see text] curve used to estimate the respiration rate. A combined steady-state index based on wavelet analysis of these variables, was applied and compared between the three study groups and stimuli (Wilcoxon signed ranks, Kruskal-Wallis and Mann-Whitney tests). Following institutional approval and signing the informed consent thirty four patients were enrolled in this study (3 excluded due to signal loss during data collection). The BIS index of the EEG, frontal EMG, heart rate, BP, and PPG wave amplitude changed in response to different noxious stimuli. Laryngoscopy/intubation was the stimulus with the more pronounced response [Formula: see text]. These variables were used in the construction of the combined index STAN; STAN responded adequately to noxious stimuli, with a more pronounced response to laryngoscopy/intubation (18.5-43.1 %, [Formula: see text]), and the attenuation provided by the analgesic, detecting steady-state periods in the different physiological signals analyzed (approximately 50 % of the total study time). A new multivariate approach for the assessment of the patient steady-state during general anesthesia was developed. The proposed wavelet based multivariate index responds adequately to different noxious stimuli, and attenuation provided by the analgesic in a dose-dependent manner for each stimulus analyzed in this study.
Baratieri, Sabrina C; Barbosa, Juliana M; Freitas, Matheus P; Martins, José A
2006-01-23
A multivariate method of analysis of nystatin and metronidazole in a semi-solid matrix, based on diffuse reflectance NIR measurements and partial least squares regression, is reported. The product, a vaginal cream used in the antifungal and antibacterial treatment, is usually, quantitatively analyzed through microbiological tests (nystatin) and HPLC technique (metronidazole), according to pharmacopeial procedures. However, near infrared spectroscopy has demonstrated to be a valuable tool for content determination, given the rapidity and scope of the method. In the present study, it was successfully applied in the prediction of nystatin (even in low concentrations, ca. 0.3-0.4%, w/w, which is around 100,000 IU/5g) and metronidazole contents, as demonstrated by some figures of merit, namely linearity, precision (mean and repeatability) and accuracy.
Can texture analysis of tooth microwear detect within guild niche partitioning in extinct species?
NASA Astrophysics Data System (ADS)
Purnell, Mark; Nedza, Christopher; Rychlik, Leszek
2017-04-01
Recent work shows that tooth microwear analysis can be applied further back in time and deeper into the phylogenetic history of vertebrate clades than previously thought (e.g. niche partitioning in early Jurassic insectivorous mammals; Gill et al., 2014, Nature). Furthermore, quantitative approaches to analysis based on parameterization of surface roughness are increasing the robustness and repeatability of this widely used dietary proxy. Discriminating between taxa within dietary guilds has the potential to significantly increase our ability to determine resource use and partitioning in fossil vertebrates, but how sensitive is the technique? To address this question we analysed tooth microwear texture in sympatric populations of shrew species (Neomys fodiens, Neomys anomalus, Sorex araneus, Sorex minutus) from BiaŁ owieza Forest, Poland. These populations are known to exhibit varying degrees of niche partitioning (Churchfield & Rychlik, 2006, J. Zool.) with greatest overlap between the Neomys species. Sorex araneus also exhibits some niche overlap with N. anomalus, while S. minutus is the most specialised. Multivariate analysis based only on tooth microwear textures recovers the same pattern of niche partitioning. Our results also suggest that tooth textures track seasonal differences in diet. Projecting data from fossils into the multivariate dietary space defined using microwear from extant taxa demonstrates that the technique is capable of subtle dietary discrimination in extinct insectivores.
Wager, M; Menei, P; Guilhot, J; Levillain, P; Michalak, S; Bataille, B; Blanc, J-L; Lapierre, F; Rigoard, P; Milin, S; Duthe, F; Bonneau, D; Larsen, C-J; Karayan-Tapon, L
2008-06-03
This study assessed the prognostic value of several markers involved in gliomagenesis, and compared it with that of other clinical and imaging markers already used. Four-hundred and sixteen adult patients with newly diagnosed glioma were included over a 3-year period and tumour suppressor genes, oncogenes, MGMT and hTERT expressions, losses of heterozygosity, as well as relevant clinical and imaging information were recorded. This prospective study was based on all adult gliomas. Analyses were performed on patient groups selected according to World Health Organization histoprognostic criteria and on the entire cohort. The endpoint was overall survival, estimated by the Kaplan-Meier method. Univariate analysis was followed by multivariate analysis according to a Cox model. p14(ARF), p16(INK4A) and PTEN expressions, and 10p 10q23, 10q26 and 13q LOH for the entire cohort, hTERT expression for high-grade tumours, EGFR for glioblastomas, 10q26 LOH for grade III tumours and anaplastic oligodendrogliomas were found to be correlated with overall survival on univariate analysis and age and grade on multivariate analysis only. This study confirms the prognostic value of several markers. However, the scattering of the values explained by tumour heterogeneity prevents their use in individual decision-making.
NASA Technical Reports Server (NTRS)
Wolf, S. F.; Lipschutz, M. E.
1993-01-01
Multivariate statistical analysis techniques (linear discriminant analysis and logistic regression) can provide powerful discrimination tools which are generally unfamiliar to the planetary science community. Fall parameters were used to identify a group of 17 H chondrites (Cluster 1) that were part of a coorbital stream which intersected Earth's orbit in May, from 1855 - 1895, and can be distinguished from all other H chondrite falls. Using multivariate statistical techniques, it was demonstrated that a totally different criterion, labile trace element contents - hence thermal histories - or 13 Cluster 1 meteorites are distinguishable from those of 45 non-Cluster 1 H chondrites. Here, we focus upon the principles of multivariate statistical techniques and illustrate their application using non-meteoritic and meteoritic examples.
Nojima, Masanori; Tokunaga, Mutsumi; Nagamura, Fumitaka
2018-05-05
To investigate under what circumstances inappropriate use of 'multivariate analysis' is likely to occur and to identify the population that needs more support with medical statistics. The frequency of inappropriate regression model construction in multivariate analysis and related factors were investigated in observational medical research publications. The inappropriate algorithm of using only variables that were significant in univariate analysis was estimated to occur at 6.4% (95% CI 4.8% to 8.5%). This was observed in 1.1% of the publications with a medical statistics expert (hereinafter 'expert') as the first author, 3.5% if an expert was included as coauthor and in 12.2% if experts were not involved. In the publications where the number of cases was 50 or less and the study did not include experts, inappropriate algorithm usage was observed with a high proportion of 20.2%. The OR of the involvement of experts for this outcome was 0.28 (95% CI 0.15 to 0.53). A further, nation-level, analysis showed that the involvement of experts and the implementation of unfavourable multivariate analysis are associated at the nation-level analysis (R=-0.652). Based on the results of this study, the benefit of participation of medical statistics experts is obvious. Experts should be involved for proper confounding adjustment and interpretation of statistical models. © Article author(s) (or their employer(s) unless otherwise stated in the text of the article) 2018. All rights reserved. No commercial use is permitted unless otherwise expressly granted.
Moseson, Heidi; Gerdts, Caitlin; Dehlendorf, Christine; Hiatt, Robert A; Vittinghoff, Eric
2017-12-21
The list experiment is a promising measurement tool for eliciting truthful responses to stigmatized or sensitive health behaviors. However, investigators may be hesitant to adopt the method due to previously untestable assumptions and the perceived inability to conduct multivariable analysis. With a recently developed statistical test that can detect the presence of a design effect - the absence of which is a central assumption of the list experiment method - we sought to test the validity of a list experiment conducted on self-reported abortion in Liberia. We also aim to introduce recently developed multivariable regression estimators for the analysis of list experiment data, to explore relationships between respondent characteristics and having had an abortion - an important component of understanding the experiences of women who have abortions. To test the null hypothesis of no design effect in the Liberian list experiment data, we calculated the percentage of each respondent "type," characterized by response to the control items, and compared these percentages across treatment and control groups with a Bonferroni-adjusted alpha criterion. We then implemented two least squares and two maximum likelihood models (four total), each representing different bias-variance trade-offs, to estimate the association between respondent characteristics and abortion. We find no clear evidence of a design effect in list experiment data from Liberia (p = 0.18), affirming the first key assumption of the method. Multivariable analyses suggest a negative association between education and history of abortion. The retrospective nature of measuring lifetime experience of abortion, however, complicates interpretation of results, as the timing and safety of a respondent's abortion may have influenced her ability to pursue an education. Our work demonstrates that multivariable analyses, as well as statistical testing of a key design assumption, are possible with list experiment data, although with important limitations when considering lifetime measures. We outline how to implement this methodology with list experiment data in future research.
Zafar, Raheel; Kamel, Nidal; Naufal, Mohamad; Malik, Aamir Saeed; Dass, Sarat C; Ahmad, Rana Fayyaz; Abdullah, Jafri M; Reza, Faruque
2017-01-01
Decoding of human brain activity has always been a primary goal in neuroscience especially with functional magnetic resonance imaging (fMRI) data. In recent years, Convolutional neural network (CNN) has become a popular method for the extraction of features due to its higher accuracy, however it needs a lot of computation and training data. In this study, an algorithm is developed using Multivariate pattern analysis (MVPA) and modified CNN to decode the behavior of brain for different images with limited data set. Selection of significant features is an important part of fMRI data analysis, since it reduces the computational burden and improves the prediction performance; significant features are selected using t-test. MVPA uses machine learning algorithms to classify different brain states and helps in prediction during the task. General linear model (GLM) is used to find the unknown parameters of every individual voxel and the classification is done using multi-class support vector machine (SVM). MVPA-CNN based proposed algorithm is compared with region of interest (ROI) based method and MVPA based estimated values. The proposed method showed better overall accuracy (68.6%) compared to ROI (61.88%) and estimation values (64.17%).
Ferreira, Ana P; Tobyn, Mike
2015-01-01
In the pharmaceutical industry, chemometrics is rapidly establishing itself as a tool that can be used at every step of product development and beyond: from early development to commercialization. This set of multivariate analysis methods allows the extraction of information contained in large, complex data sets thus contributing to increase product and process understanding which is at the core of the Food and Drug Administration's Process Analytical Tools (PAT) Guidance for Industry and the International Conference on Harmonisation's Pharmaceutical Development guideline (Q8). This review is aimed at providing pharmaceutical industry professionals an introduction to multivariate analysis and how it is being adopted and implemented by companies in the transition from "quality-by-testing" to "quality-by-design". It starts with an introduction to multivariate analysis and the two methods most commonly used: principal component analysis and partial least squares regression, their advantages, common pitfalls and requirements for their effective use. That is followed with an overview of the diverse areas of application of multivariate analysis in the pharmaceutical industry: from the development of real-time analytical methods to definition of the design space and control strategy, from formulation optimization during development to the application of quality-by-design principles to improve manufacture of existing commercial products.
Multi-variant study of obesity risk genes in African Americans: The Jackson Heart Study.
Liu, Shijian; Wilson, James G; Jiang, Fan; Griswold, Michael; Correa, Adolfo; Mei, Hao
2016-11-30
Genome-wide association study (GWAS) has been successful in identifying obesity risk genes by single-variant association analysis. For this study, we designed steps of analysis strategy and aimed to identify multi-variant effects on obesity risk among candidate genes. Our analyses were focused on 2137 African American participants with body mass index measured in the Jackson Heart Study and 657 common single nucleotide polymorphisms (SNPs) genotyped at 8 GWAS-identified obesity risk genes. Single-variant association test showed that no SNPs reached significance after multiple testing adjustment. The following gene-gene interaction analysis, which was focused on SNPs with unadjusted p-value<0.10, identified 6 significant multi-variant associations. Logistic regression showed that SNPs in these associations did not have significant linear interactions; examination of genetic risk score evidenced that 4 multi-variant associations had significant additive effects of risk SNPs; and haplotype association test presented that all multi-variant associations contained one or several combinations of particular alleles or haplotypes, associated with increased obesity risk. Our study evidenced that obesity risk genes generated multi-variant effects, which can be additive or non-linear interactions, and multi-variant study is an important supplement to existing GWAS for understanding genetic effects of obesity risk genes. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Technical Reports Server (NTRS)
Morrissey, L. A.; Weinstock, K. J.; Mouat, D. A.; Card, D. H.
1984-01-01
An evaluation of Thematic Mapper Simulator (TMS) data for the geobotanical discrimination of rock types based on vegetative cover characteristics is addressed in this research. A methodology for accomplishing this evaluation utilizing univariate and multivariate techniques is presented. TMS data acquired with a Daedalus DEI-1260 multispectral scanner were integrated with vegetation and geologic information for subsequent statistical analyses, which included a chi-square test, an analysis of variance, stepwise discriminant analysis, and Duncan's multiple range test. Results indicate that ultramafic rock types are spectrally separable from nonultramafics based on vegetative cover through the use of statistical analyses.
Fast Genome-Wide QTL Association Mapping on Pedigree and Population Data.
Zhou, Hua; Blangero, John; Dyer, Thomas D; Chan, Kei-Hang K; Lange, Kenneth; Sobel, Eric M
2017-04-01
Since most analysis software for genome-wide association studies (GWAS) currently exploit only unrelated individuals, there is a need for efficient applications that can handle general pedigree data or mixtures of both population and pedigree data. Even datasets thought to consist of only unrelated individuals may include cryptic relationships that can lead to false positives if not discovered and controlled for. In addition, family designs possess compelling advantages. They are better equipped to detect rare variants, control for population stratification, and facilitate the study of parent-of-origin effects. Pedigrees selected for extreme trait values often segregate a single gene with strong effect. Finally, many pedigrees are available as an important legacy from the era of linkage analysis. Unfortunately, pedigree likelihoods are notoriously hard to compute. In this paper, we reexamine the computational bottlenecks and implement ultra-fast pedigree-based GWAS analysis. Kinship coefficients can either be based on explicitly provided pedigrees or automatically estimated from dense markers. Our strategy (a) works for random sample data, pedigree data, or a mix of both; (b) entails no loss of power; (c) allows for any number of covariate adjustments, including correction for population stratification; (d) allows for testing SNPs under additive, dominant, and recessive models; and (e) accommodates both univariate and multivariate quantitative traits. On a typical personal computer (six CPU cores at 2.67 GHz), analyzing a univariate HDL (high-density lipoprotein) trait from the San Antonio Family Heart Study (935,392 SNPs on 1,388 individuals in 124 pedigrees) takes less than 2 min and 1.5 GB of memory. Complete multivariate QTL analysis of the three time-points of the longitudinal HDL multivariate trait takes less than 5 min and 1.5 GB of memory. The algorithm is implemented as the Ped-GWAS Analysis (Option 29) in the Mendel statistical genetics package, which is freely available for Macintosh, Linux, and Windows platforms from http://genetics.ucla.edu/software/mendel. © 2016 WILEY PERIODICALS, INC.
Lu, Tsui-Shan; Longnecker, Matthew P.; Zhou, Haibo
2016-01-01
Outcome-dependent sampling (ODS) scheme is a cost-effective sampling scheme where one observes the exposure with a probability that depends on the outcome. The well-known such design is the case-control design for binary response, the case-cohort design for the failure time data and the general ODS design for a continuous response. While substantial work has been done for the univariate response case, statistical inference and design for the ODS with multivariate cases remain under-developed. Motivated by the need in biological studies for taking the advantage of the available responses for subjects in a cluster, we propose a multivariate outcome dependent sampling (Multivariate-ODS) design that is based on a general selection of the continuous responses within a cluster. The proposed inference procedure for the Multivariate-ODS design is semiparametric where all the underlying distributions of covariates are modeled nonparametrically using the empirical likelihood methods. We show that the proposed estimator is consistent and developed the asymptotically normality properties. Simulation studies show that the proposed estimator is more efficient than the estimator obtained using only the simple-random-sample portion of the Multivariate-ODS or the estimator from a simple random sample with the same sample size. The Multivariate-ODS design together with the proposed estimator provides an approach to further improve study efficiency for a given fixed study budget. We illustrate the proposed design and estimator with an analysis of association of PCB exposure to hearing loss in children born to the Collaborative Perinatal Study. PMID:27966260
Hu, Li-Xin; Ying, Guang-Guo; Chen, Xiao-Wen; Huang, Guo-Yong; Liu, You-Sheng; Jiang, Yu-Xia; Pan, Chang-Gui; Tian, Fei; Martin, Francis L
2017-02-01
Traditional duckweed toxicity tests only measure plant growth inhibition as an endpoint, with limited effects-based data. The present study aimed to investigate whether Fourier-transform infrared (FTIR) spectroscopy could enhance the duckweed (Lemna minor L.) toxicity test. Four chemicals (Cu, Cd, atrazine, and acetochlor) and 4 metal-containing industrial wastewater samples were tested. After exposure of duckweed to the chemicals, standard toxicity endpoints (frond number and chlorophyll content) were determined; the fronds were also interrogated using FTIR spectroscopy under optimized test conditions. Biochemical alterations associated with each treatment were assessed and further analyzed by multivariate analysis. The results showed that comparable x% of effective concentration (ECx) values could be achieved based on FTIR spectroscopy in comparison with those based on traditional toxicity endpoints. Biochemical alterations associated with different doses of toxicant were mainly attributed to lipid, protein, nucleic acid, and carbohydrate structural changes, which helped to explain toxic mechanisms. With the help of multivariate analysis, separation of clusters related to different exposure doses could be achieved. The present study is the first to show successful application of FTIR spectroscopy in standard duckweed toxicity tests with biochemical alterations as new endpoints. Environ Toxicol Chem 2017;36:346-353. © 2016 SETAC. © 2016 SETAC.
Instrumental Neutron Activation Analysis and Multivariate Statistics for Pottery Provenance
NASA Astrophysics Data System (ADS)
Glascock, M. D.; Neff, H.; Vaughn, K. J.
2004-06-01
The application of instrumental neutron activation analysis and multivariate statistics to archaeological studies of ceramics and clays is described. A small pottery data set from the Nasca culture in southern Peru is presented for illustration.
A Study of Effects of MultiCollinearity in the Multivariable Analysis
Yoo, Wonsuk; Mayberry, Robert; Bae, Sejong; Singh, Karan; (Peter) He, Qinghua; Lillard, James W.
2015-01-01
A multivariable analysis is the most popular approach when investigating associations between risk factors and disease. However, efficiency of multivariable analysis highly depends on correlation structure among predictive variables. When the covariates in the model are not independent one another, collinearity/multicollinearity problems arise in the analysis, which leads to biased estimation. This work aims to perform a simulation study with various scenarios of different collinearity structures to investigate the effects of collinearity under various correlation structures amongst predictive and explanatory variables and to compare these results with existing guidelines to decide harmful collinearity. Three correlation scenarios among predictor variables are considered: (1) bivariate collinear structure as the most simple collinearity case, (2) multivariate collinear structure where an explanatory variable is correlated with two other covariates, (3) a more realistic scenario when an independent variable can be expressed by various functions including the other variables. PMID:25664257
A Study of Effects of MultiCollinearity in the Multivariable Analysis.
Yoo, Wonsuk; Mayberry, Robert; Bae, Sejong; Singh, Karan; Peter He, Qinghua; Lillard, James W
2014-10-01
A multivariable analysis is the most popular approach when investigating associations between risk factors and disease. However, efficiency of multivariable analysis highly depends on correlation structure among predictive variables. When the covariates in the model are not independent one another, collinearity/multicollinearity problems arise in the analysis, which leads to biased estimation. This work aims to perform a simulation study with various scenarios of different collinearity structures to investigate the effects of collinearity under various correlation structures amongst predictive and explanatory variables and to compare these results with existing guidelines to decide harmful collinearity. Three correlation scenarios among predictor variables are considered: (1) bivariate collinear structure as the most simple collinearity case, (2) multivariate collinear structure where an explanatory variable is correlated with two other covariates, (3) a more realistic scenario when an independent variable can be expressed by various functions including the other variables.
Sun, Feifei; Zhu, Jia; Lu, Suying; Zhen, Zijun; Wang, Juan; Huang, Junting; Ding, Zonghui; Zeng, Musheng; Sun, Xiaofei
2018-01-02
Systemic inflammatory parameters are associated with poor outcomes in malignant patients. Several inflammation-based cumulative prognostic score systems were established for various solid tumors. However, there is few inflammation based cumulative prognostic score system for patients with diffuse large B cell lymphoma (DLBCL). We retrospectively reviewed 564 adult DLBCL patients who had received rituximab, cyclophosphamide, doxorubicin, vincristine and prednisolone (R-CHOP) therapy between Nov 1 2006 and Dec 30 2013 and assessed the prognostic significance of six systemic inflammatory parameters evaluated in previous studies by univariate and multivariate analysis:C-reactive protein(CRP), albumin levels, the lymphocyte-monocyte ratio (LMR), the neutrophil-lymphocyte ratio(NLR), the platelet-lymphocyte ratio(PLR)and fibrinogen levels. Multivariate analysis identified CRP, albumin levels and the LMR are three independent prognostic parameters for overall survival (OS). Based on these three factors, we constructed a novel inflammation-based cumulative prognostic score (ICPS) system. Four risk groups were formed: group ICPS = 0, ICPS = 1, ICPS = 2 and ICPS = 3. Advanced multivariate analysis indicated that the ICPS model is a prognostic score system independent of International Prognostic Index (IPI) for both progression-free survival (PFS) (p < 0.001) and OS (p < 0.001). The 3-year OS for patients with ICPS =0, ICPS =1, ICPS =2 and ICPS =3 were 95.6, 88.2, 76.0 and 62.2%, respectively (p < 0.001). The 3-year PFS for patients with ICPS = 0-1, ICPS = 2 and ICPS = 3 were 84.8, 71.6 and 54.5%, respectively (p < 0.001). The prognostic value of the ICPS model indicated that the degree of systemic inflammatory status was associated with clinical outcomes of patients with DLBCL in rituximab era. The ICPS model was shown to classify risk groups more accurately than any single inflammatory prognostic parameters. These findings may be useful for identifying candidates for further inflammation-related mechanism research or novel anti-inflammation target therapies.
Demizu, Yusuke; Murakami, Masao; Miyawaki, Daisuke; Niwa, Yasue; Akagi, Takashi; Sasaki, Ryohei; Terashima, Kazuki; Suga, Daisaku; Kamae, Isao; Hishikawa, Yoshio
2009-12-01
To assess the incident rates of vision loss (VL; based on counting fingers or more severe) caused by radiation-induced optic neuropathy (RION) after particle therapy for tumors adjacent to optic nerves (ONs), and to evaluate factors that may contribute to VL. From August 2001 to August 2006, 104 patients with head-and-neck or skull-base tumors adjacent to ONs were treated with carbon ion or proton radiotherapy. Among them, 145 ONs of 75 patients were irradiated and followed for greater than 12 months. The incident rate of VL and the prognostic factors for occurrence of VL were evaluated. The late effects of carbon ion and proton beams were compared on the basis of a biologically effective dose at alpha/beta = 3 gray equivalent (GyE(3)). Eight patients (11%) experienced VL resulting from RION. The onset of VL ranged from 17 to 58 months. The median follow-up was 25 months. No significant difference was observed between the carbon ion and proton beam treatment groups. On univariate analysis, age (>60 years), diabetes mellitus, and maximum dose to the ON (>110 GyE(3)) were significant, whereas on multivariate analysis only diabetes mellitus was found to be significant for VL. The time to the onset of VL was highly variable. There was no statistically significant difference between carbon ion and proton beam treatments over the follow-up period. Based on multivariate analysis, diabetes mellitus correlated with the occurrence of VL. A larger study with longer follow-up is warranted.
Gastric cancer, nutritional status, and outcome.
Liu, Xuechao; Qiu, Haibo; Kong, Pengfei; Zhou, Zhiwei; Sun, Xiaowei
2017-01-01
We aim to investigate the prognostic value of several nutrition-based indices, including the prognostic nutritional index (PNI), performance status, body mass index, serum albumin, and preoperative body weight loss in patients with gastric cancer (GC). We retrospectively analyzed the records of 1,330 consecutive patients with GC undergoing curative surgery between October 2000 and September 2012. The relationship between nutrition-based indices and overall survival (OS) was examined using Kaplan-Meier analysis and Cox regression model. Following multivariate analysis, the PNI and preoperative body weight loss were the only nutritional-based indices independently associated with OS (hazard ratio [HR]: 1.356, 95% confidence interval [CI]: 1.051-1.748, P =0.019; HR: 1.152, 95% CI: 1.014-1.310, P =0.030, retrospectively). In stage-stratified analysis, multivariate analysis revealed that preoperative body weight loss was identified as an independent prognostic factor only in patients with stage III GC (HR: 1.223, 95% CI: 1.065-1.405, P =0.004), while the prognostic significance of PNI was not significant (all P >0.05). In patients with stage III GC, preoperative body weight loss stratified 5-year OS from 41.1% to 26.5%. When stratified by adjuvant chemotherapy, the prognostic significance of preoperative body weight loss was maintained in patients treated with surgery plus adjuvant chemotherapy and in patients treated with surgery alone ( P <0.001; P =0.003). Preoperative body weight loss is an independent prognostic factor for OS in patients with GC, especially in stage III disease. Preoperative body weight loss appears to be a superior predictor of outcome compared with other established nutrition-based indices.
NASA Astrophysics Data System (ADS)
Chen, Zhe; Qiu, Zurong; Huo, Xinming; Fan, Yuming; Li, Xinghua
2017-03-01
A fiber-capacitive drop analyzer is an instrument which monitors a growing droplet to produce a capacitive opto-tensiotrace (COT). Each COT is an integration of fiber light intensity signals and capacitance signals and can reflect the unique physicochemical property of a liquid. In this study, we propose a solution analytical and concentration quantitative method based on multivariate statistical methods. Eight characteristic values are extracted from each COT. A series of COT characteristic values of training solutions at different concentrations compose a data library of this kind of solution. A two-stage linear discriminant analysis is applied to analyze different solution libraries and establish discriminant functions. Test solutions can be discriminated by these functions. After determining the variety of test solutions, Spearman correlation test and principal components analysis are used to filter and reduce dimensions of eight characteristic values, producing a new representative parameter. A cubic spline interpolation function is built between the parameters and concentrations, based on which we can calculate the concentration of the test solution. Methanol, ethanol, n-propanol, and saline solutions are taken as experimental subjects in this paper. For each solution, nine or ten different concentrations are chosen to be the standard library, and the other two concentrations compose the test group. By using the methods mentioned above, all eight test solutions are correctly identified and the average relative error of quantitative analysis is 1.11%. The method proposed is feasible which enlarges the applicable scope of recognizing liquids based on the COT and improves the concentration quantitative precision, as well.
Localization of genes involved in the metabolic syndrome using multivariate linkage analysis.
Olswold, Curtis; de Andrade, Mariza
2003-12-31
There are no well accepted criteria for the diagnosis of the metabolic syndrome. However, the metabolic syndrome is identified clinically by the presence of three or more of these five variables: larger waist circumference, higher triglyceride levels, lower HDL-cholesterol concentrations, hypertension, and impaired fasting glucose. We use sets of two or three variables, which are available in the Framingham Heart Study data set, to localize genes responsible for this syndrome using multivariate quantitative linkage analysis. This analysis demonstrates the applicability of using multivariate linkage analysis and how its use increases the power to detect linkage when genes are involved in the same disease mechanism.
Laurens, L M L; Wolfrum, E J
2013-12-18
One of the challenges associated with microalgal biomass characterization and the comparison of microalgal strains and conversion processes is the rapid determination of the composition of algae. We have developed and applied a high-throughput screening technology based on near-infrared (NIR) spectroscopy for the rapid and accurate determination of algal biomass composition. We show that NIR spectroscopy can accurately predict the full composition using multivariate linear regression analysis of varying lipid, protein, and carbohydrate content of algal biomass samples from three strains. We also demonstrate a high quality of predictions of an independent validation set. A high-throughput 96-well configuration for spectroscopy gives equally good prediction relative to a ring-cup configuration, and thus, spectra can be obtained from as little as 10-20 mg of material. We found that lipids exhibit a dominant, distinct, and unique fingerprint in the NIR spectrum that allows for the use of single and multiple linear regression of respective wavelengths for the prediction of the biomass lipid content. This is not the case for carbohydrate and protein content, and thus, the use of multivariate statistical modeling approaches remains necessary.
A Unified Framework for Association Analysis with Multiple Related Phenotypes
Stephens, Matthew
2013-01-01
We consider the problem of assessing associations between multiple related outcome variables, and a single explanatory variable of interest. This problem arises in many settings, including genetic association studies, where the explanatory variable is genotype at a genetic variant. We outline a framework for conducting this type of analysis, based on Bayesian model comparison and model averaging for multivariate regressions. This framework unifies several common approaches to this problem, and includes both standard univariate and standard multivariate association tests as special cases. The framework also unifies the problems of testing for associations and explaining associations – that is, identifying which outcome variables are associated with genotype. This provides an alternative to the usual, but conceptually unsatisfying, approach of resorting to univariate tests when explaining and interpreting significant multivariate findings. The method is computationally tractable genome-wide for modest numbers of phenotypes (e.g. 5–10), and can be applied to summary data, without access to raw genotype and phenotype data. We illustrate the methods on both simulated examples, and to a genome-wide association study of blood lipid traits where we identify 18 potential novel genetic associations that were not identified by univariate analyses of the same data. PMID:23861737
Tolkoff, Max R; Alfaro, Michael E; Baele, Guy; Lemey, Philippe; Suchard, Marc A
2018-05-01
Phylogenetic comparative methods explore the relationships between quantitative traits adjusting for shared evolutionary history. This adjustment often occurs through a Brownian diffusion process along the branches of the phylogeny that generates model residuals or the traits themselves. For high-dimensional traits, inferring all pair-wise correlations within the multivariate diffusion is limiting. To circumvent this problem, we propose phylogenetic factor analysis (PFA) that assumes a small unknown number of independent evolutionary factors arise along the phylogeny and these factors generate clusters of dependent traits. Set in a Bayesian framework, PFA provides measures of uncertainty on the factor number and groupings, combines both continuous and discrete traits, integrates over missing measurements and incorporates phylogenetic uncertainty with the help of molecular sequences. We develop Gibbs samplers based on dynamic programming to estimate the PFA posterior distribution, over 3-fold faster than for multivariate diffusion and a further order-of-magnitude more efficiently in the presence of latent traits. We further propose a novel marginal likelihood estimator for previously impractical models with discrete data and find that PFA also provides a better fit than multivariate diffusion in evolutionary questions in columbine flower development, placental reproduction transitions and triggerfish fin morphometry.
Multivariate singular spectrum analysis and the road to phase synchronization
NASA Astrophysics Data System (ADS)
Groth, Andreas; Ghil, Michael
2010-05-01
Singular spectrum analysis (SSA) and multivariate SSA (M-SSA) are based on the classical work of Kosambi (1943), Loeve (1945) and Karhunen (1946) and are closely related to principal component analysis. They have been introduced into information theory by Bertero, Pike and co-workers (1982, 1984) and into dynamical systems analysis by Broomhead and King (1986a,b). Ghil, Vautard and associates have applied SSA and M-SSA to the temporal and spatio-temporal analysis of short and noisy time series in climate dynamics and other fields in the geosciences since the late 1980s. M-SSA provides insight into the unknown or partially known dynamics of the underlying system by decomposing the delay-coordinate phase space of a given multivariate time series into a set of data-adaptive orthonormal components. These components can be classified essentially into trends, oscillatory patterns and noise, and allow one to reconstruct a robust "skeleton" of the dynamical system's structure. For an overview we refer to Ghil et al. (Rev. Geophys., 2002). In this talk, we present M-SSA in the context of synchronization analysis and illustrate its ability to unveil information about the mechanisms behind the adjustment of rhythms in coupled dynamical systems. The focus of the talk is on the special case of phase synchronization between coupled chaotic oscillators (Rosenblum et al., PRL, 1996). Several ways of measuring phase synchronization are in use, and the robust definition of a reasonable phase for each oscillator is critical in each of them. We illustrate here the advantages of M-SSA in the automatic identification of oscillatory modes and in drawing conclusions about the transition to phase synchronization. Without using any a priori definition of a suitable phase, we show that M-SSA is able to detect phase synchronization in a chain of coupled chaotic oscillators (Osipov et al., PRE, 1996). Recently, Muller et al. (PRE, 2005) and Allefeld et al. (Intl. J. Bif. Chaos, 2007) have demonstrated the usefulness of principal component analysis in detecting phase synchronization from multivariate time series. The present talk provides a generalization of this idea and presents a robust implementation thereof via M-SSA.
Willis, Michael; Asseburg, Christian; Nilsson, Andreas; Johnsson, Kristina; Kartman, Bernt
2017-03-01
Type 2 diabetes mellitus (T2DM) is chronic and progressive and the cost-effectiveness of new treatment interventions must be established over long time horizons. Given the limited durability of drugs, assumptions regarding downstream rescue medication can drive results. Especially for insulin, for which treatment effects and adverse events are known to depend on patient characteristics, this can be problematic for health economic evaluation involving modeling. To estimate parsimonious multivariate equations of treatment effects and hypoglycemic event risks for use in parameterizing insulin rescue therapy in model-based cost-effectiveness analysis. Clinical evidence for insulin use in T2DM was identified in PubMed and from published reviews and meta-analyses. Study and patient characteristics and treatment effects and adverse event rates were extracted and the data used to estimate parsimonious treatment effect and hypoglycemic event risk equations using multivariate regression analysis. Data from 91 studies featuring 171 usable study arms were identified, mostly for premix and basal insulin types. Multivariate prediction equations for glycated hemoglobin A 1c lowering and weight change were estimated separately for insulin-naive and insulin-experienced patients. Goodness of fit (R 2 ) for both outcomes were generally good, ranging from 0.44 to 0.84. Multivariate prediction equations for symptomatic, nocturnal, and severe hypoglycemic events were also estimated, though considerable heterogeneity in definitions limits their usefulness. Parsimonious and robust multivariate prediction equations were estimated for glycated hemoglobin A 1c and weight change, separately for insulin-naive and insulin-experienced patients. Using these in economic simulation modeling in T2DM can improve realism and flexibility in modeling insulin rescue medication. Copyright © 2017 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
Zhang, Xiuxiu; Li, Yubo; Zhou, Huifang; Fan, Simiao; Zhang, Zhenzhu; Wang, Lei; Zhang, Yanjun
2014-08-01
Acyclovir (ACV) is an antiviral agent. However, its use is limited by adverse side effect, particularly by its nephrotoxicity. Metabonomics technology can provide essential information on the metabolic profiles of biofluids and organs upon drug administration. Therefore, in this study, mass spectrometry-based metabonomics coupled with multivariate data analysis was used to identify the plasma metabolites and metabolic pathways related to nephrotoxicity caused by intraperitoneal injection of low (50mg/kg) and high (100mg/kg) doses of acyclovir. Sixteen biomarkers were identified by metabonomics and nephrotoxicity results revealed the dose-dependent effect of acyclovir on kidney tissues. The present study showed that the top four metabolic pathways interrupted by acyclovir included the metabolisms of arachidonic acid, tryptophan, arginine and proline, and glycerophospholipid. This research proves the established metabonomic approach can provide information on changes in metabolites and metabolic pathways, which can be applied to in-depth research on the mechanism of acyclovir-induced kidney injury. Copyright © 2014 Elsevier B.V. All rights reserved.
Time-varying nonstationary multivariate risk analysis using a dynamic Bayesian copula
NASA Astrophysics Data System (ADS)
Sarhadi, Ali; Burn, Donald H.; Concepción Ausín, María.; Wiper, Michael P.
2016-03-01
A time-varying risk analysis is proposed for an adaptive design framework in nonstationary conditions arising from climate change. A Bayesian, dynamic conditional copula is developed for modeling the time-varying dependence structure between mixed continuous and discrete multiattributes of multidimensional hydrometeorological phenomena. Joint Bayesian inference is carried out to fit the marginals and copula in an illustrative example using an adaptive, Gibbs Markov Chain Monte Carlo (MCMC) sampler. Posterior mean estimates and credible intervals are provided for the model parameters and the Deviance Information Criterion (DIC) is used to select the model that best captures different forms of nonstationarity over time. This study also introduces a fully Bayesian, time-varying joint return period for multivariate time-dependent risk analysis in nonstationary environments. The results demonstrate that the nature and the risk of extreme-climate multidimensional processes are changed over time under the impact of climate change, and accordingly the long-term decision making strategies should be updated based on the anomalies of the nonstationary environment.
A cross-species socio-emotional behaviour development revealed by a multivariate analysis.
Koshiba, Mamiko; Senoo, Aya; Mimura, Koki; Shirakawa, Yuka; Karino, Genta; Obara, Saya; Ozawa, Shinpei; Sekihara, Hitomi; Fukushima, Yuta; Ueda, Toyotoshi; Kishino, Hirohisa; Tanaka, Toshihisa; Ishibashi, Hidetoshi; Yamanouchi, Hideo; Yui, Kunio; Nakamura, Shun
2013-01-01
Recent progress in affective neuroscience and social neurobiology has been propelled by neuro-imaging technology and epigenetic approach in neurobiology of animal behaviour. However, quantitative measurements of socio-emotional development remains lacking, though sensory-motor development has been extensively studied in terms of digitised imaging analysis. Here, we developed a method for socio-emotional behaviour measurement that is based on the video recordings under well-defined social context using animal models with variously social sensory interaction during development. The behaviour features digitized from the video recordings were visualised in a multivariate statistic space using principal component analysis. The clustering of the behaviour parameters suggested the existence of species- and stage-specific as well as cross-species behaviour modules. These modules were used to characterise the behaviour of children with or without autism spectrum disorders (ASDs). We found that socio-emotional behaviour is highly dependent on social context and the cross-species behaviour modules may predict neurobiological basis of ASDs.
Handwriting Examination: Moving from Art to Science
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jarman, K.H.; Hanlen, R.C.; Manzolillo, P.A.
In this document, we present a method for validating the premises and methodology of forensic handwriting examination. This method is intuitively appealing because it relies on quantitative measurements currently used qualitatively by FDE's in making comparisons, and it is scientifically rigorous because it exploits the power of multivariate statistical analysis. This approach uses measures of both central tendency and variation to construct a profile for a given individual. (Central tendency and variation are important for characterizing an individual's writing and both are currently used by FDE's in comparative analyses). Once constructed, different profiles are then compared for individuality using clustermore » analysis; they are grouped so that profiles within a group cannot be differentiated from one another based on the measured characteristics, whereas profiles between groups can. The cluster analysis procedure used here exploits the power of multivariate hypothesis testing. The result is not only a profile grouping but also an indication of statistical significance of the groups generated.« less
Field, Nicholas; Konstantinidis, Spyridon; Velayudhan, Ajoy
2017-08-11
The combination of multi-well plates and automated liquid handling is well suited to the rapid measurement of the adsorption isotherms of proteins. Here, single and binary adsorption isotherms are reported for BSA, ovalbumin and conalbumin on a strong anion exchanger over a range of pH and salt levels. The impact of the main experimental factors at play on the accuracy and precision of the adsorbed protein concentrations is quantified theoretically and experimentally. In addition to the standard measurement of liquid concentrations before and after adsorption, the amounts eluted from the wells are measured directly. This additional measurement corroborates the calculation based on liquid concentration data, and improves precision especially under conditions of weak or moderate interaction strength. The traditional measurement of multicomponent isotherms is limited by the speed of HPLC analysis; this analytical bottleneck is alleviated by careful multivariate analysis of UV spectra. Copyright © 2017. Published by Elsevier B.V.
SPICE: exploration and analysis of post-cytometric complex multivariate datasets.
Roederer, Mario; Nozzi, Joshua L; Nason, Martha C
2011-02-01
Polychromatic flow cytometry results in complex, multivariate datasets. To date, tools for the aggregate analysis of these datasets across multiple specimens grouped by different categorical variables, such as demographic information, have not been optimized. Often, the exploration of such datasets is accomplished by visualization of patterns with pie charts or bar charts, without easy access to statistical comparisons of measurements that comprise multiple components. Here we report on algorithms and a graphical interface we developed for these purposes. In particular, we discuss thresholding necessary for accurate representation of data in pie charts, the implications for display and comparison of normalized versus unnormalized data, and the effects of averaging when samples with significant background noise are present. Finally, we define a statistic for the nonparametric comparison of complex distributions to test for difference between groups of samples based on multi-component measurements. While originally developed to support the analysis of T cell functional profiles, these techniques are amenable to a broad range of datatypes. Published 2011 Wiley-Liss, Inc.
Hypothyroidism among SLE patients: Case-control study.
Watad, Abdulla; Mahroum, Naim; Whitby, Aaron; Gertel, Smadar; Comaneshter, Doron; Cohen, Arnon D; Amital, Howard
2016-05-01
The prevalence of hypothyroidism in SLE patients varies considerably and early reports were mainly based on small cohorts. To investigate the association between SLE and hypothyroidism. Patients with SLE were compared with age and sex-matched controls regarding the proportion of hypothyroidism in a case-control study. Chi-square and t-tests were used for univariate analysis and a logistic regression model was used for multivariate analysis. The study was performed utilizing the medical database of Clalit Health Services. The study included 5018 patients with SLE and 25,090 age and sex-matched controls. The proportion of hypothyroidism in patients with SLE was increased compared with the prevalence in controls (15.58% and 5.75%, respectively, P<0.001). In a multivariate analysis, SLE was associated with hypothyroidism (odds ratio 2.644, 95% confidence interval 2.405-2.908). Patients with SLE have a greater proportion of hypothyroidism than matched controls. Therefore, physicians treating patients with SLE should be aware of the possibility of thyroid dysfunction. Copyright © 2016 Elsevier B.V. All rights reserved.
Bourne, Roger; Himmelreich, Uwe; Sharma, Ansuiya; Mountford, Carolyn; Sorrell, Tania
2001-01-01
A new fingerprinting technique with the potential for rapid identification of bacteria was developed by combining proton magnetic resonance spectroscopy (1H MRS) with multivariate statistical analysis. This resulted in an objective identification strategy for common clinical isolates belonging to the bacterial species Staphylococcus aureus, Staphylococcus epidermidis, Enterococcus faecalis, Streptococcus pneumoniae, Streptococcus pyogenes, Streptococcus agalactiae, and the Streptococcus milleri group. Duplicate cultures of 104 different isolates were examined one or more times using 1H MRS. A total of 312 cultures were examined. An optimized classifier was developed using a bootstrapping process and a seven-group linear discriminant analysis to provide objective classification of the spectra. Identification of isolates was based on consistent high-probability classification of spectra from duplicate cultures and achieved 92% agreement with conventional methods of identification. Fewer than 1% of isolates were identified incorrectly. Identification of the remaining 7% of isolates was defined as indeterminate. PMID:11474013
Imaging of polysaccharides in the tomato cell wall with Raman microspectroscopy
2014-01-01
Background The primary cell wall of fruits and vegetables is a structure mainly composed of polysaccharides (pectins, hemicelluloses, cellulose). Polysaccharides are assembled into a network and linked together. It is thought that the percentage of components and of plant cell wall has an important influence on mechanical properties of fruits and vegetables. Results In this study the Raman microspectroscopy technique was introduced to the visualization of the distribution of polysaccharides in cell wall of fruit. The methodology of the sample preparation, the measurement using Raman microscope and multivariate image analysis are discussed. Single band imaging (for preliminary analysis) and multivariate image analysis methods (principal component analysis and multivariate curve resolution) were used for the identification and localization of the components in the primary cell wall. Conclusions Raman microspectroscopy supported by multivariate image analysis methods is useful in distinguishing cellulose and pectins in the cell wall in tomatoes. It presents how the localization of biopolymers was possible with minimally prepared samples. PMID:24917885
A refined method for multivariate meta-analysis and meta-regression
Jackson, Daniel; Riley, Richard D
2014-01-01
Making inferences about the average treatment effect using the random effects model for meta-analysis is problematic in the common situation where there is a small number of studies. This is because estimates of the between-study variance are not precise enough to accurately apply the conventional methods for testing and deriving a confidence interval for the average effect. We have found that a refined method for univariate meta-analysis, which applies a scaling factor to the estimated effects’ standard error, provides more accurate inference. We explain how to extend this method to the multivariate scenario and show that our proposal for refined multivariate meta-analysis and meta-regression can provide more accurate inferences than the more conventional approach. We explain how our proposed approach can be implemented using standard output from multivariate meta-analysis software packages and apply our methodology to two real examples. © 2013 The Authors. Statistics in Medicine published by John Wiley & Sons, Ltd. PMID:23996351
NASA Astrophysics Data System (ADS)
Das, Bappa; Sahoo, Rabi N.; Pargal, Sourabh; Krishna, Gopal; Verma, Rakesh; Chinnusamy, Viswanathan; Sehgal, Vinay K.; Gupta, Vinod K.; Dash, Sushanta K.; Swain, Padmini
2018-03-01
In the present investigation, the changes in sucrose, reducing and total sugar content due to water-deficit stress in rice leaves were modeled using visible, near infrared (VNIR) and shortwave infrared (SWIR) spectroscopy. The objectives of the study were to identify the best vegetation indices and suitable multivariate technique based on precise analysis of hyperspectral data (350 to 2500 nm) and sucrose, reducing sugar and total sugar content measured at different stress levels from 16 different rice genotypes. Spectral data analysis was done to identify suitable spectral indices and models for sucrose estimation. Novel spectral indices in near infrared (NIR) range viz. ratio spectral index (RSI) and normalised difference spectral indices (NDSI) sensitive to sucrose, reducing sugar and total sugar content were identified which were subsequently calibrated and validated. The RSI and NDSI models had R2 values of 0.65, 0.71 and 0.67; RPD values of 1.68, 1.95 and 1.66 for sucrose, reducing sugar and total sugar, respectively for validation dataset. Different multivariate spectral models such as artificial neural network (ANN), multivariate adaptive regression splines (MARS), multiple linear regression (MLR), partial least square regression (PLSR), random forest regression (RFR) and support vector machine regression (SVMR) were also evaluated. The best performing multivariate models for sucrose, reducing sugars and total sugars were found to be, MARS, ANN and MARS, respectively with respect to RPD values of 2.08, 2.44, and 1.93. Results indicated that VNIR and SWIR spectroscopy combined with multivariate calibration can be used as a reliable alternative to conventional methods for measurement of sucrose, reducing sugars and total sugars of rice under water-deficit stress as this technique is fast, economic, and noninvasive.
R Hussein, Nawfal
2018-01-01
Hepatitis B virus (HBV) infection is a public health problem. The lack of information about the seroprevalence and risk factors is an obstacle for preventive public health plans to reduce the burden of viral hepatitis. Therefore, this study was conducted in Iraq, where no studies had been performed to determine the prevalence and risk factors of HBV infection. Blood samples were collected form 438 blood donors attending blood bank in Duhok city. Serum samples were tested for HBV core-antibodies (HBcAb) and HBV surface-antigen (HBsAg) by ELISA. Various risk factors were recorded and multivariate analysis was performed. 5/438 (1.14%) of the subjects were HBsAg positive (HBsAg and HBcAb positive) and 36/438 (8.2%) were HBcAb positive. Hence, 41 cases were exposed to HBV and data analysis was based on that. Univariate analysis showed that there were significant associations between history of illegitimate sexual contact, history of alcohol or history of dental surgeries and HBV exposure (p<0.05 for all). Then, multivariate analysis was conducted to find HBV exposure predictive factors. It was found that history of dental surgery was a predictive factor for exposure to the virus (P=0.03, OR: 2.397). This study suggested that the history of dental surgery was predictive for HBV transmission in Duhok city. Further population-based study is needed to determine HBV risk factors in the society and public health plan based on that should be considered.
Wu, Wei; Sun, Le; Zhang, Zhe; Guo, Yingying; Liu, Shuying
2015-03-25
An ultra-high-performance liquid chromatography coupled with quadrupole-time-of-flight mass spectrometry (UHPLC-Q-TOF-MS) method was developed for the detection and structural analysis of ginsenosides in white ginseng and related processed products (red ginseng). Original neutral, malonyl, and chemically transformed ginsenosides were identified in white and red ginseng samples. The aglycone types of ginsenosides were determined by MS/MS as PPD (m/z 459), PPT (m/z 475), C-24, -25 hydrated-PPD or PPT (m/z 477 or m/z 493), and Δ20(21)-or Δ20(22)-dehydrated-PPD or PPT (m/z 441 or m/z 457). Following the structural determination, the UHPLC-Q-TOF-MS-based chemical profiling coupled with multivariate statistical analysis method was applied for global analysis of white and processed ginseng samples. The chemical markers present between the processed products red ginseng and white ginseng could be assigned. Process-mediated chemical changes were recognized as the hydrolysis of ginsenosides with large molecular weight, chemical transformations of ginsenosides, changes in malonyl-ginsenosides, and generation of 20-(R)-ginsenoside enantiomers. The relative contents of compounds classified as PPD, PPT, malonyl, and transformed ginsenosides were calculated based on peak areas in ginseng before and after processing. This study provides possibility to monitor multiple components for the quality control and global evaluation of ginseng products during processing. Copyright © 2014 Elsevier B.V. All rights reserved.
The Economic Value of Career Counseling Services for College Students in South Korea
ERIC Educational Resources Information Center
Choi, Bo Young; Lee, Ji Hee; Kim, Areum; Kim, Boram; Cho, Daeyeon; Lee, Sang Min
2013-01-01
This study investigated college students' perception of the monetary value of career counseling services by using the contingent valuation method. The results of a multivariate survival analysis based on interviews with a convenience sample of 291 undergraduate students in South Korea indicate that, on average, participants' expressed willingness…
Mimatsu, Kenji; Fukino, Nobutada; Ogasawara, Yasuo; Saino, Yoko; Oida, Takatsugu
2017-08-01
The present study aimed to compare the utility of various inflammatory marker- and nutritional status-based prognostic factors, including many previous established prognostic factors, for predicting the prognosis of stage IV gastric cancer patients undergoing non-curative surgery. A total of 33 patients with stage IV gastric cancer who had undergone palliative gastrectomy and gastrojejunostomy were included in the study. Univariate and multivariate analyses were performed to evaluate the relationships between the mGPS, PNI, NLR, PLR, the CONUT, various clinicopathological factors and cancer-specific survival (CS). Among patients who received non-curative surgery, univariate analysis of CS identified the following significant risk factors: chemotherapy, mGPS and NLR, and multivariate analysis revealed that the mGPS was independently associated with CS. The mGPS was a more useful prognostic factor than the PNI, NLR, PLR and CONUT in patients undergoing non-curative surgery for stage IV gastric cancer. Copyright© 2017, International Institute of Anticancer Research (Dr. George J. Delinasios), All rights reserved.
Descriptor selection for banana accessions based on univariate and multivariate analysis.
Brandão, L P; Souza, C P F; Pereira, V M; Silva, S O; Santos-Serejo, J A; Ledo, C A S; Amorim, E P
2013-05-14
Our objective was to establish a minimum number of morphological descriptors for the characterization of banana germplasm and evaluate the efficiency of removal of redundant characters, based on univariate and multivariate statistical analyses. Phenotypic characterization was made of 77 accessions from Bahia, Brazil, using 92 descriptors. The selection of the descriptors was carried out by principal components analysis (quantitative) and by entropy (multi-category). Efficiency of elimination was analyzed by a comparative study between the clusters formed, taking into consideration all 92 descriptors and smaller groups. The selected descriptors were analyzed with the Ward-MLM procedure and a combined matrix formed by the Gower algorithm. We were able to reduce the number of descriptors used for characterizing the banana germplasm (42%). The correlation between the matrices considering the 92 descriptors and the selected ones was 0.82, showing that the reduction in the number of descriptors did not influence estimation of genetic variability between the banana accessions. We conclude that removing these descriptors caused no loss of information, considering the groups formed from pre-established criteria, including subgroup/subspecies.
Biostatistics Series Module 10: Brief Overview of Multivariate Methods.
Hazra, Avijit; Gogtay, Nithya
2017-01-01
Multivariate analysis refers to statistical techniques that simultaneously look at three or more variables in relation to the subjects under investigation with the aim of identifying or clarifying the relationships between them. These techniques have been broadly classified as dependence techniques, which explore the relationship between one or more dependent variables and their independent predictors, and interdependence techniques, that make no such distinction but treat all variables equally in a search for underlying relationships. Multiple linear regression models a situation where a single numerical dependent variable is to be predicted from multiple numerical independent variables. Logistic regression is used when the outcome variable is dichotomous in nature. The log-linear technique models count type of data and can be used to analyze cross-tabulations where more than two variables are included. Analysis of covariance is an extension of analysis of variance (ANOVA), in which an additional independent variable of interest, the covariate, is brought into the analysis. It tries to examine whether a difference persists after "controlling" for the effect of the covariate that can impact the numerical dependent variable of interest. Multivariate analysis of variance (MANOVA) is a multivariate extension of ANOVA used when multiple numerical dependent variables have to be incorporated in the analysis. Interdependence techniques are more commonly applied to psychometrics, social sciences and market research. Exploratory factor analysis and principal component analysis are related techniques that seek to extract from a larger number of metric variables, a smaller number of composite factors or components, which are linearly related to the original variables. Cluster analysis aims to identify, in a large number of cases, relatively homogeneous groups called clusters, without prior information about the groups. The calculation intensive nature of multivariate analysis has so far precluded most researchers from using these techniques routinely. The situation is now changing with wider availability, and increasing sophistication of statistical software and researchers should no longer shy away from exploring the applications of multivariate methods to real-life data sets.
Bohn, Justin; Eddings, Wesley; Schneeweiss, Sebastian
2017-03-15
Distributed networks of health-care data sources are increasingly being utilized to conduct pharmacoepidemiologic database studies. Such networks may contain data that are not physically pooled but instead are distributed horizontally (separate patients within each data source) or vertically (separate measures within each data source) in order to preserve patient privacy. While multivariable methods for the analysis of horizontally distributed data are frequently employed, few practical approaches have been put forth to deal with vertically distributed health-care databases. In this paper, we propose 2 propensity score-based approaches to vertically distributed data analysis and test their performance using 5 example studies. We found that these approaches produced point estimates close to what could be achieved without partitioning. We further found a performance benefit (i.e., lower mean squared error) for sequentially passing a propensity score through each data domain (called the "sequential approach") as compared with fitting separate domain-specific propensity scores (called the "parallel approach"). These results were validated in a small simulation study. This proof-of-concept study suggests a new multivariable analysis approach to vertically distributed health-care databases that is practical, preserves patient privacy, and warrants further investigation for use in clinical research applications that rely on health-care databases. © The Author 2017. Published by Oxford University Press on behalf of the Johns Hopkins Bloomberg School of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Kroese, Leonard F; Kleinrensink, Gert-Jan; Lange, Johan F; Gillion, Jean-Francois
2018-03-01
Incisional hernia is a frequent complication after midline laparotomy. Surgical hernia repair is associated with complications, but no clear predictive risk factors have been identified. The European Hernia Society (EHS) classification offers a structured framework to describe hernias and to analyze postoperative complications. Because of its structured nature, it might prove to be useful for preoperative patient or treatment classification. The objective of this study was to investigate the EHS classification as a predictor for postoperative complications after incisional hernia surgery. An analysis was performed using a registry-based, large-scale, prospective cohort study, including all patients undergoing incisional hernia surgery between September 1, 2011 and February 29, 2016. Univariate analyses and multivariable logistic regression analysis were performed to identify risk factors for postoperative complications. A total of 2,191 patients were included, of whom 323 (15%) had 1 or more complications. Factors associated with complications in univariate analyses (p < 0.20) and clinically relevant factors were included in the multivariable analysis. In the multivariable analysis, EHS width class, incarceration, open surgery, duration of surgery, Altemeier wound class, and therapeutic antibiotic treatment were independent risk factors for postoperative complications. Third recurrence and emergency surgery were associated with fewer complications. Incisional hernia repair is associated with a 15% complication rate. The EHS width classification is associated with postoperative complications. To identify patients at risk for complications, the EHS classification is useful. Copyright © 2017. Published by Elsevier Inc.
Martínez-Ramos, David; Fortea-Sanchis, Carlos; Escrig-Sos, Javier; Prats-de Puig, Miguel; Queralt-Martín, Raquel; Salvador-Sanchis, José Luís
2014-01-01
Conservative surgery can be regarded as the standard treatment for most early stage breast tumors. However, a minority of patients treated with conservative surgery will present local or locoregional recurrence. Therefore, it is of interest to evaluate the possible factors associated with this recurrence. A population-based retrospective study using data from the Tumor Registry of Castellón (Valencia, Spain) of patients operated on for primary nonmetastatic breast cancer between January 2000 and December 2008 was designed. Kaplan-Meier curves and log-rank test to estimate 5-year local recurrence were used. Two groups of patients were defined, one with conservative surgery and another with nonconservative surgery. Cox multivariate analysis was conducted. The total number of patients was 410. Average local recurrence was 6.8%. In univariate analysis, only tumor size and lymph node involvement showed significant differences. On multivariate analysis, independent prognostic factors were conservative surgery (hazard ratio [HR] 4.62; 95% confidence interval [CI]: 1.12-16.82), number of positive lymph nodes (HR 1.07; 95% CI: 1.01-1.17) and tumor size (in mm) (HR 1.02; 95% CI: 1.01-1.06). Local recurrence after breast-conserving surgery is higher in tumors >2 cm. Although tumor size should not be a contraindication for conservative surgery, it should be a risk factor to be considered.
EEMD-based multiscale ICA method for slewing bearing fault detection and diagnosis
NASA Astrophysics Data System (ADS)
Žvokelj, Matej; Zupan, Samo; Prebil, Ivan
2016-05-01
A novel multivariate and multiscale statistical process monitoring method is proposed with the aim of detecting incipient failures in large slewing bearings, where subjective influence plays a minor role. The proposed method integrates the strengths of the Independent Component Analysis (ICA) multivariate monitoring approach with the benefits of Ensemble Empirical Mode Decomposition (EEMD), which adaptively decomposes signals into different time scales and can thus cope with multiscale system dynamics. The method, which was named EEMD-based multiscale ICA (EEMD-MSICA), not only enables bearing fault detection but also offers a mechanism of multivariate signal denoising and, in combination with the Envelope Analysis (EA), a diagnostic tool. The multiscale nature of the proposed approach makes the method convenient to cope with data which emanate from bearings in complex real-world rotating machinery and frequently represent the cumulative effect of many underlying phenomena occupying different regions in the time-frequency plane. The efficiency of the proposed method was tested on simulated as well as real vibration and Acoustic Emission (AE) signals obtained through conducting an accelerated run-to-failure lifetime experiment on a purpose-built laboratory slewing bearing test stand. The ability to detect and locate the early-stage rolling-sliding contact fatigue failure of the bearing indicates that AE and vibration signals carry sufficient information on the bearing condition and that the developed EEMD-MSICA method is able to effectively extract it, thereby representing a reliable bearing fault detection and diagnosis strategy.
Support vector machine learning-based fMRI data group analysis.
Wang, Ze; Childress, Anna R; Wang, Jiongjiong; Detre, John A
2007-07-15
To explore the multivariate nature of fMRI data and to consider the inter-subject brain response discrepancies, a multivariate and brain response model-free method is fundamentally required. Two such methods are presented in this paper by integrating a machine learning algorithm, the support vector machine (SVM), and the random effect model. Without any brain response modeling, SVM was used to extract a whole brain spatial discriminance map (SDM), representing the brain response difference between the contrasted experimental conditions. Population inference was then obtained through the random effect analysis (RFX) or permutation testing (PMU) on the individual subjects' SDMs. Applied to arterial spin labeling (ASL) perfusion fMRI data, SDM RFX yielded lower false-positive rates in the null hypothesis test and higher detection sensitivity for synthetic activations with varying cluster size and activation strengths, compared to the univariate general linear model (GLM)-based RFX. For a sensory-motor ASL fMRI study, both SDM RFX and SDM PMU yielded similar activation patterns to GLM RFX and GLM PMU, respectively, but with higher t values and cluster extensions at the same significance level. Capitalizing on the absence of temporal noise correlation in ASL data, this study also incorporated PMU in the individual-level GLM and SVM analyses accompanied by group-level analysis through RFX or group-level PMU. Providing inferences on the probability of being activated or deactivated at each voxel, these individual-level PMU-based group analysis methods can be used to threshold the analysis results of GLM RFX, SDM RFX or SDM PMU.
Multivariate time series analysis of neuroscience data: some challenges and opportunities.
Pourahmadi, Mohsen; Noorbaloochi, Siamak
2016-04-01
Neuroimaging data may be viewed as high-dimensional multivariate time series, and analyzed using techniques from regression analysis, time series analysis and spatiotemporal analysis. We discuss issues related to data quality, model specification, estimation, interpretation, dimensionality and causality. Some recent research areas addressing aspects of some recurring challenges are introduced. Copyright © 2015 Elsevier Ltd. All rights reserved.
Multivariate image analysis of laser-induced photothermal imaging used for detection of caries tooth
NASA Astrophysics Data System (ADS)
El-Sherif, Ashraf F.; Abdel Aziz, Wessam M.; El-Sharkawy, Yasser H.
2010-08-01
Time-resolved photothermal imaging has been investigated to characterize tooth for the purpose of discriminating between normal and caries areas of the hard tissue using thermal camera. Ultrasonic thermoelastic waves were generated in hard tissue by the absorption of fiber-coupled Q-switched Nd:YAG laser pulses operating at 1064 nm in conjunction with a laser-induced photothermal technique used to detect the thermal radiation waves for diagnosis of human tooth. The concepts behind the use of photo-thermal techniques for off-line detection of caries tooth features were presented by our group in earlier work. This paper illustrates the application of multivariate image analysis (MIA) techniques to detect the presence of caries tooth. MIA is used to rapidly detect the presence and quantity of common caries tooth features as they scanned by the high resolution color (RGB) thermal cameras. Multivariate principal component analysis is used to decompose the acquired three-channel tooth images into a two dimensional principal components (PC) space. Masking score point clusters in the score space and highlighting corresponding pixels in the image space of the two dominant PCs enables isolation of caries defect pixels based on contrast and color information. The technique provides a qualitative result that can be used for early stage caries tooth detection. The proposed technique can potentially be used on-line or real-time resolved to prescreen the existence of caries through vision based systems like real-time thermal camera. Experimental results on the large number of extracted teeth as well as one of the thermal image panoramas of the human teeth voltanteer are investigated and presented.
NASA Astrophysics Data System (ADS)
Eilert, Tobias; Beckers, Maximilian; Drechsler, Florian; Michaelis, Jens
2017-10-01
The analysis tool and software package Fast-NPS can be used to analyse smFRET data to obtain quantitative structural information about macromolecules in their natural environment. In the algorithm a Bayesian model gives rise to a multivariate probability distribution describing the uncertainty of the structure determination. Since Fast-NPS aims to be an easy-to-use general-purpose analysis tool for a large variety of smFRET networks, we established an MCMC based sampling engine that approximates the target distribution and requires no parameter specification by the user at all. For an efficient local exploration we automatically adapt the multivariate proposal kernel according to the shape of the target distribution. In order to handle multimodality, the sampler is equipped with a parallel tempering scheme that is fully adaptive with respect to temperature spacing and number of chains. Since the molecular surrounding of a dye molecule affects its spatial mobility and thus the smFRET efficiency, we introduce dye models which can be selected for every dye molecule individually. These models allow the user to represent the smFRET network in great detail leading to an increased localisation precision. Finally, a tool to validate the chosen model combination is provided. Programme Files doi:http://dx.doi.org/10.17632/7ztzj63r68.1 Licencing provisions: Apache-2.0 Programming language: GUI in MATLAB (The MathWorks) and the core sampling engine in C++ Nature of problem: Sampling of highly diverse multivariate probability distributions in order to solve for macromolecular structures from smFRET data. Solution method: MCMC algorithm with fully adaptive proposal kernel and parallel tempering scheme.
Brinjikji, W; Rabinstein, A A; McDonald, J S; Cloft, H J
2014-03-01
Previous studies have demonstrated that socioeconomic disparities in the treatment of cerebrovascular diseases exist. We studied a large administrative data base to study disparities in the utilization of mechanical thrombectomy for acute ischemic stroke. With the utilization of the Perspective data base, we studied disparities in mechanical thrombectomy utilization between patient race and insurance status in 1) all patients presenting with acute ischemic stroke and 2) patients presenting with acute ischemic stroke at centers that performed mechanical thrombectomy. We examined utilization rates of mechanical thrombectomy by race/ethnicity (white, black, and Hispanic) and insurance status (Medicare, Medicaid, self-pay, and private). Multivariate logistic regression analysis adjusting for potential confounding variables was performed to study the association between race/insurance status and mechanical thrombectomy utilization. The overall mechanical thrombectomy utilization rate was 0.15% (371/249,336); utilization rate at centers that performed mechanical thrombectomy was 1.0% (371/35,376). In the sample of all patients with acute ischemic stroke, multivariate logistic regression analysis demonstrated that uninsured patients had significantly lower odds of mechanical thrombectomy utilization compared with privately insured patients (OR = 0.52, 95% CI = 0.25-0.95, P = .03), as did Medicare patients (OR = 0.53, 95% CI = 0.41-0.70, P < .0001). Blacks had significantly lower odds of mechanical thrombectomy utilization compared with whites (OR = 0.35, 95% CI = 0.23-0.51, P < .0001). When considering only patients treated at centers performing mechanical thrombectomy, multivariate logistic regression analysis demonstrated that insurance was not associated with significant disparities in mechanical thrombectomy utilization; however, black patients had significantly lower odds of mechanical thrombectomy utilization compared with whites (OR = 0.41, 95% CI = 0.27-0.60, P < .0001). Significant socioeconomic disparities exist in the utilization of mechanical thrombectomy in the United States.
NASA Technical Reports Server (NTRS)
Park, Steve
1990-01-01
A large and diverse number of computational techniques are routinely used to process and analyze remotely sensed data. These techniques include: univariate statistics; multivariate statistics; principal component analysis; pattern recognition and classification; other multivariate techniques; geometric correction; registration and resampling; radiometric correction; enhancement; restoration; Fourier analysis; and filtering. Each of these techniques will be considered, in order.
Chemical structure of wood charcoal by infrared spectroscopy and multivariate analysis
Nicole Labbe; David Harper; Timothy Rials; Thomas Elder
2006-01-01
In this work, the effect of temperature on charcoal structure and chemical composition is investigated for four tree species. Wood charcoal carbonized at various temperatures is analyzed by mid infrared spectroscopy coupled with multivariate analysis and by thermogravimetric analysis to characterize the chemical composition during the carbonization process. The...
Farseer-NMR: automatic treatment, analysis and plotting of large, multi-variable NMR data.
Teixeira, João M C; Skinner, Simon P; Arbesú, Miguel; Breeze, Alexander L; Pons, Miquel
2018-05-11
We present Farseer-NMR ( https://git.io/vAueU ), a software package to treat, evaluate and combine NMR spectroscopic data from sets of protein-derived peaklists covering a range of experimental conditions. The combined advances in NMR and molecular biology enable the study of complex biomolecular systems such as flexible proteins or large multibody complexes, which display a strong and functionally relevant response to their environmental conditions, e.g. the presence of ligands, site-directed mutations, post translational modifications, molecular crowders or the chemical composition of the solution. These advances have created a growing need to analyse those systems' responses to multiple variables. The combined analysis of NMR peaklists from large and multivariable datasets has become a new bottleneck in the NMR analysis pipeline, whereby information-rich NMR-derived parameters have to be manually generated, which can be tedious, repetitive and prone to human error, or even unfeasible for very large datasets. There is a persistent gap in the development and distribution of software focused on peaklist treatment, analysis and representation, and specifically able to handle large multivariable datasets, which are becoming more commonplace. In this regard, Farseer-NMR aims to close this longstanding gap in the automated NMR user pipeline and, altogether, reduce the time burden of analysis of large sets of peaklists from days/weeks to seconds/minutes. We have implemented some of the most common, as well as new, routines for calculation of NMR parameters and several publication-quality plotting templates to improve NMR data representation. Farseer-NMR has been written entirely in Python and its modular code base enables facile extension.
van Mierlo, Pieter; Lie, Octavian; Staljanssens, Willeke; Coito, Ana; Vulliémoz, Serge
2018-04-26
We investigated the influence of processing steps in the estimation of multivariate directed functional connectivity during seizures recorded with intracranial EEG (iEEG) on seizure-onset zone (SOZ) localization. We studied the effect of (i) the number of nodes, (ii) time-series normalization, (iii) the choice of multivariate time-varying connectivity measure: Adaptive Directed Transfer Function (ADTF) or Adaptive Partial Directed Coherence (APDC) and (iv) graph theory measure: outdegree or shortest path length. First, simulations were performed to quantify the influence of the various processing steps on the accuracy to localize the SOZ. Afterwards, the SOZ was estimated from a 113-electrodes iEEG seizure recording and compared with the resection that rendered the patient seizure-free. The simulations revealed that ADTF is preferred over APDC to localize the SOZ from ictal iEEG recordings. Normalizing the time series before analysis resulted in an increase of 25-35% of correctly localized SOZ, while adding more nodes to the connectivity analysis led to a moderate decrease of 10%, when comparing 128 with 32 input nodes. The real-seizure connectivity estimates localized the SOZ inside the resection area using the ADTF coupled to outdegree or shortest path length. Our study showed that normalizing the time-series is an important pre-processing step, while adding nodes to the analysis did only marginally affect the SOZ localization. The study shows that directed multivariate Granger-based connectivity analysis is feasible with many input nodes (> 100) and that normalization of the time-series before connectivity analysis is preferred.
Byun, Bo-Ram; Kim, Yong-Il; Yamaguchi, Tetsutaro; Maki, Koutaro; Ko, Ching-Chang; Hwang, Dea-Seok; Park, Soo-Byung; Son, Woo-Sung
2015-11-01
The purpose of this study was to establish multivariable regression models for the estimation of skeletal maturation status in Japanese boys and girls using the cone-beam computed tomography (CBCT)-based cervical vertebral maturation (CVM) assessment method and hand-wrist radiography. The analyzed sample consisted of hand-wrist radiographs and CBCT images from 47 boys and 57 girls. To quantitatively evaluate the correlation between the skeletal maturation status and measurement ratios, a CBCT-based CVM assessment method was applied to the second, third, and fourth cervical vertebrae. Pearson's correlation coefficient analysis and multivariable regression analysis were used to determine the ratios for each of the cervical vertebrae (p < 0.05). Four characteristic parameters ((OH2 + PH2)/W2, (OH2 + AH2)/W2, D2, AH3/W3), as independent variables, were used to build the multivariable regression models: for the Japanese boys, the skeletal maturation status according to the CBCT-based quantitative cervical vertebral maturation (QCVM) assessment was 5.90 + 99.11 × AH3/W3 - 14.88 × (OH2 + AH2)/W2 + 13.24 × D2; for the Japanese girls, it was 41.39 + 59.52 × AH3/W3 - 15.88 × (OH2 + PH2)/W2 + 10.93 × D2. The CBCT-generated CVM images proved very useful to the definition of the cervical vertebral body and the odontoid process. The newly developed CBCT-based QCVM assessment method showed a high correlation between the derived ratios from the second cervical vertebral body and odontoid process. There are high correlations between the skeletal maturation status and the ratios of the second cervical vertebra based on the remnant of dentocentral synchondrosis.
NASA Astrophysics Data System (ADS)
Pradhan, Biswajeet
2010-05-01
This paper presents the results of the cross-validation of a multivariate logistic regression model using remote sensing data and GIS for landslide hazard analysis on the Penang, Cameron, and Selangor areas in Malaysia. Landslide locations in the study areas were identified by interpreting aerial photographs and satellite images, supported by field surveys. SPOT 5 and Landsat TM satellite imagery were used to map landcover and vegetation index, respectively. Maps of topography, soil type, lineaments and land cover were constructed from the spatial datasets. Ten factors which influence landslide occurrence, i.e., slope, aspect, curvature, distance from drainage, lithology, distance from lineaments, soil type, landcover, rainfall precipitation, and normalized difference vegetation index (ndvi), were extracted from the spatial database and the logistic regression coefficient of each factor was computed. Then the landslide hazard was analysed using the multivariate logistic regression coefficients derived not only from the data for the respective area but also using the logistic regression coefficients calculated from each of the other two areas (nine hazard maps in all) as a cross-validation of the model. For verification of the model, the results of the analyses were then compared with the field-verified landslide locations. Among the three cases of the application of logistic regression coefficient in the same study area, the case of Selangor based on the Selangor logistic regression coefficients showed the highest accuracy (94%), where as Penang based on the Penang coefficients showed the lowest accuracy (86%). Similarly, among the six cases from the cross application of logistic regression coefficient in other two areas, the case of Selangor based on logistic coefficient of Cameron showed highest (90%) prediction accuracy where as the case of Penang based on the Selangor logistic regression coefficients showed the lowest accuracy (79%). Qualitatively, the cross application model yields reasonable results which can be used for preliminary landslide hazard mapping.
NASA Astrophysics Data System (ADS)
Valder, J.; Kenner, S.; Long, A.
2008-12-01
Portions of the Cheyenne River are characterized as impaired by the U.S. Environmental Protection Agency because of water-quality exceedences. The Cheyenne River watershed includes the Black Hills National Forest and part of the Badlands National Park. Preliminary analysis indicates that the Badlands National Park is a major contributor to the exceedances of the water-quality constituents for total dissolved solids and total suspended solids. Water-quality data have been collected continuously since 2007, and in the second year of collection (2008), monthly grab and passive sediment samplers are being used to collect total suspended sediment and total dissolved solids in both base-flow and runoff-event conditions. In addition, sediment samples from the river channel, including bed, bank, and floodplain, have been collected. These samples are being analyzed at the South Dakota School of Mines and Technology's X-Ray Diffraction Lab to quantify the mineralogy of the sediments. A multivariate statistical approach (including principal components, least squares, and maximum likelihood techniques) is applied to the mineral percentages that were characterized for each site to identify the contributing source areas that are causing exceedances of sediment transport in the Cheyenne River watershed. Results of the multivariate analysis demonstrate the likely sources of solids found in the Cheyenne River samples. A further refinement of the methods is in progress that utilizes a conceptual model which, when applied with the multivariate statistical approach, provides a better estimate for sediment sources.
NASA Astrophysics Data System (ADS)
Everard, Colm D.; Kim, Moon S.; Lee, Hoyoung
2014-05-01
The production of contaminant free fresh fruit and vegetables is needed to reduce foodborne illnesses and related costs. Leafy greens grown in the field can be susceptible to fecal matter contamination from uncontrolled livestock and wild animals entering the field. Pathogenic bacteria can be transferred via fecal matter and several outbreaks of E.coli O157:H7 have been associated with the consumption of leafy greens. This study examines the use of hyperspectral fluorescence imaging coupled with multivariate image analysis to detect fecal contamination on Spinach leaves (Spinacia oleracea). Hyperspectral fluorescence images from 464 to 800 nm were captured; ultraviolet excitation was supplied by two LED-based line light sources at 370 nm. Key wavelengths and algorithms useful for a contaminant screening optical imaging device were identified and developed, respectively. A non-invasive screening device has the potential to reduce the harmful consequences of foodborne illnesses.
Monitoring Quality of Biotherapeutic Products Using Multivariate Data Analysis.
Rathore, Anurag S; Pathak, Mili; Jain, Renu; Jadaun, Gaurav Pratap Singh
2016-07-01
Monitoring the quality of pharmaceutical products is a global challenge, heightened by the implications of letting subquality drugs come to the market on public safety. Regulatory agencies do their due diligence at the time of approval as per their prescribed regulations. However, product quality needs to be monitored post-approval as well to ensure patient safety throughout the product life cycle. This is particularly complicated for biotechnology-based therapeutics where seemingly minor changes in process and/or raw material attributes have been shown to have a significant effect on clinical safety and efficacy of the product. This article provides a perspective on the topic of monitoring the quality of biotech therapeutics. In the backdrop of challenges faced by the regulatory agencies, the potential use of multivariate data analysis as a tool for effective monitoring has been proposed. Case studies using data from several insulin biosimilars have been used to illustrate the key concepts.
Hemakom, Apit; Goverdovsky, Valentin; Looney, David; Mandic, Danilo P
2016-04-13
An extension to multivariate empirical mode decomposition (MEMD), termed adaptive-projection intrinsically transformed MEMD (APIT-MEMD), is proposed to cater for power imbalances and inter-channel correlations in real-world multichannel data. It is shown that the APIT-MEMD exhibits similar or better performance than MEMD for a large number of projection vectors, whereas it outperforms MEMD for the critical case of a small number of projection vectors within the sifting algorithm. We also employ the noise-assisted APIT-MEMD within our proposed intrinsic multiscale analysis framework and illustrate the advantages of such an approach in notoriously noise-dominated cooperative brain-computer interface (BCI) based on the steady-state visual evoked potentials and the P300 responses. Finally, we show that for a joint cognitive BCI task, the proposed intrinsic multiscale analysis framework improves system performance in terms of the information transfer rate. © 2016 The Author(s).
Jamshidi-Zanjani, Ahmad; Saeedi, Mohsen
2017-07-01
Vertical distribution of metals (Cu, Zn, Cr, Fe, Mn, Pb, Ni, Cd, and Li) in four sediment core samples (C 1 , C 2 , C 3 , and C 4 ) from Anzali international wetland located southwest of the Caspian Sea was examined. Background concentration of each metal was calculated according to different statistical approaches. The results of multivariate statistical analysis showed that Fe and Mn might have significant role in the fate of Ni and Zn in sediment core samples. Different sediment quality indexes were utilized to assess metal pollution in sediment cores. Moreover, a new sediment quality index named aggregative toxicity index (ATI) based on sediment quality guidelines (SQGs) was developed to assess the degree of metal toxicity in an aggregative manner. The increasing pattern of metal pollution and their toxicity degree in upper layers of core samples indicated increasing effects of anthropogenic sources in the study area.
NASA Astrophysics Data System (ADS)
Yamasaki, Hideki; Morita, Shigeaki
2018-05-01
Multivariate curve resolution (MCR) was applied to a hetero-spectrally combined dataset consisting of mid-infrared (MIR) and near-infrared (NIR) spectra collected during the isothermal curing reaction of an epoxy resin. An epoxy monomer, bisphenol A diglycidyl ether (BADGE), and a hardening agent, 4,4‧-diaminodiphenyl methane (DDM), were used for the reaction. The fundamental modes of the Nsbnd H and Osbnd H stretches were highly overlapped in the MIR region, while their first overtones could be independently identified in the NIR region. The concentration profiles obtained by MCR using the hetero-spectral combination showed good agreement with the results of calculations based on the Beer-Lambert law and the mass balance. The band assignments and absorption sites estimated by the analysis also showed good agreement with the results using two-dimensional (2D) hetero-correlation spectroscopy.
Yamasaki, Hideki; Morita, Shigeaki
2018-05-15
Multivariate curve resolution (MCR) was applied to a hetero-spectrally combined dataset consisting of mid-infrared (MIR) and near-infrared (NIR) spectra collected during the isothermal curing reaction of an epoxy resin. An epoxy monomer, bisphenol A diglycidyl ether (BADGE), and a hardening agent, 4,4'-diaminodiphenyl methane (DDM), were used for the reaction. The fundamental modes of the NH and OH stretches were highly overlapped in the MIR region, while their first overtones could be independently identified in the NIR region. The concentration profiles obtained by MCR using the hetero-spectral combination showed good agreement with the results of calculations based on the Beer-Lambert law and the mass balance. The band assignments and absorption sites estimated by the analysis also showed good agreement with the results using two-dimensional (2D) hetero-correlation spectroscopy. Copyright © 2017 Elsevier B.V. All rights reserved.
Multivariate analysis: greater insights into complex systems
USDA-ARS?s Scientific Manuscript database
Many agronomic researchers measure and collect multiple response variables in an effort to understand the more complex nature of the system being studied. Multivariate (MV) statistical methods encompass the simultaneous analysis of all random variables (RV) measured on each experimental or sampling ...
DOE Office of Scientific and Technical Information (OSTI.GOV)
Van Benthem, Mark Hilary; Mowry, Curtis Dale; Kotula, Paul Gabriel
Thermal decomposition of poly dimethyl siloxane compounds, Sylgard{reg_sign} 184 and 186, were examined using thermal desorption coupled gas chromatography-mass spectrometry (TD/GC-MS) and multivariate analysis. This work describes a method of producing multiway data using a stepped thermal desorption. The technique involves sequentially heating a sample of the material of interest with subsequent analysis in a commercial GC/MS system. The decomposition chromatograms were analyzed using multivariate analysis tools including principal component analysis (PCA), factor rotation employing the varimax criterion, and multivariate curve resolution. The results of the analysis show seven components related to offgassing of various fractions of siloxanes that varymore » as a function of temperature. Thermal desorption coupled with gas chromatography-mass spectrometry (TD/GC-MS) is a powerful analytical technique for analyzing chemical mixtures. It has great potential in numerous analytic areas including materials analysis, sports medicine, in the detection of designer drugs; and biological research for metabolomics. Data analysis is complicated, far from automated and can result in high false positive or false negative rates. We have demonstrated a step-wise TD/GC-MS technique that removes more volatile compounds from a sample before extracting the less volatile compounds. This creates an additional dimension of separation before the GC column, while simultaneously generating three-way data. Sandia's proven multivariate analysis methods, when applied to these data, have several advantages over current commercial options. It also has demonstrated potential for success in finding and enabling identification of trace compounds. Several challenges remain, however, including understanding the sources of noise in the data, outlier detection, improving the data pretreatment and analysis methods, developing a software tool for ease of use by the chemist, and demonstrating our belief that this multivariate analysis will enable superior differentiation capabilities. In addition, noise and system artifacts challenge the analysis of GC-MS data collected on lower cost equipment, ubiquitous in commercial laboratories. This research has the potential to affect many areas of analytical chemistry including materials analysis, medical testing, and environmental surveillance. It could also provide a method to measure adsorption parameters for chemical interactions on various surfaces by measuring desorption as a function of temperature for mixtures. We have presented results of a novel method for examining offgas products of a common PDMS material. Our method involves utilizing a stepped TD/GC-MS data acquisition scheme that may be almost totally automated, coupled with multivariate analysis schemes. This method of data generation and analysis can be applied to a number of materials aging and thermal degradation studies.« less
Mercuri, A; Pagliari, M; Baxevanis, F; Fares, R; Fotaki, N
2017-02-25
In this study the selection of in vivo predictive in vitro dissolution experimental set-ups using a multivariate analysis approach, in line with the Quality by Design (QbD) principles, is explored. The dissolution variables selected using a design of experiments (DoE) were the dissolution apparatus [USP1 apparatus (basket) and USP2 apparatus (paddle)], the rotational speed of the basket/or paddle, the operator conditions (dissolution apparatus brand and operator), the volume, the pH, and the ethanol content of the dissolution medium. The dissolution profiles of two nifedipine capsules (poorly soluble compound), under conditions mimicking the intake of the capsules with i. water, ii. orange juice and iii. an alcoholic drink (orange juice and ethanol) were analysed using multiple linear regression (MLR). Optimised dissolution set-ups, generated based on the mathematical model obtained via MLR, were used to build predicted in vitro-in vivo correlations (IVIVC). IVIVC could be achieved using physiologically relevant in vitro conditions mimicking the intake of the capsules with an alcoholic drink (orange juice and ethanol). The multivariate analysis revealed that the concentration of ethanol used in the in vitro dissolution experiments (47% v/v) can be lowered to less than 20% v/v, reflecting recently found physiological conditions. Copyright © 2016 Elsevier B.V. All rights reserved.
Risk prediction for myocardial infarction via generalized functional regression models.
Ieva, Francesca; Paganoni, Anna M
2016-08-01
In this paper, we propose a generalized functional linear regression model for a binary outcome indicating the presence/absence of a cardiac disease with multivariate functional data among the relevant predictors. In particular, the motivating aim is the analysis of electrocardiographic traces of patients whose pre-hospital electrocardiogram (ECG) has been sent to 118 Dispatch Center of Milan (the Italian free-toll number for emergencies) by life support personnel of the basic rescue units. The statistical analysis starts with a preprocessing of ECGs treated as multivariate functional data. The signals are reconstructed from noisy observations. The biological variability is then removed by a nonlinear registration procedure based on landmarks. Thus, in order to perform a data-driven dimensional reduction, a multivariate functional principal component analysis is carried out on the variance-covariance matrix of the reconstructed and registered ECGs and their first derivatives. We use the scores of the Principal Components decomposition as covariates in a generalized linear model to predict the presence of the disease in a new patient. Hence, a new semi-automatic diagnostic procedure is proposed to estimate the risk of infarction (in the case of interest, the probability of being affected by Left Bundle Brunch Block). The performance of this classification method is evaluated and compared with other methods proposed in literature. Finally, the robustness of the procedure is checked via leave-j-out techniques. © The Author(s) 2013.
Giacomino, Agnese; Abollino, Ornella; Malandrino, Mery; Mentasti, Edoardo
2011-03-04
Single and sequential extraction procedures are used for studying element mobility and availability in solid matrices, like soils, sediments, sludge, and airborne particulate matter. In the first part of this review we reported an overview on these procedures and described the applications of chemometric uni- and bivariate techniques and of multivariate pattern recognition techniques based on variable reduction to the experimental results obtained. The second part of the review deals with the use of chemometrics not only for the visualization and interpretation of data, but also for the investigation of the effects of experimental conditions on the response, the optimization of their values and the calculation of element fractionation. We will describe the principles of the multivariate chemometric techniques considered, the aims for which they were applied and the key findings obtained. The following topics will be critically addressed: pattern recognition by cluster analysis (CA), linear discriminant analysis (LDA) and other less common techniques; modelling by multiple linear regression (MLR); investigation of spatial distribution of variables by geostatistics; calculation of fractionation patterns by a mixture resolution method (Chemometric Identification of Substrates and Element Distributions, CISED); optimization and characterization of extraction procedures by experimental design; other multivariate techniques less commonly applied. Copyright © 2010 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
O'Brien, S. J.; Fitzpatrick, P. J.; Dzwonkowski, B.; Dykstra, S. L.; Wallace, D. J.; Church, I.; Wiggert, J. D.
2016-02-01
The Mississippi Sound is influenced by a high volume of sediment discharge from the Biloxi River, Mobile Bay via Pas aux Herons, Pascagoula River, Pearl River, Wolf River, and Lake Pontchartrain through the Rigolets. The river discharge, variable wind speed, wind direction and tides have a significant impact on the turbidity and transport of sediments in the Sound. Level 1 Moderate Resolution Imaging Spectroradiometer (MODIS) data is processed to extract the remote sensing reflectance at the wavelength of 645 nm and binned into an 8-day composite at a resolution of 500 m. The study uses a regional ocean color algorithm to compute suspended particulate matter (SPM) concentration based on these 8-day composite images. Multivariate analysis is applied between the SPM and time series of tides, wind, turbidity and river discharge measured at federal and academic institutions' stations and moorings. The multivariate analysis also includes in situ measurements of suspended sediment concentration and advective exchanges through the Mississippi Sound's tidal inlets between the coastal shelf and the nearshore estuarine waters. Mechanisms underlying the observed spatiotemporal distribution of SPM, including material exchange between the Sound and adjacent shelf waters, will be explored. The results of this study will contribute to current understanding of exchange mechanisms and pathways with the Mississippi Bight via the Mississippi Sound's tidal inlets.
Zhang, Jing; Liang, Lichen; Anderson, Jon R; Gatewood, Lael; Rottenberg, David A; Strother, Stephen C
2008-01-01
As functional magnetic resonance imaging (fMRI) becomes widely used, the demands for evaluation of fMRI processing pipelines and validation of fMRI analysis results is increasing rapidly. The current NPAIRS package, an IDL-based fMRI processing pipeline evaluation framework, lacks system interoperability and the ability to evaluate general linear model (GLM)-based pipelines using prediction metrics. Thus, it can not fully evaluate fMRI analytical software modules such as FSL.FEAT and NPAIRS.GLM. In order to overcome these limitations, a Java-based fMRI processing pipeline evaluation system was developed. It integrated YALE (a machine learning environment) into Fiswidgets (a fMRI software environment) to obtain system interoperability and applied an algorithm to measure GLM prediction accuracy. The results demonstrated that the system can evaluate fMRI processing pipelines with univariate GLM and multivariate canonical variates analysis (CVA)-based models on real fMRI data based on prediction accuracy (classification accuracy) and statistical parametric image (SPI) reproducibility. In addition, a preliminary study was performed where four fMRI processing pipelines with GLM and CVA modules such as FSL.FEAT and NPAIRS.CVA were evaluated with the system. The results indicated that (1) the system can compare different fMRI processing pipelines with heterogeneous models (NPAIRS.GLM, NPAIRS.CVA and FSL.FEAT) and rank their performance by automatic performance scoring, and (2) the rank of pipeline performance is highly dependent on the preprocessing operations. These results suggest that the system will be of value for the comparison, validation, standardization and optimization of functional neuroimaging software packages and fMRI processing pipelines.
Muradian, Kh K; Utko, N O; Mozzhukhina, T H; Pishel', I M; Litoshenko, O Ia; Bezrukov, V V; Fraĭfel'd, V E
2002-01-01
Correlative and regressive relations between the gaseous exchange, thermoregulation and mitochondrial protein content were analyzed by two- and three-dimensional statistics in mice. It has been shown that the pair wise linear methods of analysis did not reveal any significant correlation between the parameters under exploration. However, it became evident at three-dimensional and non-linear plotting for which the coefficients of multivariable correlation reached and even exceeded 0.7-0.8. The calculations based on partial differentiation of the multivariable regression equations allow to conclude that at certain values of VO2, VCO2 and body temperature negative relations between the systems of gaseous exchange and thermoregulation become dominating.
Quantification of proportions of different water sources in a mining operation.
Scheiber, Laura; Ayora, Carlos; Vázquez-Suñé, Enric
2018-04-01
The water drained in mining operations (galleries, shafts, open pits) usually comes from different sources. Evaluating the contribution of these sources is very often necessary for water management. To determine mixing ratios, a conventional mass balance is often used. However, the presence of more than two sources creates uncertainties in mass balance applications. Moreover, the composition of the end-members is not commonly known with certainty and/or can vary in space and time. In this paper, we propose a powerful tool for solving such problems and managing groundwater in mining sites based on multivariate statistical analysis. This approach was applied to the Cobre Las Cruces mining complex, the largest copper mine in Europe. There, the open pit water is a mixture of three end-members: runoff (RO), basal Miocene (Mb) and Paleozoic (PZ) groundwater. The volume of water drained from the Miocene base aquifer must be determined and compensated via artificial recharging to comply with current regulations. Through multivariate statistical analysis of samples from a regional field campaign, the compositions of PZ and Mb end-members were firstly estimated, and then used for mixing calculations at the open pit scale. The runoff end-member was directly determined from samples collected in interception trenches inside the open pit. The application of multivariate statistical methods allowed the estimation of mixing ratios for the hydrological years 2014-2015 and 2015-2016. Open pit water proportions have changed from 15% to 7%, 41% to 36%, and 44% to 57% for runoff, Mb and PZ end-members, respectively. An independent estimation of runoff based on the curve method yielded comparable results. Copyright © 2017 Elsevier B.V. All rights reserved.
Composition Studies with the Telescope Array Surface Detector
NASA Astrophysics Data System (ADS)
Kuznetsov, Mikhail; Piskunov, Maxim; Rubtsov, Grigory; Troitsky, Sergey; Zhezher, Yana
The results on ultra-high-energy cosmic-ray chemical composition based on the data from the Telescope Array surface-detector are presented. The method is based on the multivariate boosted decision tree (BDT) analysis which uses surface-detector observables. The results on average atomic mass in the energy range 1018.0-1020.0 eV are presented. A comparison with the Telescope Array hybrid results and the Pierre Auger Observatory surface detector results is shown.
Delay differential analysis of time series.
Lainscsek, Claudia; Sejnowski, Terrence J
2015-03-01
Nonlinear dynamical system analysis based on embedding theory has been used for modeling and prediction, but it also has applications to signal detection and classification of time series. An embedding creates a multidimensional geometrical object from a single time series. Traditionally either delay or derivative embeddings have been used. The delay embedding is composed of delayed versions of the signal, and the derivative embedding is composed of successive derivatives of the signal. The delay embedding has been extended to nonuniform embeddings to take multiple timescales into account. Both embeddings provide information on the underlying dynamical system without having direct access to all the system variables. Delay differential analysis is based on functional embeddings, a combination of the derivative embedding with nonuniform delay embeddings. Small delay differential equation (DDE) models that best represent relevant dynamic features of time series data are selected from a pool of candidate models for detection or classification. We show that the properties of DDEs support spectral analysis in the time domain where nonlinear correlation functions are used to detect frequencies, frequency and phase couplings, and bispectra. These can be efficiently computed with short time windows and are robust to noise. For frequency analysis, this framework is a multivariate extension of discrete Fourier transform (DFT), and for higher-order spectra, it is a linear and multivariate alternative to multidimensional fast Fourier transform of multidimensional correlations. This method can be applied to short or sparse time series and can be extended to cross-trial and cross-channel spectra if multiple short data segments of the same experiment are available. Together, this time-domain toolbox provides higher temporal resolution, increased frequency and phase coupling information, and it allows an easy and straightforward implementation of higher-order spectra across time compared with frequency-based methods such as the DFT and cross-spectral analysis.
MULTIVARIATE CURVE RESOLUTION OF NMR SPECTROSCOPY METABONOMIC DATA
Sandia National Laboratories is working with the EPA to evaluate and develop mathematical tools for analysis of the collected NMR spectroscopy data. Initially, we have focused on the use of Multivariate Curve Resolution (MCR) also known as molecular factor analysis (MFA), a tech...
Pérez, Concepción; Navarro, Ana; Saldaña, María T; Wilson, Koo; Rejas, Javier
2015-03-01
The aim of the present analysis was to model the association and predictive value of pain intensity on cost and resource utilization in patients with chronic peripheral neuropathic pain (PNP) treated in routine clinical practice settings in Spain. We performed a secondary economic analysis based on data from a multicenter, observational, and prospective cost-of-illness study in patients with chronic PNP that is refractory to prior treatment. Pain intensity was measured using the Short-Form McGill Pain Questionnaire. Univariate and multivariate linear regression models were fitted to identify independent predictors of cost and health care/non-health care resource utilization. A total of 1703 patients were included in the current analysis. Pain intensity was an independent predictor of total costs ([total costs]=35.6 [pain intensity]+214.5; coefficient of determination [R(2)]=0.19, P<0.001), direct costs ([direct costs]=10.8 [pain intensity]+257.7; R=0.06, P<0.001), and indirect costs ([indirect costs]=24.8 [pain intensity]-43.4; R(2)=0.20, P<0.001) related to chronic PNP in the univariate analysis. Pain intensity remains significantly associated with total costs, direct costs, and indirect costs after adjustment by other covariates in the multivariate analysis (P<0.001). None of the other variables considered in the multivariate analysis were predictors of resource utilization. Pain intensity predicts the health care and non-health care resource utilization, and costs related to chronic PNP. Management of patients with drugs associated with a higher reduction of pain intensity may have a greater impact on the economic burden of that condition.
Pleiotropy Analysis of Quantitative Traits at Gene Level by Multivariate Functional Linear Models
Wang, Yifan; Liu, Aiyi; Mills, James L.; Boehnke, Michael; Wilson, Alexander F.; Bailey-Wilson, Joan E.; Xiong, Momiao; Wu, Colin O.; Fan, Ruzong
2015-01-01
In genetics, pleiotropy describes the genetic effect of a single gene on multiple phenotypic traits. A common approach is to analyze the phenotypic traits separately using univariate analyses and combine the test results through multiple comparisons. This approach may lead to low power. Multivariate functional linear models are developed to connect genetic variant data to multiple quantitative traits adjusting for covariates for a unified analysis. Three types of approximate F-distribution tests based on Pillai–Bartlett trace, Hotelling–Lawley trace, and Wilks’s Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants in one genetic region. The approximate F-distribution tests provide much more significant results than those of F-tests of univariate analysis and optimal sequence kernel association test (SKAT-O). Extensive simulations were performed to evaluate the false positive rates and power performance of the proposed models and tests. We show that the approximate F-distribution tests control the type I error rates very well. Overall, simultaneous analysis of multiple traits can increase power performance compared to an individual test of each trait. The proposed methods were applied to analyze (1) four lipid traits in eight European cohorts, and (2) three biochemical traits in the Trinity Students Study. The approximate F-distribution tests provide much more significant results than those of F-tests of univariate analysis and SKAT-O for the three biochemical traits. The approximate F-distribution tests of the proposed functional linear models are more sensitive than those of the traditional multivariate linear models that in turn are more sensitive than SKAT-O in the univariate case. The analysis of the four lipid traits and the three biochemical traits detects more association than SKAT-O in the univariate case. PMID:25809955
Pleiotropy analysis of quantitative traits at gene level by multivariate functional linear models.
Wang, Yifan; Liu, Aiyi; Mills, James L; Boehnke, Michael; Wilson, Alexander F; Bailey-Wilson, Joan E; Xiong, Momiao; Wu, Colin O; Fan, Ruzong
2015-05-01
In genetics, pleiotropy describes the genetic effect of a single gene on multiple phenotypic traits. A common approach is to analyze the phenotypic traits separately using univariate analyses and combine the test results through multiple comparisons. This approach may lead to low power. Multivariate functional linear models are developed to connect genetic variant data to multiple quantitative traits adjusting for covariates for a unified analysis. Three types of approximate F-distribution tests based on Pillai-Bartlett trace, Hotelling-Lawley trace, and Wilks's Lambda are introduced to test for association between multiple quantitative traits and multiple genetic variants in one genetic region. The approximate F-distribution tests provide much more significant results than those of F-tests of univariate analysis and optimal sequence kernel association test (SKAT-O). Extensive simulations were performed to evaluate the false positive rates and power performance of the proposed models and tests. We show that the approximate F-distribution tests control the type I error rates very well. Overall, simultaneous analysis of multiple traits can increase power performance compared to an individual test of each trait. The proposed methods were applied to analyze (1) four lipid traits in eight European cohorts, and (2) three biochemical traits in the Trinity Students Study. The approximate F-distribution tests provide much more significant results than those of F-tests of univariate analysis and SKAT-O for the three biochemical traits. The approximate F-distribution tests of the proposed functional linear models are more sensitive than those of the traditional multivariate linear models that in turn are more sensitive than SKAT-O in the univariate case. The analysis of the four lipid traits and the three biochemical traits detects more association than SKAT-O in the univariate case. © 2015 WILEY PERIODICALS, INC.
Gemignani, Jessica; Middell, Eike; Barbour, Randall L; Graber, Harry L; Blankertz, Benjamin
2018-04-04
The statistical analysis of functional near infrared spectroscopy (fNIRS) data based on the general linear model (GLM) is often made difficult by serial correlations, high inter-subject variability of the hemodynamic response, and the presence of motion artifacts. In this work we propose to extract information on the pattern of hemodynamic activations without using any a priori model for the data, by classifying the channels as 'active' or 'not active' with a multivariate classifier based on linear discriminant analysis (LDA). This work is developed in two steps. First we compared the performance of the two analyses, using a synthetic approach in which simulated hemodynamic activations were combined with either simulated or real resting-state fNIRS data. This procedure allowed for exact quantification of the classification accuracies of GLM and LDA. In the case of real resting-state data, the correlations between classification accuracy and demographic characteristics were investigated by means of a Linear Mixed Model. In the second step, to further characterize the reliability of the newly proposed analysis method, we conducted an experiment in which participants had to perform a simple motor task and data were analyzed with the LDA-based classifier as well as with the standard GLM analysis. The results of the simulation study show that the LDA-based method achieves higher classification accuracies than the GLM analysis, and that the LDA results are more uniform across different subjects and, in contrast to the accuracies achieved by the GLM analysis, have no significant correlations with any of the demographic characteristics. Findings from the real-data experiment are consistent with the results of the real-plus-simulation study, in that the GLM-analysis results show greater inter-subject variability than do the corresponding LDA results. The results obtained suggest that the outcome of GLM analysis is highly vulnerable to violations of theoretical assumptions, and that therefore a data-driven approach such as that provided by the proposed LDA-based method is to be favored.
Hurtado Rúa, Sandra M; Mazumdar, Madhu; Strawderman, Robert L
2015-12-30
Bayesian meta-analysis is an increasingly important component of clinical research, with multivariate meta-analysis a promising tool for studies with multiple endpoints. Model assumptions, including the choice of priors, are crucial aspects of multivariate Bayesian meta-analysis (MBMA) models. In a given model, two different prior distributions can lead to different inferences about a particular parameter. A simulation study was performed in which the impact of families of prior distributions for the covariance matrix of a multivariate normal random effects MBMA model was analyzed. Inferences about effect sizes were not particularly sensitive to prior choice, but the related covariance estimates were. A few families of prior distributions with small relative biases, tight mean squared errors, and close to nominal coverage for the effect size estimates were identified. Our results demonstrate the need for sensitivity analysis and suggest some guidelines for choosing prior distributions in this class of problems. The MBMA models proposed here are illustrated in a small meta-analysis example from the periodontal field and a medium meta-analysis from the study of stroke. Copyright © 2015 John Wiley & Sons, Ltd. Copyright © 2015 John Wiley & Sons, Ltd.
Hebart, Martin N.; Görgen, Kai; Haynes, John-Dylan
2015-01-01
The multivariate analysis of brain signals has recently sparked a great amount of interest, yet accessible and versatile tools to carry out decoding analyses are scarce. Here we introduce The Decoding Toolbox (TDT) which represents a user-friendly, powerful and flexible package for multivariate analysis of functional brain imaging data. TDT is written in Matlab and equipped with an interface to the widely used brain data analysis package SPM. The toolbox allows running fast whole-brain analyses, region-of-interest analyses and searchlight analyses, using machine learning classifiers, pattern correlation analysis, or representational similarity analysis. It offers automatic creation and visualization of diverse cross-validation schemes, feature scaling, nested parameter selection, a variety of feature selection methods, multiclass capabilities, and pattern reconstruction from classifier weights. While basic users can implement a generic analysis in one line of code, advanced users can extend the toolbox to their needs or exploit the structure to combine it with external high-performance classification toolboxes. The toolbox comes with an example data set which can be used to try out the various analysis methods. Taken together, TDT offers a promising option for researchers who want to employ multivariate analyses of brain activity patterns. PMID:25610393
Application of multivariable statistical techniques in plant-wide WWTP control strategies analysis.
Flores, X; Comas, J; Roda, I R; Jiménez, L; Gernaey, K V
2007-01-01
The main objective of this paper is to present the application of selected multivariable statistical techniques in plant-wide wastewater treatment plant (WWTP) control strategies analysis. In this study, cluster analysis (CA), principal component analysis/factor analysis (PCA/FA) and discriminant analysis (DA) are applied to the evaluation matrix data set obtained by simulation of several control strategies applied to the plant-wide IWA Benchmark Simulation Model No 2 (BSM2). These techniques allow i) to determine natural groups or clusters of control strategies with a similar behaviour, ii) to find and interpret hidden, complex and casual relation features in the data set and iii) to identify important discriminant variables within the groups found by the cluster analysis. This study illustrates the usefulness of multivariable statistical techniques for both analysis and interpretation of the complex multicriteria data sets and allows an improved use of information for effective evaluation of control strategies.
NASA Astrophysics Data System (ADS)
Hao, Zengchao; Hao, Fanghua; Singh, Vijay P.
2016-08-01
Drought is among the costliest natural hazards worldwide and extreme drought events in recent years have caused huge losses to various sectors. Drought prediction is therefore critically important for providing early warning information to aid decision making to cope with drought. Due to the complicated nature of drought, it has been recognized that the univariate drought indicator may not be sufficient for drought characterization and hence multivariate drought indices have been developed for drought monitoring. Alongside the substantial effort in drought monitoring with multivariate drought indices, it is of equal importance to develop a drought prediction method with multivariate drought indices to integrate drought information from various sources. This study proposes a general framework for multivariate multi-index drought prediction that is capable of integrating complementary prediction skills from multiple drought indices. The Multivariate Ensemble Streamflow Prediction (MESP) is employed to sample from historical records for obtaining statistical prediction of multiple variables, which is then used as inputs to achieve multivariate prediction. The framework is illustrated with a linearly combined drought index (LDI), which is a commonly used multivariate drought index, based on climate division data in California and New York in the United States with different seasonality of precipitation. The predictive skill of LDI (represented with persistence) is assessed by comparison with the univariate drought index and results show that the LDI prediction skill is less affected by seasonality than the meteorological drought prediction based on SPI. Prediction results from the case study show that the proposed multivariate drought prediction outperforms the persistence prediction, implying a satisfactory performance of multivariate drought prediction. The proposed method would be useful for drought prediction to integrate drought information from various sources for early drought warning.
ERIC Educational Resources Information Center
Barton, Mitch; Yeatts, Paul E.; Henson, Robin K.; Martin, Scott B.
2016-01-01
There has been a recent call to improve data reporting in kinesiology journals, including the appropriate use of univariate and multivariate analysis techniques. For example, a multivariate analysis of variance (MANOVA) with univariate post hocs and a Bonferroni correction is frequently used to investigate group differences on multiple dependent…
Sun, Li-Li; Wang, Meng; Zhang, Hui-Jie; Liu, Ya-Nan; Ren, Xiao-Liang; Deng, Yan-Ru; Qi, Ai-Di
2018-01-01
Polygoni Multiflori Radix (PMR) is increasingly being used not just as a traditional herbal medicine but also as a popular functional food. In this study, multivariate chemometric methods and mass spectrometry were combined to analyze the ultra-high-performance liquid chromatograph (UPLC) fingerprints of PMR from six different geographical origins. A chemometric strategy based on multivariate curve resolution-alternating least squares (MCR-ALS) and three classification methods is proposed to analyze the UPLC fingerprints obtained. Common chromatographic problems, including the background contribution, baseline contribution, and peak overlap, were handled by the established MCR-ALS model. A total of 22 components were resolved. Moreover, relative species concentrations were obtained from the MCR-ALS model, which was used for multivariate classification analysis. Principal component analysis (PCA) and Ward's method have been applied to classify 72 PMR samples from six different geographical regions. The PCA score plot showed that the PMR samples fell into four clusters, which related to the geographical location and climate of the source areas. The results were then corroborated by Ward's method. In addition, according to the variance-weighted distance between cluster centers obtained from Ward's method, five components were identified as the most significant variables (chemical markers) for cluster discrimination. A counter-propagation artificial neural network has been applied to confirm and predict the effects of chemical markers on different samples. Finally, the five chemical markers were identified by UPLC-quadrupole time-of-flight mass spectrometer. Components 3, 12, 16, 18, and 19 were identified as 2,3,5,4'-tetrahydroxy-stilbene-2-O-β-d-glucoside, emodin-8-O-β-d-glucopyranoside, emodin-8-O-(6'-O-acetyl)-β-d-glucopyranoside, emodin, and physcion, respectively. In conclusion, the proposed method can be applied for the comprehensive analysis of natural samples. Copyright © 2016. Published by Elsevier B.V.
Tyagi, Neelam; Sutton, Elizabeth; Hunt, Margie; Zhang, Jing; Oh, Jung Hun; Apte, Aditya; Mechalakos, James; Wilgucki, Molly; Gelb, Emily; Mehrara, Babak; Matros, Evan; Ho, Alice
2017-02-01
Capsular contracture (CC) is a serious complication in patients receiving implant-based reconstruction for breast cancer. Currently, no objective methods are available for assessing CC. The goal of the present study was to identify image-based surrogates of CC using magnetic resonance imaging (MRI). We analyzed a retrospective data set of 50 patients who had undergone both a diagnostic MRI scan and a plastic surgeon's evaluation of the CC score (Baker's score) within a 6-month period after mastectomy and reconstructive surgery. The MRI scans were assessed for morphologic shape features of the implant and histogram features of the pectoralis muscle. The shape features, such as roundness, eccentricity, solidity, extent, and ratio length for the implant, were compared with the Baker score. For the pectoralis muscle, the muscle width and median, skewness, and kurtosis of the intensity were compared with the Baker score. Univariate analysis (UVA) using a Wilcoxon rank-sum test and multivariate analysis with the least absolute shrinkage and selection operator logistic regression was performed to determine significant differences in these features between the patient groups categorized according to their Baker's scores. UVA showed statistically significant differences between grade 1 and grade ≥2 for morphologic shape features and histogram features, except for volume and skewness. Only eccentricity, ratio length, and volume were borderline significant in differentiating grade ≤2 and grade ≥3. Features with P<.1 on UVA were used in the multivariate least absolute shrinkage and selection operator logistic regression analysis. Multivariate analysis showed a good level of predictive power for grade 1 versus grade ≥2 CC (area under the receiver operating characteristic curve 0.78, sensitivity 0.78, and specificity 0.82) and for grade ≤2 versus grade ≥3 CC (area under the receiver operating characteristic curve 0.75, sensitivity 0.75, and specificity 0.79). The morphologic shape features described on MR images were associated with the severity of CC. MRI has the potential to further improve the diagnostic ability of the Baker score in breast cancer patients who undergo implant reconstruction. Copyright © 2016 Elsevier Inc. All rights reserved.
A novel multi-variant epitope ensemble vaccine against avian leukosis virus subgroup J.
Wang, Xiaoyu; Zhou, Defang; Wang, Guihua; Huang, Libo; Zheng, Qiankun; Li, Chengui; Cheng, Ziqiang
2017-12-04
The hypervariable antigenicity and immunosuppressive features of avian leukosis virus subgroup J (ALV-J) has led to great challenges to develop effective vaccines. Epitope vaccine will be a perspective trend. Previously, we identified a variant antigenic neutralizing epitope in hypervariable region 1 (hr1) of ALV-J, N-LRDFIA/E/TKWKS/GDDL/HLIRPYVNQS-C. BLAST analysis showed that the mutation of A, E, T and H in this epitope cover 79% of all ALV-J strains. Base on this data, we designed a multi-variant epitope ensemble vaccine comprising the four mutation variants linked with glycine and serine. The recombinant multi-variant epitope gene was expressed in Escherichia coli BL21. The expressed protein of the variant multi-variant epitope gene can react with positive sera and monoclonal antibodies of ALV-J, while cannot react with ALV-J negative sera. The multi-variant epitope vaccine that conjugated Freund's adjuvant complete/incomplete showed high immunogenicity that reached the titer of 1:64,000 at 42 days post immunization and maintained the immune period for at least 126 days in SPF chickens. Further, we demonstrated that the antibody induced by the variant multi-variant ensemble epitope vaccine recognized and neutralized different ALV-J strains (NX0101, TA1, WS1, BZ1224 and BZ4). Protection experiment that was evaluated by clinical symptom, viral shedding, weight gain, gross and histopathology showed 100% chickens that inoculated the multi-epitope vaccine were well protected against ALV-J challenge. The result shows a promising multi-variant epitope ensemble vaccine against hypervariable viruses in animals. Copyright © 2017 Elsevier Ltd. All rights reserved.
Lu, Tsui-Shan; Longnecker, Matthew P; Zhou, Haibo
2017-03-15
Outcome-dependent sampling (ODS) scheme is a cost-effective sampling scheme where one observes the exposure with a probability that depends on the outcome. The well-known such design is the case-control design for binary response, the case-cohort design for the failure time data, and the general ODS design for a continuous response. While substantial work has been carried out for the univariate response case, statistical inference and design for the ODS with multivariate cases remain under-developed. Motivated by the need in biological studies for taking the advantage of the available responses for subjects in a cluster, we propose a multivariate outcome-dependent sampling (multivariate-ODS) design that is based on a general selection of the continuous responses within a cluster. The proposed inference procedure for the multivariate-ODS design is semiparametric where all the underlying distributions of covariates are modeled nonparametrically using the empirical likelihood methods. We show that the proposed estimator is consistent and developed the asymptotically normality properties. Simulation studies show that the proposed estimator is more efficient than the estimator obtained using only the simple-random-sample portion of the multivariate-ODS or the estimator from a simple random sample with the same sample size. The multivariate-ODS design together with the proposed estimator provides an approach to further improve study efficiency for a given fixed study budget. We illustrate the proposed design and estimator with an analysis of association of polychlorinated biphenyl exposure to hearing loss in children born to the Collaborative Perinatal Study. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Study on rapid valid acidity evaluation of apple by fiber optic diffuse reflectance technique
NASA Astrophysics Data System (ADS)
Liu, Yande; Ying, Yibin; Fu, Xiaping; Jiang, Xuesong
2004-03-01
Some issues related to nondestructive evaluation of valid acidity in intact apples by means of Fourier transform near infrared (FTNIR) (800-2631nm) method were addressed. A relationship was established between the diffuse reflectance spectra recorded with a bifurcated optic fiber and the valid acidity. The data were analyzed by multivariate calibration analysis such as partial least squares (PLS) analysis and principal component regression (PCR) technique. A total of 120 Fuji apples were tested and 80 of them were used to form a calibration data set. The influence of data preprocessing and different spectra treatments were also investigated. Models based on smoothing spectra were slightly worse than models based on derivative spectra and the best result was obtained when the segment length was 5 and the gap size was 10. Depending on data preprocessing and multivariate calibration technique, the best prediction model had a correlation efficient (0.871), a low RMSEP (0.0677), a low RMSEC (0.056) and a small difference between RMSEP and RMSEC by PLS analysis. The results point out the feasibility of FTNIR spectral analysis to predict the fruit valid acidity non-destructively. The ratio of data standard deviation to the root mean square error of prediction (SDR) is better to be less than 3 in calibration models, however, the results cannot meet the demand of actual application. Therefore, further study is required for better calibration and prediction.
Improved accuracy in quantitative laser-induced breakdown spectroscopy using sub-models
Anderson, Ryan; Clegg, Samuel M.; Frydenvang, Jens; Wiens, Roger C.; McLennan, Scott M.; Morris, Richard V.; Ehlmann, Bethany L.; Dyar, M. Darby
2017-01-01
Accurate quantitative analysis of diverse geologic materials is one of the primary challenges faced by the Laser-Induced Breakdown Spectroscopy (LIBS)-based ChemCam instrument on the Mars Science Laboratory (MSL) rover. The SuperCam instrument on the Mars 2020 rover, as well as other LIBS instruments developed for geochemical analysis on Earth or other planets, will face the same challenge. Consequently, part of the ChemCam science team has focused on the development of improved multivariate analysis calibrations methods. Developing a single regression model capable of accurately determining the composition of very different target materials is difficult because the response of an element’s emission lines in LIBS spectra can vary with the concentration of other elements. We demonstrate a conceptually simple “sub-model” method for improving the accuracy of quantitative LIBS analysis of diverse target materials. The method is based on training several regression models on sets of targets with limited composition ranges and then “blending” these “sub-models” into a single final result. Tests of the sub-model method show improvement in test set root mean squared error of prediction (RMSEP) for almost all cases. The sub-model method, using partial least squares regression (PLS), is being used as part of the current ChemCam quantitative calibration, but the sub-model method is applicable to any multivariate regression method and may yield similar improvements.
Lambers, Kaj T A; van den Bekerom, Michel P J; Doornberg, Job N; Stufkens, Sjoerd A S; van Dijk, C Niek; Kloen, Peter
2013-09-04
There is sparse information in the literature on the outcome of Maisonneuve-type pronation-external rotation ankle fractures treated with syndesmotic screws. The primary aim of this study was to determine the long-term results of such treatment of these fractures as indicated by standardized patient-based and physician-based outcome measures. The secondary aim was to identify predictors of the outcome with use of bivariate and multivariate statistical analysis. Fifty patients with pronation-external rotation (predominantly Maisonneuve) fractures were treated with open reduction and internal fixation of the syndesmosis utilizing only one or two screws. The results were evaluated at a mean of twenty-one years after the fracture utilizing three standardized outcomes instruments: (1) the Foot and Ankle Ability Measure (FAAM), (2) the American Orthopaedic Foot & Ankle Society (AOFAS) ankle-hindfoot scale, and (3) the Center for Epidemiologic Studies-Depression (CES-D) Scale. Osteoarthritis was graded according to the van Dijk and revised Takakura radiographic scoring systems. Bivariate and multivariate analyses were performed to identify predictors of long-term outcome. Forty-four (92%) of forty-eighty patients had good or excellent AOFAS scores, and forty-four (90%) of forty-nine had good or excellent FAAM scores. Arthrodesis for severe osteoarthritis was performed in two patients. Radiographic evidence of osteoarthritis was observed in twenty-four (49%) of forty-nine patients. Multivariate analysis identified pain as the most important independent predictor of long-term ankle function as indicated by the AOFAS and FAAM scores, explaining 91% and 53% of the variation in scores, respectively. Analysis of pain as the dependent variable in bivariate analyses revealed that depression, ankle range of motion, and a subsequent surgery were significantly correlated with higher pain scores. No firm conclusions could be drawn after multivariate analysis of predictors of pain. Long-term functional outcomes at a mean of twenty-one years after pronation-external rotation ankle fractures treated with one or two syndesmotic screws were good to excellent in the great majority of patients despite substantial radiographic evidence of osteoarthritis in one-half of the patients. The most important predictor of long-term functional outcome was patient-reported pain rather than physician-reported function or posttraumatic osteoarthritis. There was no significant association between radiographic signs of posttraumatic osteoarthritis and perceived pain in the present series.
Understanding and reaching family forest owners: lessons from social marketing research
Brett J. Butler; Mary Tyrrell; Geoff Feinberg; Scott VanManen; Larry Wiseman; Scott Wallinger
2007-01-01
Social marketing--the use of commercial marketing techniques to effect positive social change--is a promising means by which to develop more effective and efficient outreach, policies, and services for family forest owners. A hierarchical, multivariate analysis based on landowners' attitudes reveals four groups of owners to whom programs can be tailored: woodland...
A Multivariate Analysis of Secondary Students' Experience of Web-Based Language Acquisition
ERIC Educational Resources Information Center
Felix, Uschi
2004-01-01
This paper reports on a large-scale project designed to replicate an earlier investigation of tertiary students (Felix, 2001) in a secondary school environment. The new project was carried out in five settings, again investigating the potential of the Web as a medium of language instruction. Data was collected by questionnaires and observational…
NASA Astrophysics Data System (ADS)
Chen, Long; Wang, Yue; Liu, Nenrong; Lin, Duo; Weng, Cuncheng; Zhang, Jixue; Zhu, Lihuan; Chen, Weisheng; Chen, Rong; Feng, Shangyuan
2013-06-01
The diagnostic capability of using tissue intrinsic micro-Raman signals to obtain biochemical information from human esophageal tissue is presented in this paper. Near-infrared micro-Raman spectroscopy combined with multivariate analysis was applied for discrimination of esophageal cancer tissue from normal tissue samples. Micro-Raman spectroscopy measurements were performed on 54 esophageal cancer tissues and 55 normal tissues in the 400-1750 cm-1 range. The mean Raman spectra showed significant differences between the two groups. Tentative assignments of the Raman bands in the measured tissue spectra suggested some changes in protein structure, a decrease in the relative amount of lactose, and increases in the percentages of tryptophan, collagen and phenylalanine content in esophageal cancer tissue as compared to those of a normal subject. The diagnostic algorithms based on principal component analysis (PCA) and linear discriminate analysis (LDA) achieved a diagnostic sensitivity of 87.0% and specificity of 70.9% for separating cancer from normal esophageal tissue samples. The result demonstrated that near-infrared micro-Raman spectroscopy combined with PCA-LDA analysis could be an effective and sensitive tool for identification of esophageal cancer.
Chen, Jiabo; Li, Fayun; Fan, Zhiping; Wang, Yanjie
2016-01-01
Source apportionment of river water pollution is critical in water resource management and aquatic conservation. Comprehensive application of various GIS-based multivariate statistical methods was performed to analyze datasets (2009–2011) on water quality in the Liao River system (China). Cluster analysis (CA) classified the 12 months of the year into three groups (May–October, February–April and November–January) and the 66 sampling sites into three groups (groups A, B and C) based on similarities in water quality characteristics. Discriminant analysis (DA) determined that temperature, dissolved oxygen (DO), pH, chemical oxygen demand (CODMn), 5-day biochemical oxygen demand (BOD5), NH4+–N, total phosphorus (TP) and volatile phenols were significant variables affecting temporal variations, with 81.2% correct assignments. Principal component analysis (PCA) and positive matrix factorization (PMF) identified eight potential pollution factors for each part of the data structure, explaining more than 61% of the total variance. Oxygen-consuming organics from cropland and woodland runoff were the main latent pollution factor for group A. For group B, the main pollutants were oxygen-consuming organics, oil, nutrients and fecal matter. For group C, the evaluated pollutants primarily included oxygen-consuming organics, oil and toxic organics. PMID:27775679
Velasco-Tapia, Fernando
2014-01-01
Magmatic processes have usually been identified and evaluated using qualitative or semiquantitative geochemical or isotopic tools based on a restricted number of variables. However, a more complete and quantitative view could be reached applying multivariate analysis, mass balance techniques, and statistical tests. As an example, in this work a statistical and quantitative scheme is applied to analyze the geochemical features for the Sierra de las Cruces (SC) volcanic range (Mexican Volcanic Belt). In this locality, the volcanic activity (3.7 to 0.5 Ma) was dominantly dacitic, but the presence of spheroidal andesitic enclaves and/or diverse disequilibrium features in majority of lavas confirms the operation of magma mixing/mingling. New discriminant-function-based multidimensional diagrams were used to discriminate tectonic setting. Statistical tests of discordancy and significance were applied to evaluate the influence of the subducting Cocos plate, which seems to be rather negligible for the SC magmas in relation to several major and trace elements. A cluster analysis following Ward's linkage rule was carried out to classify the SC volcanic rocks geochemical groups. Finally, two mass-balance schemes were applied for the quantitative evaluation of the proportion of the end-member components (dacitic and andesitic magmas) in the comingled lavas (binary mixtures).
Multivariate meta-analysis using individual participant data
Riley, R. D.; Price, M. J.; Jackson, D.; Wardle, M.; Gueyffier, F.; Wang, J.; Staessen, J. A.; White, I. R.
2016-01-01
When combining results across related studies, a multivariate meta-analysis allows the joint synthesis of correlated effect estimates from multiple outcomes. Joint synthesis can improve efficiency over separate univariate syntheses, may reduce selective outcome reporting biases, and enables joint inferences across the outcomes. A common issue is that within-study correlations needed to fit the multivariate model are unknown from published reports. However, provision of individual participant data (IPD) allows them to be calculated directly. Here, we illustrate how to use IPD to estimate within-study correlations, using a joint linear regression for multiple continuous outcomes and bootstrapping methods for binary, survival and mixed outcomes. In a meta-analysis of 10 hypertension trials, we then show how these methods enable multivariate meta-analysis to address novel clinical questions about continuous, survival and binary outcomes; treatment–covariate interactions; adjusted risk/prognostic factor effects; longitudinal data; prognostic and multiparameter models; and multiple treatment comparisons. Both frequentist and Bayesian approaches are applied, with example software code provided to derive within-study correlations and to fit the models. PMID:26099484
Yue, Yong; Osipov, Arsen; Fraass, Benedick; Sandler, Howard; Zhang, Xiao; Nissen, Nicholas; Hendifar, Andrew; Tuli, Richard
2017-02-01
To stratify risks of pancreatic adenocarcinoma (PA) patients using pre- and post-radiotherapy (RT) PET/CT images, and to assess the prognostic value of texture variations in predicting therapy response of patients. Twenty-six PA patients treated with RT from 2011-2013 with pre- and post-treatment 18F-FDG-PET/CT scans were identified. Tumor locoregional texture was calculated using 3D kernel-based approach, and texture variations were identified by fitting discrepancies of texture maps of pre- and post-treatment images. A total of 48 texture and clinical variables were identified and evaluated for association with overall survival (OS). The prognostic heterogeneity features were selected using lasso/elastic net regression, and further were evaluated by multivariate Cox analysis. Median age was 69 y (range, 46-86 y). The texture map and temporal variations between pre- and post-treatment were well characterized by histograms and statistical fitting. The lasso analysis identified seven predictors (age, node stage, post-RT SUVmax, variations of homogeneity, variance, sum mean, and cluster tendency). The multivariate Cox analysis identified five significant variables: age, node stage, variations of homogeneity, variance, and cluster tendency (with P=0.020, 0.040, 0.065, 0.078, and 0.081, respectively). The patients were stratified into two groups based on the risk score of multivariate analysis with log-rank P=0.001: a low risk group (n=11) with a longer mean OS (29.3 months) and higher texture variation (>30%), and a high risk group (n=15) with a shorter mean OS (17.7 months) and lower texture variation (<15%). Locoregional metabolic texture response provides a feasible approach for evaluating and predicting clinical outcomes following treatment of PA with RT. The proposed method can be used to stratify patient risk and help select appropriate treatment strategies for individual patients toward implementing response-driven adaptive RT.
Universal Pressure Ulcer Prevention Bundle With WOC Nurse Support.
Anderson, Megan; Finch Guthrie, Patricia; Kraft, Wendy; Reicks, Patty; Skay, Carol; Beal, Alan L
2015-01-01
This study examined the effectiveness of a universal pressure ulcer prevention bundle (UPUPB) applied to intensive care unit (ICU) patients combined with proactive, semiweekly WOC nurse rounds. The UPUBP was compared to a standard guideline with referral-based WOC nurse involvement measuring adherence to 5 evidence-based prevention interventions and incidence of pressure ulcers. The study used a quasi-experimental, pre-, and postintervention design in which each phase included different subjects. Descriptive methods assisted in exploring the content of WOC nurse rounds. One hundred eighty-one pre- and 146 postintervention subjects who met inclusion criteria and were admitted to ICU for more than 24 hours participated in the study. The research setting was 3 ICUs located at North Memorial Medical Center in Minneapolis, Minnesota. Data collection included admission/discharge skin assessments, chart reviews for 5 evidence-based interventions and patient characteristics, and WOC nurse rounding logs. Study subjects with intact skin on admission identified with an initial skin assessment were enrolled in which prephase subjects received standard care and postphase subjects received the UPUPB. Skin assessments on ICU discharge and chart reviews throughout the stay determined the presence of unit-acquired pressure ulcers and skin care received. Analysis included description of WOC nurse rounds, t-tests for guideline adherence, and multivariate analysis for intervention effect on pressure ulcer incidence. Unit assignment, Braden Scale score, and ICU length of stay were covariates for a multivariate model based on bivariate logistic regression screening. The incidence of unit-acquired pressure ulcers decreased from 15.5% to 2.1%. WOC nurses logged 204 rounds over 6 months, focusing primarily on early detection of pressure sources. Data analysis revealed significantly increased adherence to heel elevation (t = -3.905, df = 325, P < .001) and repositioning (t = -2.441, df = 325, P < .015). Multivariate logistic regression modeling showed a significant reduction in unit-acquired pressure ulcers (P < .001). The intervention increased the Nagelkerke R-Square value by 0.099 (P < .001) more than 0.297 (P < .001) when including only covariates, for a final model value of 0.396 (P < .001). The UPUPB with WOC nurse rounds resulted in a statistically significant and clinically relevant reduction in the incidence of pressure ulcers.
Rixen, D; Raum, M; Bouillon, B; Schlosser, L E; Neugebauer, E
2001-03-01
On hospital admission numerous variables are documented from multiple trauma patients. The value of these variables to predict outcome are discussed controversially. The aim was the ability to initially determine the probability of death of multiple trauma patients. Thus, a multivariate probability model was developed based on data obtained from the trauma registry of the Deutsche Gesellschaft für Unfallchirurgie (DGU). On hospital admission the DGU trauma registry collects more than 30 variables prospectively. In the first step of analysis those variables were selected, that were assumed to be clinical predictors for outcome from literature. In a second step a univariate analysis of these variables was performed. For all primary variables with univariate significance in outcome prediction a multivariate logistic regression was performed in the third step and a multivariate prognostic model was developed. 2069 patients from 20 hospitals were prospectively included in the trauma registry from 01.01.1993-31.12.1997 (age 39 +/- 19 years; 70.0% males; ISS 22 +/- 13; 18.6% lethality). From more than 30 initially documented variables, the age, the GCS, the ISS, the base excess (BE) and the prothrombin time were the most important prognostic factors to predict the probability of death (P(death)). The following prognostic model was developed: P(death) = 1/1 + e(-[k + beta 1(age) + beta 2(GCS) + beta 3(ISS) + beta 4(BE) + beta 5(prothrombin time)]) where: k = -0.1551, beta 1 = 0.0438 with p < 0.0001, beta 2 = -0.2067 with p < 0.0001, beta 3 = 0.0252 with p = 0.0071, beta 4 = -0.0840 with p < 0.0001 and beta 5 = -0.0359 with p < 0.0001. Each of the five variables contributed significantly to the multifactorial model. These data show that the age, GCS, ISS, base excess and prothrombin time are potentially important predictors to initially identify multiple trauma patients with a high risk of lethality. With the base excess and prothrombin time value, as only variables of this multifactorial model that can be therapeutically influenced, it might be possible to better guide early and aggressive therapy.
Predicting trauma patient mortality: ICD [or ICD-10-AM] versus AIS based approaches.
Willis, Cameron D; Gabbe, Belinda J; Jolley, Damien; Harrison, James E; Cameron, Peter A
2010-11-01
The International Classification of Diseases Injury Severity Score (ICISS) has been proposed as an International Classification of Diseases (ICD)-10-based alternative to mortality prediction tools that use Abbreviated Injury Scale (AIS) data, including the Trauma and Injury Severity Score (TRISS). To date, studies have not examined the performance of ICISS using Australian trauma registry data. This study aimed to compare the performance of ICISS with other mortality prediction tools in an Australian trauma registry. This was a retrospective review of prospectively collected data from the Victorian State Trauma Registry. A training dataset was created for model development and a validation dataset for evaluation. The multiplicative ICISS model was compared with a worst injury ICISS approach, Victorian TRISS (V-TRISS, using local coefficients), maximum AIS severity and a multivariable model including ICD-10-AM codes as predictors. Models were investigated for discrimination (C-statistic) and calibration (Hosmer-Lemeshow statistic). The multivariable approach had the highest level of discrimination (C-statistic 0.90) and calibration (H-L 7.65, P= 0.468). Worst injury ICISS, V-TRISS and maximum AIS had similar performance. The multiplicative ICISS produced the lowest level of discrimination (C-statistic 0.80) and poorest calibration (H-L 50.23, P < 0.001). The performance of ICISS may be affected by the data used to develop estimates, the ICD version employed, the methods for deriving estimates and the inclusion of covariates. In this analysis, a multivariable approach using ICD-10-AM codes was the best-performing method. A multivariable ICISS approach may therefore be a useful alternative to AIS-based methods and may have comparable predictive performance to locally derived TRISS models. © 2010 The Authors. ANZ Journal of Surgery © 2010 Royal Australasian College of Surgeons.
Hybrid least squares multivariate spectral analysis methods
Haaland, David M.
2002-01-01
A set of hybrid least squares multivariate spectral analysis methods in which spectral shapes of components or effects not present in the original calibration step are added in a following estimation or calibration step to improve the accuracy of the estimation of the amount of the original components in the sampled mixture. The "hybrid" method herein means a combination of an initial classical least squares analysis calibration step with subsequent analysis by an inverse multivariate analysis method. A "spectral shape" herein means normally the spectral shape of a non-calibrated chemical component in the sample mixture but can also mean the spectral shapes of other sources of spectral variation, including temperature drift, shifts between spectrometers, spectrometer drift, etc. The "shape" can be continuous, discontinuous, or even discrete points illustrative of the particular effect.
Multivariate generalized multifactor dimensionality reduction to detect gene-gene interactions
2013-01-01
Background Recently, one of the greatest challenges in genome-wide association studies is to detect gene-gene and/or gene-environment interactions for common complex human diseases. Ritchie et al. (2001) proposed multifactor dimensionality reduction (MDR) method for interaction analysis. MDR is a combinatorial approach to reduce multi-locus genotypes into high-risk and low-risk groups. Although MDR has been widely used for case-control studies with binary phenotypes, several extensions have been proposed. One of these methods, a generalized MDR (GMDR) proposed by Lou et al. (2007), allows adjusting for covariates and applying to both dichotomous and continuous phenotypes. GMDR uses the residual score of a generalized linear model of phenotypes to assign either high-risk or low-risk group, while MDR uses the ratio of cases to controls. Methods In this study, we propose multivariate GMDR, an extension of GMDR for multivariate phenotypes. Jointly analysing correlated multivariate phenotypes may have more power to detect susceptible genes and gene-gene interactions. We construct generalized estimating equations (GEE) with multivariate phenotypes to extend generalized linear models. Using the score vectors from GEE we discriminate high-risk from low-risk groups. We applied the multivariate GMDR method to the blood pressure data of the 7,546 subjects from the Korean Association Resource study: systolic blood pressure (SBP) and diastolic blood pressure (DBP). We compare the results of multivariate GMDR for SBP and DBP to the results from separate univariate GMDR for SBP and DBP, respectively. We also applied the multivariate GMDR method to the repeatedly measured hypertension status from 5,466 subjects and compared its result with those of univariate GMDR at each time point. Results Results from the univariate GMDR and multivariate GMDR in two-locus model with both blood pressures and hypertension phenotypes indicate best combinations of SNPs whose interaction has significant association with risk for high blood pressures or hypertension. Although the test balanced accuracy (BA) of multivariate analysis was not always greater than that of univariate analysis, the multivariate BAs were more stable with smaller standard deviations. Conclusions In this study, we have developed multivariate GMDR method using GEE approach. It is useful to use multivariate GMDR with correlated multiple phenotypes of interests. PMID:24565370
D'Amico, E J; Neilands, T B; Zambarano, R
2001-11-01
Although power analysis is an important component in the planning and implementation of research designs, it is often ignored. Computer programs for performing power analysis are available, but most have limitations, particularly for complex multivariate designs. An SPSS procedure is presented that can be used for calculating power for univariate, multivariate, and repeated measures models with and without time-varying and time-constant covariates. Three examples provide a framework for calculating power via this method: an ANCOVA, a MANOVA, and a repeated measures ANOVA with two or more groups. The benefits and limitations of this procedure are discussed.
Probabilistic, meso-scale flood loss modelling
NASA Astrophysics Data System (ADS)
Kreibich, Heidi; Botto, Anna; Schröter, Kai; Merz, Bruno
2016-04-01
Flood risk analyses are an important basis for decisions on flood risk management and adaptation. However, such analyses are associated with significant uncertainty, even more if changes in risk due to global change are expected. Although uncertainty analysis and probabilistic approaches have received increased attention during the last years, they are still not standard practice for flood risk assessments and even more for flood loss modelling. State of the art in flood loss modelling is still the use of simple, deterministic approaches like stage-damage functions. Novel probabilistic, multi-variate flood loss models have been developed and validated on the micro-scale using a data-mining approach, namely bagging decision trees (Merz et al. 2013). In this presentation we demonstrate and evaluate the upscaling of the approach to the meso-scale, namely on the basis of land-use units. The model is applied in 19 municipalities which were affected during the 2002 flood by the River Mulde in Saxony, Germany (Botto et al. submitted). The application of bagging decision tree based loss models provide a probability distribution of estimated loss per municipality. Validation is undertaken on the one hand via a comparison with eight deterministic loss models including stage-damage functions as well as multi-variate models. On the other hand the results are compared with official loss data provided by the Saxon Relief Bank (SAB). The results show, that uncertainties of loss estimation remain high. Thus, the significant advantage of this probabilistic flood loss estimation approach is that it inherently provides quantitative information about the uncertainty of the prediction. References: Merz, B.; Kreibich, H.; Lall, U. (2013): Multi-variate flood damage assessment: a tree-based data-mining approach. NHESS, 13(1), 53-64. Botto A, Kreibich H, Merz B, Schröter K (submitted) Probabilistic, multi-variable flood loss modelling on the meso-scale with BT-FLEMO. Risk Analysis.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ladd-Lively, Jennifer L
2014-01-01
The objective of this work was to determine the feasibility of using on-line multivariate statistical process control (MSPC) for safeguards applications in natural uranium conversion plants. Multivariate statistical process control is commonly used throughout industry for the detection of faults. For safeguards applications in uranium conversion plants, faults could include the diversion of intermediate products such as uranium dioxide, uranium tetrafluoride, and uranium hexafluoride. This study was limited to a 100 metric ton of uranium (MTU) per year natural uranium conversion plant (NUCP) using the wet solvent extraction method for the purification of uranium ore concentrate. A key component inmore » the multivariate statistical methodology is the Principal Component Analysis (PCA) approach for the analysis of data, development of the base case model, and evaluation of future operations. The PCA approach was implemented through the use of singular value decomposition of the data matrix where the data matrix represents normal operation of the plant. Component mole balances were used to model each of the process units in the NUCP. However, this approach could be applied to any data set. The monitoring framework developed in this research could be used to determine whether or not a diversion of material has occurred at an NUCP as part of an International Atomic Energy Agency (IAEA) safeguards system. This approach can be used to identify the key monitoring locations, as well as locations where monitoring is unimportant. Detection limits at the key monitoring locations can also be established using this technique. Several faulty scenarios were developed to test the monitoring framework after the base case or normal operating conditions of the PCA model were established. In all of the scenarios, the monitoring framework was able to detect the fault. Overall this study was successful at meeting the stated objective.« less
Potyrailo, Radislav A
2016-10-12
Modern gas monitoring scenarios for medical diagnostics, environmental surveillance, industrial safety, and other applications demand new sensing capabilities. This Review provides analysis of development of new generation of gas sensors based on the multivariable response principles. Design criteria of these individual sensors involve a sensing material with multiresponse mechanisms to different gases and a multivariable transducer with independent outputs to recognize these different gas responses. These new sensors quantify individual components in mixtures, reject interferences, and offer more stable response over sensor arrays. Such performance is attractive when selectivity advantages of classic gas chromatography, ion mobility, and mass spectrometry instruments are canceled by requirements for no consumables, low power, low cost, and unobtrusive form factors for Internet of Things, Industrial Internet, and other applications. This Review is concluded with a perspective for future needs in fundamental and applied aspects of gas sensing and with the 2025 roadmap for ubiquitous gas monitoring.
Levine, Matthew E; Albers, David J; Hripcsak, George
2016-01-01
Time series analysis methods have been shown to reveal clinical and biological associations in data collected in the electronic health record. We wish to develop reliable high-throughput methods for identifying adverse drug effects that are easy to implement and produce readily interpretable results. To move toward this goal, we used univariate and multivariate lagged regression models to investigate associations between twenty pairs of drug orders and laboratory measurements. Multivariate lagged regression models exhibited higher sensitivity and specificity than univariate lagged regression in the 20 examples, and incorporating autoregressive terms for labs and drugs produced more robust signals in cases of known associations among the 20 example pairings. Moreover, including inpatient admission terms in the model attenuated the signals for some cases of unlikely associations, demonstrating how multivariate lagged regression models' explicit handling of context-based variables can provide a simple way to probe for health-care processes that confound analyses of EHR data.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Demizu, Yusuke, E-mail: y_demizu@nifty.co; Murakami, Masao; Miyawaki, Daisuke
2009-12-01
Purpose: To assess the incident rates of vision loss (VL; based on counting fingers or more severe) caused by radiation-induced optic neuropathy (RION) after particle therapy for tumors adjacent to optic nerves (ONs), and to evaluate factors that may contribute to VL. Methods and Materials: From August 2001 to August 2006, 104 patients with head-and-neck or skull-base tumors adjacent to ONs were treated with carbon ion or proton radiotherapy. Among them, 145 ONs of 75 patients were irradiated and followed for greater than 12 months. The incident rate of VL and the prognostic factors for occurrence of VL were evaluated.more » The late effects of carbon ion and proton beams were compared on the basis of a biologically effective dose at alpha/beta = 3 gray equivalent (GyE{sub 3}). Results: Eight patients (11%) experienced VL resulting from RION. The onset of VL ranged from 17 to 58 months. The median follow-up was 25 months. No significant difference was observed between the carbon ion and proton beam treatment groups. On univariate analysis, age (>60 years), diabetes mellitus, and maximum dose to the ON (>110 GyE{sub 3}) were significant, whereas on multivariate analysis only diabetes mellitus was found to be significant for VL. Conclusions: The time to the onset of VL was highly variable. There was no statistically significant difference between carbon ion and proton beam treatments over the follow-up period. Based on multivariate analysis, diabetes mellitus correlated with the occurrence of VL. A larger study with longer follow-up is warranted.« less
Multi-Sample Cluster Analysis Using Akaike’s Information Criterion.
1982-12-20
of Likelihood Criteria for I)fferent Hypotheses," in P. A. Krishnaiah (Ed.), Multivariate Analysis-Il, New York: Academic Press. [5] Fisher, R. A...Methods of Simultaneous Inference in MANOVA," in P. R. Krishnaiah (Ed.), rultivariate Analysis-Il, New York: Academic Press. [8) Kendall, M. G. (1966...1982), Applied Multivariate Statisti- cal-Analysis, Englewood Cliffs: Prentice-Mall, Inc. [1U] Krishnaiah , P. R. (1969), "Simultaneous Test
Use of proxy measures in estimating socioeconomic inequalities in malaria prevalence.
Somi, Masha F; Butler, James R; Vahid, Farshid; Njau, Joseph D; Kachur, S P; Abdulla, Salim
2008-03-01
To present and compare socioeconomic status (SES) rankings of households using consumption and an asset-based index as two alternative measures of SES; and to compare and evaluate the performance of these two measures in multivariate analyses of the socioeconomic gradient in malaria prevalence. Data for the study come from a survey of 557 households in 25 study villages in Tanzania in 2004. Household SES was determined using consumption and an asset-based index calculated using Principal Components Analysis on a set of household variables. In multivariate analyses of malaria prevalence, we also used two other measures of disease prevalence: parasitaemia and self-report of malaria or fever in the 2 weeks before interview. Household rankings based on the two measures of SES differ substantially. In multivariate analyses, there was a statistically significant negative association between both measures of SES and parasitaemia but not between either measure of SES and self-reported malaria. Age of individual, use of a mosquito net, and wall construction were negatively and significantly associated with parasitaemia, whilst roof construction was positively associated with parasitaemia. Only age remained significant when malaria self-report was used as the measure of disease prevalence. An asset index is an effective alternative to consumption in measuring the socioeconomic gradient in malaria parasitaemia, but self-report may be an unreliable measure of malaria prevalence for this purpose.
A Machine Learning Approach to Automated Gait Analysis for the Noldus Catwalk System.
Frohlich, Holger; Claes, Kasper; De Wolf, Catherine; Van Damme, Xavier; Michel, Anne
2018-05-01
Gait analysis of animal disease models can provide valuable insights into in vivo compound effects and thus help in preclinical drug development. The purpose of this paper is to establish a computational gait analysis approach for the Noldus Catwalk system, in which footprints are automatically captured and stored. We present a - to our knowledge - first machine learning based approach for the Catwalk system, which comprises a step decomposition, definition and extraction of meaningful features, multivariate step sequence alignment, feature selection, and training of different classifiers (gradient boosting machine, random forest, and elastic net). Using animal-wise leave-one-out cross validation we demonstrate that with our method we can reliable separate movement patterns of a putative Parkinson's disease animal model and several control groups. Furthermore, we show that we can predict the time point after and the type of different brain lesions and can even forecast the brain region, where the intervention was applied. We provide an in-depth analysis of the features involved into our classifiers via statistical techniques for model interpretation. A machine learning method for automated analysis of data from the Noldus Catwalk system was established. Our works shows the ability of machine learning to discriminate pharmacologically relevant animal groups based on their walking behavior in a multivariate manner. Further interesting aspects of the approach include the ability to learn from past experiments, improve with more data arriving and to make predictions for single animals in future studies.
Moyer, Cheryl A; McLaren, Zoë M; Adanu, Richard M; Lantz, Paula M
2013-09-01
To determine the types of access to care most strongly associated with facility-based delivery among women in Ghana. Data relating to the "5 As of Access" framework were extracted from the 2008 Ghana Demographic Health Survey and analyzed using multivariate logistic regression. In all, 55.5% of a weighted sample of 1102 women delivered in a healthcare facility, whereas 45.5% delivered at home. Affordability was the strongest access factor associated with delivery location, with health insurance coverage tripling the odds of facility delivery. Availability, accessibility (except urban residence), acceptability, and social access variables were not significant factors in the final models. Social access variables, including needing permission to seek healthcare and not being involved in decisions regarding healthcare, were associated with a reduced likelihood of facility-based delivery when examined individually. Multivariate analysis suggested that these variables reflected maternal literacy, health insurance coverage, and household wealth, all of which attenuated the effects of social access. Affordability was an important determinant of facility delivery in Ghana-even among women with health insurance-but social access variables had a mediating role. Copyright © 2013 International Federation of Gynecology and Obstetrics. Published by Elsevier Ireland Ltd. All rights reserved.
Docking and multivariate methods to explore HIV-1 drug-resistance: a comparative analysis
NASA Astrophysics Data System (ADS)
Almerico, Anna Maria; Tutone, Marco; Lauria, Antonino
2008-05-01
In this paper we describe a comparative analysis between multivariate and docking methods in the study of the drug resistance to the reverse transcriptase and the protease inhibitors. In our early papers we developed a simple but efficient method to evaluate the features of compounds that are less likely to trigger resistance or are effective against mutant HIV strains, using the multivariate statistical procedures PCA and DA. In the attempt to create a more solid background for the prediction of susceptibility or resistance, we carried out a comparative analysis between our previous multivariate approach and molecular docking study. The intent of this paper is not only to find further support to the results obtained by the combined use of PCA and DA, but also to evidence the structural features, in terms of molecular descriptors, similarity, and energetic contributions, derived from docking, which can account for the arising of drug-resistance against mutant strains.
SUGGESTIONS FOR OPTIMIZED PLANNING OF MULTIVARIATE MONITORING OF ATMOSPHERIC POLLUTION
Recent work in factor analysis of multivariate data sets has shown that variables with little signal should not be included in the factor analysis. Work also shows that rotational ambiguity is reduced if sources impacting a receptor have both large and small contributions. Thes...
Multivariate Meta-Analysis Using Individual Participant Data
ERIC Educational Resources Information Center
Riley, R. D.; Price, M. J.; Jackson, D.; Wardle, M.; Gueyffier, F.; Wang, J.; Staessen, J. A.; White, I. R.
2015-01-01
When combining results across related studies, a multivariate meta-analysis allows the joint synthesis of correlated effect estimates from multiple outcomes. Joint synthesis can improve efficiency over separate univariate syntheses, may reduce selective outcome reporting biases, and enables joint inferences across the outcomes. A common issue is…
Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G.; Shah, Arvind K.; Lin, Jianxin
2013-01-01
In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data (IPD) in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the Deviance Information Criterion (DIC) is used to select the best transformation model. Since the model is quite complex, a novel Monte Carlo Markov chain (MCMC) sampling scheme is developed to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol lowering drugs where the goal is to jointly model the three dimensional response consisting of Low Density Lipoprotein Cholesterol (LDL-C), High Density Lipoprotein Cholesterol (HDL-C), and Triglycerides (TG) (LDL-C, HDL-C, TG). Since the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately: however, a multivariate approach would be more appropriate since these variables are correlated with each other. A detailed analysis of these data is carried out using the proposed methodology. PMID:23580436
Kim, Sungduk; Chen, Ming-Hui; Ibrahim, Joseph G; Shah, Arvind K; Lin, Jianxin
2013-10-15
In this paper, we propose a class of Box-Cox transformation regression models with multidimensional random effects for analyzing multivariate responses for individual patient data in meta-analysis. Our modeling formulation uses a multivariate normal response meta-analysis model with multivariate random effects, in which each response is allowed to have its own Box-Cox transformation. Prior distributions are specified for the Box-Cox transformation parameters as well as the regression coefficients in this complex model, and the deviance information criterion is used to select the best transformation model. Because the model is quite complex, we develop a novel Monte Carlo Markov chain sampling scheme to sample from the joint posterior of the parameters. This model is motivated by a very rich dataset comprising 26 clinical trials involving cholesterol-lowering drugs where the goal is to jointly model the three-dimensional response consisting of low density lipoprotein cholesterol (LDL-C), high density lipoprotein cholesterol (HDL-C), and triglycerides (TG) (LDL-C, HDL-C, TG). Because the joint distribution of (LDL-C, HDL-C, TG) is not multivariate normal and in fact quite skewed, a Box-Cox transformation is needed to achieve normality. In the clinical literature, these three variables are usually analyzed univariately; however, a multivariate approach would be more appropriate because these variables are correlated with each other. We carry out a detailed analysis of these data by using the proposed methodology. Copyright © 2013 John Wiley & Sons, Ltd.
Tyagi, Ritu; Rana, Poonam; Gupta, Mamta; Bhatnagar, Deepak; Srivastava, Shatakshi; Roy, Raja; Khushu, Subash
2014-03-25
Heavy metal tungsten alloys (HMTAs) have been found to be safer alternatives for making military munitions. Recently, some studies demonstrating the toxic potential of HMTAs have raised concern over the safety issues, and further propose that HMTAs exposure may lead to physiological disturbances as well. To look for the systemic effect of acute toxicity of HMTA based metals salt, (1)H nuclear magnetic resonance ((1)H NMR) spectroscopic profiling of rat urine was carried out. Male Sprague Dawley rats were administered (intraperitoneal) low and high dose of mixture of HMTA based metals salt and NMR spectroscopy was carried out in urine samples collected at 8, 24, 72 and 120 h post dosing (p.d.). Serum biochemical parameters and liver histopathology were also conducted. The (1)H NMR spectra were analysed using multivariate analysis techniques to show the time- and dose-dependent biochemical variations in post HMTA based metals salt exposure. Urine metabolomic analysis showed changes associated with energy metabolism, amino acids, N-methyl nicotinamide, membrane and gut flora metabolites. Multivariate analysis showed maximum variation with best classification of control and treated groups at 24h p.d. At the end of the study, for the low dose group most of the changes at metabolite level reverted to control except for the energy metabolites; whereas, in the high dose group some of the changes still persisted. The observations were well correlated with histopathological and serum biochemical parameters. Further, metabolic pathway analysis clarified that amongst all the metabolic pathways analysed, tricarboxylic acid cycle was most affected at all the time points indicating a switchover in energy metabolism from aerobic to anaerobic. These results suggest that exposure of rats to acute doses of HMTA based metals salt disrupts physiological metabolism with moderate injury to the liver, which might indirectly result from heavy metals induced oxidative stress. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Maione, Camila; Barbosa, Rommel Melgaço
2018-01-24
Rice is one of the most important staple foods around the world. Authentication of rice is one of the most addressed concerns in the present literature, which includes recognition of its geographical origin and variety, certification of organic rice and many other issues. Good results have been achieved by multivariate data analysis and data mining techniques when combined with specific parameters for ascertaining authenticity and many other useful characteristics of rice, such as quality, yield and others. This paper brings a review of the recent research projects on discrimination and authentication of rice using multivariate data analysis and data mining techniques. We found that data obtained from image processing, molecular and atomic spectroscopy, elemental fingerprinting, genetic markers, molecular content and others are promising sources of information regarding geographical origin, variety and other aspects of rice, being widely used combined with multivariate data analysis techniques. Principal component analysis and linear discriminant analysis are the preferred methods, but several other data classification techniques such as support vector machines, artificial neural networks and others are also frequently present in some studies and show high performance for discrimination of rice.
Martin, P; Brown, M C; Espin-Garcia, O; Cuffe, S; Pringle, D; Mahler, M; Villeneuve, J; Niu, C; Charow, R; Lam, C; Shani, R M; Hon, H; Otsuka, M; Xu, W; Alibhai, S; Jenkinson, J; Liu, G
2016-03-01
In this study, we compared cancer patients preference for computerised (tablet/web-based) surveys versus paper. We also assessed whether the understanding of a cancer-related topic, pharmacogenomics is affected by the survey format, and examined differences in demographic and medical characteristics which may affect patient preference and understanding. Three hundred and four cancer patients completed a tablet-administered survey and another 153 patients completed a paper-based survey. Patients who participated in the tablet survey were questioned regarding their preference for survey format administration (paper, tablet and web-based). Understanding was assessed with a 'direct' method, by asking patients to assess their understanding of genetic testing, and with a 'composite' score. Patients preferred administration with tablet (71%) compared with web-based (12%) and paper (17%). Patients <65 years old, non-Caucasians and white-collar professionals significantly preferred the computerised format following multivariate analysis. There was no significant difference in understanding between the paper and tablet survey with direct questioning or composite score. Age (<65 years) and white-collar professionals were associated with increased understanding (both P = 0.03). There was no significant difference in understanding between the tablet and print survey in a multivariate analysis. Patients overwhelmingly preferred computerised surveys and understanding of pharmacogenomics was not affected by survey format. © 2015 John Wiley & Sons Ltd.
Risk factors of hepatitis B virus infection among blood donors in Duhok city, Kurdistan Region, Iraq
R Hussein, Nawfal
2018-01-01
Background: Hepatitis B virus (HBV) infection is a public health problem. The lack of information about the seroprevalence and risk factors is an obstacle for preventive public health plans to reduce the burden of viral hepatitis. Therefore, this study was conducted in Iraq, where no studies had been performed to determine the prevalence and risk factors of HBV infection. Methods: Blood samples were collected form 438 blood donors attending blood bank in Duhok city. Serum samples were tested for HBV core-antibodies (HBcAb) and HBV surface-antigen (HBsAg) by ELISA. Various risk factors were recorded and multivariate analysis was performed. Results: 5/438 (1.14%) of the subjects were HBsAg positive (HBsAg and HBcAb positive) and 36/438 (8.2%) were HBcAb positive. Hence, 41 cases were exposed to HBV and data analysis was based on that. Univariate analysis showed that there were significant associations between history of illegitimate sexual contact, history of alcohol or history of dental surgeries and HBV exposure (p<0.05 for all). Then, multivariate analysis was conducted to find HBV exposure predictive factors. It was found that history of dental surgery was a predictive factor for exposure to the virus (P=0.03, OR: 2.397). Conclusions: This study suggested that the history of dental surgery was predictive for HBV transmission in Duhok city. Further population-based study is needed to determine HBV risk factors in the society and public health plan based on that should be considered. PMID:29387315
Tang, Yongqiang
2018-04-30
The controlled imputation method refers to a class of pattern mixture models that have been commonly used as sensitivity analyses of longitudinal clinical trials with nonignorable dropout in recent years. These pattern mixture models assume that participants in the experimental arm after dropout have similar response profiles to the control participants or have worse outcomes than otherwise similar participants who remain on the experimental treatment. In spite of its popularity, the controlled imputation has not been formally developed for longitudinal binary and ordinal outcomes partially due to the lack of a natural multivariate distribution for such endpoints. In this paper, we propose 2 approaches for implementing the controlled imputation for binary and ordinal data based respectively on the sequential logistic regression and the multivariate probit model. Efficient Markov chain Monte Carlo algorithms are developed for missing data imputation by using the monotone data augmentation technique for the sequential logistic regression and a parameter-expanded monotone data augmentation scheme for the multivariate probit model. We assess the performance of the proposed procedures by simulation and the analysis of a schizophrenia clinical trial and compare them with the fully conditional specification, last observation carried forward, and baseline observation carried forward imputation methods. Copyright © 2018 John Wiley & Sons, Ltd.
Liguori, Lucia; Bjørsvik, Hans-René
2012-12-01
The development of a multivariate study for a quantitative analysis of six different polybrominated diphenyl ethers (PBDEs) in tissue of Atlantic Salmo salar L. is reported. An extraction, isolation, and purification process based on an accelerated solvent extraction system was designed, investigated, and optimized by means of statistical experimental design and multivariate data analysis and regression. An accompanying gas chromatography-mass spectrometry analytical method was developed for the identification and quantification of the analytes, BDE 28, BDE 47, BDE 99, BDE 100, BDE 153, and BDE 154. These PBDEs have been used in commercial blends that were used as flame-retardants for a variety of materials, including electronic devices, synthetic polymers and textiles. The present study revealed that an extracting solvent mixture composed of hexane and CH₂Cl₂ (10:90) provided excellent recoveries of all of the six PBDEs studied herein. A somewhat lower polarity in the extracting solvent, hexane and CH₂Cl₂ (40:60) decreased the analyte %-recoveries, which still remain acceptable and satisfactory. The study demonstrates the necessity to perform an intimately investigation of the extraction and purification process in order to achieve quantitative isolation of the analytes from the specific matrix. Copyright © 2012 Elsevier B.V. All rights reserved.
Differential use of fresh water environments by wintering waterfowl of coastal Texas
White, D.H.; James, D.
1978-01-01
A comparative study of the environmental relationships among 14 species of wintering waterfowl was conducted at the Welder Wildlife Foundation, San Patricia County, near Sinton, Texas during the fall and early winter of 1973. Measurements of 20 environmental factors (social, vegetational, physical, and chemical) were subjected to multivariate statistical methods to determine certain niche characteristics and environmental relationships of waterfowl wintering in the aquatic community.....Each waterfowl species occupied a unique realized niche by responding to distinct combinations of environmental factors identified by principal component analysis. One percent confidence ellipses circumscribing the mean scores plotted for the first and second principal components gave an indication of relative niche width for each species. The waterfowl environments were significantly different interspecifically and water depth at feeding site and % emergent vegetation were most important in the separation. This was shown by subjecting the transformed data to multivariate analysis of variance with an associated step-down procedure. The species were distributed along a community cline extending from shallow water with abundant emergent vegetation to open deep water with little emergent vegetation of any kind. Four waterfowl subgroups were significantly separated along the cline, as indicated by one-way analysis of variance with Duncan?s multiple range test. Clumping of the bird species toward the middle of the available habitat hyperspace was shown in a plot of the principal component scores for the random samples and individual species.....Naturally occurring relationships among waterfowl were clarified using principal comcomponent analysis and related multivariate procedures. These techniques may prove useful in wetland management for particular groups of waterfowl based on habitat preferences.
NASA Astrophysics Data System (ADS)
DSouza, Adora M.; Abidin, Anas Z.; Leistritz, Lutz; Wismüller, Axel
2017-02-01
We investigate the applicability of large-scale Granger Causality (lsGC) for extracting a measure of multivariate information flow between pairs of regional brain activities from resting-state functional MRI (fMRI) and test the effectiveness of these measures for predicting a disease state. Such pairwise multivariate measures of interaction provide high-dimensional representations of connectivity profiles for each subject and are used in a machine learning task to distinguish between healthy controls and individuals presenting with symptoms of HIV Associated Neurocognitive Disorder (HAND). Cognitive impairment in several domains can occur as a result of HIV infection of the central nervous system. The current paradigm for assessing such impairment is through neuropsychological testing. With fMRI data analysis, we aim at non-invasively capturing differences in brain connectivity patterns between healthy subjects and subjects presenting with symptoms of HAND. To classify the extracted interaction patterns among brain regions, we use a prototype-based learning algorithm called Generalized Matrix Learning Vector Quantization (GMLVQ). Our approach to characterize connectivity using lsGC followed by GMLVQ for subsequent classification yields good prediction results with an accuracy of 87% and an area under the ROC curve (AUC) of up to 0.90. We obtain a statistically significant improvement (p<0.01) over a conventional Granger causality approach (accuracy = 0.76, AUC = 0.74). High accuracy and AUC values using our multivariate method to connectivity analysis suggests that our approach is able to better capture changes in interaction patterns between different brain regions when compared to conventional Granger causality analysis known from the literature.
de Falco, Bruna; Incerti, Guido; Pepe, Rosa; Amato, Mariana; Lanzotti, Virginia
2016-09-01
Globe artichoke (Cynara cardunculus L. var. scolymus L. Fiori) and cardoon (Cynara cardunculus L. var. altilis DC) are sources of nutraceuticals and bioactive compounds. To apply a NMR metabolomic fingerprinting approach to Cynara cardunculus heads to obtain simultaneous identification and quantitation of the major classes of organic compounds. The edible part of 14 Globe artichoke populations, belonging to the Romaneschi varietal group, were extracted to obtain apolar and polar organic extracts. The analysis was also extended to one species of cultivated cardoon for comparison. The (1) H-NMR of the extracts allowed simultaneous identification of the bioactive metabolites whose quantitation have been obtained by spectral integration followed by principal component analysis (PCA). Apolar organic extracts were mainly based on highly unsaturated long chain lipids. Polar organic extracts contained organic acids, amino acids, sugars (mainly inulin), caffeoyl derivatives (mainly cynarin), flavonoids, and terpenes. The level of nutraceuticals was found to be highest in the Italian landraces Bianco di Pertosa zia E and Natalina while cardoon showed the lowest content of all metabolites thus confirming the genetic distance between artichokes and cardoon. Metabolomic approach coupling NMR spectroscopy with multivariate data analysis allowed for a detailed metabolite profile of artichoke and cardoon varieties to be obtained. Relevant differences in the relative content of the metabolites were observed for the species analysed. This work is the first application of (1) H-NMR with multivariate statistics to provide a metabolomic fingerprinting of Cynara scolymus. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
ERIC Educational Resources Information Center
Bejar, Isaac I.
1981-01-01
Effects of nutritional supplementation on physical development of malnourished children was analyzed by univariate and multivariate methods for the analysis of repeated measures. Results showed that the nutritional treatment was successful, but it was necessary to resort to the multivariate approach. (Author/GK)
ERIC Educational Resources Information Center
Grundmann, Matthias
Following the assumptions of ecological socialization research, adequate analysis of socialization conditions must take into account the multilevel and multivariate structure of social factors that impact on human development. This statement implies that complex models of family configurations or of socialization factors are needed to explain the…
Univariate Analysis of Multivariate Outcomes in Educational Psychology.
ERIC Educational Resources Information Center
Hubble, L. M.
1984-01-01
The author examined the prevalence of multiple operational definitions of outcome constructs and an estimate of the incidence of Type I error rates when univariate procedures were applied to multiple variables in educational psychology. Multiple operational definitions of constructs were advocated and wider use of multivariate analysis was…
Applied Statistics: From Bivariate through Multivariate Techniques [with CD-ROM
ERIC Educational Resources Information Center
Warner, Rebecca M.
2007-01-01
This book provides a clear introduction to widely used topics in bivariate and multivariate statistics, including multiple regression, discriminant analysis, MANOVA, factor analysis, and binary logistic regression. The approach is applied and does not require formal mathematics; equations are accompanied by verbal explanations. Students are asked…
Evaluation of Meterorite Amono Acid Analysis Data Using Multivariate Techniques
NASA Technical Reports Server (NTRS)
McDonald, G.; Storrie-Lombardi, M.; Nealson, K.
1999-01-01
The amino acid distributions in the Murchison carbonaceous chondrite, Mars meteorite ALH84001, and ice from the Allan Hills region of Antarctica are shown, using a multivariate technique known as Principal Component Analysis (PCA), to be statistically distinct from the average amino acid compostion of 101 terrestrial protein superfamilies.
Microenvironmental and biological/personal monitoring information were collected during the National Human Exposure Assessment Survey (NHEXAS), conducted in the six states comprising U.S. EPA Region Five. They have been analyzed by multivariate analysis techniques with general ...
Multivariate Cryptography Based on Clipped Hopfield Neural Network.
Wang, Jia; Cheng, Lee-Ming; Su, Tong
2018-02-01
Designing secure and efficient multivariate public key cryptosystems [multivariate cryptography (MVC)] to strengthen the security of RSA and ECC in conventional and quantum computational environment continues to be a challenging research in recent years. In this paper, we will describe multivariate public key cryptosystems based on extended Clipped Hopfield Neural Network (CHNN) and implement it using the MVC (CHNN-MVC) framework operated in space. The Diffie-Hellman key exchange algorithm is extended into the matrix field, which illustrates the feasibility of its new applications in both classic and postquantum cryptography. The efficiency and security of our proposed new public key cryptosystem CHNN-MVC are simulated and found to be NP-hard. The proposed algorithm will strengthen multivariate public key cryptosystems and allows hardware realization practicality.
NASA Astrophysics Data System (ADS)
Burns, R. G.; Meyer, R. W.; Cornwell, K.
2003-12-01
In-basin statistical relations allow for development of regional flood frequency and magnitude equations in the Cosumnes River and Mokelumne River drainage basins. Current equations were derived from data collected through 1975, and do not reflect newer data with some significant flooding. Physical basin characteristics (area, mean basin elevation, slope of longest reach, and mean annual precipitation) were correlated against predicted flood discharges for each of the 5, 10, 25, 50, 100, 200, and 500-year recurrence intervals in a multivariate analysis. Predicted maximum instantaneous flood discharges were determined using the PEAKFQ program with default settings, for 24 stream gages within the study area presumed not affected by flow management practices. For numerical comparisons, GIS-based methods using Spatial Analyst and the Arc Hydro Tools extension were applied to derive physical basin characteristics as predictor variables from a 30m digital elevation model (DEM) and a mean annual precipitation raster (PRISM). In a bivariate analysis, examination of Pearson correlation coefficients, F-statistic, and t & p thresholds show good correlation between area and flood discharges. Similar analyses show poor correlation for mean basin elevation, slope and precipitation, with flood discharge. Bivariate analysis suggests slope may not be an appropriate predictor term for use in the multivariate analysis. Precipitation and elevation correlate very well, demonstrating possible orographic effects. From the multivariate analysis, less than 6% of the variability in the correlation is not explained for flood recurrences up to 25 years. Longer term predictions up to 500 years accrue greater uncertainty with as much as 15% of the variability in the correlation left unexplained.
Chapat, Ludivine; Hilaire, Florence; Bouvet, Jérome; Pialot, Daniel; Philippe-Reversat, Corinne; Guiot, Anne-Laure; Remolue, Lydie; Lechenet, Jacques; Andreoni, Christine; Poulet, Hervé; Day, Michael J; De Luca, Karelle; Cariou, Carine; Cupillard, Lionel
2017-07-01
The assessment of vaccine combinations, or the evaluation of the impact of minor modifications of one component in well-established vaccines, requires animal challenges in the absence of previously validated correlates of protection. As an alternative, we propose conducting a multivariate analysis of the specific immune response to the vaccine. This approach is consistent with the principles of the 3Rs (Refinement, Reduction and Replacement) and avoids repeating efficacy studies based on infectious challenges in vivo. To validate this approach, a set of nine immunological parameters was selected in order to characterize B and T lymphocyte responses against canine rabies virus and to evaluate the compatibility between two canine vaccines, an inactivated rabies vaccine (RABISIN ® ) and a combined vaccine (EURICAN ® DAPPi-Lmulti) injected at two different sites in the same animals. The analysis was focused on the magnitude and quality of the immune response. The multi-dimensional picture given by this 'immune fingerprint' was used to assess the impact of the concomitant injection of the combined vaccine on the immunogenicity of the rabies vaccine. A principal component analysis fully discriminated the control group from the groups vaccinated with RABISIN ® alone or RABISIN ® +EURICAN ® DAPPi-Lmulti and confirmed the compatibility between the rabies vaccines. This study suggests that determining the immune fingerprint, combined with a multivariate statistical analysis, is a promising approach to characterizing the immunogenicity of a vaccine with an established record of efficacy. It may also avoid the need to repeat efficacy studies involving challenge infection in case of minor modifications of the vaccine or for compatibility studies. Copyright © 2017 Elsevier B.V. All rights reserved.
Classical least squares multivariate spectral analysis
Haaland, David M.
2002-01-01
An improved classical least squares multivariate spectral analysis method that adds spectral shapes describing non-calibrated components and system effects (other than baseline corrections) present in the analyzed mixture to the prediction phase of the method. These improvements decrease or eliminate many of the restrictions to the CLS-type methods and greatly extend their capabilities, accuracy, and precision. One new application of PACLS includes the ability to accurately predict unknown sample concentrations when new unmodeled spectral components are present in the unknown samples. Other applications of PACLS include the incorporation of spectrometer drift into the quantitative multivariate model and the maintenance of a calibration on a drifting spectrometer. Finally, the ability of PACLS to transfer a multivariate model between spectrometers is demonstrated.
Horta, Rogério Lessa; Horta, Bernardo Lessa; da Costa, Andre Wallace Nery; do Prado, Rogério Ruscitto; Oliveira-Campos, Maryane; Malta, Deborah Carvalho
2014-01-01
This study aimed at describing the prevalence of illicit drug use among 9th grade students in the morning period of public and private schools in Brazil, and assessing associated factors. The Brazilian survey PeNSE (National Adolescent School-based Health Survey) 2012 evaluated a representative sample of 9th grade students in the morning period, in Brazil and its five regions. The use of illicit drugs at least once in life was assessed for the most commonly used drugs, such as marijuana, cocaine, crack, solvent-based glue, general ether-based inhalants, ecstasy and oxy. Data were subjected to descriptive analysis, and Pearson's χ² test and logistic regression was used in the multivariate analysis. The use of illicit drugs at least once in life was reported by 7.3% (95%CI 5.3 - 9.4) of the respondents. Logistic regression was used for multivariate analysis and the evidences suggest that illicit drug use is associated to social conditions of greater consumption power, the use of alcohol and tobacco, behaviors related to socialization, such as having friends or sexual activity, and also the perception of loneliness, loose contact between school and parents and experiences of abuse in the family environment. The outcome was inversely associated with close contact with parents and parental supervision. In addition to the association with the processes of socialization and consumption, the influence of family and school is expressed in a particularly protective manner in different records of direct supervision and care.
Havlicek, Martin; Jan, Jiri; Brazdil, Milan; Calhoun, Vince D.
2015-01-01
Increasing interest in understanding dynamic interactions of brain neural networks leads to formulation of sophisticated connectivity analysis methods. Recent studies have applied Granger causality based on standard multivariate autoregressive (MAR) modeling to assess the brain connectivity. Nevertheless, one important flaw of this commonly proposed method is that it requires the analyzed time series to be stationary, whereas such assumption is mostly violated due to the weakly nonstationary nature of functional magnetic resonance imaging (fMRI) time series. Therefore, we propose an approach to dynamic Granger causality in the frequency domain for evaluating functional network connectivity in fMRI data. The effectiveness and robustness of the dynamic approach was significantly improved by combining a forward and backward Kalman filter that improved estimates compared to the standard time-invariant MAR modeling. In our method, the functional networks were first detected by independent component analysis (ICA), a computational method for separating a multivariate signal into maximally independent components. Then the measure of Granger causality was evaluated using generalized partial directed coherence that is suitable for bivariate as well as multivariate data. Moreover, this metric provides identification of causal relation in frequency domain, which allows one to distinguish the frequency components related to the experimental paradigm. The procedure of evaluating Granger causality via dynamic MAR was demonstrated on simulated time series as well as on two sets of group fMRI data collected during an auditory sensorimotor (SM) or auditory oddball discrimination (AOD) tasks. Finally, a comparison with the results obtained from a standard time-invariant MAR model was provided. PMID:20561919
Pätzug, Konrad; Friedrich, Nele; Kische, Hanna; Hannemann, Anke; Völzke, Henry; Nauck, Matthias; Keevil, Brian G; Haring, Robin
2017-12-01
The present study investigates potential associations between liquid chromatography-mass spectrometry (LC-MS) measured sex hormones, dehydroepiandrosterone sulphate, sex hormone-binding globulin (SHBG) and bone ultrasound parameters at the heel in men and women from the general population. Data from 502 women and 425 men from the population-based Study of Health in Pomerania (SHIP-TREND) were used. Cross-sectional associations of sex hormones including testosterone (TT), calculated free testosterone (FT), dehydroepiandrosterone sulphate (DHEAS), androstenedione (ASD), estrone (E1) and SHBG with quantitative ultrasound (QUS) parameters at the heel, including broadband ultrasound attenuation (BUA), speed of sound (SOS) and stiffness index (SI) were examined by analysis of variance (ANOVA) and multivariable quantile regression models. Multivariable regression analysis showed a sex-specific inverse association of DHEAS with SI in men (Beta per SI unit = - 3.08, standard error (SE) = 0.88), but not in women (Beta = - 0.01, SE = 2.09). Furthermore, FT was positively associated with BUA in men (Beta per BUA unit = 29.0, SE = 10.1). None of the other sex hormones (ASD, E1) or SHBG was associated with QUS parameters after multivariable adjustment. This cross-sectional population-based study revealed independent associations of DHEAS and FT with QUS parameters in men, suggesting a potential influence on male bone metabolism. The predictive role of DHEAS and FT as a marker for osteoporosis in men warrants further investigation in clinical trials and large-scale observational studies.
Optical assay for biotechnology and clinical diagnosis.
Moczko, Ewa; Cauchi, Michael; Turner, Claire; Meglinski, Igor; Piletsky, Sergey
2011-08-01
In this paper, we present an optical diagnostic assay consisting of a mixture of environmental-sensitive fluorescent dyes combined with multivariate data analysis for quantitative and qualitative examination of biological and clinical samples. The performance of the assay is based on the analysis of spectrum of the selected fluorescent dyes with the operational principle similar to electronic nose and electronic tongue systems. This approach has been successfully applied for monitoring of growing cell cultures and identification of gastrointestinal diseases in humans.
Luan, Xiaoli; Chen, Qiang; Liu, Fei
2014-09-01
This article presents a new scheme to design full matrix controller for high dimensional multivariable processes based on equivalent transfer function (ETF). Differing from existing ETF method, the proposed ETF is derived directly by exploiting the relationship between the equivalent closed-loop transfer function and the inverse of open-loop transfer function. Based on the obtained ETF, the full matrix controller is designed utilizing the existing PI tuning rules. The new proposed ETF model can more accurately represent the original processes. Furthermore, the full matrix centralized controller design method proposed in this paper is applicable to high dimensional multivariable systems with satisfactory performance. Comparison with other multivariable controllers shows that the designed ETF based controller is superior with respect to design-complexity and obtained performance. Copyright © 2014 ISA. Published by Elsevier Ltd. All rights reserved.
Cluster-based exposure variation analysis
2013-01-01
Background Static posture, repetitive movements and lack of physical variation are known risk factors for work-related musculoskeletal disorders, and thus needs to be properly assessed in occupational studies. The aims of this study were (i) to investigate the effectiveness of a conventional exposure variation analysis (EVA) in discriminating exposure time lines and (ii) to compare it with a new cluster-based method for analysis of exposure variation. Methods For this purpose, we simulated a repeated cyclic exposure varying within each cycle between “low” and “high” exposure levels in a “near” or “far” range, and with “low” or “high” velocities (exposure change rates). The duration of each cycle was also manipulated by selecting a “small” or “large” standard deviation of the cycle time. Theses parameters reflected three dimensions of exposure variation, i.e. range, frequency and temporal similarity. Each simulation trace included two realizations of 100 concatenated cycles with either low (ρ = 0.1), medium (ρ = 0.5) or high (ρ = 0.9) correlation between the realizations. These traces were analyzed by conventional EVA, and a novel cluster-based EVA (C-EVA). Principal component analysis (PCA) was applied on the marginal distributions of 1) the EVA of each of the realizations (univariate approach), 2) a combination of the EVA of both realizations (multivariate approach) and 3) C-EVA. The least number of principal components describing more than 90% of variability in each case was selected and the projection of marginal distributions along the selected principal component was calculated. A linear classifier was then applied to these projections to discriminate between the simulated exposure patterns, and the accuracy of classified realizations was determined. Results C-EVA classified exposures more correctly than univariate and multivariate EVA approaches; classification accuracy was 49%, 47% and 52% for EVA (univariate and multivariate), and C-EVA, respectively (p < 0.001). All three methods performed poorly in discriminating exposure patterns differing with respect to the variability in cycle time duration. Conclusion While C-EVA had a higher accuracy than conventional EVA, both failed to detect differences in temporal similarity. The data-driven optimality of data reduction and the capability of handling multiple exposure time lines in a single analysis are the advantages of the C-EVA. PMID:23557439
A data fusion-based drought index
NASA Astrophysics Data System (ADS)
Azmi, Mohammad; Rüdiger, Christoph; Walker, Jeffrey P.
2016-03-01
Drought and water stress monitoring plays an important role in the management of water resources, especially during periods of extreme climate conditions. Here, a data fusion-based drought index (DFDI) has been developed and analyzed for three different locations of varying land use and climate regimes in Australia. The proposed index comprehensively considers all types of drought through a selection of indices and proxies associated with each drought type. In deriving the proposed index, weekly data from three different data sources (OzFlux Network, Asia-Pacific Water Monitor, and MODIS-Terra satellite) were employed to first derive commonly used individual standardized drought indices (SDIs), which were then grouped using an advanced clustering method. Next, three different multivariate methods (principal component analysis, factor analysis, and independent component analysis) were utilized to aggregate the SDIs located within each group. For the two clusters in which the grouped SDIs best reflected the water availability and vegetation conditions, the variables were aggregated based on an averaging between the standardized first principal components of the different multivariate methods. Then, considering those two aggregated indices as well as the classifications of months (dry/wet months and active/non-active months), the proposed DFDI was developed. Finally, the symbolic regression method was used to derive mathematical equations for the proposed DFDI. The results presented here show that the proposed index has revealed new aspects in water stress monitoring which previous indices were not able to, by simultaneously considering both hydrometeorological and ecological concepts to define the real water stress of the study areas.
Race and Older Mothers’ Differentiation: A Sequential Quantitative and Qualitative Analysis
Sechrist, Jori; Suitor, J. Jill; Riffin, Catherine; Taylor-Watson, Kadari; Pillemer, Karl
2011-01-01
The goal of this paper is to demonstrate a process by which qualitative and quantitative approaches are combined to reveal patterns in the data that are unlikely to be detected and confirmed by either method alone. Specifically, we take a sequential approach to combining qualitative and quantitative data to explore race differences in how mothers differentiate among their adult children. We began with a standard multivariate analysis examining race differences in mothers’ differentiation among their adult children regarding emotional closeness and confiding. Finding no race differences in this analysis, we conducted an in-depth comparison of the Black and White mothers’ narratives to determine whether there were underlying patterns that we had been unable to detect in our first analysis. Using this method, we found that Black mothers were substantially more likely than White mothers to emphasize interpersonal relationships within the family when describing differences among their children. In our final step, we developed a measure of familism based on the qualitative data and conducted a multivariate analysis to confirm the patterns revealed by the in-depth comparison of the mother’s narratives. We conclude that using such a sequential mixed methods approach to data analysis has the potential to shed new light on complex family relations. PMID:21967639
Li, Ning; Liu, Xueqin; Xie, Wei; Wu, Jidong; Zhang, Peng
2013-01-01
New features of natural disasters have been observed over the last several years. The factors that influence the disasters' formation mechanisms, regularity of occurrence and main characteristics have been revealed to be more complicated and diverse in nature than previously thought. As the uncertainty involved increases, the variables need to be examined further. This article discusses the importance and the shortage of multivariate analysis of natural disasters and presents a method to estimate the joint probability of the return periods and perform a risk analysis. Severe dust storms from 1990 to 2008 in Inner Mongolia were used as a case study to test this new methodology, as they are normal and recurring climatic phenomena on Earth. Based on the 79 investigated events and according to the dust storm definition with bivariate, the joint probability distribution of severe dust storms was established using the observed data of maximum wind speed and duration. The joint return periods of severe dust storms were calculated, and the relevant risk was analyzed according to the joint probability. The copula function is able to simulate severe dust storm disasters accurately. The joint return periods generated are closer to those observed in reality than the univariate return periods and thus have more value in severe dust storm disaster mitigation, strategy making, program design, and improvement of risk management. This research may prove useful in risk-based decision making. The exploration of multivariate analysis methods can also lay the foundation for further applications in natural disaster risk analysis. © 2012 Society for Risk Analysis.
Time Series Model Identification by Estimating Information.
1982-11-01
principle, Applications of Statistics, P. R. Krishnaiah , ed., North-Holland: Amsterdam, 27-41. Anderson, T. W. (1971). The Statistical Analysis of Time Series...E. (1969). Multiple Time Series Modeling, Multivariate Analysis II, edited by P. Krishnaiah , Academic Press: New York, 389-409. Parzen, E. (1981...Newton, H. J. (1980). Multiple Time Series Modeling, II Multivariate Analysis - V, edited by P. Krishnaiah , North Holland: Amsterdam, 181-197. Shibata, R
Kohara, Norihito; Kaneko, Masayuki; Narukawa, Mamoru
2018-01-01
The concept of the risk-based approach has been introduced as an effort to secure the quality of clinical trials. In the risk-based approach, identification and evaluation of risk in advance are considered important. For recently completed clinical trials, we investigated the relationship between study characteristics and protocol deviations leading to the exclusion of subjects from Per Protocol Set (PPS) efficacy analysis. New drugs approved in Japan in the fiscal year 2014-2015 were targeted in the research. The reasons for excluding subjects from the PPS efficacy analysis were described in 102 trials out of 492 in the summary of new drug application documents, which was publicly disclosed after the drug's regulatory approval. The author extracted these reasons along with the numbers of the cases and the study characteristics of each clinical trial. Then, the direct comparison, univariate regression analysis, and multivariate regression analysis was carried out based on the exclusion rate. The study characteristics for which exclusion of subjects from the PPS efficacy analysis were frequently observed was multiregional clinical trials in study region; inhalant and external use in administration route; Anti-infective for systemic use; Respiratory system, Dermatologicals, and Nervous system in therapeutic drug under the Anatomical Therapeutic Chemical Classification. In the multivariate regression analysis, the clinical trial variables of inhalant, Respiratory system, or Dermatologicals were selected as study characteristics leading to a higher exclusion rate. The characteristics of the clinical trial that is likely to cause protocol deviations that will affect efficacy analysis were suggested. These studies should be considered for specific attention and priority observation in the trial protocol or its monitoring plan and execution, such as a clear description of inclusion/exclusion criteria in the protocol, development of training materials to site staff, and/or trial subjects as specific risk-alleviating measures.
A Multivariate Model and Analysis of Competitive Strategy in the U.S. Hardwood Lumber Industry
Robert J. Bush; Steven A. Sinclair
1991-01-01
Business-level competitive strategy in the hardwood lumber industry was modeled through the identification of strategic groups among large U.S. hardwood lumber producers. Strategy was operationalized using a measure based on the variables developed by Dess and Davis (1984). Factor and cluster analyses were used to define strategic groups along the dimensions of cost...
Multivariate proteomic profiling identifies novel accessory proteins of coated vesicles
Antrobus, Robin; Hirst, Jennifer; Bhumbra, Gary S.; Kozik, Patrycja; Jackson, Lauren P.; Sahlender, Daniela A.
2012-01-01
Despite recent advances in mass spectrometry, proteomic characterization of transport vesicles remains challenging. Here, we describe a multivariate proteomics approach to analyzing clathrin-coated vesicles (CCVs) from HeLa cells. siRNA knockdown of coat components and different fractionation protocols were used to obtain modified coated vesicle-enriched fractions, which were compared by stable isotope labeling of amino acids in cell culture (SILAC)-based quantitative mass spectrometry. 10 datasets were combined through principal component analysis into a “profiling” cluster analysis. Overall, 136 CCV-associated proteins were predicted, including 36 new proteins. The method identified >93% of established CCV coat proteins and assigned >91% correctly to intracellular or endocytic CCVs. Furthermore, the profiling analysis extends to less well characterized types of coated vesicles, and we identify and characterize the first AP-4 accessory protein, which we have named tepsin. Finally, our data explain how sequestration of TACC3 in cytosolic clathrin cages causes the severe mitotic defects observed in auxilin-depleted cells. The profiling approach can be adapted to address related cell and systems biological questions. PMID:22472443
NASA Astrophysics Data System (ADS)
Hu, Chongqing; Li, Aihua; Zhao, Xingyang
2011-02-01
This paper proposes a multivariate statistical analysis approach to processing the instantaneous engine speed signal for the purpose of locating multiple misfire events in internal combustion engines. The state of each cylinder is described with a characteristic vector extracted from the instantaneous engine speed signal following a three-step procedure. These characteristic vectors are considered as the values of various procedure parameters of an engine cycle. Therefore, determination of occurrence of misfire events and identification of misfiring cylinders can be accomplished by a principal component analysis (PCA) based pattern recognition methodology. The proposed algorithm can be implemented easily in practice because the threshold can be defined adaptively without the information of operating conditions. Besides, the effect of torsional vibration on the engine speed waveform is interpreted as the presence of super powerful cylinder, which is also isolated by the algorithm. The misfiring cylinder and the super powerful cylinder are often adjacent in the firing sequence, thus missing detections and false alarms can be avoided effectively by checking the relationship between the cylinders.
Multivariate Welch t-test on distances
2016-01-01
Motivation: Permutational non-Euclidean analysis of variance, PERMANOVA, is routinely used in exploratory analysis of multivariate datasets to draw conclusions about the significance of patterns visualized through dimension reduction. This method recognizes that pairwise distance matrix between observations is sufficient to compute within and between group sums of squares necessary to form the (pseudo) F statistic. Moreover, not only Euclidean, but arbitrary distances can be used. This method, however, suffers from loss of power and type I error inflation in the presence of heteroscedasticity and sample size imbalances. Results: We develop a solution in the form of a distance-based Welch t-test, TW2, for two sample potentially unbalanced and heteroscedastic data. We demonstrate empirically the desirable type I error and power characteristics of the new test. We compare the performance of PERMANOVA and TW2 in reanalysis of two existing microbiome datasets, where the methodology has originated. Availability and Implementation: The source code for methods and analysis of this article is available at https://github.com/alekseyenko/Tw2. Further guidance on application of these methods can be obtained from the author. Contact: alekseye@musc.edu PMID:27515741
Multivariate Welch t-test on distances.
Alekseyenko, Alexander V
2016-12-01
Permutational non-Euclidean analysis of variance, PERMANOVA, is routinely used in exploratory analysis of multivariate datasets to draw conclusions about the significance of patterns visualized through dimension reduction. This method recognizes that pairwise distance matrix between observations is sufficient to compute within and between group sums of squares necessary to form the (pseudo) F statistic. Moreover, not only Euclidean, but arbitrary distances can be used. This method, however, suffers from loss of power and type I error inflation in the presence of heteroscedasticity and sample size imbalances. We develop a solution in the form of a distance-based Welch t-test, [Formula: see text], for two sample potentially unbalanced and heteroscedastic data. We demonstrate empirically the desirable type I error and power characteristics of the new test. We compare the performance of PERMANOVA and [Formula: see text] in reanalysis of two existing microbiome datasets, where the methodology has originated. The source code for methods and analysis of this article is available at https://github.com/alekseyenko/Tw2 Further guidance on application of these methods can be obtained from the author. alekseye@musc.edu. © The Author 2016. Published by Oxford University Press.
Multivariate Analysis for Quantification of Plutonium(IV) in Nitric Acid Based on Absorption Spectra
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lines, Amanda M.; Adami, Susan R.; Sinkov, Sergey I.
Development of more effective, reliable, and fast methods for monitoring process streams is a growing opportunity for analytical applications. Many fields can benefit from on-line monitoring, including the nuclear fuel cycle where improved methods for monitoring radioactive materials will facilitate maintenance of proper safeguards and ensure safe and efficient processing of materials. On-line process monitoring with a focus on optical spectroscopy can provide a fast, non-destructive method for monitoring chemical species. However, identification and quantification of species can be hindered by the complexity of the solutions if bands overlap or show condition-dependent spectral features. Plutonium (IV) is one example ofmore » a species which displays significant spectral variation with changing nitric acid concentration. Single variate analysis (i.e. Beer’s Law) is difficult to apply to the quantification of Pu(IV) unless the nitric acid concentration is known and separate calibration curves have been made for all possible acid strengths. Multivariate, or chemometric, analysis is an approach that allows for the accurate quantification of Pu(IV) without a priori knowledge of nitric acid concentration.« less
Tan, Guangguo; Lou, Ziyang; Jing, Jing; Li, Wuhong; Zhu, Zhenyu; Zhao, Liang; Zhang, Guoqing; Chai, Yifeng
2011-12-01
Aconite roots are popularly used in herbal medicines in China. Many cases of accidental and intentional intoxication with this plant have been reported; some of these are fatal because the toxicity of aconitum is very high. It is thus important to detect and identify aconitum alkaloids in biofluids. In this work, an improved method employing LC-TOFMS with multivariate data analysis was developed for screening and analysis of major aconitum alkaloids and their metabolites in rat urine following oral administration of aconite roots extract. Thirty-four signals highlighted by multivariate statistical analyses including 24 parent components and 10 metabolites were screened out and further identified by adjustment of the fragmentor voltage to produce structure-relevant fragment ions. It is helpful for studying aconite roots in toxicology, pharmacology and forensic medicine. This work also confirmed that the metabolomic approach provides effective tools for screening multiple absorbed and metabolic components of Chinese herbal medicines in vivo. Copyright © 2011 John Wiley & Sons, Ltd.
Marini, Federico; de Beer, Dalene; Walters, Nico A; de Villiers, André; Joubert, Elizabeth; Walczak, Beata
2017-03-17
An ultimate goal of investigations of rooibos plant material subjected to different stages of fermentation is to identify the chemical changes taking place in the phenolic composition, using an untargeted approach and chromatographic fingerprints. Realization of this goal requires, among others, identification of the main components of the plant material involved in chemical reactions during the fermentation process. Quantitative chromatographic data for the compounds for extracts of green, semi-fermented and fermented rooibos form the basis of preliminary study following a targeted approach. The aim is to estimate whether treatment has a significant effect based on all quantified compounds and to identify the compounds, which contribute significantly to it. Analysis of variance is performed using modern multivariate methods such as ANOVA-Simultaneous Component Analysis, ANOVA - Target Projection and regularized MANOVA. This study is the first one in which all three approaches are compared and evaluated. For the data studied, all tree methods reveal the same significance of the fermentation effect on the extract compositions, but they lead to its different interpretation. Copyright © 2017 Elsevier B.V. All rights reserved.
Boggia, Raffaella; Casolino, Maria Chiara; Hysenaj, Vilma; Oliveri, Paolo; Zunin, Paola
2013-10-15
Consumer demand for pomegranate juice has considerably grown, during the last years, for its potential health benefits. Since it is an expensive functional food, cheaper fruit juices addition (i.e., grape and apple juices) or its simple dilution, or polyphenols subtraction are deceptively used. At present, time-consuming analyses are used to control the quality of this product. Furthermore these analyses are expensive and require well-trained analysts. Thus, the purpose of this study was to propose a high-speed and easy-to-use shortcut. Based on UV-VIS spectroscopy and chemometrics, a screening method is proposed to quickly screening some common fillers of pomegranate juice that could decrease the antiradical scavenging capacity of pure products. The analytical method was applied to laboratory prepared juices, to commercial juices and to representative experimental mixtures at different levels of water and filler juices. The outcomes were evaluated by means of multivariate exploratory analysis. The results indicate that the proposed strategy can be a useful screening tool to assess addition of filler juices and water to pomegranate juices. Copyright © 2012 Elsevier Ltd. All rights reserved.
Wang, Xiuquan; Huang, Guohe; Zhao, Shan; Guo, Junhong
2015-09-01
This paper presents an open-source software package, rSCA, which is developed based upon a stepwise cluster analysis method and serves as a statistical tool for modeling the relationships between multiple dependent and independent variables. The rSCA package is efficient in dealing with both continuous and discrete variables, as well as nonlinear relationships between the variables. It divides the sample sets of dependent variables into different subsets (or subclusters) through a series of cutting and merging operations based upon the theory of multivariate analysis of variance (MANOVA). The modeling results are given by a cluster tree, which includes both intermediate and leaf subclusters as well as the flow paths from the root of the tree to each leaf subcluster specified by a series of cutting and merging actions. The rSCA package is a handy and easy-to-use tool and is freely available at http://cran.r-project.org/package=rSCA . By applying the developed package to air quality management in an urban environment, we demonstrate its effectiveness in dealing with the complicated relationships among multiple variables in real-world problems.
Goodwin, Cody R; Sherrod, Stacy D; Marasco, Christina C; Bachmann, Brian O; Schramm-Sapyta, Nicole; Wikswo, John P; McLean, John A
2014-07-01
A metabolic system is composed of inherently interconnected metabolic precursors, intermediates, and products. The analysis of untargeted metabolomics data has conventionally been performed through the use of comparative statistics or multivariate statistical analysis-based approaches; however, each falls short in representing the related nature of metabolic perturbations. Herein, we describe a complementary method for the analysis of large metabolite inventories using a data-driven approach based upon a self-organizing map algorithm. This workflow allows for the unsupervised clustering, and subsequent prioritization of, correlated features through Gestalt comparisons of metabolic heat maps. We describe this methodology in detail, including a comparison to conventional metabolomics approaches, and demonstrate the application of this method to the analysis of the metabolic repercussions of prolonged cocaine exposure in rat sera profiles.
NASA Astrophysics Data System (ADS)
Kumar, S.; Jasinski, M. F.; Mocko, D. M.; Rodell, M.; Borak, J.; Li, B.; Beaudoing, H. K.; Peters-Lidard, C. D.
2017-12-01
This presentation will describe one of the first successful examples of multisensor, multivariate land data assimilation, encompassing a large suite of soil moisture, snow depth, snow cover and irrigation intensity environmental data records (EDRs) from Scanning Multi-channel Microwave Radiometer (SMMR), the Special Sensor Microwave Imager (SSM/I), the Advanced Scatterometer (ASCAT), the Moderate-Resolution Imaging Spectroradiometer (MODIS), the Advanced Microwave Scanning Radiometer (AMSR-E and AMSR2), the Soil Moisture Ocean Salinity (SMOS) mission and the Soil Moisture Active Passive (SMAP) mission. The analysis is performed using the NASA Land Information System (LIS) as an enabling tool for the U.S. National Climate Assessment (NCA). The performance of NCA Land Data Assimilation System (NCA-LDAS) is evaluated by comparing to a number of hydrological reference data products. Results indicate that multivariate assimilation provides systematic improvements in simulated soil moisture and snow depth, with marginal effects on the accuracy of simulated streamflow and ET. An important conclusion is that across all evaluated variables, assimilation of data from increasingly more modern sensors (e.g. SMOS, SMAP, AMSR2, ASCAT) produces more skillful results than assimilation of data from older sensors (e.g. SMMR, SSM/I, AMSR-E). The evaluation also indicates high skill of NCA-LDAS when compared with other land analysis products. Further, drought indicators based on NCA-LDAS output suggest a trend of longer and more severe droughts over parts of Western U.S. during 1979-2015, particularly in the Southwestern U.S.
Hayes, Don; Kopp, Benjamin T; Tobias, Joseph D; Woodley, Frederick W; Mansour, Heidi M; Tumin, Dmitry; Kirkby, Stephen E
2015-12-01
Survival in non-cystic fibrosis (CF) bronchiectasis is not well studied. The United Network for Organ Sharing database was queried from 1987 to 2013 to compare survival in adult patients with non-CF bronchiectasis to patients with CF listed for lung transplantation (LTx). Each subject was tracked from waitlist entry date until death or censoring to determine survival differences between the two groups. Of 2112 listed lung transplant candidates with bronchiectasis (180 non-CF, 1932 CF), 1617 were used for univariate Cox and Kaplan-Meier survival function analysis, 1173 for multivariate Cox models, and 182 for matched-pairs analysis based on propensity scores. Compared to CF, patients with non-CF bronchiectasis had a significantly lower mortality by univariate Cox analysis (HR 0.565; 95 % CI 0.424, 0.754; p < 0.001). Adjusting for potential confounders, multivariate Cox models identified a significant reduction in risk for death associated with non-CF bronchiectasis who were lung transplant candidates (HR 0.684; 95 % CI 0.475, 0.985; p = 0.041). Results were consistent in multivariate models adjusting for pulmonary hypertension and forced expiratory volume in one second. Non-CF bronchiectasis with advanced lung disease was associated with significantly lower mortality hazard compared to CF bronchiectasis on the waitlist for LTx. Separate referral and listing criteria for LTx in non-CF and CF populations should be considered.
Geurts, Brigitte P; Neerincx, Anne H; Bertrand, Samuel; Leemans, Manja A A P; Postma, Geert J; Wolfender, Jean-Luc; Cristescu, Simona M; Buydens, Lutgarde M C; Jansen, Jeroen J
2017-04-22
Revealing the biochemistry associated to micro-organismal interspecies interactions is highly relevant for many purposes. Each pathogen has a characteristic metabolic fingerprint that allows identification based on their unique multivariate biochemistry. When pathogen species come into mutual contact, their co-culture will display a chemistry that may be attributed both to mixing of the characteristic chemistries of the mono-cultures and to competition between the pathogens. Therefore, investigating pathogen development in a polymicrobial environment requires dedicated chemometric methods to untangle and focus upon these sources of variation. The multivariate data analysis method Projected Orthogonalised Chemical Encounter Monitoring (POCHEMON) is dedicated to highlight metabolites characteristic for the interaction of two micro-organisms in co-culture. However, this approach is currently limited to a single time-point, while development of polymicrobial interactions may be highly dynamic. A well-known multivariate implementation of Analysis of Variance (ANOVA) uses Principal Component Analysis (ANOVA-PCA). This allows the overall dynamics to be separated from the pathogen-specific chemistry to analyse the contributions of both aspects separately. For this reason, we propose to integrate ANOVA-PCA with the POCHEMON approach to disentangle the pathogen dynamics and the specific biochemistry in interspecies interactions. Two complementary case studies show great potential for both liquid and gas chromatography - mass spectrometry to reveal novel information on chemistry specific to interspecies interaction during pathogen development. Copyright © 2017 The Author(s). Published by Elsevier B.V. All rights reserved.
In situ X-ray diffraction analysis of (CF x) n batteries: signal extraction by multivariate analysis
Rodriguez, Mark A.; Keenan, Michael R.; Nagasubramanian, Ganesan
2007-11-10
In this study, (CF x) n cathode reaction during discharge has been investigated using in situ X-ray diffraction (XRD). Mathematical treatment of the in situ XRD data set was performed using multivariate curve resolution with alternating least squares (MCR–ALS), a technique of multivariate analysis. MCR–ALS analysis successfully separated the relatively weak XRD signal intensity due to the chemical reaction from the other inert cell component signals. The resulting dynamic reaction component revealed the loss of (CF x) n cathode signal together with the simultaneous appearance of LiF by-product intensity. Careful examination of the XRD data set revealed an additional dynamicmore » component which may be associated with the formation of an intermediate compound during the discharge process.« less
Hybrid least squares multivariate spectral analysis methods
Haaland, David M.
2004-03-23
A set of hybrid least squares multivariate spectral analysis methods in which spectral shapes of components or effects not present in the original calibration step are added in a following prediction or calibration step to improve the accuracy of the estimation of the amount of the original components in the sampled mixture. The hybrid method herein means a combination of an initial calibration step with subsequent analysis by an inverse multivariate analysis method. A spectral shape herein means normally the spectral shape of a non-calibrated chemical component in the sample mixture but can also mean the spectral shapes of other sources of spectral variation, including temperature drift, shifts between spectrometers, spectrometer drift, etc. The shape can be continuous, discontinuous, or even discrete points illustrative of the particular effect.
Buttini, Francesca; Pasquali, Irene; Brambilla, Gaetano; Copelli, Diego; Alberi, Massimiliano Dagli; Balducci, Anna Giulia; Bettini, Ruggero; Sisti, Viviana
2016-03-01
The aim of this work was to evaluate the effect of two different dry powder inhalers, of the NGI induction port and Alberta throat and of the actual inspiratory profiles of asthmatic patients on in-vitro drug inhalation performances. The two devices considered were a reservoir multidose and a capsule-based inhaler. The formulation used to test the inhalers was a combination of formoterol fumarate and beclomethasone dipropionate. A breath simulator was used to mimic inhalatory patterns previously determined in vivo. A multivariate approach was adopted to estimate the significance of the effect of the investigated variables in the explored domain. Breath simulator was a useful tool to mimic in vitro the in vivo inspiratory profiles of asthmatic patients. The type of throat coupled with the impactor did not affect the aerodynamic distribution of the investigated formulation. However, the type of inhaler and inspiratory profiles affected the respirable dose of drugs. The multivariate statistical approach demonstrated that the multidose inhaler, released efficiently a high fine particle mass independently from the inspiratory profiles adopted. Differently, the single dose capsule inhaler, showed a significant decrease of fine particle mass of both drugs when the device was activated using the minimum inspiratory volume (592 mL).
A Review of Multivariate Methods for Multimodal Fusion of Brain Imaging Data
Adali, Tülay; Yu, Qingbao; Calhoun, Vince D.
2011-01-01
The development of various neuroimaging techniques is rapidly improving the measurements of brain function/structure. However, despite improvements in individual modalities, it is becoming increasingly clear that the most effective research approaches will utilize multi-modal fusion, which takes advantage of the fact that each modality provides a limited view of the brain. The goal of multimodal fusion is to capitalize on the strength of each modality in a joint analysis, rather than a separate analysis of each. This is a more complicated endeavor that must be approached more carefully and efficient methods should be developed to draw generalized and valid conclusions from high dimensional data with a limited number of subjects. Numerous research efforts have been reported in the field based on various statistical approaches, e.g. independent component analysis (ICA), canonical correlation analysis (CCA) and partial least squares (PLS). In this review paper, we survey a number of multivariate methods appearing in previous reports, which are performed with or without prior information and may have utility for identifying potential brain illness biomarkers. We also discuss the possible strengths and limitations of each method, and review their applications to brain imaging data. PMID:22108139
Valverde-Som, Lucia; Ruiz-Samblás, Cristina; Rodríguez-García, Francisco P; Cuadros-Rodríguez, Luis
2018-02-09
Virgin olive oil is the only food product for which sensory analysis is regulated to classify it in different quality categories. To harmonize the results of the sensorial method, the use of standards or reference materials is crucial. The stability of sensory reference materials is required to enable their suitable control, aiming to confirm that their specific target values are maintained on an ongoing basis. Currently, such stability is monitored by means of sensory analysis and the sensory panels are in the paradoxical situation of controlling the standards that are devoted to controlling the panels. In the present study, several approaches based on similarity analysis are exploited. For each approach, the specific methodology to build a proper multivariate control chart to monitor the stability of the sensory properties is explained and discussed. The normalized Euclidean and Mahalanobis distances, the so-called nearness and hardiness indices respectively, have been defined as new similarity indices to range the values from 0 to 1. Also, the squared mean from Hotelling's T 2 -statistic and Q 2 -statistic has been proposed as another similarity index. © 2018 Society of Chemical Industry. © 2018 Society of Chemical Industry.
Genetic association of impulsivity in young adults: a multivariate study
Khadka, S; Narayanan, B; Meda, S A; Gelernter, J; Han, S; Sawyer, B; Aslanzadeh, F; Stevens, M C; Hawkins, K A; Anticevic, A; Potenza, M N; Pearlson, G D
2014-01-01
Impulsivity is a heritable, multifaceted construct with clinically relevant links to multiple psychopathologies. We assessed impulsivity in young adult (N~2100) participants in a longitudinal study, using self-report questionnaires and computer-based behavioral tasks. Analysis was restricted to the subset (N=426) who underwent genotyping. Multivariate association between impulsivity measures and single-nucleotide polymorphism data was implemented using parallel independent component analysis (Para-ICA). Pathways associated with multiple genes in components that correlated significantly with impulsivity phenotypes were then identified using a pathway enrichment analysis. Para-ICA revealed two significantly correlated genotype–phenotype component pairs. One impulsivity component included the reward responsiveness subscale and behavioral inhibition scale of the Behavioral-Inhibition System/Behavioral-Activation System scale, and the second impulsivity component included the non-planning subscale of the Barratt Impulsiveness Scale and the Experiential Discounting Task. Pathway analysis identified processes related to neurogenesis, nervous system signal generation/amplification, neurotransmission and immune response. We identified various genes and gene regulatory pathways associated with empirically derived impulsivity components. Our study suggests that gene networks implicated previously in brain development, neurotransmission and immune response are related to impulsive tendencies and behaviors. PMID:25268255
An error bound for a discrete reduced order model of a linear multivariable system
NASA Technical Reports Server (NTRS)
Al-Saggaf, Ubaid M.; Franklin, Gene F.
1987-01-01
The design of feasible controllers for high dimension multivariable systems can be greatly aided by a method of model reduction. In order for the design based on the order reduction to include a guarantee of stability, it is sufficient to have a bound on the model error. Previous work has provided such a bound for continuous-time systems for algorithms based on balancing. In this note an L-infinity bound is derived for model error for a method of order reduction of discrete linear multivariable systems based on balancing.
Zhu, Guangxu; Guo, Qingjun; Xiao, Huayun; Chen, Tongbin; Yang, Jun
2017-06-01
Heavy metals are considered toxic to humans and ecosystems. In the present study, heavy metal concentration in soil was investigated using the single pollution index (PIi), the integrated Nemerow pollution index (PIN), and the geoaccumulation index (Igeo) to determine metal accumulation and its pollution status at the abandoned site of the Capital Iron and Steel Factory in Beijing and its surrounding area. Multivariate statistical (principal component analysis and correlation analysis), geostatistical analysis (ArcGIS tool), combined with stable Pb isotopic ratios, were applied to explore the characteristics of heavy metal pollution and the possible sources of pollutants. The results indicated that heavy metal elements show different degrees of accumulation in the study area, the observed trend of the enrichment factors, and the geoaccumulation index was Hg > Cd > Zn > Cr > Pb > Cu ≈ As > Ni. Hg, Cd, Zn, and Cr were the dominant elements that influenced soil quality in the study area. The Nemerow index method indicated that all of the heavy metals caused serious pollution except Ni. Multivariate statistical analysis indicated that Cd, Zn, Cu, and Pb show obvious correlation and have higher loads on the same principal component, suggesting that they had the same sources, which are related to industrial activities and vehicle emissions. The spatial distribution maps based on ordinary kriging showed that high concentrations of heavy metals were located in the local factory area and in the southeast-northwest part of the study region, corresponding with the predominant wind directions. Analyses of lead isotopes confirmed that Pb in the study soils is predominantly derived from three Pb sources: dust generated during steel production, coal combustion, and the natural background. Moreover, the ternary mixture model based on lead isotope analysis indicates that lead in the study soils originates mainly from anthropogenic sources, which contribute much more than the natural sources. Our study could not only reveal the overall situation of heavy metal contamination, but also identify the specific pollution sources.
Richard. D. Wood-Smith; John M. Buffington
1996-01-01
Multivariate statistical analyses of geomorphic variables from 23 forest stream reaches in southeast Alaska result in successful discrimination between pristine streams and those disturbed by land management, specifically timber harvesting and associated road building. Results of discriminant function analysis indicate that a three-variable model discriminates 10...
ERIC Educational Resources Information Center
Tchumtchoua, Sylvie; Dey, Dipak K.
2012-01-01
This paper proposes a semiparametric Bayesian framework for the analysis of associations among multivariate longitudinal categorical variables in high-dimensional data settings. This type of data is frequent, especially in the social and behavioral sciences. A semiparametric hierarchical factor analysis model is developed in which the…
Dor, Avi; Luo, Qian; Gerstein, Maya Tuchman; Malveaux, Floyd; Mitchell, Herman; Markus, Anne Rossier
We present an incremental cost-effectiveness analysis of an evidence-based childhood asthma intervention (Community Healthcare for Asthma Management and Prevention of Symptoms [CHAMPS]) to usual management of childhood asthma in community health centers. Data used in the analysis include household surveys, Medicaid insurance claims, and community health center expenditure reports. We combined our incremental cost-effectiveness analysis with a difference-in-differences multivariate regression framework. We found that CHAMPS reduced symptom days by 29.75 days per child-year and was cost-effective (incremental cost-effectiveness ratio: $28.76 per symptom-free days). Most of the benefits were due to reductions in direct medical costs. Indirect benefits from increased household productivity were relatively small.
Use of Multivariate Linkage Analysis for Dissection of a Complex Cognitive Trait
Marlow, Angela J.; Fisher, Simon E.; Francks, Clyde; MacPhie, I. Laurence; Cherny, Stacey S.; Richardson, Alex J.; Talcott, Joel B.; Stein, John F.; Monaco, Anthony P.; Cardon, Lon R.
2003-01-01
Replication of linkage results for complex traits has been exceedingly difficult, owing in part to the inability to measure the precise underlying phenotype, small sample sizes, genetic heterogeneity, and statistical methods employed in analysis. Often, in any particular study, multiple correlated traits have been collected, yet these have been analyzed independently or, at most, in bivariate analyses. Theoretical arguments suggest that full multivariate analysis of all available traits should offer more power to detect linkage; however, this has not yet been evaluated on a genomewide scale. Here, we conduct multivariate genomewide analyses of quantitative-trait loci that influence reading- and language-related measures in families affected with developmental dyslexia. The results of these analyses are substantially clearer than those of previous univariate analyses of the same data set, helping to resolve a number of key issues. These outcomes highlight the relevance of multivariate analysis for complex disorders for dissection of linkage results in correlated traits. The approach employed here may aid positional cloning of susceptibility genes in a wide spectrum of complex traits. PMID:12587094
The association between body mass index and severe biliary infections: a multivariate analysis.
Stewart, Lygia; Griffiss, J McLeod; Jarvis, Gary A; Way, Lawrence W
2012-11-01
Obesity has been associated with worse infectious disease outcomes. It is a risk factor for cholesterol gallstones, but little is known about associations between body mass index (BMI) and biliary infections. We studied this using factors associated with biliary infections. A total of 427 patients with gallstones were studied. Gallstones, bile, and blood (as applicable) were cultured. Illness severity was classified as follows: none (no infection or inflammation), systemic inflammatory response syndrome (fever, leukocytosis), severe (abscess, cholangitis, empyema), or multi-organ dysfunction syndrome (bacteremia, hypotension, organ failure). Associations between BMI and biliary bacteria, bacteremia, gallstone type, and illness severity were examined using bivariate and multivariate analysis. BMI inversely correlated with pigment stones, biliary bacteria, bacteremia, and increased illness severity on bivariate and multivariate analysis. Obesity correlated with less severe biliary infections. BMI inversely correlated with pigment stones and biliary bacteria; multivariate analysis showed an independent correlation between lower BMI and illness severity. Most patients with severe biliary infections had a normal BMI, suggesting that obesity may be protective in biliary infections. This study examined the correlation between BMI and biliary infection severity. Published by Elsevier Inc.
Multivariate meta-analysis using individual participant data.
Riley, R D; Price, M J; Jackson, D; Wardle, M; Gueyffier, F; Wang, J; Staessen, J A; White, I R
2015-06-01
When combining results across related studies, a multivariate meta-analysis allows the joint synthesis of correlated effect estimates from multiple outcomes. Joint synthesis can improve efficiency over separate univariate syntheses, may reduce selective outcome reporting biases, and enables joint inferences across the outcomes. A common issue is that within-study correlations needed to fit the multivariate model are unknown from published reports. However, provision of individual participant data (IPD) allows them to be calculated directly. Here, we illustrate how to use IPD to estimate within-study correlations, using a joint linear regression for multiple continuous outcomes and bootstrapping methods for binary, survival and mixed outcomes. In a meta-analysis of 10 hypertension trials, we then show how these methods enable multivariate meta-analysis to address novel clinical questions about continuous, survival and binary outcomes; treatment-covariate interactions; adjusted risk/prognostic factor effects; longitudinal data; prognostic and multiparameter models; and multiple treatment comparisons. Both frequentist and Bayesian approaches are applied, with example software code provided to derive within-study correlations and to fit the models. © 2014 The Authors. Research Synthesis Methods published by John Wiley & Sons, Ltd.