NASA Astrophysics Data System (ADS)
de Oliveira, Isadora R. N.; Roque, Jussara V.; Maia, Mariza P.; Stringheta, Paulo C.; Teófilo, Reinaldo F.
2018-04-01
A new method was developed to determine the antioxidant properties of red cabbage extract (Brassica oleracea) by mid (MID) and near (NIR) infrared spectroscopies and partial least squares (PLS) regression. A 70% (v/v) ethanolic extract of red cabbage was concentrated to 9° Brix and further diluted (12 to 100%) in water. The dilutions were used as external standards for the building of PLS models. For the first time, this strategy was applied for building multivariate regression models. Reference analyses and spectral data were obtained from diluted extracts. The determinate properties were total and monomeric anthocyanins, total polyphenols and antioxidant capacity by ABTS (2,2-azino-bis(3-ethyl-benzothiazoline-6-sulfonate)) and DPPH (2,2-diphenyl-1-picrylhydrazyl) methods. Ordered predictors selection (OPS) and genetic algorithm (GA) were used for feature selection before PLS regression (PLS-1). In addition, a PLS-2 regression was applied to all properties simultaneously. PLS-1 models provided more predictive models than did PLS-2 regression. PLS-OPS and PLS-GA models presented excellent prediction results with a correlation coefficient higher than 0.98. However, the best models were obtained using PLS and variable selection with the OPS algorithm and the models based on NIR spectra were considered more predictive for all properties. Then, these models provided a simple, rapid and accurate method for determination of red cabbage extract antioxidant properties and its suitability for use in the food industry.
Xu, Yun; Muhamadali, Howbeer; Sayqal, Ali; Dixon, Neil; Goodacre, Royston
2016-10-28
Partial least squares (PLS) is one of the most commonly used supervised modelling approaches for analysing multivariate metabolomics data. PLS is typically employed as either a regression model (PLS-R) or a classification model (PLS-DA). However, in metabolomics studies it is common to investigate multiple, potentially interacting, factors simultaneously following a specific experimental design. Such data often cannot be considered as a "pure" regression or a classification problem. Nevertheless, these data have often still been treated as a regression or classification problem and this could lead to ambiguous results. In this study, we investigated the feasibility of designing a hybrid target matrix Y that better reflects the experimental design than simple regression or binary class membership coding commonly used in PLS modelling. The new design of Y coding was based on the same principle used by structural modelling in machine learning techniques. Two real metabolomics datasets were used as examples to illustrate how the new Y coding can improve the interpretability of the PLS model compared to classic regression/classification coding.
Kernel Partial Least Squares for Nonlinear Regression and Discrimination
NASA Technical Reports Server (NTRS)
Rosipal, Roman; Clancy, Daniel (Technical Monitor)
2002-01-01
This paper summarizes recent results on applying the method of partial least squares (PLS) in a reproducing kernel Hilbert space (RKHS). A previously proposed kernel PLS regression model was proven to be competitive with other regularized regression methods in RKHS. The family of nonlinear kernel-based PLS models is extended by considering the kernel PLS method for discrimination. Theoretical and experimental results on a two-class discrimination problem indicate usefulness of the method.
Mechanisms behind the estimation of photosynthesis traits from leaf reflectance observations
NASA Astrophysics Data System (ADS)
Dechant, Benjamin; Cuntz, Matthias; Doktor, Daniel; Vohland, Michael
2016-04-01
Many studies have investigated the reflectance-based estimation of leaf chlorophyll, water and dry matter contents of plants. Only few studies focused on photosynthesis traits, however. The maximum potential uptake of carbon dioxide under given environmental conditions is determined mainly by RuBisCO activity, limiting carboxylation, or the speed of photosynthetic electron transport. These two main limitations are represented by the maximum carboxylation capacity, V cmax,25, and the maximum electron transport rate, Jmax,25. These traits were estimated from leaf reflectance before but the mechanisms underlying the estimation remain rather speculative. The aim of this study was therefore to reveal the mechanisms behind reflectance-based estimation of V cmax,25 and Jmax,25. Leaf reflectance, photosynthetic response curves as well as nitrogen content per area, Narea, and leaf mass per area, LMA, were measured on 37 deciduous tree species. V cmax,25 and Jmax,25 were determined from the response curves. Partial Least Squares (PLS) regression models for the two photosynthesis traits V cmax,25 and Jmax,25 as well as Narea and LMA were studied using a cross-validation approach. Analyses of linear regression models based on Narea and other leaf traits estimated via PROSPECT inversion, PLS regression coefficients and model residuals were conducted in order to reveal the mechanisms behind the reflectance-based estimation. We found that V cmax,25 and Jmax,25 can be estimated from leaf reflectance with good to moderate accuracy for a large number of species and different light conditions. The dominant mechanism behind the estimations was the strong relationship between photosynthesis traits and leaf nitrogen content. This was concluded from very strong relationships between PLS regression coefficients, the model residuals as well as the prediction performance of Narea- based linear regression models compared to PLS regression models. While the PLS regression model for V cmax,25 was fully based on the correlation to Narea, the PLS regression model for Jmax,25 was not entirely based on it. Analyses of the contributions of different parts of the reflectance spectrum revealed that the information contributing to the Jmax,25 PLS regression model in addition to the main source of information, Narea, was mainly located in the visible part of the spectrum (500-900 nm). Estimated chlorophyll content could be excluded as potential source of this extra information. The PLS regression coefficients of the Jmax,25 model indicated possible contributions from chlorophyll fluorescence and cytochrome f content. In summary, we found that the main mechanism behind the estimation of V cmax,25 and Jmax,25 from leaf reflectance observations is the correlation to Narea but that there is additional information related to Jmax,25 mainly in the visible part of the spectrum.
Siebers, Nina; Kruse, Jens; Eckhardt, Kai-Uwe; Hu, Yongfeng; Leinweber, Peter
2012-07-01
Cadmium (Cd) has a high toxicity and resolving its speciation in soil is challenging but essential for estimating the environmental risk. In this study partial least-square (PLS) regression was tested for its capability to deconvolute Cd L(3)-edge X-ray absorption near-edge structure (XANES) spectra of multi-compound mixtures. For this, a library of Cd reference compound spectra and a spectrum of a soil sample were acquired. A good coefficient of determination (R(2)) of Cd compounds in mixtures was obtained for the PLS model using binary and ternary mixtures of various Cd reference compounds proving the validity of this approach. In order to describe complex systems like soil, multi-compound mixtures of a variety of Cd compounds must be included in the PLS model. The obtained PLS regression model was then applied to a highly Cd-contaminated soil revealing Cd(3)(PO(4))(2) (36.1%), Cd(NO(3))(2)·4H(2)O (24.5%), Cd(OH)(2) (21.7%), CdCO(3) (17.1%) and CdCl(2) (0.4%). These preliminary results proved that PLS regression is a promising approach for a direct determination of Cd speciation in the solid phase of a soil sample.
Shan, Jiajia; Wang, Xue; Zhou, Hao; Han, Shuqing; Riza, Dimas Firmanda Al; Kondo, Naoshi
2018-03-13
Synchronous fluorescence spectra, combined with multivariate analysis were used to predict flavonoids content in green tea rapidly and nondestructively. This paper presented a new and efficient spectral intervals selection method called clustering based partial least square (CL-PLS), which selected informative wavelengths by combining clustering concept and partial least square (PLS) methods to improve models' performance by synchronous fluorescence spectra. The fluorescence spectra of tea samples were obtained and k-means and kohonen-self organizing map clustering algorithms were carried out to cluster full spectra into several clusters, and sub-PLS regression model was developed on each cluster. Finally, CL-PLS models consisting of gradually selected clusters were built. Correlation coefficient (R) was used to evaluate the effect on prediction performance of PLS models. In addition, variable influence on projection partial least square (VIP-PLS), selectivity ratio partial least square (SR-PLS), interval partial least square (iPLS) models and full spectra PLS model were investigated and the results were compared. The results showed that CL-PLS presented the best result for flavonoids prediction using synchronous fluorescence spectra.
NASA Astrophysics Data System (ADS)
Yuniarto, Budi; Kurniawan, Robert
2017-03-01
PLS Path Modeling (PLS-PM) is different from covariance based SEM, where PLS-PM use an approach based on variance or component, therefore, PLS-PM is also known as a component based SEM. Multiblock Partial Least Squares (MBPLS) is a method in PLS regression which can be used in PLS Path Modeling which known as Multiblock PLS Path Modeling (MBPLS-PM). This method uses an iterative procedure in its algorithm. This research aims to modify MBPLS-PM with Back Propagation Neural Network approach. The result is MBPLS-PM algorithm can be modified using the Back Propagation Neural Network approach to replace the iterative process in backward and forward step to get the matrix t and the matrix u in the algorithm. By modifying the MBPLS-PM algorithm using Back Propagation Neural Network approach, the model parameters obtained are relatively not significantly different compared to model parameters obtained by original MBPLS-PM algorithm.
NASA Astrophysics Data System (ADS)
Luna, Aderval S.; Gonzaga, Fabiano B.; da Rocha, Werickson F. C.; Lima, Igor C. A.
2018-01-01
Laser-induced breakdown spectroscopy (LIBS) analysis was carried out on eleven steel samples to quantify the concentrations of chromium, nickel, and manganese. LIBS spectral data were correlated to known concentrations of the samples using different strategies in partial least squares (PLS) regression models. For the PLS analysis, one predictive model was separately generated for each element, while different approaches were used for the selection of variables (VIP: variable importance in projection and iPLS: interval partial least squares) in the PLS model to quantify the contents of the elements. The comparison of the performance of the models showed that there was no significant statistical difference using the Wilcoxon signed rank test. The elliptical joint confidence region (EJCR) did not detect systematic errors in these proposed methodologies for each metal.
NASA Astrophysics Data System (ADS)
Shan, Jiajia; Wang, Xue; Zhou, Hao; Han, Shuqing; Riza, Dimas Firmanda Al; Kondo, Naoshi
2018-04-01
Synchronous fluorescence spectra, combined with multivariate analysis were used to predict flavonoids content in green tea rapidly and nondestructively. This paper presented a new and efficient spectral intervals selection method called clustering based partial least square (CL-PLS), which selected informative wavelengths by combining clustering concept and partial least square (PLS) methods to improve models’ performance by synchronous fluorescence spectra. The fluorescence spectra of tea samples were obtained and k-means and kohonen-self organizing map clustering algorithms were carried out to cluster full spectra into several clusters, and sub-PLS regression model was developed on each cluster. Finally, CL-PLS models consisting of gradually selected clusters were built. Correlation coefficient (R) was used to evaluate the effect on prediction performance of PLS models. In addition, variable influence on projection partial least square (VIP-PLS), selectivity ratio partial least square (SR-PLS), interval partial least square (iPLS) models and full spectra PLS model were investigated and the results were compared. The results showed that CL-PLS presented the best result for flavonoids prediction using synchronous fluorescence spectra.
Zhang, Yan; Zou, Hong-Yan; Shi, Pei; Yang, Qin; Tang, Li-Juan; Jiang, Jian-Hui; Wu, Hai-Long; Yu, Ru-Qin
2016-01-01
Determination of benzo[a]pyrene (BaP) in cigarette smoke can be very important for the tobacco quality control and the assessment of its harm to human health. In this study, mid-infrared spectroscopy (MIR) coupled to chemometric algorithm (DPSO-WPT-PLS), which was based on the wavelet packet transform (WPT), discrete particle swarm optimization algorithm (DPSO) and partial least squares regression (PLS), was used to quantify harmful ingredient benzo[a]pyrene in the cigarette mainstream smoke with promising result. Furthermore, the proposed method provided better performance compared to several other chemometric models, i.e., PLS, radial basis function-based PLS (RBF-PLS), PLS with stepwise regression variable selection (Stepwise-PLS) as well as WPT-PLS with informative wavelet coefficients selected by correlation coefficient test (rtest-WPT-PLS). It can be expected that the proposed strategy could become a new effective, rapid quantitative analysis technique in analyzing the harmful ingredient BaP in cigarette mainstream smoke. Copyright © 2015 Elsevier B.V. All rights reserved.
Kernel analysis of partial least squares (PLS) regression models.
Shinzawa, Hideyuki; Ritthiruangdej, Pitiporn; Ozaki, Yukihiro
2011-05-01
An analytical technique based on kernel matrix representation is demonstrated to provide further chemically meaningful insight into partial least squares (PLS) regression models. The kernel matrix condenses essential information about scores derived from PLS or principal component analysis (PCA). Thus, it becomes possible to establish the proper interpretation of the scores. A PLS model for the total nitrogen (TN) content in multiple Thai fish sauces is built with a set of near-infrared (NIR) transmittance spectra of the fish sauce samples. The kernel analysis of the scores effectively reveals that the variation of the spectral feature induced by the change in protein content is substantially associated with the total water content and the protein hydration. Kernel analysis is also carried out on a set of time-dependent infrared (IR) spectra representing transient evaporation of ethanol from a binary mixture solution of ethanol and oleic acid. A PLS model to predict the elapsed time is built with the IR spectra and the kernel matrix is derived from the scores. The detailed analysis of the kernel matrix provides penetrating insight into the interaction between the ethanol and the oleic acid.
Akimoto, Yuki; Yugi, Katsuyuki; Uda, Shinsuke; Kudo, Takamasa; Komori, Yasunori; Kubota, Hiroyuki; Kuroda, Shinya
2013-01-01
Cells use common signaling molecules for the selective control of downstream gene expression and cell-fate decisions. The relationship between signaling molecules and downstream gene expression and cellular phenotypes is a multiple-input and multiple-output (MIMO) system and is difficult to understand due to its complexity. For example, it has been reported that, in PC12 cells, different types of growth factors activate MAP kinases (MAPKs) including ERK, JNK, and p38, and CREB, for selective protein expression of immediate early genes (IEGs) such as c-FOS, c-JUN, EGR1, JUNB, and FOSB, leading to cell differentiation, proliferation and cell death; however, how multiple-inputs such as MAPKs and CREB regulate multiple-outputs such as expression of the IEGs and cellular phenotypes remains unclear. To address this issue, we employed a statistical method called partial least squares (PLS) regression, which involves a reduction of the dimensionality of the inputs and outputs into latent variables and a linear regression between these latent variables. We measured 1,200 data points for MAPKs and CREB as the inputs and 1,900 data points for IEGs and cellular phenotypes as the outputs, and we constructed the PLS model from these data. The PLS model highlighted the complexity of the MIMO system and growth factor-specific input-output relationships of cell-fate decisions in PC12 cells. Furthermore, to reduce the complexity, we applied a backward elimination method to the PLS regression, in which 60 input variables were reduced to 5 variables, including the phosphorylation of ERK at 10 min, CREB at 5 min and 60 min, AKT at 5 min and JNK at 30 min. The simple PLS model with only 5 input variables demonstrated a predictive ability comparable to that of the full PLS model. The 5 input variables effectively extracted the growth factor-specific simple relationships within the MIMO system in cell-fate decisions in PC12 cells.
Teoh, Shao Thing; Kitamura, Miki; Nakayama, Yasumune; Putri, Sastia; Mukai, Yukio; Fukusaki, Eiichiro
2016-08-01
In recent years, the advent of high-throughput omics technology has made possible a new class of strain engineering approaches, based on identification of possible gene targets for phenotype improvement from omic-level comparison of different strains or growth conditions. Metabolomics, with its focus on the omic level closest to the phenotype, lends itself naturally to this semi-rational methodology. When a quantitative phenotype such as growth rate under stress is considered, regression modeling using multivariate techniques such as partial least squares (PLS) is often used to identify metabolites correlated with the target phenotype. However, linear modeling techniques such as PLS require a consistent metabolite-phenotype trend across the samples, which may not be the case when outliers or multiple conflicting trends are present in the data. To address this, we proposed a data-mining strategy that utilizes random sample consensus (RANSAC) to select subsets of samples with consistent trends for construction of better regression models. By applying a combination of RANSAC and PLS (RANSAC-PLS) to a dataset from a previous study (gas chromatography/mass spectrometry metabolomics data and 1-butanol tolerance of 19 yeast mutant strains), new metabolites were indicated to be correlated with tolerance within certain subsets of the samples. The relevance of these metabolites to 1-butanol tolerance were then validated from single-deletion strains of corresponding metabolic genes. The results showed that RANSAC-PLS is a promising strategy to identify unique metabolites that provide additional hints for phenotype improvement, which could not be detected by traditional PLS modeling using the entire dataset. Copyright © 2016 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Ying, Yibin; Liu, Yande; Fu, Xiaping; Lu, Huishan
2005-11-01
The artificial neural networks (ANNs) have been used successfully in applications such as pattern recognition, image processing, automation and control. However, majority of today's applications of ANNs is back-propagate feed-forward ANN (BP-ANN). In this paper, back-propagation artificial neural networks (BP-ANN) were applied for modeling soluble solid content (SSC) of intact pear from their Fourier transform near infrared (FT-NIR) spectra. One hundred and sixty-four pear samples were used to build the calibration models and evaluate the models predictive ability. The results are compared to the classical calibration approaches, i.e. principal component regression (PCR), partial least squares (PLS) and non-linear PLS (NPLS). The effects of the optimal methods of training parameters on the prediction model were also investigated. BP-ANN combine with principle component regression (PCR) resulted always better than the classical PCR, PLS and Weight-PLS methods, from the point of view of the predictive ability. Based on the results, it can be concluded that FT-NIR spectroscopy and BP-ANN models can be properly employed for rapid and nondestructive determination of fruit internal quality.
Impact of multicollinearity on small sample hydrologic regression models
NASA Astrophysics Data System (ADS)
Kroll, Charles N.; Song, Peter
2013-06-01
Often hydrologic regression models are developed with ordinary least squares (OLS) procedures. The use of OLS with highly correlated explanatory variables produces multicollinearity, which creates highly sensitive parameter estimators with inflated variances and improper model selection. It is not clear how to best address multicollinearity in hydrologic regression models. Here a Monte Carlo simulation is developed to compare four techniques to address multicollinearity: OLS, OLS with variance inflation factor screening (VIF), principal component regression (PCR), and partial least squares regression (PLS). The performance of these four techniques was observed for varying sample sizes, correlation coefficients between the explanatory variables, and model error variances consistent with hydrologic regional regression models. The negative effects of multicollinearity are magnified at smaller sample sizes, higher correlations between the variables, and larger model error variances (smaller R2). The Monte Carlo simulation indicates that if the true model is known, multicollinearity is present, and the estimation and statistical testing of regression parameters are of interest, then PCR or PLS should be employed. If the model is unknown, or if the interest is solely on model predictions, is it recommended that OLS be employed since using more complicated techniques did not produce any improvement in model performance. A leave-one-out cross-validation case study was also performed using low-streamflow data sets from the eastern United States. Results indicate that OLS with stepwise selection generally produces models across study regions with varying levels of multicollinearity that are as good as biased regression techniques such as PCR and PLS.
Newman, J; Egan, T; Harbourne, N; O'Riordan, D; Jacquier, J C; O'Sullivan, M
2014-08-01
Sensory evaluation can be problematic for ingredients with a bitter taste during research and development phase of new food products. In this study, 19 dairy protein hydrolysates (DPH) were analysed by an electronic tongue and their physicochemical characteristics, the data obtained from these methods were correlated with their bitterness intensity as scored by a trained sensory panel and each model was also assessed by its predictive capabilities. The physiochemical characteristics of the DPHs investigated were degree of hydrolysis (DH%), and data relating to peptide size and relative hydrophobicity from size exclusion chromatography (SEC) and reverse phase (RP) HPLC. Partial least square regression (PLS) was used to construct the prediction models. All PLS regressions had good correlations (0.78 to 0.93) with the strongest being the combination of data obtained from SEC and RP HPLC. However, the PLS with the strongest predictive power was based on the e-tongue which had the PLS regression with the lowest root mean predicted residual error sum of squares (PRESS) in the study. The results show that the PLS models constructed with the e-tongue and the combination of SEC and RP-HPLC has potential to be used for prediction of bitterness and thus reducing the reliance on sensory analysis in DPHs for future food research. Copyright © 2014 Elsevier B.V. All rights reserved.
Filgueiras, Paulo R; Terra, Luciana A; Castro, Eustáquio V R; Oliveira, Lize M S L; Dias, Júlio C M; Poppi, Ronei J
2015-09-01
This paper aims to estimate the temperature equivalent to 10% (T10%), 50% (T50%) and 90% (T90%) of distilled volume in crude oils using (1)H NMR and support vector regression (SVR). Confidence intervals for the predicted values were calculated using a boosting-type ensemble method in a procedure called ensemble support vector regression (eSVR). The estimated confidence intervals obtained by eSVR were compared with previously accepted calculations from partial least squares (PLS) models and a boosting-type ensemble applied in the PLS method (ePLS). By using the proposed boosting strategy, it was possible to identify outliers in the T10% property dataset. The eSVR procedure improved the accuracy of the distillation temperature predictions in relation to standard PLS, ePLS and SVR. For T10%, a root mean square error of prediction (RMSEP) of 11.6°C was obtained in comparison with 15.6°C for PLS, 15.1°C for ePLS and 28.4°C for SVR. The RMSEPs for T50% were 24.2°C, 23.4°C, 22.8°C and 14.4°C for PLS, ePLS, SVR and eSVR, respectively. For T90%, the values of RMSEP were 39.0°C, 39.9°C and 39.9°C for PLS, ePLS, SVR and eSVR, respectively. The confidence intervals calculated by the proposed boosting methodology presented acceptable values for the three properties analyzed; however, they were lower than those calculated by the standard methodology for PLS. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Li, Lin
2008-12-01
Partial least squares (PLS) regressions were applied to lunar highland and mare soil data characterized by the Lunar Soil Characterization Consortium (LSCC) for spectral estimation of the abundance of lunar soil chemical constituents FeO and Al2O3. The LSCC data set was split into a number of subsets including the total highland, Apollo 16, Apollo 14, and total mare soils, and then PLS was applied to each to investigate the effect of nonlinearity on the performance of the PLS method. The weight-loading vectors resulting from PLS were analyzed to identify mineral species responsible for spectral estimation of the soil chemicals. The results from PLS modeling indicate that the PLS performance depends on the correlation of constituents of interest to their major mineral carriers, and the Apollo 16 soils are responsible for the large errors of FeO and Al2O3 estimates when the soils were modeled along with other types of soils. These large errors are primarily attributed to the degraded correlation FeO to pyroxene for the relatively mature Apollo 16 soils as a result of space weathering and secondary to the interference of olivine. PLS consistently yields very accurate fits to the two soil chemicals when applied to mare soils. Although Al2O3 has no spectrally diagnostic characteristics, this chemical can be predicted for all subset data by PLS modeling at high accuracies because of its correlation to FeO. This correlation is reflected in the symmetry of the PLS weight-loading vectors for FeO and Al2O3, which prove to be very useful for qualitative interpretation of the PLS results. However, this qualitative interpretation of PLS modeling cannot be achieved using principal component regression loading vectors.
Balabin, Roman M; Smirnov, Sergey V
2011-04-29
During the past several years, near-infrared (near-IR/NIR) spectroscopy has increasingly been adopted as an analytical tool in various fields from petroleum to biomedical sectors. The NIR spectrum (above 4000 cm(-1)) of a sample is typically measured by modern instruments at a few hundred of wavelengths. Recently, considerable effort has been directed towards developing procedures to identify variables (wavelengths) that contribute useful information. Variable selection (VS) or feature selection, also called frequency selection or wavelength selection, is a critical step in data analysis for vibrational spectroscopy (infrared, Raman, or NIRS). In this paper, we compare the performance of 16 different feature selection methods for the prediction of properties of biodiesel fuel, including density, viscosity, methanol content, and water concentration. The feature selection algorithms tested include stepwise multiple linear regression (MLR-step), interval partial least squares regression (iPLS), backward iPLS (BiPLS), forward iPLS (FiPLS), moving window partial least squares regression (MWPLS), (modified) changeable size moving window partial least squares (CSMWPLS/MCSMWPLSR), searching combination moving window partial least squares (SCMWPLS), successive projections algorithm (SPA), uninformative variable elimination (UVE, including UVE-SPA), simulated annealing (SA), back-propagation artificial neural networks (BP-ANN), Kohonen artificial neural network (K-ANN), and genetic algorithms (GAs, including GA-iPLS). Two linear techniques for calibration model building, namely multiple linear regression (MLR) and partial least squares regression/projection to latent structures (PLS/PLSR), are used for the evaluation of biofuel properties. A comparison with a non-linear calibration model, artificial neural networks (ANN-MLP), is also provided. Discussion of gasoline, ethanol-gasoline (bioethanol), and diesel fuel data is presented. The results of other spectroscopic techniques application, such as Raman, ultraviolet-visible (UV-vis), or nuclear magnetic resonance (NMR) spectroscopies, can be greatly improved by an appropriate feature selection choice. Copyright © 2011 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Talebpour, Zahra; Tavallaie, Roya; Ahmadi, Seyyed Hamid; Abdollahpour, Assem
2010-09-01
In this study, a new method for the simultaneous determination of penicillin G salts in pharmaceutical mixture via FT-IR spectroscopy combined with chemometrics was investigated. The mixture of penicillin G salts is a complex system due to similar analytical characteristics of components. Partial least squares (PLS) and radial basis function-partial least squares (RBF-PLS) were used to develop the linear and nonlinear relation between spectra and components, respectively. The orthogonal signal correction (OSC) preprocessing method was used to correct unexpected information, such as spectral overlapping and scattering effects. In order to compare the influence of OSC on PLS and RBF-PLS models, the optimal linear (PLS) and nonlinear (RBF-PLS) models based on conventional and OSC preprocessed spectra were established and compared. The obtained results demonstrated that OSC clearly enhanced the performance of both RBF-PLS and PLS calibration models. Also in the case of some nonlinear relation between spectra and component, OSC-RBF-PLS gave satisfactory results than OSC-PLS model which indicated that the OSC was helpful to remove extrinsic deviations from linearity without elimination of nonlinear information related to component. The chemometric models were tested on an external dataset and finally applied to the analysis commercialized injection product of penicillin G salts.
Seasonal forecasting of high wind speeds over Western Europe
NASA Astrophysics Data System (ADS)
Palutikof, J. P.; Holt, T.
2003-04-01
As financial losses associated with extreme weather events escalate, there is interest from end users in the forestry and insurance industries, for example, in the development of seasonal forecasting models with a long lead time. This study uses exceedences of the 90th, 95th, and 99th percentiles of daily maximum wind speed over the period 1958 to present to derive predictands of winter wind extremes. The source data is the 6-hourly NCEP Reanalysis gridded surface wind field. Predictor variables include principal components of Atlantic sea surface temperature and several indices of climate variability, including the NAO and SOI. Lead times of up to a year are considered, in monthly increments. Three regression techniques are evaluated; multiple linear regression (MLR), principal component regression (PCR), and partial least squares regression (PLS). PCR and PLS proved considerably superior to MLR with much lower standard errors. PLS was chosen to formulate the predictive model since it offers more flexibility in experimental design and gave slightly better results than PCR. The results indicate that winter windiness can be predicted with considerable skill one year ahead for much of coastal Europe, but that this deteriorates rapidly in the hinterland. The experiment succeeded in highlighting PLS as a very useful method for developing more precise forecasting models, and in identifying areas of high predictability.
Xie, Chuanqi; He, Yong
2016-01-01
This study was carried out to use hyperspectral imaging technique for determining color (L*, a* and b*) and eggshell strength and identifying cracked chicken eggs. Partial least squares (PLS) models based on full and selected wavelengths suggested by regression coefficient (RC) method were established to predict the four parameters, respectively. Partial least squares-discriminant analysis (PLS-DA) and RC-partial least squares-discriminant analysis (RC-PLS-DA) models were applied to identify cracked eggs. PLS models performed well with the correlation coefficient (rp) of 0.788 for L*, 0.810 for a*, 0.766 for b* and 0.835 for eggshell strength. RC-PLS models also obtained the rp of 0.771 for L*, 0.806 for a*, 0.767 for b* and 0.841 for eggshell strength. The classification results were 97.06% in PLS-DA model and 88.24% in RC-PLS-DA model. It demonstrated that hyperspectral imaging technique has the potential to be used to detect color and eggshell strength values and identify cracked chicken eggs. PMID:26882990
Yu, Peigen; Low, Mei Yin; Zhou, Weibiao
2018-01-01
In order to develop products that would be preferred by consumers, the effects of the chemical compositions of ready-to-drink green tea beverages on consumer liking were studied through regression analyses. Green tea model systems were prepared by dosing solutions of 0.1% green tea extract with differing concentrations of eight flavour keys deemed to be important for green tea aroma and taste, based on a D-optimal experimental design, before undergoing commercial sterilisation. Sensory evaluation of the green tea model system was carried out using an untrained consumer panel to obtain hedonic liking scores of the samples. Regression models were subsequently trained to objectively predict the consumer liking scores of the green tea model systems. A linear partial least squares (PLS) regression model was developed to describe the effects of the eight flavour keys on consumer liking, with a coefficient of determination (R 2 ) of 0.733, and a root-mean-square error (RMSE) of 3.53%. The PLS model was further augmented with an artificial neural network (ANN) to establish a PLS-ANN hybrid model. The established hybrid model was found to give a better prediction of consumer liking scores, based on its R 2 (0.875) and RMSE (2.41%). Copyright © 2017 Elsevier Ltd. All rights reserved.
Koch, Cosima; Posch, Andreas E; Goicoechea, Héctor C; Herwig, Christoph; Lendl, Bernhard
2014-01-07
This paper presents the quantification of Penicillin V and phenoxyacetic acid, a precursor, inline during Pencillium chrysogenum fermentations by FTIR spectroscopy and partial least squares (PLS) regression and multivariate curve resolution - alternating least squares (MCR-ALS). First, the applicability of an attenuated total reflection FTIR fiber optic probe was assessed offline by measuring standards of the analytes of interest and investigating matrix effects of the fermentation broth. Then measurements were performed inline during four fed-batch fermentations with online HPLC for the determination of Penicillin V and phenoxyacetic acid as reference analysis. PLS and MCR-ALS models were built using these data and validated by comparison of single analyte spectra with the selectivity ratio of the PLS models and the extracted spectral traces of the MCR-ALS models, respectively. The achieved root mean square errors of cross-validation for the PLS regressions were 0.22 g L(-1) for Penicillin V and 0.32 g L(-1) for phenoxyacetic acid and the root mean square errors of prediction for MCR-ALS were 0.23 g L(-1) for Penicillin V and 0.15 g L(-1) for phenoxyacetic acid. A general work-flow for building and assessing chemometric regression models for the quantification of multiple analytes in bioprocesses by FTIR spectroscopy is given. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Gholizadeh, H.; Robeson, S. M.
2015-12-01
Empirical models have been widely used to estimate global chlorophyll content from remotely sensed data. Here, we focus on the standard NASA empirical models that use blue-green band ratios. These band ratio ocean color (OC) algorithms are in the form of fourth-order polynomials and the parameters of these polynomials (i.e. coefficients) are estimated from the NASA bio-Optical Marine Algorithm Data set (NOMAD). Most of the points in this data set have been sampled from tropical and temperate regions. However, polynomial coefficients obtained from this data set are used to estimate chlorophyll content in all ocean regions with different properties such as sea-surface temperature, salinity, and downwelling/upwelling patterns. Further, the polynomial terms in these models are highly correlated. In sum, the limitations of these empirical models are as follows: 1) the independent variables within the empirical models, in their current form, are correlated (multicollinear), and 2) current algorithms are global approaches and are based on the spatial stationarity assumption, so they are independent of location. Multicollinearity problem is resolved by using partial least squares (PLS). PLS, which transforms the data into a set of independent components, can be considered as a combined form of principal component regression (PCR) and multiple regression. Geographically weighted regression (GWR) is also used to investigate the validity of spatial stationarity assumption. GWR solves a regression model over each sample point by using the observations within its neighbourhood. PLS results show that the empirical method underestimates chlorophyll content in high latitudes, including the Southern Ocean region, when compared to PLS (see Figure 1). Cluster analysis of GWR coefficients also shows that the spatial stationarity assumption in empirical models is not likely a valid assumption.
da Silva, Fabiana E B; Flores, Érico M M; Parisotto, Graciele; Müller, Edson I; Ferrão, Marco F
2016-03-01
An alternative method for the quantification of sulphametoxazole (SMZ) and trimethoprim (TMP) using diffuse reflectance infrared Fourier-transform spectroscopy (DRIFTS) and partial least square regression (PLS) was developed. Interval Partial Least Square (iPLS) and Synergy Partial Least Square (siPLS) were applied to select a spectral range that provided the lowest prediction error in comparison to the full-spectrum model. Fifteen commercial tablet formulations and forty-nine synthetic samples were used. The ranges of concentration considered were 400 to 900 mg g-1SMZ and 80 to 240 mg g-1 TMP. Spectral data were recorded between 600 and 4000 cm-1 with a 4 cm-1 resolution by Diffuse Reflectance Infrared Fourier Transform Spectroscopy (DRIFTS). The proposed procedure was compared to high performance liquid chromatography (HPLC). The results obtained from the root mean square error of prediction (RMSEP), during the validation of the models for samples of sulphamethoxazole (SMZ) and trimethoprim (TMP) using siPLS, demonstrate that this approach is a valid technique for use in quantitative analysis of pharmaceutical formulations. The selected interval algorithm allowed building regression models with minor errors when compared to the full spectrum PLS model. A RMSEP of 13.03 mg g-1for SMZ and 4.88 mg g-1 for TMP was obtained after the selection the best spectral regions by siPLS.
Wang, Yonghua; Li, Yan; Wang, Bin
2007-01-01
Nicotine and a variety of other drugs and toxins are metabolized by cytochrome P450 (CYP) 2A6. The aim of the present study was to build a quantitative structure-activity relationship (QSAR) model to predict the activities of nicotine analogues on CYP2A6. Kernel partial least squares (K-PLS) regression was employed with the electro-topological descriptors to build the computational models. Both the internal and external predictabilities of the models were evaluated with test sets to ensure their validity and reliability. As a comparison to K-PLS, a standard PLS algorithm was also applied on the same training and test sets. Our results show that the K-PLS produced reasonable results that outperformed the PLS model on the datasets. The obtained K-PLS model will be helpful for the design of novel nicotine-like selective CYP2A6 inhibitors.
Balss, Karin M; Long, Frederick H; Veselov, Vladimir; Orana, Argjenta; Akerman-Revis, Eugena; Papandreou, George; Maryanoff, Cynthia A
2008-07-01
Multivariate data analysis was applied to confocal Raman measurements on stents coated with the polymers and drug used in the CYPHER Sirolimus-eluting Coronary Stents. Partial least-squares (PLS) regression was used to establish three independent calibration curves for the coating constituents: sirolimus, poly(n-butyl methacrylate) [PBMA], and poly(ethylene-co-vinyl acetate) [PEVA]. The PLS calibrations were based on average spectra generated from each spatial location profiled. The PLS models were tested on six unknown stent samples to assess accuracy and precision. The wt % difference between PLS predictions and laboratory assay values for sirolimus was less than 1 wt % for the composite of the six unknowns, while the polymer models were estimated to be less than 0.5 wt % difference for the combined samples. The linearity and specificity of the three PLS models were also demonstrated with the three PLS models. In contrast to earlier univariate models, the PLS models achieved mass balance with better accuracy. This analysis was extended to evaluate the spatial distribution of the three constituents. Quantitative bitmap images of drug-eluting stent coatings are presented for the first time to assess the local distribution of components.
Francisco, Fabiane Lacerda; Saviano, Alessandro Morais; Almeida, Túlia de Souza Botelho; Lourenço, Felipe Rebello
2016-05-01
Microbiological assays are widely used to estimate the relative potencies of antibiotics in order to guarantee the efficacy, safety, and quality of drug products. Despite of the advantages of turbidimetric bioassays when compared to other methods, it has limitations concerning the linearity and range of the dose-response curve determination. Here, we proposed to use partial least squares (PLS) regression to solve these limitations and to improve the prediction of relative potencies of antibiotics. Kinetic-reading microplate turbidimetric bioassays for apramacyin and vancomycin were performed using Escherichia coli (ATCC 8739) and Bacillus subtilis (ATCC 6633), respectively. Microbial growths were measured as absorbance up to 180 and 300min for apramycin and vancomycin turbidimetric bioassays, respectively. Conventional dose-response curves (absorbances or area under the microbial growth curve vs. log of antibiotic concentration) showed significant regression, however there were significant deviation of linearity. Thus, they could not be used for relative potency estimations. PLS regression allowed us to construct a predictive model for estimating the relative potencies of apramycin and vancomycin without over-fitting and it improved the linear range of turbidimetric bioassay. In addition, PLS regression provided predictions of relative potencies equivalent to those obtained from agar diffusion official methods. Therefore, we conclude that PLS regression may be used to estimate the relative potencies of antibiotics with significant advantages when compared to conventional dose-response curve determination. Copyright © 2016 Elsevier B.V. All rights reserved.
Li, Yi; Tseng, Yufeng J.; Pan, Dahua; Liu, Jianzhong; Kern, Petra S.; Gerberick, G. Frank; Hopfinger, Anton J.
2008-01-01
Currently, the only validated methods to identify skin sensitization effects are in vivo models, such as the Local Lymph Node Assay (LLNA) and guinea pig studies. There is a tremendous need, in particular due to novel legislation, to develop animal alternatives, eg. Quantitative Structure-Activity Relationship (QSAR) models. Here, QSAR models for skin sensitization using LLNA data have been constructed. The descriptors used to generate these models are derived from the 4D-molecular similarity paradigm and are referred to as universal 4D-fingerprints. A training set of 132 structurally diverse compounds and a test set of 15 structurally diverse compounds were used in this study. The statistical methodologies used to build the models are logistic regression (LR), and partial least square coupled logistic regression (PLS-LR), which prove to be effective tools for studying skin sensitization measures expressed in the two categorical terms of sensitizer and non-sensitizer. QSAR models with low values of the Hosmer-Lemeshow goodness-of-fit statistic, χHL2, are significant and predictive. For the training set, the cross-validated prediction accuracy of the logistic regression models ranges from 77.3% to 78.0%, while that of PLS-logistic regression models ranges from 87.1% to 89.4%. For the test set, the prediction accuracy of logistic regression models ranges from 80.0%-86.7%, while that of PLS-logistic regression models ranges from 73.3%-80.0%. The QSAR models are made up of 4D-fingerprints related to aromatic atoms, hydrogen bond acceptors and negatively partially charged atoms. PMID:17226934
Divya, O; Mishra, Ashok K
2007-05-29
Quantitative determination of kerosene fraction present in diesel has been carried out based on excitation emission matrix fluorescence (EEMF) along with parallel factor analysis (PARAFAC) and N-way partial least squares regression (N-PLS). EEMF is a simple, sensitive and nondestructive method suitable for the analysis of multifluorophoric mixtures. Calibration models consisting of varying compositions of diesel and kerosene were constructed and their validation was carried out using leave-one-out cross validation method. The accuracy of the model was evaluated through the root mean square error of prediction (RMSEP) for the PARAFAC, N-PLS and unfold PLS methods. N-PLS was found to be a better method compared to PARAFAC and unfold PLS method because of its low RMSEP values.
Lee, Soo Yee; Mediani, Ahmed; Maulidiani, Maulidiani; Khatib, Alfi; Ismail, Intan Safinar; Zawawi, Norhasnida; Abas, Faridah
2018-01-01
Neptunia oleracea is a plant consumed as a vegetable and which has been used as a folk remedy for several diseases. Herein, two regression models (partial least squares, PLS; and random forest, RF) in a metabolomics approach were compared and applied to the evaluation of the relationship between phenolics and bioactivities of N. oleracea. In addition, the effects of different extraction conditions on the phenolic constituents were assessed by pattern recognition analysis. Comparison of the PLS and RF showed that RF exhibited poorer generalization and hence poorer predictive performance. Both the regression coefficient of PLS and the variable importance of RF revealed that quercetin and kaempferol derivatives, caffeic acid and vitexin-2-O-rhamnoside were significant towards the tested bioactivities. Furthermore, principal component analysis (PCA) and partial least squares-discriminant analysis (PLS-DA) results showed that sonication and absolute ethanol are the preferable extraction method and ethanol ratio, respectively, to produce N. oleracea extracts with high phenolic levels and therefore high DPPH scavenging and α-glucosidase inhibitory activities. Both PLS and RF are useful regression models in metabolomics studies. This work provides insight into the performance of different multivariate data analysis tools and the effects of different extraction conditions on the extraction of desired phenolics from plants. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
NASA Astrophysics Data System (ADS)
Liu, Yande; Ying, Yibin; Lu, Huishan; Fu, Xiaping
2004-12-01
This work evaluates the feasibility of Fourier transform near infrared (FT-NIR) spectrometry for rapid determining the total soluble solids content and acidity of apple fruit. Intact apple fruit were measured by reflectance FT-NIR in 800-2500 nm range. FT-NIR models were developed based on partial least square (PLS) regression and principal component regress (PCR) with respect to the reflectance and its first derivative, the logarithms of the reflectance reciprocal and its second derivative. The above regression models, related the FT-NIR spectra to soluble solids content (SSC), titratable acidity (TA) and available acidity (pH). The best combination, based on the prediction results, was PLS models with respect to the logarithms of the reflectance reciprocal. Predictions with PLS models resulted standard errors of prediction (SEP) of 0.455, 0.044 and 0.068, and correlation coefficients of 0.968, 0.728 and 0.831 for SSC, TA and pH, respectively. It was concluded that by using the FT-NIR spectrometry measurement system, in the appropriate spectral range, it is possible to nondestructively assess the maturity factors of apple fruit.
Noninvasive and fast measurement of blood glucose in vivo by near infrared (NIR) spectroscopy
NASA Astrophysics Data System (ADS)
Jintao, Xue; Liming, Ye; Yufei, Liu; Chunyan, Li; Han, Chen
2017-05-01
This research was to develop a method for noninvasive and fast blood glucose assay in vivo. Near-infrared (NIR) spectroscopy, a more promising technique compared to other methods, was investigated in rats with diabetes and normal rats. Calibration models are generated by two different multivariate strategies: partial least squares (PLS) as linear regression method and artificial neural networks (ANN) as non-linear regression method. The PLS model was optimized individually by considering spectral range, spectral pretreatment methods and number of model factors, while the ANN model was studied individually by selecting spectral pretreatment methods, parameters of network topology, number of hidden neurons, and times of epoch. The results of the validation showed the two models were robust, accurate and repeatable. Compared to the ANN model, the performance of the PLS model was much better, with lower root mean square error of validation (RMSEP) of 0.419 and higher correlation coefficients (R) of 96.22%.
NASA Astrophysics Data System (ADS)
Jiang, Hui; Liu, Guohai; Mei, Congli; Yu, Shuang; Xiao, Xiahong; Ding, Yuhan
2012-11-01
The feasibility of rapid determination of the process variables (i.e. pH and moisture content) in solid-state fermentation (SSF) of wheat straw using Fourier transform near infrared (FT-NIR) spectroscopy was studied. Synergy interval partial least squares (siPLS) algorithm was implemented to calibrate regression model. The number of PLS factors and the number of subintervals were optimized simultaneously by cross-validation. The performance of the prediction model was evaluated according to the root mean square error of cross-validation (RMSECV), the root mean square error of prediction (RMSEP) and the correlation coefficient (R). The measurement results of the optimal model were obtained as follows: RMSECV = 0.0776, Rc = 0.9777, RMSEP = 0.0963, and Rp = 0.9686 for pH model; RMSECV = 1.3544% w/w, Rc = 0.8871, RMSEP = 1.4946% w/w, and Rp = 0.8684 for moisture content model. Finally, compared with classic PLS and iPLS models, the siPLS model revealed its superior performance. The overall results demonstrate that FT-NIR spectroscopy combined with siPLS algorithm can be used to measure process variables in solid-state fermentation of wheat straw, and NIR spectroscopy technique has a potential to be utilized in SSF industry.
Borràs, Eva; Ferré, Joan; Boqué, Ricard; Mestres, Montserrat; Aceña, Laura; Calvo, Angels; Busto, Olga
2016-08-01
Headspace-Mass Spectrometry (HS-MS), Fourier Transform Mid-Infrared spectroscopy (FT-MIR) and UV-Visible spectrophotometry (UV-vis) instrumental responses have been combined to predict virgin olive oil sensory descriptors. 343 olive oil samples analyzed during four consecutive harvests (2010-2014) were used to build multivariate calibration models using partial least squares (PLS) regression. The reference values of the sensory attributes were provided by expert assessors from an official taste panel. The instrumental data were modeled individually and also using data fusion approaches. The use of fused data with both low- and mid-level of abstraction improved PLS predictions for all the olive oil descriptors. The best PLS models were obtained for two positive attributes (fruity and bitter) and two defective descriptors (fusty and musty), all of them using data fusion of MS and MIR spectral fingerprints. Although good predictions were not obtained for some sensory descriptors, the results are encouraging, specially considering that the legal categorization of virgin olive oils only requires the determination of fruity and defective descriptors. Copyright © 2016 Elsevier B.V. All rights reserved.
Martelo-Vidal, M J; Vázquez, M
2014-09-01
Spectral analysis is a quick and non-destructive method to analyse wine. In this work, trans-resveratrol, oenin, malvin, catechin, epicatechin, quercetin and syringic acid were determined in commercial red wines from DO Rías Baixas and DO Ribeira Sacra (Spain) by UV-VIS-NIR spectroscopy. Calibration models were developed using principal component regression (PCR) or partial least squares (PLS) regression. HPLC was used as reference method. The results showed that reliable PLS models were obtained to quantify all polyphenols for Rías Baixas wines. For Ribeira Sacra, feasible models were obtained to determine quercetin, epicatechin, oenin and syringic acid. PCR calibration models showed worst reliable of prediction than PLS models. For red wines from mencía grapes, feasible models were obtained for catechin and oenin, regardless the geographical origin. The results obtained demonstrate that UV-VIS-NIR spectroscopy can be used to determine individual polyphenolic compounds in red wines. Copyright © 2014 Elsevier Ltd. All rights reserved.
Kuligowski, Julia; Carrión, David; Quintás, Guillermo; Garrigues, Salvador; de la Guardia, Miguel
2011-01-01
The selection of an appropriate calibration set is a critical step in multivariate method development. In this work, the effect of using different calibration sets, based on a previous classification of unknown samples, on the partial least squares (PLS) regression model performance has been discussed. As an example, attenuated total reflection (ATR) mid-infrared spectra of deep-fried vegetable oil samples from three botanical origins (olive, sunflower, and corn oil), with increasing polymerized triacylglyceride (PTG) content induced by a deep-frying process were employed. The use of a one-class-classifier partial least squares-discriminant analysis (PLS-DA) and a rooted binary directed acyclic graph tree provided accurate oil classification. Oil samples fried without foodstuff could be classified correctly, independent of their PTG content. However, class separation of oil samples fried with foodstuff, was less evident. The combined use of double-cross model validation with permutation testing was used to validate the obtained PLS-DA classification models, confirming the results. To discuss the usefulness of the selection of an appropriate PLS calibration set, the PTG content was determined by calculating a PLS model based on the previously selected classes. In comparison to a PLS model calculated using a pooled calibration set containing samples from all classes, the root mean square error of prediction could be improved significantly using PLS models based on the selected calibration sets using PLS-DA, ranging between 1.06 and 2.91% (w/w).
Jiang, Hui; Liu, Guohai; Mei, Congli; Yu, Shuang; Xiao, Xiahong; Ding, Yuhan
2012-11-01
The feasibility of rapid determination of the process variables (i.e. pH and moisture content) in solid-state fermentation (SSF) of wheat straw using Fourier transform near infrared (FT-NIR) spectroscopy was studied. Synergy interval partial least squares (siPLS) algorithm was implemented to calibrate regression model. The number of PLS factors and the number of subintervals were optimized simultaneously by cross-validation. The performance of the prediction model was evaluated according to the root mean square error of cross-validation (RMSECV), the root mean square error of prediction (RMSEP) and the correlation coefficient (R). The measurement results of the optimal model were obtained as follows: RMSECV=0.0776, R(c)=0.9777, RMSEP=0.0963, and R(p)=0.9686 for pH model; RMSECV=1.3544% w/w, R(c)=0.8871, RMSEP=1.4946% w/w, and R(p)=0.8684 for moisture content model. Finally, compared with classic PLS and iPLS models, the siPLS model revealed its superior performance. The overall results demonstrate that FT-NIR spectroscopy combined with siPLS algorithm can be used to measure process variables in solid-state fermentation of wheat straw, and NIR spectroscopy technique has a potential to be utilized in SSF industry. Copyright © 2012 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Anderson, R. B.; Clegg, S. M.; Frydenvang, J.
2015-12-01
One of the primary challenges faced by the ChemCam instrument on the Curiosity Mars rover is developing a regression model that can accurately predict the composition of the wide range of target types encountered (basalts, calcium sulfate, feldspar, oxides, etc.). The original calibration used 69 rock standards to train a partial least squares (PLS) model for each major element. By expanding the suite of calibration samples to >400 targets spanning a wider range of compositions, the accuracy of the model was improved, but some targets with "extreme" compositions (e.g. pure minerals) were still poorly predicted. We have therefore developed a simple method, referred to as "submodel PLS", to improve the performance of PLS across a wide range of target compositions. In addition to generating a "full" (0-100 wt.%) PLS model for the element of interest, we also generate several overlapping submodels (e.g. for SiO2, we generate "low" (0-50 wt.%), "mid" (30-70 wt.%), and "high" (60-100 wt.%) models). The submodels are generally more accurate than the "full" model for samples within their range because they are able to adjust for matrix effects that are specific to that range. To predict the composition of an unknown target, we first predict the composition with the submodels and the "full" model. Then, based on the predicted composition from the "full" model, the appropriate submodel prediction can be used (e.g. if the full model predicts a low composition, use the "low" model result, which is likely to be more accurate). For samples with "full" predictions that occur in a region of overlap between submodels, the submodel predictions are "blended" using a simple linear weighted sum. The submodel PLS method shows improvements in most of the major elements predicted by ChemCam and reduces the occurrence of negative predictions for low wt.% targets. Submodel PLS is currently being used in conjunction with ICA regression for the major element compositions of ChemCam data.
Luoma, Pekka; Natschläger, Thomas; Malli, Birgit; Pawliczek, Marcin; Brandstetter, Markus
2018-05-12
A model recalibration method based on additive Partial Least Squares (PLS) regression is generalized for multi-adjustment scenarios of independent variance sources (referred to as additive PLS - aPLS). aPLS allows for effortless model readjustment under changing measurement conditions and the combination of independent variance sources with the initial model by means of additive modelling. We demonstrate these distinguishing features on two NIR spectroscopic case-studies. In case study 1 aPLS was used as a readjustment method for an emerging offset. The achieved RMS error of prediction (1.91 a.u.) was of similar level as before the offset occurred (2.11 a.u.). In case-study 2 a calibration combining different variance sources was conducted. The achieved performance was of sufficient level with an absolute error being better than 0.8% of the mean concentration, therefore being able to compensate negative effects of two independent variance sources. The presented results show the applicability of the aPLS approach. The main advantages of the method are that the original model stays unadjusted and that the modelling is conducted on concrete changes in the spectra thus supporting efficient (in most cases straightforward) modelling. Additionally, the method is put into context of existing machine learning algorithms. Copyright © 2018 Elsevier B.V. All rights reserved.
Hacisalihoglu, Gokhan; Larbi, Bismark; Settles, A Mark
2010-01-27
The objective of this study was to explore the potential of near-infrared reflectance (NIR) spectroscopy to determine individual seed composition in common bean ( Phaseolus vulgaris L.). NIR spectra and analytical measurements of seed weight, protein, and starch were collected from 267 individual bean seeds representing 91 diverse genotypes. Partial least-squares (PLS) regression models were developed with 61 bean accessions randomly assigned to a calibration data set and 30 accessions assigned to an external validation set. Protein gave the most accurate PLS regression, with the external validation set having a standard error of prediction (SEP) = 1.6%. PLS regressions for seed weight and starch had sufficient accuracy for seed sorting applications, with SEP = 41.2 mg and 4.9%, respectively. Seed color had a clear effect on the NIR spectra, with black beans having a distinct spectral type. Seed coat color did not impact the accuracy of PLS predictions. This research demonstrates that NIR is a promising technique for simultaneous sorting of multiple seed traits in single bean seeds with no sample preparation.
Towards molecular design using 2D-molecular contour maps obtained from PLS regression coefficients
NASA Astrophysics Data System (ADS)
Borges, Cleber N.; Barigye, Stephen J.; Freitas, Matheus P.
2017-12-01
The multivariate image analysis descriptors used in quantitative structure-activity relationships are direct representations of chemical structures as they are simply numerical decodifications of pixels forming the 2D chemical images. These MDs have found great utility in the modeling of diverse properties of organic molecules. Given the multicollinearity and high dimensionality of the data matrices generated with the MIA-QSAR approach, modeling techniques that involve the projection of the data space onto orthogonal components e.g. Partial Least Squares (PLS) have been generally used. However, the chemical interpretation of the PLS-based MIA-QSAR models, in terms of the structural moieties affecting the modeled bioactivity has not been straightforward. This work describes the 2D-contour maps based on the PLS regression coefficients, as a means of assessing the relevance of single MIA predictors to the response variable, and thus allowing for the structural, electronic and physicochemical interpretation of the MIA-QSAR models. A sample study to demonstrate the utility of the 2D-contour maps to design novel drug-like molecules is performed using a dataset of some anti-HIV-1 2-amino-6-arylsulfonylbenzonitriles and derivatives, and the inferences obtained are consistent with other reports in the literature. In addition, the different schemes for encoding atomic properties in molecules are discussed and evaluated.
Dinç, Erdal; Ertekin, Zehra Ceren
2016-01-01
An application of parallel factor analysis (PARAFAC) and three-way partial least squares (3W-PLS1) regression models to ultra-performance liquid chromatography-photodiode array detection (UPLC-PDA) data with co-eluted peaks in the same wavelength and time regions was described for the multicomponent quantitation of hydrochlorothiazide (HCT) and olmesartan medoxomil (OLM) in tablets. Three-way dataset of HCT and OLM in their binary mixtures containing telmisartan (IS) as an internal standard was recorded with a UPLC-PDA instrument. Firstly, the PARAFAC algorithm was applied for the decomposition of three-way UPLC-PDA data into the chromatographic, spectral and concentration profiles to quantify the concerned compounds. Secondly, 3W-PLS1 approach was subjected to the decomposition of a tensor consisting of three-way UPLC-PDA data into a set of triads to build 3W-PLS1 regression for the analysis of the same compounds in samples. For the proposed three-way analysis methods in the regression and prediction steps, the applicability and validity of PARAFAC and 3W-PLS1 models were checked by analyzing the synthetic mixture samples, inter-day and intra-day samples, and standard addition samples containing HCT and OLM. Two different three-way analysis methods, PARAFAC and 3W-PLS1, were successfully applied to the quantitative estimation of the solid dosage form containing HCT and OLM. Regression and prediction results provided from three-way analysis were compared with those obtained by traditional UPLC method. Copyright © 2015 Elsevier B.V. All rights reserved.
Rahman, Md. Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D. W.; Labrique, Alain B.; Rashid, Mahbubur; Christian, Parul; West, Keith P.
2017-01-01
Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 − -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset. PMID:29261760
Kabir, Alamgir; Rahman, Md Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D W; Labrique, Alain B; Rashid, Mahbubur; Christian, Parul; West, Keith P
2017-01-01
Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 - -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset.
NASA Astrophysics Data System (ADS)
Zhang, Xuexi; Xiao, Zhi-Yan; Yin, Jianhua; Xia, Yang
2014-09-01
Fourier transform infrared imaging (FTIRI) combined with chemometrics can be used to detect the structure of bio-macromolecule, measure the concentrations of some components, and so on. In this study, FTIRI with Partial Least-Squares (PLS) regression was applied to study the concentration of two main components in bovine nasal cartilage (BNC), collagen and proteoglycan. An infrared spectrum library was built by mixing the collagen and chondroitin 6-sulfate (main of proteoglycan) at different ratios. Some pretreatments are needed for building PLS model. FTIR images were collected from BNC sections at 6.25μm and 25μm pixel size. The spectra extracted from BNC-FTIR images were imported into the PLS regression program to predict the concentrations of collagen and proteoglycan. These PLS-determined concentrations are agreed with the result in our previous work and biochemical analytical results. The prediction shows that the concentrations of collagen and proteoglycan in BNC are comparative on the whole. However, the concentration of proteoglycan is a litter higher than that of collagen, to some extent.
Domain-Invariant Partial-Least-Squares Regression.
Nikzad-Langerodi, Ramin; Zellinger, Werner; Lughofer, Edwin; Saminger-Platz, Susanne
2018-05-11
Multivariate calibration models often fail to extrapolate beyond the calibration samples because of changes associated with the instrumental response, environmental condition, or sample matrix. Most of the current methods used to adapt a source calibration model to a target domain exclusively apply to calibration transfer between similar analytical devices, while generic methods for calibration-model adaptation are largely missing. To fill this gap, we here introduce domain-invariant partial-least-squares (di-PLS) regression, which extends ordinary PLS by a domain regularizer in order to align the source and target distributions in the latent-variable space. We show that a domain-invariant weight vector can be derived in closed form, which allows the integration of (partially) labeled data from the source and target domains as well as entirely unlabeled data from the latter. We test our approach on a simulated data set where the aim is to desensitize a source calibration model to an unknown interfering agent in the target domain (i.e., unsupervised model adaptation). In addition, we demonstrate unsupervised, semisupervised, and supervised model adaptation by di-PLS on two real-world near-infrared (NIR) spectroscopic data sets.
NASA Astrophysics Data System (ADS)
Al-Harrasi, Ahmed; Rehman, Najeeb Ur; Mabood, Fazal; Albroumi, Muhammaed; Ali, Liaqat; Hussain, Javid; Hussain, Hidayat; Csuk, René; Khan, Abdul Latif; Alam, Tanveer; Alameri, Saif
2017-09-01
In the present study, for the first time, NIR spectroscopy coupled with PLS regression as a rapid and alternative method was developed to quantify the amount of Keto-β-Boswellic Acid (KBA) in different plant parts of Boswellia sacra and the resin exudates of the trunk. NIR spectroscopy was used for the measurement of KBA standards and B. sacra samples in absorption mode in the wavelength range from 700-2500 nm. PLS regression model was built from the obtained spectral data using 70% of KBA standards (training set) in the range from 0.1 ppm to 100 ppm. The PLS regression model obtained was having R-square value of 98% with 0.99 corelationship value and having good prediction with RMSEP value 3.2 and correlation of 0.99. It was then used to quantify the amount of KBA in the samples of B. sacra. The results indicated that the MeOH extract of resin has the highest concentration of KBA (0.6%) followed by essential oil (0.1%). However, no KBA was found in the aqueous extract. The MeOH extract of the resin was subjected to column chromatography to get various sub-fractions at different polarity of organic solvents. The sub-fraction at 4% MeOH/CHCl3 (4.1% of KBA) was found to contain the highest percentage of KBA followed by another sub-fraction at 2% MeOH/CHCl3 (2.2% of KBA). The present results also indicated that KBA is only present in the gum-resin of the trunk and not in all parts of the plant. These results were further confirmed through HPLC analysis and therefore it is concluded that NIRS coupled with PLS regression is a rapid and alternate method for quantification of KBA in Boswellia sacra. It is non-destructive, rapid, sensitive and uses simple methods of sample preparation.
Locally-Based Kernal PLS Smoothing to Non-Parametric Regression Curve Fitting
NASA Technical Reports Server (NTRS)
Rosipal, Roman; Trejo, Leonard J.; Wheeler, Kevin; Korsmeyer, David (Technical Monitor)
2002-01-01
We present a novel smoothing approach to non-parametric regression curve fitting. This is based on kernel partial least squares (PLS) regression in reproducing kernel Hilbert space. It is our concern to apply the methodology for smoothing experimental data where some level of knowledge about the approximate shape, local inhomogeneities or points where the desired function changes its curvature is known a priori or can be derived based on the observed noisy data. We propose locally-based kernel PLS regression that extends the previous kernel PLS methodology by incorporating this knowledge. We compare our approach with existing smoothing splines, hybrid adaptive splines and wavelet shrinkage techniques on two generated data sets.
Rapid and Simultaneous Prediction of Eight Diesel Quality Parameters through ATR-FTIR Analysis.
Nespeca, Maurilio Gustavo; Hatanaka, Rafael Rodrigues; Flumignan, Danilo Luiz; de Oliveira, José Eduardo
2018-01-01
Quality assessment of diesel fuel is highly necessary for society, but the costs and time spent are very high while using standard methods. Therefore, this study aimed to develop an analytical method capable of simultaneously determining eight diesel quality parameters (density; flash point; total sulfur content; distillation temperatures at 10% (T10), 50% (T50), and 85% (T85) recovery; cetane index; and biodiesel content) through attenuated total reflection Fourier transform infrared (ATR-FTIR) spectroscopy and the multivariate regression method, partial least square (PLS). For this purpose, the quality parameters of 409 samples were determined using standard methods, and their spectra were acquired in ranges of 4000-650 cm -1 . The use of the multivariate filters, generalized least squares weighting (GLSW) and orthogonal signal correction (OSC), was evaluated to improve the signal-to-noise ratio of the models. Likewise, four variable selection approaches were tested: manual exclusion, forward interval PLS (FiPLS), backward interval PLS (BiPLS), and genetic algorithm (GA). The multivariate filters and variables selection algorithms generated more fitted and accurate PLS models. According to the validation, the FTIR/PLS models presented accuracy comparable to the reference methods and, therefore, the proposed method can be applied in the diesel routine monitoring to significantly reduce costs and analysis time.
Rapid and Simultaneous Prediction of Eight Diesel Quality Parameters through ATR-FTIR Analysis
Hatanaka, Rafael Rodrigues; Flumignan, Danilo Luiz; de Oliveira, José Eduardo
2018-01-01
Quality assessment of diesel fuel is highly necessary for society, but the costs and time spent are very high while using standard methods. Therefore, this study aimed to develop an analytical method capable of simultaneously determining eight diesel quality parameters (density; flash point; total sulfur content; distillation temperatures at 10% (T10), 50% (T50), and 85% (T85) recovery; cetane index; and biodiesel content) through attenuated total reflection Fourier transform infrared (ATR-FTIR) spectroscopy and the multivariate regression method, partial least square (PLS). For this purpose, the quality parameters of 409 samples were determined using standard methods, and their spectra were acquired in ranges of 4000–650 cm−1. The use of the multivariate filters, generalized least squares weighting (GLSW) and orthogonal signal correction (OSC), was evaluated to improve the signal-to-noise ratio of the models. Likewise, four variable selection approaches were tested: manual exclusion, forward interval PLS (FiPLS), backward interval PLS (BiPLS), and genetic algorithm (GA). The multivariate filters and variables selection algorithms generated more fitted and accurate PLS models. According to the validation, the FTIR/PLS models presented accuracy comparable to the reference methods and, therefore, the proposed method can be applied in the diesel routine monitoring to significantly reduce costs and analysis time. PMID:29629209
NASA Astrophysics Data System (ADS)
Kang, Qian; Ru, Qingguo; Liu, Yan; Xu, Lingyan; Liu, Jia; Wang, Yifei; Zhang, Yewen; Li, Hui; Zhang, Qing; Wu, Qing
2016-01-01
An on-line near infrared (NIR) spectroscopy monitoring method with an appropriate multivariate calibration method was developed for the extraction process of Fu-fang Shuanghua oral solution (FSOS). On-line NIR spectra were collected through two fiber optic probes, which were designed to transmit NIR radiation by a 2 mm flange. Partial least squares (PLS), interval PLS (iPLS) and synergy interval PLS (siPLS) algorithms were used comparatively for building the calibration regression models. During the extraction process, the feasibility of NIR spectroscopy was employed to determine the concentrations of chlorogenic acid (CA) content, total phenolic acids contents (TPC), total flavonoids contents (TFC) and soluble solid contents (SSC). High performance liquid chromatography (HPLC), ultraviolet spectrophotometric method (UV) and loss on drying methods were employed as reference methods. Experiment results showed that the performance of siPLS model is the best compared with PLS and iPLS. The calibration models for AC, TPC, TFC and SSC had high values of determination coefficients of (R2) (0.9948, 0.9992, 0.9950 and 0.9832) and low root mean square error of cross validation (RMSECV) (0.0113, 0.0341, 0.1787 and 1.2158), which indicate a good correlation between reference values and NIR predicted values. The overall results show that the on line detection method could be feasible in real application and would be of great value for monitoring the mixed decoction process of FSOS and other Chinese patent medicines.
Cao, Hui; Li, Yao-Jiang; Zhou, Yan; Wang, Yan-Xia
2014-11-01
To deal with nonlinear characteristics of spectra data for the thermal power plant flue, a nonlinear partial least square (PLS) analysis method with internal model based on neural network is adopted in the paper. The latent variables of the independent variables and the dependent variables are extracted by PLS regression firstly, and then they are used as the inputs and outputs of neural network respectively to build the nonlinear internal model by train process. For spectra data of flue gases of the thermal power plant, PLS, the nonlinear PLS with the internal model of back propagation neural network (BP-NPLS), the non-linear PLS with the internal model of radial basis function neural network (RBF-NPLS) and the nonlinear PLS with the internal model of adaptive fuzzy inference system (ANFIS-NPLS) are compared. The root mean square error of prediction (RMSEP) of sulfur dioxide of BP-NPLS, RBF-NPLS and ANFIS-NPLS are reduced by 16.96%, 16.60% and 19.55% than that of PLS, respectively. The RMSEP of nitric oxide of BP-NPLS, RBF-NPLS and ANFIS-NPLS are reduced by 8.60%, 8.47% and 10.09% than that of PLS, respectively. The RMSEP of nitrogen dioxide of BP-NPLS, RBF-NPLS and ANFIS-NPLS are reduced by 2.11%, 3.91% and 3.97% than that of PLS, respectively. Experimental results show that the nonlinear PLS is more suitable for the quantitative analysis of glue gas than PLS. Moreover, by using neural network function which can realize high approximation of nonlinear characteristics, the nonlinear partial least squares method with internal model mentioned in this paper have well predictive capabilities and robustness, and could deal with the limitations of nonlinear partial least squares method with other internal model such as polynomial and spline functions themselves under a certain extent. ANFIS-NPLS has the best performance with the internal model of adaptive fuzzy inference system having ability to learn more and reduce the residuals effectively. Hence, ANFIS-NPLS is an accurate and useful quantitative thermal power plant flue gas analysis method.
Sills, Deborah L; Gossett, James M
2012-04-01
Fourier transform infrared, attenuated total reflectance (FTIR-ATR) spectroscopy, combined with partial least squares (PLS) regression, accurately predicted solubilization of plant cell wall constituents and NaOH consumption through pretreatment, and overall sugar productions from combined pretreatment and enzymatic hydrolysis. PLS regression models were constructed by correlating FTIR spectra of six raw biomasses (two switchgrass cultivars, big bluestem grass, a low-impact, high-diversity mixture of prairie biomasses, mixed hardwood, and corn stover), plus alkali loading in pretreatment, to nine dependent variables: glucose, xylose, lignin, and total solids solubilized in pretreatment; NaOH consumed in pretreatment; and overall glucose and xylose conversions and yields from combined pretreatment and enzymatic hydrolysis. PLS models predicted the dependent variables with the following values of coefficient of determination for cross-validation (Q²): 0.86 for glucose, 0.90 for xylose, 0.79 for lignin, and 0.85 for total solids solubilized in pretreatment; 0.83 for alkali consumption; 0.93 for glucose conversion, 0.94 for xylose conversion, and 0.88 for glucose and xylose yields. The sugar yield models are noteworthy for their ability to predict overall saccharification through combined pretreatment and enzymatic hydrolysis per mass dry untreated solids without a priori knowledge of the composition of solids. All wavenumbers with significant variable-important-for-projection (VIP) scores have been attributed to chemical features of lignocellulose, demonstrating the models were based on real chemical information. These models suggest that PLS regression can be applied to FTIR-ATR spectra of raw biomasses to rapidly predict effects of pretreatment on solids and on subsequent enzymatic hydrolysis. Copyright © 2011 Wiley Periodicals, Inc.
Random forest models to predict aqueous solubility.
Palmer, David S; O'Boyle, Noel M; Glen, Robert C; Mitchell, John B O
2007-01-01
Random Forest regression (RF), Partial-Least-Squares (PLS) regression, Support Vector Machines (SVM), and Artificial Neural Networks (ANN) were used to develop QSPR models for the prediction of aqueous solubility, based on experimental data for 988 organic molecules. The Random Forest regression model predicted aqueous solubility more accurately than those created by PLS, SVM, and ANN and offered methods for automatic descriptor selection, an assessment of descriptor importance, and an in-parallel measure of predictive ability, all of which serve to recommend its use. The prediction of log molar solubility for an external test set of 330 molecules that are solid at 25 degrees C gave an r2 = 0.89 and RMSE = 0.69 log S units. For a standard data set selected from the literature, the model performed well with respect to other documented methods. Finally, the diversity of the training and test sets are compared to the chemical space occupied by molecules in the MDL drug data report, on the basis of molecular descriptors selected by the regression analysis.
Application of near-infrared spectroscopy for the rapid quality assessment of Radix Paeoniae Rubra
NASA Astrophysics Data System (ADS)
Zhan, Hao; Fang, Jing; Tang, Liying; Yang, Hongjun; Li, Hua; Wang, Zhuju; Yang, Bin; Wu, Hongwei; Fu, Meihong
2017-08-01
Near-infrared (NIR) spectroscopy with multivariate analysis was used to quantify gallic acid, catechin, albiflorin, and paeoniflorin in Radix Paeoniae Rubra, and the feasibility to classify the samples originating from different areas was investigated. A new high-performance liquid chromatography method was developed and validated to analyze gallic acid, catechin, albiflorin, and paeoniflorin in Radix Paeoniae Rubra as the reference. Partial least squares (PLS), principal component regression (PCR), and stepwise multivariate linear regression (SMLR) were performed to calibrate the regression model. Different data pretreatments such as derivatives (1st and 2nd), multiplicative scatter correction, standard normal variate, Savitzky-Golay filter, and Norris derivative filter were applied to remove the systematic errors. The performance of the model was evaluated according to the root mean square of calibration (RMSEC), root mean square error of prediction (RMSEP), root mean square error of cross-validation (RMSECV), and correlation coefficient (r). The results show that compared to PCR and SMLR, PLS had a lower RMSEC, RMSECV, and RMSEP and higher r for all the four analytes. PLS coupled with proper pretreatments showed good performance in both the fitting and predicting results. Furthermore, the original areas of Radix Paeoniae Rubra samples were partly distinguished by principal component analysis. This study shows that NIR with PLS is a reliable, inexpensive, and rapid tool for the quality assessment of Radix Paeoniae Rubra.
Cole-Cole, linear and multivariate modeling of capacitance data for on-line monitoring of biomass.
Dabros, Michal; Dennewald, Danielle; Currie, David J; Lee, Mark H; Todd, Robert W; Marison, Ian W; von Stockar, Urs
2009-02-01
This work evaluates three techniques of calibrating capacitance (dielectric) spectrometers used for on-line monitoring of biomass: modeling of cell properties using the theoretical Cole-Cole equation, linear regression of dual-frequency capacitance measurements on biomass concentration, and multivariate (PLS) modeling of scanning dielectric spectra. The performance and robustness of each technique is assessed during a sequence of validation batches in two experimental settings of differing signal noise. In more noisy conditions, the Cole-Cole model had significantly higher biomass concentration prediction errors than the linear and multivariate models. The PLS model was the most robust in handling signal noise. In less noisy conditions, the three models performed similarly. Estimates of the mean cell size were done additionally using the Cole-Cole and PLS models, the latter technique giving more satisfactory results.
Partial least squares for efficient models of fecal indicator bacteria on Great Lakes beaches
Brooks, Wesley R.; Fienen, Michael N.; Corsi, Steven R.
2013-01-01
At public beaches, it is now common to mitigate the impact of water-borne pathogens by posting a swimmer's advisory when the concentration of fecal indicator bacteria (FIB) exceeds an action threshold. Since culturing the bacteria delays public notification when dangerous conditions exist, regression models are sometimes used to predict the FIB concentration based on readily-available environmental measurements. It is hard to know which environmental parameters are relevant to predicting FIB concentration, and the parameters are usually correlated, which can hurt the predictive power of a regression model. Here the method of partial least squares (PLS) is introduced to automate the regression modeling process. Model selection is reduced to the process of setting a tuning parameter to control the decision threshold that separates predicted exceedances of the standard from predicted non-exceedances. The method is validated by application to four Great Lakes beaches during the summer of 2010. Performance of the PLS models compares favorably to that of the existing state-of-the-art regression models at these four sites.
Vajargah, Kianoush Fathi; Sadeghi-Bazargani, Homayoun; Mehdizadeh-Esfanjani, Robab; Savadi-Oskouei, Daryoush; Farhoudi, Mehdi
2012-01-01
The objective of the present study was to assess the comparable applicability of orthogonal projections to latent structures (OPLS) statistical model vs traditional linear regression in order to investigate the role of trans cranial doppler (TCD) sonography in predicting ischemic stroke prognosis. The study was conducted on 116 ischemic stroke patients admitted to a specialty neurology ward. The Unified Neurological Stroke Scale was used once for clinical evaluation on the first week of admission and again six months later. All data was primarily analyzed using simple linear regression and later considered for multivariate analysis using PLS/OPLS models through the SIMCA P+12 statistical software package. The linear regression analysis results used for the identification of TCD predictors of stroke prognosis were confirmed through the OPLS modeling technique. Moreover, in comparison to linear regression, the OPLS model appeared to have higher sensitivity in detecting the predictors of ischemic stroke prognosis and detected several more predictors. Applying the OPLS model made it possible to use both single TCD measures/indicators and arbitrarily dichotomized measures of TCD single vessel involvement as well as the overall TCD result. In conclusion, the authors recommend PLS/OPLS methods as complementary rather than alternative to the available classical regression models such as linear regression.
Fernández-Novales, Juan; López, María-Isabel; González-Caballero, Virginia; Ramírez, Pilar; Sánchez, María-Teresa
2011-06-01
Volumic mass-a key component of must quality control tests during alcoholic fermentation-is of great interest to the winemaking industry. Transmitance near-infrared (NIR) spectra of 124 must samples over the range of 200-1,100-nm were obtained using a miniature spectrometer. The performance of this instrument to predict volumic mass was evaluated using partial least squares (PLS) regression and multiple linear regression (MLR). The validation statistics coefficient of determination (r(2)) and the standard error of prediction (SEP) were r(2) = 0.98, n = 31 and r(2) = 0.96, n = 31, and SEP = 5.85 and 7.49 g/dm(3) for PLS and MLR equations developed to fit reference data for volumic mass and spectral data. Comparison of results from MLR and PLS demonstrates that a MLR model with six significant wavelengths (P < 0.05) fit volumic mass data to transmittance (1/T) data slightly worse than a more sophisticated PLS model using the full scanning range. The results suggest that NIR spectroscopy is a suitable technique for predicting volumic mass during alcoholic fermentation, and that a low-cost NIR instrument can be used for this purpose.
D'Archivio, Angelo Antonio; Incani, Angela; Ruggieri, Fabrizio
2011-01-01
In this paper, we use a quantitative structure-retention relationship (QSRR) method to predict the retention times of polychlorinated biphenyls (PCBs) in comprehensive two-dimensional gas chromatography (GC×GC). We analyse the GC×GC retention data taken from the literature by comparing predictive capability of different regression methods. The various models are generated using 70 out of 209 PCB congeners in the calibration stage, while their predictive performance is evaluated on the remaining 139 compounds. The two-dimensional chromatogram is initially estimated by separately modelling retention times of PCBs in the first and in the second column ((1) t (R) and (2) t (R), respectively). In particular, multilinear regression (MLR) combined with genetic algorithm (GA) variable selection is performed to extract two small subsets of predictors for (1) t (R) and (2) t (R) from a large set of theoretical molecular descriptors provided by the popular software Dragon, which after removal of highly correlated or almost constant variables consists of 237 structure-related quantities. Based on GA-MLR analysis, a four-dimensional and a five-dimensional relationship modelling (1) t (R) and (2) t (R), respectively, are identified. Single-response partial least square (PLS-1) regression is alternatively applied to independently model (1) t (R) and (2) t (R) without the need for preliminary GA variable selection. Further, we explore the possibility of predicting the two-dimensional chromatogram of PCBs in a single calibration procedure by using a two-response PLS (PLS-2) model or a feed-forward artificial neural network (ANN) with two output neurons. In the first case, regression is carried out on the full set of 237 descriptors, while the variables previously selected by GA-MLR are initially considered as ANN inputs and subjected to a sensitivity analysis to remove the redundant ones. Results show PLS-1 regression exhibits a noticeably better descriptive and predictive performance than the other investigated approaches. The observed values of determination coefficients for (1) t (R) and (2) t (R) in calibration (0.9999 and 0.9993, respectively) and prediction (0.9987 and 0.9793, respectively) provided by PLS-1 demonstrate that GC×GC behaviour of PCBs is properly modelled. In particular, the predicted two-dimensional GC×GC chromatogram of 139 PCBs not involved in the calibration stage closely resembles the experimental one. Based on the above lines of evidence, the proposed approach ensures accurate simulation of the whole GC×GC chromatogram of PCBs using experimental determination of only 1/3 retention data of representative congeners.
Linear and nonlinear methods in modeling the aqueous solubility of organic compounds.
Catana, Cornel; Gao, Hua; Orrenius, Christian; Stouten, Pieter F W
2005-01-01
Solubility data for 930 diverse compounds have been analyzed using linear Partial Least Square (PLS) and nonlinear PLS methods, Continuum Regression (CR), and Neural Networks (NN). 1D and 2D descriptors from MOE package in combination with E-state or ISIS keys have been used. The best model was obtained using linear PLS for a combination between 22 MOE descriptors and 65 ISIS keys. It has a correlation coefficient (r2) of 0.935 and a root-mean-square error (RMSE) of 0.468 log molar solubility (log S(w)). The model validated on a test set of 177 compounds not included in the training set has r2 0.911 and RMSE 0.475 log S(w). The descriptors were ranked according to their importance, and at the top of the list have been found the 22 MOE descriptors. The CR model produced results as good as PLS, and because of the way in which cross-validation has been done it is expected to be a valuable tool in prediction besides PLS model. The statistics obtained using nonlinear methods did not surpass those got with linear ones. The good statistic obtained for linear PLS and CR recommends these models to be used in prediction when it is difficult or impossible to make experimental measurements, for virtual screening, combinatorial library design, and efficient leads optimization.
NASA Astrophysics Data System (ADS)
Alaoui, G.; Leger, M.; Gagne, J.; Tremblay, L.
2009-05-01
The goal of this work was to evaluate the capability of infrared reflectance spectroscopy for a fast quantification of the elemental and molecular compositions of sedimentary and particulate organic matter (OM). A partial least-squares (PLS) regression model was used for analysis and values were compared to those obtained by traditional methods (i.e., elemental, humic and HPLC analyses). PLS tools are readily accessible from software such as GRAMS (Thermo-Fisher) used in spectroscopy. This spectroscopic-chemometric approach has several advantages including its rapidity and use of whole unaltered samples. To predict properties, a set of infrared spectra from representative samples must first be fitted to form a PLS calibration model. In this study, a large set (180) of sediments and particles on GFF filters from the St. Lawrence estuarine system were used. These samples are very heterogenous (e.g., various tributaries, terrigenous vs. marine, events such as landslides and floods) and thus represent a challenging test for PLS prediction. For sediments, the infrared spectra were obtained with a diffuse reflectance, or DRIFT, accessory. Sedimentary carbon, nitrogen, humic substance contents as well as humic substance proportions in OM and N:C ratios were predicted by PLS. The relative root mean square error of prediction (%RMSEP) for these properties were between 5.7% (humin content) and 14.1% (total humic substance yield) using the cross-validation, or leave-one out, approach. The %RMSEP calculated by PLS for carbon content was lower with the PLS model (7.6%) than with an external calibration method (11.7%) (Tremblay and Gagné, 2002, Anal. Chem., 74, 2985). Moreover, the PLS approach does not require the extraction of POM needed in external calibration. Results highlighted the importance of using a PLS calibration set representative of the unknown samples (e.g., same area). For filtered particles, the infrared spectra were obtained using a novel approach based on attenuated total reflectance, or ATR, allowing the direct analysis of the filters. In addition to carbon and nitrogen contents, amino acid and muramic acid (a bacterial biomarker) yields were predicted using PLS. Calculated %RMSEP varied from 6.4% (total amino acid content) to 18.6% (muramic acid content) with cross-validation. PLS regression modeling does not require a priori knowledge of the spectral bands associated with the properties to be predicted. In turn, the spectral regions that give good PLS predictions provided valuable information on band assignment and geochemical processes. For instance, nitrogen and humin contents were greatly determined by an absorption band caused by aluminosilicate OH group. This supports the idea that OM-clay interactions, important in humin formation and OM preservation, are mediated by nitrogen-containing groups.
Li, Wen-bing; Yao, Lin-tao; Liu, Mu-hua; Huang, Lin; Yao, Ming-yin; Chen, Tian-bing; He, Xiu-wen; Yang, Ping; Hu, Hui-qin; Nie, Jiang-hui
2015-05-01
Cu in navel orange was detected rapidly by laser-induced breakdown spectroscopy (LIBS) combined with partial least squares (PLS) for quantitative analysis, then the effect on the detection accuracy of the model with different spectral data ptetreatment methods was explored. Spectral data for the 52 Gannan navel orange samples were pretreated by different data smoothing, mean centralized and standard normal variable transform. Then 319~338 nm wavelength section containing characteristic spectral lines of Cu was selected to build PLS models, the main evaluation indexes of models such as regression coefficient (r), root mean square error of cross validation (RMSECV) and the root mean square error of prediction (RMSEP) were compared and analyzed. Three indicators of PLS model after 13 points smoothing and processing of the mean center were found reaching 0. 992 8, 3. 43 and 3. 4 respectively, the average relative error of prediction model is only 5. 55%, and in one word, the quality of calibration and prediction of this model are the best results. The results show that selecting the appropriate data pre-processing method, the prediction accuracy of PLS quantitative model of fruits and vegetables detected by LIBS can be improved effectively, providing a new method for fast and accurate detection of fruits and vegetables by LIBS.
Andrade, Letícia; Farhat, Imad A; Aeberhardt, Kasia; Bro, Rasmus; Engelsen, Søren Balling
2009-02-01
The influence of temperature on near-infrared (NIR) and nuclear magnetic resonance (NMR) spectroscopy complicates the industrial applications of both spectroscopic methods. The focus of this study is to analyze and model the effect of temperature variation on NIR spectra and NMR relaxation data. Different multivariate methods were tested for constructing robust prediction models based on NIR and NMR data acquired at various temperatures. Data were acquired on model spray-dried limonene systems at five temperatures in the range from 20 degrees C to 60 degrees C and partial least squares (PLS) regression models were computed for limonene and water predictions. The predictive ability of the models computed on the NIR spectra (acquired at various temperatures) improved significantly when data were preprocessed using extended inverted signal correction (EISC). The average PLS regression prediction error was reduced to 0.2%, corresponding to 1.9% and 3.4% of the full range of limonene and water reference values, respectively. The removal of variation induced by temperature prior to calibration, by direct orthogonalization (DO), slightly enhanced the predictive ability of the models based on NMR data. Bilinear PLS models, with implicit inclusion of the temperature, enabled limonene and water predictions by NMR with an error of 0.3% (corresponding to 2.8% and 7.0% of the full range of limonene and water). For NMR, and in contrast to the NIR results, modeling the data using multi-way N-PLS improved the models' performance. N-PLS models, in which temperature was included as an extra variable, enabled more accurate prediction, especially for limonene (prediction error was reduced to 0.2%). Overall, this study proved that it is possible to develop models for limonene and water content prediction based on NIR and NMR data, independent of the measurement temperature.
NASA Astrophysics Data System (ADS)
Chen, Hua-cai; Chen, Xing-dan; Lu, Yong-jun; Cao, Zhi-qiang
2006-01-01
Near infrared (NIR) reflectance spectroscopy was used to develop a fast determination method for total ginsenosides in Ginseng (Panax Ginseng) powder. The spectra were analyzed with multiplicative signal correction (MSC) correlation method. The best correlative spectra region with the total ginsenosides content was 1660 nm~1880 nm and 2230nm~2380 nm. The NIR calibration models of ginsenosides were built with multiple linear regression (MLR), principle component regression (PCR) and partial least squares (PLS) regression respectively. The results showed that the calibration model built with PLS combined with MSC and the optimal spectrum region was the best one. The correlation coefficient and the root mean square error of correction validation (RMSEC) of the best calibration model were 0.98 and 0.15% respectively. The optimal spectrum region for calibration was 1204nm~2014nm. The result suggested that using NIR to rapidly determinate the total ginsenosides content in ginseng powder were feasible.
NASA Astrophysics Data System (ADS)
Solimun
2017-05-01
The aim of this research is to model survival data from kidney-transplant patients using the partial least squares (PLS)-Cox regression, which can both meet and not meet the no-multicollinearity assumption. The secondary data were obtained from research entitled "Factors affecting the survival of kidney-transplant patients". The research subjects comprised 250 patients. The predictor variables consisted of: age (X1), sex (X2); two categories, prior hemodialysis duration (X3), diabetes (X4); two categories, prior transplantation number (X5), number of blood transfusions (X6), discrepancy score (X7), use of antilymphocyte globulin(ALG) (X8); two categories, while the response variable was patient survival time (in months). Partial least squares regression is a model that connects the predictor variables X and the response variable y and it initially aims to determine the relationship between them. Results of the above analyses suggest that the survival of kidney transplant recipients ranged from 0 to 55 months, with 62% of the patients surviving until they received treatment that lasted for 55 months. The PLS-Cox regression analysis results revealed that patients' age and the use of ALG significantly affected the survival time of patients. The factor of patients' age (X1) in the PLS-Cox regression model merely affected the failure probability by 1.201. This indicates that the probability of dying for elderly patients with a kidney transplant is 1.152 times higher than that for younger patients.
Zhang, Hong-guang; Lu, Jian-gang
2016-02-01
Abstract To overcome the problems of significant difference among samples and nonlinearity between the property and spectra of samples in spectral quantitative analysis, a local regression algorithm is proposed in this paper. In this algorithm, net signal analysis method(NAS) was firstly used to obtain the net analyte signal of the calibration samples and unknown samples, then the Euclidean distance between net analyte signal of the sample and net analyte signal of calibration samples was calculated and utilized as similarity index. According to the defined similarity index, the local calibration sets were individually selected for each unknown sample. Finally, a local PLS regression model was built on each local calibration sets for each unknown sample. The proposed method was applied to a set of near infrared spectra of meat samples. The results demonstrate that the prediction precision and model complexity of the proposed method are superior to global PLS regression method and conventional local regression algorithm based on spectral Euclidean distance.
NASA Astrophysics Data System (ADS)
Suhandy, D.; Yulia, M.; Ogawa, Y.; Kondo, N.
2018-05-01
In the present research, an evaluation of using near infrared (NIR) spectroscopy in tandem with full spectrum partial least squares (FS-PLS) regression for quantification of degree of adulteration in civet coffee was conducted. A number of 126 ground roasted coffee samples with degree of adulteration 0-51% were prepared. Spectral data were acquired using a NIR spectrometer equipped with an integrating sphere for diffuse reflectance measurement in the range of 1300-2500 nm. The samples were divided into two groups calibration sample set (84 samples) and prediction sample set (42 samples). The calibration model was developed on original spectra using FS-PLS regression with full-cross validation method. The calibration model exhibited the determination coefficient R2=0.96 for calibration and R2=0.92 for validation. The prediction resulted in low root mean square error of prediction (RMSEP) (4.67%) and high ratio prediction to deviation (RPD) (3.75). In conclusion, the degree of adulteration in civet coffee have been quantified successfully by using NIR spectroscopy and FS-PLS regression in a non-destructive, economical, precise, and highly sensitive method, which uses very simple sample preparation.
Katsarov, Plamen; Gergov, Georgi; Alin, Aylin; Pilicheva, Bissera; Al-Degs, Yahya; Simeonov, Vasil; Kassarova, Margarita
2018-03-01
The prediction power of partial least squares (PLS) and multivariate curve resolution-alternating least squares (MCR-ALS) methods have been studied for simultaneous quantitative analysis of the binary drug combination - doxylamine succinate and pyridoxine hydrochloride. Analysis of first-order UV overlapped spectra was performed using different PLS models - classical PLS1 and PLS2 as well as partial robust M-regression (PRM). These linear models were compared to MCR-ALS with equality and correlation constraints (MCR-ALS-CC). All techniques operated within the full spectral region and extracted maximum information for the drugs analysed. The developed chemometric methods were validated on external sample sets and were applied to the analyses of pharmaceutical formulations. The obtained statistical parameters were satisfactory for calibration and validation sets. All developed methods can be successfully applied for simultaneous spectrophotometric determination of doxylamine and pyridoxine both in laboratory-prepared mixtures and commercial dosage forms.
NASA Astrophysics Data System (ADS)
Tewari, Jagdish; Strong, Richard; Boulas, Pierre
2017-02-01
This article summarizes the development and validation of a Fourier transform near infrared spectroscopy (FT-NIR) method for the rapid at-line prediction of active pharmaceutical ingredient (API) in a powder blend to optimize small molecule formulations. The method was used to determine the blend uniformity end-point for a pharmaceutical solid dosage formulation containing a range of API concentrations. A set of calibration spectra from samples with concentrations ranging from 1% to 15% of API (w/w) were collected at-line from 4000 to 12,500 cm- 1. The ability of the FT-NIR method to predict API concentration in the blend samples was validated against a reference high performance liquid chromatography (HPLC) method. The prediction efficiency of four different types of multivariate data modeling methods such as partial least-squares 1 (PLS1), partial least-squares 2 (PLS2), principal component regression (PCR) and artificial neural network (ANN), were compared using relevant multivariate figures of merit. The prediction ability of the regression models were cross validated against results generated with the reference HPLC method. PLS1 and ANN showed excellent and superior prediction abilities when compared to PLS2 and PCR. Based upon these results and because of its decreased complexity compared to ANN, PLS1 was selected as the best chemometric method to predict blend uniformity at-line. The FT-NIR measurement and the associated chemometric analysis were implemented in the production environment for rapid at-line determination of the end-point of the small molecule blending operation. FIGURE 1: Correlation coefficient vs Rank plot FIGURE 2: FT-NIR spectra of different steps of Blend and final blend FIGURE 3: Predictions ability of PCR FIGURE 4: Blend uniformity predication ability of PLS2 FIGURE 5: Prediction efficiency of blend uniformity using ANN FIGURE 6: Comparison of prediction efficiency of chemometric models TABLE 1: Order of Addition for Blending Steps
NASA Astrophysics Data System (ADS)
Hegazy, Maha A.; Lotfy, Hayam M.; Mowaka, Shereen; Mohamed, Ekram Hany
2016-07-01
Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations.
Nondestructive evaluation of soluble solid content in strawberry by near infrared spectroscopy
NASA Astrophysics Data System (ADS)
Guo, Zhiming; Huang, Wenqian; Chen, Liping; Wang, Xiu; Peng, Yankun
This paper indicates the feasibility to use near infrared (NIR) spectroscopy combined with synergy interval partial least squares (siPLS) algorithms as a rapid nondestructive method to estimate the soluble solid content (SSC) in strawberry. Spectral preprocessing methods were optimized selected by cross-validation in the model calibration. Partial least squares (PLS) algorithm was conducted on the calibration of regression model. The performance of the final model was back-evaluated according to root mean square error of calibration (RMSEC) and correlation coefficient (R2 c) in calibration set, and tested by mean square error of prediction (RMSEP) and correlation coefficient (R2 p) in prediction set. The optimal siPLS model was obtained with after first derivation spectra preprocessing. The measurement results of best model were achieved as follow: RMSEC = 0.2259, R2 c = 0.9590 in the calibration set; and RMSEP = 0.2892, R2 p = 0.9390 in the prediction set. This work demonstrated that NIR spectroscopy and siPLS with efficient spectral preprocessing is a useful tool for nondestructively evaluation SSC in strawberry.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tripathi, Markandey M.; Krishnan, Sundar R.; Srinivasan, Kalyan K.
Chemiluminescence emissions from OH*, CH*, C2, and CO2 formed within the reaction zone of premixed flames depend upon the fuel-air equivalence ratio in the burning mixture. In the present paper, a new partial least square regression (PLS-R) based multivariate sensing methodology is investigated and compared with an OH*/CH* intensity ratio-based calibration model for sensing equivalence ratio in atmospheric methane-air premixed flames. Five replications of spectral data at nine different equivalence ratios ranging from 0.73 to 1.48 were used in the calibration of both models. During model development, the PLS-R model was initially validated with the calibration data set using themore » leave-one-out cross validation technique. Since the PLS-R model used the entire raw spectral intensities, it did not need the nonlinear background subtraction of CO2 emission that is required for typical OH*/CH* intensity ratio calibrations. An unbiased spectral data set (not used in the PLS-R model development), for 28 different equivalence ratio conditions ranging from 0.71 to 1.67, was used to predict equivalence ratios using the PLS-R and the intensity ratio calibration models. It was found that the equivalence ratios predicted with the PLS-R based multivariate calibration model matched the experimentally measured equivalence ratios within 7%; whereas, the OH*/CH* intensity ratio calibration grossly underpredicted equivalence ratios in comparison to measured equivalence ratios, especially under rich conditions ( > 1.2). The practical implications of the chemiluminescence-based multivariate equivalence ratio sensing methodology are also discussed.« less
Igne, Benoît; Drennen, James K; Anderson, Carl A
2014-01-01
Changes in raw materials and process wear and tear can have significant effects on the prediction error of near-infrared calibration models. When the variability that is present during routine manufacturing is not included in the calibration, test, and validation sets, the long-term performance and robustness of the model will be limited. Nonlinearity is a major source of interference. In near-infrared spectroscopy, nonlinearity can arise from light path-length differences that can come from differences in particle size or density. The usefulness of support vector machine (SVM) regression to handle nonlinearity and improve the robustness of calibration models in scenarios where the calibration set did not include all the variability present in test was evaluated. Compared to partial least squares (PLS) regression, SVM regression was less affected by physical (particle size) and chemical (moisture) differences. The linearity of the SVM predicted values was also improved. Nevertheless, although visualization and interpretation tools have been developed to enhance the usability of SVM-based methods, work is yet to be done to provide chemometricians in the pharmaceutical industry with a regression method that can supplement PLS-based methods.
Lopes, Marta B; Calado, Cecília R C; Figueiredo, Mário A T; Bioucas-Dias, José M
2017-06-01
The monitoring of biopharmaceutical products using Fourier transform infrared (FT-IR) spectroscopy relies on calibration techniques involving the acquisition of spectra of bioprocess samples along the process. The most commonly used method for that purpose is partial least squares (PLS) regression, under the assumption that a linear model is valid. Despite being successful in the presence of small nonlinearities, linear methods may fail in the presence of strong nonlinearities. This paper studies the potential usefulness of nonlinear regression methods for predicting, from in situ near-infrared (NIR) and mid-infrared (MIR) spectra acquired in high-throughput mode, biomass and plasmid concentrations in Escherichia coli DH5-α cultures producing the plasmid model pVAX-LacZ. The linear methods PLS and ridge regression (RR) are compared with their kernel (nonlinear) versions, kPLS and kRR, as well as with the (also nonlinear) relevance vector machine (RVM) and Gaussian process regression (GPR). For the systems studied, RR provided better predictive performances compared to the remaining methods. Moreover, the results point to further investigation based on larger data sets whenever differences in predictive accuracy between a linear method and its kernelized version could not be found. The use of nonlinear methods, however, shall be judged regarding the additional computational cost required to tune their additional parameters, especially when the less computationally demanding linear methods herein studied are able to successfully monitor the variables under study.
Quantification of brain lipids by FTIR spectroscopy and partial least squares regression
NASA Astrophysics Data System (ADS)
Dreissig, Isabell; Machill, Susanne; Salzer, Reiner; Krafft, Christoph
2009-01-01
Brain tissue is characterized by high lipid content. Its content decreases and the lipid composition changes during transformation from normal brain tissue to tumors. Therefore, the analysis of brain lipids might complement the existing diagnostic tools to determine the tumor type and tumor grade. Objective of this work is to extract lipids from gray matter and white matter of porcine brain tissue, record infrared (IR) spectra of these extracts and develop a quantification model for the main lipids based on partial least squares (PLS) regression. IR spectra of the pure lipids cholesterol, cholesterol ester, phosphatidic acid, phosphatidylcholine, phosphatidylethanolamine, phosphatidylserine, phosphatidylinositol, sphingomyelin, galactocerebroside and sulfatide were used as references. Two lipid mixtures were prepared for training and validation of the quantification model. The composition of lipid extracts that were predicted by the PLS regression of IR spectra was compared with lipid quantification by thin layer chromatography.
Nazari, Seyed Saeed Hashemi; Mokhayeri, Yaser; Mansournia, Mohammad Ali; Khodakarim, Soheila; Soori, Hamid
2018-05-21
Some studies shed light on the association between dietary patterns and stroke, though, none of them applied reduced rank regression (RRR). Therefore, we sought to extract dietary patterns using RRR, and showed how well the extracted scores by RRR predict stroke in comparison to those scores produced by partial least squares (PLS) and principal components regression (PCR). Diet data at baseline with four response variables including body mass index (BMI), fibrinogen, IL-6, low-density lipoprotein (LDL) cholesterol were used to extract dietary patterns. Analyses were based on 5468 men and women aged 45-84 y who had no clinical cardiovascular diseases (CVD) from Multi-Ethnic Study of Atherosclerosis (MESA). Dietary patterns were created by three methods RRR, PLS, and PCR. The RRR1 was positively associated with stroke incidence in both models (for model 1 hazard ratio (HR): 7.49; 95% CI: 1.66, 33.69 P for trend = 0.01 and for model 2 HR: 6.83; 95% CI: 1.51, 30.87 for quintile 5 compared with the reference category P for trend = 0.02). The RRR1, PLS1, and PCR1 were high in fats and oils, poultry, tomatoes, fried potato and processed meat. Additionally, RRR1 and PLS1 were high in dark-yellow and cruciferous vegetables which negatively were correlated with the first dietary pattern. Mainly according to the RRR, we identified that a dietary pattern high in fats and oil, poultry, non-diet soda, processed meat, tomatoes, legumes, chicken, tuna and egg salad, fried potato and low in dark-yellow and cruciferous vegetables may increase the incidence of stroke.
Bricklemyer, Ross S; Brown, David J; Turk, Philip J; Clegg, Sam M
2013-10-01
Laser-induced breakdown spectroscopy (LIBS) provides a potential method for rapid, in situ soil C measurement. In previous research on the application of LIBS to intact soil cores, we hypothesized that ultraviolet (UV) spectrum LIBS (200-300 nm) might not provide sufficient elemental information to reliably discriminate between soil organic C (SOC) and inorganic C (IC). In this study, using a custom complete spectrum (245-925 nm) core-scanning LIBS instrument, we analyzed 60 intact soil cores from six wheat fields. Predictive multi-response partial least squares (PLS2) models using full and reduced spectrum LIBS were compared for directly determining soil total C (TC), IC, and SOC. Two regression shrinkage and variable selection approaches, the least absolute shrinkage and selection operator (LASSO) and sparse multivariate regression with covariance estimation (MRCE), were tested for soil C predictions and the identification of wavelengths important for soil C prediction. Using complete spectrum LIBS for PLS2 modeling reduced the calibration standard error of prediction (SEP) 15 and 19% for TC and IC, respectively, compared to UV spectrum LIBS. The LASSO and MRCE approaches provided significantly improved calibration accuracy and reduced SEP 32-55% over UV spectrum PLS2 models. We conclude that (1) complete spectrum LIBS is superior to UV spectrum LIBS for predicting soil C for intact soil cores without pretreatment; (2) LASSO and MRCE approaches provide improved calibration prediction accuracy over PLS2 but require additional testing with increased soil and target analyte diversity; and (3) measurement errors associated with analyzing intact cores (e.g., sample density and surface roughness) require further study and quantification.
Malegori, Cristina; Nascimento Marques, Emanuel José; de Freitas, Sergio Tonetto; Pimentel, Maria Fernanda; Pasquini, Celio; Casiraghi, Ernestina
2017-04-01
The main goal of this study was to investigate the analytical performances of a state-of-the-art device, one of the smallest dispersion NIR spectrometers on the market (MicroNIR 1700), making a critical comparison with a benchtop FT-NIR spectrometer in the evaluation of the prediction accuracy. In particular, the aim of this study was to estimate in a non-destructive manner, titratable acidity and ascorbic acid content in acerola fruit during ripening, in a view of direct applicability in field of this new miniaturised handheld device. Acerola (Malpighia emarginata DC.) is a super-fruit characterised by a considerable amount of ascorbic acid, ranging from 1.0% to 4.5%. However, during ripening, acerola colour changes and the fruit may lose as much as half of its ascorbic acid content. Because the variability of chemical parameters followed a non-strictly linear profile, two different regression algorithms were compared: PLS and SVM. Regression models obtained with Micro-NIR spectra give better results using SVM algorithm, for both ascorbic acid and titratable acidity estimation. FT-NIR data give comparable results using both SVM and PLS algorithms, with lower errors for SVM regression. The prediction ability of the two instruments was statistically compared using the Passing-Bablok regression algorithm; the outcomes are critically discussed together with the regression models, showing the suitability of the portable Micro-NIR for in field monitoring of chemical parameters of interest in acerola fruits. Copyright © 2016 Elsevier B.V. All rights reserved.
Jović, Ozren; Smrečki, Neven; Popović, Zora
2016-04-01
A novel quantitative prediction and variable selection method called interval ridge regression (iRR) is studied in this work. The method is performed on six data sets of FTIR, two data sets of UV-vis and one data set of DSC. The obtained results show that models built with ridge regression on optimal variables selected with iRR significantly outperfom models built with ridge regression on all variables in both calibration (6 out of 9 cases) and validation (2 out of 9 cases). In this study, iRR is also compared with interval partial least squares regression (iPLS). iRR outperfomed iPLS in validation (insignificantly in 6 out of 9 cases and significantly in one out of 9 cases for p<0.05). Also, iRR can be a fast alternative to iPLS, especially in case of unknown degree of complexity of analyzed system, i.e. if upper limit of number of latent variables is not easily estimated for iPLS. Adulteration of hempseed (H) oil, a well known health beneficial nutrient, is studied in this work by mixing it with cheap and widely used oils such as soybean (So) oil, rapeseed (R) oil and sunflower (Su) oil. Binary mixture sets of hempseed oil with these three oils (HSo, HR and HSu) and a ternary mixture set of H oil, R oil and Su oil (HRSu) were considered. The obtained accuracy indicates that using iRR on FTIR and UV-vis data, each particular oil can be very successfully quantified (in all 8 cases RMSEP<1.2%). This means that FTIR-ATR coupled with iRR can very rapidly and effectively determine the level of adulteration in the adulterated hempseed oil (R(2)>0.99). Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Chen, Hui; Tan, Chao; Lin, Zan; Wu, Tong
2018-01-01
Milk is among the most popular nutrient source worldwide, which is of great interest due to its beneficial medicinal properties. The feasibility of the classification of milk powder samples with respect to their brands and the determination of protein concentration is investigated by NIR spectroscopy along with chemometrics. Two datasets were prepared for experiment. One contains 179 samples of four brands for classification and the other contains 30 samples for quantitative analysis. Principal component analysis (PCA) was used for exploratory analysis. Based on an effective model-independent variable selection method, i.e., minimal-redundancy maximal-relevance (MRMR), only 18 variables were selected to construct a partial least-square discriminant analysis (PLS-DA) model. On the test set, the PLS-DA model based on the selected variable set was compared with the full-spectrum PLS-DA model, both of which achieved 100% accuracy. In quantitative analysis, the partial least-square regression (PLSR) model constructed by the selected subset of 260 variables outperforms significantly the full-spectrum model. It seems that the combination of NIR spectroscopy, MRMR and PLS-DA or PLSR is a powerful tool for classifying different brands of milk and determining the protein content.
Hegazy, Maha A; Lotfy, Hayam M; Mowaka, Shereen; Mohamed, Ekram Hany
2016-07-05
Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations. Copyright © 2016 Elsevier B.V. All rights reserved.
Nishii, Takashi; Genkawa, Takuma; Watari, Masahiro; Ozaki, Yukihiro
2012-01-01
A new selection procedure of an informative near-infrared (NIR) region for regression model building is proposed that uses an online NIR/mid-infrared (mid-IR) dual-region spectrometer in conjunction with two-dimensional (2D) NIR/mid-IR heterospectral correlation spectroscopy. In this procedure, both NIR and mid-IR spectra of a liquid sample are acquired sequentially during a reaction process using the NIR/mid-IR dual-region spectrometer; the 2D NIR/mid-IR heterospectral correlation spectrum is subsequently calculated from the obtained spectral data set. From the calculated 2D spectrum, a NIR region is selected that includes bands of high positive correlation intensity with mid-IR bands assigned to the analyte, and used for the construction of a regression model. To evaluate the performance of this procedure, a partial least-squares (PLS) regression model of the ethanol concentration in a fermentation process was constructed. During fermentation, NIR/mid-IR spectra in the 10000 - 1200 cm(-1) region were acquired every 3 min, and a 2D NIR/mid-IR heterospectral correlation spectrum was calculated to investigate the correlation intensity between the NIR and mid-IR bands. NIR regions that include bands at 4343, 4416, 5778, 5904, and 5955 cm(-1), which result from the combinations and overtones of the C-H group of ethanol, were selected for use in the PLS regression models, by taking the correlation intensity of a mid-IR band at 2985 cm(-1) arising from the CH(3) asymmetric stretching vibration mode of ethanol as a reference. The predicted results indicate that the ethanol concentrations calculated from the PLS regression models fit well to those obtained by high-performance liquid chromatography. Thus, it can be concluded that the selection procedure using the NIR/mid-IR dual-region spectrometer combined with 2D NIR/mid-IR heterospectral correlation spectroscopy is a powerful method for the construction of a reliable regression model.
Determination of total phenolic compounds in compost by infrared spectroscopy.
Cascant, M M; Sisouane, M; Tahiri, S; Krati, M El; Cervera, M L; Garrigues, S; de la Guardia, M
2016-06-01
Middle and near infrared (MIR and NIR) were applied to determine the total phenolic compounds (TPC) content in compost samples based on models built by using partial least squares (PLS) regression. The multiplicative scatter correction, standard normal variate and first derivative were employed as spectra pretreatment, and the number of latent variable were optimized by leave-one-out cross-validation. The performance of PLS-ATR-MIR and PLS-DR-NIR models was evaluated according to root mean square error of cross validation and prediction (RMSECV and RMSEP), the coefficient of determination for prediction (Rpred(2)) and residual predictive deviation (RPD) being obtained for this latter values of 5.83 and 8.26 for MIR and NIR, respectively. Copyright © 2016 Elsevier B.V. All rights reserved.
Whelan, Jessica; Craven, Stephen; Glennon, Brian
2012-01-01
In this study, the application of Raman spectroscopy to the simultaneous quantitative determination of glucose, glutamine, lactate, ammonia, glutamate, total cell density (TCD), and viable cell density (VCD) in a CHO fed-batch process was demonstrated in situ in 3 L and 15 L bioreactors. Spectral preprocessing and partial least squares (PLS) regression were used to correlate spectral data with off-line reference data. Separate PLS calibration models were developed for each analyte at the 3 L laboratory bioreactor scale before assessing its transferability to the same bioprocess conducted at the 15 L pilot scale. PLS calibration models were successfully developed for all analytes bar VCD and transferred to the 15 L scale. Copyright © 2012 American Institute of Chemical Engineers (AIChE).
Yoo, Kwangsun; Rosenberg, Monica D; Hsu, Wei-Ting; Zhang, Sheng; Li, Chiang-Shan R; Scheinost, Dustin; Constable, R Todd; Chun, Marvin M
2018-02-15
Connectome-based predictive modeling (CPM; Finn et al., 2015; Shen et al., 2017) was recently developed to predict individual differences in traits and behaviors, including fluid intelligence (Finn et al., 2015) and sustained attention (Rosenberg et al., 2016a), from functional brain connectivity (FC) measured with fMRI. Here, using the CPM framework, we compared the predictive power of three different measures of FC (Pearson's correlation, accordance, and discordance) and two different prediction algorithms (linear and partial least square [PLS] regression) for attention function. Accordance and discordance are recently proposed FC measures that respectively track in-phase synchronization and out-of-phase anti-correlation (Meskaldji et al., 2015). We defined connectome-based models using task-based or resting-state FC data, and tested the effects of (1) functional connectivity measure and (2) feature-selection/prediction algorithm on individualized attention predictions. Models were internally validated in a training dataset using leave-one-subject-out cross-validation, and externally validated with three independent datasets. The training dataset included fMRI data collected while participants performed a sustained attention task and rested (N = 25; Rosenberg et al., 2016a). The validation datasets included: 1) data collected during performance of a stop-signal task and at rest (N = 83, including 19 participants who were administered methylphenidate prior to scanning; Farr et al., 2014a; Rosenberg et al., 2016b), 2) data collected during Attention Network Task performance and rest (N = 41, Rosenberg et al., in press), and 3) resting-state data and ADHD symptom severity from the ADHD-200 Consortium (N = 113; Rosenberg et al., 2016a). Models defined using all combinations of functional connectivity measure (Pearson's correlation, accordance, and discordance) and prediction algorithm (linear and PLS regression) predicted attentional abilities, with correlations between predicted and observed measures of attention as high as 0.9 for internal validation, and 0.6 for external validation (all p's < 0.05). Models trained on task data outperformed models trained on rest data. Pearson's correlation and accordance features generally showed a small numerical advantage over discordance features, while PLS regression models were usually better than linear regression models. Overall, in addition to correlation features combined with linear models (Rosenberg et al., 2016a), it is useful to consider accordance features and PLS regression for CPM. Copyright © 2017 Elsevier Inc. All rights reserved.
Allegrini, Franco; Braga, Jez W B; Moreira, Alessandro C O; Olivieri, Alejandro C
2018-06-29
A new multivariate regression model, named Error Covariance Penalized Regression (ECPR) is presented. Following a penalized regression strategy, the proposed model incorporates information about the measurement error structure of the system, using the error covariance matrix (ECM) as a penalization term. Results are reported from both simulations and experimental data based on replicate mid and near infrared (MIR and NIR) spectral measurements. The results for ECPR are better under non-iid conditions when compared with traditional first-order multivariate methods such as ridge regression (RR), principal component regression (PCR) and partial least-squares regression (PLS). Copyright © 2018 Elsevier B.V. All rights reserved.
Thermal-to-visible face recognition using partial least squares.
Hu, Shuowen; Choi, Jonghyun; Chan, Alex L; Schwartz, William Robson
2015-03-01
Although visible face recognition has been an active area of research for several decades, cross-modal face recognition has only been explored by the biometrics community relatively recently. Thermal-to-visible face recognition is one of the most difficult cross-modal face recognition challenges, because of the difference in phenomenology between the thermal and visible imaging modalities. We address the cross-modal recognition problem using a partial least squares (PLS) regression-based approach consisting of preprocessing, feature extraction, and PLS model building. The preprocessing and feature extraction stages are designed to reduce the modality gap between the thermal and visible facial signatures, and facilitate the subsequent one-vs-all PLS-based model building. We incorporate multi-modal information into the PLS model building stage to enhance cross-modal recognition. The performance of the proposed recognition algorithm is evaluated on three challenging datasets containing visible and thermal imagery acquired under different experimental scenarios: time-lapse, physical tasks, mental tasks, and subject-to-camera range. These scenarios represent difficult challenges relevant to real-world applications. We demonstrate that the proposed method performs robustly for the examined scenarios.
Ramírez, J; Górriz, J M; Segovia, F; Chaves, R; Salas-Gonzalez, D; López, M; Alvarez, I; Padilla, P
2010-03-19
This letter shows a computer aided diagnosis (CAD) technique for the early detection of the Alzheimer's disease (AD) by means of single photon emission computed tomography (SPECT) image classification. The proposed method is based on partial least squares (PLS) regression model and a random forest (RF) predictor. The challenge of the curse of dimensionality is addressed by reducing the large dimensionality of the input data by downscaling the SPECT images and extracting score features using PLS. A RF predictor then forms an ensemble of classification and regression tree (CART)-like classifiers being its output determined by a majority vote of the trees in the forest. A baseline principal component analysis (PCA) system is also developed for reference. The experimental results show that the combined PLS-RF system yields a generalization error that converges to a limit when increasing the number of trees in the forest. Thus, the generalization error is reduced when using PLS and depends on the strength of the individual trees in the forest and the correlation between them. Moreover, PLS feature extraction is found to be more effective for extracting discriminative information from the data than PCA yielding peak sensitivity, specificity and accuracy values of 100%, 92.7%, and 96.9%, respectively. Moreover, the proposed CAD system outperformed several other recently developed AD CAD systems. Copyright 2010 Elsevier Ireland Ltd. All rights reserved.
He, Yan-Lin; Xu, Yuan; Geng, Zhi-Qiang; Zhu, Qun-Xiong
2016-03-01
In this paper, a hybrid robust model based on an improved functional link neural network integrating with partial least square (IFLNN-PLS) is proposed. Firstly, an improved functional link neural network with small norm of expanded weights and high input-output correlation (SNEWHIOC-FLNN) was proposed for enhancing the generalization performance of FLNN. Unlike the traditional FLNN, the expanded variables of the original inputs are not directly used as the inputs in the proposed SNEWHIOC-FLNN model. The original inputs are attached to some small norm of expanded weights. As a result, the correlation coefficient between some of the expanded variables and the outputs is enhanced. The larger the correlation coefficient is, the more relevant the expanded variables tend to be. In the end, the expanded variables with larger correlation coefficient are selected as the inputs to improve the performance of the traditional FLNN. In order to test the proposed SNEWHIOC-FLNN model, three UCI (University of California, Irvine) regression datasets named Housing, Concrete Compressive Strength (CCS), and Yacht Hydro Dynamics (YHD) are selected. Then a hybrid model based on the improved FLNN integrating with partial least square (IFLNN-PLS) was built. In IFLNN-PLS model, the connection weights are calculated using the partial least square method but not the error back propagation algorithm. Lastly, IFLNN-PLS was developed as an intelligent measurement model for accurately predicting the key variables in the Purified Terephthalic Acid (PTA) process and the High Density Polyethylene (HDPE) process. Simulation results illustrated that the IFLNN-PLS could significant improve the prediction performance. Copyright © 2015 ISA. Published by Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Ahmed, Shamim; Miorelli, Roberto; Calmon, Pierre; Anselmi, Nicola; Salucci, Marco
2018-04-01
This paper describes Learning-By-Examples (LBE) technique for performing quasi real time flaw localization and characterization within a conductive tube based on Eddy Current Testing (ECT) signals. Within the framework of LBE, the combination of full-factorial (i.e., GRID) sampling and Partial Least Squares (PLS) feature extraction (i.e., GRID-PLS) techniques are applied for generating a suitable training set in offine phase. Support Vector Regression (SVR) is utilized for model development and inversion during offine and online phases, respectively. The performance and robustness of the proposed GIRD-PLS/SVR strategy on noisy test set is evaluated and compared with standard GRID/SVR approach.
Hordge, LaQuana N; McDaniel, Kiara L; Jones, Derick D; Fakayode, Sayo O
2016-05-15
The endocrine disruption property of estrogens necessitates the immediate need for effective monitoring and development of analytical protocols for their analyses in biological and human specimens. This study explores the first combined utility of a steady-state fluorescence spectroscopy and multivariate partial-least-square (PLS) regression analysis for the simultaneous determination of two estrogens (17α-ethinylestradiol (EE) and norgestimate (NOR)) concentrations in bovine serum albumin (BSA) and human serum albumin (HSA) samples. The influence of EE and NOR concentrations and temperature on the emission spectra of EE-HSA EE-BSA, NOR-HSA, and NOR-BSA complexes was also investigated. The binding of EE with HSA and BSA resulted in increase in emission characteristics of HSA and BSA and a significant blue spectra shift. In contrast, the interaction of NOR with HSA and BSA quenched the emission characteristics of HSA and BSA. The observed emission spectral shifts preclude the effective use of traditional univariate regression analysis of fluorescent data for the determination of EE and NOR concentrations in HSA and BSA samples. Multivariate partial-least-squares (PLS) regression analysis was utilized to correlate the changes in emission spectra with EE and NOR concentrations in HSA and BSA samples. The figures-of-merit of the developed PLS regression models were excellent, with limits of detection as low as 1.6×10(-8) M for EE and 2.4×10(-7) M for NOR and good linearity (R(2)>0.994985). The PLS models correctly predicted EE and NOR concentrations in independent validation HSA and BSA samples with a root-mean-square-percent-relative-error (RMS%RE) of less than 6.0% at physiological condition. On the contrary, the use of univariate regression resulted in poor predictions of EE and NOR in HSA and BSA samples, with RMS%RE larger than 40% at physiological conditions. High accuracy, low sensitivity, simplicity, low-cost with no prior analyte extraction or separation required makes this method promising, compelling, and attractive alternative for the rapid determination of estrogen concentrations in biomedical and biological specimens, pharmaceuticals, or environmental samples. Published by Elsevier B.V.
NASA Astrophysics Data System (ADS)
Hart, Brian K.; Griffiths, Peter R.
1998-06-01
Partial least squares (PLS) regression has been evaluated as a robust calibration technique for over 100 hazardous air pollutants (HAPs) measured by open path Fourier transform infrared (OP/FT-IR) spectrometry. PLS has the advantage over the current recommended calibration method of classical least squares (CLS), in that it can look at the whole useable spectrum (700-1300 cm-1, 2000-2150 cm-1, and 2400-3000 cm-1), and detect several analytes simultaneously. Up to one hundred HAPs synthetically added to OP/FT-IR backgrounds have been simultaneously calibrated and detected using PLS. PLS also has the advantage in requiring less preprocessing of spectra than that which is required in CLS calibration schemes, allowing PLS to provide user independent real-time analysis of OP/FT-IR spectra.
Partial least squares (PLS) analysis offers a number of advantages over the more traditionally used regression analyses applied in landscape ecology, particularly for determining the associations among multiple constituents of surface water and landscape configuration. Common dat...
Gómez-Carracedo, M P; Andrade, J M; Rutledge, D N; Faber, N M
2007-03-07
Selecting the correct dimensionality is critical for obtaining partial least squares (PLS) regression models with good predictive ability. Although calibration and validation sets are best established using experimental designs, industrial laboratories cannot afford such an approach. Typically, samples are collected in an (formally) undesigned way, spread over time and their measurements are included in routine measurement processes. This makes it hard to evaluate PLS model dimensionality. In this paper, classical criteria (leave-one-out cross-validation and adjusted Wold's criterion) are compared to recently proposed alternatives (smoothed PLS-PoLiSh and a randomization test) to seek out the optimum dimensionality of PLS models. Kerosene (jet fuel) samples were measured by attenuated total reflectance-mid-IR spectrometry and their spectra where used to predict eight important properties determined using reference methods that are time-consuming and prone to analytical errors. The alternative methods were shown to give reliable dimensionality predictions when compared to external validation. By contrast, the simpler methods seemed to be largely affected by the largest changes in the modeling capabilities of the first components.
Burgués, Javier; Marco, Santiago
2018-08-17
Metal oxide semiconductor (MOX) sensors are usually temperature-modulated and calibrated with multivariate models such as partial least squares (PLS) to increase the inherent low selectivity of this technology. The multivariate sensor response patterns exhibit heteroscedastic and correlated noise, which suggests that maximum likelihood methods should outperform PLS. One contribution of this paper is the comparison between PLS and maximum likelihood principal components regression (MLPCR) in MOX sensors. PLS is often criticized by the lack of interpretability when the model complexity increases beyond the chemical rank of the problem. This happens in MOX sensors due to cross-sensitivities to interferences, such as temperature or humidity and non-linearity. Additionally, the estimation of fundamental figures of merit, such as the limit of detection (LOD), is still not standardized in multivariate models. Orthogonalization methods, such as orthogonal projection to latent structures (O-PLS), have been successfully applied in other fields to reduce the complexity of PLS models. In this work, we propose a LOD estimation method based on applying the well-accepted univariate LOD formulas to the scores of the first component of an orthogonal PLS model. The resulting LOD is compared to the multivariate LOD range derived from error-propagation. The methodology is applied to data extracted from temperature-modulated MOX sensors (FIS SB-500-12 and Figaro TGS 3870-A04), aiming at the detection of low concentrations of carbon monoxide in the presence of uncontrolled humidity (chemical noise). We found that PLS models were simpler and more accurate than MLPCR models. Average LOD values of 0.79 ppm (FIS) and 1.06 ppm (Figaro) were found using the approach described in this paper. These values were contained within the LOD ranges obtained with the error-propagation approach. The mean LOD increased to 1.13 ppm (FIS) and 1.59 ppm (Figaro) when considering validation samples collected two weeks after calibration, which represents a 43% and 46% degradation, respectively. The orthogonal score-plot was a very convenient tool to visualize MOX sensor data and to validate the LOD estimates. Copyright © 2018 Elsevier B.V. All rights reserved.
Partial least squares (PLS) analysis offers a number of advantages over the more traditionally used regression analyses applied in landscape ecology to study the associations among constituents of surface water and landscapes. Common data problems in ecological studies include: s...
Lakshmi, KS; Lakshmi, S
2010-01-01
Two chemometric methods were developed for the simultaneous determination of telmisartan and hydrochlorothiazide. The chemometric methods applied were principal component regression (PCR) and partial least square (PLS-1). These approaches were successfully applied to quantify the two drugs in the mixture using the information included in the UV absorption spectra of appropriate solutions in the range of 200-350 nm with the intervals Δλ = 1 nm. The calibration of PCR and PLS-1 models was evaluated by internal validation (prediction of compounds in its own designed training set of calibration) and by external validation over laboratory prepared mixtures and pharmaceutical preparations. The PCR and PLS-1 methods require neither any separation step, nor any prior graphical treatment of the overlapping spectra of the two drugs in a mixture. The results of PCR and PLS-1 methods were compared with each other and a good agreement was found. PMID:21331198
Lakshmi, Ks; Lakshmi, S
2010-01-01
Two chemometric methods were developed for the simultaneous determination of telmisartan and hydrochlorothiazide. The chemometric methods applied were principal component regression (PCR) and partial least square (PLS-1). These approaches were successfully applied to quantify the two drugs in the mixture using the information included in the UV absorption spectra of appropriate solutions in the range of 200-350 nm with the intervals Δλ = 1 nm. The calibration of PCR and PLS-1 models was evaluated by internal validation (prediction of compounds in its own designed training set of calibration) and by external validation over laboratory prepared mixtures and pharmaceutical preparations. The PCR and PLS-1 methods require neither any separation step, nor any prior graphical treatment of the overlapping spectra of the two drugs in a mixture. The results of PCR and PLS-1 methods were compared with each other and a good agreement was found.
Passos, Cláudia P; Cardoso, Susana M; Barros, António S; Silva, Carlos M; Coimbra, Manuel A
2010-02-28
Fourier transform infrared (FTIR) spectroscopy has being emphasised as a widespread technique in the quick assess of food components. In this work, procyanidins were extracted with methanol and acetone/water from the seeds of white and red grape varieties. A fractionation by graded methanol/chloroform precipitations allowed to obtain 26 samples that were characterised using thiolysis as pre-treatment followed by HPLC-UV and MS detection. The average degree of polymerisation (DPn) of the procyanidins in the samples ranged from 2 to 11 flavan-3-ol residues. FTIR spectroscopy within the wavenumbers region of 1800-700 cm(-1) allowed to build a partial least squares (PLS1) regression model with 8 latent variables (LVs) for the estimation of the DPn, giving a RMSECV of 11.7%, with a R(2) of 0.91 and a RMSEP of 2.58. The application of orthogonal projection to latent structures (O-PLS1) clarifies the interpretation of the regression model vectors. Moreover, the O-PLS procedure has removed 88% of non-correlated variations with the DPn, allowing to relate the increase of the absorbance peaks at 1203 and 1099 cm(-1) with the increase of the DPn due to the higher proportion of substitutions in the aromatic ring of the polymerised procyanidin molecules. Copyright 2009 Elsevier B.V. All rights reserved.
Lafuente, Victoria; Herrera, Luis J; Pérez, María del Mar; Val, Jesús; Negueruela, Ignacio
2015-08-15
In this work, near infrared spectroscopy (NIR) and an acoustic measure (AWETA) (two non-destructive methods) were applied in Prunus persica fruit 'Calrico' (n = 260) to predict Magness-Taylor (MT) firmness. Separate and combined use of these measures was evaluated and compared using partial least squares (PLS) and least squares support vector machine (LS-SVM) regression methods. Also, a mutual-information-based variable selection method, seeking to find the most significant variables to produce optimal accuracy of the regression models, was applied to a joint set of variables (NIR wavelengths and AWETA measure). The newly proposed combined NIR-AWETA model gave good values of the determination coefficient (R(2)) for PLS and LS-SVM methods (0.77 and 0.78, respectively), improving the reliability of MT firmness prediction in comparison with separate NIR and AWETA predictions. The three variables selected by the variable selection method (AWETA measure plus NIR wavelengths 675 and 697 nm) achieved R(2) values 0.76 and 0.77, PLS and LS-SVM. These results indicated that the proposed mutual-information-based variable selection algorithm was a powerful tool for the selection of the most relevant variables. © 2014 Society of Chemical Industry.
Rapid detection of talcum powder in tea using FT-IR spectroscopy coupled with chemometrics
Li, Xiaoli; Zhang, Yuying; He, Yong
2016-01-01
This paper investigated the feasibility of Fourier transform infrared transmission (FT-IR) spectroscopy to detect talcum powder illegally added in tea based on chemometric methods. Firstly, 210 samples of tea powder with 13 dose levels of talcum powder were prepared for FT-IR spectra acquirement. In order to highlight the slight variations in FT-IR spectra, smoothing, normalize and standard normal variate (SNV) were employed to preprocess the raw spectra. Among them, SNV preprocessing had the best performance with high correlation of prediction (RP = 0.948) and low root mean square error of prediction (RMSEP = 0.108) of partial least squares (PLS) model. Then 18 characteristic wavenumbers were selected based on a hybrid of backward interval partial least squares (biPLS) regression, competitive adaptive reweighted sampling (CARS) algorithm and successive projections algorithm (SPA). These characteristic wavenumbers only accounted for 0.64% of the full wavenumbers. Following that, 18 characteristic wavenumbers were used to build linear and nonlinear determination models by PLS regression and extreme learning machine (ELM), respectively. The optimal model with RP = 0.963 and RMSEP = 0.137 was achieved by ELM algorithm. These results demonstrated that FT-IR spectroscopy with chemometrics could be used successfully to detect talcum powder in tea. PMID:27468701
NASA Astrophysics Data System (ADS)
He, Anhua; Singh, Ramesh P.; Sun, Zhaohua; Ye, Qing; Zhao, Gang
2016-07-01
The earth tide, atmospheric pressure, precipitation and earthquake fluctuations, especially earthquake greatly impacts water well levels, thus anomalous co-seismic changes in ground water levels have been observed. In this paper, we have used four different models, simple linear regression (SLR), multiple linear regression (MLR), principal component analysis (PCA) and partial least squares (PLS) to compute the atmospheric pressure and earth tidal effects on water level. Furthermore, we have used the Akaike information criterion (AIC) to study the performance of various models. Based on the lowest AIC and sum of squares for error values, the best estimate of the effects of atmospheric pressure and earth tide on water level is found using the MLR model. However, MLR model does not provide multicollinearity between inputs, as a result the atmospheric pressure and earth tidal response coefficients fail to reflect the mechanisms associated with the groundwater level fluctuations. On the premise of solving serious multicollinearity of inputs, PLS model shows the minimum AIC value. The atmospheric pressure and earth tidal response coefficients show close response with the observation using PLS model. The atmospheric pressure and the earth tidal response coefficients are found to be sensitive to the stress-strain state using the observed data for the period 1 April-8 June 2008 of Chuan 03# well. The transient enhancement of porosity of rock mass around Chuan 03# well associated with the Wenchuan earthquake (Mw = 7.9 of 12 May 2008) that has taken its original pre-seismic level after 13 days indicates that the co-seismic sharp rise of water well could be induced by static stress change, rather than development of new fractures.
Moscetti, Roberto; Sturm, Barbara; Crichton, Stuart Oj; Amjad, Waseem; Massantini, Riccardo
2018-05-01
The potential of hyperspectral imaging (500-1010 nm) was evaluated for monitoring of the quality of potato slices (var. Anuschka) of 5, 7 and 9 mm thickness subjected to air drying at 50 °C. The study investigated three different feature selection methods for the prediction of dry basis moisture content and colour of potato slices using partial least squares regression (PLS). The feature selection strategies tested include interval PLS regression (iPLS), and differences and ratios between raw reflectance values for each possible pair of wavelengths (R[λ 1 ]-R[λ 2 ] and R[λ 1 ]:R[λ 2 ], respectively). Moreover, the combination of spectral and spatial domains was tested. Excellent results were obtained using the iPLS algorithm. However, features from both datasets of raw reflectance differences and ratios represent suitable alternatives for development of low-complex prediction models. Finally, the dry basis moisture content was high accurately predicted by combining spectral data (i.e. R[511 nm]-R[994 nm]) and spatial domain (i.e. relative area shrinkage of slice). Modelling the data acquired during drying through hyperspectral imaging can provide useful information concerning the chemical and physicochemical changes of the product. With all this information, the proposed approach lays the foundations for a more efficient smart dryer that can be designed and its process optimized for drying of potato slices. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
Determination of butter adulteration with margarine using Raman spectroscopy.
Uysal, Reyhan Selin; Boyaci, Ismail Hakki; Genis, Hüseyin Efe; Tamer, Ugur
2013-12-15
In this study, adulteration of butter with margarine was analysed using Raman spectroscopy combined with chemometric methods (principal component analysis (PCA), principal component regression (PCR), partial least squares (PLS)) and artificial neural networks (ANNs). Different butter and margarine samples were mixed at various concentrations ranging from 0% to 100% w/w. PCA analysis was applied for the classification of butters, margarines and mixtures. PCR, PLS and ANN were used for the detection of adulteration ratios of butter. Models were created using a calibration data set and developed models were evaluated using a validation data set. The coefficient of determination (R(2)) values between actual and predicted values obtained for PCR, PLS and ANN for the validation data set were 0.968, 0.987 and 0.978, respectively. In conclusion, a combination of Raman spectroscopy with chemometrics and ANN methods can be applied for testing butter adulteration. Copyright © 2013 Elsevier Ltd. All rights reserved.
Determination of cellulose I crystallinity by FT-Raman spectroscopy
Umesh P. Agarwal; Richard S. Reiner; Sally A. Ralph
2009-01-01
Two new methods based on FT-Raman spectroscopy, one simple, based on band intensity ratio, and the other, using a partial least-squares (PLS) regression model, are proposed to determine cellulose I crystallinity. In the simple method, crystallinity in semicrystalline cellulose I samples was determined based on univariate regression that was first developed using the...
We present here the application of PLS regression to predicting surface water total phosphorous, total ammonia and Escherichia coli from landscape metrics. The amount of variability in surface water constituents explained by each model reflects the composition of the contributi...
Ferragina, A.; de los Campos, G.; Vazquez, A. I.; Cecchinato, A.; Bittante, G.
2017-01-01
The aim of this study was to assess the performance of Bayesian models commonly used for genomic selection to predict “difficult-to-predict” dairy traits, such as milk fatty acid (FA) expressed as percentage of total fatty acids, and technological properties, such as fresh cheese yield and protein recovery, using Fourier-transform infrared (FTIR) spectral data. Our main hypothesis was that Bayesian models that can estimate shrinkage and perform variable selection may improve our ability to predict FA traits and technological traits above and beyond what can be achieved using the current calibration models (e.g., partial least squares, PLS). To this end, we assessed a series of Bayesian methods and compared their prediction performance with that of PLS. The comparison between models was done using the same sets of data (i.e., same samples, same variability, same spectral treatment) for each trait. Data consisted of 1,264 individual milk samples collected from Brown Swiss cows for which gas chromatographic FA composition, milk coagulation properties, and cheese-yield traits were available. For each sample, 2 spectra in the infrared region from 5,011 to 925 cm−1 were available and averaged before data analysis. Three Bayesian models: Bayesian ridge regression (Bayes RR), Bayes A, and Bayes B, and 2 reference models: PLS and modified PLS (MPLS) procedures, were used to calibrate equations for each of the traits. The Bayesian models used were implemented in the R package BGLR (http://cran.r-project.org/web/packages/BGLR/index.html), whereas the PLS and MPLS were those implemented in the WinISI II software (Infrasoft International LLC, State College, PA). Prediction accuracy was estimated for each trait and model using 25 replicates of a training-testing validation procedure. Compared with PLS, which is currently the most widely used calibration method, MPLS and the 3 Bayesian methods showed significantly greater prediction accuracy. Accuracy increased in moving from calibration to external validation methods, and in moving from PLS and MPLS to Bayesian methods, particularly Bayes A and Bayes B. The maximum R2 value of validation was obtained with Bayes B and Bayes A. For the FA, C10:0 (% of each FA on total FA basis) had the highest R2 (0.75, achieved with Bayes A and Bayes B), and among the technological traits, fresh cheese yield R2 of 0.82 (achieved with Bayes B). These 2 methods have proven to be useful instruments in shrinking and selecting very informative wavelengths and inferring the structure and functions of the analyzed traits. We conclude that Bayesian models are powerful tools for deriving calibration equations, and, importantly, these equations can be easily developed using existing open-source software. As part of our study, we provide scripts based on the open source R software BGLR, which can be used to train customized prediction equations for other traits or populations. PMID:26387015
Li, Yuanpeng; Li, Fucui; Yang, Xinhao; Guo, Liu; Huang, Furong; Chen, Zhenqiang; Chen, Xingdan; Zheng, Shifu
2018-08-05
A rapid quantitative analysis model for determining the glycated albumin (GA) content based on Attenuated total reflectance (ATR)-Fourier transform infrared spectroscopy (FTIR) combining with linear SiPLS and nonlinear SVM has been developed. Firstly, the real GA content in human serum was determined by GA enzymatic method, meanwhile, the ATR-FTIR spectra of serum samples from the population of health examination were obtained. The spectral data of the whole spectra mid-infrared region (4000-600 cm -1 ) and GA's characteristic region (1800-800 cm -1 ) were used as the research object of quantitative analysis. Secondly, several preprocessing steps including first derivative, second derivative, variable standardization and spectral normalization, were performed. Lastly, quantitative analysis regression models were established by using SiPLS and SVM respectively. The SiPLS modeling results are as follows: root mean square error of cross validation (RMSECV T ) = 0.523 g/L, calibration coefficient (R C ) = 0.937, Root Mean Square Error of Prediction (RMSEP T ) = 0.787 g/L, and prediction coefficient (R P ) = 0.938. The SVM modeling results are as follows: RMSECV T = 0.0048 g/L, R C = 0.998, RMSEP T = 0.442 g/L, and R p = 0.916. The results indicated that the model performance was improved significantly after preprocessing and optimization of characteristic regions. While modeling performance of nonlinear SVM was considerably better than that of linear SiPLS. Hence, the quantitative analysis model for GA in human serum based on ATR-FTIR combined with SiPLS and SVM is effective. And it does not need sample preprocessing while being characterized by simple operations and high time efficiency, providing a rapid and accurate method for GA content determination. Copyright © 2018 Elsevier B.V. All rights reserved.
Optical scatterometry of quarter-micron patterns using neural regression
NASA Astrophysics Data System (ADS)
Bischoff, Joerg; Bauer, Joachim J.; Haak, Ulrich; Hutschenreuther, Lutz; Truckenbrodt, Horst
1998-06-01
With shrinking dimensions and increasing chip areas, a rapid and non-destructive full wafer characterization after every patterning cycle is an inevitable necessity. In former publications it was shown that Optical Scatterometry (OS) has the potential to push the attainable feature limits of optical techniques from 0.8 . . . 0.5 microns for imaging methods down to 0.1 micron and below. Thus the demands of future metrology can be met. Basically being a nonimaging method, OS combines light scatter (or diffraction) measurements with modern data analysis schemes to solve the inverse scatter issue. For very fine patterns with lambda-to-pitch ratios grater than one, the specular reflected light versus the incidence angle is recorded. Usually, the data analysis comprises two steps -- a training cycle connected the a rigorous forward modeling and the prediction itself. Until now, two data analysis schemes are usually applied -- the multivariate regression based Partial Least Squares method (PLS) and a look-up-table technique which is also referred to as Minimum Mean Square Error approach (MMSE). Both methods are afflicted with serious drawbacks. On the one hand, the prediction accuracy of multivariate regression schemes degrades with larger parameter ranges due to the linearization properties of the method. On the other hand, look-up-table methods are rather time consuming during prediction thus prolonging the processing time and reducing the throughput. An alternate method is an Artificial Neural Network (ANN) based regression which combines the advantages of multivariate regression and MMSE. Due to the versatility of a neural network, not only can its structure be adapted more properly to the scatter problem, but also the nonlinearity of the neuronal transfer functions mimic the nonlinear behavior of optical diffraction processes more adequately. In spite of these pleasant properties, the prediction speed of ANN regression is comparable with that of the PLS-method. In this paper, the viability and performance of ANN-regression will be demonstrated with the example of sub-quarter-micron resist metrology. To this end, 0.25 micrometer line/space patterns have been printed in positive photoresist by means of DUV projection lithography. In order to evaluate the total metrology chain from light scatter measurement through data analysis, a thorough modeling has been performed. Assuming a trapezoidal shape of the developed resist profile, a training data set was generated by means of the Rigorous Coupled Wave Approach (RCWA). After training the model, a second data set was computed and deteriorated by Gaussian noise to imitate real measuring conditions. Then, these data have been fed into the models established before resulting in a Standard Error of Prediction (SEP) which corresponds to the measuring accuracy. Even with putting only little effort in the design of a back-propagation network, the ANN is clearly superior to the PLS-method. Depending on whether a network with one or two hidden layers was used, accuracy gains between 2 and 5 can be achieved compared with PLS regression. Furthermore, the ANN is less noise sensitive, for there is only a doubling of the SEP at 5% noise for ANN whereas for PLS the accuracy degrades rapidly with increasing noise. The accuracy gain also depends on the light polarization and on the measured parameters. Finally, these results have been proven experimentally, where the OS-results are in good accordance with the profiles obtained from cross- sectioning micrographs.
Barimani, Shirin; Kleinebudde, Peter
2017-10-01
A multivariate analysis method, Science-Based Calibration (SBC), was used for the first time for endpoint determination of a tablet coating process using Raman data. Two types of tablet cores, placebo and caffeine cores, received a coating suspension comprising a polyvinyl alcohol-polyethylene glycol graft-copolymer and titanium dioxide to a maximum coating thickness of 80µm. Raman spectroscopy was used as in-line PAT tool. The spectra were acquired every minute and correlated to the amount of applied aqueous coating suspension. SBC was compared to another well-known multivariate analysis method, Partial Least Squares-regression (PLS) and a simpler approach, Univariate Data Analysis (UVDA). All developed calibration models had coefficient of determination values (R 2 ) higher than 0.99. The coating endpoints could be predicted with root mean square errors (RMSEP) less than 3.1% of the applied coating suspensions. Compared to PLS and UVDA, SBC proved to be an alternative multivariate calibration method with high predictive power. Copyright © 2017 Elsevier B.V. All rights reserved.
Umesh P. Agarwal; Richard S. Reiner; Sally A. Ralph
2010-01-01
Two new methods based on FTâRaman spectroscopy, one simple, based on band intensity ratio, and the other using a partial least squares (PLS) regression model, are proposed to determine cellulose I crystallinity. In the simple method, crystallinity in cellulose I samples was determined based on univariate regression that was first developed using the Raman band...
Guo, Canyong; Luo, Xuefang; Zhou, Xiaohua; Shi, Beijia; Wang, Juanjuan; Zhao, Jinqi; Zhang, Xiaoxia
2017-06-05
Vibrational spectroscopic techniques such as infrared, near-infrared and Raman spectroscopy have become popular in detecting and quantifying polymorphism of pharmaceutics since they are fast and non-destructive. This study assessed the ability of three vibrational spectroscopy combined with multivariate analysis to quantify a low-content undesired polymorph within a binary polymorphic mixture. Partial least squares (PLS) regression and support vector machine (SVM) regression were employed to build quantitative models. Fusidic acid, a steroidal antibiotic, was used as the model compound. It was found that PLS regression performed slightly better than SVM regression in all the three spectroscopic techniques. Root mean square errors of prediction (RMSEP) were ranging from 0.48% to 1.17% for diffuse reflectance FTIR spectroscopy and 1.60-1.93% for diffuse reflectance FT-NIR spectroscopy and 1.62-2.31% for Raman spectroscopy. The results indicate that diffuse reflectance FTIR spectroscopy offers significant advantages in providing accurate measurement of polymorphic content in the fusidic acid binary mixtures, while Raman spectroscopy is the least accurate technique for quantitative analysis of polymorphs. Copyright © 2017 Elsevier B.V. All rights reserved.
Párta, László; Zalai, Dénes; Borbély, Sándor; Putics, Akos
2014-02-01
The application of dielectric spectroscopy was frequently investigated as an on-line cell culture monitoring tool; however, it still requires supportive data and experience in order to become a robust technique. In this study, dielectric spectroscopy was used to predict viable cell density (VCD) at industrially relevant high levels in concentrated fed-batch culture of Chinese hamster ovary cells producing a monoclonal antibody for pharmaceutical purposes. For on-line dielectric spectroscopy measurements, capacitance was scanned within a wide range of frequency values (100-19,490 kHz) in six parallel cell cultivation batches. Prior to detailed mathematical analysis of the collected data, principal component analysis (PCA) was applied to compare dielectric behavior of the cultivations. PCA analysis resulted in detecting measurement disturbances. By using the measured spectroscopic data, partial least squares regression (PLS), Cole-Cole, and linear modeling were applied and compared in order to predict VCD. The Cole-Cole and the PLS model provided reliable prediction over the entire cultivation including both the early and decline phases of cell growth, while the linear model failed to estimate VCD in the later, declining cultivation phase. In regards to the measurement error sensitivity, remarkable differences were shown among PLS, Cole-Cole, and linear modeling. VCD prediction accuracy could be improved in the runs with measurement disturbances by first derivative pre-treatment in PLS and by parameter optimization of the Cole-Cole modeling.
Quantitative determination of wool in textile by near-infrared spectroscopy and multivariate models.
Chen, Hui; Tan, Chao; Lin, Zan
2018-08-05
The wool content in textiles is a key quality index and the corresponding quantitative analysis takes an important position due to common adulterations in both raw and finished textiles. Conventional methods are maybe complicated, destructive, time-consuming, environment-unfriendly. Developing a quick, easy-to-use and green alternative method is interesting. The work focuses on exploring the feasibility of combining near-infrared (NIR) spectroscopy and several partial least squares (PLS)-based algorithms and elastic component regression (ECR) algorithms for measuring wool content in textile. A total of 108 cloth samples with wool content ranging from 0% to 100% (w/w) were collected and all the compositions are really existent in the market. The dataset was divided equally into the training and test sets for developing and validating calibration models. When using local PLS, the original spectrum axis was split into 20 sub-intervals. No obvious difference of performance can be seen for the local PLS models. The ECR model is comparable or superior to the other models due its flexibility, i.e., being transition state from PCR to PLS. It seems that ECR combined with NIR technique may be a potential method for determining wool content in textile products. In addition, it might have regulatory advantages to avoid time-consuming and environmental-unfriendly chemical analysis. Copyright © 2018 Elsevier B.V. All rights reserved.
Enhancement of partial robust M-regression (PRM) performance using Bisquare weight function
NASA Astrophysics Data System (ADS)
Mohamad, Mazni; Ramli, Norazan Mohamed; Ghani@Mamat, Nor Azura Md; Ahmad, Sanizah
2014-09-01
Partial Least Squares (PLS) regression is a popular regression technique for handling multicollinearity in low and high dimensional data which fits a linear relationship between sets of explanatory and response variables. Several robust PLS methods are proposed to accommodate the classical PLS algorithms which are easily affected with the presence of outliers. The recent one was called partial robust M-regression (PRM). Unfortunately, the use of monotonous weighting function in the PRM algorithm fails to assign appropriate and proper weights to large outliers according to their severity. Thus, in this paper, a modified partial robust M-regression is introduced to enhance the performance of the original PRM. A re-descending weight function, known as Bisquare weight function is recommended to replace the fair function in the PRM. A simulation study is done to assess the performance of the modified PRM and its efficiency is also tested in both contaminated and uncontaminated simulated data under various percentages of outliers, sample sizes and number of predictors.
NASA Astrophysics Data System (ADS)
Liu, Fei; He, Yong
2008-02-01
Visible and near infrared (Vis/NIR) transmission spectroscopy and chemometric methods were utilized to predict the pH values of cola beverages. Five varieties of cola were prepared and 225 samples (45 samples for each variety) were selected for the calibration set, while 75 samples (15 samples for each variety) for the validation set. The smoothing way of Savitzky-Golay and standard normal variate (SNV) followed by first-derivative were used as the pre-processing methods. Partial least squares (PLS) analysis was employed to extract the principal components (PCs) which were used as the inputs of least squares-support vector machine (LS-SVM) model according to their accumulative reliabilities. Then LS-SVM with radial basis function (RBF) kernel function and a two-step grid search technique were applied to build the regression model with a comparison of PLS regression. The correlation coefficient (r), root mean square error of prediction (RMSEP) and bias were 0.961, 0.040 and 0.012 for PLS, while 0.975, 0.031 and 4.697x10 -3 for LS-SVM, respectively. Both methods obtained a satisfying precision. The results indicated that Vis/NIR spectroscopy combined with chemometric methods could be applied as an alternative way for the prediction of pH of cola beverages.
Fadzillah, Nurrulhidayah Ahmad; Man, Yaakob bin Che; Rohman, Abdul; Rosman, Arieff Salleh; Ismail, Amin; Mustafa, Shuhaimi; Khatib, Alfi
2015-01-01
The authentication of food products from the presence of non-allowed components for certain religion like lard is very important. In this study, we used proton Nuclear Magnetic Resonance ((1)H-NMR) spectroscopy for the analysis of butter adulterated with lard by simultaneously quantification of all proton bearing compounds, and consequently all relevant sample classes. Since the spectra obtained were too complex to be analyzed visually by the naked eyes, the classification of spectra was carried out.The multivariate calibration of partial least square (PLS) regression was used for modelling the relationship between actual value of lard and predicted value. The model yielded a highest regression coefficient (R(2)) of 0.998 and the lowest root mean square error calibration (RMSEC) of 0.0091% and root mean square error prediction (RMSEP) of 0.0090, respectively. Cross validation testing evaluates the predictive power of the model. PLS model was shown as good models as the intercept of R(2)Y and Q(2)Y were 0.0853 and -0.309, respectively.
Chang, Wen-Qi; Zhou, Jian-Liang; Li, Yi; Shi, Zi-Qi; Wang, Li; Yang, Jie; Li, Ping; Liu, Li-Fang; Xin, Gui-Zhong
2017-01-15
The elevation of free fatty acids (FFAs) has been regarded as a universal metabolic signature of excessive adipocyte lipolysis. Nowadays, in vitro lipolysis assay is generally essential for drug screening prior to the animal study. Here, we present a novel in vitro approach for lipolysis measurement combining UHPLC-Orbitrap and partial least squares (PLS) based analysis. Firstly, the calibration matrix was constructed by serial proportions of mixed samples (blended with control and model samples). Then, lipidome profiling was performed by UHPLC-Orbitrap, and 403 variables were extracted and aligned as dataset. Owing to the high resolution of Orbitrap analyzer and open source lipid identification software, 28 FFAs were further screened and identified. Based on the relative intensity of the screened FFAs, PLS regression model was constructed for lipolysis measurement. After leave-one-out cross-validation, ten principal components have been designated to build the final PLS model with excellent performances (RMSECV, 0.0268; RMSEC, 0.0173; R 2 , 0.9977). In addition, the high predictive accuracy (R 2 = 0.9907 and RMSEP = 0.0345) of the trained PLS model was also demonstrated using test samples. Finally, taking curcumin as a model compound, its antilipolytic effect on palmitic acid-induced lipolysis was successfully predicted as 31.78% by the proposed approach. Besides, supplementary evidences of curcumin induced modification in FFAs compositions as well as lipidome were given by PLS extended methods. Different from general biological assays, high resolution MS-based method provide more sophisticated information included in biological events. Thus, the novel biological evaluation model proposed here showed promising perspectives for drug evaluation or disease diagnosis. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Palou, Anna; Miró, Aira; Blanco, Marcelo; Larraz, Rafael; Gómez, José Francisco; Martínez, Teresa; González, Josep Maria; Alcalà, Manel
2017-06-01
Even when the feasibility of using near infrared (NIR) spectroscopy combined with partial least squares (PLS) regression for prediction of physico-chemical properties of biodiesel/diesel blends has been widely demonstrated, inclusion in the calibration sets of the whole variability of diesel samples from diverse production origins still remains as an important challenge when constructing the models. This work presents a useful strategy for the systematic selection of calibration sets of samples of biodiesel/diesel blends from diverse origins, based on a binary code, principal components analysis (PCA) and the Kennard-Stones algorithm. Results show that using this methodology the models can keep their robustness over time. PLS calculations have been done using a specialized chemometric software as well as the software of the NIR instrument installed in plant, and both produced RMSEP under reproducibility values of the reference methods. The models have been proved for on-line simultaneous determination of seven properties: density, cetane index, fatty acid methyl esters (FAME) content, cloud point, boiling point at 95% of recovery, flash point and sulphur.
NASA Astrophysics Data System (ADS)
Liu, Fei; He, Yong
2008-03-01
Three different chemometric methods were performed for the determination of sugar content of cola soft drinks using visible and near infrared spectroscopy (Vis/NIRS). Four varieties of colas were prepared and 180 samples (45 samples for each variety) were selected for the calibration set, while 60 samples (15 samples for each variety) for the validation set. The smoothing way of Savitzky-Golay, standard normal variate (SNV) and Savitzky-Golay first derivative transformation were applied for the pre-processing of spectral data. The first eleven principal components (PCs) extracted by partial least squares (PLS) analysis were employed as the inputs of BP neural network (BPNN) and least squares-support vector machine (LS-SVM) model. Then the BPNN model with the optimal structural parameters and LS-SVM model with radial basis function (RBF) kernel were applied to build the regression model with a comparison of PLS regression. The correlation coefficient (r), root mean square error of prediction (RMSEP) and bias for prediction were 0.971, 1.259 and -0.335 for PLS, 0.986, 0.763, and -0.042 for BPNN, while 0.978, 0.995 and -0.227 for LS-SVM, respectively. All the three methods supplied a high and satisfying precision. The results indicated that Vis/NIR spectroscopy combined with chemometric methods could be utilized as a high precision way for the determination of sugar content of cola soft drinks.
USDA-ARS?s Scientific Manuscript database
A technique of using multiple calibration sets in partial least squares regression (PLS) was proposed to improve the quantitative determination of ammonia from open-path Fourier transform infrared spectra. The spectra were measured near animal farms, and the path-integrated concentration of ammonia...
NASA Astrophysics Data System (ADS)
Bilal, Maria; Bilal, Muhammad; Saleem, Muhammad; Khurram, Muhammad; Khan, Saranjam; Ullah, Rahat; Ali, Hina; Ahmed, Mushtaq; Shahzada, Shaista; Ullah Khan, Ehsan
2017-04-01
Raman spectroscopy based investigations of the molecular changes associated with an early stage of dengue virus infection (DENV) using a partial least squares (PLS) regression model is presented. This study is based on non-structural protein 1 (NS1) which appears after three days of DENV infection. In total, 39 blood sera samples were collected and divided into two groups. The control group contained samples which were the negative for NS1 and antibodies and the positive group contained those samples in which NS1 is positive and antibodies were negative. Out of 39 samples, 29 Raman spectra were used for the model development while the remaining 10 were kept hidden for blind testing of the model. PLS regression yielded a vector of regression coefficients as a function of Raman shift, which were analyzed. Cytokines in the region 775-875 cm-1, lectins at 1003, 1238, 1340, 1449 and 1672 cm-1, DNA in the region 1040-1140 cm-1 and alpha and beta structures of proteins in the region 933-967 cm-1 have been identified in the regression vector for their role in an early stage of DENV infection. Validity of the model was established by its R-square value of 0.891. Sensitivity, specificity and accuracy were 100% each and the area under the receiver operator characteristic curve was found to be 1.
NASA Astrophysics Data System (ADS)
Pérez-Rodríguez, Marta; Horák-Terra, Ingrid; Rodríguez-Lado, Luis; Martínez Cortizas, Antonio
2016-11-01
Despite its potential, infrared spectroscopy combined with multivariate statistics has been seldom used to model peat properties with environmental value, such us the concentration of potentially toxic metals. In this research, we applied attenuated total reflectance (ATR) Fourier-Transform Infrared (FTIR) spectroscopy to evaluate the ability of the technique to predict mercury concentrations in late-Pleistocene/Holocene peat from a minerogenic peatland from Minas Gerais (Brazil). Mercury concentrations were analysed using a Milestone DMA-80 analyzer and attenuated total reflectance FTIR-ATR was performed using a Gladi-ATR (Pike Technologies) in the mid IR spectrum (4000-400 cm- 1). Concentrations were modelled using principal components (PCR) and partial least squares regression (PLS). The performance of the models varied between moderate and very good (R2 0.67-0.90), with low RMSD values (0.35-1.06). A PLS model based on three latent vectors (LV1 to LV3) provided the best (R2 0.90, RMSD 0.35) results. LV1 reflected total organic matter content versus mineral matter (mainly quartz from local fluxes), LV2 was related to dust deposition from regional sources, and LV3 reflected peat organic matter decomposition. Compared to a previous investigation based on geochemical data, the spectroscopy-based PLS model performed better, but it has to be complemented with additional data (as δ13 C ratios) to reliably reproduce the changes of the factors controlling mercury accumulation over time. This, time- and cost-effective, methodology may help to develop multi-core approaches to study the within and between mire (of a similar type and area) variability in mercury accumulation, and probably also other peat properties. Fig. S2 Loadings weights of the three and two significant components from the direct (dPCR) and transposed (trPCR) PCR models. Fig. S3 Depth records of the cumulative effects of the factors involved in the variation of mercury concentrations. Left, MIR-PLS model; centre, MIR-PLS + δ13 C data model; right, geochemical model from Pérez-Rodríguez et al. [44].
Golmohammadi, Hassan
2009-11-30
A quantitative structure-property relationship (QSPR) study was performed to develop models those relate the structure of 141 organic compounds to their octanol-water partition coefficients (log P(o/w)). A genetic algorithm was applied as a variable selection tool. Modeling of log P(o/w) of these compounds as a function of theoretically derived descriptors was established by multiple linear regression (MLR), partial least squares (PLS), and artificial neural network (ANN). The best selected descriptors that appear in the models are: atomic charge weighted partial positively charged surface area (PPSA-3), fractional atomic charge weighted partial positive surface area (FPSA-3), minimum atomic partial charge (Qmin), molecular volume (MV), total dipole moment of molecule (mu), maximum antibonding contribution of a molecule orbital in the molecule (MAC), and maximum free valency of a C atom in the molecule (MFV). The result obtained showed the ability of developed artificial neural network to prediction of partition coefficients of organic compounds. Also, the results revealed the superiority of ANN over the MLR and PLS models. Copyright 2009 Wiley Periodicals, Inc.
Cai, Rui; Wang, Shisheng; Tang, Bo; Li, Yueqing; Zhao, Weijie
2018-01-01
Sea cucumber is the major tonic seafood worldwide, and geographical origin traceability is an important part of its quality and safety control. In this work, a non-destructive method for origin traceability of sea cucumber (Apostichopus japonicus) from northern China Sea and East China Sea using near infrared spectroscopy (NIRS) and multivariate analysis methods was proposed. Total fat contents of 189 fresh sea cucumber samples were determined and partial least-squares (PLS) regression was used to establish the quantitative NIRS model. The ordered predictor selection algorithm was performed to select feasible wavelength regions for the construction of PLS and identification models. The identification model was developed by principal component analysis combined with Mahalanobis distance and scaling to the first range algorithms. In the test set of the optimum PLS models, the root mean square error of prediction was 0.45, and correlation coefficient was 0.90. The correct classification rates of 100% were obtained in both identification calibration model and test model. The overall results indicated that NIRS method combined with chemometric analysis was a suitable tool for origin traceability and identification of fresh sea cucumber samples from nine origins in China. PMID:29410795
Guo, Xiuhan; Cai, Rui; Wang, Shisheng; Tang, Bo; Li, Yueqing; Zhao, Weijie
2018-01-01
Sea cucumber is the major tonic seafood worldwide, and geographical origin traceability is an important part of its quality and safety control. In this work, a non-destructive method for origin traceability of sea cucumber ( Apostichopus japonicus ) from northern China Sea and East China Sea using near infrared spectroscopy (NIRS) and multivariate analysis methods was proposed. Total fat contents of 189 fresh sea cucumber samples were determined and partial least-squares (PLS) regression was used to establish the quantitative NIRS model. The ordered predictor selection algorithm was performed to select feasible wavelength regions for the construction of PLS and identification models. The identification model was developed by principal component analysis combined with Mahalanobis distance and scaling to the first range algorithms. In the test set of the optimum PLS models, the root mean square error of prediction was 0.45, and correlation coefficient was 0.90. The correct classification rates of 100% were obtained in both identification calibration model and test model. The overall results indicated that NIRS method combined with chemometric analysis was a suitable tool for origin traceability and identification of fresh sea cucumber samples from nine origins in China.
Ferragina, A; de los Campos, G; Vazquez, A I; Cecchinato, A; Bittante, G
2015-11-01
The aim of this study was to assess the performance of Bayesian models commonly used for genomic selection to predict "difficult-to-predict" dairy traits, such as milk fatty acid (FA) expressed as percentage of total fatty acids, and technological properties, such as fresh cheese yield and protein recovery, using Fourier-transform infrared (FTIR) spectral data. Our main hypothesis was that Bayesian models that can estimate shrinkage and perform variable selection may improve our ability to predict FA traits and technological traits above and beyond what can be achieved using the current calibration models (e.g., partial least squares, PLS). To this end, we assessed a series of Bayesian methods and compared their prediction performance with that of PLS. The comparison between models was done using the same sets of data (i.e., same samples, same variability, same spectral treatment) for each trait. Data consisted of 1,264 individual milk samples collected from Brown Swiss cows for which gas chromatographic FA composition, milk coagulation properties, and cheese-yield traits were available. For each sample, 2 spectra in the infrared region from 5,011 to 925 cm(-1) were available and averaged before data analysis. Three Bayesian models: Bayesian ridge regression (Bayes RR), Bayes A, and Bayes B, and 2 reference models: PLS and modified PLS (MPLS) procedures, were used to calibrate equations for each of the traits. The Bayesian models used were implemented in the R package BGLR (http://cran.r-project.org/web/packages/BGLR/index.html), whereas the PLS and MPLS were those implemented in the WinISI II software (Infrasoft International LLC, State College, PA). Prediction accuracy was estimated for each trait and model using 25 replicates of a training-testing validation procedure. Compared with PLS, which is currently the most widely used calibration method, MPLS and the 3 Bayesian methods showed significantly greater prediction accuracy. Accuracy increased in moving from calibration to external validation methods, and in moving from PLS and MPLS to Bayesian methods, particularly Bayes A and Bayes B. The maximum R(2) value of validation was obtained with Bayes B and Bayes A. For the FA, C10:0 (% of each FA on total FA basis) had the highest R(2) (0.75, achieved with Bayes A and Bayes B), and among the technological traits, fresh cheese yield R(2) of 0.82 (achieved with Bayes B). These 2 methods have proven to be useful instruments in shrinking and selecting very informative wavelengths and inferring the structure and functions of the analyzed traits. We conclude that Bayesian models are powerful tools for deriving calibration equations, and, importantly, these equations can be easily developed using existing open-source software. As part of our study, we provide scripts based on the open source R software BGLR, which can be used to train customized prediction equations for other traits or populations. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Boucher, Thomas F.; Ozanne, Marie V.; Carmosino, Marco L.; Dyar, M. Darby; Mahadevan, Sridhar; Breves, Elly A.; Lepore, Kate H.; Clegg, Samuel M.
2015-05-01
The ChemCam instrument on the Mars Curiosity rover is generating thousands of LIBS spectra and bringing interest in this technique to public attention. The key to interpreting Mars or any other types of LIBS data are calibrations that relate laboratory standards to unknowns examined in other settings and enable predictions of chemical composition. Here, LIBS spectral data are analyzed using linear regression methods including partial least squares (PLS-1 and PLS-2), principal component regression (PCR), least absolute shrinkage and selection operator (lasso), elastic net, and linear support vector regression (SVR-Lin). These were compared against results from nonlinear regression methods including kernel principal component regression (K-PCR), polynomial kernel support vector regression (SVR-Py) and k-nearest neighbor (kNN) regression to discern the most effective models for interpreting chemical abundances from LIBS spectra of geological samples. The results were evaluated for 100 samples analyzed with 50 laser pulses at each of five locations averaged together. Wilcoxon signed-rank tests were employed to evaluate the statistical significance of differences among the nine models using their predicted residual sum of squares (PRESS) to make comparisons. For MgO, SiO2, Fe2O3, CaO, and MnO, the sparse models outperform all the others except for linear SVR, while for Na2O, K2O, TiO2, and P2O5, the sparse methods produce inferior results, likely because their emission lines in this energy range have lower transition probabilities. The strong performance of the sparse methods in this study suggests that use of dimensionality-reduction techniques as a preprocessing step may improve the performance of the linear models. Nonlinear methods tend to overfit the data and predict less accurately, while the linear methods proved to be more generalizable with better predictive performance. These results are attributed to the high dimensionality of the data (6144 channels) relative to the small number of samples studied. The best-performing models were SVR-Lin for SiO2, MgO, Fe2O3, and Na2O, lasso for Al2O3, elastic net for MnO, and PLS-1 for CaO, TiO2, and K2O. Although these differences in model performance between methods were identified, most of the models produce comparable results when p ≤ 0.05 and all techniques except kNN produced statistically-indistinguishable results. It is likely that a combination of models could be used together to yield a lower total error of prediction, depending on the requirements of the user.
Liu, Fei; Feng, Lei; Lou, Bing-gan; Sun, Guang-ming; Wang, Lian-ping; He, Yong
2010-07-01
The combinational-stimulated bands were used to develop linear and nonlinear calibrations for the early detection of sclerotinia of oilseed rape (Brassica napus L.). Eighty healthy and 100 Sclerotinia leaf samples were scanned, and different preprocessing methods combined with successive projections algorithm (SPA) were applied to develop partial least squares (PLS) discriminant models, multiple linear regression (MLR) and least squares-support vector machine (LS-SVM) models. The results indicated that the optimal full-spectrum PLS model was achieved by direct orthogonal signal correction (DOSC), then De-trending and Raw spectra with correct recognition ratio of 100%, 95.7% and 95.7%, respectively. When using combinational-stimulated bands, the optimal linear models were SPA-MLR (DOSC) and SPA-PLS (DOSC) with correct recognition ratio of 100%. All SPA-LSSVM models using DOSC, De-trending and Raw spectra achieved perfect results with recognition of 100%. The overall results demonstrated that it was feasible to use combinational-stimulated bands for the early detection of Sclerotinia of oilseed rape, and DOSC-SPA was a powerful way for informative wavelength selection. This method supplied a new approach to the early detection and portable monitoring instrument of sclerotinia.
Melquiades, Fábio L; Thomaz, Edivaldo L
2016-05-01
An important aspect for the evaluation of fire effects in slash-and-burn agricultural system, as well as in wildfire, is the soil burn severity. The objective of this study is to estimate the maximum temperature reached in real soil burn events using energy dispersive X-ray fluorescence (EDXRF) as an analytical tool, combined with partial least square (PLS) regression. Muffle-heated soil samples were used for PLS regression model calibration and two real slash-and-burn soils were tested as external samples in the model. It was possible to associate EDXRF spectra alterations to the maximum temperature reached in the heat affected soils with about 17% relative standard deviation. The results are promising since the analysis is fast, nondestructive, and conducted after the burn event, although local calibration for each type of burned soil is necessary. Copyright © by the American Society of Agronomy, Crop Science Society of America, and Soil Science Society of America, Inc.
Klein-Júnior, Luiz C; Viaene, Johan; Tuenter, Emmy; Salton, Juliana; Gasper, André L; Apers, Sandra; Andries, Jan P M; Pieters, Luc; Henriques, Amélia T; Vander Heyden, Yvan
2016-09-09
Psychotria nemorosa is chemically characterized by indole alkaloids and displays significant inhibitory activity on butyrylcholinesterase (BChE) and monoamine oxidase-A (MAO-A), both enzymes related to neurodegenerative disorders. In the present study, 43 samples of P. nemorosa leaves were extracted and fractionated in accordance to previously optimized methods (see Part I). These fractions were analyzed by means of UPLC-DAD and assayed for their BChE and MAO-A inhibitory potencies. The chromatographic fingerprint data was first aligned using correlation optimized warping and Principal Component Analysis to explore the data structure was performed. Multivariate calibration techniques, namely Partial Least Squares (PLS1), PLS2 and Orthogonal Projections to Latent Structure (O-PLS1), were evaluated for modelling the activities as a function of the fingerprints. Since the best results were obtained with O-PLS1 model (RMSECV=9.3 and 3.3 for BChE and MAO-A, respectively), the regression coefficients of the model were analyzed and plotted relative to the original fingerprints. Four peaks were indicated as multifunctional compounds, with the capacity to impair both BChE and MAO-A activities. In order to confirm these results, a semi-prep HPLC technique was used and a fraction containing the four peaks was purified and evaluated in vitro. It was observed that the fraction exhibited an IC50 of 2.12μgmL(-1) for BChE and 1.07μgmL(-1) for MAO-A. These results reinforce the prediction obtained by O-PLS1 modelling. Copyright © 2016 Elsevier B.V. All rights reserved.
Kehimkar, Benjamin; Parsons, Brendon A; Hoggard, Jamin C; Billingsley, Matthew C; Bruno, Thomas J; Synovec, Robert E
2015-01-01
Recent efforts in predicting rocket propulsion (RP-1) fuel performance through modeling put greater emphasis on obtaining detailed and accurate fuel properties, as well as elucidating the relationships between fuel compositions and their properties. Herein, we study multidimensional chromatographic data obtained by comprehensive two-dimensional gas chromatography combined with time-of-flight mass spectrometry (GC × GC-TOFMS) to analyze RP-1 fuels. For GC × GC separations, RTX-Wax (polar stationary phase) and RTX-1 (non-polar stationary phase) columns were implemented for the primary and secondary dimensions, respectively, to separate the chemical compound classes (alkanes, cycloalkanes, aromatics, etc.), providing a significant level of chemical compositional information. The GC × GC-TOFMS data were analyzed using partial least squares regression (PLS) chemometric analysis to model and predict advanced distillation curve (ADC) data for ten RP-1 fuels that were previously analyzed using the ADC method. The PLS modeling provides insight into the chemical species that impact the ADC data. The PLS modeling correlates compositional information found in the GC × GC-TOFMS chromatograms of each RP-1 fuel, and their respective ADC, and allows prediction of the ADC for each RP-1 fuel with good precision and accuracy. The root-mean-square error of calibration (RMSEC) ranged from 0.1 to 0.5 °C, and was typically below ∼0.2 °C, for the PLS calibration of the ADC modeling with GC × GC-TOFMS data, indicating a good fit of the model to the calibration data. Likewise, the predictive power of the overall method via PLS modeling was assessed using leave-one-out cross-validation (LOOCV) yielding root-mean-square error of cross-validation (RMSECV) ranging from 1.4 to 2.6 °C, and was typically below ∼2.0 °C, at each % distilled measurement point during the ADC analysis.
Analysis of pork adulteration in beef meatball using Fourier transform infrared (FTIR) spectroscopy.
Rohman, A; Sismindari; Erwanto, Y; Che Man, Yaakob B
2011-05-01
Meatball is one of the favorite foods in Indonesia. The adulteration of pork in beef meatball is frequently occurring. This study was aimed to develop a fast and non destructive technique for the detection and quantification of pork in beef meatball using Fourier transform infrared (FTIR) spectroscopy and partial least square (PLS) calibration. The spectral bands associated with pork fat (PF), beef fat (BF), and their mixtures in meatball formulation were scanned, interpreted, and identified by relating them to those spectroscopically representative to pure PF and BF. For quantitative analysis, PLS regression was used to develop a calibration model at the selected fingerprint regions of 1200-1000 cm(-1). The equation obtained for the relationship between actual PF value and FTIR predicted values in PLS calibration model was y = 0.999x + 0.004, with coefficient of determination (R(2)) and root mean square error of calibration are 0.999 and 0.442, respectively. The PLS calibration model was subsequently used for the prediction of independent samples using laboratory made meatball samples containing the mixtures of BF and PF. Using 4 principal components, root mean square error of prediction is 0.742. The results showed that FTIR spectroscopy can be used for the detection and quantification of pork in beef meatball formulation for Halal verification purposes. Copyright © 2010 The American Meat Science Association. Published by Elsevier Ltd. All rights reserved.
Lee, Byeong-Ju; Kim, Hye-Youn; Lim, Sa Rang; Huang, Linfang; Choi, Hyung-Kyoon
2017-01-01
Panax ginseng C.A. Meyer is a herb used for medicinal purposes, and its discrimination according to cultivation age has been an important and practical issue. This study employed Fourier-transform infrared (FT-IR) spectroscopy with multivariate statistical analysis to obtain a prediction model for discriminating cultivation ages (5 and 6 years) and three different parts (rhizome, tap root, and lateral root) of P. ginseng. The optimal partial-least-squares regression (PLSR) models for discriminating ginseng samples were determined by selecting normalization methods, number of partial-least-squares (PLS) components, and variable influence on projection (VIP) cutoff values. The best prediction model for discriminating 5- and 6-year-old ginseng was developed using tap root, vector normalization applied after the second differentiation, one PLS component, and a VIP cutoff of 1.0 (based on the lowest root-mean-square error of prediction value). In addition, for discriminating among the three parts of P. ginseng, optimized PLSR models were established using data sets obtained from vector normalization, two PLS components, and VIP cutoff values of 1.5 (for 5-year-old ginseng) and 1.3 (for 6-year-old ginseng). To our knowledge, this is the first study to provide a novel strategy for rapidly discriminating the cultivation ages and parts of P. ginseng using FT-IR by selected normalization methods, number of PLS components, and VIP cutoff values.
Lim, Sa Rang; Huang, Linfang
2017-01-01
Panax ginseng C.A. Meyer is a herb used for medicinal purposes, and its discrimination according to cultivation age has been an important and practical issue. This study employed Fourier-transform infrared (FT-IR) spectroscopy with multivariate statistical analysis to obtain a prediction model for discriminating cultivation ages (5 and 6 years) and three different parts (rhizome, tap root, and lateral root) of P. ginseng. The optimal partial-least-squares regression (PLSR) models for discriminating ginseng samples were determined by selecting normalization methods, number of partial-least-squares (PLS) components, and variable influence on projection (VIP) cutoff values. The best prediction model for discriminating 5- and 6-year-old ginseng was developed using tap root, vector normalization applied after the second differentiation, one PLS component, and a VIP cutoff of 1.0 (based on the lowest root-mean-square error of prediction value). In addition, for discriminating among the three parts of P. ginseng, optimized PLSR models were established using data sets obtained from vector normalization, two PLS components, and VIP cutoff values of 1.5 (for 5-year-old ginseng) and 1.3 (for 6-year-old ginseng). To our knowledge, this is the first study to provide a novel strategy for rapidly discriminating the cultivation ages and parts of P. ginseng using FT-IR by selected normalization methods, number of PLS components, and VIP cutoff values. PMID:29049369
Sun, Zhongyu; Li, Can; Li, Lian; Nie, Lei; Dong, Qin; Li, Danyang; Gao, Lingling; Zang, Hengchang
2018-08-05
N-acetyl-d-glucosamine (GlcNAc) is a microbial fermentation product, and NIR spectroscopy is an effective process analytical technology (PAT) tool in detecting the key quality attribute: the GlcNAc content. Meanwhile, the design of NIR spectrometers is under the trend of miniaturization, portability and low-cost nowadays. The aim of this study was to explore a portable micro NIR spectrometer with the fermentation process. First, FT-NIR spectrometer and Micro-NIR 1700 spectrometer were compared with simulated fermentation process solutions. The R c 2 , R p 2 , RMSECV and RMSEP of the optimal FT-NIR and Micro-NIR 1700 models were 0.999, 0.999, 3.226 g/L, 1.388 g/L and 0.999, 0.999, 1.821 g/L, 0.967 g/L. Passing-Bablok regression method and paired t-test results showed there were no significant differences between the two instruments. Then the Micro-NIR 1700 was selected for the practical fermentation process, 135 samples from 10 batches were collected. Spectral pretreatment methods and variables selection methods (BiPLS, FiPLS, MWPLS and CARS-PLS) for PLS modeling were discussed. The R c 2 , R p 2 , RMSECV and RMSEP of the optimal GlcNAc content PLS model of the practical fermentation process were 0.994, 0.995, 2.792 g/L and 1.946 g/L. The results have a positive reference for application of the Micro-NIR spectrometer. To some extent, it could provide theoretical supports in guiding the microbial fermentation or the further assessment of bioprocess. Copyright © 2018. Published by Elsevier B.V.
Quantification of trace metals in infant formula premixes using laser-induced breakdown spectroscopy
NASA Astrophysics Data System (ADS)
Cama-Moncunill, Raquel; Casado-Gavalda, Maria P.; Cama-Moncunill, Xavier; Markiewicz-Keszycka, Maria; Dixit, Yash; Cullen, Patrick J.; Sullivan, Carl
2017-09-01
Infant formula is a human milk substitute generally based upon fortified cow milk components. In order to mimic the composition of breast milk, trace elements such as copper, iron and zinc are usually added in a single operation using a premix. The correct addition of premixes must be verified to ensure that the target levels in infant formulae are achieved. In this study, a laser-induced breakdown spectroscopy (LIBS) system was assessed as a fast validation tool for trace element premixes. LIBS is a promising emission spectroscopic technique for elemental analysis, which offers real-time analyses, little to no sample preparation and ease of use. LIBS was employed for copper and iron determinations of premix samples ranging approximately from 0 to 120 mg/kg Cu/1640 mg/kg Fe. LIBS spectra are affected by several parameters, hindering subsequent quantitative analyses. This work aimed at testing three matrix-matched calibration approaches (simple-linear regression, multi-linear regression and partial least squares regression (PLS)) as means for precision and accuracy enhancement of LIBS quantitative analysis. All calibration models were first developed using a training set and then validated with an independent test set. PLS yielded the best results. For instance, the PLS model for copper provided a coefficient of determination (R2) of 0.995 and a root mean square error of prediction (RMSEP) of 14 mg/kg. Furthermore, LIBS was employed to penetrate through the samples by repetitively measuring the same spot. Consequently, LIBS spectra can be obtained as a function of sample layers. This information was used to explore whether measuring deeper into the sample could reduce possible surface-contaminant effects and provide better quantifications.
Lim, Jongguk; Kim, Giyoung; Mo, Changyeun; Kim, Moon S; Chao, Kuanglin; Qin, Jianwei; Fu, Xiaping; Baek, Insuck; Cho, Byoung-Kwan
2016-05-01
Illegal use of nitrogen-rich melamine (C3H6N6) to boost perceived protein content of food products such as milk, infant formula, frozen yogurt, pet food, biscuits, and coffee drinks has caused serious food safety problems. Conventional methods to detect melamine in foods, such as Enzyme-linked immunosorbent assay (ELISA), High-performance liquid chromatography (HPLC), and Gas chromatography-mass spectrometry (GC-MS), are sensitive but they are time-consuming, expensive, and labor-intensive. In this research, near-infrared (NIR) hyperspectral imaging technique combined with regression coefficient of partial least squares regression (PLSR) model was used to detect melamine particles in milk powders easily and quickly. NIR hyperspectral reflectance imaging data in the spectral range of 990-1700nm were acquired from melamine-milk powder mixture samples prepared at various concentrations ranging from 0.02% to 1%. PLSR models were developed to correlate the spectral data (independent variables) with melamine concentration (dependent variables) in melamine-milk powder mixture samples. PLSR models applying various pretreatment methods were used to reconstruct the two-dimensional PLS images. PLS images were converted to the binary images to detect the suspected melamine pixels in milk powder. As the melamine concentration was increased, the numbers of suspected melamine pixels of binary images were also increased. These results suggested that NIR hyperspectral imaging technique and the PLSR model can be regarded as an effective tool to detect melamine particles in milk powders. Copyright © 2016 Elsevier B.V. All rights reserved.
Quantitative analysis of red wine tannins using Fourier-transform mid-infrared spectrometry.
Fernandez, Katherina; Agosin, Eduardo
2007-09-05
Tannin content and composition are critical quality components of red wines. No spectroscopic method assessing these phenols in wine has been described so far. We report here a new method using Fourier transform mid-infrared (FT-MIR) spectroscopy and chemometric techniques for the quantitative analysis of red wine tannins. Calibration models were developed using protein precipitation and phloroglucinolysis as analytical reference methods. After spectra preprocessing, six different predictive partial least-squares (PLS) models were evaluated, including the use of interval selection procedures such as iPLS and CSMWPLS. PLS regression with full-range (650-4000 cm(-1)), second derivative of the spectra and phloroglucinolysis as the reference method gave the most accurate determination for tannin concentration (RMSEC = 2.6%, RMSEP = 9.4%, r = 0.995). The prediction of the mean degree of polymerization (mDP) of the tannins also gave a reasonable prediction (RMSEC = 6.7%, RMSEP = 10.3%, r = 0.958). These results represent the first step in the development of a spectroscopic methodology for the quantification of several phenolic compounds that are critical for wine quality.
Baum, Andreas; Hansen, Per Waaben; Meyer, Anne S; Mikkelsen, Jørn Dalgaard
2013-08-06
Enzymes are used in many processes to release fermentable sugars for green production of biofuel, or the refinery of biomass for extraction of functional food ingredients such as pectin or prebiotic oligosaccharides. The complex biomasses may, however, require a multitude of specific enzymes which are active on specific substrates generating a multitude of products. In this paper we use the plant polymer, pectin, to present a method to quantify enzyme activity of two pectolytic enzymes by monitoring their superimposed spectral evolutions simultaneously. The data is analyzed by three chemometric multiway methods, namely PARAFAC, TUCKER3 and N-PLS, to establish simultaneous enzyme activity assays for pectin lyase and pectin methyl esterase. Correlation coefficients Rpred(2) for prediction test sets are 0.48, 0.96 and 0.96 for pectin lyase and 0.70, 0.89 and 0.89 for pectin methyl esterase, respectively. The retrieved models are compared and prediction test sets show that especially TUCKER3 performs well, even in comparison to the supervised regression method N-PLS. Copyright © 2013 Elsevier B.V. All rights reserved.
Ciura, Krzesimir; Belka, Mariusz; Kawczak, Piotr; Bączek, Tomasz; Markuszewski, Michał J; Nowakowska, Joanna
2017-09-05
The objective of this paper is to build QSRR/QSAR model for predicting the blood-brain barrier (BBB) permeability. The obtained models are based on salting-out thin layer chromatography (SOTLC) constants and calculated molecular descriptors. Among chromatographic methods SOTLC was chosen, since the mobile phases are free of organic solvent. As consequences, there are less toxic, and have lower environmental impact compared to classical reserved phases liquid chromatography (RPLC). During the study three stationary phase silica gel, cellulose plates and neutral aluminum oxide were examined. The model set of solutes presents a wide range of log BB values, containing compounds which cross the BBB readily and molecules poorly distributed to the brain including drugs acting on the nervous system as well as peripheral acting drugs. Additionally, the comparison of three regression models: multiple linear regression (MLR), partial least-squares (PLS) and orthogonal partial least squares (OPLS) were performed. The designed QSRR/QSAR models could be useful to predict BBB of systematically synthesized newly compounds in the drug development pipeline and are attractive alternatives of time-consuming and demanding directed methods for log BB measurement. The study also shown that among several regression techniques, significant differences can be obtained in models performance, measured by R 2 and Q 2 , hence it is strongly suggested to evaluate all available options as MLR, PLS and OPLS. Copyright © 2017 Elsevier B.V. All rights reserved.
Zhang, Chu; Liu, Fei; Kong, Wenwen; He, Yong
2015-01-01
Visible and near-infrared hyperspectral imaging covering spectral range of 380–1030 nm as a rapid and non-destructive method was applied to estimate the soluble protein content of oilseed rape leaves. Average spectrum (500–900 nm) of the region of interest (ROI) of each sample was extracted, and four samples out of 128 samples were defined as outliers by Monte Carlo-partial least squares (MCPLS). Partial least squares (PLS) model using full spectra obtained dependable performance with the correlation coefficient (rp) of 0.9441, root mean square error of prediction (RMSEP) of 0.1658 mg/g and residual prediction deviation (RPD) of 2.98. The weighted regression coefficient (Bw), successive projections algorithm (SPA) and genetic algorithm-partial least squares (GAPLS) selected 18, 15, and 16 sensitive wavelengths, respectively. SPA-PLS model obtained the best performance with rp of 0.9554, RMSEP of 0.1538 mg/g and RPD of 3.25. Distribution of protein content within the rape leaves were visualized and mapped on the basis of the SPA-PLS model. The overall results indicated that hyperspectral imaging could be used to determine and visualize the soluble protein content of rape leaves. PMID:26184198
Monitoring of chicken meat freshness by means of a colorimetric sensor array.
Salinas, Yolanda; Ros-Lis, José V; Vivancos, José-L; Martínez-Máñez, Ramón; Marcos, M Dolores; Aucejo, Susana; Herranz, Nuria; Lorente, Inmaculada
2012-08-21
A new optoelectronic nose to monitor chicken meat ageing has been developed. It is based on 16 pigments prepared by the incorporation of different dyes (pH indicators, Lewis acids, hydrogen-bonding derivatives, selective probes and natural dyes) into inorganic materials (UVM-7, silica and alumina). The colour changes of the sensor array were characteristic of chicken ageing in a modified packaging atmosphere (30% CO(2)-70% N(2)). The chromogenic array data were processed with qualitative (PCA) and quantitative (PLS) tools. The PCA statistical analysis showed a high degree of dispersion, with nine dimensions required to explain 95% of variance. Despite this high dimensionality, a tridimensional representation of the three principal components was able to differentiate ageing with 2-day intervals. Moreover, the PLS statistical analysis allows the creation of a model to correlate the chromogenic data with chicken meat ageing. The model offers a PLS prediction model for ageing with values of 0.9937, 0.0389 and 0.994 for the slope, the intercept and the regression coefficient, respectively, and is in agreement with the perfect fit between the predicted and measured values observed. The results suggest the feasibility of this system to help develop optoelectronic noses that monitor food freshness.
Li, Muyang; Williams, Daniel L.; Heckwolf, Marlies; ...
2016-10-04
In this paper, we explore the ability of several characterization approaches for phenotyping to extract information about plant cell wall properties in diverse maize genotypes with the goal of identifying approaches that could be used to predict the plant's response to deconstruction in a biomass-to-biofuel process. Specifically, a maize diversity panel was subjected to two high-throughput biomass characterization approaches, pyrolysis molecular beam mass spectrometry (py-MBMS) and near-infrared (NIR) spectroscopy, and chemometric models to predict a number of plant cell wall properties as well as enzymatic hydrolysis yields of glucose following either no pretreatment or with mild alkaline pretreatment. These weremore » compared to multiple linear regression (MLR) models developed from quantified properties. We were able to demonstrate that direct correlations to specific mass spectrometry ions from pyrolysis as well as characteristic regions of the second derivative of the NIR spectrum regions were comparable in their predictive capability to partial least squares (PLS) models for p-coumarate content, while the direct correlation to the spectral data was superior to the PLS for Klason lignin content and guaiacyl monomer release by thioacidolysis as assessed by cross-validation. The PLS models for prediction of hydrolysis yields using either py-MBMS or NIR spectra were superior to MLR models based on quantified properties for unpretreated biomass. However, the PLS models using the two high-throughput characterization approaches could not predict hydrolysis following alkaline pretreatment while MLR models based on quantified properties could. This is likely a consequence of quantified properties including some assessments of pretreated biomass, while the py-MBMS and NIR only utilized untreated biomass.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Muyang; Williams, Daniel L.; Heckwolf, Marlies
In this paper, we explore the ability of several characterization approaches for phenotyping to extract information about plant cell wall properties in diverse maize genotypes with the goal of identifying approaches that could be used to predict the plant's response to deconstruction in a biomass-to-biofuel process. Specifically, a maize diversity panel was subjected to two high-throughput biomass characterization approaches, pyrolysis molecular beam mass spectrometry (py-MBMS) and near-infrared (NIR) spectroscopy, and chemometric models to predict a number of plant cell wall properties as well as enzymatic hydrolysis yields of glucose following either no pretreatment or with mild alkaline pretreatment. These weremore » compared to multiple linear regression (MLR) models developed from quantified properties. We were able to demonstrate that direct correlations to specific mass spectrometry ions from pyrolysis as well as characteristic regions of the second derivative of the NIR spectrum regions were comparable in their predictive capability to partial least squares (PLS) models for p-coumarate content, while the direct correlation to the spectral data was superior to the PLS for Klason lignin content and guaiacyl monomer release by thioacidolysis as assessed by cross-validation. The PLS models for prediction of hydrolysis yields using either py-MBMS or NIR spectra were superior to MLR models based on quantified properties for unpretreated biomass. However, the PLS models using the two high-throughput characterization approaches could not predict hydrolysis following alkaline pretreatment while MLR models based on quantified properties could. This is likely a consequence of quantified properties including some assessments of pretreated biomass, while the py-MBMS and NIR only utilized untreated biomass.« less
Dinç, Erdal; Ustündağ, Ozgür; Baleanu, Dumitru
2010-08-01
The sole use of pyridoxine hydrochloride during treatment of tuberculosis gives rise to pyridoxine deficiency. Therefore, a combination of pyridoxine hydrochloride and isoniazid is used in pharmaceutical dosage form in tuberculosis treatment to reduce this side effect. In this study, two chemometric methods, partial least squares (PLS) and principal component regression (PCR), were applied to the simultaneous determination of pyridoxine (PYR) and isoniazid (ISO) in their tablets. A concentration training set comprising binary mixtures of PYR and ISO consisting of 20 different combinations were randomly prepared in 0.1 M HCl. Both multivariate calibration models were constructed using the relationships between the concentration data set (concentration data matrix) and absorbance data matrix in the spectral region 200-330 nm. The accuracy and the precision of the proposed chemometric methods were validated by analyzing synthetic mixtures containing the investigated drugs. The recovery results obtained by applying PCR and PLS calibrations to the artificial mixtures were found between 100.0 and 100.7%. Satisfactory results obtained by applying the PLS and PCR methods to both artificial and commercial samples were obtained. The results obtained in this manuscript strongly encourage us to use them for the quality control and the routine analysis of the marketing tablets containing PYR and ISO drugs. Copyright © 2010 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Dyar, M. D.; Carmosino, M. L.; Breves, E. A.; Ozanne, M. V.; Clegg, S. M.; Wiens, R. C.
2012-04-01
A remote laser-induced breakdown spectrometer (LIBS) designed to simulate the ChemCam instrument on the Mars Science Laboratory Rover Curiosity was used to probe 100 geologic samples at a 9-m standoff distance. ChemCam consists of an integrated remote LIBS instrument that will probe samples up to 7 m from the mast of the rover and a remote micro-imager (RMI) that will record context images. The elemental compositions of 100 igneous and highly-metamorphosed rocks are determined with LIBS using three variations of multivariate analysis, with a goal of improving the analytical accuracy. Two forms of partial least squares (PLS) regression are employed with finely-tuned parameters: PLS-1 regresses a single response variable (elemental concentration) against the observation variables (spectra, or intensity at each of 6144 spectrometer channels), while PLS-2 simultaneously regresses multiple response variables (concentrations of the ten major elements in rocks) against the observation predictor variables, taking advantage of natural correlations between elements. Those results are contrasted with those from the multivariate regression technique of the least absolute shrinkage and selection operator (lasso), which is a penalized shrunken regression method that selects the specific channels for each element that explain the most variance in the concentration of that element. To make this comparison, we use results of cross-validation and of held-out testing, and employ unscaled and uncentered spectral intensity data because all of the input variables are already in the same units. Results demonstrate that the lasso, PLS-1, and PLS-2 all yield comparable results in terms of accuracy for this dataset. However, the interpretability of these methods differs greatly in terms of fundamental understanding of LIBS emissions. PLS techniques generate principal components, linear combinations of intensities at any number of spectrometer channels, which explain as much variance in the response variables as possible while avoiding multicollinearity between principal components. When the selected number of principal components is projected back into the original feature space of the spectra, 6144 correlation coefficients are generated, a small fraction of which are mathematically significant to the regression. In contrast, the lasso models require only a small number (< 24) of non-zero correlation coefficients (β values) to determine the concentration of each of the ten major elements. Causality between the positively-correlated emission lines chosen by the lasso and the elemental concentration was examined. In general, the higher the lasso coefficient (β), the greater the likelihood that the selected line results from an emission of that element. Emission lines with negative β values should arise from elements that are anti-correlated with the element being predicted. For elements except Fe, Al, Ti, and P, the lasso-selected wavelength with the highest β value corresponds to the element being predicted, e.g. 559.8 nm for neutral Ca. However, the specific lines chosen by the lasso with positive β values are not always those from the element being predicted. Other wavelengths and the elements that most strongly correlate with them to predict concentration are obviously related to known geochemical correlations or close overlap of emission lines, while others must result from matrix effects. Use of the lasso technique thus directly informs our understanding of the underlying physical processes that give rise to LIBS emissions by determining which lines can best represent concentration, and which lines from other elements are causing matrix effects.
Lu, Yuzhen; Du, Changwen; Yu, Changbing; Zhou, Jianmin
2014-08-01
Fast and non-destructive determination of rapeseed protein content carries significant implications in rapeseed production. This study presented the first attempt of using Fourier transform mid-infrared photoacoustic spectroscopy (FTIR-PAS) to quantify protein content of rapeseed. The full-spectrum model was first built using partial least squares (PLS). Interval selection methods including interval partial least squares (iPLS), synergy interval partial least squares (siPLS), backward elimination interval partial least squares (biPLS) and dynamic backward elimination interval partial least squares (dyn-biPLS) were then employed to select the relevant band or band combination for PLS modeling. The full-spectrum PLS model achieved an ratio of prediction to deviation (RPD) of 2.047. In comparison, all interval selection methods produced better results than full-spectrum modeling. siPLS achieved the best predictive accuracy with an RPD of 3.215 when the spectrum was sectioned into 25 intervals, and two intervals (1198-1335 and 1614-1753 cm(-1) ) were selected. iPLS excelled biPLS and dyn-biPLS, and dyn-biPLS performed slightly better than biPLS. FTIR-PAS was verified as a promising analytical tool to quantify rapeseed protein content. Interval selection could extract the relevant individual band or synergy band associated with the sample constituent of interest, and then improve the prediction accuracy of the full-spectrum model. © 2013 Society of Chemical Industry.
Predicting heavy metal concentrations in soils and plants using field spectrophotometry
NASA Astrophysics Data System (ADS)
Muradyan, V.; Tepanosyan, G.; Asmaryan, Sh.; Sahakyan, L.; Saghatelyan, A.; Warner, T. A.
2017-09-01
Aim of this study is to predict heavy metal (HM) concentrations in soils and plants using field remote sensing methods. The studied sites were an industrial town of Kajaran and city of Yerevan. The research also included sampling of soils and leaves of two tree species exposed to different pollution levels and determination of contents of HM in lab conditions. The obtained spectral values were then collated with contents of HM in Kajaran soils and the tree leaves sampled in Yerevan, and statistical analysis was done. Consequently, Zn and Pb have a negative correlation coefficient (p <0.01) in a 2498 nm spectral range for soils. Pb has a significantly higher correlation at red edge for plants. A regression models and artificial neural network (ANN) for HM prediction were developed. Good results were obtained for the best stress sensitive spectral band ANN (R2 0.9, RPD 2.0), Simple Linear Regression (SLR) and Partial Least Squares Regression (PLSR) (R2 0.7, RPD 1.4) models. Multiple Linear Regression (MLR) model was not applicable to predict Pb and Zn concentrations in soils in this research. Almost all full spectrum PLS models provide good calibration and validation results (RPD>1.4). Full spectrum ANN models are characterized by excellent calibration R2, rRMSE and RPD (0.9; 0.1 and >2.5 respectively). For prediction of Pb and Ni contents in plants SLR and PLS models were used. The latter provide almost the same results. Our findings indicate that it is possible to make coarse direct estimation of HM content in soils and plants using rapid and economic reflectance spectroscopy.
Hemmila, April; McGill, Jim; Ritter, David
2008-03-01
To determine if changes in fingerprint infrared spectra linear with age can be found, partial least squares (PLS1) regression of 155 fingerprint infrared spectra against the person's age was constructed. The regression produced a linear model of age as a function of spectrum with a root mean square error of calibration of less than 4 years, showing an inflection at about 25 years of age. The spectral ranges emphasized by the regression do not correspond to the highest concentration constituents of the fingerprints. Separate linear regression models for old and young people can be constructed with even more statistical rigor. The success of the regression demonstrates that a combination of constituents can be found that changes linearly with age, with a significant shift around puberty.
Statistical variation in progressive scrambling
NASA Astrophysics Data System (ADS)
Clark, Robert D.; Fox, Peter C.
2004-07-01
The two methods most often used to evaluate the robustness and predictivity of partial least squares (PLS) models are cross-validation and response randomization. Both methods may be overly optimistic for data sets that contain redundant observations, however. The kinds of perturbation analysis widely used for evaluating model stability in the context of ordinary least squares regression are only applicable when the descriptors are independent of each other and errors are independent and normally distributed; neither assumption holds for QSAR in general and for PLS in particular. Progressive scrambling is a novel, non-parametric approach to perturbing models in the response space in a way that does not disturb the underlying covariance structure of the data. Here, we introduce adjustments for two of the characteristic values produced by a progressive scrambling analysis - the deprecated predictivity (Q_s^{ast^2}) and standard error of prediction (SDEP s * ) - that correct for the effect of introduced perturbation. We also explore the statistical behavior of the adjusted values (Q_0^{ast^2} and SDEP 0 * ) and the sensitivity to perturbation (d q 2/d r yy ' 2). It is shown that the three statistics are all robust for stable PLS models, in terms of the stochastic component of their determination and of their variation due to sampling effects involved in training set selection.
Adedipe, Oluwatosin E; Johanningsmeier, Suzanne D; Truong, Van-Den; Yencho, G Craig
2016-03-02
This study investigated the ability of near-infrared spectroscopy (NIRS) to predict acrylamide content in French-fried potato. Potato flour spiked with acrylamide (50-8000 μg/kg) was used to determine if acrylamide could be accurately predicted in a potato matrix. French fries produced with various pretreatments and cook times (n = 84) and obtained from quick-service restaurants (n = 64) were used for model development and validation. Acrylamide was quantified using gas chromatography-mass spectrometry, and reflectance spectra (400-2500 nm) of each freeze-dried sample were captured on a Foss XDS Rapid Content Analyzer-NIR spectrometer. Partial least-squares (PLS) discriminant analysis and PLS regression modeling demonstrated that NIRS could accurately detect acrylamide content as low as 50 μg/kg in the model potato matrix. Prediction errors of 135 μg/kg (R(2) = 0.98) and 255 μg/kg (R(2) = 0.93) were achieved with the best PLS models for acrylamide prediction in Russet Norkotah French-fried potato and multiple samples of unknown varieties, respectively. The findings indicate that NIRS can be used as a screening tool in potato breeding and potato processing research to reduce acrylamide in the food supply.
Identification of chilling and heat requirements of cherry trees--a statistical approach.
Luedeling, Eike; Kunz, Achim; Blanke, Michael M
2013-09-01
Most trees from temperate climates require the accumulation of winter chill and subsequent heat during their dormant phase to resume growth and initiate flowering in the following spring. Global warming could reduce chill and hence hamper the cultivation of high-chill species such as cherries. Yet determining chilling and heat requirements requires large-scale controlled-forcing experiments, and estimates are thus often unavailable. Where long-term phenology datasets exist, partial least squares (PLS) regression can be used as an alternative, to determine climatic requirements statistically. Bloom dates of cherry cv. 'Schneiders späte Knorpelkirsche' trees in Klein-Altendorf, Germany, from 24 growing seasons were correlated with 11-day running means of daily mean temperature. Based on the output of the PLS regression, five candidate chilling periods ranging in length from 17 to 102 days, and one forcing phase of 66 days were delineated. Among three common chill models used to quantify chill, the Dynamic Model showed the lowest variation in chill, indicating that it may be more accurate than the Utah and Chilling Hours Models. Based on the longest candidate chilling phase with the earliest starting date, cv. 'Schneiders späte Knorpelkirsche' cherries at Bonn exhibited a chilling requirement of 68.6 ± 5.7 chill portions (or 1,375 ± 178 chilling hours or 1,410 ± 238 Utah chill units) and a heat requirement of 3,473 ± 1,236 growing degree hours. Closer investigation of the distinct chilling phases detected by PLS regression could contribute to our understanding of dormancy processes and thus help fruit and nut growers identify suitable tree cultivars for a future in which static climatic conditions can no longer be assumed. All procedures used in this study were bundled in an R package ('chillR') and are provided as Supplementary materials. The procedure was also applied to leaf emergence dates of walnut (cv. 'Payne') at Davis, California.
Identification of chilling and heat requirements of cherry trees—a statistical approach
NASA Astrophysics Data System (ADS)
Luedeling, Eike; Kunz, Achim; Blanke, Michael M.
2013-09-01
Most trees from temperate climates require the accumulation of winter chill and subsequent heat during their dormant phase to resume growth and initiate flowering in the following spring. Global warming could reduce chill and hence hamper the cultivation of high-chill species such as cherries. Yet determining chilling and heat requirements requires large-scale controlled-forcing experiments, and estimates are thus often unavailable. Where long-term phenology datasets exist, partial least squares (PLS) regression can be used as an alternative, to determine climatic requirements statistically. Bloom dates of cherry cv. `Schneiders späte Knorpelkirsche' trees in Klein-Altendorf, Germany, from 24 growing seasons were correlated with 11-day running means of daily mean temperature. Based on the output of the PLS regression, five candidate chilling periods ranging in length from 17 to 102 days, and one forcing phase of 66 days were delineated. Among three common chill models used to quantify chill, the Dynamic Model showed the lowest variation in chill, indicating that it may be more accurate than the Utah and Chilling Hours Models. Based on the longest candidate chilling phase with the earliest starting date, cv. `Schneiders späte Knorpelkirsche' cherries at Bonn exhibited a chilling requirement of 68.6 ± 5.7 chill portions (or 1,375 ± 178 chilling hours or 1,410 ± 238 Utah chill units) and a heat requirement of 3,473 ± 1,236 growing degree hours. Closer investigation of the distinct chilling phases detected by PLS regression could contribute to our understanding of dormancy processes and thus help fruit and nut growers identify suitable tree cultivars for a future in which static climatic conditions can no longer be assumed. All procedures used in this study were bundled in an R package (`chillR') and are provided as Supplementary materials. The procedure was also applied to leaf emergence dates of walnut (cv. `Payne') at Davis, California.
NASA Astrophysics Data System (ADS)
Isingizwe Nturambirwe, J. Frédéric; Perold, Willem J.; Opara, Umezuruike L.
2016-02-01
Near infrared (NIR) spectroscopy has gained extensive use in quality evaluation. It is arguably one of the most advanced spectroscopic tools in non-destructive quality testing of food stuff, from measurement to data analysis and interpretation. NIR spectral data are interpreted through means often involving multivariate statistical analysis, sometimes associated with optimisation techniques for model improvement. The objective of this research was to explore the extent to which genetic algorithms (GA) can be used to enhance model development, for predicting fruit quality. Apple fruits were used, and NIR spectra in the range from 12000 to 4000 cm-1 were acquired on both bruised and healthy tissues, with different degrees of mechanical damage. GAs were used in combination with partial least squares regression methods to develop bruise severity prediction models, and compared to PLS models developed using the full NIR spectrum. A classification model was developed, which clearly separated bruised from unbruised apple tissue. GAs helped improve prediction models by over 10%, in comparison with full spectrum-based models, as evaluated in terms of error of prediction (Root Mean Square Error of Cross-validation). PLS models to predict internal quality, such as sugar content and acidity were developed and compared to the versions optimized by genetic algorithm. Overall, the results highlighted the potential use of GA method to improve speed and accuracy of fruit quality prediction.
Year-class formation of upper St. Lawrence River northern pike
Smith, B.M.; Farrell, J.M.; Underwood, H.B.; Smith, S.J.
2007-01-01
Variables associated with year-class formation in upper St. Lawrence River northern pike Esox lucius were examined to explore population trends. A partial least-squares (PLS) regression model (PLS 1) was used to relate a year-class strength index (YCSI; 1974-1997) to explanatory variables associated with spawning and nursery areas (seasonal water level and temperature and their variability, number of ice days, and last day of ice presence). A second model (PLS 2) incorporated four additional ecological variables: potential predators (abundance of double-crested cormorants Phalacrocorax auritus and yellow perch Perca flavescens), female northern pike biomass (as a measure of stock-recruitment effects), and total phosphorus (productivity). Trends in adult northern pike catch revealed a decline (1981-2005), and year-class strength was positively related to catch per unit effort (CPUE; R2 = 0.58). The YCSI exceeded the 23-year mean in only 2 of the last 10 years. Cyclic patterns in the YCSI time series (along with strong year-classes every 4-6 years) were apparent, as was a dampening effect of amplitude beginning around 1990. The PLS 1 model explained over 50% of variation in both explanatory variables and the dependent variable, YCSI first-order moving-average residuals. Variables retained (N = 10; Wold's statistic ??? 0.8) included negative YCSI associations with high summer water levels, high variability in spring and fall water levels, and variability in fall water temperature. The YCSI exhibited positive associations with high spring, summer, and fall water temperature, variability in spring temperature, and high winter and spring water level. The PLS 2 model led to positive YCSI associations with phosphorus and yellow perch CPUE and a negative correlation with double-crested cormorant abundance. Environmental variables (water level and temperature) are hypothesized to regulate northern pike YCSI cycles, and dampening in YCSI magnitude may be related to a combination of factors, including wetland habitat changes, reduced nutrient loading, and increased predation by double-crested cormorants. ?? Copyright by the American Fisheries Society 2007.
Ghasemi, Jahan B; Safavi-Sohi, Reihaneh; Barbosa, Euzébio G
2012-02-01
A quasi 4D-QSAR has been carried out on a series of potent Gram-negative LpxC inhibitors. This approach makes use of the molecular dynamics (MD) trajectories and topology information retrieved from the GROMACS package. This new methodology is based on the generation of a conformational ensemble profile, CEP, for each compound instead of only one conformation, followed by the calculation intermolecular interaction energies at each grid point considering probes and all aligned conformations resulting from MD simulations. These interaction energies are independent variables employed in a QSAR analysis. The comparison of the proposed methodology to comparative molecular field analysis (CoMFA) formalism was performed. This methodology explores jointly the main features of CoMFA and 4D-QSAR models. Step-wise multiple linear regression was used for the selection of the most informative variables. After variable selection, multiple linear regression (MLR) and partial least squares (PLS) methods used for building the regression models. Leave-N-out cross-validation (LNO), and Y-randomization were performed in order to confirm the robustness of the model in addition to analysis of the independent test set. Best models provided the following statistics: [Formula in text] (PLS) and [Formula in text] (MLR). Docking study was applied to investigate the major interactions in protein-ligand complex with CDOCKER algorithm. Visualization of the descriptors of the best model helps us to interpret the model from the chemical point of view, supporting the applicability of this new approach in rational drug design.
Computerized pigment design based on property hypersurfaces
NASA Astrophysics Data System (ADS)
Halova, Jaroslava; Sulcova, Petra; Kupka, Karel
2007-05-01
Competition is tough in the pigment market. Rational pigment design has therefore a competitive advantage, saving time and money. The aim of this work is to provide methods that can assist in designing pigments with defined properties. These methods include partial least squares regression (PLSR), neural network (NN) and generalized regression ANOVA model. Authors show how PLS bi-plot can be used to identify market gaps poorly covered by pigment manufacturers, thus giving an opportunity to develop pigments with potentially profitable properties.
Intrinsic Raman spectroscopy for quantitative biological spectroscopy Part II
Bechtel, Kate L.; Shih, Wei-Chuan; Feld, Michael S.
2009-01-01
We demonstrate the effectiveness of intrinsic Raman spectroscopy (IRS) at reducing errors caused by absorption and scattering. Physical tissue models, solutions of varying absorption and scattering coefficients with known concentrations of Raman scatterers, are studied. We show significant improvement in prediction error by implementing IRS to predict concentrations of Raman scatterers using both ordinary least squares regression (OLS) and partial least squares regression (PLS). In particular, we show that IRS provides a robust calibration model that does not increase in error when applied to samples with optical properties outside the range of calibration. PMID:18711512
Fernandes, David Douglas Sousa; Gomes, Adriano A; Costa, Gean Bezerra da; Silva, Gildo William B da; Véras, Germano
2011-12-15
This work is concerned of evaluate the use of visible and near-infrared (NIR) range, separately and combined, to determine the biodiesel content in biodiesel/diesel blends using Multiple Linear Regression (MLR) and variable selection by Successive Projections Algorithm (SPA). Full spectrum models employing Partial Least Squares (PLS) and variables selection by Stepwise (SW) regression coupled with Multiple Linear Regression (MLR) and PLS models also with variable selection by Jack-Knife (Jk) were compared the proposed methodology. Several preprocessing were evaluated, being chosen derivative Savitzky-Golay with second-order polynomial and 17-point window for NIR and visible-NIR range, with offset correction. A total of 100 blends with biodiesel content between 5 and 50% (v/v) prepared starting from ten sample of biodiesel. In the NIR and visible region the best model was the SPA-MLR using only two and eight wavelengths with RMSEP of 0.6439% (v/v) and 0.5741 respectively, while in the visible-NIR region the best model was the SW-MLR using five wavelengths and RMSEP of 0.9533% (v/v). Results indicate that both spectral ranges evaluated showed potential for developing a rapid and nondestructive method to quantify biodiesel in blends with mineral diesel. Finally, one can still mention that the improvement in terms of prediction error obtained with the procedure for variables selection was significant. Copyright © 2011 Elsevier B.V. All rights reserved.
Statistical process control of cocrystallization processes: A comparison between OPLS and PLS.
Silva, Ana F T; Sarraguça, Mafalda Cruz; Ribeiro, Paulo R; Santos, Adenilson O; De Beer, Thomas; Lopes, João Almeida
2017-03-30
Orthogonal partial least squares regression (OPLS) is being increasingly adopted as an alternative to partial least squares (PLS) regression due to the better generalization that can be achieved. Particularly in multivariate batch statistical process control (BSPC), the use of OPLS for estimating nominal trajectories is advantageous. In OPLS, the nominal process trajectories are expected to be captured in a single predictive principal component while uncorrelated variations are filtered out to orthogonal principal components. In theory, OPLS will yield a better estimation of the Hotelling's T 2 statistic and corresponding control limits thus lowering the number of false positives and false negatives when assessing the process disturbances. Although OPLS advantages have been demonstrated in the context of regression, its use on BSPC was seldom reported. This study proposes an OPLS-based approach for BSPC of a cocrystallization process between hydrochlorothiazide and p-aminobenzoic acid monitored on-line with near infrared spectroscopy and compares the fault detection performance with the same approach based on PLS. A series of cocrystallization batches with imposed disturbances were used to test the ability to detect abnormal situations by OPLS and PLS-based BSPC methods. Results demonstrated that OPLS was generally superior in terms of sensibility and specificity in most situations. In some abnormal batches, it was found that the imposed disturbances were only detected with OPLS. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Jintao, Xue; Yufei, Liu; Liming, Ye; Chunyan, Li; Quanwei, Yang; Weiying, Wang; Yun, Jing; Minxiang, Zhang; Peng, Li
2018-01-01
Near-Infrared Spectroscopy (NIRS) was first used to develop a method for rapid and simultaneous determination of 5 active alkaloids (berberine, coptisine, palmatine, epiberberine and jatrorrhizine) in 4 parts (rhizome, fibrous root, stem and leaf) of Coptidis Rhizoma. A total of 100 samples from 4 main places of origin were collected and studied. With HPLC analysis values as calibration reference, the quantitative analysis of 5 marker components was performed by two different modeling methods, partial least-squares (PLS) regression as linear regression and artificial neural networks (ANN) as non-linear regression. The results indicated that the 2 types of models established were robust, accurate and repeatable for five active alkaloids, and the ANN models was more suitable for the determination of berberine, coptisine and palmatine while the PLS model was more suitable for the analysis of epiberberine and jatrorrhizine. The performance of the optimal models was achieved as follows: the correlation coefficient (R) for berberine, coptisine, palmatine, epiberberine and jatrorrhizine was 0.9958, 0.9956, 0.9959, 0.9963 and 0.9923, respectively; the root mean square error of validation (RMSEP) was 0.5093, 0.0578, 0.0443, 0.0563 and 0.0090, respectively. Furthermore, for the comprehensive exploitation and utilization of plant resource of Coptidis Rhizoma, the established NIR models were used to analysis the content of 5 active alkaloids in 4 parts of Coptidis Rhizoma and 4 main origin of places. This work demonstrated that NIRS may be a promising method as routine screening for off-line fast analysis or on-line quality assessment of traditional Chinese medicine (TCM).
Delwiche, Stephen R; Reeves, James B
2010-01-01
In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly smoothing operations or derivatives. While such operations are often useful in reducing the number of latent variables of the actual decomposition and lowering residual error, they also run the risk of misleading the practitioner into accepting calibration equations that are poorly adapted to samples outside of the calibration. The current study developed a graphical method to examine this effect on partial least squares (PLS) regression calibrations of near-infrared (NIR) reflection spectra of ground wheat meal with two analytes, protein content and sodium dodecyl sulfate sedimentation (SDS) volume (an indicator of the quantity of the gluten proteins that contribute to strong doughs). These two properties were chosen because of their differing abilities to be modeled by NIR spectroscopy: excellent for protein content, fair for SDS sedimentation volume. To further demonstrate the potential pitfalls of preprocessing, an artificial component, a randomly generated value, was included in PLS regression trials. Savitzky-Golay (digital filter) smoothing, first-derivative, and second-derivative preprocess functions (5 to 25 centrally symmetric convolution points, derived from quadratic polynomials) were applied to PLS calibrations of 1 to 15 factors. The results demonstrated the danger of an over reliance on preprocessing when (1) the number of samples used in a multivariate calibration is low (<50), (2) the spectral response of the analyte is weak, and (3) the goodness of the calibration is based on the coefficient of determination (R(2)) rather than a term based on residual error. The graphical method has application to the evaluation of other preprocess functions and various types of spectroscopy data.
Wang, L; Qin, X C; Lin, H C; Deng, K F; Luo, Y W; Sun, Q R; Du, Q X; Wang, Z Y; Tuo, Y; Sun, J H
2018-02-01
To analyse the relationship between Fourier transform infrared (FTIR) spectrum of rat's spleen tissue and postmortem interval (PMI) for PMI estimation using FTIR spectroscopy combined with data mining method. Rats were sacrificed by cervical dislocation, and the cadavers were placed at 20 ℃. The FTIR spectrum data of rats' spleen tissues were taken and measured at different time points. After pretreatment, the data was analysed by data mining method. The absorption peak intensity of rat's spleen tissue spectrum changed with the PMI, while the absorption peak position was unchanged. The results of principal component analysis (PCA) showed that the cumulative contribution rate of the first three principal components was 96%. There was an obvious clustering tendency for the spectrum sample at each time point. The methods of partial least squares discriminant analysis (PLS-DA) and support vector machine classification (SVMC) effectively divided the spectrum samples with different PMI into four categories (0-24 h, 48-72 h, 96-120 h and 144-168 h). The determination coefficient ( R ²) of the PMI estimation model established by PLS regression analysis was 0.96, and the root mean square error of calibration (RMSEC) and root mean square error of cross validation (RMSECV) were 9.90 h and 11.39 h respectively. In prediction set, the R ² was 0.97, and the root mean square error of prediction (RMSEP) was 10.49 h. The FTIR spectrum of the rat's spleen tissue can be effectively analyzed qualitatively and quantitatively by the combination of FTIR spectroscopy and data mining method, and the classification and PLS regression models can be established for PMI estimation. Copyright© by the Editorial Department of Journal of Forensic Medicine.
Prediction of ethanol in bottled Chinese rice wine by NIR spectroscopy
NASA Astrophysics Data System (ADS)
Ying, Yibin; Yu, Haiyan; Pan, Xingxiang; Lin, Tao
2006-10-01
To evaluate the applicability of non-invasive visible and near infrared (VIS-NIR) spectroscopy for determining ethanol concentration of Chinese rice wine in square brown glass bottle, transmission spectra of 100 bottled Chinese rice wine samples were collected in the spectral range of 350-1200 nm. Statistical equations were established between the reference data and VIS-NIR spectra by partial least squares (PLS) regression method. Performance of three kinds of mathematical treatment of spectra (original spectra, first derivative spectra and second derivative spectra) were also discussed. The PLS models of original spectra turned out better results, with higher correlation coefficient in calibration (R cal) of 0.89, lower root mean standard error of calibration (RMSEC) of 0.165, and lower root mean standard error of cross validation (RMSECV) of 0.179. Using original spectra, PLS models for ethanol concentration prediction were developed. The R cal and the correlation coefficient in validation (R val) were 0.928 and 0.875, respectively; and the RMSEC and the root mean standard error of validation (RMSEP) were 0.135 (%, v v -1) and 0.177 (%, v v -1), respectively. The results demonstrated that VIS-NIR spectroscopy could be used to predict ethanol concentration in bottled Chinese rice wine.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Coble, Jamie; Orton, Christopher; Schwantes, Jon
Abstract—The Multi-Isotope Process (MIP) Monitor provides an efficient approach to monitoring the process conditions in used nuclear fuel reprocessing facilities to support process verification and validation. The MIP Monitor applies multivariate analysis to gamma spectroscopy of reprocessing streams in order to detect small changes in the gamma spectrum, which may indicate changes in process conditions. This research extends the MIP Monitor by characterizing a used fuel sample after initial dissolution according to the type of reactor of origin (pressurized or boiling water reactor), initial enrichment, burn up, and cooling time. Simulated gamma spectra were used to develop and test threemore » fuel characterization algorithms. The classification and estimation models employed are based on the partial least squares regression (PLS) algorithm. A PLS discriminate analysis model was developed which perfectly classified reactor type. Locally weighted PLS models were fitted on-the-fly to estimate continuous fuel characteristics. Burn up was predicted within 0.1% root mean squared percent error (RMSPE) and both cooling time and initial enrichment within approximately 2% RMSPE. This automated fuel characterization can be used to independently verify operator declarations of used fuel characteristics and inform the MIP Monitor anomaly detection routines at later stages of the fuel reprocessing stream to improve sensitivity to changes in operational parameters and material diversions.« less
Burggraeve, A; Van den Kerkhof, T; Hellings, M; Remon, J P; Vervaet, C; De Beer, T
2011-04-18
Fluid bed granulation is a batch process, which is characterized by the processing of raw materials for a predefined period of time, consisting of a fixed spraying phase and a subsequent drying period. The present study shows the multivariate statistical modeling and control of a fluid bed granulation process based on in-line particle size distribution (PSD) measurements (using spatial filter velocimetry) combined with continuous product temperature registration using a partial least squares (PLS) approach. Via the continuous in-line monitoring of the PSD and product temperature during granulation of various reference batches, a statistical batch model was developed allowing the real-time evaluation and acceptance or rejection of future batches. Continuously monitored PSD and product temperature process data of 10 reference batches (X-data) were used to develop a reference batch PLS model, regressing the X-data versus the batch process time (Y-data). Two PLS components captured 98.8% of the variation in the X-data block. Score control charts in which the average batch trajectory and upper and lower control limits are displayed were developed. Next, these control charts were used to monitor 4 new test batches in real-time and to immediately detect any deviations from the expected batch trajectory. By real-time evaluation of new batches using the developed control charts and by computation of contribution plots of deviating process behavior at a certain time point, batch losses or reprocessing can be prevented. Immediately after batch completion, all PSD and product temperature information (i.e., a batch progress fingerprint) was used to estimate some granule properties (density and flowability) at an early stage, which can improve batch release time. Individual PLS models relating the computed scores (X) of the reference PLS model (based on the 10 reference batches) and the density, respectively, flowabililty as Y-matrix, were developed. The scores of the 4 test batches were used to examine the predictive ability of the model. Copyright © 2011 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Yang, Yue; Wang, Lei; Wu, Yongjiang; Liu, Xuesong; Bi, Yuan; Xiao, Wei; Chen, Yong
2017-07-01
There is a growing need for the effective on-line process monitoring during the manufacture of traditional Chinese medicine to ensure quality consistency. In this study, the potential of near infrared (NIR) spectroscopy technique to monitor the extraction process of Flos Lonicerae Japonicae was investigated. A new algorithm of synergy interval PLS with genetic algorithm (Si-GA-PLS) was proposed for modeling. Four different PLS models, namely Full-PLS, Si-PLS, GA-PLS, and Si-GA-PLS, were established, and their performances in predicting two quality parameters (viz. total acid and soluble solid contents) were compared. In conclusion, Si-GA-PLS model got the best results due to the combination of superiority of Si-PLS and GA. For Si-GA-PLS, the determination coefficient (Rp2) and root-mean-square error for the prediction set (RMSEP) were 0.9561 and 147.6544 μg/ml for total acid, 0.9062 and 0.1078% for soluble solid contents, correspondingly. The overall results demonstrated that the NIR spectroscopy technique combined with Si-GA-PLS calibration is a reliable and non-destructive alternative method for on-line monitoring of the extraction process of TCM on the production scale.
NASA Astrophysics Data System (ADS)
Qiu, Peng; D'Souza, Warren D.; McAvoy, Thomas J.; Liu, K. J. Ray
2007-09-01
Tumor motion induced by respiration presents a challenge to the reliable delivery of conformal radiation treatments. Real-time motion compensation represents the technologically most challenging clinical solution but has the potential to overcome the limitations of existing methods. The performance of a real-time couch-based motion compensation system is mainly dependent on two aspects: the ability to infer the internal anatomical position and the performance of the feedback control system. In this paper, we propose two novel methods for the two aspects respectively, and then combine the proposed methods into one system. To accurately estimate the internal tumor position, we present partial-least squares (PLS) regression to predict the position of the diaphragm using skin-based motion surrogates. Four radio-opaque markers were placed on the abdomen of patients who underwent fluoroscopic imaging of the diaphragm. The coordinates of the markers served as input variables and the position of the diaphragm served as the output variable. PLS resulted in lower prediction errors compared with standard multiple linear regression (MLR). The performance of the feedback control system depends on the system dynamics and dead time (delay between the initiation and execution of the control action). While the dynamics of the system can be inverted in a feedback control system, the dead time cannot be inverted. To overcome the dead time of the system, we propose a predictive feedback control system by incorporating forward prediction using least-mean-square (LMS) and recursive least square (RLS) filtering into the couch-based control system. Motion data were obtained using a skin-based marker. The proposed predictive feedback control system was benchmarked against pure feedback control (no forward prediction) and resulted in a significant performance gain. Finally, we combined the PLS inference model and the predictive feedback control to evaluate the overall performance of the feedback control system. Our results show that, with the tumor motion unknown but inferred by skin-based markers through the PLS model, the predictive feedback control system was able to effectively compensate intra-fraction motion.
Wood, Clive; Alwati, Abdolati; Halsey, Sheelagh; Gough, Tim; Brown, Elaine; Kelly, Adrian; Paradkar, Anant
2016-09-10
The use of near infra red spectroscopy to predict the concentration of two pharmaceutical co-crystals; 1:1 ibuprofen-nicotinamide (IBU-NIC) and 1:1 carbamazepine-nicotinamide (CBZ-NIC) has been evaluated. A partial least squares (PLS) regression model was developed for both co-crystal pairs using sets of standard samples to create calibration and validation data sets with which to build and validate the models. Parameters such as the root mean square error of calibration (RMSEC), root mean square error of prediction (RMSEP) and correlation coefficient were used to assess the accuracy and linearity of the models. Accurate PLS regression models were created for both co-crystal pairs which can be used to predict the co-crystal concentration in a powder mixture of the co-crystal and the active pharmaceutical ingredient (API). The IBU-NIC model had smaller errors than the CBZ-NIC model, possibly due to the complex CBZ-NIC spectra which could reflect the different arrangement of hydrogen bonding associated with the co-crystal compared to the IBU-NIC co-crystal. These results suggest that NIR spectroscopy can be used as a PAT tool during a variety of pharmaceutical co-crystal manufacturing methods and the presented data will facilitate future offline and in-line NIR studies involving pharmaceutical co-crystals. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Improved Quantitative Analysis of Ion Mobility Spectrometry by Chemometric Multivariate Calibration
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fraga, Carlos G.; Kerr, Dayle; Atkinson, David A.
2009-09-01
Traditional peak-area calibration and the multivariate calibration methods of principle component regression (PCR) and partial least squares (PLS), including unfolded PLS (U-PLS) and multi-way PLS (N-PLS), were evaluated for the quantification of 2,4,6-trinitrotoluene (TNT) and cyclo-1,3,5-trimethylene-2,4,6-trinitramine (RDX) in Composition B samples analyzed by temperature step desorption ion mobility spectrometry (TSD-IMS). The true TNT and RDX concentrations of eight Composition B samples were determined by high performance liquid chromatography with UV absorbance detection. Most of the Composition B samples were found to have distinct TNT and RDX concentrations. Applying PCR and PLS on the exact same IMS spectra used for themore » peak-area study improved quantitative accuracy and precision approximately 3 to 5 fold and 2 to 4 fold, respectively. This in turn improved the probability of correctly identifying Composition B samples based upon the estimated RDX and TNT concentrations from 11% with peak area to 44% and 89% with PLS. This improvement increases the potential of obtaining forensic information from IMS analyzers by providing some ability to differentiate or match Composition B samples based on their TNT and RDX concentrations.« less
Zhang, Mengliang; Harrington, Peter de B
2015-01-01
Multivariate partial least-squares (PLS) method was applied to the quantification of two complex polychlorinated biphenyls (PCBs) commercial mixtures, Aroclor 1254 and 1260, in a soil matrix. PCBs in soil samples were extracted by headspace solid phase microextraction (SPME) and determined by gas chromatography/mass spectrometry (GC/MS). Decachlorinated biphenyl (deca-CB) was used as internal standard. After the baseline correction was applied, four data representations including extracted ion chromatograms (EIC) for Aroclor 1254, EIC for Aroclor 1260, EIC for both Aroclors and two-way data sets were constructed for PLS-1 and PLS-2 calibrations and evaluated with respect to quantitative prediction accuracy. The PLS model was optimized with respect to the number of latent variables using cross validation of the calibration data set. The validation of the method was performed with certified soil samples and real field soil samples and the predicted concentrations for both Aroclors using EIC data sets agreed with the certified values. The linear range of the method was from 10μgkg(-1) to 1000μgkg(-1) for both Aroclor 1254 and 1260 in soil matrices and the detection limit was 4μgkg(-1) for Aroclor 1254 and 6μgkg(-1) for Aroclor 1260. This holistic approach for the determination of mixtures of complex samples has broad application to environmental forensics and modeling. Copyright © 2014 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Samadi-Maybodi, Abdolraouf; Darzi, S. K. Hassani Nejad
2008-10-01
Resolution of binary mixtures of vitamin B12, methylcobalamin and B12 coenzyme with minimum sample pre-treatment and without analyte separation has been successfully achieved by methods of partial least squares algorithm with one dependent variable (PLS1), orthogonal signal correction/partial least squares (OSC/PLS), principal component regression (PCR) and hybrid linear analysis (HLA). Data of analysis were obtained from UV-vis spectra. The UV-vis spectra of the vitamin B12, methylcobalamin and B12 coenzyme were recorded in the same spectral conditions. The method of central composite design was used in the ranges of 10-80 mg L -1 for vitamin B12 and methylcobalamin and 20-130 mg L -1 for B12 coenzyme. The models refinement procedure and validation were performed by cross-validation. The minimum root mean square error of prediction (RMSEP) was 2.26 mg L -1 for vitamin B12 with PLS1, 1.33 mg L -1 for methylcobalamin with OSC/PLS and 3.24 mg L -1 for B12 coenzyme with HLA techniques. Figures of merit such as selectivity, sensitivity, analytical sensitivity and LOD were determined for three compounds. The procedure was successfully applied to simultaneous determination of three compounds in synthetic mixtures and in a pharmaceutical formulation.
Analytics of Radioactive Materials Released in the Fukushima Daiichi Nuclear Accident
DOE Office of Scientific and Technical Information (OSTI.GOV)
Egarievwe, Stephen U.; Nuclear Engineering Department, University of Tennessee, Knoxville, TN; Coble, Jamie B.
The 2011 Fukushima Daiichi nuclear accident in Japan resulted in the release of radioactive materials into the atmosphere, the nearby sea, and the surrounding land. Following the accident, several meteorological models were used to predict the transport of the radioactive materials to other continents such as North America and Europe. Also of high importance is the dispersion of radioactive materials locally and within Japan. Based on the International Atomic Energy Agency (IAEA) Convention on Early Notification of a nuclear accident, several radiological data sets were collected on the accident by the Japanese authorities. Among the radioactive materials monitored, are I-131more » and Cs-137 which form the major contributions to the contamination of drinking water. The radiation dose in the atmosphere was also measured. It is impractical to measure contamination and radiation dose in every place of interest. Therefore, modeling helps to predict contamination and radiation dose. Some modeling studies that have been reported in the literature include the simulation of transport and deposition of I-131 and Cs-137 from the accident, Cs-137 deposition and contamination of Japanese soils, and preliminary estimates of I-131 and Cs-137 discharged from the plant into the atmosphere. In this paper, we present statistical analytics of I-131 and Cs-137 with the goal of predicting gamma dose from the Fukushima Daiichi nuclear accident. The data sets used in our study were collected from the IAEA Fukushima Monitoring Database. As part of this study, we investigated several regression models to find the best algorithm for modeling the gamma dose. The modeling techniques used in our study include linear regression, principal component regression (PCR), partial least square (PLS) regression, and ridge regression. Our preliminary results on the first set of data showed that the linear regression model with one variable was the best with a root mean square error of 0.0133 μSv/h, compared to 0.0210 μSv/h for PCR, 0.231 μSv/h for ridge regression L-curve, 0.0856 μSv/h for PLS, and 0.0860 μSv/h for ridge regression cross validation. Complete results using the full datasets for these models will also be presented. (authors)« less
Multimodal Classification of Mild Cognitive Impairment Based on Partial Least Squares.
Wang, Pingyue; Chen, Kewei; Yao, Li; Hu, Bin; Wu, Xia; Zhang, Jiacai; Ye, Qing; Guo, Xiaojuan
2016-08-10
In recent years, increasing attention has been given to the identification of the conversion of mild cognitive impairment (MCI) to Alzheimer's disease (AD). Brain neuroimaging techniques have been widely used to support the classification or prediction of MCI. The present study combined magnetic resonance imaging (MRI), 18F-fluorodeoxyglucose PET (FDG-PET), and 18F-florbetapir PET (florbetapir-PET) to discriminate MCI converters (MCI-c, individuals with MCI who convert to AD) from MCI non-converters (MCI-nc, individuals with MCI who have not converted to AD in the follow-up period) based on the partial least squares (PLS) method. Two types of PLS models (informed PLS and agnostic PLS) were built based on 64 MCI-c and 65 MCI-nc from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. The results showed that the three-modality informed PLS model achieved better classification accuracy of 81.40%, sensitivity of 79.69%, and specificity of 83.08% compared with the single-modality model, and the three-modality agnostic PLS model also achieved better classification compared with the two-modality model. Moreover, combining the three modalities with clinical test score (ADAS-cog), the agnostic PLS model (independent data: florbetapir-PET; dependent data: FDG-PET and MRI) achieved optimal accuracy of 86.05%, sensitivity of 81.25%, and specificity of 90.77%. In addition, the comparison of PLS, support vector machine (SVM), and random forest (RF) showed greater diagnostic power of PLS. These results suggested that our multimodal PLS model has the potential to discriminate MCI-c from the MCI-nc and may therefore be helpful in the early diagnosis of AD.
NASA Astrophysics Data System (ADS)
Hemmateenejad, Bahram; Rezaei, Zahra; Khabnadideh, Soghra; Saffari, Maryam
2007-11-01
Carbamazepine (CBZ) undergoes enzyme biotransformation through epoxidation with the formation of its metabolite, carbamazepine-10,11-epoxide (CBZE). A simple chemometrics-assisted spectrophotometric method has been proposed for simultaneous determination of CBZ and CBZE in plasma. A liquid extraction procedure was operated to separate the analytes from plasma, and the UV absorbance spectra of the resultant solutions were subjected to partial least squares (PLS) regression. The optimum number of PLS latent variables was selected according to the PRESS values of leave-one-out cross-validation. A HPLC method was also employed for comparison. The respective mean recoveries for analysis of CBZ and CBZE in synthetic mixtures were 102.57 (±0.25)% and 103.00 (±0.09)% for PLS and 99.40 (±0.15)% and 102.20 (±0.02)%. The concentrations of CBZ and CBZE were also determined in five patients using the PLS and HPLC methods. The results showed that the data obtained by PLS were comparable with those obtained by HPLC method.
Villanger, Gro D; Jenssen, Bjørn M; Fjeldberg, Rita R; Letcher, Robert J; Muir, Derek C G; Kirkegaard, Maja; Sonne, Christian; Dietz, Rune
2011-05-01
We investigated the multivariate relationships between adipose tissue residue levels of 48 individual organohalogen contaminants (OHCs) and circulating thyroid hormone (TH) levels in polar bears (Ursus maritimus) from East Greenland (1999-2001, n=62), using projection to latent structure (PLS) regression for four groupings of polar bears; subadults (SubA), adult females with cubs (AdF_N), adult females without cubs (AdF_S) and adult males (AdM). In the resulting significant PLS models for SubA, AdF_N and AdF_S, some OHCs were especially important in explaining variations in circulating TH levels: polybrominated diphenylether (PBDE)-99, PBDE-100, PBDE-153, polychlorinated biphenyl (PCB)-52, PCB-118, cis-nonachlor, trans-nonachlor, trichlorobenzene (TCB) and pentachlorobenzene (QCB), and both negative and positive relationships with THs were found. In addition, the models revealed that DDTs had a positive influence on total 3,5,3'-triiodothyronine (TT3) in AdF_S, and that a group of 17 higher chlorinated ortho-PCBs had a positive influence on total 3,5,3',5'-tetraiodothyronine (thyroxine, TT4) in AdF_N. TH levels in AdM seemed less influenced by OHCs because of non-significant PLS models. TH levels were also influenced by biological factors such as age, sex, body size, lipid content of adipose tissue and sampling date. When controlling for biological variables, the major relationships from the PLS models for SubA, AdF_N and AdF_S were found significant in partial correlations. The most important OHCs that influenced TH levels in the significant PLS models may potentially act through similar mechanisms on the hypothalamic-pituitary-thyroid (HPT) axis, suggesting that both combined effects by dose and response addition and perhaps synergistic potentiation may be a possibility in these polar bears. Statistical associations are not evidence per se of biological cause-effect relationships. Still, the results of the present study indicate that OHCs may affect circulating TH levels in East Greenland polar bears, adding to the "weight of evidence" suggesting that OHCs might interfere with thyroid homeostasis in polar bears. Copyright © 2011 Elsevier Ltd. All rights reserved.
Quantification of amine functional groups and their influence on OM/OC in the IMPROVE network
NASA Astrophysics Data System (ADS)
Kamruzzaman, Mohammed; Takahama, Satoshi; Dillner, Ann M.
2018-01-01
Recently, we developed a method using FT-IR spectroscopy coupled with partial least squares (PLS) regression to measure the four most abundant organic functional groups, aliphatic C-H, alcohol OH, carboxylic acid OH and carbonyl C=O, in atmospheric particulate matter. These functional groups are summed to estimate organic matter (OM) while the carbon from the functional groups is summed to estimate organic carbon (OC). With this method, OM and OM/OC can be estimated for each sample rather than relying on one assumed value to convert OC measurements to OM. This study continues the development of the FT-IR and PLS method for estimating OM and OM/OC by including the amine functional group. Amines are ubiquitous in the atmosphere and come from motor vehicle exhaust, animal husbandry, biomass burning, and vegetation among other sources. In this study, calibration standards for amines are produced by aerosolizing individual amine compounds and collecting them on PTFE filters using an IMPROVE sampler, thereby mimicking the filter media and collection geometry of ambient standards. The moles of amine functional group on each standard and a narrow range of amine-specific wavenumbers in the FT-IR spectra (wavenumber range 1 550-1 500 cm-1) are used to develop a PLS calibration model. The PLS model is validated using three methods: prediction of a set of laboratory standards not included in the model, a peak height analysis and a PLS model with a broader wavenumber range. The model is then applied to the ambient samples collected throughout 2013 from 16 IMPROVE sites in the USA. Urban sites have higher amine concentrations than most rural sites, but amine functional groups account for a lower fraction of OM at urban sites. Amine concentrations, contributions to OM and seasonality vary by site and sample. Amine has a small impact on the annual average OM/OC for urban sites, but for some rural sites including amine in the OM/OC calculations increased OM/OC by 0.1 or more.
Oliveira, Flavia C C; Brandão, Christian R R; Ramalho, Hugo F; da Costa, Leonardo A F; Suarez, Paulo A Z; Rubim, Joel C
2007-03-28
In this work it has been shown that the routine ASTM methods (ASTM 4052, ASTM D 445, ASTM D 4737, ASTM D 93, and ASTM D 86) recommended by the ANP (the Brazilian National Agency for Petroleum, Natural Gas and Biofuels) to determine the quality of diesel/biodiesel blends are not suitable to prevent the adulteration of B2 or B5 blends with vegetable oils. Considering the previous and actual problems with fuel adulterations in Brazil, we have investigated the application of vibrational spectroscopy (Fourier transform (FT) near infrared spectrometry and FT-Raman) to identify adulterations of B2 and B5 blends with vegetable oils. Partial least square regression (PLS), principal component regression (PCR), and artificial neural network (ANN) calibration models were designed and their relative performances were evaluated by external validation using the F-test. The PCR, PLS, and ANN calibration models based on the Fourier transform (FT) near infrared spectrometry and FT-Raman spectroscopy were designed using 120 samples. Other 62 samples were used in the validation and external validation, for a total of 182 samples. The results have shown that among the designed calibration models, the ANN/FT-Raman presented the best accuracy (0.028%, w/w) for samples used in the external validation.
Error propagation of partial least squares for parameters optimization in NIR modeling.
Du, Chenzhao; Dai, Shengyun; Qiao, Yanjiang; Wu, Zhisheng
2018-03-05
A novel methodology is proposed to determine the error propagation of partial least-square (PLS) for parameters optimization in near-infrared (NIR) modeling. The parameters include spectral pretreatment, latent variables and variable selection. In this paper, an open source dataset (corn) and a complicated dataset (Gardenia) were used to establish PLS models under different modeling parameters. And error propagation of modeling parameters for water quantity in corn and geniposide quantity in Gardenia were presented by both type І and type II error. For example, when variable importance in the projection (VIP), interval partial least square (iPLS) and backward interval partial least square (BiPLS) variable selection algorithms were used for geniposide in Gardenia, compared with synergy interval partial least squares (SiPLS), the error weight varied from 5% to 65%, 55% and 15%. The results demonstrated how and what extent the different modeling parameters affect error propagation of PLS for parameters optimization in NIR modeling. The larger the error weight, the worse the model. Finally, our trials finished a powerful process in developing robust PLS models for corn and Gardenia under the optimal modeling parameters. Furthermore, it could provide a significant guidance for the selection of modeling parameters of other multivariate calibration models. Copyright © 2017. Published by Elsevier B.V.
Error propagation of partial least squares for parameters optimization in NIR modeling
NASA Astrophysics Data System (ADS)
Du, Chenzhao; Dai, Shengyun; Qiao, Yanjiang; Wu, Zhisheng
2018-03-01
A novel methodology is proposed to determine the error propagation of partial least-square (PLS) for parameters optimization in near-infrared (NIR) modeling. The parameters include spectral pretreatment, latent variables and variable selection. In this paper, an open source dataset (corn) and a complicated dataset (Gardenia) were used to establish PLS models under different modeling parameters. And error propagation of modeling parameters for water quantity in corn and geniposide quantity in Gardenia were presented by both type І and type II error. For example, when variable importance in the projection (VIP), interval partial least square (iPLS) and backward interval partial least square (BiPLS) variable selection algorithms were used for geniposide in Gardenia, compared with synergy interval partial least squares (SiPLS), the error weight varied from 5% to 65%, 55% and 15%. The results demonstrated how and what extent the different modeling parameters affect error propagation of PLS for parameters optimization in NIR modeling. The larger the error weight, the worse the model. Finally, our trials finished a powerful process in developing robust PLS models for corn and Gardenia under the optimal modeling parameters. Furthermore, it could provide a significant guidance for the selection of modeling parameters of other multivariate calibration models.
On-line milk spectrometry: analysis of bovine milk composition
NASA Astrophysics Data System (ADS)
Spitzer, Kyle; Kuennemeyer, Rainer; Woolford, Murray; Claycomb, Rod
2005-04-01
We present partial least squares (PLS) regressions to predict the composition of raw, unhomogenised milk using visible to near infrared spectroscopy. A total of 370 milk samples from individual quarters were collected and analysed on-line by two low cost spectrometers in the wavelength ranges 380-1100 nm and 900-1700 nm. Samples were collected from 22 Friesian, 17 Jersey, 2 Ayrshire and 3 Friesian-Jersey crossbred cows over a period of 7 consecutive days. Transmission spectra were recorded in an inline flowcell through a 0.5 mm thick milk sample. PLS models, where wavelength selection was performed using iterative PLS, were developed for fat, protein, lactose, and somatic cell content. The root mean square error of prediction (and correlation coefficient) for the nir and visible spectrometers respectively were 0.70%(0.93) and 0.91%(0.91) for fat, 0.65%(0.5) and 0.47%(0.79) for protein, 0.36%(0.49) and 0.45%(0.43) for lactose, and 0.50(0.54) and 0.48(0.51) for log10 somatic cells.
Li, Shuifang; Zhang, Xin; Shan, Yang; Su, Donglin; Ma, Qiang; Wen, Ruizhi; Li, Jiaojuan
2017-03-01
Near-infrared spectroscopy (NIR) was used for qualitative and quantitative detection of honey adulterated with high-fructose corn syrup (HFCS) or maltose syrup (MS). Competitive adaptive reweighted sampling (CARS) was employed to select key variables. Partial least squares linear discriminant analysis (PLS-LDA) was adopted to classify the adulterated honey samples. The CARS-PLS-LDA models showed an accuracy of 86.3% (honey vs. adulterated honey with HFCS) and 96.1% (honey vs. adulterated honey with MS), respectively. PLS regression (PLSR) was used to predict the extent of adulteration in the honeys. The results showed that NIR combined with PLSR could not be used to quantify adulteration with HFCS, but could be used to quantify adulteration with MS: coefficient (R p 2 ) and root mean square of prediction (RMSEP) were 0.901 and 4.041 for MS-adulterated samples from different floral origins, and 0.981 and 1.786 for MS-adulterated samples from the same floral origin (Brassica spp.), respectively. Copyright © 2016. Published by Elsevier Ltd.
Wang, Jun; Kliks, Michael M; Jun, Soojin; Jackson, Mel; Li, Qing X
2010-03-01
Quantitative analysis of glucose, fructose, sucrose, and maltose in different geographic origin honey samples in the world using the Fourier transform infrared (FTIR) spectroscopy and chemometrics such as partial least squares (PLS) and principal component regression was studied. The calibration series consisted of 45 standard mixtures, which were made up of glucose, fructose, sucrose, and maltose. There were distinct peak variations of all sugar mixtures in the spectral "fingerprint" region between 1500 and 800 cm(-1). The calibration model was successfully validated using 7 synthetic blend sets of sugars. The PLS 2nd-derivative model showed the highest degree of prediction accuracy with a highest R(2) value of 0.999. Along with the canonical variate analysis, the calibration model further validated by high-performance liquid chromatography measurements for commercial honey samples demonstrates that FTIR can qualitatively and quantitatively determine the presence of glucose, fructose, sucrose, and maltose in multiple regional honey samples.
TØ, Bechshøft; Sonne, C; Dietz, R; Born, EW; Muir, DCG; Letcher, RJ; Novak, MA; Henchey, E; Meyer, JS; Jenssen, BM; Villanger, GD
2012-01-01
The multivariate relationship between hair cortisol, whole blood thyroid hormones, and the complex mixtures of organohalogen contaminant (OHC) levels measured in subcutaneous adipose of 23 East Greenland polar bears (eight males and 15 females, all sampled between the years 1999 and 2001) was analyzed using projection to latent structure (PLS) regression modeling. In the resulting PLS model, most important variables with a negative influence on cortisol levels were particularly BDE-99, but also CB-180, -201, BDE-153, and CB-170/190. The most important variables with a positive influence on cortisol were CB-66/95, α-HCH, TT3, as well as heptachlor epoxide, dieldrin, BDE-47, p,p′-DDD. Although statistical modeling does not necessarily fully explain biological cause-effect relationships, relationships indicate that (1) the hypothalamic-pituitary-adrenal (HPA) axis in East Greenland polar bears is likely to be affected by OHC-contaminants and (2) the association between OHCs and cortisol may be linked with the hypothalamus-pituitary-thyroid (HPT) axis. PMID:22575327
NASA Astrophysics Data System (ADS)
Yulia, M.; Suhandy, D.
2018-03-01
NIR spectra obtained from spectral data acquisition system contains both chemical information of samples as well as physical information of the samples, such as particle size and bulk density. Several methods have been established for developing calibration models that can compensate for sample physical information variations. One common approach is to include physical information variation in the calibration model both explicitly and implicitly. The objective of this study was to evaluate the feasibility of using explicit method to compensate the influence of different particle size of coffee powder in NIR calibration model performance. A number of 220 coffee powder samples with two different types of coffee (civet and non-civet) and two different particle sizes (212 and 500 µm) were prepared. Spectral data was acquired using NIR spectrometer equipped with an integrating sphere for diffuse reflectance measurement. A discrimination method based on PLS-DA was conducted and the influence of different particle size on the performance of PLS-DA was investigated. In explicit method, we add directly the particle size as predicted variable results in an X block containing only the NIR spectra and a Y block containing the particle size and type of coffee. The explicit inclusion of the particle size into the calibration model is expected to improve the accuracy of type of coffee determination. The result shows that using explicit method the quality of the developed calibration model for type of coffee determination is a little bit superior with coefficient of determination (R2) = 0.99 and root mean square error of cross-validation (RMSECV) = 0.041. The performance of the PLS2 calibration model for type of coffee determination with particle size compensation was quite good and able to predict the type of coffee in two different particle sizes with relatively high R2 pred values. The prediction also resulted in low bias and RMSEP values.
Boiret, Mathieu; Meunier, Loïc; Ginot, Yves-Michel
2011-02-20
A near infrared (NIR) method was developed for determination of tablet potency of active pharmaceutical ingredient (API) in a complex coated tablet matrix. The calibration set contained samples from laboratory and production scale batches. The reference values were obtained by high performance liquid chromatography (HPLC) and partial least squares (PLS) regression was used to establish a model. The model was challenged by calculating tablet potency of two external test sets. Root mean square errors of prediction were respectively equal to 2.0% and 2.7%. To use this model with a second spectrometer from the production field, a calibration transfer method called piecewise direct standardisation (PDS) was used. After the transfer, the root mean square error of prediction of the first test set was 2.4% compared to 4.0% without transferring the spectra. A statistical technique using bootstrap of PLS residuals was used to estimate confidence intervals of tablet potency calculations. This method requires an optimised PLS model, selection of the bootstrap number and determination of the risk. In the case of a chemical analysis, the tablet potency value will be included within the confidence interval calculated by the bootstrap method. An easy to use graphical interface was developed to easily determine if the predictions, surrounded by minimum and maximum values, are within the specifications defined by the regulatory organisation. Copyright © 2010 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Liu, Jianzhong; Kern, Petra S.; Gerberick, G. Frank; Santos-Filho, Osvaldo A.; Esposito, Emilio X.; Hopfinger, Anton J.; Tseng, Yufeng J.
2008-06-01
In previous studies we have developed categorical QSAR models for predicting skin-sensitization potency based on 4D-fingerprint (4D-FP) descriptors and in vivo murine local lymph node assay (LLNA) measures. Only 4D-FP derived from the ground state (GMAX) structures of the molecules were used to build the QSAR models. In this study we have generated 4D-FP descriptors from the first excited state (EMAX) structures of the molecules. The GMAX, EMAX and the combined ground and excited state 4D-FP descriptors (GEMAX) were employed in building categorical QSAR models. Logistic regression (LR) and partial least square coupled logistic regression (PLS-CLR), found to be effective model building for the LLNA skin-sensitization measures in our previous studies, were used again in this study. This also permitted comparison of the prior ground state models to those involving first excited state 4D-FP descriptors. Three types of categorical QSAR models were constructed for each of the GMAX, EMAX and GEMAX datasets: a binary model (2-state), an ordinal model (3-state) and a binary-binary model (two-2-state). No significant differences exist among the LR 2-state model constructed for each of the three datasets. However, the PLS-CLR 3-state and 2-state models based on the EMAX and GEMAX datasets have higher predictivity than those constructed using only the GMAX dataset. These EMAX and GMAX categorical models are also more significant and predictive than corresponding models built in our previous QSAR studies of LLNA skin-sensitization measures.
Application of visible and near-infrared spectroscopy to classification of Miscanthus species
Jin, Xiaoli; Chen, Xiaoling; Xiao, Liang; ...
2017-04-03
Here, the feasibility of visible and near infrared (NIR) spectroscopy as tool to classify Miscanthus samples was explored in this study. Three types of Miscanthus plants, namely, M. sinensis, M. sacchariflorus and M. fIoridulus, were analyzed using a NIR spectrophotometer. Several classification models based on the NIR spectra data were developed using line discriminated analysis (LDA), partial least squares (PLS), least squares support vector machine regression (LSSVR), radial basis function (RBF) and neural network (NN). The principal component analysis (PCA) presented rough classification with overlapping samples, while the models of Line_LSSVR, RBF_LSSVR and RBF_NN presented almost same calibration and validationmore » results. Due to the higher speed of Line_LSSVR than RBF_LSSVR and RBF_NN, we selected the line_LSSVR model as a representative. In our study, the model based on line_LSSVR showed higher accuracy than LDA and PLS models. The total correct classification rates of 87.79 and 96.51% were observed based on LDA and PLS model in the testing set, respectively, while the line_LSSVR showed 99.42% of total correct classification rate. Meanwhile, the lin_LSSVR model in the testing set showed correct classification rate of 100, 100 and 96.77% for M. sinensis, M. sacchariflorus and M. fIoridulus, respectively. The lin_LSSVR model assigned 99.42% of samples to the right groups, except one M. fIoridulus sample. The results demonstrated that NIR spectra combined with a preliminary morphological classification could be an effective and reliable procedure for the classification of Miscanthus species.« less
Application of visible and near-infrared spectroscopy to classification of Miscanthus species
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Xiaoli; Chen, Xiaoling; Xiao, Liang
Here, the feasibility of visible and near infrared (NIR) spectroscopy as tool to classify Miscanthus samples was explored in this study. Three types of Miscanthus plants, namely, M. sinensis, M. sacchariflorus and M. fIoridulus, were analyzed using a NIR spectrophotometer. Several classification models based on the NIR spectra data were developed using line discriminated analysis (LDA), partial least squares (PLS), least squares support vector machine regression (LSSVR), radial basis function (RBF) and neural network (NN). The principal component analysis (PCA) presented rough classification with overlapping samples, while the models of Line_LSSVR, RBF_LSSVR and RBF_NN presented almost same calibration and validationmore » results. Due to the higher speed of Line_LSSVR than RBF_LSSVR and RBF_NN, we selected the line_LSSVR model as a representative. In our study, the model based on line_LSSVR showed higher accuracy than LDA and PLS models. The total correct classification rates of 87.79 and 96.51% were observed based on LDA and PLS model in the testing set, respectively, while the line_LSSVR showed 99.42% of total correct classification rate. Meanwhile, the lin_LSSVR model in the testing set showed correct classification rate of 100, 100 and 96.77% for M. sinensis, M. sacchariflorus and M. fIoridulus, respectively. The lin_LSSVR model assigned 99.42% of samples to the right groups, except one M. fIoridulus sample. The results demonstrated that NIR spectra combined with a preliminary morphological classification could be an effective and reliable procedure for the classification of Miscanthus species.« less
Application of visible and near-infrared spectroscopy to classification of Miscanthus species.
Jin, Xiaoli; Chen, Xiaoling; Xiao, Liang; Shi, Chunhai; Chen, Liang; Yu, Bin; Yi, Zili; Yoo, Ji Hye; Heo, Kweon; Yu, Chang Yeon; Yamada, Toshihiko; Sacks, Erik J; Peng, Junhua
2017-01-01
The feasibility of visible and near infrared (NIR) spectroscopy as tool to classify Miscanthus samples was explored in this study. Three types of Miscanthus plants, namely, M. sinensis, M. sacchariflorus and M. fIoridulus, were analyzed using a NIR spectrophotometer. Several classification models based on the NIR spectra data were developed using line discriminated analysis (LDA), partial least squares (PLS), least squares support vector machine regression (LSSVR), radial basis function (RBF) and neural network (NN). The principal component analysis (PCA) presented rough classification with overlapping samples, while the models of Line_LSSVR, RBF_LSSVR and RBF_NN presented almost same calibration and validation results. Due to the higher speed of Line_LSSVR than RBF_LSSVR and RBF_NN, we selected the line_LSSVR model as a representative. In our study, the model based on line_LSSVR showed higher accuracy than LDA and PLS models. The total correct classification rates of 87.79 and 96.51% were observed based on LDA and PLS model in the testing set, respectively, while the line_LSSVR showed 99.42% of total correct classification rate. Meanwhile, the lin_LSSVR model in the testing set showed correct classification rate of 100, 100 and 96.77% for M. sinensis, M. sacchariflorus and M. fIoridulus, respectively. The lin_LSSVR model assigned 99.42% of samples to the right groups, except one M. fIoridulus sample. The results demonstrated that NIR spectra combined with a preliminary morphological classification could be an effective and reliable procedure for the classification of Miscanthus species.
Application of visible and near-infrared spectroscopy to classification of Miscanthus species
Shi, Chunhai; Chen, Liang; Yu, Bin; Yi, Zili; Yoo, Ji Hye; Heo, Kweon; Yu, Chang Yeon; Yamada, Toshihiko; Sacks, Erik J.; Peng, Junhua
2017-01-01
The feasibility of visible and near infrared (NIR) spectroscopy as tool to classify Miscanthus samples was explored in this study. Three types of Miscanthus plants, namely, M. sinensis, M. sacchariflorus and M. fIoridulus, were analyzed using a NIR spectrophotometer. Several classification models based on the NIR spectra data were developed using line discriminated analysis (LDA), partial least squares (PLS), least squares support vector machine regression (LSSVR), radial basis function (RBF) and neural network (NN). The principal component analysis (PCA) presented rough classification with overlapping samples, while the models of Line_LSSVR, RBF_LSSVR and RBF_NN presented almost same calibration and validation results. Due to the higher speed of Line_LSSVR than RBF_LSSVR and RBF_NN, we selected the line_LSSVR model as a representative. In our study, the model based on line_LSSVR showed higher accuracy than LDA and PLS models. The total correct classification rates of 87.79 and 96.51% were observed based on LDA and PLS model in the testing set, respectively, while the line_LSSVR showed 99.42% of total correct classification rate. Meanwhile, the lin_LSSVR model in the testing set showed correct classification rate of 100, 100 and 96.77% for M. sinensis, M. sacchariflorus and M. fIoridulus, respectively. The lin_LSSVR model assigned 99.42% of samples to the right groups, except one M. fIoridulus sample. The results demonstrated that NIR spectra combined with a preliminary morphological classification could be an effective and reliable procedure for the classification of Miscanthus species. PMID:28369059
Hashimoto, Ryu-Ichiro; Itahashi, Takashi; Okada, Rieko; Hasegawa, Sayaka; Tani, Masayuki; Kato, Nobumasa; Mimura, Masaru
2018-01-01
Abnormalities in functional brain networks in schizophrenia have been studied by examining intrinsic and extrinsic brain activity under various experimental paradigms. However, the identified patterns of abnormal functional connectivity (FC) vary depending on the adopted paradigms. Thus, it is unclear whether and how these patterns are inter-related. In order to assess relationships between abnormal patterns of FC during intrinsic activity and those during extrinsic activity, we adopted a data-fusion approach and applied partial least square (PLS) analyses to FC datasets from 25 patients with chronic schizophrenia and 25 age- and sex-matched normal controls. For the input to the PLS analyses, we generated a pair of FC maps during the resting state (REST) and the auditory deviance response (ADR) from each participant using the common seed region in the left middle temporal gyrus, which is a focus of activity associated with auditory verbal hallucinations (AVHs). PLS correlation (PLS-C) analysis revealed that patients with schizophrenia have significantly lower loadings of a component containing positive FCs in default-mode network regions during REST and a component containing positive FCs in the auditory and attention-related networks during ADR. Specifically, loadings of the REST component were significantly correlated with the severities of positive symptoms and AVH in patients with schizophrenia. The co-occurrence of such altered FC patterns during REST and ADR was replicated using PLS regression, wherein FC patterns during REST are modeled to predict patterns during ADR. These findings provide an integrative understanding of altered FCs during intrinsic and extrinsic activity underlying core schizophrenia symptoms.
Mabood, F; Boqué, R; Folcarelli, R; Busto, O; Jabeen, F; Al-Harrasi, Ahmed; Hussain, J
2016-05-15
In this study the effect of thermal treatment on the enhancement of synchronous fluorescence spectroscopic method for discrimination and quantification of pure extra virgin olive oil (EVOO) samples from EVOO samples adulterated with refined oil was investigated. Two groups of samples were used. One group was analyzed at room temperature (25 °C) and the other group was thermally treated in a thermostatic water bath at 75 °C for 8h, in contact with air and with light exposure, to favor oxidation. All the samples were then measured with synchronous fluorescence spectroscopy. Synchronous fluorescence spectra were acquired by varying the wavelength in the region from 250 to 720 nm at 20 nm wavelength differential interval of excitation and emission. Pure and adulterated olive oils were discriminated by using partial least-squares discriminant analysis (PLS-DA). It was found that the best PLS-DA models were those built with the difference spectra (75 °C-25 °C), which were able to discriminate pure from adulterated oils at a 2% level of adulteration of refined olive oils. Furthermore, PLS regression models were also built to quantify the level of adulteration. Again, the best model was the one built with the difference spectra, with a prediction error of 3.18% of adulteration. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Mabood, F.; Boqué, R.; Folcarelli, R.; Busto, O.; Jabeen, F.; Al-Harrasi, Ahmed; Hussain, J.
2016-05-01
In this study the effect of thermal treatment on the enhancement of synchronous fluorescence spectroscopic method for discrimination and quantification of pure extra virgin olive oil (EVOO) samples from EVOO samples adulterated with refined oil was investigated. Two groups of samples were used. One group was analyzed at room temperature (25 °C) and the other group was thermally treated in a thermostatic water bath at 75 °C for 8 h, in contact with air and with light exposure, to favor oxidation. All the samples were then measured with synchronous fluorescence spectroscopy. Synchronous fluorescence spectra were acquired by varying the wavelength in the region from 250 to 720 nm at 20 nm wavelength differential interval of excitation and emission. Pure and adulterated olive oils were discriminated by using partial least-squares discriminant analysis (PLS-DA). It was found that the best PLS-DA models were those built with the difference spectra (75 °C-25 °C), which were able to discriminate pure from adulterated oils at a 2% level of adulteration of refined olive oils. Furthermore, PLS regression models were also built to quantify the level of adulteration. Again, the best model was the one built with the difference spectra, with a prediction error of 3.18% of adulteration.
Kim, So-Hyun; Cho, Somi K; Hyun, Sun-Hee; Park, Hae-Eun; Kim, Young-Suk; Choi, Hyung-Kyoon
2011-01-01
Guava leaves were classified and the free radical scavenging activity (FRSA) evaluated according to different harvest times by using the (1)H-NMR-based metabolomic technique. A principal component analysis (PCA) of (1)H-NMR data from the guava leaves provided clear clusters according to the harvesting time. A partial least squares (PLS) analysis indicated a correlation between the metabolic profile and FRSA. FRSA levels of the guava leaves harvested during May and August were high, and those leaves contained higher amounts of 3-hydroxybutyric acid, acetic acid, glutamic acid, asparagine, citric acid, malonic acid, trans-aconitic acid, ascorbic acid, maleic acid, cis-aconitic acid, epicatechin, protocatechuic acid, and xanthine than the leaves harvested during October and December. Epicatechin and protocatechuic acid among those compounds seem to have enhanced FRSA of the guava leaf samples harvested in May and August. A PLS regression model was established to predict guava leaf FRSA at different harvesting times by using a (1)H-NMR data set. The predictability of the PLS model was then tested by internal and external validation. The results of this study indicate that (1)H-NMR-based metabolomic data could usefully characterize guava leaves according to their time of harvesting.
Orthogonal decomposition of left ventricular remodeling in myocardial infarction
Zhang, Xingyu; Medrano-Gracia, Pau; Ambale-Venkatesh, Bharath; Bluemke, David A.; Cowan, Brett R; Finn, J. Paul; Kadish, Alan H.; Lee, Daniel C.; Lima, Joao A. C.; Young, Alistair A.; Suinesiaputra, Avan
2017-01-01
Abstract Left ventricular size and shape are important for quantifying cardiac remodeling in response to cardiovascular disease. Geometric remodeling indices have been shown to have prognostic value in predicting adverse events in the clinical literature, but these often describe interrelated shape changes. We developed a novel method for deriving orthogonal remodeling components directly from any (moderately independent) set of clinical remodeling indices. Results: Six clinical remodeling indices (end-diastolic volume index, sphericity, relative wall thickness, ejection fraction, apical conicity, and longitudinal shortening) were evaluated using cardiac magnetic resonance images of 300 patients with myocardial infarction, and 1991 asymptomatic subjects, obtained from the Cardiac Atlas Project. Partial least squares (PLS) regression of left ventricular shape models resulted in remodeling components that were optimally associated with each remodeling index. A Gram–Schmidt orthogonalization process, by which remodeling components were successively removed from the shape space in the order of shape variance explained, resulted in a set of orthonormal remodeling components. Remodeling scores could then be calculated that quantify the amount of each remodeling component present in each case. A one-factor PLS regression led to more decoupling between scores from the different remodeling components across the entire cohort, and zero correlation between clinical indices and subsequent scores. Conclusions: The PLS orthogonal remodeling components had similar power to describe differences between myocardial infarction patients and asymptomatic subjects as principal component analysis, but were better associated with well-understood clinical indices of cardiac remodeling. The data and analyses are available from www.cardiacatlas.org. PMID:28327972
ERIC Educational Resources Information Center
Qi, Cathy Huaqing; Marley, Scott C.
2009-01-01
The study examined whether item bias is present in the "Preschool Language Scale-4" (PLS-4). Participants were 440 children (3-5 years old; 86% English-speaking Hispanic and 14% European American) who were enrolled in Head Start programs. The PLS-4 items were analyzed for differential item functioning (DIF) using logistic regression and…
Optimizing methods for linking cinematic features to fMRI data.
Kauttonen, Janne; Hlushchuk, Yevhen; Tikka, Pia
2015-04-15
One of the challenges of naturalistic neurosciences using movie-viewing experiments is how to interpret observed brain activations in relation to the multiplicity of time-locked stimulus features. As previous studies have shown less inter-subject synchronization across viewers of random video footage than story-driven films, new methods need to be developed for analysis of less story-driven contents. To optimize the linkage between our fMRI data collected during viewing of a deliberately non-narrative silent film 'At Land' by Maya Deren (1944) and its annotated content, we combined the method of elastic-net regularization with the model-driven linear regression and the well-established data-driven independent component analysis (ICA) and inter-subject correlation (ISC) methods. In the linear regression analysis, both IC and region-of-interest (ROI) time-series were fitted with time-series of a total of 36 binary-valued and one real-valued tactile annotation of film features. The elastic-net regularization and cross-validation were applied in the ordinary least-squares linear regression in order to avoid over-fitting due to the multicollinearity of regressors, the results were compared against both the partial least-squares (PLS) regression and the un-regularized full-model regression. Non-parametric permutation testing scheme was applied to evaluate the statistical significance of regression. We found statistically significant correlation between the annotation model and 9 ICs out of 40 ICs. Regression analysis was also repeated for a large set of cubic ROIs covering the grey matter. Both IC- and ROI-based regression analyses revealed activations in parietal and occipital regions, with additional smaller clusters in the frontal lobe. Furthermore, we found elastic-net based regression more sensitive than PLS and un-regularized regression since it detected a larger number of significant ICs and ROIs. Along with the ISC ranking methods, our regression analysis proved a feasible method for ordering the ICs based on their functional relevance to the annotated cinematic features. The novelty of our method is - in comparison to the hypothesis-driven manual pre-selection and observation of some individual regressors biased by choice - in applying data-driven approach to all content features simultaneously. We found especially the combination of regularized regression and ICA useful when analyzing fMRI data obtained using non-narrative movie stimulus with a large set of complex and correlated features. Copyright © 2015. Published by Elsevier Inc.
NASA Astrophysics Data System (ADS)
Mabood, Fazal; Boqué, Ricard; Folcarelli, Rita; Busto, Olga; Al-Harrasi, Ahmed; Hussain, Javid
2015-05-01
We have investigated the effect of thermal treatment on the discrimination of pure extra virgin olive oil (EVOO) samples from EVOO samples adulterated with sunflower oil. Two groups of samples were used. One group was analyzed at room temperature (25 °C) and the other group was thermally treated in a thermostatic water bath at 75 °C for 8 h, in contact with air and with light exposure, to favor oxidation. All samples were then measured with synchronous fluorescence spectroscopy. Fluorescence spectra were acquired by varying the excitation wavelength in the region from 250 to 720 nm. In order to optimize the differences between excitation and emission wavelengths, four constant differential wavelengths, i.e., 20 nm, 40 nm, 60 nm and 80 nm, were tried. Partial least-squares discriminant analysis (PLS-DA) was used to discriminate between pure and adulterated oils. It was found that the 20 nm difference was the optimal, at which the discrimination models showed the best results. The best PLS-DA models were those built with the difference spectra (75-25 °C), which were able to discriminate pure from adulterated oils at a 2% level of adulteration. Furthermore, PLS regression models were built to quantify the level of adulteration. Again, the best model was the one built with the difference spectra, with a prediction error of 1.75% of adulteration.
Fakayode, Sayo O; Mitchell, Breanna S; Pollard, David A
2014-08-01
Accurate understanding of analyte boiling points (BP) is of critical importance in gas chromatographic (GC) separation and crude oil refinery operation in petrochemical industries. This study reported the first combined use of GC separation and partial-least-square (PLS1) multivariate regression analysis of petrochemical structural activity relationship (SAR) for accurate BP determination of two commercially available (D3710 and MA VHP) calibration gas mix samples. The results of the BP determination using PLS1 multivariate regression were further compared with the results of traditional simulated distillation method of BP determination. The developed PLS1 regression was able to correctly predict analytes BP in D3710 and MA VHP calibration gas mix samples, with a root-mean-square-%-relative-error (RMS%RE) of 6.4%, and 10.8% respectively. In contrast, the overall RMS%RE of 32.9% and 40.4%, respectively obtained for BP determination in D3710 and MA VHP using a traditional simulated distillation method were approximately four times larger than the corresponding RMS%RE of BP prediction using MRA, demonstrating the better predictive ability of MRA. The reported method is rapid, robust, and promising, and can be potentially used routinely for fast analysis, pattern recognition, and analyte BP determination in petrochemical industries. Copyright © 2014 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Saad, Ahmed S.; Hamdy, Abdallah M.; Salama, Fathy M.; Abdelkawy, Mohamed
2016-10-01
Effect of data manipulation in preprocessing step proceeding construction of chemometric models was assessed. The same set of UV spectral data was used for construction of PLS and PCR models directly and after mathematically manipulation as per well known first and second derivatives of the absorption spectra, ratio spectra and first and second derivatives of the ratio spectra spectrophotometric methods, meanwhile the optimal working wavelength ranges were carefully selected for each model and the models were constructed. Unexpectedly, number of latent variables used for models' construction varied among the different methods. The prediction power of the different models was compared using a validation set of 8 mixtures prepared as per the multilevel multifactor design and results were statistically compared using two-way ANOVA test. Root mean squares error of prediction (RMSEP) was used for further comparison of the predictability among different constructed models. Although no significant difference was found between results obtained using Partial Least Squares (PLS) and Principal Component Regression (PCR) models, however, discrepancies among results was found to be attributed to the variation in the discrimination power of adopted spectrophotometric methods on spectral data.
Li, Yankun; Shao, Xueguang; Cai, Wensheng
2007-04-15
Consensus modeling of combining the results of multiple independent models to produce a single prediction avoids the instability of single model. Based on the principle of consensus modeling, a consensus least squares support vector regression (LS-SVR) method for calibrating the near-infrared (NIR) spectra was proposed. In the proposed approach, NIR spectra of plant samples were firstly preprocessed using discrete wavelet transform (DWT) for filtering the spectral background and noise, then, consensus LS-SVR technique was used for building the calibration model. With an optimization of the parameters involved in the modeling, a satisfied model was achieved for predicting the content of reducing sugar in plant samples. The predicted results show that consensus LS-SVR model is more robust and reliable than the conventional partial least squares (PLS) and LS-SVR methods.
Elkhoudary, Mahmoud M; Naguib, Ibrahim A; Abdel Salam, Randa A; Hadad, Ghada M
2017-05-01
Four accurate, sensitive and reliable stability indicating chemometric methods were developed for the quantitative determination of Agomelatine (AGM) whether in pure form or in pharmaceutical formulations. Two supervised learning machines' methods; linear artificial neural networks (PC-linANN) preceded by principle component analysis and linear support vector regression (linSVR), were compared with two principle component based methods; principle component regression (PCR) as well as partial least squares (PLS) for the spectrofluorimetric determination of AGM and its degradants. The results showed the benefits behind using linear learning machines' methods and the inherent merits of their algorithms in handling overlapped noisy spectral data especially during the challenging determination of AGM alkaline and acidic degradants (DG1 and DG2). Relative mean squared error of prediction (RMSEP) for the proposed models in the determination of AGM were 1.68, 1.72, 0.68 and 0.22 for PCR, PLS, SVR and PC-linANN; respectively. The results showed the superiority of supervised learning machines' methods over principle component based methods. Besides, the results suggested that linANN is the method of choice for determination of components in low amounts with similar overlapped spectra and narrow linearity range. Comparison between the proposed chemometric models and a reported HPLC method revealed the comparable performance and quantification power of the proposed models.
Real‐time monitoring and control of the load phase of a protein A capture step
Rüdt, Matthias; Brestrich, Nina; Rolinger, Laura
2016-01-01
ABSTRACT The load phase in preparative Protein A capture steps is commonly not controlled in real‐time. The load volume is generally based on an offline quantification of the monoclonal antibody (mAb) prior to loading and on a conservative column capacity determined by resin‐life time studies. While this results in a reduced productivity in batch mode, the bottleneck of suitable real‐time analytics has to be overcome in order to enable continuous mAb purification. In this study, Partial Least Squares Regression (PLS) modeling on UV/Vis absorption spectra was applied to quantify mAb in the effluent of a Protein A capture step during the load phase. A PLS model based on several breakthrough curves with variable mAb titers in the HCCF was successfully calibrated. The PLS model predicted the mAb concentrations in the effluent of a validation experiment with a root mean square error (RMSE) of 0.06 mg/mL. The information was applied to automatically terminate the load phase, when a product breakthrough of 1.5 mg/mL was reached. In a second part of the study, the sensitivity of the method was further increased by only considering small mAb concentrations in the calibration and by subtracting an impurity background signal. The resulting PLS model exhibited a RMSE of prediction of 0.01 mg/mL and was successfully applied to terminate the load phase, when a product breakthrough of 0.15 mg/mL was achieved. The proposed method has hence potential for the real‐time monitoring and control of capture steps at large scale production. This might enhance the resin capacity utilization, eliminate time‐consuming offline analytics, and contribute to the realization of continuous processing. Biotechnol. Bioeng. 2017;114: 368–373. © 2016 The Authors. Biotechnology and Bioengineering published by Wiley Periodicals, Inc. PMID:27543789
Aleixandre-Tudo, José Luis; Nieuwoudt, Helené; Aleixandre, José Luis; Du Toit, Wessel J
2015-02-04
The validation of ultraviolet-visible (UV-vis) spectroscopy combined with partial least-squares (PLS) regression to quantify red wine tannins is reported. The methylcellulose precipitable (MCP) tannin assay and the bovine serum albumin (BSA) tannin assay were used as reference methods. To take the high variability of wine tannins into account when the calibration models were built, a diverse data set was collected from samples of South African red wines that consisted of 18 different cultivars, from regions spanning the wine grape-growing areas of South Africa with their various sites, climates, and soils, ranging in vintage from 2000 to 2012. A total of 240 wine samples were analyzed, and these were divided into a calibration set (n = 120) and a validation set (n = 120) to evaluate the predictive ability of the models. To test the robustness of the PLS calibration models, the predictive ability of the classifying variables cultivar, vintage year, and experimental versus commercial wines was also tested. In general, the statistics obtained when BSA was used as a reference method were slightly better than those obtained with MCP. Despite this, the MCP tannin assay should also be considered as a valid reference method for developing PLS calibrations. The best calibration statistics for the prediction of new samples were coefficient of correlation (R 2 val) = 0.89, root mean standard error of prediction (RMSEP) = 0.16, and residual predictive deviation (RPD) = 3.49 for MCP and R 2 val = 0.93, RMSEP = 0.08, and RPD = 4.07 for BSA, when only the UV region (260-310 nm) was selected, which also led to a faster analysis time. In addition, a difference in the results obtained when the predictive ability of the classifying variables vintage, cultivar, or commercial versus experimental wines was studied suggests that tannin composition is highly affected by many factors. This study also discusses the correlations in tannin values between the methylcellulose and protein precipitation methods.
Cheheltani, Rabee; McGoverin, Cushla M; Rao, Jayashree; Vorp, David A; Kiani, Mohammad F; Pleshko, Nancy
2014-06-21
Extracellular matrix (ECM) is a key component and regulator of many biological tissues including aorta. Several aortic pathologies are associated with significant changes in the composition of the matrix, especially in the content, quality and type of aortic structural proteins, collagen and elastin. The purpose of this study was to develop an infrared spectroscopic methodology that is comparable to biochemical assays to quantify collagen and elastin in aorta. Enzymatically degraded porcine aorta samples were used as a model of ECM degradation in abdominal aortic aneurysm (AAA). After enzymatic treatment, Fourier transform infrared (FTIR) spectra of the aortic tissue were acquired by an infrared fiber optic probe (IFOP) and FTIR imaging spectroscopy (FT-IRIS). Collagen and elastin content were quantified biochemically and partial least squares (PLS) models were developed to predict collagen and elastin content in aorta based on FTIR spectra. PLS models developed from FT-IRIS spectra were able to predict elastin and collagen content of the samples with strong correlations (RMSE of validation = 8.4% and 11.1% of the range respectively), and IFOP spectra were successfully used to predict elastin content (RMSE = 11.3% of the range). The PLS regression coefficients from the FT-IRIS models were used to map collagen and elastin in tissue sections of degraded porcine aortic tissue as well as a human AAA biopsy tissue, creating a similar map of each component compared to histology. These results support further application of FTIR spectroscopic techniques for evaluation of AAA tissues.
Cheheltani, Rabee; McGoverin, Cushla M.; Rao, Jayashree; Vorp, David A.; Kiani, Mohammad F.; Pleshko, N.
2014-01-01
Extracellular matrix (ECM) is a key component and regulator of many biological tissues including aorta. Several aortic pathologies are associated with significant changes in the composition of the matrix, especially in the content, quality and type of aortic structural proteins, collagen and elastin. The purpose of this study was to develop an infrared spectroscopic methodology that is comparable to biochemical assays to quantify collagen and elastin in aorta. Enzymatically degraded porcine aorta samples were used as a model of ECM degradation in abdominal aortic aneurysm (AAA). After enzymatic treatment, Fourier transform infrared (FTIR) spectra of the aortic tissue were acquired by an infrared fiber optic probe (IFOP) and FTIR imaging spectroscopy (FT-IRIS). Collagen and elastin content were quantified biochemically and partial least squares (PLS) models were developed to predict collagen and elastin content in aorta based on FTIR spectra. PLS models developed from FT-IRIS spectra were able to predict elastin and collagen content of the samples with strong correlations (RMSE of validation = 8.4% and 11.1% of the range respectively), and IFOP spectra were successfully used to predict elastin content (RMSE = 11.3% of the range). The PLS regression coefficients from the FT-IRIS models were used to map collagen and elastin in tissue sections of degraded porcine aortic tissue as well as a human AAA biopsy tissue, creating a similar map of each component compared to histology. These results support further application of FTIR spectroscopic techniques for evaluation of AAA tissues. PMID:24761431
Riahi, Siavash; Hadiloo, Farshad; Milani, Seyed Mohammad R; Davarkhah, Nazila; Ganjali, Mohammad R; Norouzi, Parviz; Seyfi, Payam
2011-05-01
The accuracy in predicting different chemometric methods was compared when applied on ordinary UV spectra and first order derivative spectra. Principal component regression (PCR) and partial least squares with one dependent variable (PLS1) and two dependent variables (PLS2) were applied on spectral data of pharmaceutical formula containing pseudoephedrine (PDP) and guaifenesin (GFN). The ability to derivative in resolved overlapping spectra chloropheniramine maleate was evaluated when multivariate methods are adopted for analysis of two component mixtures without using any chemical pretreatment. The chemometrics models were tested on an external validation dataset and finally applied to the analysis of pharmaceuticals. Significant advantages were found in analysis of the real samples when the calibration models from derivative spectra were used. It should also be mentioned that the proposed method is a simple and rapid way requiring no preliminary separation steps and can be used easily for the analysis of these compounds, especially in quality control laboratories. Copyright © 2011 John Wiley & Sons, Ltd.
Rébufa, Catherine; Pany, Inès; Bombarda, Isabelle
2018-09-30
A rapid methodology was developed to simultaneously predict water content and activity values (a w ) of Moringa oleifera leaf powders (MOLP) using near infrared (NIR) signatures and experimental sorption isotherms. NIR spectra of MOLP samples (n = 181) were recorded. A Partial Least Square Regression model (PLS2) was obtained with low standard errors of prediction (SEP of 1.8% and 0.07 for water content and a w respectively). Experimental sorption isotherms obtained at 20, 30 and 40 °C showed similar profiles. This result is particularly important to use MOLP in food industry. In fact, a temperature variation of the drying process will not affect their available water content (self-life). Nutrient contents based on protein and selected minerals (Ca, Fe, K) were also predicted from PLS1 models. Protein contents were well predicted (SEP of 2.3%). This methodology allowed for an improvement in MOLP safety, quality control and traceability. Published by Elsevier Ltd.
Genisheva, Z; Quintelas, C; Mesquita, D P; Ferreira, E C; Oliveira, J M; Amaral, A L
2018-04-25
This work aims to explore the potential of near infrared (NIR) spectroscopy to quantify volatile compounds in Vinho Verde wines, commonly determined by gas chromatography. For this purpose, 105 Vinho Verde wine samples were analyzed using Fourier transform near infrared (FT-NIR) transmission spectroscopy in the range of 5435 cm -1 to 6357 cm -1 . Boxplot and principal components analysis (PCA) were performed for clusters identification and outliers removal. A partial least square (PLS) regression was then applied to develop the calibration models, by a new iterative approach. The predictive ability of the models was confirmed by an external validation procedure with an independent sample set. The obtained results could be considered as quite good with coefficients of determination (R 2 ) varying from 0.94 to 0.97. The current methodology, using NIR spectroscopy and chemometrics, can be seen as a promising rapid tool to determine volatile compounds in Vinho Verde wines. Copyright © 2017 Elsevier Ltd. All rights reserved.
Locally Based Kernel PLS Regression De-noising with Application to Event-Related Potentials
NASA Technical Reports Server (NTRS)
Rosipal, Roman; Trejo, Leonard J.; Wheeler, Kevin; Tino, Peter
2002-01-01
The close relation of signal de-noising and regression problems dealing with the estimation of functions reflecting dependency between a set of inputs and dependent outputs corrupted with some level of noise have been employed in our approach.
Meoded, Avner; Kwan, Justin Y.; Peters, Tracy L.; Huey, Edward D.; Danielian, Laura E.; Wiggs, Edythe; Morrissette, Arthur; Wu, Tianxia; Russell, James W.; Bayat, Elham; Grafman, Jordan; Floeter, Mary Kay
2013-01-01
Introduction Executive dysfunction occurs in many patients with amyotrophic lateral sclerosis (ALS), but it has not been well studied in primary lateral sclerosis (PLS). The aims of this study were to (1) compare cognitive function in PLS to that in ALS patients, (2) explore the relationship between performance on specific cognitive tests and diffusion tensor imaging (DTI) metrics of white matter tracts and gray matter volumes, and (3) compare DTI metrics in patients with and without cognitive and behavioral changes. Methods The Delis-Kaplan Executive Function System (D-KEFS), the Mattis Dementia Rating Scale (DRS-2), and other behavior and mood scales were administered to 25 ALS patients and 25 PLS patients. Seventeen of the PLS patients, 13 of the ALS patients, and 17 healthy controls underwent structural magnetic resonance imaging (MRI) and DTI. Atlas-based analysis using MRI Studio software was used to measure fractional anisotropy, and axial and radial diffusivity of selected white matter tracts. Voxel-based morphometry was used to assess gray matter volumes. The relationship between diffusion properties of selected association and commissural white matter and performance on executive function and memory tests was explored using a linear regression model. Results More ALS than PLS patients had abnormal scores on the DRS-2. DRS-2 and D-KEFS scores were related to DTI metrics in several long association tracts and the callosum. Reduced gray matter volumes in motor and perirolandic areas were not associated with cognitive scores. Conclusion The changes in diffusion metrics of white matter long association tracts suggest that the loss of integrity of the networks connecting fronto-temporal areas to parietal and occipital areas contributes to cognitive impairment. PMID:24052798
Comparison of 3 Methods for Identifying Dietary Patterns Associated With Risk of Disease
DiBello, Julia R.; Kraft, Peter; McGarvey, Stephen T.; Goldberg, Robert; Campos, Hannia
2008-01-01
Reduced rank regression and partial least-squares regression (PLS) are proposed alternatives to principal component analysis (PCA). Using all 3 methods, the authors derived dietary patterns in Costa Rican data collected on 3,574 cases and controls in 1994–2004 and related the resulting patterns to risk of first incident myocardial infarction. Four dietary patterns associated with myocardial infarction were identified. Factor 1, characterized by high intakes of lean chicken, vegetables, fruit, and polyunsaturated oil, was generated by all 3 dietary pattern methods and was associated with a significantly decreased adjusted risk of myocardial infarction (28%–46%, depending on the method used). PCA and PLS also each yielded a pattern associated with a significantly decreased risk of myocardial infarction (31% and 23%, respectively); this pattern was characterized by moderate intake of alcohol and polyunsaturated oil and low intake of high-fat dairy products. The fourth factor derived from PCA was significantly associated with a 38% increased risk of myocardial infarction and was characterized by high intakes of coffee and palm oil. Contrary to previous studies, the authors found PCA and PLS to produce more patterns associated with cardiovascular disease than reduced rank regression. The most effective method for deriving dietary patterns related to disease may vary depending on the study goals. PMID:18945692
Brouckaert, D; Uyttersprot, J-S; Broeckx, W; De Beer, T
2018-03-01
Calibration transfer or standardisation aims at creating a uniform spectral response on different spectroscopic instruments or under varying conditions, without requiring a full recalibration for each situation. In the current study, this strategy is applied to construct at-line multivariate calibration models and consequently employ them in-line in a continuous industrial production line, using the same spectrometer. Firstly, quantitative multivariate models are constructed at-line at laboratory scale for predicting the concentration of two main ingredients in hard surface cleaners. By regressing the Raman spectra of a set of small-scale calibration samples against their reference concentration values, partial least squares (PLS) models are developed to quantify the surfactant levels in the liquid detergent compositions under investigation. After evaluating the models performance with a set of independent validation samples, a univariate slope/bias correction is applied in view of transporting these at-line calibration models to an in-line manufacturing set-up. This standardisation technique allows a fast and easy transfer of the PLS regression models, by simply correcting the model predictions on the in-line set-up, without adjusting anything to the original multivariate calibration models. An extensive statistical analysis is performed in order to assess the predictive quality of the transferred regression models. Before and after transfer, the R 2 and RMSEP of both models is compared for evaluating if their magnitude is similar. T-tests are then performed to investigate whether the slope and intercept of the transferred regression line are not statistically different from 1 and 0, respectively. Furthermore, it is inspected whether no significant bias can be noted. F-tests are executed as well, for assessing the linearity of the transfer regression line and for investigating the statistical coincidence of the transfer and validation regression line. Finally, a paired t-test is performed to compare the original at-line model to the slope/bias corrected in-line model, using interval hypotheses. It is shown that the calibration models of Surfactant 1 and Surfactant 2 yield satisfactory in-line predictions after slope/bias correction. While Surfactant 1 passes seven out of eight statistical tests, the recommended validation parameters are 100% successful for Surfactant 2. It is hence concluded that the proposed strategy for transferring at-line calibration models to an in-line industrial environment via a univariate slope/bias correction of the predicted values offers a successful standardisation approach. Copyright © 2017 Elsevier B.V. All rights reserved.
Relationship between Composition and Toxicity of Motor Vehicle Emission Samples
McDonald, Jacob D.; Eide, Ingvar; Seagrave, JeanClare; Zielinska, Barbara; Whitney, Kevin; Lawson, Douglas R.; Mauderly, Joe L.
2004-01-01
In this study we investigated the statistical relationship between particle and semivolatile organic chemical constituents in gasoline and diesel vehicle exhaust samples, and toxicity as measured by inflammation and tissue damage in rat lungs and mutagenicity in bacteria. Exhaust samples were collected from “normal” and “high-emitting” gasoline and diesel light-duty vehicles. We employed a combination of principal component analysis (PCA) and partial least-squares regression (PLS; also known as projection to latent structures) to evaluate the relationships between chemical composition of vehicle exhaust and toxicity. The PLS analysis revealed the chemical constituents covarying most strongly with toxicity and produced models predicting the relative toxicity of the samples with good accuracy. The specific nitro-polycyclic aromatic hydrocarbons important for mutagenicity were the same chemicals that have been implicated by decades of bioassay-directed fractionation. These chemicals were not related to lung toxicity, which was associated with organic carbon and select organic compounds that are present in lubricating oil. The results demonstrate the utility of the PCA/PLS approach for evaluating composition–response relationships in complex mixture exposures and also provide a starting point for confirming causality and determining the mechanisms of the lung effects. PMID:15531438
Sato, Takako; Zaitsu, Kei; Tsuboi, Kento; Nomura, Masakatsu; Kusano, Maiko; Shima, Noriaki; Abe, Shuntaro; Ishii, Akira; Tsuchihashi, Hitoshi; Suzuki, Koichi
2015-05-01
Estimation of postmortem interval (PMI) is an important goal in judicial autopsy. Although many approaches can estimate PMI through physical findings and biochemical tests, accurate PMI calculation by these conventional methods remains difficult because PMI is readily affected by surrounding conditions, such as ambient temperature and humidity. In this study, Sprague-Dawley (SD) rats (10 weeks) were sacrificed by suffocation, and blood was collected by dissection at various time intervals (0, 3, 6, 12, 24, and 48 h; n = 6) after death. A total of 70 endogenous metabolites were detected in plasma by gas chromatography-tandem mass spectrometry (GC-MS/MS). Each time group was separated from each other on the principal component analysis (PCA) score plot, suggesting that the various endogenous metabolites changed with time after death. To prepare a prediction model of a PMI, a partial least squares (or projection to latent structure, PLS) regression model was constructed using the levels of significantly different metabolites determined by variable importance in the projection (VIP) score and the Kruskal-Wallis test (P < 0.05). Because the constructed PLS regression model could successfully predict each PMI, this model was validated with another validation set (n = 3). In conclusion, plasma metabolic profiling demonstrated its ability to successfully estimate PMI under a certain condition. This result can be considered to be the first step for using the metabolomics method in future forensic casework.
Quantitative analysis of bayberry juice acidity based on visible and near-infrared spectroscopy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shao Yongni; He Yong; Mao Jingyuan
Visible and near-infrared (Vis/NIR) reflectance spectroscopy has been investigated for its ability to nondestructively detect acidity in bayberry juice. What we believe to be a new, better mathematic model is put forward, which we have named principal component analysis-stepwise regression analysis-backpropagation neural network (PCA-SRA-BPNN), to build a correlation between the spectral reflectivity data and the acidity of bayberry juice. In this model, the optimum network parameters,such as the number of input nodes, hidden nodes, learning rate, and momentum, are chosen by the value of root-mean-square (rms) error. The results show that its prediction statistical parameters are correlation coefficient (r) ofmore » 0.9451 and root-mean-square error of prediction(RMSEP) of 0.1168. Partial least-squares (PLS) regression is also established to compare with this model. Before doing this, the influences of various spectral pretreatments (standard normal variate, multiplicative scatter correction, S. Golay first derivative, and wavelet package transform) are compared. The PLS approach with wavelet package transform preprocessing spectra is found to provide the best results, and its prediction statistical parameters are correlation coefficient (r) of 0.9061 and RMSEP of 0.1564. Hence, these two models are both desirable to analyze the data from Vis/NIR spectroscopy and to solve the problem of the acidity prediction of bayberry juice. This supplies basal research to ultimately realize the online measurements of the juice's internal quality through this Vis/NIR spectroscopy technique.« less
NASA Astrophysics Data System (ADS)
Aimran, Ahmad Nazim; Ahmad, Sabri; Afthanorhan, Asyraf; Awang, Zainudin
2017-05-01
Structural equation modeling (SEM) is the second generation statistical analysis technique developed for analyzing the inter-relationships among multiple variables in a model. Previous studies have shown that there seemed to be at least an implicit agreement about the factors that should drive the choice between covariance-based structural equation modeling (CB-SEM) and partial least square path modeling (PLS-PM). PLS-PM appears to be the preferred method by previous scholars because of its less stringent assumption and the need to avoid the perceived difficulties in CB-SEM. Along with this issue has been the increasing debate among researchers on the use of CB-SEM and PLS-PM in studies. The present study intends to assess the performance of CB-SEM and PLS-PM as a confirmatory study in which the findings will contribute to the body of knowledge of SEM. Maximum likelihood (ML) was chosen as the estimator for CB-SEM and was expected to be more powerful than PLS-PM. Based on the balanced experimental design, the multivariate normal data with specified population parameter and sample sizes were generated using Pro-Active Monte Carlo simulation, and the data were analyzed using AMOS for CB-SEM and SmartPLS for PLS-PM. Comparative Bias Index (CBI), construct relationship, average variance extracted (AVE), composite reliability (CR), and Fornell-Larcker criterion were used to study the consequence of each estimator. The findings conclude that CB-SEM performed notably better than PLS-PM in estimation for large sample size (100 and above), particularly in terms of estimations accuracy and consistency.
Orthogonal decomposition of left ventricular remodeling in myocardial infarction.
Zhang, Xingyu; Medrano-Gracia, Pau; Ambale-Venkatesh, Bharath; Bluemke, David A; Cowan, Brett R; Finn, J Paul; Kadish, Alan H; Lee, Daniel C; Lima, Joao A C; Young, Alistair A; Suinesiaputra, Avan
2017-03-01
Left ventricular size and shape are important for quantifying cardiac remodeling in response to cardiovascular disease. Geometric remodeling indices have been shown to have prognostic value in predicting adverse events in the clinical literature, but these often describe interrelated shape changes. We developed a novel method for deriving orthogonal remodeling components directly from any (moderately independent) set of clinical remodeling indices. Six clinical remodeling indices (end-diastolic volume index, sphericity, relative wall thickness, ejection fraction, apical conicity, and longitudinal shortening) were evaluated using cardiac magnetic resonance images of 300 patients with myocardial infarction, and 1991 asymptomatic subjects, obtained from the Cardiac Atlas Project. Partial least squares (PLS) regression of left ventricular shape models resulted in remodeling components that were optimally associated with each remodeling index. A Gram-Schmidt orthogonalization process, by which remodeling components were successively removed from the shape space in the order of shape variance explained, resulted in a set of orthonormal remodeling components. Remodeling scores could then be calculated that quantify the amount of each remodeling component present in each case. A one-factor PLS regression led to more decoupling between scores from the different remodeling components across the entire cohort, and zero correlation between clinical indices and subsequent scores. The PLS orthogonal remodeling components had similar power to describe differences between myocardial infarction patients and asymptomatic subjects as principal component analysis, but were better associated with well-understood clinical indices of cardiac remodeling. The data and analyses are available from www.cardiacatlas.org. © The Author 2017. Published by Oxford University Press.
NASA Astrophysics Data System (ADS)
Glavanović, Siniša; Glavanović, Marija; Tomišić, Vladislav
2016-03-01
The UV spectrophotometric methods for simultaneous quantitative determination of paracetamol and tramadol in paracetamol-tramadol tablets were developed. The spectrophotometric data obtained were processed by means of partial least squares (PLS) and genetic algorithm coupled with PLS (GA-PLS) methods in order to determine the content of active substances in the tablets. The results gained by chemometric processing of the spectroscopic data were statistically compared with those obtained by means of validated ultra-high performance liquid chromatographic (UHPLC) method. The accuracy and precision of data obtained by the developed chemometric models were verified by analysing the synthetic mixture of drugs, and by calculating recovery as well as relative standard error (RSE). A statistically good agreement was found between the amounts of paracetamol determined using PLS and GA-PLS algorithms, and that obtained by UHPLC analysis, whereas for tramadol GA-PLS results were proven to be more reliable compared to those of PLS. The simplest and the most accurate and precise models were constructed by using the PLS method for paracetamol (mean recovery 99.5%, RSE 0.89%) and the GA-PLS method for tramadol (mean recovery 99.4%, RSE 1.69%).
Vindimian, Éric; Garric, Jeanne; Flammarion, Patrick; Thybaud, Éric; Babut, Marc
1999-10-01
The evaluation of the ecotoxicity of effluents requires a battery of biological tests on several species. In order to derive a summary parameter from such a battery, a single endpoint was calculated for all the tests: the EC10, obtained by nonlinear regression, with bootstrap evaluation of the confidence intervals. Principal component analysis was used to characterize and visualize the correlation between the tests. The table of the toxicity of the effluents was then submitted to a panel of experts, who classified the effluents according to the test results. Partial least squares (PLS) regression was used to fit the average value of the experts' judgements to the toxicity data, using a simple equation. Furthermore, PLS regression on partial data sets and other considerations resulted in an optimum battery, with two chronic tests and one acute test. The index is intended to be used for the classification of effluents based on their toxicity to aquatic species. Copyright © 1999 SETAC.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vindimian, E.; Garric, J.; Flammarion, P.
1999-10-01
The evaluation of the ecotoxicity of effluents requires a battery of biological tests on several species. In order to derive a summary parameter from such a battery, a single endpoint was calculated for all the tests: the EC10, obtained by nonlinear regression, with bootstrap evaluation of the confidence intervals. Principal component analysis was used to characterize and visualize the correlation between the tests. The table of the toxicity of the effluents was then submitted to a panel of experts, who classified the effluents according to the test results. Partial least squares (PLS) regression was used to fit the average valuemore » of the experts' judgments to the toxicity data, using a simple equation. Furthermore, PLS regression on partial data sets and other considerations resulted in an optimum battery, with two chronic tests and one acute test. The index is intended to be used for the classification of effluents based on their toxicity to aquatic species.« less
Yulia, Meinilwita
2017-01-01
Asian palm civet coffee or kopi luwak (Indonesian words for coffee and palm civet) is well known as the world's priciest and rarest coffee. To protect the authenticity of luwak coffee and protect consumer from luwak coffee adulteration, it is very important to develop a robust and simple method for determining the adulteration of luwak coffee. In this research, the use of UV-Visible spectra combined with PLSR was evaluated to establish rapid and simple methods for quantification of adulteration in luwak-arabica coffee blend. Several preprocessing methods were tested and the results show that most of the preprocessing spectra were effective in improving the quality of calibration models with the best PLS calibration model selected for Savitzky-Golay smoothing spectra which had the lowest RMSECV (0.039) and highest RPDcal value (4.64). Using this PLS model, a prediction for quantification of luwak content was calculated and resulted in satisfactory prediction performance with high both RPDp and RER values. PMID:28913348
Bechshøft, T Ø; Sonne, C; Dietz, R; Born, E W; Muir, D C G; Letcher, R J; Novak, M A; Henchey, E; Meyer, J S; Jenssen, B M; Villanger, G D
2012-07-01
The multivariate relationship between hair cortisol, whole blood thyroid hormones, and the complex mixtures of organohalogen contaminant (OHC) levels measured in subcutaneous adipose of 23 East Greenland polar bears (eight males and 15 females, all sampled between the years 1999 and 2001) was analyzed using projection to latent structure (PLS) regression modeling. In the resulting PLS model, most important variables with a negative influence on cortisol levels were particularly BDE-99, but also CB-180, -201, BDE-153, and CB-170/190. The most important variables with a positive influence on cortisol were CB-66/95, α-HCH, TT3, as well as heptachlor epoxide, dieldrin, BDE-47, p,p'-DDD. Although statistical modeling does not necessarily fully explain biological cause-effect relationships, relationships indicate that (1) the hypothalamic-pituitary-adrenal (HPA) axis in East Greenland polar bears is likely to be affected by OHC-contaminants and (2) the association between OHCs and cortisol may be linked with the hypothalamus-pituitary-thyroid (HPT) axis. Copyright © 2012 Elsevier Inc. All rights reserved.
NASA Technical Reports Server (NTRS)
Soller, Babs R.; Favreau, Janice; Idwasi, Patrick O.
2003-01-01
The feasibility of using near-infrared (NIR) spectroscopy in combination with partial least-squares (PLS) regression was explored to measure electrolyte concentration in whole blood samples. Spectra were collected from diluted blood samples containing randomized, clinically relevant concentrations of Na+, K+, and Ca2+. Sodium was also studied in lysed blood. Reference measurements were made from the same samples using a standard clinical chemistry instrument. Partial least squares (PLS) was used to develop calibration models for each ion with acceptable results (Na+, R2 = 0.86, CVSEP = 9.5 mmol/L; K+, R2 = 0.54, CVSEP = 1.4 mmol/L; Ca2+, R2 = 0.56, CVSEP = 0.18 mmol/L). Slightly improved results were obtained using a narrower wavelength region (470-925 nm) where hemoglobin, but not water, absorbed indicating that ionic interaction with hemoglobin is as effective as water in causing measurable spectral variation. Good models were also achieved for sodium in lysed blood, illustrating that cell swelling, which is correlated with sodium concentration, is not required for calibration model development.
Dong, Yanhong; Li, Juan; Zhong, Xiaoxiao; Cao, Liya; Luo, Yang; Fan, Qi
2016-04-15
This paper establishes a novel method to simultaneously predict the tablet weight (TW) and trimethoprim (TMP) content of compound sulfamethoxazole tablets (SMZCO) by near infrared (NIR) spectroscopy with partial least squares (PLS) regression for controlling the uniformity of dosage units (UODU). The NIR spectra for 257 samples were measured using the optimized parameter values and pretreated using the optimized chemometric techniques. After the outliers were ignored, two PLS models for predicting TW and TMP content were respectively established by using the selected spectral sub-ranges and the reference values. The TW model reaches the correlation coefficient of calibration (R(c)) 0.9543 and the TMP content model has the R(c) 0.9205. The experimental results indicate that this strategy expands the NIR application in controlling UODU, especially in the high-throughput and rapid analysis of TWs and contents of the compound pharmaceutical tablets, and may be an important complement to the common NIR on-line analytical method for pharmaceutical tablets. Copyright © 2016 Elsevier B.V. All rights reserved.
An improved partial least-squares regression method for Raman spectroscopy
NASA Astrophysics Data System (ADS)
Momenpour Tehran Monfared, Ali; Anis, Hanan
2017-10-01
It is known that the performance of partial least-squares (PLS) regression analysis can be improved using the backward variable selection method (BVSPLS). In this paper, we further improve the BVSPLS based on a novel selection mechanism. The proposed method is based on sorting the weighted regression coefficients, and then the importance of each variable of the sorted list is evaluated using root mean square errors of prediction (RMSEP) criterion in each iteration step. Our Improved BVSPLS (IBVSPLS) method has been applied to leukemia and heparin data sets and led to an improvement in limit of detection of Raman biosensing ranged from 10% to 43% compared to PLS. Our IBVSPLS was also compared to the jack-knifing (simpler) and Genetic Algorithm (more complex) methods. Our method was consistently better than the jack-knifing method and showed either a similar or a better performance compared to the genetic algorithm.
Shimizu, Yu; Yoshimoto, Junichiro; Takamura, Masahiro; Okada, Go; Okamoto, Yasumasa; Yamawaki, Shigeto; Doya, Kenji
2017-01-01
In diagnostic applications of statistical machine learning methods to brain imaging data, common problems include data high-dimensionality and co-linearity, which often cause over-fitting and instability. To overcome these problems, we applied partial least squares (PLS) regression to resting-state functional magnetic resonance imaging (rs-fMRI) data, creating a low-dimensional representation that relates symptoms to brain activity and that predicts clinical measures. Our experimental results, based upon data from clinically depressed patients and healthy controls, demonstrated that PLS and its kernel variants provided significantly better prediction of clinical measures than ordinary linear regression. Subsequent classification using predicted clinical scores distinguished depressed patients from healthy controls with 80% accuracy. Moreover, loading vectors for latent variables enabled us to identify brain regions relevant to depression, including the default mode network, the right superior frontal gyrus, and the superior motor area. PMID:28700672
Rahman, Anisur; Faqeerzada, Mohammad A; Cho, Byoung-Kwan
2018-03-14
Allicin and soluble solid content (SSC) in garlic is the responsible for its pungent flavor and odor. However, current conventional methods such as the use of high-pressure liquid chromatography and a refractometer have critical drawbacks in that they are time-consuming, labor-intensive and destructive procedures. The present study aimed to predict allicin and SSC in garlic using hyperspectral imaging in combination with variable selection algorithms and calibration models. Hyperspectral images of 100 garlic cloves were acquired that covered two spectral ranges, from which the mean spectra of each clove were extracted. The calibration models included partial least squares (PLS) and least squares-support vector machine (LS-SVM) regression, as well as different spectral pre-processing techniques, from which the highest performing spectral preprocessing technique and spectral range were selected. Then, variable selection methods, such as regression coefficients, variable importance in projection (VIP) and the successive projections algorithm (SPA), were evaluated for the selection of effective wavelengths (EWs). Furthermore, PLS and LS-SVM regression methods were applied to quantitatively predict the quality attributes of garlic using the selected EWs. Of the established models, the SPA-LS-SVM model obtained an Rpred2 of 0.90 and standard error of prediction (SEP) of 1.01% for SSC prediction, whereas the VIP-LS-SVM model produced the best result with an Rpred2 of 0.83 and SEP of 0.19 mg g -1 for allicin prediction in the range 1000-1700 nm. Furthermore, chemical images of garlic were developed using the best predictive model to facilitate visualization of the spatial distributions of allicin and SSC. The present study clearly demonstrates that hyperspectral imaging combined with an appropriate chemometrics method can potentially be employed as a fast, non-invasive method to predict the allicin and SSC in garlic. © 2018 Society of Chemical Industry. © 2018 Society of Chemical Industry.
Zhang, Xuan; Li, Wei; Yin, Bin; Chen, Weizhong; Kelly, Declan P; Wang, Xiaoxin; Zheng, Kaiyi; Du, Yiping
2013-10-01
Coffee is the most heavily consumed beverage in the world after water, for which quality is a key consideration in commercial trade. Therefore, caffeine content which has a significant effect on the final quality of the coffee products requires to be determined fast and reliably by new analytical techniques. The main purpose of this work was to establish a powerful and practical analytical method based on near infrared spectroscopy (NIRS) and chemometrics for quantitative determination of caffeine content in roasted Arabica coffees. Ground coffee samples within a wide range of roasted levels were analyzed by NIR, meanwhile, in which the caffeine contents were quantitative determined by the most commonly used HPLC-UV method as the reference values. Then calibration models based on chemometric analyses of the NIR spectral data and reference concentrations of coffee samples were developed. Partial least squares (PLS) regression was used to construct the models. Furthermore, diverse spectra pretreatment and variable selection techniques were applied in order to obtain robust and reliable reduced-spectrum regression models. Comparing the respective quality of the different models constructed, the application of second derivative pretreatment and stability competitive adaptive reweighted sampling (SCARS) variable selection provided a notably improved regression model, with root mean square error of cross validation (RMSECV) of 0.375 mg/g and correlation coefficient (R) of 0.918 at PLS factor of 7. An independent test set was used to assess the model, with the root mean square error of prediction (RMSEP) of 0.378 mg/g, mean relative error of 1.976% and mean relative standard deviation (RSD) of 1.707%. Thus, the results provided by the high-quality calibration model revealed the feasibility of NIR spectroscopy for at-line application to predict the caffeine content of unknown roasted coffee samples, thanks to the short analysis time of a few seconds and non-destructive advantages of NIRS. Copyright © 2013 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Zhang, Xuan; Li, Wei; Yin, Bin; Chen, Weizhong; Kelly, Declan P.; Wang, Xiaoxin; Zheng, Kaiyi; Du, Yiping
2013-10-01
Coffee is the most heavily consumed beverage in the world after water, for which quality is a key consideration in commercial trade. Therefore, caffeine content which has a significant effect on the final quality of the coffee products requires to be determined fast and reliably by new analytical techniques. The main purpose of this work was to establish a powerful and practical analytical method based on near infrared spectroscopy (NIRS) and chemometrics for quantitative determination of caffeine content in roasted Arabica coffees. Ground coffee samples within a wide range of roasted levels were analyzed by NIR, meanwhile, in which the caffeine contents were quantitative determined by the most commonly used HPLC-UV method as the reference values. Then calibration models based on chemometric analyses of the NIR spectral data and reference concentrations of coffee samples were developed. Partial least squares (PLS) regression was used to construct the models. Furthermore, diverse spectra pretreatment and variable selection techniques were applied in order to obtain robust and reliable reduced-spectrum regression models. Comparing the respective quality of the different models constructed, the application of second derivative pretreatment and stability competitive adaptive reweighted sampling (SCARS) variable selection provided a notably improved regression model, with root mean square error of cross validation (RMSECV) of 0.375 mg/g and correlation coefficient (R) of 0.918 at PLS factor of 7. An independent test set was used to assess the model, with the root mean square error of prediction (RMSEP) of 0.378 mg/g, mean relative error of 1.976% and mean relative standard deviation (RSD) of 1.707%. Thus, the results provided by the high-quality calibration model revealed the feasibility of NIR spectroscopy for at-line application to predict the caffeine content of unknown roasted coffee samples, thanks to the short analysis time of a few seconds and non-destructive advantages of NIRS.
NASA Astrophysics Data System (ADS)
Krepper, Gabriela; Romeo, Florencia; Fernandes, David Douglas de Sousa; Diniz, Paulo Henrique Gonçalves Dias; de Araújo, Mário César Ugulino; Di Nezio, María Susana; Pistonesi, Marcelo Fabián; Centurión, María Eugenia
2018-01-01
Determining fat content in hamburgers is very important to minimize or control the negative effects of fat on human health, effects such as cardiovascular diseases and obesity, which are caused by the high consumption of saturated fatty acids and cholesterol. This study proposed an alternative analytical method based on Near Infrared Spectroscopy (NIR) and Successive Projections Algorithm for interval selection in Partial Least Squares regression (iSPA-PLS) for fat content determination in commercial chicken hamburgers. For this, 70 hamburger samples with a fat content ranging from 14.27 to 32.12 mg kg- 1 were prepared based on the upper limit recommended by the Argentinean Food Codex, which is 20% (w w- 1). NIR spectra were then recorded and then preprocessed by applying different approaches: base line correction, SNV, MSC, and Savitzky-Golay smoothing. For comparison, full-spectrum PLS and the Interval PLS are also used. The best performance for the prediction set was obtained for the first derivative Savitzky-Golay smoothing with a second-order polynomial and window size of 19 points, achieving a coefficient of correlation of 0.94, RMSEP of 1.59 mg kg- 1, REP of 7.69% and RPD of 3.02. The proposed methodology represents an excellent alternative to the conventional Soxhlet extraction method, since waste generation is avoided, yet without the use of either chemical reagents or solvents, which follows the primary principles of Green Chemistry. The new method was successfully applied to chicken hamburger analysis, and the results agreed with those with reference values at a 95% confidence level, making it very attractive for routine analysis.
Krepper, Gabriela; Romeo, Florencia; Fernandes, David Douglas de Sousa; Diniz, Paulo Henrique Gonçalves Dias; de Araújo, Mário César Ugulino; Di Nezio, María Susana; Pistonesi, Marcelo Fabián; Centurión, María Eugenia
2018-01-15
Determining fat content in hamburgers is very important to minimize or control the negative effects of fat on human health, effects such as cardiovascular diseases and obesity, which are caused by the high consumption of saturated fatty acids and cholesterol. This study proposed an alternative analytical method based on Near Infrared Spectroscopy (NIR) and Successive Projections Algorithm for interval selection in Partial Least Squares regression (iSPA-PLS) for fat content determination in commercial chicken hamburgers. For this, 70 hamburger samples with a fat content ranging from 14.27 to 32.12mgkg -1 were prepared based on the upper limit recommended by the Argentinean Food Codex, which is 20% (ww -1 ). NIR spectra were then recorded and then preprocessed by applying different approaches: base line correction, SNV, MSC, and Savitzky-Golay smoothing. For comparison, full-spectrum PLS and the Interval PLS are also used. The best performance for the prediction set was obtained for the first derivative Savitzky-Golay smoothing with a second-order polynomial and window size of 19 points, achieving a coefficient of correlation of 0.94, RMSEP of 1.59mgkg -1 , REP of 7.69% and RPD of 3.02. The proposed methodology represents an excellent alternative to the conventional Soxhlet extraction method, since waste generation is avoided, yet without the use of either chemical reagents or solvents, which follows the primary principles of Green Chemistry. The new method was successfully applied to chicken hamburger analysis, and the results agreed with those with reference values at a 95% confidence level, making it very attractive for routine analysis. Copyright © 2017 Elsevier B.V. All rights reserved.
The development of comparative bias index
NASA Astrophysics Data System (ADS)
Aimran, Ahmad Nazim; Ahmad, Sabri; Afthanorhan, Asyraf; Awang, Zainudin
2017-08-01
Structural Equation Modeling (SEM) is a second generation statistical analysis techniques developed for analyzing the inter-relationships among multiple variables in a model simultaneously. There are two most common used methods in SEM namely Covariance-Based Structural Equation Modeling (CB-SEM) and Partial Least Square Path Modeling (PLS-PM). There have been continuous debates among researchers in the use of PLS-PM over CB-SEM. While there is few studies were conducted to test the performance of CB-SEM and PLS-PM bias in estimating simulation data. This study intends to patch this problem by a) developing the Comparative Bias Index and b) testing the performance of CB-SEM and PLS-PM using developed index. Based on balanced experimental design, two multivariate normal simulation data with of distinct specifications of size 50, 100, 200 and 500 are generated and analyzed using CB-SEM and PLS-PM.
De Girolamo, A; Lippolis, V; Nordkvist, E; Visconti, A
2009-06-01
Fourier transform near-infrared spectroscopy (FT-NIR) was used for rapid and non-invasive analysis of deoxynivalenol (DON) in durum and common wheat. The relevance of using ground wheat samples with a homogeneous particle size distribution to minimize measurement variations and avoid DON segregation among particles of different sizes was established. Calibration models for durum wheat, common wheat and durum + common wheat samples, with particle size <500 microm, were obtained by using partial least squares (PLS) regression with an external validation technique. Values of root mean square error of prediction (RMSEP, 306-379 microg kg(-1)) were comparable and not too far from values of root mean square error of cross-validation (RMSECV, 470-555 microg kg(-1)). Coefficients of determination (r(2)) indicated an "approximate to good" level of prediction of the DON content by FT-NIR spectroscopy in the PLS calibration models (r(2) = 0.71-0.83), and a "good" discrimination between low and high DON contents in the PLS validation models (r(2) = 0.58-0.63). A "limited to good" practical utility of the models was ascertained by range error ratio (RER) values higher than 6. A qualitative model, based on 197 calibration samples, was developed to discriminate between blank and naturally contaminated wheat samples by setting a cut-off at 300 microg kg(-1) DON to separate the two classes. The model correctly classified 69% of the 65 validation samples with most misclassified samples (16 of 20) showing DON contamination levels quite close to the cut-off level. These findings suggest that FT-NIR analysis is suitable for the determination of DON in unprocessed wheat at levels far below the maximum permitted limits set by the European Commission.
Wei, Zhenbo; Wang, Jun; Ye, Linshuang
2011-08-15
A voltammetric electronic tongue (VE-tongue) was developed to discriminate the difference between Chinese rice wines in this research. Three types of Chinese rice wine with different marked ages (1, 3, and 5 years) were classified by the VE-tongue by principal component analysis (PCA) and cluster analysis (CA). The VE-tongue consisted of six working electrodes (gold, silver, platinum, palladium, tungsten, and titanium) in a standard three-electrode configuration. The multi-frequency large amplitude pulse voltammetry (MLAPV), which consisted of four segments of 1 Hz, 10 Hz, 100 Hz, and 1000 Hz, was applied as the potential waveform. The three types of Chinese rice wine could be classified accurately by PCA and CA, and some interesting regularity is shown in the score plots with the help of PCA. Two regression models, partial least squares (PLS) and back-error propagation-artificial neural network (BP-ANN), were used for wine age prediction. The regression results showed that the marked ages of the three types of Chinese rice wine were successfully predicted using PLS and BP-ANN. Copyright © 2011 Elsevier B.V. All rights reserved.
PLS modelling of structure—activity relationships of catechol O-methyltransferase inhibitors
NASA Astrophysics Data System (ADS)
Lotta, Timo; Taskinen, Jyrki; Bäckström, Reijo; Nissinen, Erkki
1992-06-01
Quantitative structure-activity analysis was carried out for in vitro inhibition of rat brain soluble catechol O-methyltransferase by a series (N=99) of 1,5-substituted-3,4-dihydroxybenzenes using computational chemistry and multivariate PLS modelling of data sets. The molecular structural descriptors (N=19) associated with the electronics of the catecholic ring and sizes of substituents were derived theoretically. For the whole set of molecules two separate PLS models have to be used. A PLS model with two significant (crossvalidated) model dimensions describing 82.2% of the variance in inhibition activity data was capable of predicting all molecules except those having the largest R1 substituent or having a large R5 substituent compared to the NO2 group. The other PLS model with three significant (crossvalidated) model dimensions described 83.3% of the variance in inhibition activity data. This model could not handle compounds having a small R5 substituent, compared to the NO2 group, or the largest R1 substituent. The predictive capability of these PLS models was good. The models reveal that inhibition activity is nonlinearly related to the size of the R5 substituent. The analysis of the PLS models also shows that the binding affinity is greatly dependent on the electronic nature of both R1 and R5 substituents. The electron-withdrawing nature of the substituents enhances inhibition activity. In addition, the size of the R1 substituent and its lipophilicity are important in the binding of inhibitors. The size of the R1 substituent has an upper limit. On the other hand, ionized R1 substituents decrease inhibition activity.
A Cultural Diffusion Model for the Rise and Fall of Programming Languages.
Valverde, Sergi; Solé, Ricard V
2015-07-01
Our interaction with complex computing machines is mediated by programming languages (PLs), which constitute one of the major innovations in the evolution of technology. PLs allow flexible, scalable, and fast use of hardware and are largely responsible for shaping the history of information technology since the rise of computers in the 1950s. The rapid growth and impact of computers were followed closely by the development of PLs. As occurs with natural, human languages, PLs have emerged and gone extinct. There has been always a diversity of coexisting PLs that compete somewhat while occupying special niches. Here we show that the statistical patterns of language adoption, rise, and fall can be accounted for by a simple model in which a set of programmers can use several PLs, decide to use existing PLs used by other programmers, or decide not to use them. Our results highlight the influence of strong communities of practice in the diffusion of PL innovations.
Jiang, Hui; Zhang, Hang; Chen, Quansheng; Mei, Congli; Liu, Guohai
2015-01-01
The use of wavelength variable selection before partial least squares discriminant analysis (PLS-DA) for qualitative identification of solid state fermentation degree by FT-NIR spectroscopy technique was investigated in this study. Two wavelength variable selection methods including competitive adaptive reweighted sampling (CARS) and stability competitive adaptive reweighted sampling (SCARS) were employed to select the important wavelengths. PLS-DA was applied to calibrate identified model using selected wavelength variables by CARS and SCARS for identification of solid state fermentation degree. Experimental results showed that the number of selected wavelength variables by CARS and SCARS were 58 and 47, respectively, from the 1557 original wavelength variables. Compared with the results of full-spectrum PLS-DA, the two wavelength variable selection methods both could enhance the performance of identified models. Meanwhile, compared with CARS-PLS-DA model, the SCARS-PLS-DA model achieved better results with the identification rate of 91.43% in the validation process. The overall results sufficiently demonstrate the PLS-DA model constructed using selected wavelength variables by a proper wavelength variable method can be more accurate identification of solid state fermentation degree. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Jiang, Hui; Zhang, Hang; Chen, Quansheng; Mei, Congli; Liu, Guohai
2015-10-01
The use of wavelength variable selection before partial least squares discriminant analysis (PLS-DA) for qualitative identification of solid state fermentation degree by FT-NIR spectroscopy technique was investigated in this study. Two wavelength variable selection methods including competitive adaptive reweighted sampling (CARS) and stability competitive adaptive reweighted sampling (SCARS) were employed to select the important wavelengths. PLS-DA was applied to calibrate identified model using selected wavelength variables by CARS and SCARS for identification of solid state fermentation degree. Experimental results showed that the number of selected wavelength variables by CARS and SCARS were 58 and 47, respectively, from the 1557 original wavelength variables. Compared with the results of full-spectrum PLS-DA, the two wavelength variable selection methods both could enhance the performance of identified models. Meanwhile, compared with CARS-PLS-DA model, the SCARS-PLS-DA model achieved better results with the identification rate of 91.43% in the validation process. The overall results sufficiently demonstrate the PLS-DA model constructed using selected wavelength variables by a proper wavelength variable method can be more accurate identification of solid state fermentation degree.
Casale, M; Oliveri, P; Casolino, C; Sinelli, N; Zunin, P; Armanino, C; Forina, M; Lanteri, S
2012-01-27
An authentication study of the Italian PDO (protected designation of origin) extra virgin olive oil Chianti Classico was performed; UV-visible (UV-vis), Near-Infrared (NIR) and Mid-Infrared (MIR) spectroscopies were applied to a set of samples representative of the whole Chianti Classico production area. The non-selective signals (fingerprints) provided by the three spectroscopic techniques were utilised both individually and jointly, after fusion of the respective profile vectors, in order to build a model for the Chianti Classico PDO olive oil. Moreover, these results were compared with those obtained by the gas chromatographic determination of the fatty acids composition. In order to characterise the olive oils produced in the Chianti Classico PDO area, UNEQ (unequal class models) and SIMCA (soft independent modelling of class analogy) were employed both on the MIR, NIR and UV-vis spectra, individually and jointly, and on the fatty acid composition. Finally, PLS (partial least square) regression was applied on the UV-vis, NIR and MIR spectra, in order to predict the content of oleic and linoleic acids in the extra virgin olive oils. UNEQ, SIMCA and PLS were performed after selection of the relevant predictors, in order to increase the efficiency of both classification and regression models. The non-selective information obtained from UV-vis, NIR and MIR spectroscopy allowed to build reliable models for checking the authenticity of the Italian PDO extra virgin olive oil Chianti Classico. Copyright © 2011 Elsevier B.V. All rights reserved.
Dealing with gene expression missing data.
Brás, L P; Menezes, J C
2006-05-01
Compared evaluation of different methods is presented for estimating missing values in microarray data: weighted K-nearest neighbours imputation (KNNimpute), regression-based methods such as local least squares imputation (LLSimpute) and partial least squares imputation (PLSimpute) and Bayesian principal component analysis (BPCA). The influence in prediction accuracy of some factors, such as methods' parameters, type of data relationships used in the estimation process (i.e. row-wise, column-wise or both), missing rate and pattern and type of experiment [time series (TS), non-time series (NTS) or mixed (MIX) experiments] is elucidated. Improvements based on the iterative use of data (iterative LLS and PLS imputation--ILLSimpute and IPLSimpute), the need to perform initial imputations (modified PLS and Helland PLS imputation--MPLSimpute and HPLSimpute) and the type of relationships employed (KNNarray, LLSarray, HPLSarray and alternating PLS--APLSimpute) are proposed. Overall, it is shown that data set properties (type of experiment, missing rate and pattern) affect the data similarity structure, therefore influencing the methods' performance. LLSimpute and ILLSimpute are preferable in the presence of data with a stronger similarity structure (TS and MIX experiments), whereas PLS-based methods (MPLSimpute, IPLSimpute and APLSimpute) are preferable when estimating NTS missing data.
[Effect of near infrared spectrum on the precision of PLS model for oil yield from oil shale].
Wang, Zhi-Hong; Liu, Jie; Chen, Xiao-Chao; Sun, Yu-Yang; Yu, Yang; Lin, Jun
2012-10-01
It is impossible to use present measurement methods for the oil yield of oil shale to realize in-situ detection and these methods unable to meet the requirements of the oil shale resources exploration and exploitation. But in-situ oil yield analysis of oil shale can be achieved by the portable near infrared spectroscopy technique. There are different correlativities of NIR spectrum data formats and contents of sample components, and the different absorption specialities of sample components shows in different NIR spectral regions. So with the proportioning samples, the PLS modeling experiments were done by 3 formats (reflectance, absorbance and K-M function) and 4 regions of modeling spectrum, and the effect of NIR spectral format and region to the precision of PLS model for oil yield from oil shale was studied. The results show that the best data format is reflectance and the best modeling region is combination spectral range by PLS model method and proportioning samples. Therefore, the appropriate data format and the proper characteristic spectral region can increase the precision of PLS model for oil yield form oil shale.
Explaining and modeling the concentration and loading of Escherichia coli in a stream-A case study.
Wang, Chaozi; Schneider, Rebecca L; Parlange, Jean-Yves; Dahlke, Helen E; Walter, M Todd
2018-09-01
Escherichia coli (E. coli) level in streams is a public health indicator. Therefore, being able to explain why E. coli levels are sometimes high and sometimes low is important. Using citizen science data from Fall Creek in central NY we found that complementarily using principal component analysis (PCA) and partial least squares (PLS) regression provided insights into the drivers of E. coli and a mechanism for predicting E. coli levels, respectively. We found that stormwater, temperature/season and shallow subsurface flow are the three dominant processes driving the fate and transport of E. coli. PLS regression modeling provided very good predictions under stormwater conditions (R 2 = 0.85 for log (E. coli concentration) and R 2 = 0.90 for log (E. coli loading)); predictions under baseflow conditions were less robust. But, in our case, both E. coli concentration and E. coli loading were significantly higher under stormwater condition, so it is probably more important to predict high-flow E. coli hazards than low-flow conditions. Besides previously reported good indicators of in-stream E. coli level, nitrate-/nitrite-nitrogen and soluble reactive phosphorus were also found to be good indicators of in-stream E. coli levels. These findings suggest management practices to reduce E. coli concentrations and loads in-streams and, eventually, reduce the risk of waterborne disease outbreak. Copyright © 2018. Published by Elsevier B.V.
Ono, Daiki; Bamba, Takeshi; Oku, Yuichi; Yonetani, Tsutomu; Fukusaki, Eiichiro
2011-09-01
In this study, we constructed prediction models by metabolic fingerprinting of fresh green tea leaves using Fourier transform near-infrared (FT-NIR) spectroscopy and partial least squares (PLS) regression analysis to objectively optimize of the steaming process conditions in green tea manufacture. The steaming process is the most important step for manufacturing high quality green tea products. However, the parameter setting of the steamer is currently determined subjectively by the manufacturer. Therefore, a simple and robust system that can be used to objectively set the steaming process parameters is necessary. We focused on FT-NIR spectroscopy because of its simple operation, quick measurement, and low running costs. After removal of noise in the spectral data by principal component analysis (PCA), PLS regression analysis was performed using spectral information as independent variables, and the steaming parameters set by experienced manufacturers as dependent variables. The prediction models were successfully constructed with satisfactory accuracy. Moreover, the results of the demonstrated experiment suggested that the green tea steaming process parameters could be predicted on a larger manufacturing scale. This technique will contribute to improvement of the quality and productivity of green tea because it can objectively optimize the complicated green tea steaming process and will be suitable for practical use in green tea manufacture. Copyright © 2011 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.
González-Sáiz, J M; Esteban-Díez, I; Sánchez-Gallardo, C; Pizarro, C
2008-08-01
Wastes and by-products of the onion-processing industry pose an increasing disposal and environmental problem and represent a loss of valuable sources of nutrients. The present study focused on the production of vinegar from worthless onions as a potential valorisation route which could provide a viable solution to multiple disposal and environmental problems, simultaneously offering the possibility of converting waste materials into a useful food-grade product and of exploiting the unique properties and health benefits of onions. This study deals specifically with the second and definitive step of the onion vinegar production process: the efficient production of vinegar from onion waste by transforming onion ethanol, previously produced by alcoholic fermentation, into acetic acid via acetic fermentation. Near-infrared spectroscopy (NIRS), coupled with multivariate calibration methods, has been used to monitor the concentrations of both substrates and products in acetic fermentation. Separate partial least squares (PLS) regression models, correlating NIR spectral data of fermentation samples with each kinetic parameter studied, were developed. Wavelength selection was also performed applying the iterative predictor weighting-PLS (IPW-PLS) method in order to only consider significant spectral features in each model development to improve the quality of the final models constructed. Biomass, substrate (ethanol) and product (acetic acid) concentration were predicted in the acetic fermentation of onion alcohol with high accuracy using IPW-PLS models with a root-mean-square error of the residuals in external prediction (RMSEP) lower than 2.5% for both ethanol and acetic acid, and an RMSEP of 6.1% for total biomass concentration (a very satisfactory result considering the relatively low precision and accuracy associated with the reference method used for determining the latter). Thus, the simple and reliable calibration models proposed in this study suggest that they could be implemented in routine applications to monitor and predict the key species involved in the acetic fermentation of onion alcohol, allowing the onion vinegar production process to be controlled in real time.
Garriga, Miguel; Romero-Bravo, Sebastián; Estrada, Félix; Escobar, Alejandro; Matus, Iván A.; del Pozo, Alejandro; Astudillo, Cesar A.; Lobos, Gustavo A.
2017-01-01
Phenotyping, via remote and proximal sensing techniques, of the agronomic and physiological traits associated with yield potential and drought adaptation could contribute to improvements in breeding programs. In the present study, 384 genotypes of wheat (Triticum aestivum L.) were tested under fully irrigated (FI) and water stress (WS) conditions. The following traits were evaluated and assessed via spectral reflectance: Grain yield (GY), spikes per square meter (SM2), kernels per spike (KPS), thousand-kernel weight (TKW), chlorophyll content (SPAD), stem water soluble carbohydrate concentration and content (WSC and WSCC, respectively), carbon isotope discrimination (Δ13C), and leaf area index (LAI). The performances of spectral reflectance indices (SRIs), four regression algorithms (PCR, PLSR, ridge regression RR, and SVR), and three classification methods (PCA-LDA, PLS-DA, and kNN) were evaluated for the prediction of each trait. For the classification approaches, two classes were established for each trait: The lower 80% of the trait variability range (Class 1) and the remaining 20% (Class 2 or elite genotypes). Both the SRIs and regression methods performed better when data from FI and WS were combined. The traits that were best estimated by SRIs and regression methods were GY and Δ13C. For most traits and conditions, the estimations provided by RR and SVR were the same, or better than, those provided by the SRIs. PLS-DA showed the best performance among the categorical methods and, unlike the SRI and regression models, most traits were relatively well-classified within a specific hydric condition (FI or WS), proving that classification approach is an effective tool to be explored in future studies related to genotype selection. PMID:28337210
Balabin, Roman M; Lomakina, Ekaterina I
2011-04-21
In this study, we make a general comparison of the accuracy and robustness of five multivariate calibration models: partial least squares (PLS) regression or projection to latent structures, polynomial partial least squares (Poly-PLS) regression, artificial neural networks (ANNs), and two novel techniques based on support vector machines (SVMs) for multivariate data analysis: support vector regression (SVR) and least-squares support vector machines (LS-SVMs). The comparison is based on fourteen (14) different datasets: seven sets of gasoline data (density, benzene content, and fractional composition/boiling points), two sets of ethanol gasoline fuel data (density and ethanol content), one set of diesel fuel data (total sulfur content), three sets of petroleum (crude oil) macromolecules data (weight percentages of asphaltenes, resins, and paraffins), and one set of petroleum resins data (resins content). Vibrational (near-infrared, NIR) spectroscopic data are used to predict the properties and quality coefficients of gasoline, biofuel/biodiesel, diesel fuel, and other samples of interest. The four systems presented here range greatly in composition, properties, strength of intermolecular interactions (e.g., van der Waals forces, H-bonds), colloid structure, and phase behavior. Due to the high diversity of chemical systems studied, general conclusions about SVM regression methods can be made. We try to answer the following question: to what extent can SVM-based techniques replace ANN-based approaches in real-world (industrial/scientific) applications? The results show that both SVR and LS-SVM methods are comparable to ANNs in accuracy. Due to the much higher robustness of the former, the SVM-based approaches are recommended for practical (industrial) application. This has been shown to be especially true for complicated, highly nonlinear objects.
Garriga, Miguel; Romero-Bravo, Sebastián; Estrada, Félix; Escobar, Alejandro; Matus, Iván A; Del Pozo, Alejandro; Astudillo, Cesar A; Lobos, Gustavo A
2017-01-01
Phenotyping, via remote and proximal sensing techniques, of the agronomic and physiological traits associated with yield potential and drought adaptation could contribute to improvements in breeding programs. In the present study, 384 genotypes of wheat ( Triticum aestivum L.) were tested under fully irrigated (FI) and water stress (WS) conditions. The following traits were evaluated and assessed via spectral reflectance: Grain yield (GY), spikes per square meter (SM2), kernels per spike (KPS), thousand-kernel weight (TKW), chlorophyll content (SPAD), stem water soluble carbohydrate concentration and content (WSC and WSCC, respectively), carbon isotope discrimination (Δ 13 C), and leaf area index (LAI). The performances of spectral reflectance indices (SRIs), four regression algorithms (PCR, PLSR, ridge regression RR, and SVR), and three classification methods (PCA-LDA, PLS-DA, and k NN) were evaluated for the prediction of each trait. For the classification approaches, two classes were established for each trait: The lower 80% of the trait variability range (Class 1) and the remaining 20% (Class 2 or elite genotypes). Both the SRIs and regression methods performed better when data from FI and WS were combined. The traits that were best estimated by SRIs and regression methods were GY and Δ 13 C. For most traits and conditions, the estimations provided by RR and SVR were the same, or better than, those provided by the SRIs. PLS-DA showed the best performance among the categorical methods and, unlike the SRI and regression models, most traits were relatively well-classified within a specific hydric condition (FI or WS), proving that classification approach is an effective tool to be explored in future studies related to genotype selection.
NASA Astrophysics Data System (ADS)
Müller, Aline Lima Hermes; Picoloto, Rochele Sogari; Mello, Paola de Azevedo; Ferrão, Marco Flores; dos Santos, Maria de Fátima Pereira; Guimarães, Regina Célia Lourenço; Müller, Edson Irineu; Flores, Erico Marlon Moraes
2012-04-01
Total sulfur concentration was determined in atmospheric residue (AR) and vacuum residue (VR) samples obtained from petroleum distillation process by Fourier transform infrared spectroscopy with attenuated total reflectance (FT-IR/ATR) in association with chemometric methods. Calibration and prediction set consisted of 40 and 20 samples, respectively. Calibration models were developed using two variable selection models: interval partial least squares (iPLS) and synergy interval partial least squares (siPLS). Different treatments and pre-processing steps were also evaluated for the development of models. The pre-treatment based on multiplicative scatter correction (MSC) and the mean centered data were selected for models construction. The use of siPLS as variable selection method provided a model with root mean square error of prediction (RMSEP) values significantly better than those obtained by PLS model using all variables. The best model was obtained using siPLS algorithm with spectra divided in 20 intervals and combinations of 3 intervals (911-824, 823-736 and 737-650 cm-1). This model produced a RMSECV of 400 mg kg-1 S and RMSEP of 420 mg kg-1 S, showing a correlation coefficient of 0.990.
Multivariate analysis of gamma spectra to characterize used nuclear fuel
Coble, Jamie; Orton, Christopher; Schwantes, Jon
2017-01-17
The Multi-Isotope Process (MIP) Monitor provides an efficient means to monitor the process conditions in used nuclear fuel reprocessing facilities to support process verification and validation. The MIP Monitor applies multivariate analysis to gamma spectroscopy of key stages in the reprocessing stream in order to detect small changes in the gamma spectrum, which may indicate changes in process conditions. This research extends the MIP Monitor by characterizing a used fuel sample after initial dissolution according to the type of reactor of origin (pressurized or boiling water reactor; PWR and BWR, respectively), initial enrichment, burn up, and cooling time. Simulated gammamore » spectra were used in this paper to develop and test three fuel characterization algorithms. The classification and estimation models employed are based on the partial least squares regression (PLS) algorithm. A PLS discriminate analysis model was developed which perfectly classified reactor type for the three PWR and three BWR reactor designs studied. Locally weighted PLS models were fitted on-the-fly to estimate the remaining fuel characteristics. For the simulated gamma spectra considered, burn up was predicted with 0.1% root mean squared percent error (RMSPE) and both cooling time and initial enrichment with approximately 2% RMSPE. Finally, this approach to automated fuel characterization can be used to independently verify operator declarations of used fuel characteristics and to inform the MIP Monitor anomaly detection routines at later stages of the fuel reprocessing stream to improve sensitivity to changes in operational parameters that may indicate issues with operational control or malicious activities.« less
Multivariate analysis of gamma spectra to characterize used nuclear fuel
DOE Office of Scientific and Technical Information (OSTI.GOV)
Coble, Jamie; Orton, Christopher; Schwantes, Jon
The Multi-Isotope Process (MIP) Monitor provides an efficient means to monitor the process conditions in used nuclear fuel reprocessing facilities to support process verification and validation. The MIP Monitor applies multivariate analysis to gamma spectroscopy of key stages in the reprocessing stream in order to detect small changes in the gamma spectrum, which may indicate changes in process conditions. This research extends the MIP Monitor by characterizing a used fuel sample after initial dissolution according to the type of reactor of origin (pressurized or boiling water reactor; PWR and BWR, respectively), initial enrichment, burn up, and cooling time. Simulated gammamore » spectra were used in this paper to develop and test three fuel characterization algorithms. The classification and estimation models employed are based on the partial least squares regression (PLS) algorithm. A PLS discriminate analysis model was developed which perfectly classified reactor type for the three PWR and three BWR reactor designs studied. Locally weighted PLS models were fitted on-the-fly to estimate the remaining fuel characteristics. For the simulated gamma spectra considered, burn up was predicted with 0.1% root mean squared percent error (RMSPE) and both cooling time and initial enrichment with approximately 2% RMSPE. Finally, this approach to automated fuel characterization can be used to independently verify operator declarations of used fuel characteristics and to inform the MIP Monitor anomaly detection routines at later stages of the fuel reprocessing stream to improve sensitivity to changes in operational parameters that may indicate issues with operational control or malicious activities.« less
NASA Astrophysics Data System (ADS)
Pullanagari, R. R.; Kereszturi, G.; Yule, I. J.
2017-06-01
New Zealand farming relies heavily on grazed pasture for feeding livestock; therefore it is important to provide high quality palatable grass in order to maintain profitable and sustainable grassland management. The presence of non-photosynthetic vegetation (NPV) such as dead vegetation in pastures severely limits the quality and productivity of pastures. Quantifying the fraction of dead vegetation in mixed pastures is a great challenge even with remote sensing approaches. In this study, a high spatial resolution with pixel resolution of 1 m and spectral resolution of 3.5-5.6 nm imaging spectroscopy data from AisaFENIX (380-2500 nm) was used to assess the fraction of dead vegetation component in mixed pastures on a hill country farm in New Zealand. We used different methods to retrieve dead vegetation fraction from the spectra; narrow band vegetation indices, full spectrum based partial least squares (PLS) regression and feature selection based PLS regression. Among all approaches, feature selection based PLS model exhibited better performance in terms of prediction accuracy (R2CV = 0.73, RMSECV = 6.05, RPDCV = 2.25). The results were consistent with validation data, and also performed well on the external test data (R2 = 0.62, RMSE = 8.06, RPD = 2.06). In addition, statistical tests were conducted to ascertain the effect of topographical variables such as slope and aspect on the accumulation of the dead vegetation fraction. Steep slopes (>25°) had a significantly (p < 0.05) higher amount of dead vegetation. In contrast, aspect showed non-significant impact on dead vegetation accumulation. The results from the study indicate that AisaFENIX imaging spectroscopy data could be a useful tool for mapping the dead vegetation fraction accurately.
NASA Astrophysics Data System (ADS)
de Santana, Felipe Bachion; de Souza, André Marcelo; Poppi, Ronei Jesus
2018-02-01
This study evaluates the use of visible and near infrared spectroscopy (Vis-NIRS) combined with multivariate regression based on random forest to quantify some quality soil parameters. The parameters analyzed were soil cation exchange capacity (CEC), sum of exchange bases (SB), organic matter (OM), clay and sand present in the soils of several regions of Brazil. Current methods for evaluating these parameters are laborious, timely and require various wet analytical methods that are not adequate for use in precision agriculture, where faster and automatic responses are required. The random forest regression models were statistically better than PLS regression models for CEC, OM, clay and sand, demonstrating resistance to overfitting, attenuating the effect of outlier samples and indicating the most important variables for the model. The methodology demonstrates the potential of the Vis-NIR as an alternative for determination of CEC, SB, OM, sand and clay, making possible to develop a fast and automatic analytical procedure.
Žuvela, Petar; Liu, J Jay; Macur, Katarzyna; Bączek, Tomasz
2015-10-06
In this work, performance of five nature-inspired optimization algorithms, genetic algorithm (GA), particle swarm optimization (PSO), artificial bee colony (ABC), firefly algorithm (FA), and flower pollination algorithm (FPA), was compared in molecular descriptor selection for development of quantitative structure-retention relationship (QSRR) models for 83 peptides that originate from eight model proteins. The matrix with 423 descriptors was used as input, and QSRR models based on selected descriptors were built using partial least squares (PLS), whereas root mean square error of prediction (RMSEP) was used as a fitness function for their selection. Three performance criteria, prediction accuracy, computational cost, and the number of selected descriptors, were used to evaluate the developed QSRR models. The results show that all five variable selection methods outperform interval PLS (iPLS), sparse PLS (sPLS), and the full PLS model, whereas GA is superior because of its lowest computational cost and higher accuracy (RMSEP of 5.534%) with a smaller number of variables (nine descriptors). The GA-QSRR model was validated initially through Y-randomization. In addition, it was successfully validated with an external testing set out of 102 peptides originating from Bacillus subtilis proteomes (RMSEP of 22.030%). Its applicability domain was defined, from which it was evident that the developed GA-QSRR exhibited strong robustness. All the sources of the model's error were identified, thus allowing for further application of the developed methodology in proteomics.
An Improved Incremental Learning Approach for KPI Prognosis of Dynamic Fuel Cell System.
Yin, Shen; Xie, Xiaochen; Lam, James; Cheung, Kie Chung; Gao, Huijun
2016-12-01
The key performance indicator (KPI) has an important practical value with respect to the product quality and economic benefits for modern industry. To cope with the KPI prognosis issue under nonlinear conditions, this paper presents an improved incremental learning approach based on available process measurements. The proposed approach takes advantage of the algorithm overlapping of locally weighted projection regression (LWPR) and partial least squares (PLS), implementing the PLS-based prognosis in each locally linear model produced by the incremental learning process of LWPR. The global prognosis results including KPI prediction and process monitoring are obtained from the corresponding normalized weighted means of all the local models. The statistical indicators for prognosis are enhanced as well by the design of novel KPI-related and KPI-unrelated statistics with suitable control limits for non-Gaussian data. For application-oriented purpose, the process measurements from real datasets of a proton exchange membrane fuel cell system are employed to demonstrate the effectiveness of KPI prognosis. The proposed approach is finally extended to a long-term voltage prediction for potential reference of further fuel cell applications.
Chemometric studies on potential larvicidal compounds against Aedes aegypti.
Scotti, Luciana; Scotti, Marcus Tullius; Silva, Viviane Barros; Santos, Sandra Regina Lima; Cavalcanti, Sócrates C H; Mendonça, Francisco J B
2014-03-01
The mosquito Aedes aegypti (Diptera, Culicidae) is the vector of yellow and dengue fever. In this study, chemometric tools, such as, Principal Component Analysis (PCA), Consensus PCA (CPCA), and Partial Least Squares Regression (PLS), were applied to a set of fifty five active compounds against Ae. aegypti larvae, which includes terpenes, cyclic alcohols, phenolic compounds, and their synthetic derivatives. The calculations were performed using the VolSurf+ program. CPCA analysis suggests that the higher weight blocks of descriptors were SIZE/SHAPE, DRY, and H2O. The PCA was generated with 48 descriptors selected from the previous blocks. The scores plot showed good separation between more and less potent compounds. The first two PCs accounted for over 60% of the data variance. The best model obtained in PLS, after validation leave-one-out, exhibited q(2) = 0.679 and r(2) = 0.714. External prediction model was R(2) = 0.623. The independent variables having a hydrophobic profile were strongly correlated to the biological data. The interaction maps generated with the GRID force field showed that the most active compounds exhibit more interaction with the DRY probe.
Pérez-Castaño, Estefanía; Sánchez-Viñas, Mercedes; Gázquez-Evangelista, Domingo; Bagur-González, M Gracia
2018-01-15
This paper describes and discusses the application of trimethylsilyl (TMS)-4,4'-desmethylsterols derivatives chromatographic fingerprints (obtained from an off-line HPLC-GC-FID system) for the quantification of extra virgin olive oil in commercial vinaigrettes, dressing salad and in-house reference materials (i-HRM) using two different Partial Least Square-Regression (PLS-R) multivariate quantification methods. Different data pre-processing strategies were carried out being the whole one: (i) internal normalization; (ii) sampling based on The Nyquist Theorem; (iii) internal correlation optimized shifting, icoshift; (iv) baseline correction (v) mean centering and (vi) selecting zones. The first model corresponds to a matrix of dimensions 'n×911' variables and the second one to a matrix of dimensions 'n×431' variables. It has to be highlighted that the proposed two PLS-R models allow the quantification of extra virgin olive oil in binary blends, foodstuffs, etc., when the provided percentage is greater than 25%. Copyright © 2017 Elsevier Ltd. All rights reserved.
Wang, Xiao; Esquerre, Carlos; Downey, Gerard; Henihan, Lisa; O'Callaghan, Donal; O'Donnell, Colm
2018-06-01
In this study, visible and near-infrared (Vis-NIR), mid-infrared (MIR) and Raman process analytical technologies were investigated for assessment of infant formula quality and compositional parameters namely preheat temperature, storage temperature, storage time, fluorescence of advanced Maillard products and soluble tryptophan (FAST) index, soluble protein, fat and surface free fat (SFF) content. PLS-DA models developed using spectral data with appropriate data pre-treatment and significant variables selected using Martens' uncertainty test had good accuracy for the discrimination of preheat temperature (92.3-100%) and storage temperature (91.7-100%). The best PLS regression models developed yielded values for the ratio of prediction error to deviation (RPD) of 3.6-6.1, 2.1-2.7, 1.7-2.9, 1.6-2.6 and 2.5-3.0 for storage time, FAST index, soluble protein, fat and SFF content prediction respectively. Vis-NIR, MIR and Raman were demonstrated to be potential PAT tools for process control and quality assurance applications in infant formula and dairy ingredient manufacture. Copyright © 2018 Elsevier B.V. All rights reserved.
Sandoval, S; Torres, A; Pawlowsky-Reusing, E; Riechel, M; Caradot, N
2013-01-01
The present study aims to explore the relationship between rainfall variables and water quality/quantity characteristics of combined sewer overflows (CSOs), by the use of multivariate statistical methods and online measurements at a principal CSO outlet in Berlin (Germany). Canonical correlation results showed that the maximum and average rainfall intensities are the most influential variables to describe CSO water quantity and pollutant loads whereas the duration of the rainfall event and the rain depth seem to be the most influential variables to describe CSO pollutant concentrations. The analysis of partial least squares (PLS) regression models confirms the findings of the canonical correlation and highlights three main influences of rainfall on CSO characteristics: (i) CSO water quantity characteristics are mainly influenced by the maximal rainfall intensities, (ii) CSO pollutant concentrations were found to be mostly associated with duration of the rainfall and (iii) pollutant loads seemed to be principally influenced by dry weather duration before the rainfall event. The prediction quality of PLS models is rather low (R² < 0.6) but results can be useful to explore qualitatively the influence of rainfall on CSO characteristics.
Jović, Ozren
2016-12-15
A novel method for quantitative prediction and variable-selection on spectroscopic data, called Durbin-Watson partial least-squares regression (dwPLS), is proposed in this paper. The idea is to inspect serial correlation in infrared data that is known to consist of highly correlated neighbouring variables. The method selects only those variables whose intervals have a lower Durbin-Watson statistic (dw) than a certain optimal cutoff. For each interval, dw is calculated on a vector of regression coefficients. Adulteration of cold-pressed linseed oil (L), a well-known nutrient beneficial to health, is studied in this work by its being mixed with cheaper oils: rapeseed oil (R), sesame oil (Se) and sunflower oil (Su). The samples for each botanical origin of oil vary with respect to producer, content and geographic origin. The results obtained indicate that MIR-ATR, combined with dwPLS could be implemented to quantitative determination of edible-oil adulteration. Copyright © 2016 Elsevier Ltd. All rights reserved.
Fassihi, Afshin; Sabet, Razieh
2008-01-01
Quantitative relationships between molecular structure and p56lck protein tyrosine kinase inhibitory activity of 50 flavonoid derivatives are discovered by MLR and GA-PLS methods. Different QSAR models revealed that substituent electronic descriptors (SED) parameters have significant impact on protein tyrosine kinase inhibitory activity of the compounds. Between the two statistical methods employed, GA-PLS gave superior results. The resultant GA-PLS model had a high statistical quality (R2 = 0.74 and Q2 = 0.61) for predicting the activity of the inhibitors. The models proposed in the present work are more useful in describing QSAR of flavonoid derivatives as p56lck protein tyrosine kinase inhibitors than those provided previously. PMID:19325836
Marsillas, Sara; De Donder, Liesbeth; Kardol, Tinie; van Regenmortel, Sofie; Dury, Sarah; Brosens, Dorien; Smetcoren, An-Sofie; Braña, Teresa; Varela, Jesús
2017-09-01
Several debates have emerged across the literature about the conceptualisation of active ageing. The aim of this study is to develop a model of the construct that is focused on the individual, including different elements of people's lives that have the potential to be modified by intervention programs. Moreover, the paper examines the contributions of active ageing to life satisfaction, as well as the possible predictive role of coping styles on active ageing. For this purpose, a representative sample of 404 Galician (Spain) community-dwelling older adults (aged ≥60 years) were interviewed using a structured survey. The results demonstrate that the proposed model composed of two broad categories is valid. The model comprises status variables (related to physical, psychological, and social health) as well as different types of activities, called processual variables. This model is tested using partial least squares (PLS) regression. The findings show that active ageing is a fourth-order, formative construct. In addition, PLS analyses indicate that active ageing has a moderate and positive path on life satisfaction and that coping styles may predict active ageing. The discussion highlights the potential of active ageing as a relevant concept for people's lives, drawing out policy implications and suggestions for further research.
Comparison of three chemometrics methods for near-infrared spectra of glucose in the whole blood
NASA Astrophysics Data System (ADS)
Zhang, Hongyan; Ding, Dong; Li, Xin; Chen, Yu; Tang, Yuguo
2005-01-01
Principal Component Regression (PCR), Partial Least Square (PLS) and Artificial Neural Networks (ANN) methods are used in the analysis for the near infrared (NIR) spectra of glucose in the whole blood. The calibration model is built up in the spectrum band where there are the glucose has much more spectral absorption than the water, fat, and protein with these methods and the correlation coefficients of the model are showed in this paper. Comparing these results, a suitable method to analyze the glucose NIR spectrum in the whole blood is found.
NASA Astrophysics Data System (ADS)
Shi, Ji-yong; Zou, Xiao-bo; Zhao, Jie-wen; Mel, Holmes; Wang, Kai-liang; Wang, Xue; Chen, Hong
Total flavonoids content is often considered an important quality index of Ginkgo biloba leaf. The feasibility of using near infrared (NIR) spectra at the wavelength range of 10,000-4000 cm-1 for rapid and nondestructive determination of total flavonoids content in G. biloba leaf was investigated. 120 fresh G. biloba leaves in different colors (green, green-yellowish and yellow) were used to spectra acquisition and total flavonoids determination. Partial least squares (PLS), interval partial least squares (iPLS) and synergy interval partial least squares (SiPLS) were used to develop calibration models for total flavonoids content in two colors leaves (green-yellowish and yellow) and three colors leaves (green, green-yellowish and yellow), respectively. The level of total flavonoids content for green, green-yellowish and yellow leaves was in an increasing order. Two characteristic wavelength regions (5840-6090 cm-1 and 6620-6880 cm-1), which corresponded to the absorptions of two aromatic rings in basic flavonoid structure, were selected by SiPLS. The optimal SiPLS model for total flavonoids content in the two colors leaves (r2 = 0.82, RMSEP = 2.62 mg g-1) had better performance than PLS and iPLS models. It could be concluded that NIR spectroscopy has significant potential in the nondestructive determination of total flavonoids content in fresh G. biloba leaf.
Kinoshita, Kodzue; Kuze, Noko; Kobayashi, Toshio; Miyakawa, Etsuko; Narita, Hiromitsu; Inoue-Murayama, Miho; Idani, Gen'ichi; Tsenkova, Roumiana
2016-01-01
For promoting in situ conservation, it is important to estimate the density distribution of fertile individuals, and there is a need for developing an easy monitoring method to discriminate between physiological states. To date, physiological state has generally been determined by measuring hormone concentration using radioimmunoassay or enzyme immunoassay (EIA) methods. However, these methods have rarely been applied in situ because of the requirements for a large amount of reagent, instruments, and a radioactive isotope. In addition, the proper storage of the sample (including urine and feces) on site until analysis is difficult. On the other hand, near infrared (NIR) spectroscopy requires no reagent and enables rapid measurement. In the present study, we attempted urinary NIR spectroscopy to determine the estrogen levels of orangutans in Japanese zoos and in the Danum Valley Conservation Area, Sabah, Malaysia. Reflectance NIR spectra were obtained from urine stored using a filter paper. Filter paper is easy to use to store dried urine, even in the wild. Urinary estrogen and creatinine concentrations measured by EIA were used as the reference data of partial least square (PLS) regression of urinary NIR spectra. High accuracies (R(2) > 0.68) were obtained in both estrogen and creatinine regression models. In addition, the PLS regressions in both standards showed higher accuracies (R(2) > 0.70). Therefore, the present study demonstrates that urinary NIR spectra have the potential to estimate the estrogen and creatinine concentrations.
Infrared microspectroscopic determination of collagen cross-links in articular cartilage
NASA Astrophysics Data System (ADS)
Rieppo, Lassi; Kokkonen, Harri T.; Kulmala, Katariina A. M.; Kovanen, Vuokko; Lammi, Mikko J.; Töyräs, Juha; Saarakkala, Simo
2017-03-01
Collagen forms an organized network in articular cartilage to give tensile stiffness to the tissue. Due to its long half-life, collagen is susceptible to cross-links caused by advanced glycation end-products. The current standard method for determination of cross-link concentrations in tissues is the destructive high-performance liquid chromatography (HPLC). The aim of this study was to analyze the cross-link concentrations nondestructively from standard unstained histological articular cartilage sections by using Fourier transform infrared (FTIR) microspectroscopy. Half of the bovine articular cartilage samples (n=27) were treated with threose to increase the collagen cross-linking while the other half (n=27) served as a control group. Partial least squares (PLS) regression with variable selection algorithms was used to predict the cross-link concentrations from the measured average FTIR spectra of the samples, and HPLC was used as the reference method for cross-link concentrations. The correlation coefficients between the PLS regression models and the biochemical reference values were r=0.84 (p<0.001), r=0.87 (p<0.001) and r=0.92 (p<0.001) for hydroxylysyl pyridinoline (HP), lysyl pyridinoline (LP), and pentosidine (Pent) cross-links, respectively. The study demonstrated that FTIR microspectroscopy is a feasible method for investigating cross-link concentrations in articular cartilage.
Müller, Aline Lima Hermes; Picoloto, Rochele Sogari; de Azevedo Mello, Paola; Ferrão, Marco Flores; de Fátima Pereira dos Santos, Maria; Guimarães, Regina Célia Lourenço; Müller, Edson Irineu; Flores, Erico Marlon Moraes
2012-04-01
Total sulfur concentration was determined in atmospheric residue (AR) and vacuum residue (VR) samples obtained from petroleum distillation process by Fourier transform infrared spectroscopy with attenuated total reflectance (FT-IR/ATR) in association with chemometric methods. Calibration and prediction set consisted of 40 and 20 samples, respectively. Calibration models were developed using two variable selection models: interval partial least squares (iPLS) and synergy interval partial least squares (siPLS). Different treatments and pre-processing steps were also evaluated for the development of models. The pre-treatment based on multiplicative scatter correction (MSC) and the mean centered data were selected for models construction. The use of siPLS as variable selection method provided a model with root mean square error of prediction (RMSEP) values significantly better than those obtained by PLS model using all variables. The best model was obtained using siPLS algorithm with spectra divided in 20 intervals and combinations of 3 intervals (911-824, 823-736 and 737-650 cm(-1)). This model produced a RMSECV of 400 mg kg(-1) S and RMSEP of 420 mg kg(-1) S, showing a correlation coefficient of 0.990. Copyright © 2011 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Yeganeh, B.; Motlagh, M. Shafie Pour; Rashidi, Y.; Kamalan, H.
2012-08-01
Due to the health impacts caused by exposures to air pollutants in urban areas, monitoring and forecasting of air quality parameters have become popular as an important topic in atmospheric and environmental research today. The knowledge on the dynamics and complexity of air pollutants behavior has made artificial intelligence models as a useful tool for a more accurate pollutant concentration prediction. This paper focuses on an innovative method of daily air pollution prediction using combination of Support Vector Machine (SVM) as predictor and Partial Least Square (PLS) as a data selection tool based on the measured values of CO concentrations. The CO concentrations of Rey monitoring station in the south of Tehran, from Jan. 2007 to Feb. 2011, have been used to test the effectiveness of this method. The hourly CO concentrations have been predicted using the SVM and the hybrid PLS-SVM models. Similarly, daily CO concentrations have been predicted based on the aforementioned four years measured data. Results demonstrated that both models have good prediction ability; however the hybrid PLS-SVM has better accuracy. In the analysis presented in this paper, statistic estimators including relative mean errors, root mean squared errors and the mean absolute relative error have been employed to compare performances of the models. It has been concluded that the errors decrease after size reduction and coefficients of determination increase from 56 to 81% for SVM model to 65-85% for hybrid PLS-SVM model respectively. Also it was found that the hybrid PLS-SVM model required lower computational time than SVM model as expected, hence supporting the more accurate and faster prediction ability of hybrid PLS-SVM model.
NASA Astrophysics Data System (ADS)
Liu, Ronghua; Sun, Qiaofeng; Hu, Tian; Li, Lian; Nie, Lei; Wang, Jiayue; Zhou, Wanhui; Zang, Hengchang
2018-03-01
As a powerful process analytical technology (PAT) tool, near infrared (NIR) spectroscopy has been widely used in real-time monitoring. In this study, NIR spectroscopy was applied to monitor multi-parameters of traditional Chinese medicine (TCM) Shenzhiling oral liquid during the concentration process to guarantee the quality of products. Five lab scale batches were employed to construct quantitative models to determine five chemical ingredients and physical change (samples density) during concentration process. The paeoniflorin, albiflorin, liquiritin and samples density were modeled by partial least square regression (PLSR), while the content of the glycyrrhizic acid and cinnamic acid were modeled by support vector machine regression (SVMR). Standard normal variate (SNV) and/or Savitzkye-Golay (SG) smoothing with derivative methods were adopted for spectra pretreatment. Variable selection methods including correlation coefficient (CC), competitive adaptive reweighted sampling (CARS) and interval partial least squares regression (iPLS) were performed for optimizing the models. The results indicated that NIR spectroscopy was an effective tool to successfully monitoring the concentration process of Shenzhiling oral liquid.
Partial least squares based identification of Duchenne muscular dystrophy specific genes.
An, Hui-bo; Zheng, Hua-cheng; Zhang, Li; Ma, Lin; Liu, Zheng-yan
2013-11-01
Large-scale parallel gene expression analysis has provided a greater ease for investigating the underlying mechanisms of Duchenne muscular dystrophy (DMD). Previous studies typically implemented variance/regression analysis, which would be fundamentally flawed when unaccounted sources of variability in the arrays existed. Here we aim to identify genes that contribute to the pathology of DMD using partial least squares (PLS) based analysis. We carried out PLS-based analysis with two datasets downloaded from the Gene Expression Omnibus (GEO) database to identify genes contributing to the pathology of DMD. Except for the genes related to inflammation, muscle regeneration and extracellular matrix (ECM) modeling, we found some genes with high fold change, which have not been identified by previous studies, such as SRPX, GPNMB, SAT1, and LYZ. In addition, downregulation of the fatty acid metabolism pathway was found, which may be related to the progressive muscle wasting process. Our results provide a better understanding for the downstream mechanisms of DMD.
NASA Astrophysics Data System (ADS)
Wu, Di; He, Yong
2007-11-01
The aim of this study is to investigate the potential of the visible and near infrared spectroscopy (Vis/NIRS) technique for non-destructive measurement of soluble solids contents (SSC) in grape juice beverage. 380 samples were studied in this paper. Smoothing way of Savitzky-Golay and standard normal variate were applied for the pre-processing of spectral data. Least-squares support vector machines (LS-SVM) with RBF kernel function was applied to developing the SSC prediction model based on the Vis/NIRS absorbance data. The determination coefficient for prediction (Rp2) of the results predicted by LS-SVM model was 0. 962 and root mean square error (RMSEP) was 0. 434137. It is concluded that Vis/NIRS technique can quantify the SSC of grape juice beverage fast and non-destructively.. At the same time, LS-SVM model was compared with PLS and back propagation neural network (BP-NN) methods. The results showed that LS-SVM was superior to the conventional linear and non-linear methods in predicting SSC of grape juice beverage. In this study, the generation ability of LS-SVM, PLS and BP-NN models were also investigated. It is concluded that LS-SVM regression method is a promising technique for chemometrics in quantitative prediction.
Fadzlillah, Nurrulhidayah Ahmad; Rohman, Abdul; Ismail, Amin; Mustafa, Shuhaimi; Khatib, Alfi
2013-01-01
In dairy product sector, butter is one of the potential sources of fat soluble vitamins, namely vitamin A, D, E, K; consequently, butter is taken into account as high valuable price from other dairy products. This fact has attracted unscrupulous market players to blind butter with other animal fats to gain economic profit. Animal fats like mutton fat (MF) are potential to be mixed with butter due to the similarity in terms of fatty acid composition. This study focused on the application of FTIR-ATR spectroscopy in conjunction with chemometrics for classification and quantification of MF as adulterant in butter. The FTIR spectral region of 3910-710 cm⁻¹ was used for classification between butter and butter blended with MF at various concentrations with the aid of discriminant analysis (DA). DA is able to classify butter and adulterated butter without any mistakenly grouped. For quantitative analysis, partial least square (PLS) regression was used to develop a calibration model at the frequency regions of 3910-710 cm⁻¹. The equation obtained for the relationship between actual value of MF and FTIR predicted values of MF in PLS calibration model was y = 0.998x + 1.033, with the values of coefficient of determination (R²) and root mean square error of calibration are 0.998 and 0.046% (v/v), respectively. The PLS calibration model was subsequently used for the prediction of independent samples containing butter in the binary mixtures with MF. Using 9 principal components, root mean square error of prediction (RMSEP) is 1.68% (v/v). The results showed that FTIR spectroscopy can be used for the classification and quantification of MF in butter formulation for verification purposes.
Fourier transform infrared spectroscopy for Kona coffee authentication.
Wang, Jun; Jun, Soojin; Bittenbender, H C; Gautz, Loren; Li, Qing X
2009-06-01
Kona coffee, the variety of "Kona typica" grown in the north and south districts of Kona-Island, carries a unique stamp of the region of Big Island of Hawaii, U.S.A. The excellent quality of Kona coffee makes it among the best coffee products in the world. Fourier transform infrared (FTIR) spectroscopy integrated with an attenuated total reflectance (ATR) accessory and multivariate analysis was used for qualitative and quantitative analysis of ground and brewed Kona coffee and blends made with Kona coffee. The calibration set of Kona coffee consisted of 10 different blends of Kona-grown original coffee mixture from 14 different farms in Hawaii and a non-Kona-grown original coffee mixture from 3 different sampling sites in Hawaii. Derivative transformations (1st and 2nd), mathematical enhancements such as mean centering and variance scaling, multivariate regressions by partial least square (PLS), and principal components regression (PCR) were implemented to develop and enhance the calibration model. The calibration model was successfully validated using 9 synthetic blend sets of 100% Kona coffee mixture and its adulterant, 100% non-Kona coffee mixture. There were distinct peak variations of ground and brewed coffee blends in the spectral "fingerprint" region between 800 and 1900 cm(-1). The PLS-2nd derivative calibration model based on brewed Kona coffee with mean centering data processing showed the highest degree of accuracy with the lowest standard error of calibration value of 0.81 and the highest R(2) value of 0.999. The model was further validated by quantitative analysis of commercial Kona coffee blends. Results demonstrate that FTIR can be a rapid alternative to authenticate Kona coffee, which only needs very quick and simple sample preparations.
Oliveri, Paolo; López, M Isabel; Casolino, M Chiara; Ruisánchez, Itziar; Callao, M Pilar; Medini, Luca; Lanteri, Silvia
2014-12-03
A new class-modeling method, referred to as partial least squares density modeling (PLS-DM), is presented. The method is based on partial least squares (PLS), using a distance-based sample density measurement as the response variable. Potential function probability density is subsequently calculated on PLS scores and used, jointly with residual Q statistics, to develop efficient class models. The influence of adjustable model parameters on the resulting performances has been critically studied by means of cross-validation and application of the Pareto optimality criterion. The method has been applied to verify the authenticity of olives in brine from cultivar Taggiasca, based on near-infrared (NIR) spectra recorded on homogenized solid samples. Two independent test sets were used for model validation. The final optimal model was characterized by high efficiency and equilibrate balance between sensitivity and specificity values, if compared with those obtained by application of well-established class-modeling methods, such as soft independent modeling of class analogy (SIMCA) and unequal dispersed classes (UNEQ). Copyright © 2014 Elsevier B.V. All rights reserved.
Shao, Limin; Griffiths, Peter R; Leytem, April B
2010-10-01
The automated quantification of three greenhouse gases, ammonia, methane, and nitrous oxide, in the vicinity of a large dairy farm by open-path Fourier transform infrared (OP/FT-IR) spectrometry at intervals of 5 min is demonstrated. Spectral pretreatment, including the automated detection and correction of the effect of interrupting the infrared beam, is by a moving object, and the automated correction for the nonlinear detector response is applied to the measured interferograms. Two ways of obtaining quantitative data from OP/FT-IR data are described. The first, which is installed in a recently acquired commercial OP/FT-IR spectrometer, is based on classical least-squares (CLS) regression, and the second is based on partial least-squares (PLS) regression. It is shown that CLS regression only gives accurate results if the absorption features of the analytes are located in very short spectral intervals where lines due to atmospheric water vapor are absent or very weak; of the three analytes examined, only ammonia fell into this category. On the other hand, PLS regression works allowed what appeared to be accurate results to be obtained for all three analytes.
Prediction models for Arabica coffee beverage quality based on aroma analyses and chemometrics.
Ribeiro, J S; Augusto, F; Salva, T J G; Ferreira, M M C
2012-11-15
In this work, soft modeling based on chemometric analyses of coffee beverage sensory data and the chromatographic profiles of volatile roasted coffee compounds is proposed to predict the scores of acidity, bitterness, flavor, cleanliness, body, and overall quality of the coffee beverage. A partial least squares (PLS) regression method was used to construct the models. The ordered predictor selection (OPS) algorithm was applied to select the compounds for the regression model of each sensory attribute in order to take only significant chromatographic peaks into account. The prediction errors of these models, using 4 or 5 latent variables, were equal to 0.28, 0.33, 0.35, 0.33, 0.34 and 0.41, for each of the attributes and compatible with the errors of the mean scores of the experts. Thus, the results proved the feasibility of using a similar methodology in on-line or routine applications to predict the sensory quality of Brazilian Arabica coffee. Copyright © 2012 Elsevier B.V. All rights reserved.
Jin, Xiaoli; Shi, Chunhai; Yu, Chang Yeon; ...
2017-05-19
Leaf water content is one of the most common physiological parameters limiting efficiency of photosynthesis and biomass productivity in plants including Miscanthus. Therefore, it is of great significance to determine or predict the water content quickly and non-destructively. In this study, we explored the relationship between leaf water content and diffuse reflectance spectra in Miscanthus. Three multivariate calibrations including partial least squares (PLS), least squares support vector machine regression (LSSVR), and radial basis function (RBF) neural network (NN) were developed for the models of leaf water content determination. The non-linear models including RBF_LSSVR and RBF_NN showed higher accuracy than themore » PLS and Lin_LSSVR models. Moreover, 75 sensitive wavelengths were identified to be closely associated with the leaf water content in Miscanthus. The RBF_LSSVR and RBF_NN models for predicting leaf water content, based on 75 characteristic wavelengths, obtained the high determination coefficients of 0.9838 and 0.9899, respectively. The results indicated the non-linear models were more accurate than the linear models using both wavelength intervals. These results demonstrated that visible and near-infrared (VIS/NIR) spectroscopy combined with RBF_LSSVR or RBF_NN is a useful, non-destructive tool for determinations of the leaf water content in Miscanthus, and thus very helpful for development of drought-resistant varieties in Miscanthus.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Xiaoli; Shi, Chunhai; Yu, Chang Yeon
Leaf water content is one of the most common physiological parameters limiting efficiency of photosynthesis and biomass productivity in plants including Miscanthus. Therefore, it is of great significance to determine or predict the water content quickly and non-destructively. In this study, we explored the relationship between leaf water content and diffuse reflectance spectra in Miscanthus. Three multivariate calibrations including partial least squares (PLS), least squares support vector machine regression (LSSVR), and radial basis function (RBF) neural network (NN) were developed for the models of leaf water content determination. The non-linear models including RBF_LSSVR and RBF_NN showed higher accuracy than themore » PLS and Lin_LSSVR models. Moreover, 75 sensitive wavelengths were identified to be closely associated with the leaf water content in Miscanthus. The RBF_LSSVR and RBF_NN models for predicting leaf water content, based on 75 characteristic wavelengths, obtained the high determination coefficients of 0.9838 and 0.9899, respectively. The results indicated the non-linear models were more accurate than the linear models using both wavelength intervals. These results demonstrated that visible and near-infrared (VIS/NIR) spectroscopy combined with RBF_LSSVR or RBF_NN is a useful, non-destructive tool for determinations of the leaf water content in Miscanthus, and thus very helpful for development of drought-resistant varieties in Miscanthus.« less
Quantification of adulterations in extra virgin flaxseed oil using MIR and PLS.
de Souza, Letícia Maria; de Santana, Felipe Bachion; Gontijo, Lucas Caixeta; Mazivila, Sarmento Júnior; Borges Neto, Waldomiro
2015-09-01
This paper proposes a new method for the quantitative analysis of soybean oil (SO) and sunflower oil (SFO) as adulterants in extra virgin flaxseed oil (EFO) by applying Mid Infrared Spectroscopy (MIR) associated with chemometric technique of Partial Least Squares (PLS). The PLS models were built in accordance with standard method ASTM E1655-05 and these showed good correlation between the reference values and those calculated using the PLS models with low error values, with R = 0.998 for SFO and R = 0.999 for SO in EFO. These models were validated analytically in accordance with Brazilian and international guidelines through the estimate of figures of merit parameters, thus showing an effective and feasible method to control the quality of extra virgin flaxseed oil. Copyright © 2015 Elsevier Ltd. All rights reserved.
Li, Min; Zhang, Lu; Yao, Xiaolong; Jiang, Xingyu
2017-01-01
The emerging membrane introduction mass spectrometry technique has been successfully used to detect benzene, toluene, ethyl benzene and xylene (BTEX), while overlapped spectra have unfortunately hindered its further application to the analysis of mixtures. Multivariate calibration, an efficient method to analyze mixtures, has been widely applied. In this paper, we compared univariate and multivariate analyses for quantification of the individual components of mixture samples. The results showed that the univariate analysis creates poor models with regression coefficients of 0.912, 0.867, 0.440 and 0.351 for BTEX, respectively. For multivariate analysis, a comparison to the partial-least squares (PLS) model shows that the orthogonal partial-least squares (OPLS) regression exhibits an optimal performance with regression coefficients of 0.995, 0.999, 0.980 and 0.976, favorable calibration parameters (RMSEC and RMSECV) and a favorable validation parameter (RMSEP). Furthermore, the OPLS exhibits a good recovery of 73.86 - 122.20% and relative standard deviation (RSD) of the repeatability of 1.14 - 4.87%. Thus, MIMS coupled with the OPLS regression provides an optimal approach for a quantitative BTEX mixture analysis in monitoring and predicting water pollution.
Payne, Courtney E; Wolfrum, Edward J
2015-01-01
Obtaining accurate chemical composition and reactivity (measures of carbohydrate release and yield) information for biomass feedstocks in a timely manner is necessary for the commercialization of biofuels. Our objective was to use near-infrared (NIR) spectroscopy and partial least squares (PLS) multivariate analysis to develop calibration models to predict the feedstock composition and the release and yield of soluble carbohydrates generated by a bench-scale dilute acid pretreatment and enzymatic hydrolysis assay. Major feedstocks included in the calibration models are corn stover, sorghum, switchgrass, perennial cool season grasses, rice straw, and miscanthus. We present individual model statistics to demonstrate model performance and validation samples to more accurately measure predictive quality of the models. The PLS-2 model for composition predicts glucan, xylan, lignin, and ash (wt%) with uncertainties similar to primary measurement methods. A PLS-2 model was developed to predict glucose and xylose release following pretreatment and enzymatic hydrolysis. An additional PLS-2 model was developed to predict glucan and xylan yield. PLS-1 models were developed to predict the sum of glucose/glucan and xylose/xylan for release and yield (grams per gram). The release and yield models have higher uncertainties than the primary methods used to develop the models. It is possible to build effective multispecies feedstock models for composition, as well as carbohydrate release and yield. The model for composition is useful for predicting glucan, xylan, lignin, and ash with good uncertainties. The release and yield models have higher uncertainties; however, these models are useful for rapidly screening sample populations to identify unusual samples.
A spectral-spatial-dynamic hierarchical Bayesian (SSD-HB) model for estimating soybean yield
NASA Astrophysics Data System (ADS)
Kazama, Yoriko; Kujirai, Toshihiro
2014-10-01
A method called a "spectral-spatial-dynamic hierarchical-Bayesian (SSD-HB) model," which can deal with many parameters (such as spectral and weather information all together) by reducing the occurrence of multicollinearity, is proposed. Experiments conducted on soybean yields in Brazil fields with a RapidEye satellite image indicate that the proposed SSD-HB model can predict soybean yield with a higher degree of accuracy than other estimation methods commonly used in remote-sensing applications. In the case of the SSD-HB model, the mean absolute error between estimated yield of the target area and actual yield is 0.28 t/ha, compared to 0.34 t/ha when conventional PLS regression was applied, showing the potential effectiveness of the proposed model.
Linking landscape variables to cold water refugia in rivers.
Monk, Wendy A; Wilbur, Nathan M; Curry, R Allen; Gagnon, Rolland; Faux, Russell N
2013-03-30
The protection of coldwater refugia within aquatic systems requires the identification of thermal habitats in rivers. These refugia provide critical thermal habitats for brook trout (Salvelinus fontinalis) and Atlantic salmon (Salmo salar) during periods of thermal stress, for example during summer high temperature events. This study aims to model these refugia using georeferenced thermal infrared images collected during late July 2008 and 2009 for a reach of the Cains River, New Brunswick, Canada. These images were paired with geospatial catchment variables to identify the driving factors for coldwater refugia located within tributaries to the main channel. Using Partial Least Square (PLS) Regression, results suggest that median temperatures of tributary catchments are driven by their position within the landscape including slope in addition to the density of wetlands and mixed forest within the upstream catchment. Similar results are presented when PLS models were developed to predict the magnitude of the cold water refugia (i.e. the difference between the mainstem water temperature and the thermal refugia). These results suggest that thermal infrared images can be used to predict critical summer habitats for coldwater fishes. Copyright © 2013 Elsevier Ltd. All rights reserved.
Sheng, Kui-Chuan; Shen, Ying-Ying; Yang, Hai-Qing; Wang, Wen-Jin; Luo, Wei-Qiang
2012-10-01
Rapid determination of biomass feedstock properties is of value for the production of biomass densification briquetting fuel with high quality. In the present study, visible and near-infrared (Vis-NIR) spectroscopy was employed to build prediction models of componential contents, i. e. moisture, ash, volatile matter and fixed-carbon, and calorific value of three selected species of agricultural biomass feedstock, i. e. pine wood, cedar wood, and cotton stalk. The partial least squares (PLS) cross validation results showed that compared with original reflection spectra, PLS regression models developed for first derivative spectra produced higher prediction accuracy with coefficients of determination (R2) of 0.97, 0.94 and 0.90, and residual prediction deviation (RPD) of 6.57, 4.00 and 3.01 for ash, volatile matter and moisture, respectively. Good prediction accuracy was achieved with R2 of 0.85 and RPD of 2.55 for fixed carbon, and R2 of 0.87 and RPD of 2.73 for calorific value. It is concluded that the Vis-NIR spectroscopy is promising as an alternative of traditional proximate analysis for rapid determination of componential contents and calorific value of agricultural biomass feedstock
Zhou, Yan; Cao, Hui
2013-01-01
We propose an augmented classical least squares (ACLS) calibration method for quantitative Raman spectral analysis against component information loss. The Raman spectral signals with low analyte concentration correlations were selected and used as the substitutes for unknown quantitative component information during the CLS calibration procedure. The number of selected signals was determined by using the leave-one-out root-mean-square error of cross-validation (RMSECV) curve. An ACLS model was built based on the augmented concentration matrix and the reference spectral signal matrix. The proposed method was compared with partial least squares (PLS) and principal component regression (PCR) using one example: a data set recorded from an experiment of analyte concentration determination using Raman spectroscopy. A 2-fold cross-validation with Venetian blinds strategy was exploited to evaluate the predictive power of the proposed method. The one-way variance analysis (ANOVA) was used to access the predictive power difference between the proposed method and existing methods. Results indicated that the proposed method is effective at increasing the robust predictive power of traditional CLS model against component information loss and its predictive power is comparable to that of PLS or PCR.
Prediction of valid acidity in intact apples with Fourier transform near infrared spectroscopy.
Liu, Yan-De; Ying, Yi-Bin; Fu, Xia-Ping
2005-03-01
To develop nondestructive acidity prediction for intact Fuji apples, the potential of Fourier transform near infrared (FT-NIR) method with fiber optics in interactance mode was investigated. Interactance in the 800 nm to 2619 nm region was measured for intact apples, harvested from early to late maturity stages. Spectral data were analyzed by two multivariate calibration techniques including partial least squares (PLS) and principal component regression (PCR) methods. A total of 120 Fuji apples were tested and 80 of them were used to form a calibration data set. The influences of different data preprocessing and spectra treatments were also quantified. Calibration models based on smoothing spectra were slightly worse than that based on derivative spectra, and the best result was obtained when the segment length was 5 nm and the gap size was 10 points. Depending on data preprocessing and PLS method, the best prediction model yielded correlation coefficient of determination (r2) of 0.759, low root mean square error of prediction (RMSEP) of 0.0677, low root mean square error of calibration (RMSEC) of 0.0562. The results indicated the feasibility of FT-NIR spectral analysis for predicting apple valid acidity in a nondestructive way.
Prediction of valid acidity in intact apples with Fourier transform near infrared spectroscopy*
Liu, Yan-de; Ying, Yi-bin; Fu, Xia-ping
2005-01-01
To develop nondestructive acidity prediction for intact Fuji apples, the potential of Fourier transform near infrared (FT-NIR) method with fiber optics in interactance mode was investigated. Interactance in the 800 nm to 2619 nm region was measured for intact apples, harvested from early to late maturity stages. Spectral data were analyzed by two multivariate calibration techniques including partial least squares (PLS) and principal component regression (PCR) methods. A total of 120 Fuji apples were tested and 80 of them were used to form a calibration data set. The influences of different data preprocessing and spectra treatments were also quantified. Calibration models based on smoothing spectra were slightly worse than that based on derivative spectra, and the best result was obtained when the segment length was 5 nm and the gap size was 10 points. Depending on data preprocessing and PLS method, the best prediction model yielded correlation coefficient of determination (r 2) of 0.759, low root mean square error of prediction (RMSEP) of 0.0677, low root mean square error of calibration (RMSEC) of 0.0562. The results indicated the feasibility of FT-NIR spectral analysis for predicting apple valid acidity in a nondestructive way. PMID:15682498
Gomes, Adriano de Araújo; Alcaraz, Mirta Raquel; Goicoechea, Hector C; Araújo, Mario Cesar U
2014-02-06
In this work the Successive Projection Algorithm is presented for intervals selection in N-PLS for three-way data modeling. The proposed algorithm combines noise-reduction properties of PLS with the possibility of discarding uninformative variables in SPA. In addition, second-order advantage can be achieved by the residual bilinearization (RBL) procedure when an unexpected constituent is present in a test sample. For this purpose, SPA was modified in order to select intervals for use in trilinear PLS. The ability of the proposed algorithm, namely iSPA-N-PLS, was evaluated on one simulated and two experimental data sets, comparing the results to those obtained by N-PLS. In the simulated system, two analytes were quantitated in two test sets, with and without unexpected constituent. In the first experimental system, the determination of the four fluorophores (l-phenylalanine; l-3,4-dihydroxyphenylalanine; 1,4-dihydroxybenzene and l-tryptophan) was conducted with excitation-emission data matrices. In the second experimental system, quantitation of ofloxacin was performed in water samples containing two other uncalibrated quinolones (ciprofloxacin and danofloxacin) by high performance liquid chromatography with UV-vis diode array detector. For comparison purpose, a GA algorithm coupled with N-PLS/RBL was also used in this work. In most of the studied cases iSPA-N-PLS proved to be a promising tool for selection of variables in second-order calibration, generating models with smaller RMSEP, when compared to both the global model using all of the sensors in two dimensions and GA-NPLS/RBL. Copyright © 2013 Elsevier B.V. All rights reserved.
Raman spectroscopy: in vivo quick response code of skin physiological status
NASA Astrophysics Data System (ADS)
Vyumvuhore, Raoul; Tfayli, Ali; Piot, Olivier; Le Guillou, Maud; Guichard, Nathalie; Manfait, Michel; Baillet-Guffroy, Arlette
2014-11-01
Dermatologists need to combine different clinically relevant characteristics for a better understanding of skin health. These characteristics are usually measured by different techniques, and some of them are highly time consuming. Therefore, a predicting model based on Raman spectroscopy and partial least square (PLS) regression was developed as a rapid multiparametric method. The Raman spectra collected from the five uppermost micrometers of 11 healthy volunteers were fitted to different skin characteristics measured by independent appropriate methods (transepidermal water loss, hydration, pH, relative amount of ceramides, fatty acids, and cholesterol). For each parameter, the obtained PLS model presented correlation coefficients higher than R2=0.9. This model enables us to obtain all the aforementioned parameters directly from the unique Raman signature. In addition to that, in-depth Raman analyses down to 20 μm showed different balances between partially bound water and unbound water with depth. In parallel, the increase of depth was followed by an unfolding process of the proteins. The combinations of all these information led to a multiparametric investigation, which better characterizes the skin status. Raman signal can thus be used as a quick response code (QR code). This could help dermatologic diagnosis of physiological variations and presents a possible extension to pathological characterization.
Raman spectroscopy: in vivo quick response code of skin physiological status.
Vyumvuhore, Raoul; Tfayli, Ali; Piot, Olivier; Le Guillou, Maud; Guichard, Nathalie; Manfait, Michel; Baillet-Guffroy, Arlette
2014-01-01
Dermatologists need to combine different clinically relevant characteristics for a better understanding of skin health. These characteristics are usually measured by different techniques, and some of them are highly time consuming. Therefore, a predicting model based on Raman spectroscopy and partial least square (PLS) regression was developed as a rapid multiparametric method. The Raman spectra collected from the five uppermost micrometers of 11 healthy volunteers were fitted to different skin characteristics measured by independent appropriate methods (transepidermal water loss, hydration, pH, relative amount of ceramides, fatty acids, and cholesterol). For each parameter, the obtained PLS model presented correlation coefficients higher than R2=0.9. This model enables us to obtain all the aforementioned parameters directly from the unique Raman signature. In addition to that, in-depth Raman analyses down to 20 μm showed different balances between partially bound water and unbound water with depth. In parallel, the increase of depth was followed by an unfolding process of the proteins. The combinations of all these information led to a multiparametric investigation, which better characterizes the skin status. Raman signal can thus be used as a quick response code (QR code). This could help dermatologic diagnosis of physiological variations and presents a possible extension to pathological characterization.
Basatnia, Nabee; Hossein, Seyed Abbas; Rodrigo-Comino, Jesús; Khaledian, Yones; Brevik, Eric C; Aitkenhead-Peterson, Jacqueline; Natesan, Usha
2018-04-29
Coastal lagoon ecosystems are vulnerable to eutrophication, which leads to the accumulation of nutrients from the surrounding watershed over the long term. However, there is a lack of information about methods that could accurate quantify this problem in rapidly developed countries. Therefore, various statistical methods such as cluster analysis (CA), principal component analysis (PCA), partial least square (PLS), principal component regression (PCR), and ordinary least squares regression (OLS) were used in this study to estimate total organic matter content in sediments (TOM) using other parameters such as temperature, dissolved oxygen (DO), pH, electrical conductivity (EC), nitrite (NO 2 ), nitrate (NO 3 ), biological oxygen demand (BOD), phosphate (PO 4 ), total phosphorus (TP), salinity, and water depth along a 3-km transect in the Gomishan Lagoon (Iran). Results indicated that nutrient concentration and the dissolved oxygen gradient were the most significant parameters in the lagoon water quality heterogeneity. Additionally, anoxia at the bottom of the lagoon in sediments and re-suspension of the sediments were the main factors affecting internal nutrient loading. To validate the models, R 2 , RMSECV, and RPDCV were used. The PLS model was stronger than the other models. Also, classification analysis of the Gomishan Lagoon identified two hydrological zones: (i) a North Zone characterized by higher water exchange, higher dissolved oxygen and lower salinity and nutrients, and (ii) a Central and South Zone with high residence time, higher nutrient concentrations, lower dissolved oxygen, and higher salinity. A recommendation for the management of coastal lagoons, specifically the Gomishan Lagoon, to decrease or eliminate nutrient loadings is discussed and should be transferred to policy makers, the scientific community, and local inhabitants.
Netchacovitch, L; Dumont, E; Cailletaud, J; Thiry, J; De Bleye, C; Sacré, P-Y; Boiret, M; Evrard, B; Hubert, Ph; Ziemons, E
2017-09-15
The development of a quantitative method determining the crystalline percentage in an amorphous solid dispersion is of great interest in the pharmaceutical field. Indeed, the crystalline Active Pharmaceutical Ingredient transformation into its amorphous state is increasingly used as it enhances the solubility and bioavailability of Biopharmaceutical Classification System class II drugs. One way to produce amorphous solid dispersions is the Hot-Melt Extrusion (HME) process. This study reported the development and the comparison of the analytical performances of two techniques, based on backscattering and transmission Raman spectroscopy, determining the crystalline remaining content in amorphous solid dispersions produced by HME. Principal Component Analysis (PCA) and Partial Least Squares (PLS) regression were performed on preprocessed data and tended towards the same conclusions: for the backscattering Raman results, the use of the DuoScan™ mode improved the PCA and PLS results, due to a larger analyzed sampling volume. For the transmission Raman results, the determination of low crystalline percentages was possible and the best regression model was obtained using this technique. Indeed, the latter acquired spectra through the whole sample volume, in contrast with the previous surface analyses performed using the backscattering mode. This study consequently highlighted the importance of the analyzed sampling volume. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Darwish, Hany W.; Hassan, Said A.; Salem, Maissa Y.; El-Zeany, Badr A.
2014-03-01
Different chemometric models were applied for the quantitative analysis of Amlodipine (AML), Valsartan (VAL) and Hydrochlorothiazide (HCT) in ternary mixture, namely, Partial Least Squares (PLS) as traditional chemometric model and Artificial Neural Networks (ANN) as advanced model. PLS and ANN were applied with and without variable selection procedure (Genetic Algorithm GA) and data compression procedure (Principal Component Analysis PCA). The chemometric methods applied are PLS-1, GA-PLS, ANN, GA-ANN and PCA-ANN. The methods were used for the quantitative analysis of the drugs in raw materials and pharmaceutical dosage form via handling the UV spectral data. A 3-factor 5-level experimental design was established resulting in 25 mixtures containing different ratios of the drugs. Fifteen mixtures were used as a calibration set and the other ten mixtures were used as validation set to validate the prediction ability of the suggested methods. The validity of the proposed methods was assessed using the standard addition technique.
Rodríguez-Entrena, Macario; Schuberth, Florian; Gelhard, Carsten
2018-01-01
Structural equation modeling using partial least squares (PLS-SEM) has become a main-stream modeling approach in various disciplines. Nevertheless, prior literature still lacks a practical guidance on how to properly test for differences between parameter estimates. Whereas existing techniques such as parametric and non-parametric approaches in PLS multi-group analysis solely allow to assess differences between parameters that are estimated for different subpopulations, the study at hand introduces a technique that allows to also assess whether two parameter estimates that are derived from the same sample are statistically different. To illustrate this advancement to PLS-SEM, we particularly refer to a reduced version of the well-established technology acceptance model.
de Almeida, Valber Elias; de Araújo Gomes, Adriano; de Sousa Fernandes, David Douglas; Goicoechea, Héctor Casimiro; Galvão, Roberto Kawakami Harrop; Araújo, Mario Cesar Ugulino
2018-05-01
This paper proposes a new variable selection method for nonlinear multivariate calibration, combining the Successive Projections Algorithm for interval selection (iSPA) with the Kernel Partial Least Squares (Kernel-PLS) modelling technique. The proposed iSPA-Kernel-PLS algorithm is employed in a case study involving a Vis-NIR spectrometric dataset with complex nonlinear features. The analytical problem consists of determining Brix and sucrose content in samples from a sugar production system, on the basis of transflectance spectra. As compared to full-spectrum Kernel-PLS, the iSPA-Kernel-PLS models involve a smaller number of variables and display statistically significant superiority in terms of accuracy and/or bias in the predictions. Published by Elsevier B.V.
Payne, Courtney E.; Wolfrum, Edward J.
2015-03-12
Obtaining accurate chemical composition and reactivity (measures of carbohydrate release and yield) information for biomass feedstocks in a timely manner is necessary for the commercialization of biofuels. Our objective was to use near-infrared (NIR) spectroscopy and partial least squares (PLS) multivariate analysis to develop calibration models to predict the feedstock composition and the release and yield of soluble carbohydrates generated by a bench-scale dilute acid pretreatment and enzymatic hydrolysis assay. Major feedstocks included in the calibration models are corn stover, sorghum, switchgrass, perennial cool season grasses, rice straw, and miscanthus. Here are the results: We present individual model statistics tomore » demonstrate model performance and validation samples to more accurately measure predictive quality of the models. The PLS-2 model for composition predicts glucan, xylan, lignin, and ash (wt%) with uncertainties similar to primary measurement methods. A PLS-2 model was developed to predict glucose and xylose release following pretreatment and enzymatic hydrolysis. An additional PLS-2 model was developed to predict glucan and xylan yield. PLS-1 models were developed to predict the sum of glucose/glucan and xylose/xylan for release and yield (grams per gram). The release and yield models have higher uncertainties than the primary methods used to develop the models. In conclusion, it is possible to build effective multispecies feedstock models for composition, as well as carbohydrate release and yield. The model for composition is useful for predicting glucan, xylan, lignin, and ash with good uncertainties. The release and yield models have higher uncertainties; however, these models are useful for rapidly screening sample populations to identify unusual samples.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Payne, Courtney E.; Wolfrum, Edward J.
Obtaining accurate chemical composition and reactivity (measures of carbohydrate release and yield) information for biomass feedstocks in a timely manner is necessary for the commercialization of biofuels. Our objective was to use near-infrared (NIR) spectroscopy and partial least squares (PLS) multivariate analysis to develop calibration models to predict the feedstock composition and the release and yield of soluble carbohydrates generated by a bench-scale dilute acid pretreatment and enzymatic hydrolysis assay. Major feedstocks included in the calibration models are corn stover, sorghum, switchgrass, perennial cool season grasses, rice straw, and miscanthus. Here are the results: We present individual model statistics tomore » demonstrate model performance and validation samples to more accurately measure predictive quality of the models. The PLS-2 model for composition predicts glucan, xylan, lignin, and ash (wt%) with uncertainties similar to primary measurement methods. A PLS-2 model was developed to predict glucose and xylose release following pretreatment and enzymatic hydrolysis. An additional PLS-2 model was developed to predict glucan and xylan yield. PLS-1 models were developed to predict the sum of glucose/glucan and xylose/xylan for release and yield (grams per gram). The release and yield models have higher uncertainties than the primary methods used to develop the models. In conclusion, it is possible to build effective multispecies feedstock models for composition, as well as carbohydrate release and yield. The model for composition is useful for predicting glucan, xylan, lignin, and ash with good uncertainties. The release and yield models have higher uncertainties; however, these models are useful for rapidly screening sample populations to identify unusual samples.« less
Assi, Nada; Fages, Anne; Vineis, Paolo; Chadeau-Hyam, Marc; Stepien, Magdalena; Duarte-Salles, Talita; Byrnes, Graham; Boumaza, Houda; Knüppel, Sven; Kühn, Tilman; Palli, Domenico; Bamia, Christina; Boshuizen, Hendriek; Bonet, Catalina; Overvad, Kim; Johansson, Mattias; Travis, Ruth; Gunter, Marc J.; Lund, Eiliv; Dossus, Laure; Elena-Herrmann, Bénédicte; Riboli, Elio; Jenab, Mazda; Viallon, Vivian; Ferrari, Pietro
2015-01-01
Abstract Metabolomics is a potentially powerful tool for identification of biomarkers associated with lifestyle exposures and risk of various diseases. This is the rationale of the ‘meeting-in-the-middle’ concept, for which an analytical framework was developed in this study. In a nested case–control study on hepatocellular carcinoma (HCC) within the European Prospective Investigation into Cancer and nutrition (EPIC), serum 1H nuclear magnetic resonance (NMR) spectra (800 MHz) were acquired for 114 cases and 222 matched controls. Through partial least square (PLS) analysis, 21 lifestyle variables (the ‘predictors’, including information on diet, anthropometry and clinical characteristics) were linked to a set of 285 metabolic variables (the ‘responses’). The three resulting scores were related to HCC risk by means of conditional logistic regressions. The first PLS factor was not associated with HCC risk. The second PLS metabolomic factor was positively associated with tyrosine and glucose, and was related to a significantly increased HCC risk with OR = 1.11 (95% CI: 1.02, 1.22, P = 0.02) for a 1SD change in the responses score, and a similar association was found for the corresponding lifestyle component of the factor. The third PLS lifestyle factor was associated with lifetime alcohol consumption, hepatitis and smoking, and had negative loadings on vegetables intake. Its metabolomic counterpart displayed positive loadings on ethanol, glutamate and phenylalanine. These factors were positively and statistically significantly associated with HCC risk, with 1.37 (1.05, 1.79, P = 0.02) and 1.22 (1.04, 1.44, P = 0.01), respectively. Evidence of mediation was found in both the second and third PLS factors, where the metabolomic signals mediated the relation between the lifestyle component and HCC outcome. This study devised a way to bridge lifestyle variables to HCC risk through NMR metabolomics data. This implementation of the ‘meeting-in-the-middle’ approach finds natural applications in settings characterised by high-dimensional data, increasingly frequent in the omics generation. PMID:26130468
Robust PLS approach for KPI-related prediction and diagnosis against outliers and missing data
NASA Astrophysics Data System (ADS)
Yin, Shen; Wang, Guang; Yang, Xu
2014-07-01
In practical industrial applications, the key performance indicator (KPI)-related prediction and diagnosis are quite important for the product quality and economic benefits. To meet these requirements, many advanced prediction and monitoring approaches have been developed which can be classified into model-based or data-driven techniques. Among these approaches, partial least squares (PLS) is one of the most popular data-driven methods due to its simplicity and easy implementation in large-scale industrial process. As PLS is totally based on the measured process data, the characteristics of the process data are critical for the success of PLS. Outliers and missing values are two common characteristics of the measured data which can severely affect the effectiveness of PLS. To ensure the applicability of PLS in practical industrial applications, this paper introduces a robust version of PLS to deal with outliers and missing values, simultaneously. The effectiveness of the proposed method is finally demonstrated by the application results of the KPI-related prediction and diagnosis on an industrial benchmark of Tennessee Eastman process.
Alladio, E; Giacomelli, L; Biosa, G; Corcia, D Di; Gerace, E; Salomone, A; Vincenti, M
2018-01-01
The chronic intake of an excessive amount of alcohol is currently ascertained by determining the concentration of direct alcohol metabolites in the hair samples of the alleged abusers, including ethyl glucuronide (EtG) and, less frequently, fatty acid ethyl esters (FAEEs). Indirect blood biomarkers of alcohol abuse are still determined to support hair EtG results and diagnose a consequent liver impairment. In the present study, the supporting role of hair FAEEs is compared with indirect blood biomarkers with respect to the contexts in which hair EtG interpretation is uncertain. Receiver Operating Characteristics (ROC) curves and multivariate Principal Component Analysis (PCA) demonstrated much stronger correlation of EtG results with FAEEs than with any single indirect biomarker or their combinations. Partial Least Squares Discriminant Analysis (PLS-DA) models based on hair EtG and FAEEs were developed to maximize the biomarkers information content on a multivariate background. The final PLS-DA model yielded 100% correct classification on a training/evaluation dataset of 155 subjects, including both chronic alcohol abusers and social drinkers. Then, the PLS-DA model was validated on an external dataset of 81 individual providing optimal discrimination ability between chronic alcohol abusers and social drinkers, in terms of specificity and sensitivity. The PLS-DA scores obtained for each subject, with respect to the PLS-DA model threshold that separates the probabilistic distributions for the two classes, furnished a likelihood ratio value, which in turn conveys the strength of the experimental data support to the classification decision, within a Bayesian logic. Typical boundary real cases from daily work are discussed, too. Copyright © 2017 Elsevier B.V. All rights reserved.
Improved accuracy in quantitative laser-induced breakdown spectroscopy using sub-models
Anderson, Ryan; Clegg, Samuel M.; Frydenvang, Jens; Wiens, Roger C.; McLennan, Scott M.; Morris, Richard V.; Ehlmann, Bethany L.; Dyar, M. Darby
2017-01-01
Accurate quantitative analysis of diverse geologic materials is one of the primary challenges faced by the Laser-Induced Breakdown Spectroscopy (LIBS)-based ChemCam instrument on the Mars Science Laboratory (MSL) rover. The SuperCam instrument on the Mars 2020 rover, as well as other LIBS instruments developed for geochemical analysis on Earth or other planets, will face the same challenge. Consequently, part of the ChemCam science team has focused on the development of improved multivariate analysis calibrations methods. Developing a single regression model capable of accurately determining the composition of very different target materials is difficult because the response of an element’s emission lines in LIBS spectra can vary with the concentration of other elements. We demonstrate a conceptually simple “sub-model” method for improving the accuracy of quantitative LIBS analysis of diverse target materials. The method is based on training several regression models on sets of targets with limited composition ranges and then “blending” these “sub-models” into a single final result. Tests of the sub-model method show improvement in test set root mean squared error of prediction (RMSEP) for almost all cases. The sub-model method, using partial least squares regression (PLS), is being used as part of the current ChemCam quantitative calibration, but the sub-model method is applicable to any multivariate regression method and may yield similar improvements.
Study on rapid valid acidity evaluation of apple by fiber optic diffuse reflectance technique
NASA Astrophysics Data System (ADS)
Liu, Yande; Ying, Yibin; Fu, Xiaping; Jiang, Xuesong
2004-03-01
Some issues related to nondestructive evaluation of valid acidity in intact apples by means of Fourier transform near infrared (FTNIR) (800-2631nm) method were addressed. A relationship was established between the diffuse reflectance spectra recorded with a bifurcated optic fiber and the valid acidity. The data were analyzed by multivariate calibration analysis such as partial least squares (PLS) analysis and principal component regression (PCR) technique. A total of 120 Fuji apples were tested and 80 of them were used to form a calibration data set. The influence of data preprocessing and different spectra treatments were also investigated. Models based on smoothing spectra were slightly worse than models based on derivative spectra and the best result was obtained when the segment length was 5 and the gap size was 10. Depending on data preprocessing and multivariate calibration technique, the best prediction model had a correlation efficient (0.871), a low RMSEP (0.0677), a low RMSEC (0.056) and a small difference between RMSEP and RMSEC by PLS analysis. The results point out the feasibility of FTNIR spectral analysis to predict the fruit valid acidity non-destructively. The ratio of data standard deviation to the root mean square error of prediction (SDR) is better to be less than 3 in calibration models, however, the results cannot meet the demand of actual application. Therefore, further study is required for better calibration and prediction.
NIR spectroscopic measurement of moisture content in Scots pine seeds.
Lestander, Torbjörn A; Geladi, Paul
2003-04-01
When tree seeds are used for seedling production it is important that they are of high quality in order to be viable. One of the factors influencing viability is moisture content and an ideal quality control system should be able to measure this factor quickly for each seed. Seed moisture content within the range 3-34% was determined by near-infrared (NIR) spectroscopy on Scots pine (Pinus sylvestris L.) single seeds and on bulk seed samples consisting of 40-50 seeds. The models for predicting water content from the spectra were made by partial least squares (PLS) and ordinary least squares (OLS) regression. Different conditions were simulated involving both using less wavelengths and going from samples to single seeds. Reflectance and transmission measurements were used. Different spectral pretreatment methods were tested on the spectra. Including bias, the lowest prediction errors for PLS models based on reflectance within 780-2280 nm from bulk samples and single seeds were 0.8% and 1.9%, respectively. Reduction of the single seed reflectance spectrum to 850-1048 nm gave higher biases and prediction errors in the test set. In transmission (850-1048 nm) the prediction error was 2.7% for single seeds. OLS models based on simulated 4-sensor single seed system consisting of optical filters with Gaussian transmission indicated more than 3.4% error in prediction. A practical F-test based on test sets to differentiate models is introduced.
[On-site evaluation of raw milk qualities by portable Vis/NIR transmittance technique].
Wang, Jia-Hua; Zhang, Xiao-Wei; Wang, Jun; Han, Dong-Hai
2014-10-01
To ensure the material safety of dairy products, visible (Vis)/near infrared (NIR) spectroscopy combined with che- mometrics methods was used to develop models for fat, protein, dry matter (DM) and lactose on-site evaluation. A total of 88 raw milk samples were collected from individual livestocks in different years. The spectral of raw milk were measured by a porta- ble Vis/NIR spectrometer with diffused transmittance accessory. To remove the scatter effect and baseline drift, the diffused transmittance spectra were preprocessed by 2nd order derivative with Savitsky-Golay (polynomial order 2, data point 25). Changeable size moving window partial least squares (CSMWPLS) and genetic algorithms partial least squares (GAPLS) meth- ods were suggested to select informative regions for PLS calibration. The PLS and multiple linear regression (MLR) methods were used to develop models for predicting quality index of raw milk. The prediction performance of CSMWPLS models were similar to GAPLS models for fat, protein, DM and lactose evaluation, the root mean standard errors of prediction (RMSEP) were 0.115 6/0.103 3, 0.096 2/0.113 7, 0.201 3/0.123 7 and 0.077 4/0.066 8, and the relative standard deviations of prediction (RPD) were 8.99/10.06, 3.53/2.99, 5.76/9.38 and 1.81/2.10, respectively. Meanwhile, the MLR models were also cal- ibrated with 8, 10, 9 and 7 variables for fat, protein, DM and lactose, respectively. The prediction performance of MLR models was better than or close to PLS models. The MLR models to predict fat, protein, DM and lactose yielded the RMSEP of 0.107 0, 0.093 0, 0.136 0 and 0.065 8, and the RPD of 9.72, 3.66, 8.53 and 2.13, respectively. The results demonstrated the usefulness of Vis/NIR spectra combined with multivariate calibration methods as an objective and rapid method for the quality evaluation of complicated raw milks. And the results obtained also highlight the potential of portable Vis/NIR instruments for on-site assessing quality indexes of raw milk.
Song, Jingwei; He, Jiaying; Zhu, Menghua; Tan, Debao; Zhang, Yu; Ye, Song; Shen, Dingtao; Zou, Pengfei
2014-01-01
A simulated annealing (SA) based variable weighted forecast model is proposed to combine and weigh local chaotic model, artificial neural network (ANN), and partial least square support vector machine (PLS-SVM) to build a more accurate forecast model. The hybrid model was built and multistep ahead prediction ability was tested based on daily MSW generation data from Seattle, Washington, the United States. The hybrid forecast model was proved to produce more accurate and reliable results and to degrade less in longer predictions than three individual models. The average one-week step ahead prediction has been raised from 11.21% (chaotic model), 12.93% (ANN), and 12.94% (PLS-SVM) to 9.38%. Five-week average has been raised from 13.02% (chaotic model), 15.69% (ANN), and 15.92% (PLS-SVM) to 11.27%. PMID:25301508
Lakshmi, Karunanidhi Santhana; Lakshmi, Sivasubramanian
2011-03-01
Simultaneous determination of valsartan and hydrochlorothiazide by the H-point standard additions method (HPSAM) and partial least squares (PLS) calibration is described. Absorbances at a pair of wavelengths, 216 and 228 nm, were monitored with the addition of standard solutions of valsartan. Results of applying HPSAM showed that valsartan and hydrochlorothiazide can be determined simultaneously at concentration ratios varying from 20:1 to 1:15 in a mixed sample. The proposed PLS method does not require chemical separation and spectral graphical procedures for quantitative resolution of mixtures containing the titled compounds. The calibration model was based on absorption spectra in the 200-350 nm range for 25 different mixtures of valsartan and hydrochlorothiazide. Calibration matrices contained 0.5-3 μg mL-1 of both valsartan and hydrochlorothiazide. The standard error of prediction (SEP) for valsartan and hydrochlorothiazide was 0.020 and 0.038 μg mL-1, respectively. Both proposed methods were successfully applied to the determination of valsartan and hydrochlorothiazide in several synthetic and real matrix samples.
Lu, Shao Hua; Li, Bao Qiong; Zhai, Hong Lin; Zhang, Xin; Zhang, Zhuo Yong
2018-04-25
Terahertz time-domain spectroscopy has been applied to many fields, however, it still encounters drawbacks in multicomponent mixtures analysis due to serious spectral overlapping. Here, an effective approach to quantitative analysis was proposed, and applied on the determination of the ternary amino acids in foxtail millet substrate. Utilizing three parameters derived from the THz-TDS, the images were constructed and the Tchebichef image moments were used to extract the information of target components. Then the quantitative models were obtained by stepwise regression. The correlation coefficients of leave-one-out cross-validation (R loo-cv 2 ) were more than 0.9595. As for external test set, the predictive correlation coefficients (R p 2 ) were more than 0.8026 and the root mean square error of prediction (RMSE p ) were less than 1.2601. Compared with the traditional methods (PLS and N-PLS methods), our approach is more accurate, robust and reliable, and can be a potential excellent approach to quantify multicomponent with THz-TDS spectroscopy. Copyright © 2017 Elsevier Ltd. All rights reserved.
Song, Seung Yeob; Lee, Young Koung; Kim, In-Jung
2016-01-01
A high-throughput screening system for Citrus lines were established with higher sugar and acid contents using Fourier transform infrared (FT-IR) spectroscopy in combination with multivariate analysis. FT-IR spectra confirmed typical spectral differences between the frequency regions of 950-1100 cm(-1), 1300-1500 cm(-1), and 1500-1700 cm(-1). Principal component analysis (PCA) and subsequent partial least square-discriminant analysis (PLS-DA) were able to discriminate five Citrus lines into three separate clusters corresponding to their taxonomic relationships. The quantitative predictive modeling of sugar and acid contents from Citrus fruits was established using partial least square regression algorithms from FT-IR spectra. The regression coefficients (R(2)) between predicted values and estimated sugar and acid content values were 0.99. These results demonstrate that by using FT-IR spectra and applying quantitative prediction modeling to Citrus sugar and acid contents, excellent Citrus lines can be early detected with greater accuracy. Copyright © 2015 Elsevier Ltd. All rights reserved.
Pinto, Susana; de Carvalho, Mamede
2017-02-01
Slow vital capacity (SVC) and forced vital capacity (FVC) are the most frequent used tests evaluating respiratory function in amyotrophic lateral sclerosis (ALS). No previous study has determined their interchangeability. To evaluate SVC-FVC correlation in ALS. Consecutive definite/probable ALS and primary lateral sclerosis (PLS) patients (2000-2014) in whom respiratory tests were performed at baseline/4-6months later were included. All were evaluated with revised ALS functional rating scale, the ALSFRS respiratory (R-subscore) and bulbar subscores, SVC, FVC, maximal inspiratory (MIP) and expiratory (MEP) pressures. SVC-FVC correlation was analysed by Pearson product-moment correlation test. Paired t-test compared baseline/follow-up values. Multilinear regression analysis modelled the relationship between tested variables. We included 592 ALS (332 men, mean onset age 62.6 ± 11.8 years, mean disease duration 15.4 ± 15 months) and 19 PLS (11 men, median age 54 years, median disease duration 5.5 years) patients. SVC and FVC predicted values decreased 2.15%/month and 2.08%/month, respectively. FVC and SVC were strongly correlated. Both were strongly correlated with MIP and MEP and moderately correlated with R-subscore for the all population and spinal-onset patients, but weakly correlated for bulbar-onset patients. FVC and SVC were strongly correlated and declined similarly. This correlation was preserved in bulbar-onset ALS and in spastic PLS patients.
Henrique, C M; Teófilo, R F; Sabino, L; Ferreira, M M C; Cereda, M P
2007-05-01
Cassava starches are widely used in the production of biodegradable films, but their resistance to humidity migration is very low. In this work, commercial cassava starch films were studied and classified according to their physicochemical properties. A nondestructive method for water vapor permeability determination, which combines with infrared spectroscopy and multivariate calibration, is also presented. The following commercial cassava starches were studied: pregelatinized (amidomax 3550), carboxymethylated starch (CMA) of low and high viscosities, and esterified starches. To make the films, 2 different starch concentrations were evaluated, consisting of water suspensions with 3% and 5% starch. The filmogenic solutions were dried and characterized for their thickness, grammage, water vapor permeability, water activity, tensile strength (deformation force), water solubility, and puncture strength (deformation). The minimum thicknesses were 0.5 to 0.6 mm in pregelatinized starch films. The results were treated by means of the following chemometric methods: principal component analysis (PCA) and partial least squares (PLS) regression. PCA analysis on the physicochemical properties of the films showed that the differences in concentration of the dried material (3% and 5% starch) and also in the type of starch modification were mainly related to the following properties: permeability, solubility, and thickness. IR spectra collected in the region of 4000 to 600 cm(-1) were used to build a PLS model with good predictive power for water vapor permeability determination, with mean relative errors of 10.0% for cross-validation and 7.8% for the prediction set.
Greene, LaVana; Elzey, Brianda; Franklin, Mariah; Fakayode, Sayo O
2017-03-05
The negative health impact of polycyclic aromatic hydrocarbons (PAHs) and differences in pharmacological activity of enantiomers of chiral molecules in humans highlights the need for analysis of PAHs and their chiral analogue molecules in humans. Herein, the first use of cyclodextrin guest-host inclusion complexation, fluorescence spectrophotometry, and chemometric approach to PAH (anthracene) and chiral-PAH analogue derivatives (1-(9-anthryl)-2,2,2-triflouroethanol (TFE)) analyses are reported. The binding constants (K b ), stoichiometry (n), and thermodynamic properties (Gibbs free energy (ΔG), enthalpy (ΔH), and entropy (ΔS)) of anthracene and enantiomers of TFE-methyl-β-cyclodextrin (Me-β-CD) guest-host complexes were also determined. Chemometric partial-least-square (PLS) regression analysis of emission spectra data of Me-β-CD-guest-host inclusion complexes was used for the determination of anthracene and TFE enantiomer concentrations in Me-β-CD-guest-host inclusion complex samples. The values of calculated K b and negative ΔG suggest the thermodynamic favorability of anthracene-Me-β-CD and enantiomeric of TFE-Me-β-CD inclusion complexation reactions. However, anthracene-Me-β-CD and enantiomer TFE-Me-β-CD inclusion complexations showed notable differences in the binding affinity behaviors and thermodynamic properties. The PLS regression analysis resulted in square-correlation-coefficients of 0.997530 or better and a low LOD of 3.81×10 -7 M for anthracene and 3.48×10 -8 M for TFE enantiomers at physiological conditions. Most importantly, PLS regression accurately determined the anthracene and TFE enantiomer concentrations with an average low error of 2.31% for anthracene, 4.44% for R-TFE and 3.60% for S-TFE. The results of the study are highly significant because of its high sensitivity and accuracy for analysis of PAH and chiral PAH analogue derivatives without the need of an expensive chiral column, enantiomeric resolution, or use of a polarized light. Published by Elsevier B.V.
Balabin, Roman M; Smirnov, Sergey V
2011-07-15
Melamine (2,4,6-triamino-1,3,5-triazine) is a nitrogen-rich chemical implicated in the pet and human food recalls and in the global food safety scares involving milk products. Due to the serious health concerns associated with melamine consumption and the extensive scope of affected products, rapid and sensitive methods to detect melamine's presence are essential. We propose the use of spectroscopy data-produced by near-infrared (near-IR/NIR) and mid-infrared (mid-IR/MIR) spectroscopies, in particular-for melamine detection in complex dairy matrixes. None of the up-to-date reported IR-based methods for melamine detection has unambiguously shown its wide applicability to different dairy products as well as limit of detection (LOD) below 1 ppm on independent sample set. It was found that infrared spectroscopy is an effective tool to detect melamine in dairy products, such as infant formula, milk powder, or liquid milk. ALOD below 1 ppm (0.76±0.11 ppm) can be reached if a correct spectrum preprocessing (pretreatment) technique and a correct multivariate (MDA) algorithm-partial least squares regression (PLS), polynomial PLS (Poly-PLS), artificial neural network (ANN), support vector regression (SVR), or least squares support vector machine (LS-SVM)-are used for spectrum analysis. The relationship between MIR/NIR spectrum of milk products and melamine content is nonlinear. Thus, nonlinear regression methods are needed to correctly predict the triazine-derivative content of milk products. It can be concluded that mid- and near-infrared spectroscopy can be regarded as a quick, sensitive, robust, and low-cost method for liquid milk, infant formula, and milk powder analysis. Copyright © 2011 Elsevier B.V. All rights reserved.
Otsuka, Eri; Abe, Hiroyuki; Aburada, Masaki; Otsuka, Makoto
2010-07-01
A suppository dosage form has a rapid effect on therapeutics, because it dissolves in the rectum, is absorbed in the bloodstream, and passes the hepatic metabolism. However, the dosage form is unstable, because a suppository is made in a semisolid form, and so it is not easy to mix the bulk drug powder in the base. This article describes a nondestructive method of determining the drug content of suppositories using near-infrared spectrometry (NIR) combined with chemometrics. Suppositories (aspirin content: 1.8, 2.7, 4.5, 7.3, and 9.1%, w/w) were produced by mixing an aspirin bulk powder with hard fat at 50 degrees C and pouring the melt mixture into a plastic mold (2.25 mL). NIR spectra of 12 calibration and 12 validation sample sets were recorded 5 times. A total of 60 spectral data were used as a calibration set to establish a calibration model to predict drug content with a partial least-squares (PLS) regression analysis. NIR data of the suppository samples were divided into two wave number ranges, 4000-12500 cm(-1) (LR), and 5900-6300 cm(-1) (SR). Calibration models for the aspirin content of the suppositories were calculated based on LR and SR ranges of second-derivative NIR spectra using PLS. The models for LR and SR consisted of five and one principal components (PC), respectively. The plots of predicted values against actual values gave a straight line with regression coefficient constants of 0.9531 and 0.9749, respectively. The mean bias and mean accuracy of the calibration models were calculated based on the SR of variation data sets, and were lower than those of LR, respectively. Limiting the wave number of spectral data sets is useful to help understand the calibration model because of noise cancellation and to measure objective functions.
Miaw, Carolina Sheng Whei; Assis, Camila; Silva, Alessandro Rangel Carolino Sales; Cunha, Maria Luísa; Sena, Marcelo Martins; de Souza, Scheilla Vitorino Carvalho
2018-07-15
Grape, orange, peach and passion fruit nectars were formulated and adulterated by dilution with syrup, apple and cashew juices at 10 levels for each adulterant. Attenuated total reflectance Fourier transform mid infrared (ATR-FTIR) spectra were obtained. Partial least squares (PLS) multivariate calibration models allied to different variable selection methods, such as interval partial least squares (iPLS), ordered predictors selection (OPS) and genetic algorithm (GA), were used to quantify the main fruits. PLS improved by iPLS-OPS variable selection showed the highest predictive capacity to quantify the main fruit contents. The selected variables in the final models varied from 72 to 100; the root mean square errors of prediction were estimated from 0.5 to 2.6%; the correlation coefficients of prediction ranged from 0.948 to 0.990; and, the mean relative errors of prediction varied from 3.0 to 6.7%. All of the developed models were validated. Copyright © 2018 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Goudarzi, Nasser
2016-04-01
In this work, two new and powerful chemometrics methods are applied for the modeling and prediction of the 19F chemical shift values of some fluorinated organic compounds. The radial basis function-partial least square (RBF-PLS) and random forest (RF) are employed to construct the models to predict the 19F chemical shifts. In this study, we didn't used from any variable selection method and RF method can be used as variable selection and modeling technique. Effects of the important parameters affecting the ability of the RF prediction power such as the number of trees (nt) and the number of randomly selected variables to split each node (m) were investigated. The root-mean-square errors of prediction (RMSEP) for the training set and the prediction set for the RBF-PLS and RF models were 44.70, 23.86, 29.77, and 23.69, respectively. Also, the correlation coefficients of the prediction set for the RBF-PLS and RF models were 0.8684 and 0.9313, respectively. The results obtained reveal that the RF model can be used as a powerful chemometrics tool for the quantitative structure-property relationship (QSPR) studies.
Tao, Lingyan; Lin, Zhonglin; Chen, Jiashan; Wu, Yongjiang; Liu, Xuesong
2017-10-25
Gardeniae Fructus is widely used in the pharmaceutical industry, and many studies have confirmed its medical and economic value. In this study, samples collected from different liquid-liquid extraction batches of Gardeniae Fructus were detected by mid-infrared (MIR) and near-infrared (NIR) spectroscopy. Seven analytes, neochlorogenic acid (5-CQA), cryptochlorogenic acid (4-CQA), chlorogenic acid (3-CQA), geniposidic acid (GEA), deacetyl-asperulosidic acid methyl ester (DAAME), genipin-gentiobioside (GGB), and gardenoside (GA), were chosen as quality property indexes of Gardeniae Fructus. The two kinds of spectra were each used to build models by single partial least squares (PLS). Additionally, both spectral data were combined and modeled by multiblock PLS. For single spectroscopy modeling results, NIR had a better prediction for high-concentration analytes (3-CQA, DAAME, GGB, and GA) whereas MIR performed better for low-concentration analytes (5-CQA, 4-CQA, and GEA). The multiblock methodology was found to be better compared to single spectroscopy models for all seven analytes. Specifically, the coefficients of determination (R 2 ) of the NIR, MIR, and multiblock PLS calibration models of all seven components were higher than 0.95. Relative standard errors of prediction (RSEP) were all less than 7%, except for models of GGB, which were 10.36%, 13.24%, and 8.15% for the NIR-PLS, MIR-PLS, and multiblock models, respectively. These results indicate that MIR and NIR spectrographic techniques could provide a new choice for quality control in industrial production of Gardeniae Fructus. Copyright © 2017 Elsevier B.V. All rights reserved.
Grisales, Jaiver Osorio; Arancibia, Juan A; Castells, Cecilia B; Olivieri, Alejandro C
2012-12-01
In this report, we demonstrate how chiral liquid chromatography combined with multivariate chemometric techniques, specifically unfolded-partial least-squares regression (U-PLS), provides a powerful analytical methodology. Using U-PLS, strongly overlapped enantiomer profiles in a sample could be successfully processed and enantiomeric purity could be accurately determined without requiring baseline enantioresolution between peaks. The samples were partially enantioseparated with a permethyl-β-cyclodextrin chiral column under reversed-phase conditions. Signals detected with a diode-array detector within a wavelength range from 198 to 241 nm were recorded, and the data were processed by a second-order multivariate algorithm to decrease detection limits. The R-(-)-enantiomer of ibuprofen in tablet formulation samples could be determined at the level of 0.5 mg L⁻¹ in the presence of 99.9% of the S-(+)-enantiomorph with relative prediction error within ±3%. Copyright © 2012 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Jiang, Junjun; Hu, Ruimin; Han, Zhen; Wang, Zhongyuan; Chen, Jun
2013-10-01
Face superresolution (SR), or face hallucination, refers to the technique of generating a high-resolution (HR) face image from a low-resolution (LR) one with the help of a set of training examples. It aims at transcending the limitations of electronic imaging systems. Applications of face SR include video surveillance, in which the individual of interest is often far from cameras. A two-step method is proposed to infer a high-quality and HR face image from a low-quality and LR observation. First, we establish the nonlinear relationship between LR face images and HR ones, according to radial basis function and partial least squares (RBF-PLS) regression, to transform the LR face into the global face space. Then, a locality-induced sparse representation (LiSR) approach is presented to enhance the local facial details once all the global faces for each LR training face are constructed. A comparison of some state-of-the-art SR methods shows the superiority of the proposed two-step approach, RBF-PLS global face regression followed by LiSR-based local patch reconstruction. Experiments also demonstrate the effectiveness under both simulation conditions and some real conditions.
Talpur, M Younis; Kara, Huseyin; Sherazi, S T H; Ayyildiz, H Filiz; Topkafa, Mustafa; Arslan, Fatma Nur; Naz, Saba; Durmaz, Fatih; Sirajuddin
2014-11-01
Single bounce attenuated total reflectance (SB-ATR) Fourier transform infrared (FTIR) spectroscopy in conjunction with chemometrics was used for accurate determination of free fatty acid (FFA), peroxide value (PV), iodine value (IV), conjugated diene (CD) and conjugated triene (CT) of cottonseed oil (CSO) during potato chips frying. Partial least square (PLS), stepwise multiple linear regression (SMLR), principal component regression (PCR) and simple Beer׳s law (SBL) were applied to develop the calibrations for simultaneous evaluation of five stated parameters of cottonseed oil (CSO) during frying of French frozen potato chips at 170°C. Good regression coefficients (R(2)) were achieved for FFA, PV, IV, CD and CT with value of >0.992 by PLS, SMLR, PCR, and SBL. Root mean square error of prediction (RMSEP) was found to be less than 1.95% for all determinations. Result of the study indicated that SB-ATR FTIR in combination with multivariate chemometrics could be used for accurate and simultaneous determination of different parameters during the frying process without using any toxic organic solvent. Copyright © 2014 Elsevier B.V. All rights reserved.
PLS Road surface temperature forecast for susceptibility of ice occurrence
NASA Astrophysics Data System (ADS)
Marchetti, Mario; Khalifa, Abderrhamen; Bues, Michel
2014-05-01
Winter maintenance relies on many operational tools consisting in monitoring atmospheric and pavement physical parameters. Among them, road weather information systems (RWIS) and thermal mapping are mostly used by service in charge of managing infrastructure networks. The Data from RWIS and thermal mapping are considered as inputs for forecasting physical numerical models, commonly in place since the 80s. These numerical models do need an accurate description of the infrastructure, such as pavement layers and sub-layers, along with many meteorological parameters, such as air temperature and global and infrared radiation. The description is sometimes partially known, and meteorological data is only monitored on specific spot. On the other hand, thermal mapping is now an easy, reliable and cost effective way to monitor road surface temperature (RST), and many meteorological parameters all along routes of infrastructure networks, including with a whole fleet of vehicles in the specific cases of roads, or airports. The technique uses infrared thermometry to measure RST and an atmospheric probes for air temperature, relative humidity, wind speed and global radiation, both at a high resolution interval, to identify sections of the road network prone to ice occurrence. However, measurements are time-consuming, and the data from thermal mapping is one input among others to establish the forecast. The idea was to build a reliable forecast on the sole data from thermal mapping. Previous work has established the interest to use principal component analysis (PCA) on the basis of a reduced number of thermal fingerprints. The work presented here is a focus on the use of partial least-square regression (PLS) to build a RST forecast with air temperature measurements. Roads with various environments, weather conditions (clear, cloudy mainly) and seasons were monitored over several months to generate an appropriate number of samples. The study was conducted to determine the minimum number of samples to get a reliable forecast, considering inputs for numerical models do not exceed five thermal fingerprints. Results of PLS have shown that the PLS model could have a R² of 0.9562, a RMSEP of 1.34 and a bias of -0.66. The same model applied to establish a forecast on past event indicates an average difference between measurements and forecasts of 0.20 °C. The advantage of such approach is its potential application not only to winter events, but also the extreme summer ones for urban heat island.
A preliminary MTD-PLS study for androgen receptor binding of steroid compounds
NASA Astrophysics Data System (ADS)
Bora, Alina; Seclaman, E.; Kurunczi, L.; Funar-Timofei, Simona
The relative binding affinities (RBA) of a series of 30 steroids for Human Androgen Receptor (AR) were used to initiate a MTD-PLS study. The 3D structures of all the compounds were obtained through geometry optimization in the framework of AM1 semiempirical quantum chemical method. The MTD hypermolecule (HM) was constructed, superposing these structures on the AR-bonded dihydrotestosterone (DHT) skeleton obtained from PDB (AR complex, ID 1I37). The parameters characterizing the HM vertices were collected using: AM1 charges, XlogP fragmental values, calculated fragmental polarizabilities (from refractivities), volumes, and H-bond parameters (Raevsky's thermodynamic originated scale). The resulted QSAR data matrix was submitted to PCA (Principal Component Analysis) and PLS (Projections in Latent Structures) procedure (SIMCA P 9.0); five compounds were selected as test set, and the remaining 25 molecules were used as training set. In the PLS procedure supplementary chemical information was introduced, i.e. the steric effect was always considered detrimental, and the hydrophobic and van der Waals interactions were imposed to be beneficial. The initial PLS model using the entire training set has the following characteristics: R2Y = 0.584, Q2 = 0.344. Based on distances to the model criterions (DMODX and DMODY), five compounds were eliminated and the obtained final model had the following characteristics: R2Y D 0.891, Q2 D 0.591. For this the external predictivity on the test set was unsatisfactory. A tentative explanation for these behaviors is the weak information content of the input QSAR matrix for the present series comparatively with other successful MTD-PLS modeling published elsewhere.
Wang, Qi; He, Haijun; Li, Bing; Lin, Hancheng; Zhang, Yinming; Zhang, Ji
2017-01-01
Estimating PMI is of great importance in forensic investigations. Although many methods are used to estimate the PMI, a few investigations focus on the postmortem redistribution. In this study, ultraviolet–visible (UV–Vis) measurement combined with visual inspection indicated a regular diffusion of hemoglobin into plasma after death showing the redistribution of postmortem components in blood. Thereafter, attenuated total reflection–Fourier transform infrared (ATR–FTIR) spectroscopy was used to confirm the variations caused by this phenomenon. First, full-spectrum partial least-squares (PLS) and genetic algorithm combined with PLS (GA-PLS) models were constructed to predict the PMI. The performance of GA-PLS model was better than that of full-spectrum PLS model based on its root mean square error (RMSE) of cross-validation of 3.46 h (R2 = 0.95) and the RMSE of prediction of 3.46 h (R2 = 0.94). The investigation on the similarity of spectra between blood plasma and formed elements also supported the role of redistribution of components in spectral changes in postmortem plasma. These results demonstrated that ATR-FTIR spectroscopy coupled with the advanced mathematical methods could serve as a convenient and reliable tool to study the redistribution of postmortem components and estimate the PMI. PMID:28753641
Masili, Alice; Puligheddu, Sonia; Sassu, Lorenzo; Scano, Paola; Lai, Adolfo
2012-11-01
In this work, we report the feasibility study to predict the properties of neat crude oil samples from 300-MHz NMR spectral data and partial least squares (PLS) regression models. The study was carried out on 64 crude oil samples obtained from 28 different extraction fields and aims at developing a rapid and reliable method for characterizing the crude oil in a fast and cost-effective way. The main properties generally employed for evaluating crudes' quality and behavior during refining were measured and used for calibration and testing of the PLS models. Among these, the UOP characterization factor K (K(UOP)) used to classify crude oils in terms of composition, density (D), total acidity number (TAN), sulfur content (S), and true boiling point (TBP) distillation yields were investigated. Test set validation with an independent set of data was used to evaluate model performance on the basis of standard error of prediction (SEP) statistics. Model performances are particularly good for K(UOP) factor, TAN, and TPB distillation yields, whose standard error of calibration and SEP values match the analytical method precision, while the results obtained for D and S are less accurate but still useful for predictions. Furthermore, a strategy that reduces spectral data preprocessing and sample preparation procedures has been adopted. The models developed with such an ample crude oil set demonstrate that this methodology can be applied with success to modern refining process requirements. Copyright © 2012 John Wiley & Sons, Ltd.
Zhang, Xue-Xi; Yin, Jian-Hua; Mao, Zhi-Hua; Xia, Yang
2015-01-01
Abstract. Fourier transform infrared imaging (FTIRI) combined with chemometrics algorithm has strong potential to obtain complex chemical information from biology tissues. FTIRI and partial least squares-discriminant analysis (PLS-DA) were used to differentiate healthy and osteoarthritic (OA) cartilages for the first time. A PLS model was built on the calibration matrix of spectra that was randomly selected from the FTIRI spectral datasets of healthy and lesioned cartilage. Leave-one-out cross-validation was performed in the PLS model, and the fitting coefficient between actual and predicted categorical values of the calibration matrix reached 0.95. In the calibration and prediction matrices, the successful identifying percentages of healthy and lesioned cartilage spectra were 100% and 90.24%, respectively. These results demonstrated that FTIRI combined with PLS-DA could provide a promising approach for the categorical identification of healthy and OA cartilage specimens. PMID:26057029
Zhang, Xue-Xi; Yin, Jian-Hua; Mao, Zhi-Hua; Xia, Yang
2015-06-01
Fourier transform infrared imaging (FTIRI) combined with chemometrics algorithm has strong potential to obtain complex chemical information from biology tissues. FTIRI and partial least squares-discriminant analysis (PLS-DA) were used to differentiate healthy and osteoarthritic (OA) cartilages for the first time. A PLS model was built on the calibration matrix of spectra that was randomly selected from the FTIRI spectral datasets of healthy and lesioned cartilage. Leave-one-out cross-validation was performed in the PLS model, and the fitting coefficient between actual and predicted categorical values of the calibration matrix reached 0.95. In the calibration and prediction matrices, the successful identifying percentages of healthy and lesioned cartilage spectra were 100% and 90.24%, respectively. These results demonstrated that FTIRI combined with PLS-DA could provide a promising approach for the categorical identification of healthy and OA cartilage specimens.
Scott Andersson, Asa; Tysklind, Mats; Fängmark, Ingrid
2007-08-17
The environment consists of a variety of different compartments and processes that act together in a complex system that complicate the environmental risk assessment after a chemical accident. The Environment-Accident Index (EAI) is an example of a tool based on a strategy to join the properties of a chemical with site-specific properties to facilitate this assessment and to be used in the planning process. In the development of the EAI it is necessary to make an unbiased judgement of relevant variables to include in the formula and to estimate their relative importance. The development of EAI has so far included the assimilation of chemical accidents, selection of a representative set of chemical accidents, and response values (representing effects in the environment after a chemical accident) have been developed by means of an expert panel. The developed responses were then related to the chemical and site-specific properties, through a mathematical model based on multivariate modelling (PLS), to create an improved EAI model. This resulted in EAI(new), a PLS based EAI model connected to a new classification scale. The advantages of EAI(new) compared to the old EAI (EAI(old)) is that it can be calculated without the use of tables, it can estimate the effects for all included responses and make a rough classification of chemical accidents according to the new classification scale. Finally EAI(new) is a more stable model than EAI(old), built on a valid base of accident scenarios which makes it more reliable to use for a variety of chemicals and situations as it covers a broader spectra of accident scenarios. EAI(new) can be expressed as a regression model to facilitate the calculation of the index for persons that do not have access to PLS. Future work can be; an external validation of EAI(new); to complete the formula structure; to adjust the classification scale; and to make a real life evaluation of EAI(new).
Local classification: Locally weighted-partial least squares-discriminant analysis (LW-PLS-DA).
Bevilacqua, Marta; Marini, Federico
2014-08-01
The possibility of devising a simple, flexible and accurate non-linear classification method, by extending the locally weighted partial least squares (LW-PLS) approach to the cases where the algorithm is used in a discriminant way (partial least squares discriminant analysis, PLS-DA), is presented. In particular, to assess which category an unknown sample belongs to, the proposed algorithm operates by identifying which training objects are most similar to the one to be predicted and building a PLS-DA model using these calibration samples only. Moreover, the influence of the selected training samples on the local model can be further modulated by adopting a not uniform distance-based weighting scheme which allows the farthest calibration objects to have less impact than the closest ones. The performances of the proposed locally weighted-partial least squares-discriminant analysis (LW-PLS-DA) algorithm have been tested on three simulated data sets characterized by a varying degree of non-linearity: in all cases, a classification accuracy higher than 99% on external validation samples was achieved. Moreover, when also applied to a real data set (classification of rice varieties), characterized by a high extent of non-linearity, the proposed method provided an average correct classification rate of about 93% on the test set. By the preliminary results, showed in this paper, the performances of the proposed LW-PLS-DA approach have proved to be comparable and in some cases better than those obtained by other non-linear methods (k nearest neighbors, kernel-PLS-DA and, in the case of rice, counterpropagation neural networks). Copyright © 2014 Elsevier B.V. All rights reserved.
Fischer, Katharina E
2012-08-02
Decision-making in healthcare is complex. Research on coverage decision-making has focused on comparative studies for several countries, statistical analyses for single decision-makers, the decision outcome and appraisal criteria. Accounting for decision processes extends the complexity, as they are multidimensional and process elements need to be regarded as latent constructs (composites) that are not observed directly. The objective of this study was to present a practical application of partial least square path modelling (PLS-PM) to evaluate how it offers a method for empirical analysis of decision-making in healthcare. Empirical approaches that applied PLS-PM to decision-making in healthcare were identified through a systematic literature search. PLS-PM was used as an estimation technique for a structural equation model that specified hypotheses between the components of decision processes and the reasonableness of decision-making in terms of medical, economic and other ethical criteria. The model was estimated for a sample of 55 coverage decisions on the extension of newborn screening programmes in Europe. Results were evaluated by standard reliability and validity measures for PLS-PM. After modification by dropping two indicators that showed poor measures in the measurement models' quality assessment and were not meaningful for newborn screening, the structural equation model estimation produced plausible results. The presence of three influences was supported: the links between both stakeholder participation or transparency and the reasonableness of decision-making; and the effect of transparency on the degree of scientific rigour of assessment. Reliable and valid measurement models were obtained to describe the composites of 'transparency', 'participation', 'scientific rigour' and 'reasonableness'. The structural equation model was among the first applications of PLS-PM to coverage decision-making. It allowed testing of hypotheses in situations where there are links between several non-observable constructs. PLS-PM was compatible in accounting for the complexity of coverage decisions to obtain a more realistic perspective for empirical analysis. The model specification can be used for hypothesis testing by using larger sample sizes and for data in the full domain of health technologies.
Ouyang, Qin; Zhao, Jiewen; Pan, Wenxiu; Chen, Quansheng
2016-01-01
A portable and low-cost spectral analytical system was developed and used to monitor real-time process parameters, i.e. total sugar content (TSC), alcohol content (AC) and pH during rice wine fermentation. Various partial least square (PLS) algorithms were implemented to construct models. The performance of a model was evaluated by the correlation coefficient (Rp) and the root mean square error (RMSEP) in the prediction set. Among the models used, the synergy interval PLS (Si-PLS) was found to be superior. The optimal performance by the Si-PLS model for the TSC was Rp = 0.8694, RMSEP = 0.438; the AC was Rp = 0.8097, RMSEP = 0.617; and the pH was Rp = 0.9039, RMSEP = 0.0805. The stability and reliability of the system, as well as the optimal models, were verified using coefficients of variation, most of which were found to be less than 5%. The results suggest this portable system is a promising tool that could be used as an alternative method for rapid monitoring of process parameters during rice wine fermentation. Copyright © 2015 Elsevier Ltd. All rights reserved.
Tu, Yu-Kang; Davey Smith, George; Gilthorpe, Mark S.
2011-01-01
Due to a problem of identification, how to estimate the distinct effects of age, time period and cohort has been a controversial issue in the analysis of trends in health outcomes in epidemiology. In this study, we propose a novel approach, partial least squares (PLS) analysis, to separate the effects of age, period, and cohort. Our example for illustration is taken from the Glasgow Alumni cohort. A total of 15,322 students (11,755 men and 3,567 women) received medical screening at the Glasgow University between 1948 and 1968. The aim is to investigate the secular trends in blood pressure over 1925 and 1950 while taking into account the year of examination and age at examination. We excluded students born before 1925 or aged over 25 years at examination and those with missing values in confounders from the analyses, resulting in 12,546 and 12,516 students for analysis of systolic and diastolic blood pressure, respectively. PLS analysis shows that both systolic and diastolic blood pressure increased with students' age, and students born later had on average lower blood pressure (SBP: −0.17 mmHg/per year [95% confidence intervals: −0.19 to −0.15] for men and −0.25 [−0.28 to −0.22] for women; DBP: −0.14 [−0.15 to −0.13] for men; −0.09 [−0.11 to −0.07] for women). PLS also shows a decreasing trend in blood pressure over the examination period. As identification is not a problem for PLS, it provides a flexible modelling strategy for age-period-cohort analysis. More emphasis is then required to clarify the substantive and conceptual issues surrounding the definitions and interpretations of age, period and cohort effects. PMID:21556329
Yue, Peijian; Gao, Lin; Wang, Xuejing; Ding, Xuebing; Teng, Junfang
2018-06-01
The purpose of this study was to investigate ultrasound-triggered effects of the glial cell line-derived neurotrophic factor (GDNF) + nuclear receptor-related factor 1 (Nurr1)-polyethylene glycol (PEG)ylated liposomes-coupled microbubbles (PLs-GDNF + Nurr1-MBs) on behavioral impairment and neuron loss in a rat model of Parkinson's disease (PD). The unloaded PEGylated liposomes-coupled microbubbles (PLs-MBs) were characterized for zeta potential, particle size, and concentration. 6-hydroxydopamine (6-OHDA) was used to establish the PD rat model. Rotational, climbing pole, and suspension tests were used to detect behavioral impairment. The immunohistochemical staining of tyrosine hydroxylase (TH) and dopamine transporter (DAT) was used to assess the neuron loss. Western blot and quantitative real-time PCR (qRT-PCR) analysis were used to measure the expression levels of GDNF and Nurr1. The particle size of PLs-MBs was gradually increased, while the concentration and absolute zeta potential were gradually decreased as the time prolongs. 6-OHDA increased amphetamine-induced rotations and loss of dopaminergic neurons as compared to sham group. Interestingly, PLs-GDNF-MBs or PLs-Nurr1-MBs decreased rotations and increased the TH and DAT immunoreactivity. Combined of both genes resulted in a robust reduction in the rotations and a greater increase of the dopaminergic neurons. The delivery of PLs-GDNF + Nurr1-MBs into the brains using magnetic resonance imaging (MRI)-guided focused ultrasound may be more efficacious for the treatment of PD than the single treatment. © 2017 Wiley Periodicals, Inc.
NASA Astrophysics Data System (ADS)
Belal, F.; Ibrahim, F.; Sheribah, Z. A.; Alaa, H.
2018-06-01
In this paper, novel univariate and multivariate regression methods along with model-updating technique were developed and validated for the simultaneous determination of quaternary mixture of imatinib (IMB), gemifloxacin (GMI), nalbuphine (NLP) and naproxen (NAP). The univariate method is extended derivative ratio (EDR) which depends on measuring every drug in the quaternary mixture by using a ternary mixture of the other three drugs as divisor. Peak amplitudes were measured at 294 nm, 250 nm, 283 nm and 239 nm within linear concentration ranges of 4.0-17.0, 3.0-15.0, 4.0-80.0 and 1.0-6.0 μg mL-1 for IMB, GMI, NLP and NAB, respectively. Multivariate methods adopted are partial least squares (PLS) in original and derivative mode. These models were constructed for simultaneous determination of the studied drugs in the ranges of 4.0-8.0, 3.0-11.0, 10.0-18.0 and 1.0-3.0 μg mL-1 for IMB, GMI, NLP and NAB, respectively, by using eighteen mixtures as a calibration set and seven mixtures as a validation set. The root mean square error of predication (RMSEP) were 0.09 and 0.06 for IMB, 0.14 and 0.13 for GMI, 0.07 and 0.02 for NLP and 0.64 and 0.27 for NAP by PLS in original and derivative mode, respectively. Both models were successfully applied for analysis of IMB, GMI, NLP and NAP in their dosage forms. Updated PLS in derivative mode and EDR were applied for determination of the studied drugs in spiked human urine. The obtained results were statistically compared with those obtained by the reported methods giving a conclusion that there is no significant difference regarding accuracy and precision.
Belal, F; Ibrahim, F; Sheribah, Z A; Alaa, H
2018-06-05
In this paper, novel univariate and multivariate regression methods along with model-updating technique were developed and validated for the simultaneous determination of quaternary mixture of imatinib (IMB), gemifloxacin (GMI), nalbuphine (NLP) and naproxen (NAP). The univariate method is extended derivative ratio (EDR) which depends on measuring every drug in the quaternary mixture by using a ternary mixture of the other three drugs as divisor. Peak amplitudes were measured at 294nm, 250nm, 283nm and 239nm within linear concentration ranges of 4.0-17.0, 3.0-15.0, 4.0-80.0 and 1.0-6.0μgmL -1 for IMB, GMI, NLP and NAB, respectively. Multivariate methods adopted are partial least squares (PLS) in original and derivative mode. These models were constructed for simultaneous determination of the studied drugs in the ranges of 4.0-8.0, 3.0-11.0, 10.0-18.0 and 1.0-3.0μgmL -1 for IMB, GMI, NLP and NAB, respectively, by using eighteen mixtures as a calibration set and seven mixtures as a validation set. The root mean square error of predication (RMSEP) were 0.09 and 0.06 for IMB, 0.14 and 0.13 for GMI, 0.07 and 0.02 for NLP and 0.64 and 0.27 for NAP by PLS in original and derivative mode, respectively. Both models were successfully applied for analysis of IMB, GMI, NLP and NAP in their dosage forms. Updated PLS in derivative mode and EDR were applied for determination of the studied drugs in spiked human urine. The obtained results were statistically compared with those obtained by the reported methods giving a conclusion that there is no significant difference regarding accuracy and precision. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Tan, Chao; Chen, Hui; Wang, Chao; Zhu, Wanping; Wu, Tong; Diao, Yuanbo
2013-03-01
Near and mid-infrared (NIR/MIR) spectroscopy techniques have gained great acceptance in the industry due to their multiple applications and versatility. However, a success of application often depends heavily on the construction of accurate and stable calibration models. For this purpose, a simple multi-model fusion strategy is proposed. It is actually the combination of Kohonen self-organizing map (KSOM), mutual information (MI) and partial least squares (PLSs) and therefore named as KMICPLS. It works as follows: First, the original training set is fed into a KSOM for unsupervised clustering of samples, on which a series of training subsets are constructed. Thereafter, on each of the training subsets, a MI spectrum is calculated and only the variables with higher MI values than the mean value are retained, based on which a candidate PLS model is constructed. Finally, a fixed number of PLS models are selected to produce a consensus model. Two NIR/MIR spectral datasets from brewing industry are used for experiments. The results confirms its superior performance to two reference algorithms, i.e., the conventional PLS and genetic algorithm-PLS (GAPLS). It can build more accurate and stable calibration models without increasing the complexity, and can be generalized to other NIR/MIR applications.
Ali, Hina; Saleem, Muhammad; Anser, Muhammad Ramzan; Khan, Saranjam; Ullah, Rahat; Bilal, Muhammad
2018-01-01
Due to high price and nutritional values of extra virgin olive oil (EVOO), it is vulnerable to adulteration internationally. Refined oil or other vegetable oils are commonly blended with EVOO and to unmask such fraud, quick, and reliable technique needs to be standardized and developed. Therefore, in this study, adulteration of edible oil (sunflower oil) is made with pure EVOO and analyzed using fluorescence spectroscopy (excitation wavelength at 350 nm) in conjunction with principal component analysis (PCA) and partial least squares (PLS) regression. Fluorescent spectra contain fingerprints of chlorophyll and carotenoids that are characteristics of EVOO and differentiated it from sunflower oil. A broad intense hump corresponding to conjugated hydroperoxides is seen in sunflower oil in the range of 441-489 nm with the maximum at 469 nm whereas pure EVOO has low intensity doublet peaks in this region at 441 nm and 469 nm. Visible changes in spectra are observed in adulterated EVOO by increasing the concentration of sunflower oil, with an increase in doublet peak and correspondingly decrease in chlorophyll peak intensity. Principal component analysis showed a distinct clustering of adulterated samples of different concentrations. Subsequently, the PLS regression model was best fitted over the complete data set on the basis of coefficient of determination (R 2 ), standard error of calibration (SEC), and standard error of prediction (SEP) of values 0.99, 0.617, and 0.623 respectively. In addition to adulterant, test samples and imported commercial brands of EVOO were also used for prediction and validation of the models. Fluorescence spectroscopy combined with chemometrics showed its robustness to identify and quantify the specified adulterant in pure EVOO.
2012-01-01
Background Decision-making in healthcare is complex. Research on coverage decision-making has focused on comparative studies for several countries, statistical analyses for single decision-makers, the decision outcome and appraisal criteria. Accounting for decision processes extends the complexity, as they are multidimensional and process elements need to be regarded as latent constructs (composites) that are not observed directly. The objective of this study was to present a practical application of partial least square path modelling (PLS-PM) to evaluate how it offers a method for empirical analysis of decision-making in healthcare. Methods Empirical approaches that applied PLS-PM to decision-making in healthcare were identified through a systematic literature search. PLS-PM was used as an estimation technique for a structural equation model that specified hypotheses between the components of decision processes and the reasonableness of decision-making in terms of medical, economic and other ethical criteria. The model was estimated for a sample of 55 coverage decisions on the extension of newborn screening programmes in Europe. Results were evaluated by standard reliability and validity measures for PLS-PM. Results After modification by dropping two indicators that showed poor measures in the measurement models’ quality assessment and were not meaningful for newborn screening, the structural equation model estimation produced plausible results. The presence of three influences was supported: the links between both stakeholder participation or transparency and the reasonableness of decision-making; and the effect of transparency on the degree of scientific rigour of assessment. Reliable and valid measurement models were obtained to describe the composites of ‘transparency’, ‘participation’, ‘scientific rigour’ and ‘reasonableness’. Conclusions The structural equation model was among the first applications of PLS-PM to coverage decision-making. It allowed testing of hypotheses in situations where there are links between several non-observable constructs. PLS-PM was compatible in accounting for the complexity of coverage decisions to obtain a more realistic perspective for empirical analysis. The model specification can be used for hypothesis testing by using larger sample sizes and for data in the full domain of health technologies. PMID:22856325
Belay, T K; Dagnachew, B S; Boison, S A; Ådnøy, T
2018-03-28
Milk infrared spectra are routinely used for phenotyping traits of interest through links developed between the traits and spectra. Predicted individual traits are then used in genetic analyses for estimated breeding value (EBV) or for phenotypic predictions using a single-trait mixed model; this approach is referred to as indirect prediction (IP). An alternative approach [direct prediction (DP)] is a direct genetic analysis of (a reduced dimension of) the spectra using a multitrait model to predict multivariate EBV of the spectral components and, ultimately, also to predict the univariate EBV or phenotype for the traits of interest. We simulated 3 traits under different genetic (low: 0.10 to high: 0.90) and residual (zero to high: ±0.90) correlation scenarios between the 3 traits and assumed the first trait is a linear combination of the other 2 traits. The aim was to compare the IP and DP approaches for predictions of EBV and phenotypes under the different correlation scenarios. We also evaluated relationships between performances of the 2 approaches and the accuracy of calibration equations. Moreover, the effect of using different regression coefficients estimated from simulated phenotypes (β p ), true breeding values (β g ), and residuals (β r ) on performance of the 2 approaches were evaluated. The simulated data contained 2,100 parents (100 sires and 2,000 cows) and 8,000 offspring (4 offspring per cow). Of the 8,000 observations, 2,000 were randomly selected and used to develop links between the first and the other 2 traits using partial least square (PLS) regression analysis. The different PLS regression coefficients, such as β p , β g , and β r , were used in subsequent predictions following the IP and DP approaches. We used BLUP analyses for the remaining 6,000 observations using the true (co)variance components that had been used for the simulation. Accuracy of prediction (of EBV and phenotype) was calculated as a correlation between predicted and true values from the simulations. The results showed that accuracies of EBV prediction were higher in the DP than in the IP approach. The reverse was true for accuracy of phenotypic prediction when using β p but not when using β g and β r , where accuracy of phenotypic prediction in the DP was slightly higher than in the IP approach. Within the DP approach, accuracies of EBV when using β g were higher than when using β p only at the low genetic correlation scenario. However, we found no differences in EBV prediction accuracy between the β p and β g in the IP approach. Accuracy of the calibration models increased with an increase in genetic and residual correlations between the traits. Performance of both approaches increased with an increase in accuracy of the calibration models. In conclusion, the DP approach is a good strategy for EBV prediction but not for phenotypic prediction, where the classical PLS regression-based equations or the IP approach provided better results. The Authors. Published by FASS Inc. and Elsevier Inc. on behalf of the American Dairy Science Association®. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/3.0/).
Kumar, Keshav
2018-03-01
Excitation-emission matrix fluorescence (EEMF) and total synchronous fluorescence spectroscopy (TSFS) are the 2 fluorescence techniques that are commonly used for the analysis of multifluorophoric mixtures. These 2 fluorescence techniques are conceptually different and provide certain advantages over each other. The manual analysis of such highly correlated large volume of EEMF and TSFS towards developing a calibration model is difficult. Partial least square (PLS) analysis can analyze the large volume of EEMF and TSFS data sets by finding important factors that maximize the correlation between the spectral and concentration information for each fluorophore. However, often the application of PLS analysis on entire data sets does not provide a robust calibration model and requires application of suitable pre-processing step. The present work evaluates the application of genetic algorithm (GA) analysis prior to PLS analysis on EEMF and TSFS data sets towards improving the precision and accuracy of the calibration model. The GA algorithm essentially combines the advantages provided by stochastic methods with those provided by deterministic approaches and can find the set of EEMF and TSFS variables that perfectly correlate well with the concentration of each of the fluorophores present in the multifluorophoric mixtures. The utility of the GA assisted PLS analysis is successfully validated using (i) EEMF data sets acquired for dilute aqueous mixture of four biomolecules and (ii) TSFS data sets acquired for dilute aqueous mixtures of four carcinogenic polycyclic aromatic hydrocarbons (PAHs) mixtures. In the present work, it is shown that by using the GA it is possible to significantly improve the accuracy and precision of the PLS calibration model developed for both EEMF and TSFS data set. Hence, GA must be considered as a useful pre-processing technique while developing an EEMF and TSFS calibration model.
Analyses of direct and indirect impacts of a positive list system on pharmaceutical R&D investments.
Han, Euna; Kim, Tae Hyun; Jeung, Myung Jin; Lee, Eui-Kyung
2013-07-01
The South Korean government recently enacted a Positive List System (PLS) as a major change of the national formulary listing system and reimbursed prices for pharmaceutical products. Regardless of the primary goal of the PLS, its implementation might have spillover effects by influencing the pharmaceutical industry's research and development (R&D), potentially leading to a variety of responses by firms in relation to their R&D activities. We investigated the spillover effect of the PLS on R&D investments of the pharmaceutical industry in Korea through both direct and indirect channels, examining the influence of the PLS on sales profit and cash flow. Data from 9 years (5 before and 4 after PLS implementation) were drawn from the financial statements of firms whose stocks were exchanged in 2 official stock markets in Korea (526 firms) and additional pharmaceutical firms whose financial performance was officially audited by external reviewers (263 firms). Longitudinal analyses were conducted, using the panel nature of the data to control for permanent unobserved firm heterogeneity. Our results showed that the PLS was directly associated with R&D investments. In contrast, its indirect impacts stemming from the influence on sales profit and cash flow were minimal and statistically nonsignificant. The gross impact of the PLS on R&D investments increased moving further from the enactment year; R&D investments were reduced by 18.3% to 25.8% in 2009-2010 (compared with before PLS implementation) in the firm fixed-effects model. We also found that such negative direct and gross impacts of the PLS on R&D investments were significant only in firms without newly developed chemical entities. Considering the gross negative impact of the PLS on R&D investments of pharmaceutical firms and the heterogeneous response of these firms by the R&D activities, governmental efforts of cost-containment may need to consider the spillover impact of the PLS on pharmaceutical innovation. Copyright © 2013 Elsevier HS Journals, Inc. All rights reserved.
Tøndel, Kristin; Indahl, Ulf G; Gjuvsland, Arne B; Vik, Jon Olav; Hunter, Peter; Omholt, Stig W; Martens, Harald
2011-06-01
Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. HC-PLSR is a promising approach for metamodelling in systems biology, especially for highly nonlinear or non-monotone parameter to phenotype maps. The algorithm can be flexibly adjusted to suit the complexity of the dynamic model behaviour, inviting automation in the metamodelling of complex systems.
2011-01-01
Background Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Results Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. Conclusions HC-PLSR is a promising approach for metamodelling in systems biology, especially for highly nonlinear or non-monotone parameter to phenotype maps. The algorithm can be flexibly adjusted to suit the complexity of the dynamic model behaviour, inviting automation in the metamodelling of complex systems. PMID:21627852
Improved accuracy in quantitative laser-induced breakdown spectroscopy using sub-models
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anderson, Ryan B.; Clegg, Samuel M.; Frydenvang, Jens
We report that accurate quantitative analysis of diverse geologic materials is one of the primary challenges faced by the Laser-Induced Breakdown Spectroscopy (LIBS)-based ChemCam instrument on the Mars Science Laboratory (MSL) rover. The SuperCam instrument on the Mars 2020 rover, as well as other LIBS instruments developed for geochemical analysis on Earth or other planets, will face the same challenge. Consequently, part of the ChemCam science team has focused on the development of improved multivariate analysis calibrations methods. Developing a single regression model capable of accurately determining the composition of very different target materials is difficult because the response ofmore » an element’s emission lines in LIBS spectra can vary with the concentration of other elements. We demonstrate a conceptually simple “submodel” method for improving the accuracy of quantitative LIBS analysis of diverse target materials. The method is based on training several regression models on sets of targets with limited composition ranges and then “blending” these “sub-models” into a single final result. Tests of the sub-model method show improvement in test set root mean squared error of prediction (RMSEP) for almost all cases. Lastly, the sub-model method, using partial least squares regression (PLS), is being used as part of the current ChemCam quantitative calibration, but the sub-model method is applicable to any multivariate regression method and may yield similar improvements.« less
Improved accuracy in quantitative laser-induced breakdown spectroscopy using sub-models
Anderson, Ryan B.; Clegg, Samuel M.; Frydenvang, Jens; ...
2016-12-15
We report that accurate quantitative analysis of diverse geologic materials is one of the primary challenges faced by the Laser-Induced Breakdown Spectroscopy (LIBS)-based ChemCam instrument on the Mars Science Laboratory (MSL) rover. The SuperCam instrument on the Mars 2020 rover, as well as other LIBS instruments developed for geochemical analysis on Earth or other planets, will face the same challenge. Consequently, part of the ChemCam science team has focused on the development of improved multivariate analysis calibrations methods. Developing a single regression model capable of accurately determining the composition of very different target materials is difficult because the response ofmore » an element’s emission lines in LIBS spectra can vary with the concentration of other elements. We demonstrate a conceptually simple “submodel” method for improving the accuracy of quantitative LIBS analysis of diverse target materials. The method is based on training several regression models on sets of targets with limited composition ranges and then “blending” these “sub-models” into a single final result. Tests of the sub-model method show improvement in test set root mean squared error of prediction (RMSEP) for almost all cases. Lastly, the sub-model method, using partial least squares regression (PLS), is being used as part of the current ChemCam quantitative calibration, but the sub-model method is applicable to any multivariate regression method and may yield similar improvements.« less
USDA-ARS?s Scientific Manuscript database
Hyperspectral scattering is a promising technique for rapid and noninvasive measurement of multiple quality attributes of apple fruit. A hierarchical evolutionary algorithm (HEA) approach, in combination with subspace decomposition and partial least squares (PLS) regression, was proposed to select o...
Experiences of stigma and discrimination of people with schizophrenia in India
Koschorke, Mirja; Padmavati, R.; Kumar, Shuba; Cohen, Alex; Weiss, Helen A.; Chatterjee, Sudipto; Pereira, Jesina; Naik, Smita; John, Sujit; Dabholkar, Hamid; Balaji, Madhumitha; Chavan, Animish; Varghese, Mathew; Thara, R.; Thornicroft, Graham; Patel, Vikram
2014-01-01
Stigma contributes greatly to the burden of schizophrenia and is a major obstacle to recovery, yet, little is known about the subjective experiences of those directly affected in low and middle income countries. This paper aims to describe the experiences of stigma and discrimination of people living with schizophrenia (PLS) in three sites in India and to identify factors influencing negative discrimination. The study used mixed methods and was nested in a randomised controlled trial of community care for schizophrenia. Between November 2009 and October 2010, data on four aspects of stigma experienced by PLS and several clinical variables were collected from 282 PLS and 282 caregivers and analysed using multivariate regression. In addition, in-depth-interviews with PLS and caregivers (36 each) were carried out and analysed using thematic analysis. Quantitative findings indicate that experiences of negative discrimination were reported less commonly (42%) than more internalised forms of stigma experience such as a sense of alienation (79%) and significantly less often than in studies carried out elsewhere. Experiences of negative discrimination were independently predicted by higher levels of positive symptoms of schizophrenia, lower levels of negative symptoms of schizophrenia, higher caregiver knowledge about symptomatology, lower PLS age and not having a source of drinking water in the home. Qualitative findings illustrate the major impact of stigma on ‘what matters most’ in the lives of PLS and highlight three key domains influencing the themes of 'negative reactions' and ‘negative views and feelings about the self’, i.e., ‘others finding out’, ‘behaviours and manifestations of the illness’ and ‘reduced ability to meet role expectations’. Findings have implications for conceptualising and measuring stigma and add to the rationale for enhancing psycho-social interventions to support those facing discrimination. Findings also highlight the importance of addressing public stigma and achieving higher level social and political structural change. PMID:25462616
Urban pavement surface temperature. Comparison of numerical and statistical approach
NASA Astrophysics Data System (ADS)
Marchetti, Mario; Khalifa, Abderrahmen; Bues, Michel; Bouilloud, Ludovic; Martin, Eric; Chancibaut, Katia
2015-04-01
The forecast of pavement surface temperature is very specific in the context of urban winter maintenance. to manage snow plowing and salting of roads. Such forecast mainly relies on numerical models based on a description of the energy balance between the atmosphere, the buildings and the pavement, with a canyon configuration. Nevertheless, there is a specific need in the physical description and the numerical implementation of the traffic in the energy flux balance. This traffic was originally considered as a constant. Many changes were performed in a numerical model to describe as accurately as possible the traffic effects on this urban energy balance, such as tires friction, pavement-air exchange coefficient, and infrared flux neat balance. Some experiments based on infrared thermography and radiometry were then conducted to quantify the effect fo traffic on urban pavement surface. Based on meteorological data, corresponding pavement temperature forecast were calculated and were compared with fiels measurements. Results indicated a good agreement between the forecast from the numerical model based on this energy balance approach. A complementary forecast approach based on principal component analysis (PCA) and partial least-square regression (PLS) was also developed, with data from thermal mapping usng infrared radiometry. The forecast of pavement surface temperature with air temperature was obtained in the specific case of urban configurtation, and considering traffic into measurements used for the statistical analysis. A comparison between results from the numerical model based on energy balance, and PCA/PLS was then conducted, indicating the advantages and limits of each approach.
Marques Junior, Jucelino Medeiros; Muller, Aline Lima Hermes; Foletto, Edson Luiz; da Costa, Adilson Ben; Bizzi, Cezar Augusto; Irineu Muller, Edson
2015-01-01
A method for determination of propranolol hydrochloride in pharmaceutical preparation using near infrared spectrometry with fiber optic probe (FTNIR/PROBE) and combined with chemometric methods was developed. Calibration models were developed using two variable selection models: interval partial least squares (iPLS) and synergy interval partial least squares (siPLS). The treatments based on the mean centered data and multiplicative scatter correction (MSC) were selected for models construction. A root mean square error of prediction (RMSEP) of 8.2 mg g(-1) was achieved using siPLS (s2i20PLS) algorithm with spectra divided into 20 intervals and combination of 2 intervals (8501 to 8801 and 5201 to 5501 cm(-1)). Results obtained by the proposed method were compared with those using the pharmacopoeia reference method and significant difference was not observed. Therefore, proposed method allowed a fast, precise, and accurate determination of propranolol hydrochloride in pharmaceutical preparations. Furthermore, it is possible to carry out on-line analysis of this active principle in pharmaceutical formulations with use of fiber optic probe.
Bleiziffer, Isabelle; Eikmeier, Julian; Pohlentz, Gottfried; McAulay, Kathryn; Xia, Guoqing; Hussain, Muzaffar; Peschel, Andreas; Foster, Simon; Peters, Georg; Heilmann, Christine
2017-01-01
Most bacterial glycoproteins identified to date are virulence factors of pathogenic bacteria, i.e. adhesins and invasins. However, the impact of protein glycosylation on the major human pathogen Staphylococcus aureus remains incompletely understood. To study protein glycosylation in staphylococci, we analyzed lysostaphin lysates of methicillin-resistant Staphylococcus aureus (MRSA) strains by SDS-PAGE and subsequent periodic acid-Schiff's staining. We detected four (>300, ∼250, ∼165, and ∼120 kDa) and two (>300 and ∼175 kDa) glycosylated surface proteins with strain COL and strain 1061, respectively. The ∼250, ∼165, and ∼175 kDa proteins were identified as plasmin-sensitive protein (Pls) by mass spectrometry. Previously, Pls has been demonstrated to be a virulence factor in a mouse septic arthritis model. The pls gene is encoded by the staphylococcal cassette chromosome (SCC)mec type I in MRSA that also encodes the methicillin resistance-conferring mecA and further genes. In a search for glycosyltransferases, we identified two open reading frames encoded downstream of pls on the SCCmec element, which we termed gtfC and gtfD. Expression and deletion analysis revealed that both gtfC and gtfD mediate glycosylation of Pls. Additionally, the recently reported glycosyltransferases SdgA and SdgB are involved in Pls glycosylation. Glycosylation occurs at serine residues in the Pls SD-repeat region and modifying carbohydrates are N-acetylhexosaminyl residues. Functional characterization revealed that Pls can confer increased biofilm formation, which seems to involve two distinct mechanisms. The first mechanism depends on glycosylation of the SD-repeat region by GtfC/GtfD and probably also involves eDNA, while the second seems to be independent of glycosylation as well as eDNA and may involve the centrally located G5 domains. Other previously known Pls properties are not related to the sugar modifications. In conclusion, Pls is a glycoprotein and Pls glycosyl residues can stimulate biofilm formation. Thus, sugar modifications may represent promising new targets for novel therapeutic or prophylactic measures against life-threatening S. aureus infections.
Esteki, M; Nouroozi, S; Shahsavari, Z
2016-02-01
To develop a simple and efficient spectrophotometric technique combined with chemometrics for the simultaneous determination of methyl paraben (MP) and hydroquinone (HQ) in cosmetic products, and specifically, to: (i) evaluate the potential use of successive projections algorithm (SPA) to derivative spectrophotometric data in order to provide sufficient accuracy and model robustness and (ii) determine MP and HQ concentration in cosmetics without tedious pre-treatments such as derivatization or extraction techniques which are time-consuming and require hazardous solvents. The absorption spectra were measured in the wavelength range of 200-350 nm. Prior to performing chemometric models, the original and first-derivative absorption spectra of binary mixtures were used as calibration matrices. Variable selected by successive projections algorithm was used to obtain multiple linear regression (MLR) models based on a small subset of wavelengths. The number of wavelengths and the starting vector were optimized, and the comparison of the root mean square error of calibration (RMSEC) and cross-validation (RMSECV) was applied to select effective wavelengths with the least collinearity and redundancy. Principal component regression (PCR) and partial least squares (PLS) were also developed for comparison. The concentrations of the calibration matrix ranged from 0.1 to 20 μg mL(-1) for MP, and from 0.1 to 25 μg mL(-1) for HQ. The constructed models were tested on an external validation data set and finally cosmetic samples. The results indicated that successive projections algorithm-multiple linear regression (SPA-MLR), applied on the first-derivative spectra, achieved the optimal performance for two compounds when compared with the full-spectrum PCR and PLS. The root mean square error of prediction (RMSEP) was 0.083, 0.314 for MP and HQ, respectively. To verify the accuracy of the proposed method, a recovery study on real cosmetic samples was carried out with satisfactory results (84-112%). The proposed method, which is an environmentally friendly approach, using minimum amount of solvent, is a simple, fast and low-cost analysis method that can provide high accuracy and robust models. The suggested method does not need any complex extraction procedure which is time-consuming and requires hazardous solvents. © 2015 Society of Cosmetic Scientists and the Société Française de Cosmétologie.
Kuriakose, Saji; Joe, I Hubert
2013-11-01
Determination of the authenticity of essential oils has become more significant, in recent years, following some illegal adulteration and contamination scandals. The present investigative study focuses on the application of near infrared spectroscopy to detect sample authenticity and quantify economic adulteration of sandalwood oils. Several data pre-treatments are investigated for calibration and prediction using partial least square regression (PLSR). The quantitative data analysis is done using a new spectral approach - full spectrum or sequential spectrum. The optimum number of PLS components is obtained according to the lowest root mean square error of calibration (RMSEC=0.00009% v/v). The lowest root mean square error of prediction (RMSEP=0.00016% v/v) in the test set and the highest coefficient of determination (R(2)=0.99989) are used as the evaluation tools for the best model. A nonlinear method, locally weighted regression (LWR), is added to extract nonlinear information and to compare with the linear PLSR model. Copyright © 2013 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Kuriakose, Saji; Joe, I. Hubert
2013-11-01
Determination of the authenticity of essential oils has become more significant, in recent years, following some illegal adulteration and contamination scandals. The present investigative study focuses on the application of near infrared spectroscopy to detect sample authenticity and quantify economic adulteration of sandalwood oils. Several data pre-treatments are investigated for calibration and prediction using partial least square regression (PLSR). The quantitative data analysis is done using a new spectral approach - full spectrum or sequential spectrum. The optimum number of PLS components is obtained according to the lowest root mean square error of calibration (RMSEC = 0.00009% v/v). The lowest root mean square error of prediction (RMSEP = 0.00016% v/v) in the test set and the highest coefficient of determination (R2 = 0.99989) are used as the evaluation tools for the best model. A nonlinear method, locally weighted regression (LWR), is added to extract nonlinear information and to compare with the linear PLSR model.
Eliseyev, Andrey; Aksenova, Tetiana
2016-01-01
In the current paper the decoding algorithms for motor-related BCI systems for continuous upper limb trajectory prediction are considered. Two methods for the smooth prediction, namely Sobolev and Polynomial Penalized Multi-Way Partial Least Squares (PLS) regressions, are proposed. The methods are compared to the Multi-Way Partial Least Squares and Kalman Filter approaches. The comparison demonstrated that the proposed methods combined the prediction accuracy of the algorithms of the PLS family and trajectory smoothness of the Kalman Filter. In addition, the prediction delay is significantly lower for the proposed algorithms than for the Kalman Filter approach. The proposed methods could be applied in a wide range of applications beyond neuroscience. PMID:27196417
Tan, Chao; Chen, Hui; Wang, Chao; Zhu, Wanping; Wu, Tong; Diao, Yuanbo
2013-03-15
Near and mid-infrared (NIR/MIR) spectroscopy techniques have gained great acceptance in the industry due to their multiple applications and versatility. However, a success of application often depends heavily on the construction of accurate and stable calibration models. For this purpose, a simple multi-model fusion strategy is proposed. It is actually the combination of Kohonen self-organizing map (KSOM), mutual information (MI) and partial least squares (PLSs) and therefore named as KMICPLS. It works as follows: First, the original training set is fed into a KSOM for unsupervised clustering of samples, on which a series of training subsets are constructed. Thereafter, on each of the training subsets, a MI spectrum is calculated and only the variables with higher MI values than the mean value are retained, based on which a candidate PLS model is constructed. Finally, a fixed number of PLS models are selected to produce a consensus model. Two NIR/MIR spectral datasets from brewing industry are used for experiments. The results confirms its superior performance to two reference algorithms, i.e., the conventional PLS and genetic algorithm-PLS (GAPLS). It can build more accurate and stable calibration models without increasing the complexity, and can be generalized to other NIR/MIR applications. Copyright © 2012 Elsevier B.V. All rights reserved.
Liu, Xiu-ying; Wang, Li; Chang, Qing-rui; Wang, Xiao-xing; Shang, Yan
2015-07-01
Wuqi County of Shaanxi Province, where the vegetation recovering measures have been carried out for years, was taken as the study area. A total of 100 loess samples from 24 different profiles were collected. Total nitrogen (TN) and alkali hydrolysable nitrogen (AHN) contents of the soil samples were analyzed, and the soil samples were scanned in the visible/near-infrared (VNIR) region of 350-2500 nm in the laboratory. The calibration models were developed between TN and AHN contents and VNIR values based on correlation analysis (CA) and partial least squares regression (PLS). Independent samples validated the calibration models. The results indicated that the optimum model for predicting TN of loess was established by using first derivative of reflectance. The best model for predicting AHN of loess was established by using normal derivative spectra. The optimum TN model could effectively predict TN in loess from 0 to 40 cm, but the optimum AHN model could only roughly predict AHN at the same depth. This study provided a good method for rapidly predicting TN of loess where vegetation recovering measures have been adopted, but prediction of AHN needs to be further studied.
Wu, Yan-Wen; Sun, Su-Qin; Zhou, Qun; Leung, Hei-Wun
2008-02-13
Honghua Oil (HHO), a traditional Chinese medicine (TCM) oil preparation, is a mixture of several plant essential oils. In this text, the extended ranges of Fourier transform mid-infrared (FT-MIR) and near infrared (FT-NIR) were recorded for 48 commercially available HHOs of different batches from nine manufacturers. The qualitative and quantitative analysis of three marker components, alpha-pinene, methyl salicylate and eugenol, in different HHO products were performed rapidly by the two vibrational spectroscopic methods, i.e. MIR with horizontal attenuated total reflection (HATR) accessory and NIR with direct sampling technique, followed by partial least squares (PLS) regression treatment of the set of spectra obtained. The results indicated that it was successful to identify alpha-pinene, methyl salicylate and eugenol in all of the samples by simple inspection of the MIR-HATR spectra. Both PLS models established with MIR-HATR and NIR spectral data using gas chromatography (GC) peak areas as calibration reference showed a good linear correlation for each of all three target substances in HHO samples. The above spectroscopic techniques may be the promising methods for the rapid quality assessment/quality control (QA/QC) of TCM oil preparations.
Cao, Hui; Yan, Xingyu; Li, Yaojiang; Wang, Yanxia; Zhou, Yan; Yang, Sanchun
2014-01-01
Quantitative analysis for the flue gas of natural gas-fired generator is significant for energy conservation and emission reduction. The traditional partial least squares method may not deal with the nonlinear problems effectively. In the paper, a nonlinear partial least squares method with extended input based on radial basis function neural network (RBFNN) is used for components prediction of flue gas. For the proposed method, the original independent input matrix is the input of RBFNN and the outputs of hidden layer nodes of RBFNN are the extension term of the original independent input matrix. Then, the partial least squares regression is performed on the extended input matrix and the output matrix to establish the components prediction model of flue gas. A near-infrared spectral dataset of flue gas of natural gas combustion is used for estimating the effectiveness of the proposed method compared with PLS. The experiments results show that the root-mean-square errors of prediction values of the proposed method for methane, carbon monoxide, and carbon dioxide are, respectively, reduced by 4.74%, 21.76%, and 5.32% compared to those of PLS. Hence, the proposed method has higher predictive capabilities and better robustness.
Paradowska, Katarzyna; Jamróz, Marta Katarzyna; Kobyłka, Mariola; Gowin, Ewelina; Maczka, Paulina; Skibiński, Robert; Komsta, Łukasz
2012-01-01
This paper presents a preliminary study in building discriminant models from solid-state NMR spectrometry data to detect the presence of acetaminophen in over-the-counter pharmaceutical formulations. The dataset, containing 11 spectra of pure substances and 21 spectra of various formulations, was processed by partial least squares discriminant analysis (PLS-DA). The model found coped with the discrimination, and its quality parameters were acceptable. It was found that standard normal variate preprocessing had almost no influence on unsupervised investigation of the dataset. The influence of variable selection with the uninformative variable elimination by PLS method was studied, reducing the dataset from 7601 variables to around 300 informative variables, but not improving the model performance. The results showed the possibility to construct well-working PLS-DA models from such small datasets without a full experimental design.
Li, Juan; Jiang, Yue; Fan, Qi; Chen, Yang; Wu, Ruanqi
2014-05-05
This paper establishes a high-throughput and high selective method to determine the impurity named oxidized glutathione (GSSG) and radial tensile strength (RTS) of reduced glutathione (GSH) tablets based on near infrared (NIR) spectroscopy and partial least squares (PLS). In order to build and evaluate the calibration models, the NIR diffuse reflectance spectra (DRS) and transmittance spectra (TS) for 330 GSH tablets were accurately measured by using the optimized parameter values. For analyzing GSSG or RTS of GSH tablets, the NIR-DRS or NIR-TS were selected, subdivided reasonably into calibration and prediction sets, and processed appropriately with chemometric techniques. After selecting spectral sub-ranges and neglecting spectrum outliers, the PLS calibration models were built and the factor numbers were optimized. Then, the PLS models were evaluated by the root mean square errors of calibration (RMSEC), cross-validation (RMSECV) and prediction (RMSEP), and by the correlation coefficients of calibration (R(c)) and prediction (R(p)). The results indicate that the proposed models have good performances. It is thus clear that the NIR-PLS can simultaneously, selectively, nondestructively and rapidly analyze the GSSG and RTS of GSH tablets, although the contents of GSSG impurity were quite low while those of GSH active pharmaceutical ingredient (API) quite high. This strategy can be an important complement to the common NIR methods used in the on-line analysis of API in pharmaceutical preparations. And this work expands the NIR applications in the high-throughput and extraordinarily selective analysis. Copyright © 2014 Elsevier B.V. All rights reserved.
Kona, Ravikanth; Fahmy, Raafat M; Claycamp, Gregg; Polli, James E; Martinez, Marilyn; Hoag, Stephen W
2015-02-01
The objective of this study is to use near-infrared spectroscopy (NIRS) coupled with multivariate chemometric models to monitor granule and tablet quality attributes in the formulation development and manufacturing of ciprofloxacin hydrochloride (CIP) immediate release tablets. Critical roller compaction process parameters, compression force (CFt), and formulation variables identified from our earlier studies were evaluated in more detail. Multivariate principal component analysis (PCA) and partial least square (PLS) models were developed during the development stage and used as a control tool to predict the quality of granules and tablets. Validated models were used to monitor and control batches manufactured at different sites to assess their robustness to change. The results showed that roll pressure (RP) and CFt played a critical role in the quality of the granules and the finished product within the range tested. Replacing binder source did not statistically influence the quality attributes of the granules and tablets. However, lubricant type has significantly impacted the granule size. Blend uniformity, crushing force, disintegration time during the manufacturing was predicted using validated PLS regression models with acceptable standard error of prediction (SEP) values, whereas the models resulted in higher SEP for batches obtained from different manufacturing site. From this study, we were able to identify critical factors which could impact the quality attributes of the CIP IR tablets. In summary, we demonstrated the ability of near-infrared spectroscopy coupled with chemometrics as a powerful tool to monitor critical quality attributes (CQA) identified during formulation development.
Giovenzana, Valentina; Beghi, Roberto; Parisi, Simone; Brancadoro, Lucio; Guidetti, Riccardo
2018-03-01
Increasing attention is being paid to non-destructive methods for water status real time monitoring as a potential solution to replace the tedious conventional techniques which are time consuming and not easy to perform directly in the field. The objective of this study was to test the potential effectiveness of two portable optical devices (visible/near infrared (vis/NIR) and near infrared (NIR) spectrophotometers) for the rapid and non-destructive evaluation of the water status of grapevine leaves. Moreover, a variable selection methodology was proposed to determine a set of candidate variables for the prediction of water potential (Ψ, MPa) related to leaf water status in view of a simplified optical device. The statistics of the partial least square (PLS) models showed in validation R 2 between 0.67 and 0.77 for models arising from vis/NIR spectra, and R 2 ranged from 0.77 to 0.85 for the NIR region. The overall performance of the multiple linear regression (MLR) models from selected wavelengths was slightly worse than that of the PLS models. Regarding the NIR range, acceptable MLR models were obtained only using 14 effective variables (R 2 range 0.63-0.69). To address the market demand for portable optical devices and heading towards the trend of miniaturization and low cost of the devices, individual wavelengths could be useful for the design of a simplified and low-cost handheld system providing useful information for better irrigation scheduling. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
Data Mining Methods for Omics and Knowledge of Crude Medicinal Plants toward Big Data Biology
Afendi, Farit M.; Ono, Naoaki; Nakamura, Yukiko; Nakamura, Kensuke; Darusman, Latifah K.; Kibinge, Nelson; Morita, Aki Hirai; Tanaka, Ken; Horai, Hisayuki; Altaf-Ul-Amin, Md.; Kanaya, Shigehiko
2013-01-01
Molecular biological data has rapidly increased with the recent progress of the Omics fields, e.g., genomics, transcriptomics, proteomics and metabolomics that necessitates the development of databases and methods for efficient storage, retrieval, integration and analysis of massive data. The present study reviews the usage of KNApSAcK Family DB in metabolomics and related area, discusses several statistical methods for handling multivariate data and shows their application on Indonesian blended herbal medicines (Jamu) as a case study. Exploration using Biplot reveals many plants are rarely utilized while some plants are highly utilized toward specific efficacy. Furthermore, the ingredients of Jamu formulas are modeled using Partial Least Squares Discriminant Analysis (PLS-DA) in order to predict their efficacy. The plants used in each Jamu medicine served as the predictors, whereas the efficacy of each Jamu provided the responses. This model produces 71.6% correct classification in predicting efficacy. Permutation test then is used to determine plants that serve as main ingredients in Jamu formula by evaluating the significance of the PLS-DA coefficients. Next, in order to explain the role of plants that serve as main ingredients in Jamu medicines, information of pharmacological activity of the plants is added to the predictor block. Then N-PLS-DA model, multiway version of PLS-DA, is utilized to handle the three-dimensional array of the predictor block. The resulting N-PLS-DA model reveals that the effects of some pharmacological activities are specific for certain efficacy and the other activities are diverse toward many efficacies. Mathematical modeling introduced in the present study can be utilized in global analysis of big data targeting to reveal the underlying biology. PMID:24688691
Gottfried, Jennifer L
2011-07-01
The potential of laser-induced breakdown spectroscopy (LIBS) to discriminate biological and chemical threat simulant residues prepared on multiple substrates and in the presence of interferents has been explored. The simulant samples tested include Bacillus atrophaeus spores, Escherichia coli, MS-2 bacteriophage, α-hemolysin from Staphylococcus aureus, 2-chloroethyl ethyl sulfide, and dimethyl methylphosphonate. The residue samples were prepared on polycarbonate, stainless steel and aluminum foil substrates by Battelle Eastern Science and Technology Center. LIBS spectra were collected by Battelle on a portable LIBS instrument developed by A3 Technologies. This paper presents the chemometric analysis of the LIBS spectra using partial least-squares discriminant analysis (PLS-DA). The performance of PLS-DA models developed based on the full LIBS spectra, and selected emission intensities and ratios have been compared. The full-spectra models generally provided better classification results based on the inclusion of substrate emission features; however, the intensity/ratio models were able to correctly identify more types of simulant residues in the presence of interferents. The fusion of the two types of PLS-DA models resulted in a significant improvement in classification performance for models built using multiple substrates. In addition to identifying the major components of residue mixtures, minor components such as growth media and solvents can be identified with an appropriately designed PLS-DA model.
NASA Astrophysics Data System (ADS)
De Lucia, Frank C., Jr.; Gottfried, Jennifer L.
2011-02-01
Using a series of thirteen organic materials that includes novel high-nitrogen energetic materials, conventional organic military explosives, and benign organic materials, we have demonstrated the importance of variable selection for maximizing residue discrimination with partial least squares discriminant analysis (PLS-DA). We built several PLS-DA models using different variable sets based on laser induced breakdown spectroscopy (LIBS) spectra of the organic residues on an aluminum substrate under an argon atmosphere. The model classification results for each sample are presented and the influence of the variables on these results is discussed. We found that using the whole spectra as the data input for the PLS-DA model gave the best results. However, variables due to the surrounding atmosphere and the substrate contribute to discrimination when the whole spectra are used, indicating this may not be the most robust model. Further iterative testing with additional validation data sets is necessary to determine the most robust model.
Microstructural Modeling of Brittle Materials for Enhanced Performance and Reliability.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Teague, Melissa Christine; Teague, Melissa Christine; Rodgers, Theron
Brittle failure is often influenced by difficult to measure and variable microstructure-scale stresses. Recent advances in photoluminescence spectroscopy (PLS), including improved confocal laser measurement and rapid spectroscopic data collection have established the potential to map stresses with microscale spatial resolution (%3C2 microns). Advanced PLS was successfully used to investigate both residual and externally applied stresses in polycrystalline alumina at the microstructure scale. The measured average stresses matched those estimated from beam theory to within one standard deviation, validating the technique. Modeling the residual stresses within the microstructure produced general agreement in comparison with the experimentally measured results. Microstructure scale modelingmore » is primed to take advantage of advanced PLS to enable its refinement and validation, eventually enabling microstructure modeling to become a predictive tool for brittle materials.« less
Wu, Jing-zhu; Wang, Feng-zhu; Wang, Li-li; Zhang, Xiao-chao; Mao, Wen-hua
2015-01-01
In order to improve the accuracy and robustness of detecting tomato seedlings nitrogen content based on near-infrared spectroscopy (NIR), 4 kinds of characteristic spectrum selecting methods were studied in the present paper, i. e. competitive adaptive reweighted sampling (CARS), Monte Carlo uninformative variables elimination (MCUVE), backward interval partial least squares (BiPLS) and synergy interval partial least squares (SiPLS). There were totally 60 tomato seedlings cultivated at 10 different nitrogen-treatment levels (urea concentration from 0 to 120 mg . L-1), with 6 samples at each nitrogen-treatment level. They are in different degrees of over nitrogen, moderate nitrogen, lack of nitrogen and no nitrogen status. Each sample leaves were collected to scan near-infrared spectroscopy from 12 500 to 3 600 cm-1. The quantitative models based on the above 4 methods were established. According to the experimental result, the calibration model based on CARS and MCUVE selecting methods show better performance than those based on BiPLS and SiPLS selecting methods, but their prediction ability is much lower than that of the latter. Among them, the model built by BiPLS has the best prediction performance. The correlation coefficient (r), root mean square error of prediction (RMSEP) and ratio of performance to standard derivate (RPD) is 0. 952 7, 0. 118 3 and 3. 291, respectively. Therefore, NIR technology combined with characteristic spectrum selecting methods can improve the model performance. But the characteristic spectrum selecting methods are not universal. For the built model based or single wavelength variables selection is more sensitive, it is more suitable for the uniform object. While the anti-interference ability of the model built based on wavelength interval selection is much stronger, it is more suitable for the uneven and poor reproducibility object. Therefore, the characteristic spectrum selection will only play a better role in building model, combined with the consideration of sample state and the model indexes.
NASA Astrophysics Data System (ADS)
Liu, Wen; Zhang, Yuying; Yang, Si; Han, Donghai
2018-05-01
A new technique to identify the floral resources of honeys is demanded. Terahertz time-domain attenuated total reflection spectroscopy combined with chemometrics methods was applied to discriminate different categorizes (Medlar honey, Vitex honey, and Acacia honey). Principal component analysis (PCA), cluster analysis (CA) and partial least squares-discriminant analysis (PLS-DA) have been used to find information of the botanical origins of honeys. Spectral range also was discussed to increase the precision of PLS-DA model. The accuracy of 88.46% for validation set was obtained, using PLS-DA model in 0.5-1.5 THz. This work indicated terahertz time-domain attenuated total reflection spectroscopy was an available approach to evaluate the quality of honey rapidly.
Schwab, Karen; Lauber, Jennifer; Hesse, Friedemann
2016-01-01
The glycosyltransferase HisDapGalNAcT2 is the key protein of the Escherichia coli (E. coli) SHuffle® T7 cell factory which was genetically engineered to allow glycosylation of a protein substrate in vivo. The specific activity of the glycosyltransferase requires time-intensive analytics, but is a critical process parameter. Therefore, it has to be monitored closely. This study evaluates fluorometric in situ monitoring as option to access this critical process parameter during complex E. coli fermentations. Partial least square regression (PLS) models were built based on the fluorometric data recorded during the EnPresso® B fermentations. Capable models for the prediction of glucose and acetate concentrations were built for these fermentations with rout mean squared errors for prediction (RMSEP) of 0.19 g·L−1 and 0.08 g·L−1, as well as for the prediction of the optical density (RMSEP 0.24). In situ monitoring of soluble enzyme to cell dry weight ratios (RMSEP 5.5 × 10−4 µg w/w) and specific activity of the glycosyltransferase (RMSEP 33.5 pmol·min−1·µg−1) proved to be challenging, since HisDapGalNAcT2 had to be extracted from the cells and purified. However, fluorescence spectroscopy, in combination with PLS modeling, proved to be feasible for in situ monitoring of complex expression systems. PMID:28952595
NASA Astrophysics Data System (ADS)
Scafutto, Rebecca Del'Papa Moreira; Souza Filho, Carlos Roberto de
2016-08-01
The near and shortwave infrared spectral reflectance properties of several mineral substrates impregnated with crude oils (°APIs 19.2, 27.5 and 43.2), diesel, gasoline and ethanol were measured and assembled in a spectral library. These data were examined using Principal Component Analysis (PCA) and Partial Least Squares (PLS) Regression. Unique and characteristic absorption features were identified in the mixtures, besides variations of the spectral signatures related to the compositional difference of the crude oils and fuels. These features were used for qualitative and quantitative determination of the contaminant impregnated in the substrates. Specific wavelengths, where key absorption bands occur, were used for the individual characterization of oils and fuels. The intensity of these features can be correlated to the abundance of the contaminant in the mixtures. Grain size and composition of the impregnated substrate directly influence the variation of the spectral signatures. PCA models applied to the spectral library proved able to differentiate the type and density of the hydrocarbons. The calibration models generated by PLS are robust, of high quality and can also be used to predict the concentration of oils and fuels in mixtures with mineral substrates. Such data and models are employable as a reference for classifying unknown samples of contaminated substrates. The results of this study have important implications for onshore exploration and environmental monitoring of oil and fuels leaks using proximal and far range multispectral, hyperspectral and ultraespectral remote sensing.
Gonzalez Viejo, Claudia; Fuentes, Sigfredo; Torrico, Damir; Howell, Kate; Dunshea, Frank R
2018-01-01
Beer quality is mainly defined by its colour, foamability and foam stability, which are influenced by the chemical composition of the product such as proteins, carbohydrates, pH and alcohol. Traditional methods to assess specific chemical compounds are usually time-consuming and costly. This study used rapid methods to evaluate 15 foam and colour-related parameters using a robotic pourer (RoboBEER) and chemical fingerprinting using near infrared spectroscopy (NIR) from six replicates of 21 beers from three types of fermentation. Results from NIR were used to create partial least squares regression (PLS) and artificial neural networks (ANN) models to predict four chemometrics such as pH, alcohol, Brix and maximum volume of foam. The ANN method was able to create more accurate models (R 2 = 0.95) compared to PLS. Principal components analysis using RoboBEER parameters and NIR overtones related to protein explained 67% of total data variability. Additionally, a sub-space discriminant model using the absorbance values from NIR wavelengths resulted in the successful classification of 85% of beers according to fermentation type. The method proposed showed to be a rapid system based on NIR spectroscopy and RoboBEER outputs of foamability that can be used to infer the quality, production method and chemical parameters of beer with minimal laboratory equipment. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
Boksa, Kevin; Otte, Andrew; Pinal, Rodolfo
2014-09-01
A novel method for the simultaneous production and formulation of pharmaceutical cocrystals, matrix-assisted cocrystallization (MAC), is presented. Hot-melt extrusion (HME) is used to create cocrystals by coprocessing the drug and coformer in the presence of a matrix material. Carbamazepine (CBZ), nicotinamide (NCT), and Soluplus were used as a model drug, coformer, and matrix, respectively. The MAC product containing 80:20 (w/w) cocrystal:matrix was characterized by differential scanning calorimetry, Fourier transform infrared spectroscopy, and powder X-ray diffraction. A partial least squares (PLS) regression model was developed for quantifying the efficiency of cocrystal formation. The MAC product was estimated to be 78% (w/w) cocrystal (theoretical 80%), with approximately 0.3% mixture of free (unreacted) CBZ and NCT, and 21.6% Soluplus (theoretical 20%) with the PLS model. A physical mixture (PM) of a reference cocrystal (RCC), prepared by precipitation from solution, and Soluplus resulted in faster dissolution relative to the pure RCC. However, the MAC product with the exact same composition resulted in considerably faster dissolution and higher maximum concentration (∼five-fold) than those of the PM. The MAC product consists of high-quality cocrystals embedded in a matrix. The processing aspect of MAC plays a major role on the faster dissolution observed. The MAC approach offers a scalable process, suitable for the continuous manufacturing and formulation of pharmaceutical cocrystals. © 2014 Wiley Periodicals, Inc. and the American Pharmacists Association.
Navy Fuel Composition and Screening Tool (FCAST) v2.8
2016-05-10
allowed us to develop partial least squares (PLS) models based on gas chromatography–mass spectrometry (GC-MS) data that predict fuel properties. The...Chemometric property modeling Partial least squares PLS Compositional profiler Naval Air Systems Command Air-4.4.5 Patuxent River Naval Air Station Patuxent...Cumulative predicted residual error sum of squares DiEGME Diethylene glycol monomethyl ether FCAST Fuel Composition and Screening Tool FFP Fit for
ATR-FTIR spectroscopy for the determination of Na4EDTA in detergent aqueous solutions.
Suárez, Leticia; García, Roberto; Riera, Francisco A; Diez, María A
2013-10-15
Fourier transform infrared spectroscopy in the attenuated total reflectance mode (ATR-FTIR) combined with partial last square (PLS) algorithms was used to design calibration and prediction models for a wide range of tetrasodium ethylenediaminetetraacetate (Na4EDTA) concentrations (0.1 to 28% w/w) in aqueous solutions. The spectra obtained using air and water as a background medium were tested for the best fit. The PLS models designed afforded a sufficient level of precision and accuracy to allow even very small amounts of Na4EDTA to be determined. A root mean square error of nearly 0.37 for the validation set was obtained. Over a concentration range below 5% w/w, the values estimated from a combination of ATR-FTIR spectroscopy and a PLS algorithm model were similar to those obtained from an HPLC analysis of NaFeEDTA complexes and subsequent detection by UV absorbance. However, the lowest detection limit for Na4EDTA concentrations afforded by this spectroscopic/chemometric method was 0.3% w/w. The PLS model was successfully used as a rapid and simple method to quantify Na4EDTA in aqueous solutions of industrial detergents as an alternative to HPLC-UV analysis which involves time-consuming dilution and complexation processes. © 2013 Elsevier B.V. All rights reserved.
Farias, Marco Antônio Dos Santos; Soares, Frederico Luis Felipe; Carneiro, Renato Lajarim
2016-03-20
Ezetimibe (EZT), in its anhydrous form, is a drug used for cholesterol and lipids reduction in blood plasma. The presence of EZT monohydrate in commercial tablets can change the solubility rate of the API, decreasing its activity. The objective of this work was to verify if the humidity present in the excipients could promote the phase transition from EZT anhydrous to hydrate. Initially the stability of the pure anhydrous form was monitored by Raman, at room temperature (23°C) and relative humidity (75%). The MCR-ALS method showed that almost all EZT changed to hydrated form in 30 min. Then tablets of ezetimibe in the presence of its excipients were prepared and vacuum packed using a polyethylene film. Such tablet was monitored by Raman spectroscopy for 24h in order to quantify the mixture of the crystalline forms. A multivariate calibration model using Raman spectroscopy and Partial Least Square (PLS) regression was built, with validation and cross validation errors around 0.6% (wt/wt), for both crystalline forms, and R(2) higher than 0.96. The PLS model was used to quantify the crystalline mixture of ezetimibe in the monitored tablet, after 24h more than 70% of ezetimibe changed to the hydrated form. Copyright © 2016 Elsevier B.V. All rights reserved.
Feng, Yao-Ze; Elmasry, Gamal; Sun, Da-Wen; Scannell, Amalia G M; Walsh, Des; Morcy, Noha
2013-06-01
Bacterial pathogens are the main culprits for outbreaks of food-borne illnesses. This study aimed to use the hyperspectral imaging technique as a non-destructive tool for quantitative and direct determination of Enterobacteriaceae loads on chicken fillets. Partial least squares regression (PLSR) models were established and the best model using full wavelengths was obtained in the spectral range 930-1450 nm with coefficients of determination R(2)≥ 0.82 and root mean squared errors (RMSEs) ≤ 0.47 log(10)CFUg(-1). In further development of simplified models, second derivative spectra and weighted PLS regression coefficients (BW) were utilised to select important wavelengths. However, the three wavelengths (930, 1121 and 1345 nm) selected from BW were competent and more preferred for predicting Enterobacteriaceae loads with R(2) of 0.89, 0.86 and 0.87 and RMSEs of 0.33, 0.40 and 0.45 log(10)CFUg(-1) for calibration, cross-validation and prediction, respectively. Besides, the constructed prediction map provided the distribution of Enterobacteriaceae bacteria on chicken fillets, which cannot be achieved by conventional methods. It was demonstrated that hyperspectral imaging is a potential tool for determining food sanitation and detecting bacterial pathogens on food matrix without using complicated laboratory regimes. Copyright © 2012 Elsevier Ltd. All rights reserved.
Metabolomics Tools for Describing Complex Pesticide Exposure in Pregnant Women in Brittany (France)
Bonvallot, Nathalie; Tremblay-Franco, Marie; Chevrier, Cécile; Canlet, Cécile; Warembourg, Charline; Cravedi, Jean-Pierre; Cordier, Sylvaine
2013-01-01
Background The use of pesticides and the related environmental contaminations can lead to human exposure to various molecules. In early-life, such exposures could be responsible for adverse developmental effects. However, human health risks associated with exposure to complex mixtures are currently under-explored. Objective This project aims at answering the following questions: What is the influence of exposures to multiple pesticides on the metabolome? What mechanistic pathways could be involved in the metabolic changes observed? Methods Based on the PELAGIE cohort (Brittany, France), 83 pregnant women who provided a urine sample in early pregnancy, were classified in 3 groups according to the surface of land dedicated to agricultural cereal activities in their town of residence. Nuclear magnetic resonance-based metabolomics analyses were performed on urine samples. Partial Least Squares Regression-Discriminant Analysis (PLS-DA) and polytomous regressions were used to separate the urinary metabolic profiles from the 3 exposure groups after adjusting for potential confounders. Results The 3 groups of exposure were correctly separated with a PLS-DA model after implementing an orthogonal signal correction with pareto standardizations (R2 = 90.7% and Q2 = 0.53). After adjusting for maternal age, parity, body mass index and smoking habits, the most statistically significant changes were observed for glycine, threonine, lactate and glycerophosphocholine (upward trend), and for citrate (downward trend). Conclusion This work suggests that an exposure to complex pesticide mixtures induces modifications of metabolic fingerprints. It can be hypothesized from identified discriminating metabolites that the pesticide mixtures could increase oxidative stress and disturb energy metabolism. PMID:23704985
de Oliveira, Rodrigo Rocha; de Lima, Kássio Michell Gomes; Tauler, Romà; de Juan, Anna
2014-07-01
This study describes two applications of a variant of the multivariate curve resolution alternating least squares (MCR-ALS) method with a correlation constraint. The first application describes the use of MCR-ALS for the determination of biodiesel concentrations in biodiesel blends using near infrared (NIR) spectroscopic data. In the second application, the proposed method allowed the determination of the synthetic antioxidant N,N'-Di-sec-butyl-p-phenylenediamine (PDA) present in biodiesel mixtures from different vegetable sources using UV-visible spectroscopy. Well established multivariate regression algorithm, partial least squares (PLS), were calculated for comparison of the quantification performance in the models developed in both applications. The correlation constraint has been adapted to handle the presence of batch-to-batch matrix effects due to ageing effects, which might occur when different groups of samples were used to build a calibration model in the first application. Different data set configurations and diverse modes of application of the correlation constraint are explored and guidelines are given to cope with different type of analytical problems, such as the correction of matrix effects among biodiesel samples, where MCR-ALS outperformed PLS reducing the relative error of prediction RE (%) from 9.82% to 4.85% in the first application, or the determination of minor compound with overlapped weak spectroscopic signals, where MCR-ALS gave higher (RE (%)=3.16%) for prediction of PDA compared to PLS (RE (%)=1.99%), but with the advantage of recovering the related pure spectral profile of analytes and interferences. The obtained results show the potential of the MCR-ALS method with correlation constraint to be adapted to diverse data set configurations and analytical problems related to the determination of biodiesel mixtures and added compounds therein. Copyright © 2014 Elsevier B.V. All rights reserved.
Fu, Chunjiang; Wu, Gang; Lv, Fenglin; Tian, Feifei
2012-05-01
Many protein-protein interactions are mediated by a peptide-recognizing domain, such as WW, PDZ, or SH3. In the present study, we describe a new method called position-dependent noncovalent potential analysis (PDNPA), which can accurately characterize the nonbonding profile between the human endophilin-1 Src homology 3 (hEndo1 SH3) domain and its peptide ligands and quantitatively predict the binding affinity of peptide to hEndo1 SH3. In this procedure, structure models of diverse peptides in complex with the hEndo1 SH3 domain are constructed by molecular dynamics simulation and a virtual mutagenesis protocol. Subsequently, three noncovalent interactions associated with each position of the peptide ligand in the complexed state are analyzed using empirical potential functions, and the resulting potential descriptors are then correlated with the experimentally measured affinity on the basis of 1997 hEndo1 SH3-binding peptides with known activities, using linear partial least squares regression (PLS) and the nonlinear support vector machine (SVM). The results suggest that: (i) the electrostatics appears to be more important than steric properties and hydrophobicity in the formation of the hEndo1 SH3-peptide complex; (ii) P(-4) of the core decapeptide ligand with the sequence pattern P(-6)P(-5)P(-4)P(-3)P(-2)P(-1)P(0)P(1)P(2)P(3) is the most important position in terms of determining both the stability and specificity of the architecture of the complex, and; (iii) nonlinear SVM appears to be more effective than linear PLS for accurately predicting the binding affinity of a peptide ligand to hEndo1 SH3, whereas PLS models are straightforward and easy to interpret as compared to those built by SVM.
Ahmad, Iftikhar; Ahmad, Manzoor; Khan, Karim; Ikram, Masroor
2016-06-01
Optical polarimetry was employed for assessment of ex vivo healthy and basal cell carcinoma (BCC) tissue samples from human skin. Polarimetric analyses revealed that depolarization and retardance for healthy tissue group were significantly higher (p<0.001) compared to BCC tissue group. Histopathology indicated that these differences partially arise from BCC-related characteristic changes in tissue morphology. Wilks lambda statistics demonstrated the potential of all investigated polarimetric properties for computer assisted classification of the two tissue groups. Based on differences in polarimetric properties, partial least square (PLS) regression classified the samples with 100% accuracy, sensitivity and specificity. These findings indicate that optical polarimetry together with PLS statistics hold promise for automated pathology classification. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Sindt, Nathan M.; Robison, Faith; Brick, Mark A.; Schwartz, Howard F.; Heuberger, Adam L.; Prenni, Jessica E.
2018-02-01
Matrix-assisted desorption/ionization time of flight mass spectrometry (MALDI-TOF-MS) is a fast and effective tool for microbial species identification. However, current approaches are limited to species-level identification even when genetic differences are known. Here, we present a novel workflow that applies the statistical method of partial least squares discriminant analysis (PLS-DA) to MALDI-TOF-MS protein fingerprint data of Xanthomonas axonopodis, an important bacterial plant pathogen of fruit and vegetable crops. Mass spectra of 32 X. axonopodis strains were used to create a mass spectral library and PLS-DA was employed to model the closely related strains. A robust workflow was designed to optimize the PLS-DA model by assessing the model performance over a range of signal-to-noise ratios (s/n) and mass filter (MF) thresholds. The optimized parameters were observed to be s/n = 3 and MF = 0.7. The model correctly classified 83% of spectra withheld from the model as a test set. A new decision rule was developed, termed the rolled-up Maximum Decision Rule (ruMDR), and this method improved identification rates to 92%. These results demonstrate that MALDI-TOF-MS protein fingerprints of bacterial isolates can be utilized to enable identification at the strain level. Furthermore, the open-source framework of this workflow allows for broad implementation across various instrument platforms as well as integration with alternative modeling and classification algorithms.
Rapid Analysis of Deoxynivalenol in Durum Wheat by FT-NIR Spectroscopy
De Girolamo, Annalisa; Cervellieri, Salvatore; Visconti, Angelo; Pascale, Michelangelo
2014-01-01
Fourier-transform-near infrared (FT-NIR) spectroscopy has been used to develop quantitative and classification models for the prediction of deoxynivalenol (DON) levels in durum wheat samples. Partial least-squares (PLS) regression analysis was used to determine DON in wheat samples in the range of <50–16,000 µg/kg DON. The model displayed a large root mean square error of prediction value (1,977 µg/kg) as compared to the EU maximum limit for DON in unprocessed durum wheat (i.e., 1,750 µg/kg), thus making the PLS approach unsuitable for quantitative prediction of DON in durum wheat. Linear discriminant analysis (LDA) was successfully used to differentiate wheat samples based on their DON content. A first approach used LDA to group wheat samples into three classes: A (DON ≤ 1,000 µg/kg), B (1,000 < DON ≤ 2,500 µg/kg), and C (DON > 2,500 µg/kg) (LDA I). A second approach was used to discriminate highly contaminated wheat samples based on three different cut-off limits, namely 1,000 (LDA II), 1,200 (LDA III) and 1,400 µg/kg DON (LDA IV). The overall classification and false compliant rates for the three models were 75%–90% and 3%–7%, respectively, with model LDA IV using a cut-off of 1,400 µg/kg fulfilling the requirement of the European official guidelines for screening methods. These findings confirmed the suitability of FT-NIR to screen a large number of wheat samples for DON contamination and to verify the compliance with EU regulation. PMID:25384107
Rapid analysis of deoxynivalenol in durum wheat by FT-NIR spectroscopy.
De Girolamo, Annalisa; Cervellieri, Salvatore; Visconti, Angelo; Pascale, Michelangelo
2014-11-06
Fourier-transform-near infrared (FT-NIR) spectroscopy has been used to develop quantitative and classification models for the prediction of deoxynivalenol (DON) levels in durum wheat samples. Partial least-squares (PLS) regression analysis was used to determine DON in wheat samples in the range of <50-16,000 µg/kg DON. The model displayed a large root mean square error of prediction value (1,977 µg/kg) as compared to the EU maximum limit for DON in unprocessed durum wheat (i.e., 1,750 µg/kg), thus making the PLS approach unsuitable for quantitative prediction of DON in durum wheat. Linear discriminant analysis (LDA) was successfully used to differentiate wheat samples based on their DON content. A first approach used LDA to group wheat samples into three classes: A (DON ≤ 1,000 µg/kg), B (1,000 < DON ≤ 2,500 µg/kg), and C (DON > 2,500 µg/kg) (LDA I). A second approach was used to discriminate highly contaminated wheat samples based on three different cut-off limits, namely 1,000 (LDA II), 1,200 (LDA III) and 1,400 µg/kg DON (LDA IV). The overall classification and false compliant rates for the three models were 75%-90% and 3%-7%, respectively, with model LDA IV using a cut-off of 1,400 µg/kg fulfilling the requirement of the European official guidelines for screening methods. These findings confirmed the suitability of FT-NIR to screen a large number of wheat samples for DON contamination and to verify the compliance with EU regulation.
Philip Ye, X; Liu, Lu; Hayes, Douglas; Womac, Alvin; Hong, Kunlun; Sokhansanj, Shahab
2008-10-01
The objectives of this research were to determine the variation of chemical composition across botanical fractions of cornstover, and to probe the potential of Fourier transform near-infrared (FT-NIR) techniques in qualitatively classifying separated cornstover fractions and in quantitatively analyzing chemical compositions of cornstover by developing calibration models to predict chemical compositions of cornstover based on FT-NIR spectra. Large variations of cornstover chemical composition for wide calibration ranges, which is required by a reliable calibration model, were achieved by manually separating the cornstover samples into six botanical fractions, and their chemical compositions were determined by conventional wet chemical analyses, which proved that chemical composition varies significantly among different botanical fractions of cornstover. Different botanic fractions, having total saccharide content in descending order, are husk, sheath, pith, rind, leaf, and node. Based on FT-NIR spectra acquired on the biomass, classification by Soft Independent Modeling of Class Analogy (SIMCA) was employed to conduct qualitative classification of cornstover fractions, and partial least square (PLS) regression was used for quantitative chemical composition analysis. SIMCA was successfully demonstrated in classifying botanical fractions of cornstover. The developed PLS model yielded root mean square error of prediction (RMSEP %w/w) of 0.92, 1.03, 0.17, 0.27, 0.21, 1.12, and 0.57 for glucan, xylan, galactan, arabinan, mannan, lignin, and ash, respectively. The results showed the potential of FT-NIR techniques in combination with multivariate analysis to be utilized by biomass feedstock suppliers, bioethanol manufacturers, and bio-power producers in order to better manage bioenergy feedstocks and enhance bioconversion.
DOE Office of Scientific and Technical Information (OSTI.GOV)
De Lucia, Frank C. Jr.; Gottfried, Jennifer L.; Munson, Chase A.
2008-11-01
A technique being evaluated for standoff explosives detection is laser-induced breakdown spectroscopy (LIBS). LIBS is a real-time sensor technology that uses components that can be configured into a ruggedized standoff instrument. The U.S. Army Research Laboratory has been coupling standoff LIBS spectra with chemometrics for several years now in order to discriminate between explosives and nonexplosives. We have investigated the use of partial least squares discriminant analysis (PLS-DA) for explosives detection. We have extended our study of PLS-DA to more complex sample types, including binary mixtures, different types of explosives, and samples not included in the model. We demonstrate themore » importance of building the PLS-DA model by iteratively testing it against sample test sets. Independent test sets are used to test the robustness of the final model.« less
Analysis of spreadable cheese by Raman spectroscopy and chemometric tools.
Oliveira, Kamila de Sá; Callegaro, Layce de Souza; Stephani, Rodrigo; Almeida, Mariana Ramos; de Oliveira, Luiz Fernando Cappa
2016-03-01
In this work, FT-Raman spectroscopy was explored to evaluate spreadable cheese samples. A partial least squares discriminant analysis was employed to identify the spreadable cheese samples containing starch. To build the models, two types of samples were used: commercial samples and samples manufactured in local industries. The method of supervised classification PLS-DA was employed to classify the samples as adulterated or without starch. Multivariate regression was performed using the partial least squares method to quantify the starch in the spreadable cheese. The limit of detection obtained for the model was 0.34% (w/w) and the limit of quantification was 1.14% (w/w). The reliability of the models was evaluated by determining the confidence interval, which was calculated using the bootstrap re-sampling technique. The results show that the classification models can be used to complement classical analysis and as screening methods. Copyright © 2015 Elsevier Ltd. All rights reserved.
Indrehus, Oddny; Aralt, Tor Tybring
2005-04-01
Aerosol, NO and CO concentration, temperature, air humidity, air flow and number of running ventilation fans were measured by continuous analysers every minute for a whole week for six different one-week periods spread over ten months in 2001 and 2002 at measuring stations in the 7860 m long tunnel. The ventilation control system was mainly based on aerosol measurements taken by optical scatter sensors. The ventilation turned out to be satisfactory according to Norwegian air quality standards for road tunnels; however, there was some uncertainty concerning the NO2 levels. The air humidity and temperature inside the tunnel were highly influenced by the outside metrological conditions. Statistical models for NO concentration were developed and tested; correlations between predicted and measured NO were 0.81 for a partial least squares regression (PLS1) model based on CO and aerosol, and 0.77 for a linear regression model based only on aerosol. Hence, the ventilation control system should not solely be based on aerosol measurements. Since NO2 is the hazardous polluter, modelling NO2 concentration rather than NO should be preferred in any further optimising of the ventilation control.
Teixeira, Kelly Sivocy Sampaio; da Cruz Fonseca, Said Gonçalves; de Moura, Luís Carlos Brigido; de Moura, Mario Luís Ribeiro; Borges, Márcia Herminia Pinheiro; Barbosa, Euzébio Guimaraes; De Lima E Moura, Túlio Flávio Accioly
2018-02-05
The World Health Organization recommends that TB treatment be administered using combination therapy. The methodologies for quantifying simultaneously associated drugs are highly complex, being costly, extremely time consuming and producing chemical residues harmful to the environment. The need to seek alternative techniques that minimize these drawbacks is widely discussed in the pharmaceutical industry. Therefore, the objective of this study was to develop and validate a multivariate calibration model in association with the near infrared spectroscopy technique (NIR) for the simultaneous determination of rifampicin, isoniazid, pyrazinamide and ethambutol. These models allow the quality control of these medicines to be optimized using simple, fast, low-cost techniques that produce no chemical waste. In the NIR - PLS method, spectra readings were acquired in the 10,000-4000cm -1 range using an infrared spectrophotometer (IRPrestige - 21 - Shimadzu) with a resolution of 4cm -1 , 20 sweeps, under controlled temperature and humidity. For construction of the model, the central composite experimental design was employed on the program Statistica 13 (StatSoft Inc.). All spectra were treated by computational tools for multivariate analysis using partial least squares regression (PLS) on the software program Pirouette 3.11 (Infometrix, Inc.). Variable selections were performed by the QSAR modeling program. The models developed by NIR in association with multivariate analysis provided good prediction of the APIs for the external samples and were therefore validated. For the tablets, however, the slightly different quantitative compositions of excipients compared to the mixtures prepared for building the models led to results that were not statistically similar, despite having prediction errors considered acceptable in the literature. Copyright © 2017 Elsevier B.V. All rights reserved.
Hara, Yoshinori; Seki, Masahide; Matsuoka, Satoshi; Hara, Hiroshi; Yamashita, Atsushi; Matsumoto, Kouji
2008-12-01
The gene responsible for the first acylation of sn-glycerol-3-phosphate (G3P) in Bacillus subtilis has not yet been determined with certainty. The product of this first acylation, lysophosphatidic acid (LPA), is subsequently acylated again to form phosphatidic acid (PA), the primary precursor to membrane glycerolipids. A novel G3P acyltransferase (GPAT), the gene product of plsY, which uses acyl-phosphate formed by the plsX gene product, has recently been found to synthesize LPA in Streptococcus pneumoniae. We found that in B. subtilis growth arrests after repression of either a plsY homologue or a plsX homologue were overcome by expression of E. coli plsB, which encodes an acyl-acylcarrier protein (acyl-ACP)-dependent GPAT, although in the case of plsX repression a high level of plsB expression was required. B. subtilis has, therefore, a capability to use the acyl-ACP dependent GPAT of PlsB. Simultaneous expression of plsY and plsX suppressed the glycerol requirement of a strict glycerol auxotrophic derivative of the E. coli plsB26 mutant, although either one alone did not. Membrane fractions from B. subtilis cells catalyzed palmitoylphosphate-dependent acylation of [14C]-labeled G3P to synthesize [14C]-labeled LPA, whereas those from DeltaplsY cells did not. The results indicate unequivocally that PlsY is an acyl-phosphate dependent GPAT. Expression of plsX corrected the glycerol auxotrophy of a DeltaygiH (the deleted allele of an E. coli homologue of plsY) derivative of BB26-36 (plsB26 plsX50), suggesting an essential role of plsX other than substrate supply for acyl-phosphate dependent LPA synthesis. Two-hybrid examinations suggested that PlsY is associated with PlsX and that each may exist in multimeric form.
Ding, Ning; Li, Xitao; Shi, Yunfei; Ping, Lingyan; Wu, Lina; Fu, Kai; Feng, Lixia; Zheng, Xiaohui; Song, Yuqin; Pan, Zhengying; Zhu, Jun
2015-06-20
The B-cell receptor (BCR) signaling pathway has gained significant attention as a therapeutic target in B-cell malignancies. Recently, several drugs that target the BCR signaling pathway, especially the Btk inhibitor ibrutinib, have demonstrated notable therapeutic effects in relapsed/refractory patients, which indicates that pharmacological inhibition of BCR pathway holds promise in B-cell lymphoma treatment. Here we present a novel covalent irreversible Btk inhibitor PLS-123 with more potent anti-proliferative activity compared with ibrutinib in multiple cellular and in vivo models through effective apoptosis induction and dual-action inhibitory mode of Btk activation. The phosphorylation of BCR downstream activating AKT/mTOR and MAPK signal pathways was also more significantly reduced after treatment with PLS-123 than ibrutinib. Gene expression profile analysis further suggested that the different selectivity profile of PLS-123 led to significant downregulation of oncogenic gene PTPN11 expression, which might also offer new opportunities beyond what ibrutinib has achieved. In addition, PLS-123 dose-dependently attenuated BCR- and chemokine-mediated lymphoma cell adhesion and migration. Taken together, Btk inhibitor PLS-123 suggested a new direction to pharmacologically modulate Btk function and develop novel therapeutic drug for B-cell lymphoma treatment.
Ding, Ning; Li, Xitao; Shi, Yunfei; Ping, Lingyan; Wu, Lina; Fu, Kai; Feng, Lixia; Zheng, Xiaohui; Song, Yuqin; Pan, Zhengying; Zhu, Jun
2015-01-01
The B-cell receptor (BCR) signaling pathway has gained significant attention as a therapeutic target in B-cell malignancies. Recently, several drugs that target the BCR signaling pathway, especially the Btk inhibitor ibrutinib, have demonstrated notable therapeutic effects in relapsed/refractory patients, which indicates that pharmacological inhibition of BCR pathway holds promise in B-cell lymphoma treatment. Here we present a novel covalent irreversible Btk inhibitor PLS-123 with more potent anti-proliferative activity compared with ibrutinib in multiple cellular and in vivo models through effective apoptosis induction and dual-action inhibitory mode of Btk activation. The phosphorylation of BCR downstream activating AKT/mTOR and MAPK signal pathways was also more significantly reduced after treatment with PLS-123 than ibrutinib. Gene expression profile analysis further suggested that the different selectivity profile of PLS-123 led to significant downregulation of oncogenic gene PTPN11 expression, which might also offer new opportunities beyond what ibrutinib has achieved. In addition, PLS-123 dose-dependently attenuated BCR- and chemokine-mediated lymphoma cell adhesion and migration. Taken together, Btk inhibitor PLS-123 suggested a new direction to pharmacologically modulate Btk function and develop novel therapeutic drug for B-cell lymphoma treatment. PMID:25944695
Recombinant plasmids for encoding restriction enzymes DpnI and DpnII of streptococcus pneumontae
Lacks, Sanford A.
1990-01-01
Chromosomal DNA cassettes containing genes encoding either the DpnI or DpnII restriction endonucleases from Streptococcus pneumoniae are cloned into a streptococcal vector, pLS101. Large amounts of the restriction enzymes are produced by cells containing the multicopy plasmids, pLS202 and pLS207, and their derivatives pLS201, pLS211, pLS217, pLS251 and pLS252.
Recombinant plasmids for encoding restriction enzymes DpnI and DpnII of Streptococcus pneumontae
Lacks, S.A.
1990-10-02
Chromosomal DNA cassettes containing genes encoding either the DpnI or DpnII restriction endonucleases from Streptococcus pneumoniae are cloned into a streptococcal vector, pLS101. Large amounts of the restriction enzymes are produced by cells containing the multicopy plasmids, pLS202 and pLS207, and their derivatives pLS201, pLS211, pLS217, pLS251 and pLS252. 9 figs.
Remote quantification of phycocyanin in potable water sources through an adaptive model
NASA Astrophysics Data System (ADS)
Song, Kaishan; Li, Lin; Tedesco, Lenore P.; Li, Shuai; Hall, Bob E.; Du, Jia
2014-09-01
Cyanobacterial blooms in water supply sources in both central Indiana USA (CIN) and South Australia (SA) are a cause of great concerns for toxin production and water quality deterioration. Remote sensing provides an effective approach for quick assessment of cyanobacteria through quantification of phycocyanin (PC) concentration. In total, 363 samples spanning a large variation of optically active constituents (OACs) in CIN and SA waters were collected during 24 field surveys. Concurrently, remote sensing reflectance spectra (Rrs) were measured. A partial least squares-artificial neural network (PLS-ANN) model, artificial neural network (ANN) and three-band model (TBM) were developed or tuned by relating the Rrs with PC concentration. Our results indicate that the PLS-ANN model outperformed the ANN and TBM with both the original spectra and simulated ESA/Sentinel-3/Ocean and Land Color Instrument (OLCI) and EO-1/Hyperion spectra. The PLS-ANN model resulted in a high coefficient of determination (R2) for CIN dataset (R2 = 0.92, R: 0.3-220.7 μg/L) and SA (R2 = 0.98, R: 0.2-13.2 μg/L). In comparison, the TBM model yielded an R2 = 0.77 and 0.94 for the CIN and SA datasets, respectively; while the ANN obtained an intermediate modeling accuracy (CIN: R2 = 0.86; SA: R2 = 0.95). Applying the simulated OLCI and Hyperion aggregated datasets, the PLS-ANN model still achieved good performance (OLCI: R2 = 0.84; Hyperion: R2 = 0.90); the TBM also presented acceptable performance for PC estimations (OLCI: R2 = 0.65, Hyperion: R2 = 0.70). Based on the results, the PLS-ANN is an effective modeling approach for the quantification of PC in productive water supplies based on its effectiveness in solving the non-linearity of PC with other OACs. Furthermore, our investigation indicates that the ratio of inorganic suspended matter (ISM) to PC concentration has close relationship to modeling relative errors (CIN: R2 = 0.81; SA: R2 = 0.92), indicating that ISM concentration exert significant impact on PC estimation accuracy.
Miller, Arthur L.; Weakley, Andrew Todd; Griffiths, Peter R.; Cauda, Emanuele G.; Bayman, Sean
2017-01-01
In order to help reduce silicosis in miners, the National Institute for Occupational Health and Safety (NIOSH) is developing field-portable methods for measuring airborne respirable crystalline silica (RCS), specifically the polymorph α-quartz, in mine dusts. In this study we demonstrate the feasibility of end-of-shift measurement of α-quartz using a direct-on-filter (DoF) method to analyze coal mine dust samples deposited onto polyvinyl chloride filters. The DoF method is potentially amenable for on-site analyses, but deviates from the current regulatory determination of RCS for coal mines by eliminating two sample preparation steps: ashing the sampling filter and redepositing the ash prior to quantification by Fourier transform infrared (FT-IR) spectrometry. In this study, the FT-IR spectra of 66 coal dust samples from active mines were used, and the RCS was quantified by using: (1) an ordinary least squares (OLS) calibration approach that utilizes standard silica material as done in the Mine Safety and Health Administration's P7 method; and (2) a partial least squares (PLS) regression approach. Both were capable of accounting for kaolinite, which can confound the IR analysis of silica. The OLS method utilized analytical standards for silica calibration and kaolin correction, resulting in a good linear correlation with P7 results and minimal bias but with the accuracy limited by the presence of kaolinite. The PLS approach also produced predictions well-correlated to the P7 method, as well as better accuracy in RCS prediction, and no bias due to variable kaolinite mass. Besides decreased sensitivity to mineral or substrate confounders, PLS has the advantage that the analyst is not required to correct for the presence of kaolinite or background interferences related to the substrate, making the method potentially viable for automated RCS prediction in the field. This study demonstrated the efficacy of FT-IR transmission spectrometry for silica determination in coal mine dusts, using both OLS and PLS analyses, when kaolinite was present. PMID:27645724
Miller, Arthur L; Weakley, Andrew Todd; Griffiths, Peter R; Cauda, Emanuele G; Bayman, Sean
2017-05-01
In order to help reduce silicosis in miners, the National Institute for Occupational Health and Safety (NIOSH) is developing field-portable methods for measuring airborne respirable crystalline silica (RCS), specifically the polymorph α-quartz, in mine dusts. In this study we demonstrate the feasibility of end-of-shift measurement of α-quartz using a direct-on-filter (DoF) method to analyze coal mine dust samples deposited onto polyvinyl chloride filters. The DoF method is potentially amenable for on-site analyses, but deviates from the current regulatory determination of RCS for coal mines by eliminating two sample preparation steps: ashing the sampling filter and redepositing the ash prior to quantification by Fourier transform infrared (FT-IR) spectrometry. In this study, the FT-IR spectra of 66 coal dust samples from active mines were used, and the RCS was quantified by using: (1) an ordinary least squares (OLS) calibration approach that utilizes standard silica material as done in the Mine Safety and Health Administration's P7 method; and (2) a partial least squares (PLS) regression approach. Both were capable of accounting for kaolinite, which can confound the IR analysis of silica. The OLS method utilized analytical standards for silica calibration and kaolin correction, resulting in a good linear correlation with P7 results and minimal bias but with the accuracy limited by the presence of kaolinite. The PLS approach also produced predictions well-correlated to the P7 method, as well as better accuracy in RCS prediction, and no bias due to variable kaolinite mass. Besides decreased sensitivity to mineral or substrate confounders, PLS has the advantage that the analyst is not required to correct for the presence of kaolinite or background interferences related to the substrate, making the method potentially viable for automated RCS prediction in the field. This study demonstrated the efficacy of FT-IR transmission spectrometry for silica determination in coal mine dusts, using both OLS and PLS analyses, when kaolinite was present.
Bhatt, Chet R; Jain, Jinesh C; Goueguel, Christian L; McIntyre, Dustin L; Singh, Jagdish P
2018-01-01
Laser-induced breakdown spectroscopy (LIBS) was used to detect rare earth elements (REEs) in natural geological samples. Low and high intensity emission lines of Ce, La, Nd, Y, Pr, Sm, Eu, Gd, and Dy were identified in the spectra recorded from the samples to claim the presence of these REEs. Multivariate analysis was executed by developing partial least squares regression (PLS-R) models for the quantification of Ce, La, and Nd. Analysis of unknown samples indicated that the prediction results of these samples were found comparable to those obtained by inductively coupled plasma mass spectrometry analysis. Data support that LIBS has potential to quantify REEs in geological minerals/ores.
NASA Technical Reports Server (NTRS)
Anderson, R. B.; Morris, R. V.; Clegg, S. M.; Bell, J. F., III; Humphries, S. D.; Wiens, R. C.
2011-01-01
The ChemCam instrument selected for the Curiosity rover is capable of remote laser-induced breakdown spectroscopy (LIBS).[1] We used a remote LIBS instrument similar to ChemCam to analyze 197 geologic slab samples and 32 pressed-powder geostandards. The slab samples are well-characterized and have been used to validate the calibration of previous instruments on Mars missions, including CRISM [2], OMEGA [3], the MER Pancam [4], Mini-TES [5], and Moessbauer [6] instruments and the Phoenix SSI [7]. The resulting dataset was used to compare multivariate methods for quantitative LIBS and to determine the effect of grain size on calculations. Three multivariate methods - partial least squares (PLS), multilayer perceptron artificial neural networks (MLP ANNs) and cascade correlation (CC) ANNs - were used to generate models and extract the quantitative composition of unknown samples. PLS can be used to predict one element (PLS1) or multiple elements (PLS2) at a time, as can the neural network methods. Although MLP and CC ANNs were successful in some cases, PLS generally produced the most accurate and precise results.
NASA Astrophysics Data System (ADS)
Barbeira, Paulo J. S.; Paganotti, Rosilene S. N.; Ássimos, Ariane A.
2013-10-01
This study had the objective of determining the content of dry extract of commercial alcoholic extracts of bee propolis through Partial Least Squares (PLS) multivariate calibration and electronic spectroscopy. The PLS model provided a good prediction of dry extract content in commercial alcoholic extracts of bee propolis in the range of 2.7 a 16.8% (m/v), presenting the advantage of being less laborious and faster than the traditional gravimetric methodology. The PLS model was optimized with outlier detection tests according to the ASTM E 1655-05. In this study it was possible to verify that a centrifugation stage is extremely important in order to avoid the presence of waxes, resulting in a more accurate model. Around 50% of the analyzed samples presented content of dry extract lower than the value established by Brazilian legislation, in most cases, the values found were different from the values claimed in the product's label.
Ishikawa, Daitaro; Nishii, Takashi; Mizuno, Fumiaki; Sato, Harumi; Kazarian, Sergei G; Ozaki, Yukihiro
2013-12-01
This study was carried out to evaluate a new high-speed hyperspectral near-infrared (NIR) camera named Compovision. Quantitative analyses of the crystallinity and crystal evolution of biodegradable polymer, polylactic acid (PLA), and its concentration in PLA/poly-(R)-3-hydroxybutyrate (PHB) blends were investigated using near-infrared (NIR) imaging. This NIR camera can measure two-dimensional NIR spectral data in the 1000-2350 nm region obtaining images with wide field of view of 150 × 250 mm(2) (approximately 100 000 pixels) at high speeds (in less than 5 s). PLA with differing crystallinities between 0 and 50% blended samples with PHB in ratios of 80/20, 60/40, 40/60, 20/80, and pure films of 100% PLA and PHB were prepared. Compovision was used to collect respective NIR spectra in the 1000-2350 nm region and investigate the crystallinity of PLA and its concentration in the blends. The partial least squares (PLS) regression models for the crystallinity of PLA were developed using absorbance, second derivative, and standard normal variate (SNV) spectra from the most informative region of the spectra, between 1600 and 2000 nm. The predicted results of PLS models achieved using the absorbance and second derivative spectra were fairly good with a root mean square error (RMSE) of less than 6.1% and a determination of coefficient (R(2)) of more than 0.88 for PLS factor 1. The results obtained using the SNV spectra yielded the best prediction with the smallest RMSE of 2.93% and the highest R(2) of 0.976. Moreover, PLS models developed for estimating the concentration of PLA in the blend polymers using SNV spectra gave good predicted results where the RMSE was 4.94% and R(2) was 0.98. The SNV-based models provided the best-predicted results, since it can reduce the effects of the spectral changes induced by the inhomogeneity and the thickness of the samples. Wide area crystal evolution of PLA on a plate where a temperature slope of 70-105 °C had occurred was also monitored using NIR imaging. An SNV-based image gave an obvious contrast of the crystallinity around the crystal growth area according to slight temperature change. Moreover, it clarified the inhomogeneity of crystal evolution over the significant wide area. These results have proved that the newly developed hyperspectral NIR camera, Compovision, can be successfully used to study polymers for industrial processes, such as monitoring the crystallinity of PLA and the different composition of PLA/PHB blends.
Ouyang, Qin; Zhao, Jiewen; Chen, Quansheng
2015-01-01
The non-sugar solids (NSS) content is one of the most important nutrition indicators of Chinese rice wine. This study proposed a rapid method for the measurement of NSS content in Chinese rice wine using near infrared (NIR) spectroscopy. We also systemically studied the efficient spectral variables selection algorithms that have to go through modeling. A new algorithm of synergy interval partial least square with competitive adaptive reweighted sampling (Si-CARS-PLS) was proposed for modeling. The performance of the final model was back-evaluated using root mean square error of calibration (RMSEC) and correlation coefficient (Rc) in calibration set and similarly tested by mean square error of prediction (RMSEP) and correlation coefficient (Rp) in prediction set. The optimum model by Si-CARS-PLS algorithm was achieved when 7 PLS factors and 18 variables were included, and the results were as follows: Rc=0.95 and RMSEC=1.12 in the calibration set, Rp=0.95 and RMSEP=1.22 in the prediction set. In addition, Si-CARS-PLS algorithm showed its superiority when compared with the commonly used algorithms in multivariate calibration. This work demonstrated that NIR spectroscopy technique combined with a suitable multivariate calibration algorithm has a high potential in rapid measurement of NSS content in Chinese rice wine. Copyright © 2015 Elsevier B.V. All rights reserved.
Simultaneous determination of specific alpha and beta emitters by LSC-PLS in water samples.
Fons-Castells, J; Tent-Petrus, J; Llauradó, M
2017-01-01
Liquid scintillation counting (LSC) is a commonly used technique for the determination of alpha and beta emitters. However, LSC has poor resolution and the continuous spectra for beta emitters hinder the simultaneous determination of several alpha and beta emitters from the same spectrum. In this paper, the feasibility of multivariate calibration by partial least squares (PLS) models for the determination of several alpha ( nat U, 241 Am and 226 Ra) and beta emitters ( 40 K, 60 Co, 90 Sr/ 90 Y, 134 Cs and 137 Cs) in water samples is reported. A set of alpha and beta spectra from radionuclide calibration standards were used to construct three PLS models. Experimentally mixed radionuclides and intercomparision materials were used to validate the models. The results had a maximum relative bias of 25% when all the radionuclides in the sample were included in the calibration set; otherwise the relative bias was over 100% for some radionuclides. The results obtained show that LSC-PLS is a useful approach for the simultaneous determination of alpha and beta emitters in multi-radionuclide samples. However, to obtain useful results, it is important to include all the radionuclides expected in the studied scenario in the calibration set. Copyright © 2016 Elsevier Ltd. All rights reserved.
Hattotuwagama, Channa K; Doytchinova, Irini A; Flower, Darren R
2007-01-01
Quantitative structure-activity relationship (QSAR) analysis is a cornerstone of modern informatics. Predictive computational models of peptide-major histocompatibility complex (MHC)-binding affinity based on QSAR technology have now become important components of modern computational immunovaccinology. Historically, such approaches have been built around semiqualitative, classification methods, but these are now giving way to quantitative regression methods. We review three methods--a 2D-QSAR additive-partial least squares (PLS) and a 3D-QSAR comparative molecular similarity index analysis (CoMSIA) method--which can identify the sequence dependence of peptide-binding specificity for various class I MHC alleles from the reported binding affinities (IC50) of peptide sets. The third method is an iterative self-consistent (ISC) PLS-based additive method, which is a recently developed extension to the additive method for the affinity prediction of class II peptides. The QSAR methods presented here have established themselves as immunoinformatic techniques complementary to existing methodology, useful in the quantitative prediction of binding affinity: current methods for the in silico identification of T-cell epitopes (which form the basis of many vaccines, diagnostics, and reagents) rely on the accurate computational prediction of peptide-MHC affinity. We have reviewed various human and mouse class I and class II allele models. Studied alleles comprise HLA-A*0101, HLA-A*0201, HLA-A*0202, HLA-A*0203, HLA-A*0206, HLA-A*0301, HLA-A*1101, HLA-A*3101, HLA-A*6801, HLA-A*6802, HLA-B*3501, H2-K(k), H2-K(b), H2-D(b) HLA-DRB1*0101, HLA-DRB1*0401, HLA-DRB1*0701, I-A(b), I-A(d), I-A(k), I-A(S), I-E(d), and I-E(k). In this chapter we show a step-by-step guide into predicting the reliability and the resulting models to represent an advance on existing methods. The peptides used in this study are available from the AntiJen database (http://www.jenner.ac.uk/AntiJen). The PLS method is available commercially in the SYBYL molecular modeling software package. The resulting models, which can be used for accurate T-cell epitope prediction, will be made are freely available online at the URL http://www.jenner.ac.uk/MHCPred.
Estimation of soil sorption coefficients of veterinary pharmaceuticals from soil properties.
ter Laak, Thomas L; Gebbink, Wouter A; Tolls, Johannes
2006-04-01
Environmental exposure assessment of veterinary pharmaceuticals requires estimating the sorption to soil. Soil sorption coefficients of three common, ionizable, antimicrobial agents (oxytetracycline [OTC], tylosin [TYL], and sulfachloropyridazine [SCP]) were studied in relation to the soil properties of 11 different soils. The soil sorption coefficient at natural pH varied from 950 to 7,200, 10 to 370, and 0.4 to 35 L/kg for OTC, TYL, and SCP, respectively. The variation increased by almost two orders of magnitude for OTC and TYL when pH was artificially adjusted. Separate soil properties (pH, organic carbon content, clay content, cation-exchange capacity, aluminum oxyhydroxide content, and iron oxyhydroxide content) were not able to explain more than half the variation observed in soil sorption coefficients. This reflects the complexity of the sorbent-sorbate interactions. Partial-least-squares (PLS) models, integrating all the soil properties listed above, were able to explain as much as 78% of the variation in sorption coefficients. The PLS model was able to predict the sorption coefficient with an accuracy of a factor of six. Considering the pH-dependent speciation, species-specific PLS models were developed. These models were able to predict species-specific sorption coefficients with an accuracy of a factor of three to four. However, the species-specific sorption models did not improve the estimation of sorption coefficients of species mixtures, because these models were developed with a reduced data set at standardized aqueous concentrations. In conclusion, pragmatic approaches like PLS modeling might be suitable to estimate soil sorption for risk assessment purposes.
Kehimkar, Benjamin; Hoggard, Jamin C; Marney, Luke C; Billingsley, Matthew C; Fraga, Carlos G; Bruno, Thomas J; Synovec, Robert E
2014-01-31
There is an increased need to more fully assess and control the composition of kerosene-based rocket propulsion fuels such as RP-1. In particular, it is critical to make better quantitative connections among the following three attributes: fuel performance (thermal stability, sooting propensity, engine specific impulse, etc.), fuel properties (such as flash point, density, kinematic viscosity, net heat of combustion, and hydrogen content), and the chemical composition of a given fuel, i.e., amounts of specific chemical compounds and compound classes present in a fuel as a result of feedstock blending and/or processing. Recent efforts in predicting fuel chemical and physical behavior through modeling put greater emphasis on attaining detailed and accurate fuel properties and fuel composition information. Often, one-dimensional gas chromatography (GC) combined with mass spectrometry (MS) is employed to provide chemical composition information. Building on approaches that used GC-MS, but to glean substantially more chemical information from these complex fuels, we recently studied the use of comprehensive two dimensional (2D) gas chromatography combined with time-of-flight mass spectrometry (GC×GC-TOFMS) using a "reversed column" format: RTX-wax column for the first dimension, and a RTX-1 column for the second dimension. In this report, by applying chemometric data analysis, specifically partial least-squares (PLS) regression analysis, we are able to readily model (and correlate) the chemical compositional information provided by use of GC×GC-TOFMS to RP-1 fuel property information such as density, kinematic viscosity, net heat of combustion, and so on. Furthermore, we readily identified compounds that contribute significantly to measured differences in fuel properties based on results from the PLS models. We anticipate this new chemical analysis strategy will have broad implications for the development of high fidelity composition-property models, leading to an improved approach to fuel formulation and specification for advanced engine cycles. Copyright © 2014 Elsevier B.V. All rights reserved.
Freye, Chris E; Fitz, Brian D; Billingsley, Matthew C; Synovec, Robert E
2016-06-01
The chemical composition and several physical properties of RP-1 fuels were studied using comprehensive two-dimensional (2D) gas chromatography (GC×GC) coupled with flame ionization detection (FID). A "reversed column" GC×GC configuration was implemented with a RTX-wax column on the first dimension ((1)D), and a RTX-1 as the second dimension ((2)D). Modulation was achieved using a high temperature diaphragm valve mounted directly in the oven. Using leave-one-out cross-validation (LOOCV), the summed GC×GC-FID signal of three compound-class selective 2D regions (alkanes, cycloalkanes, and aromatics) was regressed against previously measured ASTM derived values for these compound classes, yielding root mean square errors of cross validation (RMSECV) of 0.855, 0.734, and 0.530mass%, respectively. For comparison, using partial least squares (PLS) analysis with LOOCV, the GC×GC-FID signal of the entire 2D separations was regressed against the same ASTM values, yielding a linear trend for the three compound classes (alkanes, cycloalkanes, and aromatics), yielding RMSECV values of 1.52, 2.76, and 0.945 mass%, respectively. Additionally, a more detailed PLS analysis was undertaken of the compounds classes (n-alkanes, iso-alkanes, mono-, di-, and tri-cycloalkanes, and aromatics), and of physical properties previously determined by ASTM methods (such as net heat of combustion, hydrogen content, density, kinematic viscosity, sustained boiling temperature and vapor rise temperature). Results from these PLS studies using the relatively simple to use and inexpensive GC×GC-FID instrumental platform are compared to previously reported results using the GC×GC-TOFMS instrumental platform. Copyright © 2016 Elsevier B.V. All rights reserved.
GTM-Based QSAR Models and Their Applicability Domains.
Gaspar, H A; Baskin, I I; Marcou, G; Horvath, D; Varnek, A
2015-06-01
In this paper we demonstrate that Generative Topographic Mapping (GTM), a machine learning method traditionally used for data visualisation, can be efficiently applied to QSAR modelling using probability distribution functions (PDF) computed in the latent 2-dimensional space. Several different scenarios of the activity assessment were considered: (i) the "activity landscape" approach based on direct use of PDF, (ii) QSAR models involving GTM-generated on descriptors derived from PDF, and, (iii) the k-Nearest Neighbours approach in 2D latent space. Benchmarking calculations were performed on five different datasets: stability constants of metal cations Ca(2+) , Gd(3+) and Lu(3+) complexes with organic ligands in water, aqueous solubility and activity of thrombin inhibitors. It has been shown that the performance of GTM-based regression models is similar to that obtained with some popular machine-learning methods (random forest, k-NN, M5P regression tree and PLS) and ISIDA fragment descriptors. By comparing GTM activity landscapes built both on predicted and experimental activities, we may visually assess the model's performance and identify the areas in the chemical space corresponding to reliable predictions. The applicability domain used in this work is based on data likelihood. Its application has significantly improved the model performances for 4 out of 5 datasets. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Edinger, Magnus; Knopp, Matthias Manne; Kerdoncuff, Hugo; Rantanen, Jukka; Rades, Thomas; Löbmann, Korbinian
2018-05-30
In this study, the influence of drug load on the microwave-induced amorphization of celecoxib (CCX) in polyvinylpyrrolidone (PVP) tablets was investigated using quantitative transmission Raman spectroscopy. A design of experiments (DoE) setup was applied for developing the quantitative model using two factors: drug load (10, 30, and 50% w/w) and amorphous fraction (0, 25, 50, 75 and 100%). The data was modeled using partial least-squares (PLS) regression and resulted in a robust model with a root mean-square error of prediction of 2.5%. The PLS model was used to study the amorphization kinetics of CCX-PVP tablets with different drug content (10, 20, 30, 40 and 50% w/w). For this purpose, transition Raman spectra were collected in 60 s intervals over a total microwave time of 10 min with an energy input of 1000 W. Using the quantitative model it was possible to measure the amorphous fraction of the tablets and follow the amorphization as a function of microwaving time. The relative amorphous fraction of CCX increased with increasing microwaving time and decreasing drug load, hence 90 ± 7% of the drug was amorphized in the tablets with 10% drug load whereas only 31 ± 7% of the drug was amorphized in the 50% CCX tablets. It is suggested that the degree of amorphization depends on drug loading. The likelihood of drug particles being in direct contact with the polymer PVP is a requirement for the dissolution of the drug into the polymer upon microwaving, and this is reduced with increasing drug load. This was further supported by polarized light microscopy that revealed evidence of crystalline particles and clusters in all the microwaved tablets. Copyright © 2018 Elsevier B.V. All rights reserved.
Process analytical technology in continuous manufacturing of a commercial pharmaceutical product.
Vargas, Jenny M; Nielsen, Sarah; Cárdenas, Vanessa; Gonzalez, Anthony; Aymat, Efrain Y; Almodovar, Elvin; Classe, Gustavo; Colón, Yleana; Sanchez, Eric; Romañach, Rodolfo J
2018-03-01
The implementation of process analytical technology and continuous manufacturing at an FDA approved commercial manufacturing site is described. In this direct compaction process the blends produced were monitored with a Near Infrared (NIR) spectroscopic calibration model developed with partial least squares (PLS) regression. The authors understand that this is the first study where the continuous manufacturing (CM) equipment was used as a gravimetric reference method for the calibration model. A principal component analysis (PCA) model was also developed to identify the powder blend, and determine whether it was similar to the calibration blends. An air diagnostic test was developed to assure that powder was present within the interface when the NIR spectra were obtained. The air diagnostic test as well the PCA and PLS calibration model were integrated into an industrial software platform that collects the real time NIR spectra and applies the calibration models. The PCA test successfully detected an equipment malfunction. Variographic analysis was also performed to estimate the sampling analytical errors that affect the results from the NIR spectroscopic method during commercial production. The system was used to monitor and control a 28 h continuous manufacturing run, where the average drug concentration determined by the NIR method was 101.17% of label claim with a standard deviation of 2.17%, based on 12,633 spectra collected. The average drug concentration for the tablets produced from these blends was 100.86% of label claim with a standard deviation of 0.4%, for 500 tablets analyzed by Fourier Transform Near Infrared (FT-NIR) transmission spectroscopy. The excellent agreement between the mean drug concentration values in the blends and tablets produced provides further evidence of the suitability of the validation strategy that was followed. Copyright © 2018 Elsevier B.V. All rights reserved.
Escuder-Gilabert, L; Martín-Biosca, Y; Sagrado, S; Medina-Hernández, M J
2014-10-10
The design of experiments (DOE) is a good option for rationally limiting the number of experiments required to achieve the enantioresolution (Rs) of a chiral compound in capillary electrophoresis. In some cases, the modeled Rs after DOE analysis can be unsatisfactory, maybe because the range of the explored factors (DOE domain) was not the adequate. In these cases, anticipative strategies can be an alternative to the repetition of the process (e.g. a new DOE), to save time and money. In this work, multiple linear regression (MLR)-steepest ascent and a new anticipative strategy based on a multiple response-partial least squares model (called PLS2-prediction) are examined as post-DOE strategies to anticipate new experimental conditions providing satisfactory Rs values. The new anticipative strategy allows to include the analysis time (At) and uncertainty limits into the decision making process. To demonstrate their efficiency, the chiral separation of hexaconazole and penconazole, as model compounds, is studied using highly sulfated-β-cyclodextrin (HS-β-CD) in electrokinetic chromatography (EKC). Box-Behnken DOE for three factors (background electrolyte pH, separation temperature and HS-β-CD concentration) and two responses (Rs and At) is used. Using commercially available software, the whole modeling and anticipative process is automatic, simple and requires minimal skills from the researcher. Both strategies studied have proven to successfully anticipate Rs values close to the experimental ones for EKC conditions outside the DOE domain for the two model compounds. The results in this work suggest that PLS2-prediction approach could be the strategy of choice to obtain secure anticipations in EKC. Copyright © 2014 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Yu, Jiajia; He, Yong
Mango is a kind of popular tropical fruit, and the soluble solid content is an important in this study visible and short-wave near-infrared spectroscopy (VIS/SWNIR) technique was applied. For sake of investigating the feasibility of using VIS/SWNIR spectroscopy to measure the soluble solid content in mango, and validating the performance of selected sensitive bands, for the calibration set was formed by 135 mango samples, while the remaining 45 mango samples for the prediction set. The combination of partial least squares and backpropagation artificial neural networks (PLS-BP) was used to calculate the prediction model based on raw spectrum data. Based on PLS-BP, the determination coefficient for prediction (Rp) was 0.757 and root mean square and the process is simple and easy to operate. Compared with the Partial least squares (PLS) result, the performance of PLS-BP is better.
Bunaciu, Andrei A.; Udristioiu, Gabriela Elena; Ruţă, Lavinia L.; Fleschin, Şerban; Aboul-Enein, Hassan Y.
2009-01-01
A Fourier transform infrared (FT-IR) spectrometric method was developed for the rapid, direct measurement of diosmin in different pharmaceutical drugs. Conventional KBr-spectra were compared for best determination of active substance in commercial preparations. The Beer–Lambert law and two chemometric approaches, partial least squares (PLS) and principal component regression (PCR+) methods, were tried in data processing. PMID:23960715
Iterative random vs. Kennard-Stone sampling for IR spectrum-based classification task using PLS2-DA
NASA Astrophysics Data System (ADS)
Lee, Loong Chuen; Liong, Choong-Yeun; Jemain, Abdul Aziz
2018-04-01
External testing (ET) is preferred over auto-prediction (AP) or k-fold-cross-validation in estimating more realistic predictive ability of a statistical model. With IR spectra, Kennard-stone (KS) sampling algorithm is often used to split the data into training and test sets, i.e. respectively for model construction and for model testing. On the other hand, iterative random sampling (IRS) has not been the favored choice though it is theoretically more likely to produce reliable estimation. The aim of this preliminary work is to compare performances of KS and IRS in sampling a representative training set from an attenuated total reflectance - Fourier transform infrared spectral dataset (of four varieties of blue gel pen inks) for PLS2-DA modeling. The `best' performance achievable from the dataset is estimated with AP on the full dataset (APF, error). Both IRS (n = 200) and KS were used to split the dataset in the ratio of 7:3. The classic decision rule (i.e. maximum value-based) is employed for new sample prediction via partial least squares - discriminant analysis (PLS2-DA). Error rate of each model was estimated repeatedly via: (a) AP on full data (APF, error); (b) AP on training set (APS, error); and (c) ET on the respective test set (ETS, error). A good PLS2-DA model is expected to produce APS, error and EVS, error that is similar to the APF, error. Bearing that in mind, the similarities between (a) APS, error vs. APF, error; (b) ETS, error vs. APF, error and; (c) APS, error vs. ETS, error were evaluated using correlation tests (i.e. Pearson and Spearman's rank test), using series of PLS2-DA models computed from KS-set and IRS-set, respectively. Overall, models constructed from IRS-set exhibits more similarities between the internal and external error rates than the respective KS-set, i.e. less risk of overfitting. In conclusion, IRS is more reliable than KS in sampling representative training set.
Rapid monitoring of grape withering using visible near-infrared spectroscopy.
Beghi, Roberto; Giovenzana, Valentina; Marai, Simone; Guidetti, Riccardo
2015-12-01
Wineries need new practical and quick instruments, non-destructive and able to quantitatively evaluate during withering the parameters that impact product quality. The aim of the work was to test an optical portable system (visible near-infrared (NIR) spectrophotometer) in a wavelength range of 400-1000 nm for the prediction of quality parameters of grape berries during withering. A total of 300 red grape samples (Vitis vinifera L., Corvina cultivar) harvested in vintage year 2012 from the Valpolicella area (Verona, Italy) were analyzed. Qualitative (principal component analysis, PCA) and quantitative (partial least squares regression algorithm, PLS) evaluations were performed on grape spectra. PCA showed a clear sample grouping for the different withering stages. PLS models gave encouraging predictive capabilities for soluble solids content (R(2) val = 0.62 and ratio performance deviation, RPD = 1.87) and firmness (R(2) val = 0.56 and RPD = 1.79). The work demonstrated the applicability of visible NIR spectroscopy as a rapid technique for the analysis of grape quality directly in barns, during withering. The sector could be provided with simple and inexpensive optical systems that could be used to monitor the withering degree of grape for better management of the wine production process. © 2014 Society of Chemical Industry.
Rationalizing context-dependent performance of dynamic RNA regulatory devices.
Kent, Ross; Halliwell, Samantha; Young, Kate; Swainston, Neil; Dixon, Neil
2018-06-21
The ability of RNA to sense, regulate and store information is an attractive attribute for a variety of functional applications including the development of regulatory control devices for synthetic biology. RNA folding and function is known to be highly context sensitive, which limits the modularity and reuse of RNA regulatory devices to control different heterologous sequences and genes. We explored the cause and effect of sequence context sensitivity for translational ON riboswitches located in the 5' UTR, by constructing and screening a library of N-terminal synonymous codon variants. By altering the N-terminal codon usage we were able to obtain RNA devices with a broad range of functional performance properties (ON, OFF, fold-change). Linear regression and calculated metrics were used to rationalize the major determining features leading to optimal riboswitch performance, and to identify multiple interactions between the explanatory metrics. Finally, partial least squared (PLS) analysis was employed in order to understand the metrics and their respective effect on performance. This PLS model was shown to provide good explanation of our library. This study provides a novel multi-variant analysis framework by which to rationalize the codon context performance of allosteric RNA-devices. The framework will also serve as a platform for future riboswitch context engineering endeavors.
NASA Astrophysics Data System (ADS)
Abdel Hameed, Eman A.; Abdel Salam, Randa A.; Hadad, Ghada M.
2015-04-01
Chemometric-assisted spectrophotometric methods and high performance liquid chromatography (HPLC) were developed for the simultaneous determination of the seven most commonly prescribed β-blockers (atenolol, sotalol, metoprolol, bisoprolol, propranolol, carvedilol and nebivolol). Principal component regression PCR, partial least square PLS and PLS with previous wavelength selection by genetic algorithm (GA-PLS) were used for chemometric analysis of spectral data of these drugs. The compositions of the mixtures used in the calibration set were varied to cover the linearity ranges 0.7-10 μg ml-1 for AT, 1-15 μg ml-1 for ST, 1-15 μg ml-1 for MT, 0.3-5 μg ml-1 for BS, 0.1-3 μg ml-1 for PR, 0.1-3 μg ml-1 for CV and 0.7-5 μg ml-1 for NB. The analytical performances of these chemometric methods were characterized by relative prediction errors and were compared with each other. GA-PLS showed superiority over the other applied multivariate methods due to the wavelength selection. A new gradient HPLC method had been developed using statistical experimental design. Optimum conditions of separation were determined with the aid of central composite design. The developed HPLC method was found to be linear in the range of 0.2-20 μg ml-1 for AT, 0.2-20 μg ml-1 for ST, 0.1-15 μg ml-1 for MT, 0.1-15 μg ml-1 for BS, 0.1-13 μg ml-1 for PR, 0.1-13 μg ml-1 for CV and 0.4-20 μg ml-1 for NB. No significant difference between the results of the proposed GA-PLS and HPLC methods with respect to accuracy and precision. The proposed analytical methods did not show any interference of the excipients when applied to pharmaceutical products.
Estimating Forest Species Composition Using a Multi-Sensor Approach
NASA Astrophysics Data System (ADS)
Wolter, P. T.
2009-12-01
The magnitude, duration, and frequency of forest disturbance caused by the spruce budworm and forest tent caterpillar has increased over the last century due to a shift in forest species composition linked to historical fire suppression, forest management, and pesticide application that has fostered the increase in dominance of host tree species. Modeling approaches are currently being used to understand and forecast potential management effects in changing insect disturbance trends. However, detailed forest composition data needed for these efforts is often lacking. Here, we used partial least squares (PLS) regression to integrate satellite sensor data from Landsat, Radarsat-1, and PALSAR, as well as pixel-wise forest structure information derived from SPOT-5 sensor data (Wolter et al. 2009), to estimate species-level forest composition of 12 species required for modeling efforts. C-band Radarsat-1 data and L-band PALSAR data were frequently among the strongest predictors of forest composition. Pixel-level forest structure data were more important for estimating conifer rather than hardwood forest composition. The coefficients of determination for species relative basal area (RBA) ranged from 0.57 (white cedar) to 0.94 (maple) with RMSE of 8.88 to 6.44 % RBA, respectively. Receiver operating characteristic (ROC) curves were used to determine the effective lower limits of usefulness of species RBA estimates which ranged from 5.94 % (jack pine) to 39.41 % (black ash). These estimates were then used to produce a dominant forest species map for the study region with an overall accuracy of 78 %. Most notably, this approach facilitated discrimination of aspen from birch as well as spruce and fir from other conifer species which is crucial for the study of forest tent caterpillar and spruce budworm dynamics, respectively, in the Upper Midwest. Thus, use of PLS regression as a data fusion strategy has proven to be an effective tool for regional characterization of forest composition within spatially heterogeneous forests using large-format satellite sensor data.
Collell, Carles; Gou, Pere; Arnau, Jacint; Muñoz, Israel; Comaposada, Josep
2012-12-01
Three different NIR equipment were evaluated based on their ability to predict superficial water activity (a(w)) and moisture content in two types of fermented sausages (with and without moulds on surface), using partial least squares (PLS) regression models. The instruments differed mainly in wavelength range, resolution and measurement configuration. The most accurate equipment was used in a new experiment to achieve robust models in sausages with different salt contents and submitted to different drying conditions. The models developed showed determination coefficients (R(2)(P)) values of 0.990, 0.910 and 0.984, and RMSEP values of 1.560%, 0.220% and 0.007% for moisture, salt and a(w) respectively. It was demonstrated that NIR spectroscopy could be a suitable non-destructive method for on-line monitoring and control of the drying process in fermented sausages. Copyright © 2012 Elsevier Ltd. All rights reserved.
Hou, Siyuan; Riley, Christopher B; Mitchell, Cynthia A; Shaw, R Anthony; Bryanton, Janet; Bigsby, Kathryn; McClure, J Trenton
2015-09-01
Immunoglobulin G (IgG) is crucial for the protection of the host from invasive pathogens. Due to its importance for human health, tools that enable the monitoring of IgG levels are highly desired. Consequently there is a need for methods to determine the IgG concentration that are simple, rapid, and inexpensive. This work explored the potential of attenuated total reflectance (ATR) infrared spectroscopy as a method to determine IgG concentrations in human serum samples. Venous blood samples were collected from adults and children, and from the umbilical cord of newborns. The serum was harvested and tested using ATR infrared spectroscopy. Partial least squares (PLS) regression provided the basis to develop the new analytical methods. Three PLS calibrations were determined: one for the combined set of the venous and umbilical cord serum samples, the second for only the umbilical cord samples, and the third for only the venous samples. The number of PLS factors was chosen by critical evaluation of Monte Carlo-based cross validation results. The predictive performance for each PLS calibration was evaluated using the Pearson correlation coefficient, scatter plot and Bland-Altman plot, and percent deviations for independent prediction sets. The repeatability was evaluated by standard deviation and relative standard deviation. The results showed that ATR infrared spectroscopy is potentially a simple, quick, and inexpensive method to measure IgG concentrations in human serum samples. The results also showed that it is possible to build a united calibration curve for the umbilical cord and the venous samples. Copyright © 2015 Elsevier B.V. All rights reserved.
Cheng, Mingyu; Wang, Hao; Yoshida, Ryu
2010-01-01
Collagen–platelet (PL)-rich plasma composites have shown in vivo potential to stimulate anterior cruciate ligament (ACL) healing at early time points in large animal models. However, little is known about the cellular mechanisms by which the plasma component of these composites may stimulate healing. We hypothesized that the components of PL-rich plasma (PRP), namely the PLs and PL-poor plasma (PPP), would independently significantly influence ACL cell viability and metabolic activity, including collagen gene expression. To test this hypothesis, ACL cells were cultured in a collagen type I hydrogel with PLs, PPP, or the combination of the two (PRP) for 14 days. The inclusion of PLs, PPP, and PRP all significantly reduced the rate of cell apoptosis and enhanced the metabolic activity of fibroblasts in the collagen hydrogel. PLs promoted fibroblast-mediated collagen scaffold contraction, whereas PPP inhibited this contraction. PPP and PRP both promoted cell elongation and the formation of wavy fibrous structure in the scaffolds. The addition of only PLs or only plasma proteins did not significantly enhance gene expression of collagen types I and III but the combination, as PRP, did. Our findings suggest that the addition of both PLs and plasma proteins to collagen hydrogel may be useful in stimulating ACL healing by enhancing ACL cell viability, metabolic activity, and collagen synthesis. PMID:19958169
Liu, Xue-Mei; Zhang, Hai-Liang
2014-10-01
Ultraviolet/visible (UV/Vis) spectroscopy was studied for the rapid determination of chemical oxygen demand (COD), which was an indicator to measure the concentration of organic matter in aquaculture water. In order to reduce the influence of the absolute noises of the spectra, the extracted 135 absorbance spectra were preprocessed by Savitzky-Golay smoothing (SG), EMD, and wavelet transform (WT) methods. The preprocessed spectra were then used to select latent variables (LVs) by partial least squares (PLS) methods. Partial least squares (PLS) was used to build models with the full spectra, and back- propagation neural network (BPNN) and least square support vector machine (LS-SVM) were applied to build models with the selected LVs. The overall results showed that BPNN and LS-SVM models performed better than PLS models, and the LS-SVM models with LVs based on WT preprocessed spectra obtained the best results with the determination coefficient (r2) and RMSE being 0. 83 and 14. 78 mg · L(-1) for calibration set, and 0.82 and 14.82 mg · L(-1) for the prediction set respectively. The method showed the best performance in LS-SVM model. The results indicated that it was feasible to use UV/Vis with LVs which were obtained by PLS method, combined with LS-SVM calibration could be applied to the rapid and accurate determination of COD in aquaculture water. Moreover, this study laid the foundation for further implementation of online analysis of aquaculture water and rapid determination of other water quality parameters.
Investigation of Drug–Polymer Compatibility Using Chemometric-Assisted UV-Spectrophotometry
Mohamed, Amir Ibrahim; Abd-Motagaly, Amr Mohamed Elsayed; Ahmed, Osama A. A.; Amin, Suzan; Mohamed Ali, Alaa Ibrahim
2017-01-01
A simple chemometric-assisted UV-spectrophotometric method was used to study the compatibility of clindamycin hydrochloride (HC1) with two commonly used natural controlled-release polymers, alginate (Ag) and chitosan (Ch). Standard mixtures containing 1:1, 1:2, and 1:0.5 w/w drug–polymer ratios were prepared and UV scanned. A calibration model was developed with partial least square (PLS) regression analysis for each polymer separately. Then, test mixtures containing 1:1 w/w drug–polymer ratios with different sets of drug concentrations were prepared. These were UV scanned initially and after three and seven days of storage at 25 °C. Using the calibration model, the drug recovery percent was estimated and a decrease in concentration of 10% or more from initial concentration was considered to indicate instability. PLS models with PC3 (for Ag) and PC2 (for Ch) showed a good correlation between actual and found values with root mean square error of cross validation (RMSECV) of 0.00284 and 0.01228, and calibration coefficient (R2) values of 0.996 and 0.942, respectively. The average drug recovery percent after three and seven days was 98.1 ± 2.9 and 95.4 ± 4.0 (for Ag), and 97.3 ± 2.1 and 91.4 ± 3.8 (for Ch), which suggests more drug compatibility with an Ag than a Ch polymer. Conventional techniques including DSC, XRD, FTIR, and in vitro minimum inhibitory concentration (MIC) for (1:1) drug–polymer mixtures were also performed to confirm clindamycin compatibility with Ag and Ch polymers. PMID:28275214
Monakhova, Yulia B; Diehl, Bernd W K; Do, Tung X; Schulze, Margit; Witzleben, Steffen
2018-02-05
Apart from the characterization of impurities, the full characterization of heparin and low molecular weight heparin (LMWH) also requires the determination of average molecular weight, which is closely related to the pharmaceutical properties of anticoagulant drugs. To determine average molecular weight of these animal-derived polymer products, partial least squares regression (PLS) was utilized for modelling of diffused-ordered spectroscopy NMR data (DOSY) of a representative set of heparin (n=32) and LMWH (n=30) samples. The same sets of samples were measured by gel permeation chromatography (GPC) to obtain reference data. The application of PLS to the data led to calibration models with root mean square error of prediction of 498Da and 179Da for heparin and LMWH, respectively. The average coefficients of variation (CVs) did not exceed 2.1% excluding sample preparation (by successive measuring one solution, n=5) and 2.5% including sample preparation (by preparing and analyzing separate samples, n=5). An advantage of the method is that the sample after standard 1D NMR characterization can be used for the molecular weight determination without further manipulation. The accuracy of multivariate models is better than the previous results for other matrices employing internal standards. Therefore, DOSY experiment is recommended to be employed for the calculation of molecular weight of heparin products as a complementary measurement to standard 1D NMR quality control. The method can be easily transferred to other matrices as well. Copyright © 2017 Elsevier B.V. All rights reserved.
Detection of crop water status in mature olive orchards using vegetation spectral measurements
NASA Astrophysics Data System (ADS)
Rallo, Giovanni; Ciraolo, Giuseppe; Farina, Giuseppe; Minacapilli, Mario; Provenzano, Giuseppe
2013-04-01
Leaf/stem water potentials are generally considered the most accurate indicators of crop water status (CWS) and they are quite often used for irrigation scheduling, even if costly and time-consuming. For this reason, in the last decade vegetation spectral measurements have been proposed, not only for environmental monitoring, but also in precision agriculture, to evaluate crop parameters and consequently for irrigation scheduling. Objective of the study was to assess the potential of hyperspectral reflectance (450-2400 nm) data to predict the crop water status (CWS) of a Mediterranean olive orchard. Different approaches were tested and particularly, (i) several standard broad- and narrow-band vegetation indices (VIs), (ii) specific VIs computed on the basis of some key wavelengths, predetermined by simple correlations and finally, (iii) using partial least squares (PLS) regression technique. To this aim, an intensive experimental campaign was carried out in 2010 and a total of 201 reflectance spectra, at leaf and canopy level, were collected with an ASD FieldSpec Pro (Analytical Spectral Devices, Inc.) handheld field spectroradiometer. CWS was contemporarily determined by measuring leaf and stem water potentials with the Scholander chamber. The results indicated that the considered standard vegetation indices were weakly correlated with CWS. On the other side, the prediction of CWS can be improved using VIs pointed to key-specific wavelengths, predetermined with a correlation analysis. The best prediction accuracy, however, can be achieved with models based on PLS regressions. The results confirmed the dependence of leaf/canopy optical features from CWS so that, for the examined crop, the proposed methodology can be considered a promising tool that could also be extended for operational applications using multispectral aerial sensors.
Kong, Yu; Wu, Qun; Zhang, Yan
2014-01-01
The in situ metabolic characteristics of the yeasts involved in spontaneous fermentation process of Chinese light-style liquor are poorly understood. The covariation between metabolic profiles and yeast communities in Chinese light-style liquor was modeled using the partial least square (PLS) regression method. The diversity of yeast species was evaluated by sequence analysis of the 26S ribosomal DNA (rDNA) D1/D2 domains of cultivable yeasts, and the volatile compounds in fermented grains were analyzed by gas chromatography (GC)-mass spectrometry (MS). Eight yeast species and 58 volatile compounds were identified, respectively. The modulation of 16 of these volatile compounds was associated with variations in the yeast population (goodness of prediction [Q2] > 20%). The results showed that Pichia anomala was responsible for the characteristic aroma of Chinese liquor, through the regulation of several important volatile compounds, such as ethyl lactate, octanoic acid, and ethyl tetradecanoate. Correspondingly, almost all of the compounds associated with P. anomala were detected in a pure culture of this yeast. In contrast to the PLS regression results, however, ethyl lactate and ethyl isobutyrate were not detected in the same pure culture, which indicated that some metabolites could be generated by P. anomala only when it existed in a community with other yeast species. Furthermore, different yeast communities provided different volatile patterns in the fermented grains, which resulted in distinct flavor profiles in the resulting liquors. This study could help identify the key yeast species involved in spontaneous fermentation and provide a deeper understanding of the role of individual yeast species in the community. PMID:24727269
Wu, Xia; Zhu, Jian-Cheng; Zhang, Yu; Li, Wei-Min; Rong, Xiang-Lu; Feng, Yi-Fan
2016-08-25
Potential impact of lipid research has been increasingly realized both in disease treatment and prevention. An effective metabolomics approach based on ultra-performance liquid chromatography/quadrupole-time-of-flight mass spectrometry (UPLC/Q-TOF-MS) along with multivariate statistic analysis has been applied for investigating the dynamic change of plasma phospholipids compositions in early type 2 diabetic rats after the treatment of an ancient prescription of Chinese Medicine Huang-Qi-San. The exported UPLC/Q-TOF-MS data of plasma samples were subjected to SIMCA-P and processed by bioMark, mixOmics, Rcomdr packages with R software. A clear score plots of plasma sample groups, including normal control group (NC), model group (MC), positive medicine control group (Flu) and Huang-Qi-San group (HQS), were achieved by principal-components analysis (PCA), partial least-squares discriminant analysis (PLS-DA) and orthogonal partial least-squares discriminant analysis (OPLS-DA). Biomarkers were screened out using student T test, principal component regression (PCR), partial least-squares regression (PLS) and important variable method (variable influence on projection, VIP). Structures of metabolites were identified and metabolic pathways were deduced by correlation coefficient. The relationship between compounds was explained by the correlation coefficient diagram, and the metabolic differences between similar compounds were illustrated. Based on KEGG database, the biological significances of identified biomarkers were described. The correlation coefficient was firstly applied to identify the structure and deduce the metabolic pathways of phospholipids metabolites, and the study provided a new methodological cue for further understanding the molecular mechanisms of metabolites in the process of regulating Huang-Qi-San for treating early type 2 diabetes. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
New consensus multivariate models based on PLS and ANN studies of sigma-1 receptor antagonists.
Oliveira, Aline A; Lipinski, Célio F; Pereira, Estevão B; Honorio, Kathia M; Oliveira, Patrícia R; Weber, Karen C; Romero, Roseli A F; de Sousa, Alexsandro G; da Silva, Albérico B F
2017-10-02
The treatment of neuropathic pain is very complex and there are few drugs approved for this purpose. Among the studied compounds in the literature, sigma-1 receptor antagonists have shown to be promising. In order to develop QSAR studies applied to the compounds of 1-arylpyrazole derivatives, multivariate analyses have been performed in this work using partial least square (PLS) and artificial neural network (ANN) methods. A PLS model has been obtained and validated with 45 compounds in the training set and 13 compounds in the test set (r 2 training = 0.761, q 2 = 0.656, r 2 test = 0.746, MSE test = 0.132 and MAE test = 0.258). Additionally, multi-layer perceptron ANNs (MLP-ANNs) were employed in order to propose non-linear models trained by gradient descent with momentum backpropagation function. Based on MSE test values, the best MLP-ANN models were combined in a MLP-ANN consensus model (MLP-ANN-CM; r 2 test = 0.824, MSE test = 0.088 and MAE test = 0.197). In the end, a general consensus model (GCM) has been obtained using PLS and MLP-ANN-CM models (r 2 test = 0.811, MSE test = 0.100 and MAE test = 0.218). Besides, the selected descriptors (GGI6, Mor23m, SRW06, H7m, MLOGP, and μ) revealed important features that should be considered when one is planning new compounds of the 1-arylpyrazole class. The multivariate models proposed in this work are definitely a powerful tool for the rational drug design of new compounds for neuropathic pain treatment. Graphical abstract Main scaffold of the 1-arylpyrazole derivatives and the selected descriptors.
Li, Weiyong; Worosila, Gregory D
2005-05-13
This research note demonstrates the simultaneous quantitation of a pharmaceutical active ingredient and three excipients in a simulated powder blend containing acetaminophen, Prosolv and Crospovidone. An experimental design approach was used in generating a 5-level (%, w/w) calibration sample set that included 125 samples. The samples were prepared by weighing suitable amount of powders into separate 20-mL scintillation vials and were mixed manually. Partial least squares (PLS) regression was used in calibration model development. The models generated accurate results for quantitation of Crospovidone (at 5%, w/w) and magnesium stearate (at 0.5%, w/w). Further testing of the models demonstrated that the 2-level models were as effective as the 5-level ones, which reduced the calibration sample number to 50. The models had a small bias for quantitation of acetaminophen (at 30%, w/w) and Prosolv (at 64.5%, w/w) in the blend. The implication of the bias is discussed.
Feng, Jianyuan; Turksoy, Kamuran; Samadi, Sediqeh; Hajizadeh, Iman; Littlejohn, Elizabeth; Cinar, Ali
2017-12-01
Supervision and control systems rely on signals from sensors to receive information to monitor the operation of a system and adjust manipulated variables to achieve the control objective. However, sensor performance is often limited by their working conditions and sensors may also be subjected to interference by other devices. Many different types of sensor errors such as outliers, missing values, drifts and corruption with noise may occur during process operation. A hybrid online sensor error detection and functional redundancy system is developed to detect errors in online signals, and replace erroneous or missing values detected with model-based estimates. The proposed hybrid system relies on two techniques, an outlier-robust Kalman filter (ORKF) and a locally-weighted partial least squares (LW-PLS) regression model, which leverage the advantages of automatic measurement error elimination with ORKF and data-driven prediction with LW-PLS. The system includes a nominal angle analysis (NAA) method to distinguish between signal faults and large changes in sensor values caused by real dynamic changes in process operation. The performance of the system is illustrated with clinical data continuous glucose monitoring (CGM) sensors from people with type 1 diabetes. More than 50,000 CGM sensor errors were added to original CGM signals from 25 clinical experiments, then the performance of error detection and functional redundancy algorithms were analyzed. The results indicate that the proposed system can successfully detect most of the erroneous signals and substitute them with reasonable estimated values computed by functional redundancy system.
Sankar, A S Kamatchi; Vetrichelvan, Thangarasu; Venkappaya, Devashya
2011-09-01
In the present work, three different spectrophotometric methods for simultaneous estimation of ramipril, aspirin and atorvastatin calcium in raw materials and in formulations are described. Overlapped data was quantitatively resolved by using chemometric methods, viz. inverse least squares (ILS), principal component regression (PCR) and partial least squares (PLS). Calibrations were constructed using the absorption data matrix corresponding to the concentration data matrix. The linearity range was found to be 1-5, 10-50 and 2-10 μg mL-1 for ramipril, aspirin and atorvastatin calcium, respectively. The absorbance matrix was obtained by measuring the zero-order absorbance in the wavelength range between 210 and 320 nm. A training set design of the concentration data corresponding to the ramipril, aspirin and atorvastatin calcium mixtures was organized statistically to maximize the information content from the spectra and to minimize the error of multivariate calibrations. By applying the respective algorithms for PLS 1, PCR and ILS to the measured spectra of the calibration set, a suitable model was obtained. This model was selected on the basis of RMSECV and RMSEP values. The same was applied to the prediction set and capsule formulation. Mean recoveries of the commercial formulation set together with the figures of merit (calibration sensitivity, selectivity, limit of detection, limit of quantification and analytical sensitivity) were estimated. Validity of the proposed approaches was successfully assessed for analyses of drugs in the various prepared physical mixtures and formulations.
Liao, Xiang; Wang, Qing; Fu, Ji-hong; Tang, Jun
2015-09-01
This work was undertaken to establish a quantitative analysis model which can rapid determinate the content of linalool, linalyl acetate of Xinjiang lavender essential oil. Totally 165 lavender essential oil samples were measured by using near infrared absorption spectrum (NIR), after analyzing the near infrared spectral absorption peaks of all samples, lavender essential oil have abundant chemical information and the interference of random noise may be relatively low on the spectral intervals of 7100~4500 cm(-1). Thus, the PLS models was constructed by using this interval for further analysis. 8 abnormal samples were eliminated. Through the clustering method, 157 lavender essential oil samples were divided into 105 calibration set samples and 52 validation set samples. Gas chromatography mass spectrometry (GC-MS) was used as a tool to determine the content of linalool and linalyl acetate in lavender essential oil. Then the matrix was established with the GC-MS raw data of two compounds in combination with the original NIR data. In order to optimize the model, different pretreatment methods were used to preprocess the raw NIR spectral to contrast the spectral filtering effect, after analysizing the quantitative model results of linalool and linalyl acetate, the root mean square error prediction (RMSEP) of orthogonal signal transformation (OSC) was 0.226, 0.558, spectrally, it was the optimum pretreatment method. In addition, forward interval partial least squares (FiPLS) method was used to exclude the wavelength points which has nothing to do with determination composition or present nonlinear correlation, finally 8 spectral intervals totally 160 wavelength points were obtained as the dataset. Combining the data sets which have optimized by OSC-FiPLS with partial least squares (PLS) to establish a rapid quantitative analysis model for determining the content of linalool and linalyl acetate in Xinjiang lavender essential oil, numbers of hidden variables of two components were 8 in the model. The performance of the model was evaluated according to root mean square error of cross-validation (RMSECV), root mean square error of prediction (RMSEP). In the model, RESECV of linalool and linalyl acetate were 0.170 and 0.416, respectively; RM-SEP were 0.188 and 0.364. The results indicated that raw data was pretreated by OSC and FiPLS, the NIR-PLS quantitative analysis model with good robustness, high measurement precision; it could quickly determine the content of linalool and linalyl acetate in lavender essential oil. In addition, the model has a favorable prediction ability. The study also provide a new effective method which could rapid quantitative analysis the major components of Xinjiang lavender essential oil.
de Groot, P J; Swierenga, H; Postma, G J; Melssen, W J; Buydens, L M C
2003-06-01
The combination of Raman and infrared spectroscopy on the one hand and wavelength selection on the other hand is used to improve the partial least-squares (PLS) prediction of seven selected yarn properties. These properties are important for on-line quality control during production. From 71 yarn samples, the Raman and infrared spectra are measured and reference methods are used to determine the selected properties. Making separate PLS models for all yarn properties using the Raman and infrared spectra, prior to wavelength selection, reveals that Raman spectroscopy outperforms infrared spectroscopy. If wavelength selection is applied, the PLS prediction error decreases and the correlation coefficient increases for all properties. However, a substantial wavelength selection effect is present for the infrared spectra compared to the Raman spectra. For the infrared spectra, wavelength selection results in PLS prediction errors comparable with the prediction performance of the Raman spectra prior to wavelength selection. Concatenating the Raman and infrared spectra does not enhance the PLS prediction performance, not even after wavelength selection. It is concluded that an infrared spectrometer, combined with a wavelength selection procedure, can be used if no (suitable) Raman instrument is available.
Hyperspectral sensing to detect the impact of herbicide drift on cotton growth and yield
NASA Astrophysics Data System (ADS)
Suarez, L. A.; Apan, A.; Werth, J.
2016-10-01
Yield loss in crops is often associated with plant disease or external factors such as environment, water supply and nutrient availability. Improper agricultural practices can also introduce risks into the equation. Herbicide drift can be a combination of improper practices and environmental conditions which can create a potential yield loss. As traditional assessment of plant damage is often imprecise and time consuming, the ability of remote and proximal sensing techniques to monitor various bio-chemical alterations in the plant may offer a faster, non-destructive and reliable approach to predict yield loss caused by herbicide drift. This paper examines the prediction capabilities of partial least squares regression (PLS-R) models for estimating yield. Models were constructed with hyperspectral data of a cotton crop sprayed with three simulated doses of the phenoxy herbicide 2,4-D at three different growth stages. Fibre quality, photosynthesis, conductance, and two main hormones, indole acetic acid (IAA) and abscisic acid (ABA) were also analysed. Except for fibre quality and ABA, Spearman correlations have shown that these variables were highly affected by the chemical. Four PLS-R models for predicting yield were developed according to four timings of data collection: 2, 7, 14 and 28 days after the exposure (DAE). As indicated by the model performance, the analysis revealed that 7 DAE was the best time for data collection purposes (RMSEP = 2.6 and R2 = 0.88), followed by 28 DAE (RMSEP = 3.2 and R2 = 0.84). In summary, the results of this study show that it is possible to accurately predict yield after a simulated herbicide drift of 2,4-D on a cotton crop, through the analysis of hyperspectral data, thereby providing a reliable, effective and non-destructive alternative based on the internal response of the cotton leaves.
The Role of Safety Culture in Influencing Provider Perceptions of Patient Safety.
Bishop, Andrea C; Boyle, Todd A
2016-12-01
To determine how provider perceptions of safety culture influence their involvement in patient safety practices. Health-care providers were surveyed in 2 tertiary hospitals located in Atlantic Canada, composed of 4 units in total. The partial least squares (PLS) approach to structural equation modeling was used to analyze the data. Latent variables provider PLS model encompassed the hypothesized relationships between provider characteristics, safety culture, perceptions of patient safety practices, and actual performance of patient safety practices, using the Health Belief Model (HBM) as a guide. Data analysis was conducted using SmartPLS. A total of 113 health-care providers completed a survey out of an eligible 318, representing a response rate of 35.5%. The final PLS model showed acceptable internal consistency with all four latent variables having a composite reliability score above the recommended 0.70 cutoff value (safety culture = 0.86, threat = 0.76, expectations = 0.83, PS practices = 0.75). Discriminant validity was established, and all path coefficients were found to be significant at the α = 0.05 level using nonparametric bootstrapping. The survey results show that safety culture accounted for 34% of the variance in perceptions of threat and 42% of the variance in expectations. This research supports the role that safety culture plays in the promotion and maintenance of patient safety activities for health-care providers. As such, it is recommended that the introduction of new patient safety strategies follow a thorough exploration of an organization's safety culture.
Gouvinhas, Irene; Machado, Nelson; Carvalho, Teresa; de Almeida, José M M M; Barros, Ana I R N A
2015-01-01
Extra virgin olive oils produced from three cultivars on different maturation stages were characterized using Raman spectroscopy. Chemometric methods (principal component analysis, discriminant analysis, principal component regression and partial least squares regression) applied to Raman spectral data were utilized to evaluate and quantify the statistical differences between cultivars and their ripening process. The models for predicting the peroxide value and free acidity of olive oils showed good calibration and prediction values and presented high coefficients of determination (>0.933). Both the R(2), and the correlation equations between the measured chemical parameters, and the values predicted by each approach are presented; these comprehend both PCR and PLS, used to assess SNV normalized Raman data, as well as first and second derivative of the spectra. This study demonstrates that a combination of Raman spectroscopy with multivariate analysis methods can be useful to predict rapidly olive oil chemical characteristics during the maturation process. Copyright © 2014 Elsevier B.V. All rights reserved.
Han, Zhigang; Cai, Shengguan; Zhang, Xuelei; Qian, Qiufeng; Huang, Yuqing; Dai, Fei; Zhang, Guoping
2017-07-15
Barley grains are rich in phenolic compounds, which are associated with reduced risk of chronic diseases. Development of barley cultivars with high phenolic acid content has become one of the main objectives in breeding programs. A rapid and accurate method for measuring phenolic compounds would be helpful for crop breeding. We developed predictive models for both total phenolics (TPC) and p-coumaric acid (PA), based on near-infrared spectroscopy (NIRS) analysis. Regressions of partial least squares (PLS) and least squares support vector machine (LS-SVM) were compared for improving the models, and Monte Carlo-Uninformative Variable Elimination (MC-UVE) was applied to select informative wavelengths. The optimal calibration models generated high coefficients of correlation (r pre ) and ratio performance deviation (RPD) for TPC and PA. These results indicated the models are suitable for rapid determination of phenolic compounds in barley grains. Copyright © 2017 Elsevier Ltd. All rights reserved.
Üstündağ, Özgür; Dinç, Erdal; Özdemir, Nurten; Tilkan, M Günseli
2015-01-01
In the development strategies of new drug products and generic drug products, the simultaneous in-vitro dissolution behavior of oral dosage formulations is the most important indication for the quantitative estimation of efficiency and biopharmaceutical characteristics of drug substances. This is to force the related field's scientists to improve very powerful analytical methods to get more reliable, precise and accurate results in the quantitative analysis and dissolution testing of drug formulations. In this context, two new chemometric tools, partial least squares (PLS) and principal component regression (PCR) were improved for the simultaneous quantitative estimation and dissolution testing of zidovudine (ZID) and lamivudine (LAM) in a tablet dosage form. The results obtained in this study strongly encourage us to use them for the quality control, the routine analysis and the dissolution test of the marketing tablets containing ZID and LAM drugs.
Devos, Olivier; Downey, Gerard; Duponchel, Ludovic
2014-04-01
Classification is an important task in chemometrics. For several years now, support vector machines (SVMs) have proven to be powerful for infrared spectral data classification. However such methods require optimisation of parameters in order to control the risk of overfitting and the complexity of the boundary. Furthermore, it is established that the prediction ability of classification models can be improved using pre-processing in order to remove unwanted variance in the spectra. In this paper we propose a new methodology based on genetic algorithm (GA) for the simultaneous optimisation of SVM parameters and pre-processing (GENOPT-SVM). The method has been tested for the discrimination of the geographical origin of Italian olive oil (Ligurian and non-Ligurian) on the basis of near infrared (NIR) or mid infrared (FTIR) spectra. Different classification models (PLS-DA, SVM with mean centre data, GENOPT-SVM) have been tested and statistically compared using McNemar's statistical test. For the two datasets, SVM with optimised pre-processing give models with higher accuracy than the one obtained with PLS-DA on pre-processed data. In the case of the NIR dataset, most of this accuracy improvement (86.3% compared with 82.8% for PLS-DA) occurred using only a single pre-processing step. For the FTIR dataset, three optimised pre-processing steps are required to obtain SVM model with significant accuracy improvement (82.2%) compared to the one obtained with PLS-DA (78.6%). Furthermore, this study demonstrates that even SVM models have to be developed on the basis of well-corrected spectral data in order to obtain higher classification rates. Copyright © 2013 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Peerbhay, Kabir Yunus; Mutanga, Onisimo; Ismail, Riyad
2013-05-01
Discriminating commercial tree species using hyperspectral remote sensing techniques is critical in monitoring the spatial distributions and compositions of commercial forests. However, issues related to data dimensionality and multicollinearity limit the successful application of the technology. The aim of this study was to examine the utility of the partial least squares discriminant analysis (PLS-DA) technique in accurately classifying six exotic commercial forest species (Eucalyptus grandis, Eucalyptus nitens, Eucalyptus smithii, Pinus patula, Pinus elliotii and Acacia mearnsii) using airborne AISA Eagle hyperspectral imagery (393-900 nm). Additionally, the variable importance in the projection (VIP) method was used to identify subsets of bands that could successfully discriminate the forest species. Results indicated that the PLS-DA model that used all the AISA Eagle bands (n = 230) produced an overall accuracy of 80.61% and a kappa value of 0.77, with user's and producer's accuracies ranging from 50% to 100%. In comparison, incorporating the optimal subset of VIP selected wavebands (n = 78) in the PLS-DA model resulted in an improved overall accuracy of 88.78% and a kappa value of 0.87, with user's and producer's accuracies ranging from 70% to 100%. Bands located predominantly within the visible region of the electromagnetic spectrum (393-723 nm) showed the most capability in terms of discriminating between the six commercial forest species. Overall, the research has demonstrated the potential of using PLS-DA for reducing the dimensionality of hyperspectral datasets as well as determining the optimal subset of bands to produce the highest classification accuracies.
Identification and topographical characterisation of microbial nanowires in Nostoc punctiforme.
Sure, Sandeep; Torriero, Angel A J; Gaur, Aditya; Li, Lu Hua; Chen, Ying; Tripathi, Chandrakant; Adholeya, Alok; Ackland, M Leigh; Kochar, Mandira
2016-03-01
Extracellular pili-like structures (PLS) produced by cyanobacteria have been poorly explored. We have done detailed topographical and electrical characterisation of PLS in Nostoc punctiforme PCC 73120 using transmission electron microscopy (TEM) and conductive atomic force microscopy (CAFM). TEM analysis showed that N. punctiforme produces two separate types of PLS differing in their length and diameter. The first type of PLS are 6-7.5 nm in diameter and 0.5-2 µm in length (short/thin PLS) while the second type of PLS are ~20-40 nm in diameter and more than 10 µm long (long/thick PLS). This is the first study to report long/thick PLS in N. punctiforme. Electrical characterisation of these two different PLS by CAFM showed that both are electrically conductive and can act as microbial nanowires. This is the first report to show two distinct PLS and also identifies microbial nanowires in N. punctiforme. This study paves the way for more detailed investigation of N. punctiforme nanowires and their potential role in cell physiology and symbiosis with plants.
NASA Astrophysics Data System (ADS)
Shao, Yongni; Xie, Chuanqi; Jiang, Linjun; Shi, Jiahui; Zhu, Jiajin; He, Yong
2015-04-01
Visible/near infrared spectroscopy (Vis/NIR) based on sensitive wavelengths (SWs) and chemometrics was proposed to discriminate different tomatoes bred by spaceflight mutagenesis from their leafs or fruits (green or mature). The tomato breeds were mutant M1, M2 and their parent. Partial least squares (PLS) analysis and least squares-support vector machine (LS-SVM) were implemented for calibration models. PLS analysis was implemented for calibration models with different wavebands including the visible region (400-700 nm) and the near infrared region (700-1000 nm). The best PLS models were achieved in the visible region for the leaf and green fruit samples and in the near infrared region for the mature fruit samples. Furthermore, different latent variables (4-8 LVs for leafs, 5-9 LVs for green fruits, and 4-9 LVs for mature fruits) were used as inputs of LS-SVM to develop the LV-LS-SVM models with the grid search technique and radial basis function (RBF) kernel. The optimal LV-LS-SVM models were achieved with six LVs for the leaf samples, seven LVs for green fruits, and six LVs for mature fruits, respectively, and they outperformed the PLS models. Moreover, independent component analysis (ICA) was executed to select several SWs based on loading weights. The optimal LS-SVM model was achieved with SWs of 550-560 nm, 562-574 nm, 670-680 nm and 705-715 nm for the leaf samples; 548-556 nm, 559-564 nm, 678-685 nm and 962-974 nm for the green fruit samples; and 712-718 nm, 720-729 nm, 968-978 nm and 820-830 nm for the mature fruit samples. All of them had better performance than PLS and LV-LS-SVM, with the parameters of correlation coefficient (rp), root mean square error of prediction (RMSEP) and bias of 0.9792, 0.2632 and 0.0901 based on leaf discrimination, 0.9837, 0.2783 and 0.1758 based on green fruit discrimination, 0.9804, 0.2215 and -0.0035 based on mature fruit discrimination, respectively. The overall results indicated that ICA was an effective way for the selection of SWs, and the Vis/NIR combined with LS-SVM models had the capability to predict the different breeds (mutant M1, mutant M2 and their parent) of tomatoes from leafs and fruits.
NASA Astrophysics Data System (ADS)
Wu, W.; Chen, G. Y.; Kang, R.; Xia, J. C.; Huang, Y. P.; Chen, K. J.
2017-07-01
During slaughtering and further processing, chicken carcasses are inevitably contaminated by microbial pathogen contaminants. Due to food safety concerns, many countries implement a zero-tolerance policy that forbids the placement of visibly contaminated carcasses in ice-water chiller tanks during processing. Manual detection of contaminants is labor consuming and imprecise. Here, a successive projections algorithm (SPA)-multivariable linear regression (MLR) classifier based on an optimal performance threshold was developed for automatic detection of contaminants on chicken carcasses. Hyperspectral images were obtained using a hyperspectral imaging system. A regression model of the classifier was established by MLR based on twelve characteristic wavelengths (505, 537, 561, 562, 564, 575, 604, 627, 656, 665, 670, and 689 nm) selected by SPA , and the optimal threshold T = 1 was obtained from the receiver operating characteristic (ROC) analysis. The SPA-MLR classifier provided the best detection results when compared with the SPA-partial least squares (PLS) regression classifier and the SPA-least squares supported vector machine (LS-SVM) classifier. The true positive rate (TPR) of 100% and the false positive rate (FPR) of 0.392% indicate that the SPA-MLR classifier can utilize spatial and spectral information to effectively detect contaminants on chicken carcasses.
Physicochemical characterization of Lavandula spp. honey with FT-Raman spectroscopy.
Anjos, Ofélia; Santos, António J A; Paixão, Vasco; Estevinho, Letícia M
2018-02-01
This study aimed to evaluate the potential of FT-Raman spectroscopy in the prediction of the chemical composition of Lavandula spp. monofloral honey. Partial Least Squares (PLS) regression models were performed for the quantitative estimation and the results were correlated with those obtained using reference methods. Good calibration models were obtained for electrical conductivity, ash, total acidity, pH, reducing sugars, hydroxymethylfurfural (HMF), proline, diastase index, apparent sucrose, total flavonoids content and total phenol content. On the other hand, the model was less accurate for pH determination. The calibration models had high r 2 (ranging between 92.8% and 99.9%), high residual prediction deviation - RPD (ranging between 4.2 and 26.8) and low root mean square errors. These results confirm the hypothesis that FT-Raman is a useful technique for the quality control and chemical properties' evaluation of Lavandula spp honey. Its application may allow improving the efficiency, speed and cost of the current laboratory analysis. Copyright © 2017 Elsevier B.V. All rights reserved.
Calvano, C D; van der Werf, I D; Palmisano, F; Sabbatini, L
2011-06-01
A matrix-assisted laser desorption ionization time-of-flight mass spectrometry-based approach was applied for the detection of various lipid classes, such as triacylglycerols (TAGs) and phospholipids (PLs), and their oxidation by-products in extracts of small (50-100 μg) samples obtained from painted artworks. Ageing of test specimens under various conditions, including the presence of different pigments, was preliminarily investigated. During ageing, the TAGs and PLs content decreased, whereas the amount of diglycerides, short-chain oxidative products arising from TAGs and PLs, and oxidized TAGs and PLs components increased. The examination of a series of model paint samples gave a clear indication that specific ions produced by oxidative cleavage of PLs and/or TAGs may be used as markers for egg and drying oil-based binders. Their elemental composition and hypothetical structure are also tentatively proposed. Moreover, the simultaneous presence of egg and oil binders can be easily and unambiguously ascertained through the simultaneous occurrence of the relevant specific markers. The potential of the proposed approach was demonstrated for the first time by the analysis of real samples from a polyptych of Bartolomeo Vivarini (fifteenth century) and a "French school" canvas painting (seventeenth century).
Hertrampf, A; Sousa, R M; Menezes, J C; Herdling, T
2016-05-30
Quality control (QC) in the pharmaceutical industry is a key activity in ensuring medicines have the required quality, safety and efficacy for their intended use. QC departments at pharmaceutical companies are responsible for all release testing of final products but also all incoming raw materials. Near-infrared spectroscopy (NIRS) and Raman spectroscopy are important techniques for fast and accurate identification and qualification of pharmaceutical samples. Tablets containing two different active pharmaceutical ingredients (API) [bisoprolol, hydrochlorothiazide] in different commercially available dosages were analysed using Raman- and NIR Spectroscopy. The goal was to define multivariate models based on each vibrational spectroscopy to discriminate between different dosages (identity) and predict their dosage (semi-quantitative). Furthermore the combination of spectroscopic techniques was investigated. Therefore, two different multiblock techniques based on PLS have been applied: multiblock PLS (MB-PLS) and sequential-orthogonalised PLS (SO-PLS). NIRS showed better results compared to Raman spectroscopy for both identification and quantitation. The multiblock techniques investigated showed that each spectroscopy contains information not present or captured with the other spectroscopic technique, thus demonstrating that there is a potential benefit in their combined use for both identification and quantitation purposes. Copyright © 2016 Elsevier B.V. All rights reserved.
Wan, Jian; Chen, Yi-Chieh; Morris, A Julian; Thennadil, Suresh N
2017-07-01
Near-infrared (NIR) spectroscopy is being widely used in various fields ranging from pharmaceutics to the food industry for analyzing chemical and physical properties of the substances concerned. Its advantages over other analytical techniques include available physical interpretation of spectral data, nondestructive nature and high speed of measurements, and little or no need for sample preparation. The successful application of NIR spectroscopy relies on three main aspects: pre-processing of spectral data to eliminate nonlinear variations due to temperature, light scattering effects and many others, selection of those wavelengths that contribute useful information, and identification of suitable calibration models using linear/nonlinear regression . Several methods have been developed for each of these three aspects and many comparative studies of different methods exist for an individual aspect or some combinations. However, there is still a lack of comparative studies for the interactions among these three aspects, which can shed light on what role each aspect plays in the calibration and how to combine various methods of each aspect together to obtain the best calibration model. This paper aims to provide such a comparative study based on four benchmark data sets using three typical pre-processing methods, namely, orthogonal signal correction (OSC), extended multiplicative signal correction (EMSC) and optical path-length estimation and correction (OPLEC); two existing wavelength selection methods, namely, stepwise forward selection (SFS) and genetic algorithm optimization combined with partial least squares regression for spectral data (GAPLSSP); four popular regression methods, namely, partial least squares (PLS), least absolute shrinkage and selection operator (LASSO), least squares support vector machine (LS-SVM), and Gaussian process regression (GPR). The comparative study indicates that, in general, pre-processing of spectral data can play a significant role in the calibration while wavelength selection plays a marginal role and the combination of certain pre-processing, wavelength selection, and nonlinear regression methods can achieve superior performance over traditional linear regression-based calibration.
Differences in chewing sounds of dry-crisp snacks by multivariate data analysis
NASA Astrophysics Data System (ADS)
De Belie, N.; Sivertsvik, M.; De Baerdemaeker, J.
2003-09-01
Chewing sounds of different types of dry-crisp snacks (two types of potato chips, prawn crackers, cornflakes and low calorie snacks from extruded starch) were analysed to assess differences in sound emission patterns. The emitted sounds were recorded by a microphone placed over the ear canal. The first bite and the first subsequent chew were selected from the time signal and a fast Fourier transformation provided the power spectra. Different multivariate analysis techniques were used for classification of the snack groups. This included principal component analysis (PCA) and unfold partial least-squares (PLS) algorithms, as well as multi-way techniques such as three-way PLS, three-way PCA (Tucker3), and parallel factor analysis (PARAFAC) on the first bite and subsequent chew. The models were evaluated by calculating the classification errors and the root mean square error of prediction (RMSEP) for independent validation sets. It appeared that the logarithm of the power spectra obtained from the chewing sounds could be used successfully to distinguish the different snack groups. When different chewers were used, recalibration of the models was necessary. Multi-way models distinguished better between chewing sounds of different snack groups than PCA on bite or chew separately and than unfold PLS. From all three-way models applied, N-PLS with three components showed the best classification capabilities, resulting in classification errors of 14-18%. The major amount of incorrect classifications was due to one type of potato chips that had a very irregular shape, resulting in a wide variation of the emitted sounds.
A chemometric approach to the characterisation of historical mortars
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rampazzi, L.; Pozzi, A.; Sansonetti, A.
2006-06-15
The compositional knowledge of historical mortars is of great concern in case of provenance and dating investigations and of conservation works since the nature of the raw materials suggests the most compatible conservation products. The classic characterisation usually goes through various analytical determinations, while conservation laboratories call for simple and quick analyses able to enlighten the nature of mortars, usually in terms of the binder fraction. A chemometric approach to the matter is here undertaken. Specimens of mortars were prepared with calcitic and dolomitic binders and analysed by Atomic Spectroscopy. Principal Components Analysis (PCA) was used to investigate the featuresmore » of specimens and samples. A Partial Least Square (PLS1) regression was done in order to predict the binder/aggregate ratio. The model was applied to historical mortars from the churches of St. Lorenzo (Milan) and St. Abbondio (Como). The accordance between the predictive model and the real samples is discussed.« less
Facial Age Synthesis Using Sparse Partial Least Squares (The Case of Ben Needham).
Bukar, Ali M; Ugail, Hassan
2017-09-01
Automatic facial age progression (AFAP) has been an active area of research in recent years. This is due to its numerous applications which include searching for missing. This study presents a new method of AFAP. Here, we use an active appearance model (AAM) to extract facial features from available images. An aging function is then modelled using sparse partial least squares regression (sPLS). Thereafter, the aging function is used to render new faces at different ages. To test the accuracy of our algorithm, extensive evaluation is conducted using a database of 500 face images with known ages. Furthermore, the algorithm is used to progress Ben Needham's facial image that was taken when he was 21 months old to the ages of 6, 14, and 22 years. The algorithm presented in this study could potentially be used to enhance the search for missing people worldwide. © 2017 American Academy of Forensic Sciences.
Simultaneous determination of three herbicides by differential pulse voltammetry and chemometrics.
Ni, Yongnian; Wang, Lin; Kokot, Serge
2011-01-01
A novel differential pulse voltammetry method (DPV) was researched and developed for the simultaneous determination of Pendimethalin, Dinoseb and sodium 5-nitroguaiacolate (5NG) with the aid of chemometrics. The voltammograms of these three compounds overlapped significantly, and to facilitate the simultaneous determination of the three analytes, chemometrics methods were applied. These included classical least squares (CLS), principal component regression (PCR), partial least squares (PLS) and radial basis function-artificial neural networks (RBF-ANN). A separately prepared verification data set was used to confirm the calibrations, which were built from the original and first derivative data matrices of the voltammograms. On the basis relative prediction errors and recoveries of the analytes, the RBF-ANN and the DPLS (D - first derivative spectra) models performed best and are particularly recommended for application. The DPLS calibration model was applied satisfactorily for the prediction of the three analytes from market vegetables and lake water samples.
Following a trend with an exponential moving average: Analytical results for a Gaussian model
NASA Astrophysics Data System (ADS)
Grebenkov, Denis S.; Serror, Jeremy
2014-01-01
We investigate how price variations of a stock are transformed into profits and losses (P&Ls) of a trend following strategy. In the frame of a Gaussian model, we derive the probability distribution of P&Ls and analyze its moments (mean, variance, skewness and kurtosis) and asymptotic behavior (quantiles). We show that the asymmetry of the distribution (with often small losses and less frequent but significant profits) is reminiscent to trend following strategies and less dependent on peculiarities of price variations. At short times, trend following strategies admit larger losses than one may anticipate from standard Gaussian estimates, while smaller losses are ensured at longer times. Simple explicit formulas characterizing the distribution of P&Ls illustrate the basic mechanisms of momentum trading, while general matrix representations can be applied to arbitrary Gaussian models. We also compute explicitly annualized risk adjusted P&L and strategy turnover to account for transaction costs. We deduce the trend following optimal timescale and its dependence on both auto-correlation level and transaction costs. Theoretical results are illustrated on the Dow Jones index.
Monitoring multiple components in vinegar fermentation using Raman spectroscopy.
Uysal, Reyhan Selin; Soykut, Esra Acar; Boyaci, Ismail Hakki; Topcu, Ali
2013-12-15
In this study, the utility of Raman spectroscopy (RS) with chemometric methods for quantification of multiple components in the fermentation process was investigated. Vinegar, the product of a two stage fermentation, was used as a model and glucose and fructose consumption, ethanol production and consumption and acetic acid production were followed using RS and the partial least squares (PLS) method. Calibration of the PLS method was performed using model solutions. The prediction capability of the method was then investigated with both model and real samples. HPLC was used as a reference method. The results from comparing RS-PLS and HPLC with each other showed good correlations were obtained between predicted and actual sample values for glucose (R(2)=0.973), fructose (R(2)=0.988), ethanol (R(2)=0.996) and acetic acid (R(2)=0.983). In conclusion, a combination of RS with chemometric methods can be applied to monitor multiple components of the fermentation process from start to finish with a single measurement in a short time. Copyright © 2013 Elsevier Ltd. All rights reserved.
Characterization of the biosolids composting process by hyperspectral analysis.
Ilani, Talli; Herrmann, Ittai; Karnieli, Arnon; Arye, Gilboa
2016-02-01
Composted biosolids are widely used as a soil supplement to improve soil quality. However, the application of immature or unstable compost can cause the opposite effect. To date, compost maturation determination is time consuming and cannot be done at the composting site. Hyperspectral spectroscopy was suggested as a simple tool for assessing compost maturity and quality. Nevertheless, there is still a gap in knowledge regarding several compost maturation characteristics, such as dissolved organic carbon, NO3, and NH4 contents. In addition, this approach has not yet been tested on a sample at its natural water content. Therefore, in the current study, hyperspectral analysis was employed in order to characterize the biosolids composting process as a function of composting time. This goal was achieved by correlating the reflectance spectra in the range of 400-2400nm, using the partial least squares-regression (PLS-R) model, with the chemical properties of wet and oven-dried biosolid samples. The results showed that the proposed method can be used as a reliable means to evaluate compost maturity and stability. Specifically, the PLS-R model was found to be an adequate tool to evaluate the biosolids' total carbon and dissolved organic carbon, total nitrogen and dissolved nitrogen, and nitrate content, as well as the absorbance ratio of 254/365nm (E2/E3) and C/N ratios in the dry and wet samples. It failed, however, to predict the ammonium content in the dry samples since the ammonium evaporated during the drying process. It was found that in contrast to what is commonly assumed, the spectral analysis of the wet samples can also be successfully used to build a model for predicting the biosolids' compost maturity. Copyright © 2015 Elsevier Ltd. All rights reserved.
Winning, Hanne; Roldán-Marín, Eduvigis; Dragsted, Lars O; Viereck, Nanna; Poulsen, Morten; Sánchez-Moreno, Concepción; Cano, M Pilar; Engelsen, Søren B
2009-11-01
The metabolome following intake of onion by-products is evaluated. Thirty-two rats were fed a diet containing an onion by-product or one of the two derived onion by-product fractions: an ethanol extract and the residue. A 24 hour urine sample was analyzed using (1)H NMR spectroscopy in order to investigate the effects of onion intake on the rat metabolism. Application of interval extended canonical variates analysis (ECVA) proved to be able to distinguish between the metabolomic profiles from rats consuming normal feed and rats fed with an onion diet. Two dietary biomarkers for onion intake were identified as dimethyl sulfone and 3-hydroxyphenylacetic acid. The same two dietary biomarkers were subsequently revealed by interval partial least squares regression (PLS) to be perfect quantitative markers for onion intake. The best PLS calibration model yielded a root mean square error of cross-validation (RMSECV) of 0.97% (w/w) with only 1 latent variable and a squared correlation coefficient of 0.94. This indicates that urine from rats on the by-product diet, the extract diet, and the residue diet all contain the same dietary biomarkers and it is concluded that dimethyl sulfone and 3-hydroxyphenylacetic acid are dietary biomarkers for onion intake. Being able to detect specific dietary biomarkers is highly beneficial in the control of nutritionally enhanced functional foods.
Aguado, Daniel; Barat, Ramón; Soto, Juan; Martínez-Mañez, Ramón
2016-10-01
This study demonstrates the feasibility of using a voltammetric electronic tongue to monitor effluent dissolved orthophosphate concentration in a struvite precipitation reactor. The electrochemical response of the electronic tongue to the presence of orthophosphate in samples collected from the effluent of the precipitation reactor is used to predict orthophosphate concentration via a statistical model based on Partial Least Squares (PLS) Regression. PLS predictions were suitable for this monitoring application in which precipitation efficiencies higher than 80% (i.e., effluent dissolved orthophosphate concentrations lower than 40mg P-PO4(3-) L(-1)) could be considered as indicator of good process performance. The electronic tongue consisted of a set of metallic (noble and non-noble) electrodes housed inside a stainless steel cylinder which was used as the body of the electronic tongue system. Fouling problems were prevented via a simple mechanical polishing of the electrodes. The measurement of each sample with the electronic tongue was done in less than 3s. Conductivity of the samples only affected the electronic tongue marginally, being the main electrochemical response due to the orthophosphate concentration in the samples. Copper, silver, iridium and rhodium were the electrodes that exhibited noticeable response correlated with the dissolved orthophosphate concentration variations, while gold, platinum and especially cobalt and nickel were the less useful electrodes for this application. Copyright © 2016 Elsevier B.V. All rights reserved.
Malzert-Fréon, A; Hennequin, D; Rault, S
2010-11-01
Lipidic nanoparticles (NP), formulated from a phase inversion temperature process, have been studied with chemometric techniques to emphasize the influence of the four major components (Solutol®, Labrasol®, Labrafac®, water) on their average diameter and their distribution in size. Typically, these NP present a monodisperse size lower than 200 nm, as determined by dynamic light scattering measurements. From the application of the partial least squares (PLS) regression technique to the experimental data collected during definition of the feasibility zone, it was established that NP present a core-shell structure where Labrasol® is well encapsulated and contributes to the structuring of the NP. Even if this solubility enhancer is regarded as a pure surfactant in the literature, it appears that the oil moieties of this macrogolglyceride mixture significantly influence its properties. Furthermore, results have shown that PLS technique can be also used for predictions of sizes for given relative proportions of components and it was established that from a mixture design, the quantitative mixture composition to use in order to reach a targeted size and a targeted polydispersity index (PDI) can be easily predicted. Hence, statistical models can be a useful tool to control and optimize the characteristics in size of NP. © 2010 Wiley-Liss, Inc. and the American Pharmacists Association
Lipiäinen, Tiina; Pessi, Jenni; Movahedi, Parisa; Koivistoinen, Juha; Kurki, Lauri; Tenhunen, Mari; Yliruusi, Jouko; Juppo, Anne M; Heikkonen, Jukka; Pahikkala, Tapio; Strachan, Clare J
2018-04-03
Raman spectroscopy is widely used for quantitative pharmaceutical analysis, but a common obstacle to its use is sample fluorescence masking the Raman signal. Time-gating provides an instrument-based method for rejecting fluorescence through temporal resolution of the spectral signal and allows Raman spectra of fluorescent materials to be obtained. An additional practical advantage is that analysis is possible in ambient lighting. This study assesses the efficacy of time-gated Raman spectroscopy for the quantitative measurement of fluorescent pharmaceuticals. Time-gated Raman spectroscopy with a 128 × (2) × 4 CMOS SPAD detector was applied for quantitative analysis of ternary mixtures of solid-state forms of the model drug, piroxicam (PRX). Partial least-squares (PLS) regression allowed quantification, with Raman-active time domain selection (based on visual inspection) improving performance. Model performance was further improved by using kernel-based regularized least-squares (RLS) regression with greedy feature selection in which the data use in both the Raman shift and time dimensions was statistically optimized. Overall, time-gated Raman spectroscopy, especially with optimized data analysis in both the spectral and time dimensions, shows potential for sensitive and relatively routine quantitative analysis of photoluminescent pharmaceuticals during drug development and manufacturing.
A SAR and QSAR study of new artemisinin compounds with antimalarial activity.
Santos, Cleydson Breno R; Vieira, Josinete B; Lobato, Cleison C; Hage-Melim, Lorane I S; Souto, Raimundo N P; Lima, Clarissa S; Costa, Elizabeth V M; Brasil, Davi S B; Macêdo, Williams Jorge C; Carvalho, José Carlos T
2013-12-30
The Hartree-Fock method and the 6-31G** basis set were employed to calculate the molecular properties of artemisinin and 20 derivatives with antimalarial activity. Maps of molecular electrostatic potential (MEPs) and molecular docking were used to investigate the interaction between ligands and the receptor (heme). Principal component analysis and hierarchical cluster analysis were employed to select the most important descriptors related to activity. The correlation between biological activity and molecular properties was obtained using the partial least squares and principal component regression methods. The regression PLS and PCR models built in this study were also used to predict the antimalarial activity of 30 new artemisinin compounds with unknown activity. The models obtained showed not only statistical significance but also predictive ability. The significant molecular descriptors related to the compounds with antimalarial activity were the hydration energy (HE), the charge on the O11 oxygen atom (QO11), the torsion angle O1-O2-Fe-N2 (D2) and the maximum rate of R/Sanderson Electronegativity (RTe+). These variables led to a physical and structural explanation of the molecular properties that should be selected for when designing new ligands to be used as antimalarial agents.
[Detection of Hawthorn Fruit Defects Using Hyperspectral Imaging].
Liu, De-hua; Zhang, Shu-juan; Wang, Bin; Yu, Ke-qiang; Zhao, Yan-ru; He, Yong
2015-11-01
Hyperspectral imaging technology covered the range of 380-1000 nm was employed to detect defects (bruise and insect damage) of hawthorn fruit. A total of 134 samples were collected, which included damage fruit of 46, pest fruit of 30, injure and pest fruit of 10 and intact fruit of 48. Because calyx · s⁻¹ tem-end and bruise/insect damage regions offered a similar appearance characteristic in RGB images, which could produce easily confusion between them. Hence, five types of defects including bruise, insect damage, sound, calyx, and stem-end were collected from 230 hawthorn fruits. After acquiring hyperspectral images of hawthorn fruits, the spectral data were extracted from region of interest (ROI). Then, several pretreatment methods of standard normalized variate (SNV), savitzky golay (SG), median filter (MF) and multiplicative scatter correction (MSC) were used and partial least squares method(PLS) model was carried out to obtain the better performance. Accordingly to their results, SNV pretreatment methods assessed by PLS was viewed as best pretreatment method. Lastly, SNV was chosen as the pretreatment method. Spectral features of five different regions were combined with Regression coefficients(RCs) of partial least squares-discriminant analysis (PLS-DA) model was used to identify the important wavelengths and ten wavebands at 483, 563, 645, 671, 686, 722, 777, 819, 837 and 942 nm were selected from all of the wavebands. Using Kennard-Stone algorithm, all kinds of samples were randomly divided into training set (173) and test set (57) according to the proportion of 3:1. And then, least squares-support vector machine (LS-SVM) discriminate model was established by using the selected wavebands. The results showed that the discriminate accuracy of the method was 91.23%. In the other hand, images at ten important wavebands were executed to Principal component analysis (PCA). Using "Sobel" operator and region growing algrorithm "Regiongrow", the edge and defect feature of 86 Hawthorn could be recognized. Lastly, the detect precision of bruised, insect damage and two-defect samples is 95.65%, 86.67% and 100%, respectively. This investigation demonstrated that hyperspectral imaging technology could detect the defects of bruise, insect damage, calyx, and stem-end in hawthorn fruit in qualitative analysis and feature detection which provided a theoretical reference for the defects nondestructive detection of hawthorn fruit.
Nuclear Forensic Inferences Using Iterative Multidimensional Statistics
DOE Office of Scientific and Technical Information (OSTI.GOV)
Robel, M; Kristo, M J; Heller, M A
2009-06-09
Nuclear forensics involves the analysis of interdicted nuclear material for specific material characteristics (referred to as 'signatures') that imply specific geographical locations, production processes, culprit intentions, etc. Predictive signatures rely on expert knowledge of physics, chemistry, and engineering to develop inferences from these material characteristics. Comparative signatures, on the other hand, rely on comparison of the material characteristics of the interdicted sample (the 'questioned sample' in FBI parlance) with those of a set of known samples. In the ideal case, the set of known samples would be a comprehensive nuclear forensics database, a database which does not currently exist. Inmore » fact, our ability to analyze interdicted samples and produce an extensive list of precise materials characteristics far exceeds our ability to interpret the results. Therefore, as we seek to develop the extensive databases necessary for nuclear forensics, we must also develop the methods necessary to produce the necessary inferences from comparison of our analytical results with these large, multidimensional sets of data. In the work reported here, we used a large, multidimensional dataset of results from quality control analyses of uranium ore concentrate (UOC, sometimes called 'yellowcake'). We have found that traditional multidimensional techniques, such as principal components analysis (PCA), are especially useful for understanding such datasets and drawing relevant conclusions. In particular, we have developed an iterative partial least squares-discriminant analysis (PLS-DA) procedure that has proven especially adept at identifying the production location of unknown UOC samples. By removing classes which fell far outside the initial decision boundary, and then rebuilding the PLS-DA model, we have consistently produced better and more definitive attributions than with a single pass classification approach. Performance of the iterative PLS-DA method compared favorably to that of classification and regression tree (CART) and k nearest neighbor (KNN) algorithms, with the best combination of accuracy and robustness, as tested by classifying samples measured independently in our laboratories against the vendor QC based reference set.« less
NASA Astrophysics Data System (ADS)
Wong, David W. C.; Choy, K. L.; Chow, Harry K. H.; Lin, Canhong
2014-06-01
For the most rapidly growing economic entity in the world, China, a new logistics operation called the indirect cross-border supply chain model has recently emerged. The primary idea of this model is to reduce logistics costs by storing goods at a bonded warehouse with low storage cost in certain Chinese regions, such as the Pearl River Delta (PRD). This research proposes a performance measurement system (PMS) framework to assess the direct and indirect cross-border supply chain models. The PMS covers four categories including cost, time, quality and flexibility in the assessment of the performance of direct and indirect models. Furthermore, a survey was conducted to investigate the logistics performance of third party logistics (3PLs) at the PRD regions, including Guangzhou, Shenzhen and Hong Kong. The significance of the proposed PMS framework allows 3PLs accurately pinpoint the weakness and strengths of it current operations policy at four major performance measurement categories. Hence, this helps 3PLs further enhance the competitiveness and operations efficiency through better resources allocation at the area of warehousing and transportation.
Mo, Changyeun; Kim, Giyoung; Lee, Kangjin; Kim, Moon S.; Cho, Byoung-Kwan; Lim, Jongguk; Kang, Sukwon
2014-01-01
In this study, we developed a viability evaluation method for pepper (Capsicum annuum L.) seeds based on hyperspectral reflectance imaging. The reflectance spectra of pepper seeds in the 400–700 nm range are collected from hyperspectral reflectance images obtained using blue, green, and red LED illumination. A partial least squares–discriminant analysis (PLS-DA) model is developed to classify viable and non-viable seeds. Four spectral ranges generated with four types of LEDs (blue, green, red, and RGB), which were pretreated using various methods, are investigated to develop the classification models. The optimal PLS-DA model based on the standard normal variate for RGB LED illumination (400–700 nm) yields discrimination accuracies of 96.7% and 99.4% for viable seeds and nonviable seeds, respectively. The use of images based on the PLS-DA model with the first-order derivative of a 31.5-nm gap for red LED illumination (600–700 nm) yields 100% discrimination accuracy for both viable and nonviable seeds. The results indicate that a hyperspectral imaging technique based on LED light can be potentially applied to high-quality pepper seed sorting. PMID:24763251
Mo, Changyeun; Kim, Giyoung; Lee, Kangjin; Kim, Moon S; Cho, Byoung-Kwan; Lim, Jongguk; Kang, Sukwon
2014-04-24
In this study, we developed a viability evaluation method for pepper (Capsicum annuum L.) seeds based on hyperspectral reflectance imaging. The reflectance spectra of pepper seeds in the 400-700 nm range are collected from hyperspectral reflectance images obtained using blue, green, and red LED illumination. A partial least squares-discriminant analysis (PLS-DA) model is developed to classify viable and non-viable seeds. Four spectral ranges generated with four types of LEDs (blue, green, red, and RGB), which were pretreated using various methods, are investigated to develop the classification models. The optimal PLS-DA model based on the standard normal variate for RGB LED illumination (400-700 nm) yields discrimination accuracies of 96.7% and 99.4% for viable seeds and nonviable seeds, respectively. The use of images based on the PLS-DA model with the first-order derivative of a 31.5-nm gap for red LED illumination (600-700 nm) yields 100% discrimination accuracy for both viable and nonviable seeds. The results indicate that a hyperspectral imaging technique based on LED light can be potentially applied to high-quality pepper seed sorting.
Tsopelas, Fotios; Konstantopoulos, Dimitris; Kakoulidou, Anna Tsantili
2018-07-26
In the present work, two approaches for the voltammetric fingerprinting of oils and their combination with chemometrics were investigated in order to detect the adulteration of extra virgin olive oil with olive pomace oil as well as the most common seed oils, namely sunflower, soybean and corn oil. In particular, cyclic voltammograms of diluted extra virgin olive oils, regular (pure) olive oils (blends of refined olive oils with virgin olive oils), olive pomace oils and seed oils in presence of dichloromethane and 0.1 M of LiClO 4 in EtOH as electrolyte were recorded at a glassy carbon working electrode. Cyclic voltammetry was also employed in methanolic extracts of olive and seed oils. Datapoints of cyclic voltammograms were exported and submitted to Principal Component Analysis (PCA), Partial Least Square- Discriminant Analysis (PLS-DA) and soft independent modeling of class analogy (SIMCA). In diluted oils, PLS-DA provided a clear discrimination between olive oils (extra virgin and regular) and olive pomace/seed oils, while SIMCA showed a clear discrimination of extra virgin olive oil in regard to all other samples. Using methanolic extracts and considering datapoints recorded between 0.6 and 1.3 V, PLS-DA provided more information, resulting in three clusters-extra virgin olive oils, regular olive oils and seed/olive pomace oils-while SIMCA showed inferior performance. For the quantification of extra virgin olive oil adulteration with olive pomace oil or seed oils, a model based on Partial Least Square (PLS) analysis was developed. Detection limit of adulteration in olive oil was found to be 2% (v/v) and the linearity range up to 33% (v/v). Validation and applicability of all models was proved using a suitable test set. In the case of PLS, synthetic oil mixtures with 4 known adulteration levels in the range of 4-26% were also employed as a blind test set. Copyright © 2018 Elsevier B.V. All rights reserved.
Clementi, Catia; Nowik, Witold; Romani, Aldo; Cardon, Dominique; Trojanowicz, Marek; Davantès, Athénaïs; Chaminade, Pierre
2016-07-05
In this paper, partial least square (PLS) regression is innovatively applied for a semi-quantitative non invasive study of the most precious dye of Antiquity: Tyrian purple. This original approach for the study of organic dyes in the cultural heritage field, is based on the correlation of spectrophotometric (UV-Visible) and chromatographic (Fast-HT-HPLC-PDA) data from an extensive set of textiles prepared with different snail species according to historical recipes. A cross-validated PLS model, based on the quantity of 6,6'-dibromoindigotin, displays an excellent correlation factor (R(2)Y = 0.987) between values determined by chromatography and those predicted from reflectance spectra. This indicates that the spectral features of Tyrian purple on textile fibre is strictly related to the amount of this indigoid component whose content may be non invasively predicted from reflectance spectrum. The studied correlation also highlights that, independently of the dyeing method and nature of the textile fibre used, the relative content of 6,6'-dibromindigotin may be used as a parameter to distinguish samples prepared with Hexaplex trunculus L. snails from those prepared with further mollusc species. To validate this model, archaeological textile fragments dating from the Roman period were successfully examined. The results achieved open an absolutely new way in Tyrian purple analysis in cultural heritage by non invasive spectroscopic techniques attesting their convergence with HPLC and giving them a semi-quantitative value. Copyright © 2016 Elsevier B.V. All rights reserved.
Tamburini, Elena; Costa, Stefania; Rugiero, Irene; Pedrini, Paola; Marchetti, Maria Gabriella
2017-04-11
A great interest has recently been focused on lycopene and β-carotene, because of their antioxidant action in the organism. Red-flesh watermelon is one of the main sources of lycopene as the most abundant carotenoid. The use of near-infrared spectroscopy (NIRS) in post-harvesting has permitted us to rapidly quantify lycopene, β-carotene, and total soluble solids (TSS) on single intact fruits. Watermelons, harvested in 2013-2015, were submitted to near-infrared (NIR) radiation while being transported along a conveyor belt system, stationary and in movement, and at different positions on the belt. Eight hundred spectra from 100 samples were collected as calibration set in the 900-1700 nm interval. Calibration models were performed using partial least squares (PLS) regression on pre-treated spectra (derivatives and SNV) in the ranges 2.65-151.75 mg/kg (lycopene), 0.19-9.39 mg/kg (β-carotene), and 5.3%-13.7% (TSS). External validation was carried out with 35 new samples and on 35 spectra. The PLS models for intact watermelon could predict lycopene with R² = 0.877 and SECV = 15.68 mg/kg, β-carotene with R² = 0.822 and SECV = 0.81 mg/kg, and TSS with R² = 0.836 and SECV = 0.8%. External validation has confirmed predictive ability with R² = 0.805 and RMSEP = 16.19 mg/kg for lycopene, R2 = 0.737 and RMSEP = 0.96 mg/kg for β-carotene, and R² = 0.707 and RMSEP = 1.4% for TSS. The results allow for the market valorization of fruits.
CIEL*a*b* color space predictive models for colorimetry devices--analysis of perfume quality.
Korifi, Rabia; Le Dréau, Yveline; Antinelli, Jean-François; Valls, Robert; Dupuy, Nathalie
2013-01-30
Color perception plays a major role in the consumer evaluation of perfume quality. Consumers need first to be entirely satisfied with the sensory properties of products, before other quality dimensions become relevant. The evaluation of complex mixtures color presents a challenge even for modern analytical techniques. A variety of instruments are available for color measurement. They can be classified as tristimulus colorimeters and spectrophotometers. Obsolescence of the electronics of old tristimulus colorimeter arises from the difficulty in finding repair parts and leads to its replacement by more modern instruments. High quality levels in color measurement, i.e., accuracy and reliability in color control are the major advantages of the new generation of color instrumentation, the integrating sphere spectrophotometer. Two models of spectrophotometer were tested in transmittance mode, employing the d/0° geometry. The CIEL(*)a(*)b(*) color space parameters were measured with each instrument for 380 samples of raw materials and bases used in the perfume compositions. The results were graphically compared between the colorimeter device and the spectrophotometer devices. All color space parameters obtained with the colorimeter were used as dependent variables to generate regression equations with values obtained from the spectrophotometers. The data was statistically analyzed to create predictive model between the reference and the target instruments through two methods. The first method uses linear regression analysis and the second method consists of partial least square regression (PLS) on each component. Copyright © 2012 Elsevier B.V. All rights reserved.
El Alami El Hassani, Nadia; Tahri, Khalid; Llobet, Eduard; Bouchikhi, Benachir; Errachid, Abdelhamid; Zine, Nadia; El Bari, Nezha
2018-03-15
Moroccan and French honeys from different geographical areas were classified and characterized by applying a voltammetric electronic tongue (VE-tongue) coupled to analytical methods. The studied parameters include color intensity, free lactonic and total acidity, proteins, phenols, hydroxymethylfurfural content (HMF), sucrose, reducing and total sugars. The geographical classification of different honeys was developed through three-pattern recognition techniques: principal component analysis (PCA), support vector machines (SVMs) and hierarchical cluster analysis (HCA). Honey characterization was achieved by partial least squares modeling (PLS). All the PLS models developed were able to accurately estimate the correct values of the parameters analyzed using as input the voltammetric experimental data (i.e. r>0.9). This confirms the potential ability of the VE-tongue for performing a rapid characterization of honeys via PLS in which an uncomplicated, cost-effective sample preparation process that does not require the use of additional chemicals is implemented. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Yan, Wen-juan; Yang, Ming; He, Guo-quan; Qin, Lin; Li, Gang
2014-11-01
In order to identify the diabetic patients by using tongue near-infrared (NIR) spectrum - a spectral classification model of the NIR reflectivity of the tongue tip is proposed, based on the partial least square (PLS) method. 39sample data of tongue tip's NIR spectra are harvested from healthy people and diabetic patients , respectively. After pretreatment of the reflectivity, the spectral data are set as the independent variable matrix, and information of classification as the dependent variables matrix, Samples were divided into two groups - i.e. 53 samples as calibration set and 25 as prediction set - then the PLS is used to build the classification model The constructed modelfrom the 53 samples has the correlation of 0.9614 and the root mean square error of cross-validation (RMSECV) of 0.1387.The predictions for the 25 samples have the correlation of 0.9146 and the RMSECV of 0.2122.The experimental result shows that the PLS method can achieve good classification on features of healthy people and diabetic patients.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zhao, Kaiguang; Valle, Denis; Popescu, Sorin
2013-05-15
Model specification remains challenging in spectroscopy of plant biochemistry, as exemplified by the availability of various spectral indices or band combinations for estimating the same biochemical. This lack of consensus in model choice across applications argues for a paradigm shift in hyperspectral methods to address model uncertainty and misspecification. We demonstrated one such method using Bayesian model averaging (BMA), which performs variable/band selection and quantifies the relative merits of many candidate models to synthesize a weighted average model with improved predictive performances. The utility of BMA was examined using a portfolio of 27 foliage spectral–chemical datasets representing over 80 speciesmore » across the globe to estimate multiple biochemical properties, including nitrogen, hydrogen, carbon, cellulose, lignin, chlorophyll (a or b), carotenoid, polar and nonpolar extractives, leaf mass per area, and equivalent water thickness. We also compared BMA with partial least squares (PLS) and stepwise multiple regression (SMR). Results showed that all the biochemicals except carotenoid were accurately estimated from hyerspectral data with R2 values > 0.80.« less
Li, Lin; Xu, Shuo; An, Xin; Zhang, Lu-Da
2011-10-01
In near infrared spectral quantitative analysis, the precision of measured samples' chemical values is the theoretical limit of those of quantitative analysis with mathematical models. However, the number of samples that can obtain accurately their chemical values is few. Many models exclude the amount of samples without chemical values, and consider only these samples with chemical values when modeling sample compositions' contents. To address this problem, a semi-supervised LS-SVR (S2 LS-SVR) model is proposed on the basis of LS-SVR, which can utilize samples without chemical values as well as those with chemical values. Similar to the LS-SVR, to train this model is equivalent to solving a linear system. Finally, the samples of flue-cured tobacco were taken as experimental material, and corresponding quantitative analysis models were constructed for four sample compositions' content(total sugar, reducing sugar, total nitrogen and nicotine) with PLS regression, LS-SVR and S2 LS-SVR. For the S2 LS-SVR model, the average relative errors between actual values and predicted ones for the four sample compositions' contents are 6.62%, 7.56%, 6.11% and 8.20%, respectively, and the correlation coefficients are 0.974 1, 0.973 3, 0.923 0 and 0.948 6, respectively. Experimental results show the S2 LS-SVR model outperforms the other two, which verifies the feasibility and efficiency of the S2 LS-SVR model.
Martins, Z E; Pinho, O; Ferreira, I M P L V O
2017-09-01
The use of agroindustry by-products (BP) for fortification of wheat bread can be an alternative to waste disposal because BP are appealing sources of dietary fiber. Moreover, it may also contribute to indirect income generation. In this study, sensory, color, and crumb structure properties of breads fortified with fiber rich fraction recovered from four types of agroindustry BP were tested, namely orange (OE), pomegranate (PE), elderberry (EE), and spent yeast (YE). Statistical models for sensory preference evaluation and correlation with color and crumb structure were developed. External preference mapping indicated consumer preferences and enabled selection of the concentrations of BP fibre-rich fraction with best acceptance, namely 7.0% EE, 2.5% OE, 5.0% PE, and 2.5% YE. Data collected from image analysis complemented sensory profile information, whereas multivariate PLS regression provided information on the relationship between "crust color" and "crumb color" and instrumental data. Regression models developed for both sensory attributes presented good fitting (R 2 Y > 0.700) and predictive ability (Q 2 > 0.500), with low RMSE. Crust and crumb a* parameters had a positive influence on "crust color" and "crumb color" models, while crust L* and b* had a negative influence. © 2017 Institute of Food Technologists®.
Ghanem, Eman; Hopfer, Helene; Navarro, Andrea; Ritzer, Maxwell S; Mahmood, Lina; Fredell, Morgan; Cubley, Ashley; Bolen, Jessica; Fattah, Rabia; Teasdale, Katherine; Lieu, Linh; Chua, Tedmund; Marini, Federico; Heymann, Hildegarde; Anslyn, Eric V
2015-05-20
Differential sensing using synthetic receptors as mimics of the mammalian senses of taste and smell is a powerful approach for the analysis of complex mixtures. Herein, we report on the effectiveness of a cross-reactive, supramolecular, peptide-based sensing array in differentiating and predicting the composition of red wine blends. Fifteen blends of Cabernet Sauvignon, Merlot and Cabernet Franc, in addition to the mono varietals, were used in this investigation. Linear Discriminant Analysis (LDA) showed a clear differentiation of blends based on tannin concentration and composition where certain mono varietals like Cabernet Sauvignon seemed to contribute less to the overall characteristics of the blend. Partial Least Squares (PLS) Regression and cross validation were used to build a predictive model for the responses of the receptors to eleven binary blends and the three mono varietals. The optimized model was later used to predict the percentage of each mono varietal in an independent test set composted of four tri-blends with a 15% average error. A partial least square regression model using the mouth-feel and taste descriptive sensory attributes of the wine blends revealed a strong correlation of the receptors to perceived astringency, which is indicative of selective binding to polyphenols in wine.
Pistonesi, Marcelo F; Di Nezio, María S; Centurión, María E; Lista, Adriana G; Fragoso, Wallace D; Pontes, Márcio J C; Araújo, Mário C U; Band, Beatriz S Fernández
2010-12-15
In this study, a novel, simple, and efficient spectrofluorimetric method to determine directly and simultaneously five phenolic compounds (hydroquinone, resorcinol, phenol, m-cresol and p-cresol) in air samples is presented. For this purpose, variable selection by the successive projections algorithm (SPA) is used in order to obtain simple multiple linear regression (MLR) models based on a small subset of wavelengths. For comparison, partial least square (PLS) regression is also employed in full-spectrum. The concentrations of the calibration matrix ranged from 0.02 to 0.2 mg L(-1) for hydroquinone, from 0.05 to 0.6 mg L(-1) for resorcinol, and from 0.05 to 0.4 mg L(-1) for phenol, m-cresol and p-cresol; incidentally, such ranges are in accordance with the Argentinean environmental legislation. To verify the accuracy of the proposed method a recovery study on real air samples of smoking environment was carried out with satisfactory results (94-104%). The advantage of the proposed method is that it requires only spectrofluorimetric measurements of samples and chemometric modeling for simultaneous determination of five phenols. With it, air is simply sampled and no pre-treatment sample is needed (i.e., separation steps and derivatization reagents are avoided) that means a great saving of time. Copyright © 2010 Elsevier B.V. All rights reserved.
Transmission versus reflectance spectroscopy for quantitation
NASA Astrophysics Data System (ADS)
Gardner, Craig M.
2018-01-01
The objective of this work was to compare the accuracy of analyte concentration estimation when using transmission versus diffuse reflectance spectroscopy of a scattering medium. Monte Carlo ray tracing of light through the medium was used in conjunction with pure component absorption spectra and Beer-Lambert absorption along each ray's pathlength to generate matched sets of pseudoabsorbance spectra, containing water and six analytes present in skin. PLS regression models revealed an improvement in accuracy when using transmission compared to reflectance for a range of medium thicknesses and instrument noise levels. An analytical expression revealed the source of the accuracy degradation with reflectance was due both to the reduced collection efficiency for a fixed instrument etendue and to the broad pathlength distribution that detected light travels in the medium before exiting from the incident side.
Güssregen, Stefan; Matter, Hans; Hessler, Gerhard; Müller, Marco; Schmidt, Friedemann; Clark, Timothy
2012-09-24
Current 3D-QSAR methods such as CoMFA or CoMSIA make use of classical force-field approaches for calculating molecular fields. Thus, they can not adequately account for noncovalent interactions involving halogen atoms like halogen bonds or halogen-π interactions. These deficiencies in the underlying force fields result from the lack of treatment of the anisotropy of the electron density distribution of those atoms, known as the "σ-hole", although recent developments have begun to take specific interactions such as halogen bonding into account. We have now replaced classical force field derived molecular fields by local properties such as the local ionization energy, local electron affinity, or local polarizability, calculated using quantum-mechanical (QM) techniques that do not suffer from the above limitation for 3D-QSAR. We first investigate the characteristics of QM-based local property fields to show that they are suitable for statistical analyses after suitable pretreatment. We then analyze these property fields with partial least-squares (PLS) regression to predict biological affinities of two data sets comprising factor Xa and GABA-A/benzodiazepine receptor ligands. While the resulting models perform equally well or even slightly better in terms of consistency and predictivity than the classical CoMFA fields, the most important aspect of these augmented field-types is that the chemical interpretation of resulting QM-based property field models reveals unique SAR trends driven by electrostatic and polarizability effects, which cannot be extracted directly from CoMFA electrostatic maps. Within the factor Xa set, the interaction of chlorine and bromine atoms with a tyrosine side chain in the protease S1 pocket are correctly predicted. Within the GABA-A/benzodiazepine ligand data set, PLS models of high predictivity resulted for our QM-based property fields, providing novel insights into key features of the SAR for two receptor subtypes and cross-receptor selectivity of the ligands. The detailed interpretation of regression models derived using improved QM-derived property fields thus provides a significant advantage by revealing chemically meaningful correlations with biological activity and helps in understanding novel structure-activity relationship features. This will allow such knowledge to be used to design novel molecules on the basis of interactions additional to steric and hydrogen-bonding features.
Liu, Xiaona; Zhang, Qiao; Wu, Zhisheng; Shi, Xinyuan; Zhao, Na; Qiao, Yanjiang
2015-01-01
Laser-induced breakdown spectroscopy (LIBS) was applied to perform a rapid elemental analysis and provenance study of Blumea balsamifera DC. Principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA) were implemented to exploit the multivariate nature of the LIBS data. Scores and loadings of computed principal components visually illustrated the differing spectral data. The PLS-DA algorithm showed good classification performance. The PLS-DA model using complete spectra as input variables had similar discrimination performance to using selected spectral lines as input variables. The down-selection of spectral lines was specifically focused on the major elements of B. balsamifera samples. Results indicated that LIBS could be used to rapidly analyze elements and to perform provenance study of B. balsamifera. PMID:25558999
Application of FTIR-ATR spectroscopy to the quantification of sugar in honey.
Anjos, Ofélia; Campos, Maria Graça; Ruiz, Pablo Contreras; Antunes, Paulo
2015-02-15
A Fourier transform infrared spectroscopic method with attenuated total reflectance (FTIR-ATR) and partial least squares (PLS) regression model for the prediction of sugar content in honey samples was calculated. Standards of trehalose, glucose, fructose, sucrose, melezitose, turanose and maltose were used to identify and quantify the individual sugar components in 63 honey samples by HPAEC-IPAD. Fructose and glucose are the highest sugars in honey with an average value of 36% and 26%, respectively. The 1stDer spectra with MSC or SLS in the wave number range from 1500 to 750cm(-1) provide the best calibration model with a r(2) of 86.60 and 86.01 with RPD of 2.6 and 2.55, respectively for fructose and glucose. For turanose and melezitose good models were also found. The FTIR-ATR showed to be a good methodology to quantify the main sugar content in honey and easily adapted to routine analysis. Copyright © 2014 Elsevier Ltd. All rights reserved.
Crop/weed discrimination using near-infrared reflectance spectroscopy (NIRS)
NASA Astrophysics Data System (ADS)
Zhang, Yun; He, Yong
2006-09-01
The traditional uniform herbicide application often results in an over chemical residues on soil, crop plants and agriculture produce, which have imperiled the environment and food security. Near-infrared reflectance spectroscopy (NIRS) offers a promising means for weed detection and site-specific herbicide application. In laboratory, a total of 90 samples (30 for each species) of the detached leaves of two weeds, i.e., threeseeded mercury (Acalypha australis L.) and fourleafed duckweed (Marsilea quadrfolia L.), and one crop soybean (Glycine max) was investigated for NIRS on 325- 1075 nm using a field spectroradiometer. 20 absorbance samples of each species after pretreatment were exported and the lacked Y variables were assigned independent values for partial least squares (PLS) analysis. During the combined principle component analysis (PCA) on 400-1000 nm, the PC1 and PC2 could together explain over 91% of the total variance and detect the three plant species with 98.3% accuracy. The full-cross validation results of PLS, i.e., standard error of prediction (SEP) 0.247, correlation coefficient (r) 0.954 and root mean square error of prediction (RMSEP) 0.245, indicated an optimum model for weed identification. By predicting the remaining 10 samples of each species in the PLS model, the results with deviation presented a 100% crop/weed detection rate. Thus, it could be concluded that PLS was an available alternative of for qualitative weed discrimination on NTRS.
Arakaki, Xianghong; Galbraith, Gary; Pikov, Victor; Fonteh, Alfred N.; Harrington, Michael G.
2014-01-01
Migraine symptoms often include auditory discomfort. Nitroglycerin (NTG)-triggered central sensitization (CS) provides a rodent model of migraine, but auditory brainstem pathways have not yet been studied in this example. Our objective was to examine brainstem auditory evoked potentials (BAEPs) in rat CS as a measure of possible auditory abnormalities. We used four subdermal electrodes to record horizontal (h) and vertical (v) dipole channel BAEPs before and after injection of NTG or saline. We measured the peak latencies (PLs), interpeak latencies (IPLs), and amplitudes for detectable waveforms evoked by 8, 16, or 32 KHz auditory stimulation. At 8 KHz stimulation, vertical channel positive PLs of waves 4, 5, and 6 (vP4, vP5, and vP6), and related IPLs from earlier negative or positive peaks (vN1-vP4, vN1-vP5, vN1-vP6; vP3-vP4, vP3-vP6) increased significantly 2 hours after NTG injection compared to the saline group. However, BAEP peak amplitudes at all frequencies, PLs and IPLs from the horizontal channel at all frequencies, and the vertical channel stimulated at 16 and 32 KHz showed no significant/consistent change. For the first time in the rat CS model, we show that BAEP PLs and IPLs ranging from putative bilateral medial superior olivary nuclei (P4) to the more rostral structures such as the medial geniculate body (P6) were prolonged 2 hours after NTG administration. These BAEP alterations could reflect changes in neurotransmitters and/or hypoperfusion in the midbrain. The similarity of our results with previous human studies further validates the rodent CS model for future migraine research. PMID:24680742
Geographical provenance of palm oil by fatty acid and volatile compound fingerprinting techniques.
Tres, A; Ruiz-Samblas, C; van der Veer, G; van Ruth, S M
2013-04-15
Analytical methods are required in addition to administrative controls to verify the geographical origin of vegetable oils such as palm oil in an objective manner. In this study the application of fatty acid and volatile organic compound fingerprinting in combination with chemometrics have been applied to verify the geographical origin of crude palm oil (continental scale). For this purpose 94 crude palm oil samples were collected from South East Asia (55), South America (11) and Africa (28). Partial least squares discriminant analysis (PLS-DA) was used to develop a hierarchical classification model by combining two consecutive binary PLS-DA models. First, a PLS-DA model was built to distinguish South East Asian from non-South East Asian palm oil samples. Then a second model was developed, only for the non-Asian samples, to discriminate African from South American crude palm oil. Models were externally validated by using them to predict the identity of new authentic samples. The fatty acid fingerprinting model revealed three misclassified samples. The volatile compound fingerprinting models showed an 88%, 100% and 100% accuracy for the South East Asian, African and American class, respectively. The verification of the geographical origin of crude palm oil is feasible by fatty acid and volatile compound fingerprinting. Further research is required to further validate the approach and to increase its spatial specificity to country/province scale. Copyright © 2012 Elsevier Ltd. All rights reserved.
Human Milk Plasmalogens Are Highly Enriched in Long-Chain PUFAs.
Moukarzel, Sara; Dyer, Roger A; Keller, Bernd O; Elango, Rajavel; Innis, Sheila M
2016-11-01
Human milk contains unique glycerophospholipids, including ethanolamine-containing plasmalogens (Pls-PEs) in the milk fat globule membrane, which have been implicated in infant brain development. Brain Pls-PEs accumulate postnatally and are enriched in long-chain polyunsaturated fatty acids (LC-PUFAs), particularly docosahexaenoic acid (DHA). Fatty acid (FA) composition of Pls-PEs in milk is poorly understood because of the analytical challenges in separating Pls-PEs from other phospholipids in the predominating presence of triacylglycerols. The variability of Pls-PE FAs and the potential role of maternal diet remain unknown. Our primary objectives were to establish improved methodology for extracting Pls-PEs from human milk, enabling FA analysis, and to compare FA composition between Pls-PEs and 2 major milk phospholipids, phosphatidylcholine and phosphatidylethanolamine. Our secondary objective was to explore associations between maternal DHA intake and DHA in milk phospholipids and variability in phospholipid-DHA within a woman. Mature milk was collected from 25 women, with 4 providing 3 milk samples on 3 separate days. Lipids were extracted, and phospholipids were removed by solid phase extraction. Pls-PEs were separated by using normal-phase HPLC, recovered and analyzed for FAs by GLC. Diet was assessed by using a validated food-frequency questionnaire. Pls-PE concentration in human milk was significantly higher in LC-PUFAs than phosphatidylethanolamine and phosphatidylcholine, including arachidonic acid (AA) and DHA. The mean ± SD concentration of AAs in Pls-PEs was ∼2.5-fold higher than in phosphatidylethanolamine (10.5 ± 1.71 and 3.82 ± 0.92 g/100 g, respectively). DHA in Pls-PEs varied across women (0.95-6.51 g/100 g), likely independent of maternal DHA intake. Pls-PE DHA also varied within a woman across days (CV ranged from 9.8% to 28%). Human milk provides the infant with LC-PUFAs from multiple lipid pools, including a source from Pls-PEs. The biological determinants of Pls-PE FAs and physiological relevance to the breastfed infant remain to be elucidated. © 2016 American Society for Nutrition.
Kritikos, Nikolaos; Tsantili-Kakoulidou, Anna; Loukas, Yannis L; Dotsikas, Yannis
2015-07-17
In the current study, quantitative structure-retention relationships (QSRR) were constructed based on data obtained by a LC-(ESI)-QTOF-MS/MS method for the determination of amino acid analogues, following their derivatization via chloroformate esters. Molecules were derivatized via n-propyl chloroformate/n-propanol mediated reaction. Derivatives were acquired through a liquid-liquid extraction procedure. Chromatographic separation is based on gradient elution using methanol/water mixtures from a 70/30% composition to an 85/15% final one, maintaining a constant rate of change. The group of examined molecules was diverse, including mainly α-amino acids, yet also β- and γ-amino acids, γ-amino acid analogues, decarboxylated and phosphorylated analogues and dipeptides. Projection to latent structures (PLS) method was selected for the formation of QSRRs, resulting in a total of three PLS models with high cross-validated coefficients of determination Q(2)Y. For this reason, molecular structures were previously described through the use of descriptors. Through stratified random sampling procedures, 57 compounds were split to a training set and a test set. Model creation was based on multiple criteria including principal component significance and eigenvalue, variable importance, form of residuals, etc. Validation was based on statistical metrics Rpred(2),QextF2(2),QextF3(2) for the test set and Roy's metrics rm(Av)(2) and rm(δ)(2), assessing both predictive stability and internal validity. Based on aforementioned models, simplified equivalent were then created using a multi-linear regression (MLR) method. MLR models were also validated with the same metrics. The suggested models are considered useful for the estimation of retention times of amino acid analogues for a series of applications. Copyright © 2015 Elsevier B.V. All rights reserved.
Predicting tropical plant physiology from leaf and canopy spectroscopy
NASA Astrophysics Data System (ADS)
Doughty, C.; Asner, G. P.; Martin, R.
2009-12-01
A broad understanding of tropical forest leaf photosynthesis has long been a goal for tropical forest ecologists, but elusive, due to difficult canopy access and great species diversity. In this paper, we develop an empirical model to predict light saturated sunlit tropical leaf photosynthesis based on leaf and canopy spectra with the goal of developing a high resolution remote sensing technique to measure canopy photosynthesis. To develop this model, we used the partial least squares (PLS) regression technique on three tropical forest datasets (~168 species), two in Hawaii and one in the tropical rainforest module of Biosphere 2 (B2L). For each species, we measured light saturated photosynthesis (A), light and CO2 saturated photosynthesis (Amax), day respiration (R), leaf spectra (400-2500 nm with 1 nm sampling), leaf nitrogen (N), chlorophyll A and B, carotenoids, and specific leaf area (SLA). On a subset of species we measured Jmax and Vcmax based on light and Aci curves. The model best predicted A (r2 = 0.74, root mean square error (RMSE) = 2.85 µmol m-2 s-1), R (r2 of 0.48, RMSE of -0.52 µmol m-2 s-1) followed by Amax (r2 of 0.47, RMSE of 5.1 µmol m-2 s-1), Jmax, (R2 = 0.52, RMSE = 39) and VCmax (R2 = 0.39, RMSE = 36). The PLS weightings, which indicate which wavelengths most contribute to the model, indicated that physiology weightings were most similar to nitrogen weightings, followed by chlorophyll and SLA. We combined leaf-level reflectance and transmittance with a canopy radiative transfer model to simulate top-of-canopy reflectance, and found that canopy spectra are a better predictor of light saturated photosynthesis more strongly (RMSE = 2.4 µmol m-2 s-1) than are leaf spectra (RMSE = 2.85 µmol m-2 s-1). The results suggest that there is potential for this technique to be used with high fidelity imaging spectrometers to remotely sense tropical forest canopy photosynthesis.
Prediction of specialty coffee cup quality based on near infrared spectra of green coffee beans.
Tolessa, Kassaye; Rademaker, Michael; De Baets, Bernard; Boeckx, Pascal
2016-04-01
The growing global demand for specialty coffee increases the need for improved coffee quality assessment methods. Green bean coffee quality analysis is usually carried out by physical (e.g. black beans, immature beans) and cup quality (e.g. acidity, flavour) evaluation. However, these evaluation methods are subjective, costly, time consuming, require sample preparation and may end up in poor grading systems. This calls for the development of a rapid, low-cost, reliable and reproducible analytical method to evaluate coffee quality attributes and eventually chemical compounds of interest (e.g. chlorogenic acid) in coffee beans. The aim of this study was to develop a model able to predict coffee cup quality based on NIR spectra of green coffee beans. NIR spectra of 86 samples of green Arabica beans of varying quality were analysed. Partial least squares (PLS) regression method was used to develop a model correlating spectral data to cupping score data (cup quality). The selected PLS model had a good predictive power for total specialty cup quality and its individual quality attributes (overall cup preference, acidity, body and aftertaste) showing a high correlation coefficient with r-values of 90, 90,78, 72 and 72, respectively, between measured and predicted cupping scores for 20 out of 86 samples. The corresponding root mean square error of prediction (RMSEP) was 1.04, 0.22, 0.27, 0.24 and 0.27 for total specialty cup quality, overall cup preference, acidity, body and aftertaste, respectively. The results obtained suggest that NIR spectra of green coffee beans are a promising tool for fast and accurate prediction of coffee quality and for classifying green coffee beans into different specialty grades. However, the model should be further tested for coffee samples from different regions in Ethiopia and test if one generic or region-specific model should be developed. Copyright © 2015 Elsevier B.V. All rights reserved.
Rodriguez-Florez, Naiara; Bruse, Jan L; Borghi, Alessandro; Vercruysse, Herman; Ong, Juling; James, Greg; Pennec, Xavier; Dunaway, David J; Jeelani, N U Owase; Schievano, Silvia
2017-10-01
Spring-assisted cranioplasty is performed to correct the long and narrow head shape of children with sagittal synostosis. Such corrective surgery involves osteotomies and the placement of spring-like distractors, which gradually expand to widen the skull until removal about 4 months later. Due to its dynamic nature, associations between surgical parameters and post-operative 3D head shape features are difficult to comprehend. The current study aimed at applying population-based statistical shape modelling to gain insight into how the choice of surgical parameters such as craniotomy size and spring positioning affects post-surgical head shape. Twenty consecutive patients with sagittal synostosis who underwent spring-assisted cranioplasty at Great Ormond Street Hospital for Children (London, UK) were prospectively recruited. Using a nonparametric statistical modelling technique based on mathematical currents, a 3D head shape template was computed from surface head scans of sagittal patients after spring removal. Partial least squares (PLS) regression was employed to quantify and visualise trends of localised head shape changes associated with the surgical parameters recorded during spring insertion: anterior-posterior and lateral craniotomy dimensions, anterior spring position and distance between anterior and posterior springs. Bivariate correlations between surgical parameters and corresponding PLS shape vectors demonstrated that anterior-posterior (Pearson's [Formula: see text]) and lateral craniotomy dimensions (Spearman's [Formula: see text]), as well as the position of the anterior spring ([Formula: see text]) and the distance between both springs ([Formula: see text]) on average had significant effects on head shapes at the time of spring removal. Such effects were visualised on 3D models. Population-based analysis of 3D post-operative medical images via computational statistical modelling tools allowed for detection of novel associations between surgical parameters and head shape features achieved following spring-assisted cranioplasty. The techniques described here could be extended to other cranio-maxillofacial procedures in order to assess post-operative outcomes and ultimately facilitate surgical decision making.
Nguyen, Phuc Nghia; Trinh Dang, Thuan Thao; Waton, Gilles; Vandamme, Thierry; Krafft, Marie Pierre
2011-10-04
The adsorption dynamics of a series of phospholipids (PLs) at the interface between an aqueous solution or dispersion of the PL and a gas phase containing the nonpolar, nonamphiphilic linear perfluorocarbon perfluorohexane (PFH) was studied by bubble profile analysis tensiometry. The PLs investigated were dioctanoylphosphatidylcholine (DiC(8)-PC), dilaurylphosphatidylcholine, dimyristoylphosphatidylcholine, and dipalmitoylphosphatidylcholine. The gas phase consisted of air or air saturated with PFH. The perfluorocarbon gas was found to have an unexpected, strong effect on both the adsorption rate and the equilibrium interfacial tension (γ(eq)) of the PLs. First, for all of the PLs, and at all concentrations investigated, the γ(eq) values were significantly lower (by up to 10 mN m(-1)) when PFH was present in the gas phase. The efficacy of PFH in decreasing γ(eq) depends on the ability of PLs to form micelles or vesicles in water. For vesicles, it also depends on the gel or fluid state of the membranes. Second, the adsorption rates of all the PLs at the interface (as assessed by the time required for the initial interfacial tension to be reduced by 30%) are significantly accelerated (by up to fivefold) by the presence of PFH for the lower PL concentrations. Both the surface-tension reducing effect and the adsorption rate increasing effect establish that PFH has a strong interaction with the PL monolayer and acts as a cosurfactant at the interface, despite the absence of any amphiphilic character. Fitting the adsorption profiles of DiC(8)-PC at the PFH-saturated air/aqueous solution interface with the modified Frumkin model indicated that the PFH molecule lay horizontally at the interface. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Yuan, Cheng; Lazarowitz, Sondra G; Citovsky, Vitaly
2016-01-19
Our fundamental knowledge of the protein-sorting pathways required for plant cell-to-cell trafficking and communication via the intercellular connections termed plasmodesmata has been severely limited by the paucity of plasmodesmal targeting sequences that have been identified to date. To address this limitation, we have identified the plasmodesmal localization signal (PLS) in the Tobacco mosaic virus (TMV) cell-to-cell-movement protein (MP), which has emerged as the paradigm for dissecting the molecular details of cell-to-cell transport through plasmodesmata. We report here the identification of a bona fide functional TMV MP PLS, which encompasses amino acid residues between positions 1 and 50, with residues Val-4 and Phe-14 potentially representing critical sites for PLS function that most likely affect protein conformation or protein interactions. We then demonstrated that this PLS is both necessary and sufficient for protein targeting to plasmodesmata. Importantly, as TMV MP traffics to plasmodesmata by a mechanism that is distinct from those of the three plant cell proteins in which PLSs have been reported, our findings provide important new insights to expand our understanding of protein-sorting pathways to plasmodesmata. The science of virology began with the discovery of Tobacco mosaic virus (TMV). Since then, TMV has served as an experimental and conceptual model for studies of viruses and dissection of virus-host interactions. Indeed, the TMV cell-to-cell-movement protein (MP) has emerged as the paradigm for dissecting the molecular details of cell-to-cell transport through the plant intercellular connections termed plasmodesmata. However, one of the most fundamental and key functional features of TMV MP, its putative plasmodesmal localization signal (PLS), has not been identified. Here, we fill this gap in our knowledge and identify the TMV MP PLS. Copyright © 2016 Yuan et al.
Purified human MDR 1 modulates membrane potential in reconstituted proteoliposomes.
Howard, Ellen M; Roepe, Paul D
2003-04-01
Human multidrug resistance (hu MDR 1) cDNA was fused to a P. shermanii transcarboxylase biotin acceptor domain (TCBD), and the fusion protein was heterologously overexpressed at high yield in K(+)-uptake deficient Saccharomyces cerevisiae yeast strain 9.3, purified by avidin-biotin chromatography, and reconstituted into proteoliposomes (PLs) formed with Escherichia coli lipid. As measured by pH- dependent ATPase activity, purified, reconstituted, biotinylated MDR-TCBD protein is fully functional. Dodecyl maltoside proved to be the most effective detergent for the membrane solubilization of MDR-TCBD, and various salts were found to significantly affect reconstitution into PLs. After extensive analysis, we find that purified reconstituted MDR-TCBD protein does not catalyze measurable H(+) pumping in the presence of ATP. In the presence of physiologic [ATP], K(+)/Na(+) diffusion potentials monitored by either anionic oxonol or cationic carbocyanine are easily established upon addition of valinomycin to either control or MDR-TCBD PLs. However, in the absence of ATP, although control PLs still maintain easily measurable K(+)/Na(+) diffusion potentials upon addition of valinomycin, MDR-TCBD PLs do not. Dissipation of potential by MDR-TCBD is clearly [ATP] dependent and also appears to be Cl(-) dependent, since replacing Cl(-) with equimolar glutamate restores the ability of MDR-TCBD PLs to form a membrane potential in the absence of physiologic [ATP]. The data are difficult to reconcile with models that might propose ATP-catalyzed "pumping" of the fluorescent probes we use and are more consistent with electrically passive anion transport via MDR-TCBD protein, but only at low [ATP]. These observations may help to resolve the confusing array of data related to putative ion transport by hu MDR 1 protein.
[NIR Assignment of Magnolol by 2D-COS Technology and Model Application Huoxiangzhengqi Oral Liduid].
Pei, Yan-ling; Wu, Zhi-sheng; Shi, Xin-yuan; Pan, Xiao-ning; Peng, Yan-fang; Qiao, Yan-jiang
2015-08-01
Near infrared (NIR) spectroscopy assignment of Magnolol was performed using deuterated chloroform solvent and two-dimensional correlation spectroscopy (2D-COS) technology. According to the synchronous spectra of deuterated chloroform solvent and Magnolol, 1365~1455, 1600~1720, 2000~2181 and 2275~2465 nm were the characteristic absorption of Magnolol. Connected with the structure of Magnolol, 1440 nm was the stretching vibration of phenolic group O-H, 1679 nm was the stretching vibration of aryl and methyl which connected with aryl, 2117, 2304, 2339 and 2370 nm were the combination of the stretching vibration, bending vibration and deformation vibration for aryl C-H, 2445 nm were the bending vibration of methyl which linked with aryl group, these bands attribut to the characteristics of Magnolol. Huoxiangzhengqi Oral Liduid was adopted to study the Magnolol, the characteristic band by spectral assignment and the band by interval Partial Least Squares (iPLS) and Synergy interval Partial Least Squares (SiPLS) were used to establish Partial Least Squares (PLS) quantitative model, the coefficient of determination Rcal(2) and Rpre(2) were greater than 0.99, the Root Mean of Square Error of Calibration (RM-SEC), Root Mean of Square Error of Cross Validation (RMSECV) and Root Mean of Square Error of Prediction (RMSEP) were very small. It indicated that the characteristic band by spectral assignment has the same results with the Chemometrics in PLS model. It provided a reference for NIR spectral assignment of chemical compositions in Chinese Materia Medica, and the band filters of NIR were interpreted.
Characterization of human breast cancer tissues by infrared imaging.
Verdonck, M; Denayer, A; Delvaux, B; Garaud, S; De Wind, R; Desmedt, C; Sotiriou, C; Willard-Gallo, K; Goormaghtigh, E
2016-01-21
Fourier Transform InfraRed (FTIR) spectroscopy coupled to microscopy (IR imaging) has shown unique advantages in detecting morphological and molecular pathologic alterations in biological tissues. The aim of this study was to evaluate the potential of IR imaging as a diagnostic tool to identify characteristics of breast epithelial cells and the stroma. In this study a total of 19 breast tissue samples were obtained from 13 patients. For 6 of the patients, we also obtained Non-Adjacent Non-Tumor tissue samples. Infrared images were recorded on the main cell/tissue types identified in all breast tissue samples. Unsupervised Principal Component Analyses and supervised Partial Least Square Discriminant Analyses (PLS-DA) were used to discriminate spectra. Leave-one-out cross-validation was used to evaluate the performance of PLS-DA models. Our results show that IR imaging coupled with PLS-DA can efficiently identify the main cell types present in FFPE breast tissue sections, i.e. epithelial cells, lymphocytes, connective tissue, vascular tissue and erythrocytes. A second PLS-DA model could distinguish normal and tumor breast epithelial cells in the breast tissue sections. A patient-specific model reached particularly high sensitivity, specificity and MCC rates. Finally, we showed that the stroma located close or at distance from the tumor exhibits distinct spectral characteristics. In conclusion FTIR imaging combined with computational algorithms could be an accurate, rapid and objective tool to identify/quantify breast epithelial cells and differentiate tumor from normal breast tissue as well as normal from tumor-associated stroma, paving the way to the establishment of a potential complementary tool to ensure safe tumor margins.
Posa, Mihalj; Pilipović, Ana; Lalić, Mladena; Popović, Jovan
2011-02-15
Linear dependence between temperature (t) and retention coefficient (k, reversed phase HPLC) of bile acids is obtained. Parameters (a, intercept and b, slope) of the linear function k=f(t) highly correlate with bile acids' structures. Investigated bile acids form linear congeneric groups on a principal component (calculated from k=f(t)) score plot that are in accordance with conformations of the hydroxyl and oxo groups in a bile acid steroid skeleton. Partition coefficient (K(p)) of nitrazepam in bile acids' micelles is investigated. Nitrazepam molecules incorporated in micelles show modified bioavailability (depo effect, higher permeability, etc.). Using multiple linear regression method QSAR models of nitrazepams' partition coefficient, K(p) are derived on the temperatures of 25°C and 37°C. For deriving linear regression models on both temperatures experimentally obtained lipophilicity parameters are included (PC1 from data k=f(t)) and in silico descriptors of the shape of a molecule while on the higher temperature molecular polarisation is introduced. This indicates the fact that the incorporation mechanism of nitrazepam in BA micelles changes on the higher temperatures. QSAR models are derived using partial least squares method as well. Experimental parameters k=f(t) are shown to be significant predictive variables. Both QSAR models are validated using cross validation and internal validation method. PLS models have slightly higher predictive capability than MLR models. Copyright © 2010 Elsevier B.V. All rights reserved.
Radioecological modelling of Polonium-210 and Caesium-137 in lichen-reindeer-man and top predators.
Persson, Bertil R R; Gjelsvik, Runhild; Holm, Elis
2018-06-01
This work deals with analysis and modelling of the radionuclides 210 Pb and 210 Po in the food-chain lichen-reindeer-man in addition to 210 Po and 137 Cs in top predators. By using the methods of Partial Least Square Regression (PLSR) the atmospheric deposition of 210 Pb and 210 Po is predicted at the sample locations. Dynamic modelling of the activity concentration with differential equations is fitted to the sample data. Reindeer lichen consumption, gastrointestinal absorption, organ distribution and elimination is derived from information in the literature. Dynamic modelling of transfer of 210 Pb and 210 Po to reindeer meat, liver and bone from lichen consumption, fitted well with data from Sweden and Finland from 1966 to 1971. The activity concentration of 210 Pb in the skeleton in man is modelled by using the results of studying the kinetics of lead in skeleton and blood in lead-workers after end of occupational exposure. The result of modelling 210 Pb and 210 Po activity in skeleton matched well with concentrations of 210 Pb and 210 Po in teeth from reindeer-breeders and autopsy bone samples in Finland. The results of 210 Po and 137 Cs in different tissues of wolf, wolverine and lynx previously published, are analysed with multivariate data processing methods such as Principal Component Analysis PCA, and modelled with the method of Projection to Latent Structures, PLS, or Partial Least Square Regression PLSR. Copyright © 2017 Elsevier Ltd. All rights reserved.
Shao, Yongni; Xie, Chuanqi; Jiang, Linjun; Shi, Jiahui; Zhu, Jiajin; He, Yong
2015-04-05
Visible/near infrared spectroscopy (Vis/NIR) based on sensitive wavelengths (SWs) and chemometrics was proposed to discriminate different tomatoes bred by spaceflight mutagenesis from their leafs or fruits (green or mature). The tomato breeds were mutant M1, M2 and their parent. Partial least squares (PLS) analysis and least squares-support vector machine (LS-SVM) were implemented for calibration models. PLS analysis was implemented for calibration models with different wavebands including the visible region (400-700 nm) and the near infrared region (700-1000 nm). The best PLS models were achieved in the visible region for the leaf and green fruit samples and in the near infrared region for the mature fruit samples. Furthermore, different latent variables (4-8 LVs for leafs, 5-9 LVs for green fruits, and 4-9 LVs for mature fruits) were used as inputs of LS-SVM to develop the LV-LS-SVM models with the grid search technique and radial basis function (RBF) kernel. The optimal LV-LS-SVM models were achieved with six LVs for the leaf samples, seven LVs for green fruits, and six LVs for mature fruits, respectively, and they outperformed the PLS models. Moreover, independent component analysis (ICA) was executed to select several SWs based on loading weights. The optimal LS-SVM model was achieved with SWs of 550-560 nm, 562-574 nm, 670-680 nm and 705-71 5 nm for the leaf samples; 548-556 nm, 559-564 nm, 678-685 nm and 962-974 nm for the green fruit samples; and 712-718 nm, 720-729 nm, 968-978 nm and 820-830 nm for the mature fruit samples. All of them had better performance than PLS and LV-LS-SVM, with the parameters of correlation coefficient (rp), root mean square error of prediction (RMSEP) and bias of 0.9792, 0.2632 and 0.0901 based on leaf discrimination, 0.9837, 0.2783 and 0.1758 based on green fruit discrimination, 0.9804, 0.2215 and -0.0035 based on mature fruit discrimination, respectively. The overall results indicated that ICA was an effective way for the selection of SWs, and the Vis/NIR combined with LS-SVM models had the capability to predict the different breeds (mutant M1, mutant M2 and their parent) of tomatoes from leafs and fruits. Copyright © 2015 Elsevier B.V. All rights reserved.
An Investigation into the Relationship between Human Cranial and Pelvic Sexual Dimorphism.
Best, Kaleigh C; Garvin, Heather M; Cabo, Luis L
2017-10-16
When faced with commingled remains, it might be assumed that a more "masculine" pelvis is associated with a more "masculine" cranium, but this relationship has not been specifically tested. This study uses geometric morphometric analyses of pelvic and cranial landmarks to assess whether there is an intra-individual relationship between the degrees of sexual expression in these two skeletal regions. Principal component and discriminant function scores were used to assess sexual dimorphism in 113 U.S. Black individuals. Correlation values and partial least squares regression (PLS) were used to evaluate intra-individual relationships. Results indicate that the os coxae is more sexually dimorphic than the cranium, with element shape being more sexually dimorphic than size. PLS and correlation results suggest no significant intra-individual relationship between pelvic and cranial sexual size or shape expression. Thus, in commingled situations, associations between these skeletal elements cannot be inferred based on degree of "masculinity." © 2017 American Academy of Forensic Sciences.
The Extent and Prediction of Heavy Metal Pollution in Soils of Shahrood and Damghan, Iran.
Sakizadeh, Mohamad; Mirzaei, Rouhollah; Ghorbani, Hadi
2015-12-01
The levels of 12 heavy metals (Ag, Ba, Be, Cd, Co, Cr, Cu, Ni, Pb, Tl, V, Zn) were considered in 229 soil samples in Semnan Province, Iran. To discriminate between natural and anthropogenic inputs of heavy metals, factor analysis was used. Seven factors accounting for 90.5 % of the total variance were extracted. The mining and agricultural activities along with geogenic sources have been attributed as the main causes of the levels of heavy metals in the study area. The partial least squares regression was utilized to predict the level of soil pollution index (SPI) considering the concentrations of 12 heavy metals. The eigenvectors from the first three PLS represented more than 98 % of the overall variance. The correlation coefficient between the observed and predicted SPI was 0.99 indicating the high efficiency of this method. The resultant coefficient of determination for three PLS components was 0.984 confirming the predictive ability of this method.
Carranco, Núria; Farrés-Cebrián, Mireia; Saurina, Javier
2018-01-01
High performance liquid chromatography method with ultra-violet detection (HPLC-UV) fingerprinting was applied for the analysis and characterization of olive oils, and was performed using a Zorbax Eclipse XDB-C8 reversed-phase column under gradient elution, employing 0.1% formic acid aqueous solution and methanol as mobile phase. More than 130 edible oils, including monovarietal extra-virgin olive oils (EVOOs) and other vegetable oils, were analyzed. Principal component analysis results showed a noticeable discrimination between olive oils and other vegetable oils using raw HPLC-UV chromatographic profiles as data descriptors. However, selected HPLC-UV chromatographic time-window segments were necessary to achieve discrimination among monovarietal EVOOs. Partial least square (PLS) regression was employed to tackle olive oil authentication of Arbequina EVOO adulterated with Picual EVOO, a refined olive oil, and sunflower oil. Highly satisfactory results were obtained after PLS analysis, with overall errors in the quantitation of adulteration in the Arbequina EVOO (minimum 2.5% adulterant) below 2.9%. PMID:29561820