NASA Astrophysics Data System (ADS)
Shan, Jiajia; Wang, Xue; Zhou, Hao; Han, Shuqing; Riza, Dimas Firmanda Al; Kondo, Naoshi
2018-04-01
Synchronous fluorescence spectra, combined with multivariate analysis were used to predict flavonoids content in green tea rapidly and nondestructively. This paper presented a new and efficient spectral intervals selection method called clustering based partial least square (CL-PLS), which selected informative wavelengths by combining clustering concept and partial least square (PLS) methods to improve models’ performance by synchronous fluorescence spectra. The fluorescence spectra of tea samples were obtained and k-means and kohonen-self organizing map clustering algorithms were carried out to cluster full spectra into several clusters, and sub-PLS regression model was developed on each cluster. Finally, CL-PLS models consisting of gradually selected clusters were built. Correlation coefficient (R) was used to evaluate the effect on prediction performance of PLS models. In addition, variable influence on projection partial least square (VIP-PLS), selectivity ratio partial least square (SR-PLS), interval partial least square (iPLS) models and full spectra PLS model were investigated and the results were compared. The results showed that CL-PLS presented the best result for flavonoids prediction using synchronous fluorescence spectra.
Shan, Jiajia; Wang, Xue; Zhou, Hao; Han, Shuqing; Riza, Dimas Firmanda Al; Kondo, Naoshi
2018-03-13
Synchronous fluorescence spectra, combined with multivariate analysis were used to predict flavonoids content in green tea rapidly and nondestructively. This paper presented a new and efficient spectral intervals selection method called clustering based partial least square (CL-PLS), which selected informative wavelengths by combining clustering concept and partial least square (PLS) methods to improve models' performance by synchronous fluorescence spectra. The fluorescence spectra of tea samples were obtained and k-means and kohonen-self organizing map clustering algorithms were carried out to cluster full spectra into several clusters, and sub-PLS regression model was developed on each cluster. Finally, CL-PLS models consisting of gradually selected clusters were built. Correlation coefficient (R) was used to evaluate the effect on prediction performance of PLS models. In addition, variable influence on projection partial least square (VIP-PLS), selectivity ratio partial least square (SR-PLS), interval partial least square (iPLS) models and full spectra PLS model were investigated and the results were compared. The results showed that CL-PLS presented the best result for flavonoids prediction using synchronous fluorescence spectra.
Peng, Jiangtao; Peng, Silong; Xie, Qiong; Wei, Jiping
2011-04-01
In order to eliminate the lower order polynomial interferences, a new quantitative calibration algorithm "Baseline Correction Combined Partial Least Squares (BCC-PLS)", which combines baseline correction and conventional PLS, is proposed. By embedding baseline correction constraints into PLS weights selection, the proposed calibration algorithm overcomes the uncertainty in baseline correction and can meet the requirement of on-line attenuated total reflectance Fourier transform infrared (ATR-FTIR) quantitative analysis. The effectiveness of the algorithm is evaluated by the analysis of glucose and marzipan ATR-FTIR spectra. BCC-PLS algorithm shows improved prediction performance over PLS. The root mean square error of cross-validation (RMSECV) on marzipan spectra for the prediction of the moisture is found to be 0.53%, w/w (range 7-19%). The sugar content is predicted with a RMSECV of 2.04%, w/w (range 33-68%). Copyright © 2011 Elsevier B.V. All rights reserved.
Cao, Hui; Li, Yao-Jiang; Zhou, Yan; Wang, Yan-Xia
2014-11-01
To deal with nonlinear characteristics of spectra data for the thermal power plant flue, a nonlinear partial least square (PLS) analysis method with internal model based on neural network is adopted in the paper. The latent variables of the independent variables and the dependent variables are extracted by PLS regression firstly, and then they are used as the inputs and outputs of neural network respectively to build the nonlinear internal model by train process. For spectra data of flue gases of the thermal power plant, PLS, the nonlinear PLS with the internal model of back propagation neural network (BP-NPLS), the non-linear PLS with the internal model of radial basis function neural network (RBF-NPLS) and the nonlinear PLS with the internal model of adaptive fuzzy inference system (ANFIS-NPLS) are compared. The root mean square error of prediction (RMSEP) of sulfur dioxide of BP-NPLS, RBF-NPLS and ANFIS-NPLS are reduced by 16.96%, 16.60% and 19.55% than that of PLS, respectively. The RMSEP of nitric oxide of BP-NPLS, RBF-NPLS and ANFIS-NPLS are reduced by 8.60%, 8.47% and 10.09% than that of PLS, respectively. The RMSEP of nitrogen dioxide of BP-NPLS, RBF-NPLS and ANFIS-NPLS are reduced by 2.11%, 3.91% and 3.97% than that of PLS, respectively. Experimental results show that the nonlinear PLS is more suitable for the quantitative analysis of glue gas than PLS. Moreover, by using neural network function which can realize high approximation of nonlinear characteristics, the nonlinear partial least squares method with internal model mentioned in this paper have well predictive capabilities and robustness, and could deal with the limitations of nonlinear partial least squares method with other internal model such as polynomial and spline functions themselves under a certain extent. ANFIS-NPLS has the best performance with the internal model of adaptive fuzzy inference system having ability to learn more and reduce the residuals effectively. Hence, ANFIS-NPLS is an accurate and useful quantitative thermal power plant flue gas analysis method.
Xie, Chuanqi; He, Yong
2016-01-01
This study was carried out to use hyperspectral imaging technique for determining color (L*, a* and b*) and eggshell strength and identifying cracked chicken eggs. Partial least squares (PLS) models based on full and selected wavelengths suggested by regression coefficient (RC) method were established to predict the four parameters, respectively. Partial least squares-discriminant analysis (PLS-DA) and RC-partial least squares-discriminant analysis (RC-PLS-DA) models were applied to identify cracked eggs. PLS models performed well with the correlation coefficient (rp) of 0.788 for L*, 0.810 for a*, 0.766 for b* and 0.835 for eggshell strength. RC-PLS models also obtained the rp of 0.771 for L*, 0.806 for a*, 0.767 for b* and 0.841 for eggshell strength. The classification results were 97.06% in PLS-DA model and 88.24% in RC-PLS-DA model. It demonstrated that hyperspectral imaging technique has the potential to be used to detect color and eggshell strength values and identify cracked chicken eggs. PMID:26882990
Divya, O; Mishra, Ashok K
2007-05-29
Quantitative determination of kerosene fraction present in diesel has been carried out based on excitation emission matrix fluorescence (EEMF) along with parallel factor analysis (PARAFAC) and N-way partial least squares regression (N-PLS). EEMF is a simple, sensitive and nondestructive method suitable for the analysis of multifluorophoric mixtures. Calibration models consisting of varying compositions of diesel and kerosene were constructed and their validation was carried out using leave-one-out cross validation method. The accuracy of the model was evaluated through the root mean square error of prediction (RMSEP) for the PARAFAC, N-PLS and unfold PLS methods. N-PLS was found to be a better method compared to PARAFAC and unfold PLS method because of its low RMSEP values.
Local classification: Locally weighted-partial least squares-discriminant analysis (LW-PLS-DA).
Bevilacqua, Marta; Marini, Federico
2014-08-01
The possibility of devising a simple, flexible and accurate non-linear classification method, by extending the locally weighted partial least squares (LW-PLS) approach to the cases where the algorithm is used in a discriminant way (partial least squares discriminant analysis, PLS-DA), is presented. In particular, to assess which category an unknown sample belongs to, the proposed algorithm operates by identifying which training objects are most similar to the one to be predicted and building a PLS-DA model using these calibration samples only. Moreover, the influence of the selected training samples on the local model can be further modulated by adopting a not uniform distance-based weighting scheme which allows the farthest calibration objects to have less impact than the closest ones. The performances of the proposed locally weighted-partial least squares-discriminant analysis (LW-PLS-DA) algorithm have been tested on three simulated data sets characterized by a varying degree of non-linearity: in all cases, a classification accuracy higher than 99% on external validation samples was achieved. Moreover, when also applied to a real data set (classification of rice varieties), characterized by a high extent of non-linearity, the proposed method provided an average correct classification rate of about 93% on the test set. By the preliminary results, showed in this paper, the performances of the proposed LW-PLS-DA approach have proved to be comparable and in some cases better than those obtained by other non-linear methods (k nearest neighbors, kernel-PLS-DA and, in the case of rice, counterpropagation neural networks). Copyright © 2014 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Hart, Brian K.; Griffiths, Peter R.
1998-06-01
Partial least squares (PLS) regression has been evaluated as a robust calibration technique for over 100 hazardous air pollutants (HAPs) measured by open path Fourier transform infrared (OP/FT-IR) spectrometry. PLS has the advantage over the current recommended calibration method of classical least squares (CLS), in that it can look at the whole useable spectrum (700-1300 cm-1, 2000-2150 cm-1, and 2400-3000 cm-1), and detect several analytes simultaneously. Up to one hundred HAPs synthetically added to OP/FT-IR backgrounds have been simultaneously calibrated and detected using PLS. PLS also has the advantage in requiring less preprocessing of spectra than that which is required in CLS calibration schemes, allowing PLS to provide user independent real-time analysis of OP/FT-IR spectra.
Partial least squares (PLS) analysis offers a number of advantages over the more traditionally used regression analyses applied in landscape ecology, particularly for determining the associations among multiple constituents of surface water and landscape configuration. Common dat...
NASA Astrophysics Data System (ADS)
Talebpour, Zahra; Tavallaie, Roya; Ahmadi, Seyyed Hamid; Abdollahpour, Assem
2010-09-01
In this study, a new method for the simultaneous determination of penicillin G salts in pharmaceutical mixture via FT-IR spectroscopy combined with chemometrics was investigated. The mixture of penicillin G salts is a complex system due to similar analytical characteristics of components. Partial least squares (PLS) and radial basis function-partial least squares (RBF-PLS) were used to develop the linear and nonlinear relation between spectra and components, respectively. The orthogonal signal correction (OSC) preprocessing method was used to correct unexpected information, such as spectral overlapping and scattering effects. In order to compare the influence of OSC on PLS and RBF-PLS models, the optimal linear (PLS) and nonlinear (RBF-PLS) models based on conventional and OSC preprocessed spectra were established and compared. The obtained results demonstrated that OSC clearly enhanced the performance of both RBF-PLS and PLS calibration models. Also in the case of some nonlinear relation between spectra and component, OSC-RBF-PLS gave satisfactory results than OSC-PLS model which indicated that the OSC was helpful to remove extrinsic deviations from linearity without elimination of nonlinear information related to component. The chemometric models were tested on an external dataset and finally applied to the analysis commercialized injection product of penicillin G salts.
Song, Weiran; Wang, Hui; Maguire, Paul; Nibouche, Omar
2018-06-07
Partial Least Squares Discriminant Analysis (PLS-DA) is one of the most effective multivariate analysis methods for spectral data analysis, which extracts latent variables and uses them to predict responses. In particular, it is an effective method for handling high-dimensional and collinear spectral data. However, PLS-DA does not explicitly address data multimodality, i.e., within-class multimodal distribution of data. In this paper, we present a novel method termed nearest clusters based PLS-DA (NCPLS-DA) for addressing the multimodality and nonlinearity issues explicitly and improving the performance of PLS-DA on spectral data classification. The new method applies hierarchical clustering to divide samples into clusters and calculates the corresponding centre of every cluster. For a given query point, only clusters whose centres are nearest to such a query point are used for PLS-DA. Such a method can provide a simple and effective tool for separating multimodal and nonlinear classes into clusters which are locally linear and unimodal. Experimental results on 17 datasets, including 12 UCI and 5 spectral datasets, show that NCPLS-DA can outperform 4 baseline methods, namely, PLS-DA, kernel PLS-DA, local PLS-DA and k-NN, achieving the highest classification accuracy most of the time. Copyright © 2018 Elsevier B.V. All rights reserved.
Partial least squares (PLS) analysis offers a number of advantages over the more traditionally used regression analyses applied in landscape ecology to study the associations among constituents of surface water and landscapes. Common data problems in ecological studies include: s...
NASA Astrophysics Data System (ADS)
Meksiarun, Phiranuphon; Ishigaki, Mika; Huck-Pezzei, Verena A. C.; Huck, Christian W.; Wongravee, Kanet; Sato, Hidetoshi; Ozaki, Yukihiro
2017-03-01
This study aimed to extract the paraffin component from paraffin-embedded oral cancer tissue spectra using three multivariate analysis (MVA) methods; Independent Component Analysis (ICA), Partial Least Squares (PLS) and Independent Component - Partial Least Square (IC-PLS). The estimated paraffin components were used for removing the contribution of paraffin from the tissue spectra. These three methods were compared in terms of the efficiency of paraffin removal and the ability to retain the tissue information. It was found that ICA, PLS and IC-PLS could remove the paraffin component from the spectra at almost the same level while Principal Component Analysis (PCA) was incapable. In terms of retaining cancer tissue spectral integrity, effects of PLS and IC-PLS on the non-paraffin region were significantly less than that of ICA where cancer tissue spectral areas were deteriorated. The paraffin-removed spectra were used for constructing Raman images of oral cancer tissue and compared with Hematoxylin and Eosin (H&E) stained tissues for verification. This study has demonstrated the capability of Raman spectroscopy together with multivariate analysis methods as a diagnostic tool for the paraffin-embedded tissue section.
Kernel analysis of partial least squares (PLS) regression models.
Shinzawa, Hideyuki; Ritthiruangdej, Pitiporn; Ozaki, Yukihiro
2011-05-01
An analytical technique based on kernel matrix representation is demonstrated to provide further chemically meaningful insight into partial least squares (PLS) regression models. The kernel matrix condenses essential information about scores derived from PLS or principal component analysis (PCA). Thus, it becomes possible to establish the proper interpretation of the scores. A PLS model for the total nitrogen (TN) content in multiple Thai fish sauces is built with a set of near-infrared (NIR) transmittance spectra of the fish sauce samples. The kernel analysis of the scores effectively reveals that the variation of the spectral feature induced by the change in protein content is substantially associated with the total water content and the protein hydration. Kernel analysis is also carried out on a set of time-dependent infrared (IR) spectra representing transient evaporation of ethanol from a binary mixture solution of ethanol and oleic acid. A PLS model to predict the elapsed time is built with the IR spectra and the kernel matrix is derived from the scores. The detailed analysis of the kernel matrix provides penetrating insight into the interaction between the ethanol and the oleic acid.
NASA Astrophysics Data System (ADS)
Chen, Hui; Tan, Chao; Lin, Zan; Wu, Tong
2018-01-01
Milk is among the most popular nutrient source worldwide, which is of great interest due to its beneficial medicinal properties. The feasibility of the classification of milk powder samples with respect to their brands and the determination of protein concentration is investigated by NIR spectroscopy along with chemometrics. Two datasets were prepared for experiment. One contains 179 samples of four brands for classification and the other contains 30 samples for quantitative analysis. Principal component analysis (PCA) was used for exploratory analysis. Based on an effective model-independent variable selection method, i.e., minimal-redundancy maximal-relevance (MRMR), only 18 variables were selected to construct a partial least-square discriminant analysis (PLS-DA) model. On the test set, the PLS-DA model based on the selected variable set was compared with the full-spectrum PLS-DA model, both of which achieved 100% accuracy. In quantitative analysis, the partial least-square regression (PLSR) model constructed by the selected subset of 260 variables outperforms significantly the full-spectrum model. It seems that the combination of NIR spectroscopy, MRMR and PLS-DA or PLSR is a powerful tool for classifying different brands of milk and determining the protein content.
Kumar, Keshav; Mishra, Ashok Kumar
2015-07-01
Fluorescence characteristic of 8-anilinonaphthalene-1-sulfonic acid (ANS) in ethanol-water mixture in combination with partial least square (PLS) analysis was used to propose a simple and sensitive analytical procedure for monitoring the adulteration of ethanol by water. The proposed analytical procedure was found to be capable of detecting even small adulteration level of ethanol by water. The robustness of the procedure is evident from the statistical parameters such as square of correlation coefficient (R(2)), root mean square of calibration (RMSEC) and root mean square of prediction (RMSEP) that were found to be well with in the acceptable limits.
Rahman, Md. Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D. W.; Labrique, Alain B.; Rashid, Mahbubur; Christian, Parul; West, Keith P.
2017-01-01
Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 − -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset. PMID:29261760
Kabir, Alamgir; Rahman, Md Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D W; Labrique, Alain B; Rashid, Mahbubur; Christian, Parul; West, Keith P
2017-01-01
Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 - -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset.
Lee, Soo Yee; Mediani, Ahmed; Maulidiani, Maulidiani; Khatib, Alfi; Ismail, Intan Safinar; Zawawi, Norhasnida; Abas, Faridah
2018-01-01
Neptunia oleracea is a plant consumed as a vegetable and which has been used as a folk remedy for several diseases. Herein, two regression models (partial least squares, PLS; and random forest, RF) in a metabolomics approach were compared and applied to the evaluation of the relationship between phenolics and bioactivities of N. oleracea. In addition, the effects of different extraction conditions on the phenolic constituents were assessed by pattern recognition analysis. Comparison of the PLS and RF showed that RF exhibited poorer generalization and hence poorer predictive performance. Both the regression coefficient of PLS and the variable importance of RF revealed that quercetin and kaempferol derivatives, caffeic acid and vitexin-2-O-rhamnoside were significant towards the tested bioactivities. Furthermore, principal component analysis (PCA) and partial least squares-discriminant analysis (PLS-DA) results showed that sonication and absolute ethanol are the preferable extraction method and ethanol ratio, respectively, to produce N. oleracea extracts with high phenolic levels and therefore high DPPH scavenging and α-glucosidase inhibitory activities. Both PLS and RF are useful regression models in metabolomics studies. This work provides insight into the performance of different multivariate data analysis tools and the effects of different extraction conditions on the extraction of desired phenolics from plants. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
NASA Astrophysics Data System (ADS)
Darwish, Hany W.; Hassan, Said A.; Salem, Maissa Y.; El-Zeany, Badr A.
2014-03-01
Different chemometric models were applied for the quantitative analysis of Amlodipine (AML), Valsartan (VAL) and Hydrochlorothiazide (HCT) in ternary mixture, namely, Partial Least Squares (PLS) as traditional chemometric model and Artificial Neural Networks (ANN) as advanced model. PLS and ANN were applied with and without variable selection procedure (Genetic Algorithm GA) and data compression procedure (Principal Component Analysis PCA). The chemometric methods applied are PLS-1, GA-PLS, ANN, GA-ANN and PCA-ANN. The methods were used for the quantitative analysis of the drugs in raw materials and pharmaceutical dosage form via handling the UV spectral data. A 3-factor 5-level experimental design was established resulting in 25 mixtures containing different ratios of the drugs. Fifteen mixtures were used as a calibration set and the other ten mixtures were used as validation set to validate the prediction ability of the suggested methods. The validity of the proposed methods was assessed using the standard addition technique.
Lu, Yuzhen; Du, Changwen; Yu, Changbing; Zhou, Jianmin
2014-08-01
Fast and non-destructive determination of rapeseed protein content carries significant implications in rapeseed production. This study presented the first attempt of using Fourier transform mid-infrared photoacoustic spectroscopy (FTIR-PAS) to quantify protein content of rapeseed. The full-spectrum model was first built using partial least squares (PLS). Interval selection methods including interval partial least squares (iPLS), synergy interval partial least squares (siPLS), backward elimination interval partial least squares (biPLS) and dynamic backward elimination interval partial least squares (dyn-biPLS) were then employed to select the relevant band or band combination for PLS modeling. The full-spectrum PLS model achieved an ratio of prediction to deviation (RPD) of 2.047. In comparison, all interval selection methods produced better results than full-spectrum modeling. siPLS achieved the best predictive accuracy with an RPD of 3.215 when the spectrum was sectioned into 25 intervals, and two intervals (1198-1335 and 1614-1753 cm(-1) ) were selected. iPLS excelled biPLS and dyn-biPLS, and dyn-biPLS performed slightly better than biPLS. FTIR-PAS was verified as a promising analytical tool to quantify rapeseed protein content. Interval selection could extract the relevant individual band or synergy band associated with the sample constituent of interest, and then improve the prediction accuracy of the full-spectrum model. © 2013 Society of Chemical Industry.
Kernel PLS-SVC for Linear and Nonlinear Discrimination
NASA Technical Reports Server (NTRS)
Rosipal, Roman; Trejo, Leonard J.; Matthews, Bryan
2003-01-01
A new methodology for discrimination is proposed. This is based on kernel orthonormalized partial least squares (PLS) dimensionality reduction of the original data space followed by support vector machines for classification. Close connection of orthonormalized PLS and Fisher's approach to linear discrimination or equivalently with canonical correlation analysis is described. This gives preference to use orthonormalized PLS over principal component analysis. Good behavior of the proposed method is demonstrated on 13 different benchmark data sets and on the real world problem of the classification finger movement periods versus non-movement periods based on electroencephalogram.
da Silva, Fabiana E B; Flores, Érico M M; Parisotto, Graciele; Müller, Edson I; Ferrão, Marco F
2016-03-01
An alternative method for the quantification of sulphametoxazole (SMZ) and trimethoprim (TMP) using diffuse reflectance infrared Fourier-transform spectroscopy (DRIFTS) and partial least square regression (PLS) was developed. Interval Partial Least Square (iPLS) and Synergy Partial Least Square (siPLS) were applied to select a spectral range that provided the lowest prediction error in comparison to the full-spectrum model. Fifteen commercial tablet formulations and forty-nine synthetic samples were used. The ranges of concentration considered were 400 to 900 mg g-1SMZ and 80 to 240 mg g-1 TMP. Spectral data were recorded between 600 and 4000 cm-1 with a 4 cm-1 resolution by Diffuse Reflectance Infrared Fourier Transform Spectroscopy (DRIFTS). The proposed procedure was compared to high performance liquid chromatography (HPLC). The results obtained from the root mean square error of prediction (RMSEP), during the validation of the models for samples of sulphamethoxazole (SMZ) and trimethoprim (TMP) using siPLS, demonstrate that this approach is a valid technique for use in quantitative analysis of pharmaceutical formulations. The selected interval algorithm allowed building regression models with minor errors when compared to the full spectrum PLS model. A RMSEP of 13.03 mg g-1for SMZ and 4.88 mg g-1 for TMP was obtained after the selection the best spectral regions by siPLS.
Zhang, Xue-Xi; Yin, Jian-Hua; Mao, Zhi-Hua; Xia, Yang
2015-01-01
Abstract. Fourier transform infrared imaging (FTIRI) combined with chemometrics algorithm has strong potential to obtain complex chemical information from biology tissues. FTIRI and partial least squares-discriminant analysis (PLS-DA) were used to differentiate healthy and osteoarthritic (OA) cartilages for the first time. A PLS model was built on the calibration matrix of spectra that was randomly selected from the FTIRI spectral datasets of healthy and lesioned cartilage. Leave-one-out cross-validation was performed in the PLS model, and the fitting coefficient between actual and predicted categorical values of the calibration matrix reached 0.95. In the calibration and prediction matrices, the successful identifying percentages of healthy and lesioned cartilage spectra were 100% and 90.24%, respectively. These results demonstrated that FTIRI combined with PLS-DA could provide a promising approach for the categorical identification of healthy and OA cartilage specimens. PMID:26057029
Zhang, Xue-Xi; Yin, Jian-Hua; Mao, Zhi-Hua; Xia, Yang
2015-06-01
Fourier transform infrared imaging (FTIRI) combined with chemometrics algorithm has strong potential to obtain complex chemical information from biology tissues. FTIRI and partial least squares-discriminant analysis (PLS-DA) were used to differentiate healthy and osteoarthritic (OA) cartilages for the first time. A PLS model was built on the calibration matrix of spectra that was randomly selected from the FTIRI spectral datasets of healthy and lesioned cartilage. Leave-one-out cross-validation was performed in the PLS model, and the fitting coefficient between actual and predicted categorical values of the calibration matrix reached 0.95. In the calibration and prediction matrices, the successful identifying percentages of healthy and lesioned cartilage spectra were 100% and 90.24%, respectively. These results demonstrated that FTIRI combined with PLS-DA could provide a promising approach for the categorical identification of healthy and OA cartilage specimens.
Katsarov, Plamen; Gergov, Georgi; Alin, Aylin; Pilicheva, Bissera; Al-Degs, Yahya; Simeonov, Vasil; Kassarova, Margarita
2018-03-01
The prediction power of partial least squares (PLS) and multivariate curve resolution-alternating least squares (MCR-ALS) methods have been studied for simultaneous quantitative analysis of the binary drug combination - doxylamine succinate and pyridoxine hydrochloride. Analysis of first-order UV overlapped spectra was performed using different PLS models - classical PLS1 and PLS2 as well as partial robust M-regression (PRM). These linear models were compared to MCR-ALS with equality and correlation constraints (MCR-ALS-CC). All techniques operated within the full spectral region and extracted maximum information for the drugs analysed. The developed chemometric methods were validated on external sample sets and were applied to the analyses of pharmaceutical formulations. The obtained statistical parameters were satisfactory for calibration and validation sets. All developed methods can be successfully applied for simultaneous spectrophotometric determination of doxylamine and pyridoxine both in laboratory-prepared mixtures and commercial dosage forms.
NASA Astrophysics Data System (ADS)
Luna, Aderval S.; Gonzaga, Fabiano B.; da Rocha, Werickson F. C.; Lima, Igor C. A.
2018-01-01
Laser-induced breakdown spectroscopy (LIBS) analysis was carried out on eleven steel samples to quantify the concentrations of chromium, nickel, and manganese. LIBS spectral data were correlated to known concentrations of the samples using different strategies in partial least squares (PLS) regression models. For the PLS analysis, one predictive model was separately generated for each element, while different approaches were used for the selection of variables (VIP: variable importance in projection and iPLS: interval partial least squares) in the PLS model to quantify the contents of the elements. The comparison of the performance of the models showed that there was no significant statistical difference using the Wilcoxon signed rank test. The elliptical joint confidence region (EJCR) did not detect systematic errors in these proposed methodologies for each metal.
Rapid and Simultaneous Prediction of Eight Diesel Quality Parameters through ATR-FTIR Analysis.
Nespeca, Maurilio Gustavo; Hatanaka, Rafael Rodrigues; Flumignan, Danilo Luiz; de Oliveira, José Eduardo
2018-01-01
Quality assessment of diesel fuel is highly necessary for society, but the costs and time spent are very high while using standard methods. Therefore, this study aimed to develop an analytical method capable of simultaneously determining eight diesel quality parameters (density; flash point; total sulfur content; distillation temperatures at 10% (T10), 50% (T50), and 85% (T85) recovery; cetane index; and biodiesel content) through attenuated total reflection Fourier transform infrared (ATR-FTIR) spectroscopy and the multivariate regression method, partial least square (PLS). For this purpose, the quality parameters of 409 samples were determined using standard methods, and their spectra were acquired in ranges of 4000-650 cm -1 . The use of the multivariate filters, generalized least squares weighting (GLSW) and orthogonal signal correction (OSC), was evaluated to improve the signal-to-noise ratio of the models. Likewise, four variable selection approaches were tested: manual exclusion, forward interval PLS (FiPLS), backward interval PLS (BiPLS), and genetic algorithm (GA). The multivariate filters and variables selection algorithms generated more fitted and accurate PLS models. According to the validation, the FTIR/PLS models presented accuracy comparable to the reference methods and, therefore, the proposed method can be applied in the diesel routine monitoring to significantly reduce costs and analysis time.
Rapid and Simultaneous Prediction of Eight Diesel Quality Parameters through ATR-FTIR Analysis
Hatanaka, Rafael Rodrigues; Flumignan, Danilo Luiz; de Oliveira, José Eduardo
2018-01-01
Quality assessment of diesel fuel is highly necessary for society, but the costs and time spent are very high while using standard methods. Therefore, this study aimed to develop an analytical method capable of simultaneously determining eight diesel quality parameters (density; flash point; total sulfur content; distillation temperatures at 10% (T10), 50% (T50), and 85% (T85) recovery; cetane index; and biodiesel content) through attenuated total reflection Fourier transform infrared (ATR-FTIR) spectroscopy and the multivariate regression method, partial least square (PLS). For this purpose, the quality parameters of 409 samples were determined using standard methods, and their spectra were acquired in ranges of 4000–650 cm−1. The use of the multivariate filters, generalized least squares weighting (GLSW) and orthogonal signal correction (OSC), was evaluated to improve the signal-to-noise ratio of the models. Likewise, four variable selection approaches were tested: manual exclusion, forward interval PLS (FiPLS), backward interval PLS (BiPLS), and genetic algorithm (GA). The multivariate filters and variables selection algorithms generated more fitted and accurate PLS models. According to the validation, the FTIR/PLS models presented accuracy comparable to the reference methods and, therefore, the proposed method can be applied in the diesel routine monitoring to significantly reduce costs and analysis time. PMID:29629209
Kuligowski, Julia; Carrión, David; Quintás, Guillermo; Garrigues, Salvador; de la Guardia, Miguel
2011-01-01
The selection of an appropriate calibration set is a critical step in multivariate method development. In this work, the effect of using different calibration sets, based on a previous classification of unknown samples, on the partial least squares (PLS) regression model performance has been discussed. As an example, attenuated total reflection (ATR) mid-infrared spectra of deep-fried vegetable oil samples from three botanical origins (olive, sunflower, and corn oil), with increasing polymerized triacylglyceride (PTG) content induced by a deep-frying process were employed. The use of a one-class-classifier partial least squares-discriminant analysis (PLS-DA) and a rooted binary directed acyclic graph tree provided accurate oil classification. Oil samples fried without foodstuff could be classified correctly, independent of their PTG content. However, class separation of oil samples fried with foodstuff, was less evident. The combined use of double-cross model validation with permutation testing was used to validate the obtained PLS-DA classification models, confirming the results. To discuss the usefulness of the selection of an appropriate PLS calibration set, the PTG content was determined by calculating a PLS model based on the previously selected classes. In comparison to a PLS model calculated using a pooled calibration set containing samples from all classes, the root mean square error of prediction could be improved significantly using PLS models based on the selected calibration sets using PLS-DA, ranging between 1.06 and 2.91% (w/w).
ERIC Educational Resources Information Center
Pierce, Karisa M.; Schale, Stephen P.; Le, Trang M.; Larson, Joel C.
2011-01-01
We present a laboratory experiment for an advanced analytical chemistry course where we first focus on the chemometric technique partial least-squares (PLS) analysis applied to one-dimensional (1D) total-ion-current gas chromatography-mass spectrometry (GC-TIC) separations of biodiesel blends. Then, we focus on n-way PLS (n-PLS) applied to…
Application of near-infrared spectroscopy for the rapid quality assessment of Radix Paeoniae Rubra
NASA Astrophysics Data System (ADS)
Zhan, Hao; Fang, Jing; Tang, Liying; Yang, Hongjun; Li, Hua; Wang, Zhuju; Yang, Bin; Wu, Hongwei; Fu, Meihong
2017-08-01
Near-infrared (NIR) spectroscopy with multivariate analysis was used to quantify gallic acid, catechin, albiflorin, and paeoniflorin in Radix Paeoniae Rubra, and the feasibility to classify the samples originating from different areas was investigated. A new high-performance liquid chromatography method was developed and validated to analyze gallic acid, catechin, albiflorin, and paeoniflorin in Radix Paeoniae Rubra as the reference. Partial least squares (PLS), principal component regression (PCR), and stepwise multivariate linear regression (SMLR) were performed to calibrate the regression model. Different data pretreatments such as derivatives (1st and 2nd), multiplicative scatter correction, standard normal variate, Savitzky-Golay filter, and Norris derivative filter were applied to remove the systematic errors. The performance of the model was evaluated according to the root mean square of calibration (RMSEC), root mean square error of prediction (RMSEP), root mean square error of cross-validation (RMSECV), and correlation coefficient (r). The results show that compared to PCR and SMLR, PLS had a lower RMSEC, RMSECV, and RMSEP and higher r for all the four analytes. PLS coupled with proper pretreatments showed good performance in both the fitting and predicting results. Furthermore, the original areas of Radix Paeoniae Rubra samples were partly distinguished by principal component analysis. This study shows that NIR with PLS is a reliable, inexpensive, and rapid tool for the quality assessment of Radix Paeoniae Rubra.
Balabin, Roman M; Smirnov, Sergey V
2011-04-29
During the past several years, near-infrared (near-IR/NIR) spectroscopy has increasingly been adopted as an analytical tool in various fields from petroleum to biomedical sectors. The NIR spectrum (above 4000 cm(-1)) of a sample is typically measured by modern instruments at a few hundred of wavelengths. Recently, considerable effort has been directed towards developing procedures to identify variables (wavelengths) that contribute useful information. Variable selection (VS) or feature selection, also called frequency selection or wavelength selection, is a critical step in data analysis for vibrational spectroscopy (infrared, Raman, or NIRS). In this paper, we compare the performance of 16 different feature selection methods for the prediction of properties of biodiesel fuel, including density, viscosity, methanol content, and water concentration. The feature selection algorithms tested include stepwise multiple linear regression (MLR-step), interval partial least squares regression (iPLS), backward iPLS (BiPLS), forward iPLS (FiPLS), moving window partial least squares regression (MWPLS), (modified) changeable size moving window partial least squares (CSMWPLS/MCSMWPLSR), searching combination moving window partial least squares (SCMWPLS), successive projections algorithm (SPA), uninformative variable elimination (UVE, including UVE-SPA), simulated annealing (SA), back-propagation artificial neural networks (BP-ANN), Kohonen artificial neural network (K-ANN), and genetic algorithms (GAs, including GA-iPLS). Two linear techniques for calibration model building, namely multiple linear regression (MLR) and partial least squares regression/projection to latent structures (PLS/PLSR), are used for the evaluation of biofuel properties. A comparison with a non-linear calibration model, artificial neural networks (ANN-MLP), is also provided. Discussion of gasoline, ethanol-gasoline (bioethanol), and diesel fuel data is presented. The results of other spectroscopic techniques application, such as Raman, ultraviolet-visible (UV-vis), or nuclear magnetic resonance (NMR) spectroscopies, can be greatly improved by an appropriate feature selection choice. Copyright © 2011 Elsevier B.V. All rights reserved.
Rodríguez-Entrena, Macario; Schuberth, Florian; Gelhard, Carsten
2018-01-01
Structural equation modeling using partial least squares (PLS-SEM) has become a main-stream modeling approach in various disciplines. Nevertheless, prior literature still lacks a practical guidance on how to properly test for differences between parameter estimates. Whereas existing techniques such as parametric and non-parametric approaches in PLS multi-group analysis solely allow to assess differences between parameters that are estimated for different subpopulations, the study at hand introduces a technique that allows to also assess whether two parameter estimates that are derived from the same sample are statistically different. To illustrate this advancement to PLS-SEM, we particularly refer to a reduced version of the well-established technology acceptance model.
Angeyo, K H; Gari, S; Mustapha, A O; Mangala, J M
2012-11-01
The greatest challenge to material characterization by XRF technique is encountered in direct trace analysis of complex matrices. We exploited partial least squares (PLS) in conjunction with energy dispersive X-ray fluorescence and scattering (EDXRFS) spectrometry to rapidly (200 s) analyze lubricating oils. The PLS-EDXRFS method affords non-invasive quality assurance (QA) analysis of complex matrix liquids as it gave optimistic results for both heavy- and low-Z metal additives. Scatter peaks may further be used for QA characterization via the light elements. Copyright © 2012 Elsevier Ltd. All rights reserved.
Marques Junior, Jucelino Medeiros; Muller, Aline Lima Hermes; Foletto, Edson Luiz; da Costa, Adilson Ben; Bizzi, Cezar Augusto; Irineu Muller, Edson
2015-01-01
A method for determination of propranolol hydrochloride in pharmaceutical preparation using near infrared spectrometry with fiber optic probe (FTNIR/PROBE) and combined with chemometric methods was developed. Calibration models were developed using two variable selection models: interval partial least squares (iPLS) and synergy interval partial least squares (siPLS). The treatments based on the mean centered data and multiplicative scatter correction (MSC) were selected for models construction. A root mean square error of prediction (RMSEP) of 8.2 mg g(-1) was achieved using siPLS (s2i20PLS) algorithm with spectra divided into 20 intervals and combination of 2 intervals (8501 to 8801 and 5201 to 5501 cm(-1)). Results obtained by the proposed method were compared with those using the pharmacopoeia reference method and significant difference was not observed. Therefore, proposed method allowed a fast, precise, and accurate determination of propranolol hydrochloride in pharmaceutical preparations. Furthermore, it is possible to carry out on-line analysis of this active principle in pharmaceutical formulations with use of fiber optic probe.
Error propagation of partial least squares for parameters optimization in NIR modeling.
Du, Chenzhao; Dai, Shengyun; Qiao, Yanjiang; Wu, Zhisheng
2018-03-05
A novel methodology is proposed to determine the error propagation of partial least-square (PLS) for parameters optimization in near-infrared (NIR) modeling. The parameters include spectral pretreatment, latent variables and variable selection. In this paper, an open source dataset (corn) and a complicated dataset (Gardenia) were used to establish PLS models under different modeling parameters. And error propagation of modeling parameters for water quantity in corn and geniposide quantity in Gardenia were presented by both type І and type II error. For example, when variable importance in the projection (VIP), interval partial least square (iPLS) and backward interval partial least square (BiPLS) variable selection algorithms were used for geniposide in Gardenia, compared with synergy interval partial least squares (SiPLS), the error weight varied from 5% to 65%, 55% and 15%. The results demonstrated how and what extent the different modeling parameters affect error propagation of PLS for parameters optimization in NIR modeling. The larger the error weight, the worse the model. Finally, our trials finished a powerful process in developing robust PLS models for corn and Gardenia under the optimal modeling parameters. Furthermore, it could provide a significant guidance for the selection of modeling parameters of other multivariate calibration models. Copyright © 2017. Published by Elsevier B.V.
Error propagation of partial least squares for parameters optimization in NIR modeling
NASA Astrophysics Data System (ADS)
Du, Chenzhao; Dai, Shengyun; Qiao, Yanjiang; Wu, Zhisheng
2018-03-01
A novel methodology is proposed to determine the error propagation of partial least-square (PLS) for parameters optimization in near-infrared (NIR) modeling. The parameters include spectral pretreatment, latent variables and variable selection. In this paper, an open source dataset (corn) and a complicated dataset (Gardenia) were used to establish PLS models under different modeling parameters. And error propagation of modeling parameters for water quantity in corn and geniposide quantity in Gardenia were presented by both type І and type II error. For example, when variable importance in the projection (VIP), interval partial least square (iPLS) and backward interval partial least square (BiPLS) variable selection algorithms were used for geniposide in Gardenia, compared with synergy interval partial least squares (SiPLS), the error weight varied from 5% to 65%, 55% and 15%. The results demonstrated how and what extent the different modeling parameters affect error propagation of PLS for parameters optimization in NIR modeling. The larger the error weight, the worse the model. Finally, our trials finished a powerful process in developing robust PLS models for corn and Gardenia under the optimal modeling parameters. Furthermore, it could provide a significant guidance for the selection of modeling parameters of other multivariate calibration models.
Partial Least Squares for Discrimination in fMRI Data
Andersen, Anders H.; Rayens, William S.; Liu, Yushu; Smith, Charles D.
2011-01-01
Multivariate methods for discrimination were used in the comparison of brain activation patterns between groups of cognitively normal women who are at either high or low Alzheimer's disease risk based on family history and apolipoprotein-E4 status. Linear discriminant analysis (LDA) was preceded by dimension reduction using either principal component analysis (PCA), partial least squares (PLS), or a new oriented partial least squares (OrPLS) method. The aim was to identify a spatial pattern of functionally connected brain regions that was differentially expressed by the risk groups and yielded optimal classification accuracy. Multivariate dimension reduction is required prior to LDA when the data contains more feature variables than there are observations on individual subjects. Whereas PCA has been commonly used to identify covariance patterns in neuroimaging data, this approach only identifies gross variability and is not capable of distinguishing among-groups from within-groups variability. PLS and OrPLS provide a more focused dimension reduction by incorporating information on class structure and therefore lead to more parsimonious models for discrimination. Performance was evaluated in terms of the cross-validated misclassification rates. The results support the potential of using fMRI as an imaging biomarker or diagnostic tool to discriminate individuals with disease or high risk. PMID:22227352
NASA Astrophysics Data System (ADS)
Samadi-Maybodi, Abdolraouf; Darzi, S. K. Hassani Nejad
2008-10-01
Resolution of binary mixtures of vitamin B12, methylcobalamin and B12 coenzyme with minimum sample pre-treatment and without analyte separation has been successfully achieved by methods of partial least squares algorithm with one dependent variable (PLS1), orthogonal signal correction/partial least squares (OSC/PLS), principal component regression (PCR) and hybrid linear analysis (HLA). Data of analysis were obtained from UV-vis spectra. The UV-vis spectra of the vitamin B12, methylcobalamin and B12 coenzyme were recorded in the same spectral conditions. The method of central composite design was used in the ranges of 10-80 mg L -1 for vitamin B12 and methylcobalamin and 20-130 mg L -1 for B12 coenzyme. The models refinement procedure and validation were performed by cross-validation. The minimum root mean square error of prediction (RMSEP) was 2.26 mg L -1 for vitamin B12 with PLS1, 1.33 mg L -1 for methylcobalamin with OSC/PLS and 3.24 mg L -1 for B12 coenzyme with HLA techniques. Figures of merit such as selectivity, sensitivity, analytical sensitivity and LOD were determined for three compounds. The procedure was successfully applied to simultaneous determination of three compounds in synthetic mixtures and in a pharmaceutical formulation.
Li, Wen-bing; Yao, Lin-tao; Liu, Mu-hua; Huang, Lin; Yao, Ming-yin; Chen, Tian-bing; He, Xiu-wen; Yang, Ping; Hu, Hui-qin; Nie, Jiang-hui
2015-05-01
Cu in navel orange was detected rapidly by laser-induced breakdown spectroscopy (LIBS) combined with partial least squares (PLS) for quantitative analysis, then the effect on the detection accuracy of the model with different spectral data ptetreatment methods was explored. Spectral data for the 52 Gannan navel orange samples were pretreated by different data smoothing, mean centralized and standard normal variable transform. Then 319~338 nm wavelength section containing characteristic spectral lines of Cu was selected to build PLS models, the main evaluation indexes of models such as regression coefficient (r), root mean square error of cross validation (RMSECV) and the root mean square error of prediction (RMSEP) were compared and analyzed. Three indicators of PLS model after 13 points smoothing and processing of the mean center were found reaching 0. 992 8, 3. 43 and 3. 4 respectively, the average relative error of prediction model is only 5. 55%, and in one word, the quality of calibration and prediction of this model are the best results. The results show that selecting the appropriate data pre-processing method, the prediction accuracy of PLS quantitative model of fruits and vegetables detected by LIBS can be improved effectively, providing a new method for fast and accurate detection of fruits and vegetables by LIBS.
Koch, Cosima; Posch, Andreas E; Goicoechea, Héctor C; Herwig, Christoph; Lendl, Bernhard
2014-01-07
This paper presents the quantification of Penicillin V and phenoxyacetic acid, a precursor, inline during Pencillium chrysogenum fermentations by FTIR spectroscopy and partial least squares (PLS) regression and multivariate curve resolution - alternating least squares (MCR-ALS). First, the applicability of an attenuated total reflection FTIR fiber optic probe was assessed offline by measuring standards of the analytes of interest and investigating matrix effects of the fermentation broth. Then measurements were performed inline during four fed-batch fermentations with online HPLC for the determination of Penicillin V and phenoxyacetic acid as reference analysis. PLS and MCR-ALS models were built using these data and validated by comparison of single analyte spectra with the selectivity ratio of the PLS models and the extracted spectral traces of the MCR-ALS models, respectively. The achieved root mean square errors of cross-validation for the PLS regressions were 0.22 g L(-1) for Penicillin V and 0.32 g L(-1) for phenoxyacetic acid and the root mean square errors of prediction for MCR-ALS were 0.23 g L(-1) for Penicillin V and 0.15 g L(-1) for phenoxyacetic acid. A general work-flow for building and assessing chemometric regression models for the quantification of multiple analytes in bioprocesses by FTIR spectroscopy is given. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.
Liao, Xiang; Wang, Qing; Fu, Ji-hong; Tang, Jun
2015-09-01
This work was undertaken to establish a quantitative analysis model which can rapid determinate the content of linalool, linalyl acetate of Xinjiang lavender essential oil. Totally 165 lavender essential oil samples were measured by using near infrared absorption spectrum (NIR), after analyzing the near infrared spectral absorption peaks of all samples, lavender essential oil have abundant chemical information and the interference of random noise may be relatively low on the spectral intervals of 7100~4500 cm(-1). Thus, the PLS models was constructed by using this interval for further analysis. 8 abnormal samples were eliminated. Through the clustering method, 157 lavender essential oil samples were divided into 105 calibration set samples and 52 validation set samples. Gas chromatography mass spectrometry (GC-MS) was used as a tool to determine the content of linalool and linalyl acetate in lavender essential oil. Then the matrix was established with the GC-MS raw data of two compounds in combination with the original NIR data. In order to optimize the model, different pretreatment methods were used to preprocess the raw NIR spectral to contrast the spectral filtering effect, after analysizing the quantitative model results of linalool and linalyl acetate, the root mean square error prediction (RMSEP) of orthogonal signal transformation (OSC) was 0.226, 0.558, spectrally, it was the optimum pretreatment method. In addition, forward interval partial least squares (FiPLS) method was used to exclude the wavelength points which has nothing to do with determination composition or present nonlinear correlation, finally 8 spectral intervals totally 160 wavelength points were obtained as the dataset. Combining the data sets which have optimized by OSC-FiPLS with partial least squares (PLS) to establish a rapid quantitative analysis model for determining the content of linalool and linalyl acetate in Xinjiang lavender essential oil, numbers of hidden variables of two components were 8 in the model. The performance of the model was evaluated according to root mean square error of cross-validation (RMSECV), root mean square error of prediction (RMSEP). In the model, RESECV of linalool and linalyl acetate were 0.170 and 0.416, respectively; RM-SEP were 0.188 and 0.364. The results indicated that raw data was pretreated by OSC and FiPLS, the NIR-PLS quantitative analysis model with good robustness, high measurement precision; it could quickly determine the content of linalool and linalyl acetate in lavender essential oil. In addition, the model has a favorable prediction ability. The study also provide a new effective method which could rapid quantitative analysis the major components of Xinjiang lavender essential oil.
Kehimkar, Benjamin; Parsons, Brendon A; Hoggard, Jamin C; Billingsley, Matthew C; Bruno, Thomas J; Synovec, Robert E
2015-01-01
Recent efforts in predicting rocket propulsion (RP-1) fuel performance through modeling put greater emphasis on obtaining detailed and accurate fuel properties, as well as elucidating the relationships between fuel compositions and their properties. Herein, we study multidimensional chromatographic data obtained by comprehensive two-dimensional gas chromatography combined with time-of-flight mass spectrometry (GC × GC-TOFMS) to analyze RP-1 fuels. For GC × GC separations, RTX-Wax (polar stationary phase) and RTX-1 (non-polar stationary phase) columns were implemented for the primary and secondary dimensions, respectively, to separate the chemical compound classes (alkanes, cycloalkanes, aromatics, etc.), providing a significant level of chemical compositional information. The GC × GC-TOFMS data were analyzed using partial least squares regression (PLS) chemometric analysis to model and predict advanced distillation curve (ADC) data for ten RP-1 fuels that were previously analyzed using the ADC method. The PLS modeling provides insight into the chemical species that impact the ADC data. The PLS modeling correlates compositional information found in the GC × GC-TOFMS chromatograms of each RP-1 fuel, and their respective ADC, and allows prediction of the ADC for each RP-1 fuel with good precision and accuracy. The root-mean-square error of calibration (RMSEC) ranged from 0.1 to 0.5 °C, and was typically below ∼0.2 °C, for the PLS calibration of the ADC modeling with GC × GC-TOFMS data, indicating a good fit of the model to the calibration data. Likewise, the predictive power of the overall method via PLS modeling was assessed using leave-one-out cross-validation (LOOCV) yielding root-mean-square error of cross-validation (RMSECV) ranging from 1.4 to 2.6 °C, and was typically below ∼2.0 °C, at each % distilled measurement point during the ADC analysis.
Kernel Partial Least Squares for Nonlinear Regression and Discrimination
NASA Technical Reports Server (NTRS)
Rosipal, Roman; Clancy, Daniel (Technical Monitor)
2002-01-01
This paper summarizes recent results on applying the method of partial least squares (PLS) in a reproducing kernel Hilbert space (RKHS). A previously proposed kernel PLS regression model was proven to be competitive with other regularized regression methods in RKHS. The family of nonlinear kernel-based PLS models is extended by considering the kernel PLS method for discrimination. Theoretical and experimental results on a two-class discrimination problem indicate usefulness of the method.
De Luca, Michele; Restuccia, Donatella; Clodoveo, Maria Lisa; Puoci, Francesco; Ragno, Gaetano
2016-07-01
Chemometric discrimination of extra virgin olive oils (EVOO) from whole and stoned olive pastes was carried out by using Fourier transform infrared (FTIR) data and partial least squares-discriminant analysis (PLS1-DA) approach. Four Italian commercial EVOO brands, all in both whole and stoned version, were considered in this study. The adopted chemometric methodologies were able to describe the different chemical features in phenolic and volatile compounds contained in the two types of oil by using unspecific IR spectral information. Principal component analysis (PCA) was employed in cluster analysis to capture data patterns and to highlight differences between technological processes and EVOO brands. The PLS1-DA algorithm was used as supervised discriminant analysis to identify the different oil extraction procedures. Discriminant analysis was extended to the evaluation of possible adulteration by addition of aliquots of oil from whole paste to the most valuable oil from stoned olives. The statistical parameters from external validation of all the PLS models were very satisfactory, with low root mean square error of prediction (RMSEP) and relative error (RE%). Copyright © 2016 Elsevier Ltd. All rights reserved.
Analysis of pork adulteration in beef meatball using Fourier transform infrared (FTIR) spectroscopy.
Rohman, A; Sismindari; Erwanto, Y; Che Man, Yaakob B
2011-05-01
Meatball is one of the favorite foods in Indonesia. The adulteration of pork in beef meatball is frequently occurring. This study was aimed to develop a fast and non destructive technique for the detection and quantification of pork in beef meatball using Fourier transform infrared (FTIR) spectroscopy and partial least square (PLS) calibration. The spectral bands associated with pork fat (PF), beef fat (BF), and their mixtures in meatball formulation were scanned, interpreted, and identified by relating them to those spectroscopically representative to pure PF and BF. For quantitative analysis, PLS regression was used to develop a calibration model at the selected fingerprint regions of 1200-1000 cm(-1). The equation obtained for the relationship between actual PF value and FTIR predicted values in PLS calibration model was y = 0.999x + 0.004, with coefficient of determination (R(2)) and root mean square error of calibration are 0.999 and 0.442, respectively. The PLS calibration model was subsequently used for the prediction of independent samples using laboratory made meatball samples containing the mixtures of BF and PF. Using 4 principal components, root mean square error of prediction is 0.742. The results showed that FTIR spectroscopy can be used for the detection and quantification of pork in beef meatball formulation for Halal verification purposes. Copyright © 2010 The American Meat Science Association. Published by Elsevier Ltd. All rights reserved.
Balss, Karin M; Long, Frederick H; Veselov, Vladimir; Orana, Argjenta; Akerman-Revis, Eugena; Papandreou, George; Maryanoff, Cynthia A
2008-07-01
Multivariate data analysis was applied to confocal Raman measurements on stents coated with the polymers and drug used in the CYPHER Sirolimus-eluting Coronary Stents. Partial least-squares (PLS) regression was used to establish three independent calibration curves for the coating constituents: sirolimus, poly(n-butyl methacrylate) [PBMA], and poly(ethylene-co-vinyl acetate) [PEVA]. The PLS calibrations were based on average spectra generated from each spatial location profiled. The PLS models were tested on six unknown stent samples to assess accuracy and precision. The wt % difference between PLS predictions and laboratory assay values for sirolimus was less than 1 wt % for the composite of the six unknowns, while the polymer models were estimated to be less than 0.5 wt % difference for the combined samples. The linearity and specificity of the three PLS models were also demonstrated with the three PLS models. In contrast to earlier univariate models, the PLS models achieved mass balance with better accuracy. This analysis was extended to evaluate the spatial distribution of the three constituents. Quantitative bitmap images of drug-eluting stent coatings are presented for the first time to assess the local distribution of components.
Kumar, Keshav
2018-03-01
Excitation-emission matrix fluorescence (EEMF) and total synchronous fluorescence spectroscopy (TSFS) are the 2 fluorescence techniques that are commonly used for the analysis of multifluorophoric mixtures. These 2 fluorescence techniques are conceptually different and provide certain advantages over each other. The manual analysis of such highly correlated large volume of EEMF and TSFS towards developing a calibration model is difficult. Partial least square (PLS) analysis can analyze the large volume of EEMF and TSFS data sets by finding important factors that maximize the correlation between the spectral and concentration information for each fluorophore. However, often the application of PLS analysis on entire data sets does not provide a robust calibration model and requires application of suitable pre-processing step. The present work evaluates the application of genetic algorithm (GA) analysis prior to PLS analysis on EEMF and TSFS data sets towards improving the precision and accuracy of the calibration model. The GA algorithm essentially combines the advantages provided by stochastic methods with those provided by deterministic approaches and can find the set of EEMF and TSFS variables that perfectly correlate well with the concentration of each of the fluorophores present in the multifluorophoric mixtures. The utility of the GA assisted PLS analysis is successfully validated using (i) EEMF data sets acquired for dilute aqueous mixture of four biomolecules and (ii) TSFS data sets acquired for dilute aqueous mixtures of four carcinogenic polycyclic aromatic hydrocarbons (PAHs) mixtures. In the present work, it is shown that by using the GA it is possible to significantly improve the accuracy and precision of the PLS calibration model developed for both EEMF and TSFS data set. Hence, GA must be considered as a useful pre-processing technique while developing an EEMF and TSFS calibration model.
Barimani, Shirin; Kleinebudde, Peter
2017-10-01
A multivariate analysis method, Science-Based Calibration (SBC), was used for the first time for endpoint determination of a tablet coating process using Raman data. Two types of tablet cores, placebo and caffeine cores, received a coating suspension comprising a polyvinyl alcohol-polyethylene glycol graft-copolymer and titanium dioxide to a maximum coating thickness of 80µm. Raman spectroscopy was used as in-line PAT tool. The spectra were acquired every minute and correlated to the amount of applied aqueous coating suspension. SBC was compared to another well-known multivariate analysis method, Partial Least Squares-regression (PLS) and a simpler approach, Univariate Data Analysis (UVDA). All developed calibration models had coefficient of determination values (R 2 ) higher than 0.99. The coating endpoints could be predicted with root mean square errors (RMSEP) less than 3.1% of the applied coating suspensions. Compared to PLS and UVDA, SBC proved to be an alternative multivariate calibration method with high predictive power. Copyright © 2017 Elsevier B.V. All rights reserved.
Dinç, Erdal; Ertekin, Zehra Ceren
2016-01-01
An application of parallel factor analysis (PARAFAC) and three-way partial least squares (3W-PLS1) regression models to ultra-performance liquid chromatography-photodiode array detection (UPLC-PDA) data with co-eluted peaks in the same wavelength and time regions was described for the multicomponent quantitation of hydrochlorothiazide (HCT) and olmesartan medoxomil (OLM) in tablets. Three-way dataset of HCT and OLM in their binary mixtures containing telmisartan (IS) as an internal standard was recorded with a UPLC-PDA instrument. Firstly, the PARAFAC algorithm was applied for the decomposition of three-way UPLC-PDA data into the chromatographic, spectral and concentration profiles to quantify the concerned compounds. Secondly, 3W-PLS1 approach was subjected to the decomposition of a tensor consisting of three-way UPLC-PDA data into a set of triads to build 3W-PLS1 regression for the analysis of the same compounds in samples. For the proposed three-way analysis methods in the regression and prediction steps, the applicability and validity of PARAFAC and 3W-PLS1 models were checked by analyzing the synthetic mixture samples, inter-day and intra-day samples, and standard addition samples containing HCT and OLM. Two different three-way analysis methods, PARAFAC and 3W-PLS1, were successfully applied to the quantitative estimation of the solid dosage form containing HCT and OLM. Regression and prediction results provided from three-way analysis were compared with those obtained by traditional UPLC method. Copyright © 2015 Elsevier B.V. All rights reserved.
Dealing with gene expression missing data.
Brás, L P; Menezes, J C
2006-05-01
Compared evaluation of different methods is presented for estimating missing values in microarray data: weighted K-nearest neighbours imputation (KNNimpute), regression-based methods such as local least squares imputation (LLSimpute) and partial least squares imputation (PLSimpute) and Bayesian principal component analysis (BPCA). The influence in prediction accuracy of some factors, such as methods' parameters, type of data relationships used in the estimation process (i.e. row-wise, column-wise or both), missing rate and pattern and type of experiment [time series (TS), non-time series (NTS) or mixed (MIX) experiments] is elucidated. Improvements based on the iterative use of data (iterative LLS and PLS imputation--ILLSimpute and IPLSimpute), the need to perform initial imputations (modified PLS and Helland PLS imputation--MPLSimpute and HPLSimpute) and the type of relationships employed (KNNarray, LLSarray, HPLSarray and alternating PLS--APLSimpute) are proposed. Overall, it is shown that data set properties (type of experiment, missing rate and pattern) affect the data similarity structure, therefore influencing the methods' performance. LLSimpute and ILLSimpute are preferable in the presence of data with a stronger similarity structure (TS and MIX experiments), whereas PLS-based methods (MPLSimpute, IPLSimpute and APLSimpute) are preferable when estimating NTS missing data.
NASA Astrophysics Data System (ADS)
De Lucia, Frank C., Jr.; Gottfried, Jennifer L.
2011-02-01
Using a series of thirteen organic materials that includes novel high-nitrogen energetic materials, conventional organic military explosives, and benign organic materials, we have demonstrated the importance of variable selection for maximizing residue discrimination with partial least squares discriminant analysis (PLS-DA). We built several PLS-DA models using different variable sets based on laser induced breakdown spectroscopy (LIBS) spectra of the organic residues on an aluminum substrate under an argon atmosphere. The model classification results for each sample are presented and the influence of the variables on these results is discussed. We found that using the whole spectra as the data input for the PLS-DA model gave the best results. However, variables due to the surrounding atmosphere and the substrate contribute to discrimination when the whole spectra are used, indicating this may not be the most robust model. Further iterative testing with additional validation data sets is necessary to determine the most robust model.
Alladio, E; Giacomelli, L; Biosa, G; Corcia, D Di; Gerace, E; Salomone, A; Vincenti, M
2018-01-01
The chronic intake of an excessive amount of alcohol is currently ascertained by determining the concentration of direct alcohol metabolites in the hair samples of the alleged abusers, including ethyl glucuronide (EtG) and, less frequently, fatty acid ethyl esters (FAEEs). Indirect blood biomarkers of alcohol abuse are still determined to support hair EtG results and diagnose a consequent liver impairment. In the present study, the supporting role of hair FAEEs is compared with indirect blood biomarkers with respect to the contexts in which hair EtG interpretation is uncertain. Receiver Operating Characteristics (ROC) curves and multivariate Principal Component Analysis (PCA) demonstrated much stronger correlation of EtG results with FAEEs than with any single indirect biomarker or their combinations. Partial Least Squares Discriminant Analysis (PLS-DA) models based on hair EtG and FAEEs were developed to maximize the biomarkers information content on a multivariate background. The final PLS-DA model yielded 100% correct classification on a training/evaluation dataset of 155 subjects, including both chronic alcohol abusers and social drinkers. Then, the PLS-DA model was validated on an external dataset of 81 individual providing optimal discrimination ability between chronic alcohol abusers and social drinkers, in terms of specificity and sensitivity. The PLS-DA scores obtained for each subject, with respect to the PLS-DA model threshold that separates the probabilistic distributions for the two classes, furnished a likelihood ratio value, which in turn conveys the strength of the experimental data support to the classification decision, within a Bayesian logic. Typical boundary real cases from daily work are discussed, too. Copyright © 2017 Elsevier B.V. All rights reserved.
Zhang, Yan; Zou, Hong-Yan; Shi, Pei; Yang, Qin; Tang, Li-Juan; Jiang, Jian-Hui; Wu, Hai-Long; Yu, Ru-Qin
2016-01-01
Determination of benzo[a]pyrene (BaP) in cigarette smoke can be very important for the tobacco quality control and the assessment of its harm to human health. In this study, mid-infrared spectroscopy (MIR) coupled to chemometric algorithm (DPSO-WPT-PLS), which was based on the wavelet packet transform (WPT), discrete particle swarm optimization algorithm (DPSO) and partial least squares regression (PLS), was used to quantify harmful ingredient benzo[a]pyrene in the cigarette mainstream smoke with promising result. Furthermore, the proposed method provided better performance compared to several other chemometric models, i.e., PLS, radial basis function-based PLS (RBF-PLS), PLS with stepwise regression variable selection (Stepwise-PLS) as well as WPT-PLS with informative wavelet coefficients selected by correlation coefficient test (rtest-WPT-PLS). It can be expected that the proposed strategy could become a new effective, rapid quantitative analysis technique in analyzing the harmful ingredient BaP in cigarette mainstream smoke. Copyright © 2015 Elsevier B.V. All rights reserved.
Lee, Byeong-Ju; Kim, Hye-Youn; Lim, Sa Rang; Huang, Linfang; Choi, Hyung-Kyoon
2017-01-01
Panax ginseng C.A. Meyer is a herb used for medicinal purposes, and its discrimination according to cultivation age has been an important and practical issue. This study employed Fourier-transform infrared (FT-IR) spectroscopy with multivariate statistical analysis to obtain a prediction model for discriminating cultivation ages (5 and 6 years) and three different parts (rhizome, tap root, and lateral root) of P. ginseng. The optimal partial-least-squares regression (PLSR) models for discriminating ginseng samples were determined by selecting normalization methods, number of partial-least-squares (PLS) components, and variable influence on projection (VIP) cutoff values. The best prediction model for discriminating 5- and 6-year-old ginseng was developed using tap root, vector normalization applied after the second differentiation, one PLS component, and a VIP cutoff of 1.0 (based on the lowest root-mean-square error of prediction value). In addition, for discriminating among the three parts of P. ginseng, optimized PLSR models were established using data sets obtained from vector normalization, two PLS components, and VIP cutoff values of 1.5 (for 5-year-old ginseng) and 1.3 (for 6-year-old ginseng). To our knowledge, this is the first study to provide a novel strategy for rapidly discriminating the cultivation ages and parts of P. ginseng using FT-IR by selected normalization methods, number of PLS components, and VIP cutoff values.
Lim, Sa Rang; Huang, Linfang
2017-01-01
Panax ginseng C.A. Meyer is a herb used for medicinal purposes, and its discrimination according to cultivation age has been an important and practical issue. This study employed Fourier-transform infrared (FT-IR) spectroscopy with multivariate statistical analysis to obtain a prediction model for discriminating cultivation ages (5 and 6 years) and three different parts (rhizome, tap root, and lateral root) of P. ginseng. The optimal partial-least-squares regression (PLSR) models for discriminating ginseng samples were determined by selecting normalization methods, number of partial-least-squares (PLS) components, and variable influence on projection (VIP) cutoff values. The best prediction model for discriminating 5- and 6-year-old ginseng was developed using tap root, vector normalization applied after the second differentiation, one PLS component, and a VIP cutoff of 1.0 (based on the lowest root-mean-square error of prediction value). In addition, for discriminating among the three parts of P. ginseng, optimized PLSR models were established using data sets obtained from vector normalization, two PLS components, and VIP cutoff values of 1.5 (for 5-year-old ginseng) and 1.3 (for 6-year-old ginseng). To our knowledge, this is the first study to provide a novel strategy for rapidly discriminating the cultivation ages and parts of P. ginseng using FT-IR by selected normalization methods, number of PLS components, and VIP cutoff values. PMID:29049369
NASA Astrophysics Data System (ADS)
Aimran, Ahmad Nazim; Ahmad, Sabri; Afthanorhan, Asyraf; Awang, Zainudin
2017-05-01
Structural equation modeling (SEM) is the second generation statistical analysis technique developed for analyzing the inter-relationships among multiple variables in a model. Previous studies have shown that there seemed to be at least an implicit agreement about the factors that should drive the choice between covariance-based structural equation modeling (CB-SEM) and partial least square path modeling (PLS-PM). PLS-PM appears to be the preferred method by previous scholars because of its less stringent assumption and the need to avoid the perceived difficulties in CB-SEM. Along with this issue has been the increasing debate among researchers on the use of CB-SEM and PLS-PM in studies. The present study intends to assess the performance of CB-SEM and PLS-PM as a confirmatory study in which the findings will contribute to the body of knowledge of SEM. Maximum likelihood (ML) was chosen as the estimator for CB-SEM and was expected to be more powerful than PLS-PM. Based on the balanced experimental design, the multivariate normal data with specified population parameter and sample sizes were generated using Pro-Active Monte Carlo simulation, and the data were analyzed using AMOS for CB-SEM and SmartPLS for PLS-PM. Comparative Bias Index (CBI), construct relationship, average variance extracted (AVE), composite reliability (CR), and Fornell-Larcker criterion were used to study the consequence of each estimator. The findings conclude that CB-SEM performed notably better than PLS-PM in estimation for large sample size (100 and above), particularly in terms of estimations accuracy and consistency.
Szymanska-Chargot, M; Chylinska, M; Kruk, B; Zdunek, A
2015-01-22
The aim of this work was to quantitatively and qualitatively determine the composition of the cell wall material from apples during development by means of Fourier transform infrared (FT-IR) spectroscopy. The FT-IR region of 1500-800 cm(-1), containing characteristic bands for galacturonic acid, hemicellulose and cellulose, was examined using principal component analysis (PCA), k-means clustering and partial least squares (PLS). The samples were differentiated by development stage and cultivar using PCA and k-means clustering. PLS calibration models for galacturonic acid, hemicellulose and cellulose content from FT-IR spectra were developed and validated with the reference data. PLS models were tested using the root-mean-square errors of cross-validation for contents of galacturonic acid, hemicellulose and cellulose which was 8.30 mg/g, 4.08% and 1.74%, respectively. It was proven that FT-IR spectroscopy combined with chemometric methods has potential for fast and reliable determination of the main constituents of fruit cell walls. Copyright © 2014 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Yuniarto, Budi; Kurniawan, Robert
2017-03-01
PLS Path Modeling (PLS-PM) is different from covariance based SEM, where PLS-PM use an approach based on variance or component, therefore, PLS-PM is also known as a component based SEM. Multiblock Partial Least Squares (MBPLS) is a method in PLS regression which can be used in PLS Path Modeling which known as Multiblock PLS Path Modeling (MBPLS-PM). This method uses an iterative procedure in its algorithm. This research aims to modify MBPLS-PM with Back Propagation Neural Network approach. The result is MBPLS-PM algorithm can be modified using the Back Propagation Neural Network approach to replace the iterative process in backward and forward step to get the matrix t and the matrix u in the algorithm. By modifying the MBPLS-PM algorithm using Back Propagation Neural Network approach, the model parameters obtained are relatively not significantly different compared to model parameters obtained by original MBPLS-PM algorithm.
NASA Astrophysics Data System (ADS)
Liu, Wen; Zhang, Yuying; Yang, Si; Han, Donghai
2018-05-01
A new technique to identify the floral resources of honeys is demanded. Terahertz time-domain attenuated total reflection spectroscopy combined with chemometrics methods was applied to discriminate different categorizes (Medlar honey, Vitex honey, and Acacia honey). Principal component analysis (PCA), cluster analysis (CA) and partial least squares-discriminant analysis (PLS-DA) have been used to find information of the botanical origins of honeys. Spectral range also was discussed to increase the precision of PLS-DA model. The accuracy of 88.46% for validation set was obtained, using PLS-DA model in 0.5-1.5 THz. This work indicated terahertz time-domain attenuated total reflection spectroscopy was an available approach to evaluate the quality of honey rapidly.
Li, Yuanpeng; Li, Fucui; Yang, Xinhao; Guo, Liu; Huang, Furong; Chen, Zhenqiang; Chen, Xingdan; Zheng, Shifu
2018-08-05
A rapid quantitative analysis model for determining the glycated albumin (GA) content based on Attenuated total reflectance (ATR)-Fourier transform infrared spectroscopy (FTIR) combining with linear SiPLS and nonlinear SVM has been developed. Firstly, the real GA content in human serum was determined by GA enzymatic method, meanwhile, the ATR-FTIR spectra of serum samples from the population of health examination were obtained. The spectral data of the whole spectra mid-infrared region (4000-600 cm -1 ) and GA's characteristic region (1800-800 cm -1 ) were used as the research object of quantitative analysis. Secondly, several preprocessing steps including first derivative, second derivative, variable standardization and spectral normalization, were performed. Lastly, quantitative analysis regression models were established by using SiPLS and SVM respectively. The SiPLS modeling results are as follows: root mean square error of cross validation (RMSECV T ) = 0.523 g/L, calibration coefficient (R C ) = 0.937, Root Mean Square Error of Prediction (RMSEP T ) = 0.787 g/L, and prediction coefficient (R P ) = 0.938. The SVM modeling results are as follows: RMSECV T = 0.0048 g/L, R C = 0.998, RMSEP T = 0.442 g/L, and R p = 0.916. The results indicated that the model performance was improved significantly after preprocessing and optimization of characteristic regions. While modeling performance of nonlinear SVM was considerably better than that of linear SiPLS. Hence, the quantitative analysis model for GA in human serum based on ATR-FTIR combined with SiPLS and SVM is effective. And it does not need sample preprocessing while being characterized by simple operations and high time efficiency, providing a rapid and accurate method for GA content determination. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Yeganeh, B.; Motlagh, M. Shafie Pour; Rashidi, Y.; Kamalan, H.
2012-08-01
Due to the health impacts caused by exposures to air pollutants in urban areas, monitoring and forecasting of air quality parameters have become popular as an important topic in atmospheric and environmental research today. The knowledge on the dynamics and complexity of air pollutants behavior has made artificial intelligence models as a useful tool for a more accurate pollutant concentration prediction. This paper focuses on an innovative method of daily air pollution prediction using combination of Support Vector Machine (SVM) as predictor and Partial Least Square (PLS) as a data selection tool based on the measured values of CO concentrations. The CO concentrations of Rey monitoring station in the south of Tehran, from Jan. 2007 to Feb. 2011, have been used to test the effectiveness of this method. The hourly CO concentrations have been predicted using the SVM and the hybrid PLS-SVM models. Similarly, daily CO concentrations have been predicted based on the aforementioned four years measured data. Results demonstrated that both models have good prediction ability; however the hybrid PLS-SVM has better accuracy. In the analysis presented in this paper, statistic estimators including relative mean errors, root mean squared errors and the mean absolute relative error have been employed to compare performances of the models. It has been concluded that the errors decrease after size reduction and coefficients of determination increase from 56 to 81% for SVM model to 65-85% for hybrid PLS-SVM model respectively. Also it was found that the hybrid PLS-SVM model required lower computational time than SVM model as expected, hence supporting the more accurate and faster prediction ability of hybrid PLS-SVM model.
USDA-ARS?s Scientific Manuscript database
Two simple fingerprinting methods, flow-injection UV spectroscopy (FIUV) and 1H nuclear magnetic resonance (NMR), for discrimination of Aurantii FructusImmaturus and Fructus Poniciri TrifoliataeImmaturususing were described. Both methods were combined with partial least-squares discriminant analysis...
Liu, Xiaona; Zhang, Qiao; Wu, Zhisheng; Shi, Xinyuan; Zhao, Na; Qiao, Yanjiang
2015-01-01
Laser-induced breakdown spectroscopy (LIBS) was applied to perform a rapid elemental analysis and provenance study of Blumea balsamifera DC. Principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA) were implemented to exploit the multivariate nature of the LIBS data. Scores and loadings of computed principal components visually illustrated the differing spectral data. The PLS-DA algorithm showed good classification performance. The PLS-DA model using complete spectra as input variables had similar discrimination performance to using selected spectral lines as input variables. The down-selection of spectral lines was specifically focused on the major elements of B. balsamifera samples. Results indicated that LIBS could be used to rapidly analyze elements and to perform provenance study of B. balsamifera. PMID:25558999
ATR-FTIR spectroscopy for the determination of Na4EDTA in detergent aqueous solutions.
Suárez, Leticia; García, Roberto; Riera, Francisco A; Diez, María A
2013-10-15
Fourier transform infrared spectroscopy in the attenuated total reflectance mode (ATR-FTIR) combined with partial last square (PLS) algorithms was used to design calibration and prediction models for a wide range of tetrasodium ethylenediaminetetraacetate (Na4EDTA) concentrations (0.1 to 28% w/w) in aqueous solutions. The spectra obtained using air and water as a background medium were tested for the best fit. The PLS models designed afforded a sufficient level of precision and accuracy to allow even very small amounts of Na4EDTA to be determined. A root mean square error of nearly 0.37 for the validation set was obtained. Over a concentration range below 5% w/w, the values estimated from a combination of ATR-FTIR spectroscopy and a PLS algorithm model were similar to those obtained from an HPLC analysis of NaFeEDTA complexes and subsequent detection by UV absorbance. However, the lowest detection limit for Na4EDTA concentrations afforded by this spectroscopic/chemometric method was 0.3% w/w. The PLS model was successfully used as a rapid and simple method to quantify Na4EDTA in aqueous solutions of industrial detergents as an alternative to HPLC-UV analysis which involves time-consuming dilution and complexation processes. © 2013 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Jiang, Hui; Liu, Guohai; Mei, Congli; Yu, Shuang; Xiao, Xiahong; Ding, Yuhan
2012-11-01
The feasibility of rapid determination of the process variables (i.e. pH and moisture content) in solid-state fermentation (SSF) of wheat straw using Fourier transform near infrared (FT-NIR) spectroscopy was studied. Synergy interval partial least squares (siPLS) algorithm was implemented to calibrate regression model. The number of PLS factors and the number of subintervals were optimized simultaneously by cross-validation. The performance of the prediction model was evaluated according to the root mean square error of cross-validation (RMSECV), the root mean square error of prediction (RMSEP) and the correlation coefficient (R). The measurement results of the optimal model were obtained as follows: RMSECV = 0.0776, Rc = 0.9777, RMSEP = 0.0963, and Rp = 0.9686 for pH model; RMSECV = 1.3544% w/w, Rc = 0.8871, RMSEP = 1.4946% w/w, and Rp = 0.8684 for moisture content model. Finally, compared with classic PLS and iPLS models, the siPLS model revealed its superior performance. The overall results demonstrate that FT-NIR spectroscopy combined with siPLS algorithm can be used to measure process variables in solid-state fermentation of wheat straw, and NIR spectroscopy technique has a potential to be utilized in SSF industry.
Waskitho, Dri; Lukitaningsih, Endang; Sudjadi; Rohman, Abdul
2016-01-01
Analysis of lard extracted from lipstick formulation containing castor oil has been performed using FTIR spectroscopic method combined with multivariate calibration. Three different extraction methods were compared, namely saponification method followed by liquid/liquid extraction with hexane/dichlorometane/ethanol/water, saponification method followed by liquid/liquid extraction with dichloromethane/ethanol/water, and Bligh & Dyer method using chloroform/methanol/water as extracting solvent. Qualitative and quantitative analysis of lard were performed using principle component (PCA) and partial least square (PLS) analysis, respectively. The results showed that, in all samples prepared by the three extraction methods, PCA was capable of identifying lard at wavelength region of 1200-800 cm -1 with the best result was obtained by Bligh & Dyer method. Furthermore, PLS analysis at the same wavelength region used for qualification showed that Bligh and Dyer was the most suitable extraction method with the highest determination coefficient (R 2 ) and the lowest root mean square error of calibration (RMSEC) as well as root mean square error of prediction (RMSEP) values.
ERIC Educational Resources Information Center
Henseler, Jorg; Chin, Wynne W.
2010-01-01
In social and business sciences, the importance of the analysis of interaction effects between manifest as well as latent variables steadily increases. Researchers using partial least squares (PLS) to analyze interaction effects between latent variables need an overview of the available approaches as well as their suitability. This article…
NASA Astrophysics Data System (ADS)
Glavanović, Siniša; Glavanović, Marija; Tomišić, Vladislav
2016-03-01
The UV spectrophotometric methods for simultaneous quantitative determination of paracetamol and tramadol in paracetamol-tramadol tablets were developed. The spectrophotometric data obtained were processed by means of partial least squares (PLS) and genetic algorithm coupled with PLS (GA-PLS) methods in order to determine the content of active substances in the tablets. The results gained by chemometric processing of the spectroscopic data were statistically compared with those obtained by means of validated ultra-high performance liquid chromatographic (UHPLC) method. The accuracy and precision of data obtained by the developed chemometric models were verified by analysing the synthetic mixture of drugs, and by calculating recovery as well as relative standard error (RSE). A statistically good agreement was found between the amounts of paracetamol determined using PLS and GA-PLS algorithms, and that obtained by UHPLC analysis, whereas for tramadol GA-PLS results were proven to be more reliable compared to those of PLS. The simplest and the most accurate and precise models were constructed by using the PLS method for paracetamol (mean recovery 99.5%, RSE 0.89%) and the GA-PLS method for tramadol (mean recovery 99.4%, RSE 1.69%).
The development of comparative bias index
NASA Astrophysics Data System (ADS)
Aimran, Ahmad Nazim; Ahmad, Sabri; Afthanorhan, Asyraf; Awang, Zainudin
2017-08-01
Structural Equation Modeling (SEM) is a second generation statistical analysis techniques developed for analyzing the inter-relationships among multiple variables in a model simultaneously. There are two most common used methods in SEM namely Covariance-Based Structural Equation Modeling (CB-SEM) and Partial Least Square Path Modeling (PLS-PM). There have been continuous debates among researchers in the use of PLS-PM over CB-SEM. While there is few studies were conducted to test the performance of CB-SEM and PLS-PM bias in estimating simulation data. This study intends to patch this problem by a) developing the Comparative Bias Index and b) testing the performance of CB-SEM and PLS-PM using developed index. Based on balanced experimental design, two multivariate normal simulation data with of distinct specifications of size 50, 100, 200 and 500 are generated and analyzed using CB-SEM and PLS-PM.
NASA Astrophysics Data System (ADS)
Liu, Fei; He, Yong
2008-02-01
Visible and near infrared (Vis/NIR) transmission spectroscopy and chemometric methods were utilized to predict the pH values of cola beverages. Five varieties of cola were prepared and 225 samples (45 samples for each variety) were selected for the calibration set, while 75 samples (15 samples for each variety) for the validation set. The smoothing way of Savitzky-Golay and standard normal variate (SNV) followed by first-derivative were used as the pre-processing methods. Partial least squares (PLS) analysis was employed to extract the principal components (PCs) which were used as the inputs of least squares-support vector machine (LS-SVM) model according to their accumulative reliabilities. Then LS-SVM with radial basis function (RBF) kernel function and a two-step grid search technique were applied to build the regression model with a comparison of PLS regression. The correlation coefficient (r), root mean square error of prediction (RMSEP) and bias were 0.961, 0.040 and 0.012 for PLS, while 0.975, 0.031 and 4.697x10 -3 for LS-SVM, respectively. Both methods obtained a satisfying precision. The results indicated that Vis/NIR spectroscopy combined with chemometric methods could be applied as an alternative way for the prediction of pH of cola beverages.
Cai, Rui; Wang, Shisheng; Tang, Bo; Li, Yueqing; Zhao, Weijie
2018-01-01
Sea cucumber is the major tonic seafood worldwide, and geographical origin traceability is an important part of its quality and safety control. In this work, a non-destructive method for origin traceability of sea cucumber (Apostichopus japonicus) from northern China Sea and East China Sea using near infrared spectroscopy (NIRS) and multivariate analysis methods was proposed. Total fat contents of 189 fresh sea cucumber samples were determined and partial least-squares (PLS) regression was used to establish the quantitative NIRS model. The ordered predictor selection algorithm was performed to select feasible wavelength regions for the construction of PLS and identification models. The identification model was developed by principal component analysis combined with Mahalanobis distance and scaling to the first range algorithms. In the test set of the optimum PLS models, the root mean square error of prediction was 0.45, and correlation coefficient was 0.90. The correct classification rates of 100% were obtained in both identification calibration model and test model. The overall results indicated that NIRS method combined with chemometric analysis was a suitable tool for origin traceability and identification of fresh sea cucumber samples from nine origins in China. PMID:29410795
Guo, Xiuhan; Cai, Rui; Wang, Shisheng; Tang, Bo; Li, Yueqing; Zhao, Weijie
2018-01-01
Sea cucumber is the major tonic seafood worldwide, and geographical origin traceability is an important part of its quality and safety control. In this work, a non-destructive method for origin traceability of sea cucumber ( Apostichopus japonicus ) from northern China Sea and East China Sea using near infrared spectroscopy (NIRS) and multivariate analysis methods was proposed. Total fat contents of 189 fresh sea cucumber samples were determined and partial least-squares (PLS) regression was used to establish the quantitative NIRS model. The ordered predictor selection algorithm was performed to select feasible wavelength regions for the construction of PLS and identification models. The identification model was developed by principal component analysis combined with Mahalanobis distance and scaling to the first range algorithms. In the test set of the optimum PLS models, the root mean square error of prediction was 0.45, and correlation coefficient was 0.90. The correct classification rates of 100% were obtained in both identification calibration model and test model. The overall results indicated that NIRS method combined with chemometric analysis was a suitable tool for origin traceability and identification of fresh sea cucumber samples from nine origins in China.
USDA-ARS?s Scientific Manuscript database
Several partial least squares (PLS) models were created correlating various properties and chemical composition measurements with the 1H and 13C NMR spectra of 73 different of pyrolysis bio-oil samples from various biomass sources (crude and intermediate products), finished oils and small molecule s...
NASA Astrophysics Data System (ADS)
Yu, Jiajia; He, Yong
Mango is a kind of popular tropical fruit, and the soluble solid content is an important in this study visible and short-wave near-infrared spectroscopy (VIS/SWNIR) technique was applied. For sake of investigating the feasibility of using VIS/SWNIR spectroscopy to measure the soluble solid content in mango, and validating the performance of selected sensitive bands, for the calibration set was formed by 135 mango samples, while the remaining 45 mango samples for the prediction set. The combination of partial least squares and backpropagation artificial neural networks (PLS-BP) was used to calculate the prediction model based on raw spectrum data. Based on PLS-BP, the determination coefficient for prediction (Rp) was 0.757 and root mean square and the process is simple and easy to operate. Compared with the Partial least squares (PLS) result, the performance of PLS-BP is better.
NASA Astrophysics Data System (ADS)
Shi, Ji-yong; Zou, Xiao-bo; Zhao, Jie-wen; Mel, Holmes; Wang, Kai-liang; Wang, Xue; Chen, Hong
Total flavonoids content is often considered an important quality index of Ginkgo biloba leaf. The feasibility of using near infrared (NIR) spectra at the wavelength range of 10,000-4000 cm-1 for rapid and nondestructive determination of total flavonoids content in G. biloba leaf was investigated. 120 fresh G. biloba leaves in different colors (green, green-yellowish and yellow) were used to spectra acquisition and total flavonoids determination. Partial least squares (PLS), interval partial least squares (iPLS) and synergy interval partial least squares (SiPLS) were used to develop calibration models for total flavonoids content in two colors leaves (green-yellowish and yellow) and three colors leaves (green, green-yellowish and yellow), respectively. The level of total flavonoids content for green, green-yellowish and yellow leaves was in an increasing order. Two characteristic wavelength regions (5840-6090 cm-1 and 6620-6880 cm-1), which corresponded to the absorptions of two aromatic rings in basic flavonoid structure, were selected by SiPLS. The optimal SiPLS model for total flavonoids content in the two colors leaves (r2 = 0.82, RMSEP = 2.62 mg g-1) had better performance than PLS and iPLS models. It could be concluded that NIR spectroscopy has significant potential in the nondestructive determination of total flavonoids content in fresh G. biloba leaf.
Jiang, Hui; Liu, Guohai; Mei, Congli; Yu, Shuang; Xiao, Xiahong; Ding, Yuhan
2012-11-01
The feasibility of rapid determination of the process variables (i.e. pH and moisture content) in solid-state fermentation (SSF) of wheat straw using Fourier transform near infrared (FT-NIR) spectroscopy was studied. Synergy interval partial least squares (siPLS) algorithm was implemented to calibrate regression model. The number of PLS factors and the number of subintervals were optimized simultaneously by cross-validation. The performance of the prediction model was evaluated according to the root mean square error of cross-validation (RMSECV), the root mean square error of prediction (RMSEP) and the correlation coefficient (R). The measurement results of the optimal model were obtained as follows: RMSECV=0.0776, R(c)=0.9777, RMSEP=0.0963, and R(p)=0.9686 for pH model; RMSECV=1.3544% w/w, R(c)=0.8871, RMSEP=1.4946% w/w, and R(p)=0.8684 for moisture content model. Finally, compared with classic PLS and iPLS models, the siPLS model revealed its superior performance. The overall results demonstrate that FT-NIR spectroscopy combined with siPLS algorithm can be used to measure process variables in solid-state fermentation of wheat straw, and NIR spectroscopy technique has a potential to be utilized in SSF industry. Copyright © 2012 Elsevier B.V. All rights reserved.
Zhou, Yan; Cao, Hui
2013-01-01
We propose an augmented classical least squares (ACLS) calibration method for quantitative Raman spectral analysis against component information loss. The Raman spectral signals with low analyte concentration correlations were selected and used as the substitutes for unknown quantitative component information during the CLS calibration procedure. The number of selected signals was determined by using the leave-one-out root-mean-square error of cross-validation (RMSECV) curve. An ACLS model was built based on the augmented concentration matrix and the reference spectral signal matrix. The proposed method was compared with partial least squares (PLS) and principal component regression (PCR) using one example: a data set recorded from an experiment of analyte concentration determination using Raman spectroscopy. A 2-fold cross-validation with Venetian blinds strategy was exploited to evaluate the predictive power of the proposed method. The one-way variance analysis (ANOVA) was used to access the predictive power difference between the proposed method and existing methods. Results indicated that the proposed method is effective at increasing the robust predictive power of traditional CLS model against component information loss and its predictive power is comparable to that of PLS or PCR.
Detection of Genetically Modified Sugarcane by Using Terahertz Spectroscopy and Chemometrics
NASA Astrophysics Data System (ADS)
Liu, J.; Xie, H.; Zha, B.; Ding, W.; Luo, J.; Hu, C.
2018-03-01
A methodology is proposed to identify genetically modified sugarcane from non-genetically modified sugarcane by using terahertz spectroscopy and chemometrics techniques, including linear discriminant analysis (LDA), support vector machine-discriminant analysis (SVM-DA), and partial least squares-discriminant analysis (PLS-DA). The classification rate of the above mentioned methods is compared, and different types of preprocessing are considered. According to the experimental results, the best option is PLS-DA, with an identification rate of 98%. The results indicated that THz spectroscopy and chemometrics techniques are a powerful tool to identify genetically modified and non-genetically modified sugarcane.
Navy Fuel Composition and Screening Tool (FCAST) v2.8
2016-05-10
allowed us to develop partial least squares (PLS) models based on gas chromatography–mass spectrometry (GC-MS) data that predict fuel properties. The...Chemometric property modeling Partial least squares PLS Compositional profiler Naval Air Systems Command Air-4.4.5 Patuxent River Naval Air Station Patuxent...Cumulative predicted residual error sum of squares DiEGME Diethylene glycol monomethyl ether FCAST Fuel Composition and Screening Tool FFP Fit for
Ouyang, Qin; Zhao, Jiewen; Pan, Wenxiu; Chen, Quansheng
2016-01-01
A portable and low-cost spectral analytical system was developed and used to monitor real-time process parameters, i.e. total sugar content (TSC), alcohol content (AC) and pH during rice wine fermentation. Various partial least square (PLS) algorithms were implemented to construct models. The performance of a model was evaluated by the correlation coefficient (Rp) and the root mean square error (RMSEP) in the prediction set. Among the models used, the synergy interval PLS (Si-PLS) was found to be superior. The optimal performance by the Si-PLS model for the TSC was Rp = 0.8694, RMSEP = 0.438; the AC was Rp = 0.8097, RMSEP = 0.617; and the pH was Rp = 0.9039, RMSEP = 0.0805. The stability and reliability of the system, as well as the optimal models, were verified using coefficients of variation, most of which were found to be less than 5%. The results suggest this portable system is a promising tool that could be used as an alternative method for rapid monitoring of process parameters during rice wine fermentation. Copyright © 2015 Elsevier Ltd. All rights reserved.
Li, Juan; Jiang, Yue; Fan, Qi; Chen, Yang; Wu, Ruanqi
2014-05-05
This paper establishes a high-throughput and high selective method to determine the impurity named oxidized glutathione (GSSG) and radial tensile strength (RTS) of reduced glutathione (GSH) tablets based on near infrared (NIR) spectroscopy and partial least squares (PLS). In order to build and evaluate the calibration models, the NIR diffuse reflectance spectra (DRS) and transmittance spectra (TS) for 330 GSH tablets were accurately measured by using the optimized parameter values. For analyzing GSSG or RTS of GSH tablets, the NIR-DRS or NIR-TS were selected, subdivided reasonably into calibration and prediction sets, and processed appropriately with chemometric techniques. After selecting spectral sub-ranges and neglecting spectrum outliers, the PLS calibration models were built and the factor numbers were optimized. Then, the PLS models were evaluated by the root mean square errors of calibration (RMSEC), cross-validation (RMSECV) and prediction (RMSEP), and by the correlation coefficients of calibration (R(c)) and prediction (R(p)). The results indicate that the proposed models have good performances. It is thus clear that the NIR-PLS can simultaneously, selectively, nondestructively and rapidly analyze the GSSG and RTS of GSH tablets, although the contents of GSSG impurity were quite low while those of GSH active pharmaceutical ingredient (API) quite high. This strategy can be an important complement to the common NIR methods used in the on-line analysis of API in pharmaceutical preparations. And this work expands the NIR applications in the high-throughput and extraordinarily selective analysis. Copyright © 2014 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Peerbhay, Kabir Yunus; Mutanga, Onisimo; Ismail, Riyad
2013-05-01
Discriminating commercial tree species using hyperspectral remote sensing techniques is critical in monitoring the spatial distributions and compositions of commercial forests. However, issues related to data dimensionality and multicollinearity limit the successful application of the technology. The aim of this study was to examine the utility of the partial least squares discriminant analysis (PLS-DA) technique in accurately classifying six exotic commercial forest species (Eucalyptus grandis, Eucalyptus nitens, Eucalyptus smithii, Pinus patula, Pinus elliotii and Acacia mearnsii) using airborne AISA Eagle hyperspectral imagery (393-900 nm). Additionally, the variable importance in the projection (VIP) method was used to identify subsets of bands that could successfully discriminate the forest species. Results indicated that the PLS-DA model that used all the AISA Eagle bands (n = 230) produced an overall accuracy of 80.61% and a kappa value of 0.77, with user's and producer's accuracies ranging from 50% to 100%. In comparison, incorporating the optimal subset of VIP selected wavebands (n = 78) in the PLS-DA model resulted in an improved overall accuracy of 88.78% and a kappa value of 0.87, with user's and producer's accuracies ranging from 70% to 100%. Bands located predominantly within the visible region of the electromagnetic spectrum (393-723 nm) showed the most capability in terms of discriminating between the six commercial forest species. Overall, the research has demonstrated the potential of using PLS-DA for reducing the dimensionality of hyperspectral datasets as well as determining the optimal subset of bands to produce the highest classification accuracies.
Crop/weed discrimination using near-infrared reflectance spectroscopy (NIRS)
NASA Astrophysics Data System (ADS)
Zhang, Yun; He, Yong
2006-09-01
The traditional uniform herbicide application often results in an over chemical residues on soil, crop plants and agriculture produce, which have imperiled the environment and food security. Near-infrared reflectance spectroscopy (NIRS) offers a promising means for weed detection and site-specific herbicide application. In laboratory, a total of 90 samples (30 for each species) of the detached leaves of two weeds, i.e., threeseeded mercury (Acalypha australis L.) and fourleafed duckweed (Marsilea quadrfolia L.), and one crop soybean (Glycine max) was investigated for NIRS on 325- 1075 nm using a field spectroradiometer. 20 absorbance samples of each species after pretreatment were exported and the lacked Y variables were assigned independent values for partial least squares (PLS) analysis. During the combined principle component analysis (PCA) on 400-1000 nm, the PC1 and PC2 could together explain over 91% of the total variance and detect the three plant species with 98.3% accuracy. The full-cross validation results of PLS, i.e., standard error of prediction (SEP) 0.247, correlation coefficient (r) 0.954 and root mean square error of prediction (RMSEP) 0.245, indicated an optimum model for weed identification. By predicting the remaining 10 samples of each species in the PLS model, the results with deviation presented a 100% crop/weed detection rate. Thus, it could be concluded that PLS was an available alternative of for qualitative weed discrimination on NTRS.
Nondestructive evaluation of soluble solid content in strawberry by near infrared spectroscopy
NASA Astrophysics Data System (ADS)
Guo, Zhiming; Huang, Wenqian; Chen, Liping; Wang, Xiu; Peng, Yankun
This paper indicates the feasibility to use near infrared (NIR) spectroscopy combined with synergy interval partial least squares (siPLS) algorithms as a rapid nondestructive method to estimate the soluble solid content (SSC) in strawberry. Spectral preprocessing methods were optimized selected by cross-validation in the model calibration. Partial least squares (PLS) algorithm was conducted on the calibration of regression model. The performance of the final model was back-evaluated according to root mean square error of calibration (RMSEC) and correlation coefficient (R2 c) in calibration set, and tested by mean square error of prediction (RMSEP) and correlation coefficient (R2 p) in prediction set. The optimal siPLS model was obtained with after first derivation spectra preprocessing. The measurement results of best model were achieved as follow: RMSEC = 0.2259, R2 c = 0.9590 in the calibration set; and RMSEP = 0.2892, R2 p = 0.9390 in the prediction set. This work demonstrated that NIR spectroscopy and siPLS with efficient spectral preprocessing is a useful tool for nondestructively evaluation SSC in strawberry.
Newman, J; Egan, T; Harbourne, N; O'Riordan, D; Jacquier, J C; O'Sullivan, M
2014-08-01
Sensory evaluation can be problematic for ingredients with a bitter taste during research and development phase of new food products. In this study, 19 dairy protein hydrolysates (DPH) were analysed by an electronic tongue and their physicochemical characteristics, the data obtained from these methods were correlated with their bitterness intensity as scored by a trained sensory panel and each model was also assessed by its predictive capabilities. The physiochemical characteristics of the DPHs investigated were degree of hydrolysis (DH%), and data relating to peptide size and relative hydrophobicity from size exclusion chromatography (SEC) and reverse phase (RP) HPLC. Partial least square regression (PLS) was used to construct the prediction models. All PLS regressions had good correlations (0.78 to 0.93) with the strongest being the combination of data obtained from SEC and RP HPLC. However, the PLS with the strongest predictive power was based on the e-tongue which had the PLS regression with the lowest root mean predicted residual error sum of squares (PRESS) in the study. The results show that the PLS models constructed with the e-tongue and the combination of SEC and RP-HPLC has potential to be used for prediction of bitterness and thus reducing the reliance on sensory analysis in DPHs for future food research. Copyright © 2014 Elsevier B.V. All rights reserved.
Ramírez, J; Górriz, J M; Segovia, F; Chaves, R; Salas-Gonzalez, D; López, M; Alvarez, I; Padilla, P
2010-03-19
This letter shows a computer aided diagnosis (CAD) technique for the early detection of the Alzheimer's disease (AD) by means of single photon emission computed tomography (SPECT) image classification. The proposed method is based on partial least squares (PLS) regression model and a random forest (RF) predictor. The challenge of the curse of dimensionality is addressed by reducing the large dimensionality of the input data by downscaling the SPECT images and extracting score features using PLS. A RF predictor then forms an ensemble of classification and regression tree (CART)-like classifiers being its output determined by a majority vote of the trees in the forest. A baseline principal component analysis (PCA) system is also developed for reference. The experimental results show that the combined PLS-RF system yields a generalization error that converges to a limit when increasing the number of trees in the forest. Thus, the generalization error is reduced when using PLS and depends on the strength of the individual trees in the forest and the correlation between them. Moreover, PLS feature extraction is found to be more effective for extracting discriminative information from the data than PCA yielding peak sensitivity, specificity and accuracy values of 100%, 92.7%, and 96.9%, respectively. Moreover, the proposed CAD system outperformed several other recently developed AD CAD systems. Copyright 2010 Elsevier Ireland Ltd. All rights reserved.
Piccirilli, Gisela N; Escandar, Graciela M
2006-09-01
This paper demonstrates for the first time the power of a chemometric second-order algorithm for predicting, in a simple way and using spectrofluorimetric data, the concentration of analytes in the presence of both the inner-filter effect and unsuspected species. The simultaneous determination of the systemic fungicides carbendazim and thiabendazole was achieved and employed for the discussion of the scopes of the applied second-order chemometric tools: parallel factor analysis (PARAFAC) and partial least-squares with residual bilinearization (PLS/RBL). The chemometric study was performed using fluorescence excitation-emission matrices obtained after the extraction of the analytes over a C18-membrane surface. The ability of PLS/RBL to recognize and overcome the significant changes produced by thiabendazole in both the excitation and emission spectra of carbendazim is demonstrated. The high performance of the selected PLS/RBL method was established with the determination of both pesticides in artificial and real samples.
Wang, Yan-peng; Gong, Qi; Yu, Sheng-rong; Liu, You-yan
2012-04-01
A method for detecting trace impurities in high concentration matrix by ICP-AES based on partial least squares (PLS) was established. The research showed that PLS could effectively correct the interference caused by high level of matrix concentration error and could withstand higher concentrations of matrix than multicomponent spectral fitting (MSF). When the mass ratios of matrix to impurities were from 1 000 : 1 to 20 000 : 1, the recoveries of standard addition were between 95% and 105% by PLS. For the system in which interference effect has nonlinear correlation with the matrix concentrations, the prediction accuracy of normal PLS method was poor, but it can be improved greatly by using LIN-PPLS, which was based on matrix transformation of sample concentration. The contents of Co, Pb and Ga in stream sediment (GBW07312) were detected by MSF, PLS and LIN-PPLS respectively. The results showed that the prediction accuracy of LIN-PPLS was better than PLS, and the prediction accuracy of PLS was better than MSF.
Differences in chewing sounds of dry-crisp snacks by multivariate data analysis
NASA Astrophysics Data System (ADS)
De Belie, N.; Sivertsvik, M.; De Baerdemaeker, J.
2003-09-01
Chewing sounds of different types of dry-crisp snacks (two types of potato chips, prawn crackers, cornflakes and low calorie snacks from extruded starch) were analysed to assess differences in sound emission patterns. The emitted sounds were recorded by a microphone placed over the ear canal. The first bite and the first subsequent chew were selected from the time signal and a fast Fourier transformation provided the power spectra. Different multivariate analysis techniques were used for classification of the snack groups. This included principal component analysis (PCA) and unfold partial least-squares (PLS) algorithms, as well as multi-way techniques such as three-way PLS, three-way PCA (Tucker3), and parallel factor analysis (PARAFAC) on the first bite and subsequent chew. The models were evaluated by calculating the classification errors and the root mean square error of prediction (RMSEP) for independent validation sets. It appeared that the logarithm of the power spectra obtained from the chewing sounds could be used successfully to distinguish the different snack groups. When different chewers were used, recalibration of the models was necessary. Multi-way models distinguished better between chewing sounds of different snack groups than PCA on bite or chew separately and than unfold PLS. From all three-way models applied, N-PLS with three components showed the best classification capabilities, resulting in classification errors of 14-18%. The major amount of incorrect classifications was due to one type of potato chips that had a very irregular shape, resulting in a wide variation of the emitted sounds.
Wang, Yonghua; Li, Yan; Wang, Bin
2007-01-01
Nicotine and a variety of other drugs and toxins are metabolized by cytochrome P450 (CYP) 2A6. The aim of the present study was to build a quantitative structure-activity relationship (QSAR) model to predict the activities of nicotine analogues on CYP2A6. Kernel partial least squares (K-PLS) regression was employed with the electro-topological descriptors to build the computational models. Both the internal and external predictabilities of the models were evaluated with test sets to ensure their validity and reliability. As a comparison to K-PLS, a standard PLS algorithm was also applied on the same training and test sets. Our results show that the K-PLS produced reasonable results that outperformed the PLS model on the datasets. The obtained K-PLS model will be helpful for the design of novel nicotine-like selective CYP2A6 inhibitors.
ERIC Educational Resources Information Center
Huang, Jie-Tsuen; Hsieh, Hui-Hsien
2011-01-01
The purpose of this study was to investigate the contributions of socioeconomic status (SES) in predicting social cognitive career theory (SCCT) factors. Data were collected from 738 college students in Taiwan. The results of the partial least squares (PLS) analyses indicated that SES significantly predicted career decision self-efficacy (CDSE);…
Laser-Induced Breakdown Spectroscopy (LIBS) Measurement of Uranium in Molten Salt.
Williams, Ammon; Phongikaroon, Supathorn
2018-01-01
In this current study, the molten salt aerosol-laser-induced breakdown spectroscopy (LIBS) system was used to measure the uranium (U) content in a ternary UCl 3 -LiCl-KCl salt to investigate and assess a near real-time analytical approach for material safeguards and accountability. Experiments were conducted using five different U concentrations to determine the analytical figures of merit for the system with respect to U. In the analysis, three U lines were used to develop univariate calibration curves at the 367.01 nm, 385.96 nm, and 387.10 nm lines. The 367.01 nm line had the lowest limit of detection (LOD) of 0.065 wt% U. The 385.96 nm line had the best root mean square error of cross-validation (RMSECV) of 0.20 wt% U. In addition to the univariate calibration approach, a multivariate partial least squares (PLS) model was developed to further analyze the data. Using partial least squares (PLS) modeling, an RMSECV of 0.085 wt% U was determined. The RMSECV from the multivariate approach was significantly better than the univariate case and the PLS model is recommended for future LIBS analysis. Overall, the aerosol-LIBS system performed well in monitoring the U concentration and it is expected that the system could be used to quantitatively determine the U compositions within the normal operational concentrations of U in pyroprocessing molten salts.
Wang, L; Qin, X C; Lin, H C; Deng, K F; Luo, Y W; Sun, Q R; Du, Q X; Wang, Z Y; Tuo, Y; Sun, J H
2018-02-01
To analyse the relationship between Fourier transform infrared (FTIR) spectrum of rat's spleen tissue and postmortem interval (PMI) for PMI estimation using FTIR spectroscopy combined with data mining method. Rats were sacrificed by cervical dislocation, and the cadavers were placed at 20 ℃. The FTIR spectrum data of rats' spleen tissues were taken and measured at different time points. After pretreatment, the data was analysed by data mining method. The absorption peak intensity of rat's spleen tissue spectrum changed with the PMI, while the absorption peak position was unchanged. The results of principal component analysis (PCA) showed that the cumulative contribution rate of the first three principal components was 96%. There was an obvious clustering tendency for the spectrum sample at each time point. The methods of partial least squares discriminant analysis (PLS-DA) and support vector machine classification (SVMC) effectively divided the spectrum samples with different PMI into four categories (0-24 h, 48-72 h, 96-120 h and 144-168 h). The determination coefficient ( R ²) of the PMI estimation model established by PLS regression analysis was 0.96, and the root mean square error of calibration (RMSEC) and root mean square error of cross validation (RMSECV) were 9.90 h and 11.39 h respectively. In prediction set, the R ² was 0.97, and the root mean square error of prediction (RMSEP) was 10.49 h. The FTIR spectrum of the rat's spleen tissue can be effectively analyzed qualitatively and quantitatively by the combination of FTIR spectroscopy and data mining method, and the classification and PLS regression models can be established for PMI estimation. Copyright© by the Editorial Department of Journal of Forensic Medicine.
Souza, Beatriz C C; De Oliveira, Tiago B; Aquino, Thiago M; de Lima, Maria C A; Pitta, Ivan R; Galdino, Suely L; Lima, Edeltrudes O; Gonçalves-Silva, Teresinha; Militão, Gardênia C G; Scotti, Luciana; Scotti, Marcus T; Mendonça, Francisco J B
2012-06-01
A series of 2-[(arylidene)amino]-cycloalkyl[b]thiophene-3-carbonitriles (2a-x) was synthesized by incorporation of substituted aromatic aldehydes in Gewald adducts (1a-c). The title compounds were screened for their antifungal activity against Candida krusei and Criptococcus neoformans and for their antiproliferative activity against a panel of 3 human cancer cell lines (HT29, NCI H-292 and HEP). For antiproliferative activity, the partial least squares (PLS) methodology was applied. Some of the prepared compounds exhibited promising antifungal and proliferative properties. The most active compounds for antifungal activity were cyclohexyl[b]thiophene derivatives, and for antiproliferative activity cycloheptyl[b]thiophene derivatives, especially 2-[(1H-indol-2-yl-methylidene)amino]- 5,6,7,8-tetrahydro-4H-cyclohepta[b]thiophene-3-carbonitrile (2r), which inhibited more than 97 % growth of the three cell lines. The PLS discriminant analysis (PLS-DA) applied generated good exploratory and predictive results and showed that the descriptors having shape characteristics were strongly correlated with the biological data.
Quantification of adulterations in extra virgin flaxseed oil using MIR and PLS.
de Souza, Letícia Maria; de Santana, Felipe Bachion; Gontijo, Lucas Caixeta; Mazivila, Sarmento Júnior; Borges Neto, Waldomiro
2015-09-01
This paper proposes a new method for the quantitative analysis of soybean oil (SO) and sunflower oil (SFO) as adulterants in extra virgin flaxseed oil (EFO) by applying Mid Infrared Spectroscopy (MIR) associated with chemometric technique of Partial Least Squares (PLS). The PLS models were built in accordance with standard method ASTM E1655-05 and these showed good correlation between the reference values and those calculated using the PLS models with low error values, with R = 0.998 for SFO and R = 0.999 for SO in EFO. These models were validated analytically in accordance with Brazilian and international guidelines through the estimate of figures of merit parameters, thus showing an effective and feasible method to control the quality of extra virgin flaxseed oil. Copyright © 2015 Elsevier Ltd. All rights reserved.
An improved partial least-squares regression method for Raman spectroscopy
NASA Astrophysics Data System (ADS)
Momenpour Tehran Monfared, Ali; Anis, Hanan
2017-10-01
It is known that the performance of partial least-squares (PLS) regression analysis can be improved using the backward variable selection method (BVSPLS). In this paper, we further improve the BVSPLS based on a novel selection mechanism. The proposed method is based on sorting the weighted regression coefficients, and then the importance of each variable of the sorted list is evaluated using root mean square errors of prediction (RMSEP) criterion in each iteration step. Our Improved BVSPLS (IBVSPLS) method has been applied to leukemia and heparin data sets and led to an improvement in limit of detection of Raman biosensing ranged from 10% to 43% compared to PLS. Our IBVSPLS was also compared to the jack-knifing (simpler) and Genetic Algorithm (more complex) methods. Our method was consistently better than the jack-knifing method and showed either a similar or a better performance compared to the genetic algorithm.
NASA Astrophysics Data System (ADS)
Hemmateenejad, Bahram; Rezaei, Zahra; Khabnadideh, Soghra; Saffari, Maryam
2007-11-01
Carbamazepine (CBZ) undergoes enzyme biotransformation through epoxidation with the formation of its metabolite, carbamazepine-10,11-epoxide (CBZE). A simple chemometrics-assisted spectrophotometric method has been proposed for simultaneous determination of CBZ and CBZE in plasma. A liquid extraction procedure was operated to separate the analytes from plasma, and the UV absorbance spectra of the resultant solutions were subjected to partial least squares (PLS) regression. The optimum number of PLS latent variables was selected according to the PRESS values of leave-one-out cross-validation. A HPLC method was also employed for comparison. The respective mean recoveries for analysis of CBZ and CBZE in synthetic mixtures were 102.57 (±0.25)% and 103.00 (±0.09)% for PLS and 99.40 (±0.15)% and 102.20 (±0.02)%. The concentrations of CBZ and CBZE were also determined in five patients using the PLS and HPLC methods. The results showed that the data obtained by PLS were comparable with those obtained by HPLC method.
Improved Quantitative Analysis of Ion Mobility Spectrometry by Chemometric Multivariate Calibration
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fraga, Carlos G.; Kerr, Dayle; Atkinson, David A.
2009-09-01
Traditional peak-area calibration and the multivariate calibration methods of principle component regression (PCR) and partial least squares (PLS), including unfolded PLS (U-PLS) and multi-way PLS (N-PLS), were evaluated for the quantification of 2,4,6-trinitrotoluene (TNT) and cyclo-1,3,5-trimethylene-2,4,6-trinitramine (RDX) in Composition B samples analyzed by temperature step desorption ion mobility spectrometry (TSD-IMS). The true TNT and RDX concentrations of eight Composition B samples were determined by high performance liquid chromatography with UV absorbance detection. Most of the Composition B samples were found to have distinct TNT and RDX concentrations. Applying PCR and PLS on the exact same IMS spectra used for themore » peak-area study improved quantitative accuracy and precision approximately 3 to 5 fold and 2 to 4 fold, respectively. This in turn improved the probability of correctly identifying Composition B samples based upon the estimated RDX and TNT concentrations from 11% with peak area to 44% and 89% with PLS. This improvement increases the potential of obtaining forensic information from IMS analyzers by providing some ability to differentiate or match Composition B samples based on their TNT and RDX concentrations.« less
NASA Astrophysics Data System (ADS)
Müller, Aline Lima Hermes; Picoloto, Rochele Sogari; Mello, Paola de Azevedo; Ferrão, Marco Flores; dos Santos, Maria de Fátima Pereira; Guimarães, Regina Célia Lourenço; Müller, Edson Irineu; Flores, Erico Marlon Moraes
2012-04-01
Total sulfur concentration was determined in atmospheric residue (AR) and vacuum residue (VR) samples obtained from petroleum distillation process by Fourier transform infrared spectroscopy with attenuated total reflectance (FT-IR/ATR) in association with chemometric methods. Calibration and prediction set consisted of 40 and 20 samples, respectively. Calibration models were developed using two variable selection models: interval partial least squares (iPLS) and synergy interval partial least squares (siPLS). Different treatments and pre-processing steps were also evaluated for the development of models. The pre-treatment based on multiplicative scatter correction (MSC) and the mean centered data were selected for models construction. The use of siPLS as variable selection method provided a model with root mean square error of prediction (RMSEP) values significantly better than those obtained by PLS model using all variables. The best model was obtained using siPLS algorithm with spectra divided in 20 intervals and combinations of 3 intervals (911-824, 823-736 and 737-650 cm-1). This model produced a RMSECV of 400 mg kg-1 S and RMSEP of 420 mg kg-1 S, showing a correlation coefficient of 0.990.
Miaw, Carolina Sheng Whei; Assis, Camila; Silva, Alessandro Rangel Carolino Sales; Cunha, Maria Luísa; Sena, Marcelo Martins; de Souza, Scheilla Vitorino Carvalho
2018-07-15
Grape, orange, peach and passion fruit nectars were formulated and adulterated by dilution with syrup, apple and cashew juices at 10 levels for each adulterant. Attenuated total reflectance Fourier transform mid infrared (ATR-FTIR) spectra were obtained. Partial least squares (PLS) multivariate calibration models allied to different variable selection methods, such as interval partial least squares (iPLS), ordered predictors selection (OPS) and genetic algorithm (GA), were used to quantify the main fruits. PLS improved by iPLS-OPS variable selection showed the highest predictive capacity to quantify the main fruit contents. The selected variables in the final models varied from 72 to 100; the root mean square errors of prediction were estimated from 0.5 to 2.6%; the correlation coefficients of prediction ranged from 0.948 to 0.990; and, the mean relative errors of prediction varied from 3.0 to 6.7%. All of the developed models were validated. Copyright © 2018 Elsevier Ltd. All rights reserved.
Lim, Jongguk; Kim, Giyoung; Mo, Changyeun; Oh, Kyoungmin; Yoo, Hyeonchae; Ham, Hyeonheui; Kim, Moon S.
2017-01-01
The purpose of this study is to use near-infrared reflectance (NIR) spectroscopy equipment to nondestructively and rapidly discriminate Fusarium-infected hulled barley. Both normal hulled barley and Fusarium-infected hulled barley were scanned by using a NIR spectrometer with a wavelength range of 1175 to 2170 nm. Multiple mathematical pretreatments were applied to the reflectance spectra obtained for Fusarium discrimination and the multivariate analysis method of partial least squares discriminant analysis (PLS-DA) was used for discriminant prediction. The PLS-DA prediction model developed by applying the second-order derivative pretreatment to the reflectance spectra obtained from the side of hulled barley without crease achieved 100% accuracy in discriminating the normal hulled barley and the Fusarium-infected hulled barley. These results demonstrated the feasibility of rapid discrimination of the Fusarium-infected hulled barley by combining multivariate analysis with the NIR spectroscopic technique, which is utilized as a nondestructive detection method. PMID:28974012
Partial least squares based identification of Duchenne muscular dystrophy specific genes.
An, Hui-bo; Zheng, Hua-cheng; Zhang, Li; Ma, Lin; Liu, Zheng-yan
2013-11-01
Large-scale parallel gene expression analysis has provided a greater ease for investigating the underlying mechanisms of Duchenne muscular dystrophy (DMD). Previous studies typically implemented variance/regression analysis, which would be fundamentally flawed when unaccounted sources of variability in the arrays existed. Here we aim to identify genes that contribute to the pathology of DMD using partial least squares (PLS) based analysis. We carried out PLS-based analysis with two datasets downloaded from the Gene Expression Omnibus (GEO) database to identify genes contributing to the pathology of DMD. Except for the genes related to inflammation, muscle regeneration and extracellular matrix (ECM) modeling, we found some genes with high fold change, which have not been identified by previous studies, such as SRPX, GPNMB, SAT1, and LYZ. In addition, downregulation of the fatty acid metabolism pathway was found, which may be related to the progressive muscle wasting process. Our results provide a better understanding for the downstream mechanisms of DMD.
NASA Astrophysics Data System (ADS)
Alaoui, G.; Leger, M.; Gagne, J.; Tremblay, L.
2009-05-01
The goal of this work was to evaluate the capability of infrared reflectance spectroscopy for a fast quantification of the elemental and molecular compositions of sedimentary and particulate organic matter (OM). A partial least-squares (PLS) regression model was used for analysis and values were compared to those obtained by traditional methods (i.e., elemental, humic and HPLC analyses). PLS tools are readily accessible from software such as GRAMS (Thermo-Fisher) used in spectroscopy. This spectroscopic-chemometric approach has several advantages including its rapidity and use of whole unaltered samples. To predict properties, a set of infrared spectra from representative samples must first be fitted to form a PLS calibration model. In this study, a large set (180) of sediments and particles on GFF filters from the St. Lawrence estuarine system were used. These samples are very heterogenous (e.g., various tributaries, terrigenous vs. marine, events such as landslides and floods) and thus represent a challenging test for PLS prediction. For sediments, the infrared spectra were obtained with a diffuse reflectance, or DRIFT, accessory. Sedimentary carbon, nitrogen, humic substance contents as well as humic substance proportions in OM and N:C ratios were predicted by PLS. The relative root mean square error of prediction (%RMSEP) for these properties were between 5.7% (humin content) and 14.1% (total humic substance yield) using the cross-validation, or leave-one out, approach. The %RMSEP calculated by PLS for carbon content was lower with the PLS model (7.6%) than with an external calibration method (11.7%) (Tremblay and Gagné, 2002, Anal. Chem., 74, 2985). Moreover, the PLS approach does not require the extraction of POM needed in external calibration. Results highlighted the importance of using a PLS calibration set representative of the unknown samples (e.g., same area). For filtered particles, the infrared spectra were obtained using a novel approach based on attenuated total reflectance, or ATR, allowing the direct analysis of the filters. In addition to carbon and nitrogen contents, amino acid and muramic acid (a bacterial biomarker) yields were predicted using PLS. Calculated %RMSEP varied from 6.4% (total amino acid content) to 18.6% (muramic acid content) with cross-validation. PLS regression modeling does not require a priori knowledge of the spectral bands associated with the properties to be predicted. In turn, the spectral regions that give good PLS predictions provided valuable information on band assignment and geochemical processes. For instance, nitrogen and humin contents were greatly determined by an absorption band caused by aluminosilicate OH group. This supports the idea that OM-clay interactions, important in humin formation and OM preservation, are mediated by nitrogen-containing groups.
Chang, Wen-Qi; Zhou, Jian-Liang; Li, Yi; Shi, Zi-Qi; Wang, Li; Yang, Jie; Li, Ping; Liu, Li-Fang; Xin, Gui-Zhong
2017-01-15
The elevation of free fatty acids (FFAs) has been regarded as a universal metabolic signature of excessive adipocyte lipolysis. Nowadays, in vitro lipolysis assay is generally essential for drug screening prior to the animal study. Here, we present a novel in vitro approach for lipolysis measurement combining UHPLC-Orbitrap and partial least squares (PLS) based analysis. Firstly, the calibration matrix was constructed by serial proportions of mixed samples (blended with control and model samples). Then, lipidome profiling was performed by UHPLC-Orbitrap, and 403 variables were extracted and aligned as dataset. Owing to the high resolution of Orbitrap analyzer and open source lipid identification software, 28 FFAs were further screened and identified. Based on the relative intensity of the screened FFAs, PLS regression model was constructed for lipolysis measurement. After leave-one-out cross-validation, ten principal components have been designated to build the final PLS model with excellent performances (RMSECV, 0.0268; RMSEC, 0.0173; R 2 , 0.9977). In addition, the high predictive accuracy (R 2 = 0.9907 and RMSEP = 0.0345) of the trained PLS model was also demonstrated using test samples. Finally, taking curcumin as a model compound, its antilipolytic effect on palmitic acid-induced lipolysis was successfully predicted as 31.78% by the proposed approach. Besides, supplementary evidences of curcumin induced modification in FFAs compositions as well as lipidome were given by PLS extended methods. Different from general biological assays, high resolution MS-based method provide more sophisticated information included in biological events. Thus, the novel biological evaluation model proposed here showed promising perspectives for drug evaluation or disease diagnosis. Copyright © 2016 Elsevier B.V. All rights reserved.
Mixture quantification using PLS in plastic scintillation measurements.
Bagán, H; Tarancón, A; Rauret, G; García, J F
2011-06-01
This article reports the capability of plastic scintillation (PS) combined with multivariate calibration (Partial least squares; PLS) to detect and quantify alpha and beta emitters in mixtures. While several attempts have been made with this purpose in mind using liquid scintillation (LS), no attempt was done using PS that has the great advantage of not producing mixed waste after the measurements are performed. Following this objective, ternary mixtures of alpha and beta emitters ((241)Am, (137)Cs and (90)Sr/(90)Y) have been quantified. Procedure optimisation has evaluated the use of the net spectra or the sample spectra, the inclusion of different spectra obtained at different values of the Pulse Shape Analysis parameter and the application of the PLS1 or PLS2 algorithms. The conclusions show that the use of PS+PLS2 applied to the sample spectra, without the use of any pulse shape discrimination, allows quantification of the activities with relative errors less than 10% in most of the cases. This procedure not only allows quantification of mixtures but also reduces measurement time (no blanks are required) and the application of this procedure does not require detectors that include the pulse shape analysis parameter. Copyright © 2011 Elsevier Ltd. All rights reserved.
de Almeida, Valber Elias; de Araújo Gomes, Adriano; de Sousa Fernandes, David Douglas; Goicoechea, Héctor Casimiro; Galvão, Roberto Kawakami Harrop; Araújo, Mario Cesar Ugulino
2018-05-01
This paper proposes a new variable selection method for nonlinear multivariate calibration, combining the Successive Projections Algorithm for interval selection (iSPA) with the Kernel Partial Least Squares (Kernel-PLS) modelling technique. The proposed iSPA-Kernel-PLS algorithm is employed in a case study involving a Vis-NIR spectrometric dataset with complex nonlinear features. The analytical problem consists of determining Brix and sucrose content in samples from a sugar production system, on the basis of transflectance spectra. As compared to full-spectrum Kernel-PLS, the iSPA-Kernel-PLS models involve a smaller number of variables and display statistically significant superiority in terms of accuracy and/or bias in the predictions. Published by Elsevier B.V.
Xu, Yun; Muhamadali, Howbeer; Sayqal, Ali; Dixon, Neil; Goodacre, Royston
2016-10-28
Partial least squares (PLS) is one of the most commonly used supervised modelling approaches for analysing multivariate metabolomics data. PLS is typically employed as either a regression model (PLS-R) or a classification model (PLS-DA). However, in metabolomics studies it is common to investigate multiple, potentially interacting, factors simultaneously following a specific experimental design. Such data often cannot be considered as a "pure" regression or a classification problem. Nevertheless, these data have often still been treated as a regression or classification problem and this could lead to ambiguous results. In this study, we investigated the feasibility of designing a hybrid target matrix Y that better reflects the experimental design than simple regression or binary class membership coding commonly used in PLS modelling. The new design of Y coding was based on the same principle used by structural modelling in machine learning techniques. Two real metabolomics datasets were used as examples to illustrate how the new Y coding can improve the interpretability of the PLS model compared to classic regression/classification coding.
Alarcón, Francis; Báez, María E; Bravo, Manuel; Richter, Pablo; Escandar, Graciela M; Olivieri, Alejandro C; Fuentes, Edwar
2013-01-15
The possibility of simultaneously determining seven concerned heavy polycyclic aromatic hydrocarbons (PAHs) of the US-EPA priority pollutant list, in extra virgin olive and sunflower oils was examined using unfolded partial least-squares with residual bilinearization (U-PLS/RBL) and parallel factor analysis (PARAFAC). Both of these methods were applied to fluorescence excitation emission matrices. The compounds studied were benzo[a]anthracene, benzo[b]fluoranthene, benzo[k]fluoranthene, benzo[a]pyrene, dibenz[a,h]anthracene, benzo[g,h,i]perylene and indeno[1,2,3-c,d]-pyrene. The analysis was performed using fluorescence spectroscopy after a microwave assisted liquid-liquid extraction and solid-phase extraction on silica. The U-PLS/RBL algorithm exhibited the best performance for resolving the heavy PAH mixture in the presence of both the highly complex oil matrix and other unpredicted PAHs of the US-EPA list. The obtained limit of detection for the proposed method ranged from 0.07 to 2 μg kg(-1). The predicted U-PLS/RBL concentrations were satisfactorily compared with those obtained using high-performance liquid chromatography with fluorescence detection. A simple analysis with a considerable reduction in time and solvent consumption in comparison with chromatography are the principal advantages of the proposed method. Copyright © 2012 Elsevier B.V. All rights reserved.
Fadzillah, Nurrulhidayah Ahmad; Man, Yaakob bin Che; Rohman, Abdul; Rosman, Arieff Salleh; Ismail, Amin; Mustafa, Shuhaimi; Khatib, Alfi
2015-01-01
The authentication of food products from the presence of non-allowed components for certain religion like lard is very important. In this study, we used proton Nuclear Magnetic Resonance ((1)H-NMR) spectroscopy for the analysis of butter adulterated with lard by simultaneously quantification of all proton bearing compounds, and consequently all relevant sample classes. Since the spectra obtained were too complex to be analyzed visually by the naked eyes, the classification of spectra was carried out.The multivariate calibration of partial least square (PLS) regression was used for modelling the relationship between actual value of lard and predicted value. The model yielded a highest regression coefficient (R(2)) of 0.998 and the lowest root mean square error calibration (RMSEC) of 0.0091% and root mean square error prediction (RMSEP) of 0.0090, respectively. Cross validation testing evaluates the predictive power of the model. PLS model was shown as good models as the intercept of R(2)Y and Q(2)Y were 0.0853 and -0.309, respectively.
He, Yan-Lin; Xu, Yuan; Geng, Zhi-Qiang; Zhu, Qun-Xiong
2016-03-01
In this paper, a hybrid robust model based on an improved functional link neural network integrating with partial least square (IFLNN-PLS) is proposed. Firstly, an improved functional link neural network with small norm of expanded weights and high input-output correlation (SNEWHIOC-FLNN) was proposed for enhancing the generalization performance of FLNN. Unlike the traditional FLNN, the expanded variables of the original inputs are not directly used as the inputs in the proposed SNEWHIOC-FLNN model. The original inputs are attached to some small norm of expanded weights. As a result, the correlation coefficient between some of the expanded variables and the outputs is enhanced. The larger the correlation coefficient is, the more relevant the expanded variables tend to be. In the end, the expanded variables with larger correlation coefficient are selected as the inputs to improve the performance of the traditional FLNN. In order to test the proposed SNEWHIOC-FLNN model, three UCI (University of California, Irvine) regression datasets named Housing, Concrete Compressive Strength (CCS), and Yacht Hydro Dynamics (YHD) are selected. Then a hybrid model based on the improved FLNN integrating with partial least square (IFLNN-PLS) was built. In IFLNN-PLS model, the connection weights are calculated using the partial least square method but not the error back propagation algorithm. Lastly, IFLNN-PLS was developed as an intelligent measurement model for accurately predicting the key variables in the Purified Terephthalic Acid (PTA) process and the High Density Polyethylene (HDPE) process. Simulation results illustrated that the IFLNN-PLS could significant improve the prediction performance. Copyright © 2015 ISA. Published by Elsevier Ltd. All rights reserved.
Filgueiras, Paulo R; Terra, Luciana A; Castro, Eustáquio V R; Oliveira, Lize M S L; Dias, Júlio C M; Poppi, Ronei J
2015-09-01
This paper aims to estimate the temperature equivalent to 10% (T10%), 50% (T50%) and 90% (T90%) of distilled volume in crude oils using (1)H NMR and support vector regression (SVR). Confidence intervals for the predicted values were calculated using a boosting-type ensemble method in a procedure called ensemble support vector regression (eSVR). The estimated confidence intervals obtained by eSVR were compared with previously accepted calculations from partial least squares (PLS) models and a boosting-type ensemble applied in the PLS method (ePLS). By using the proposed boosting strategy, it was possible to identify outliers in the T10% property dataset. The eSVR procedure improved the accuracy of the distillation temperature predictions in relation to standard PLS, ePLS and SVR. For T10%, a root mean square error of prediction (RMSEP) of 11.6°C was obtained in comparison with 15.6°C for PLS, 15.1°C for ePLS and 28.4°C for SVR. The RMSEPs for T50% were 24.2°C, 23.4°C, 22.8°C and 14.4°C for PLS, ePLS, SVR and eSVR, respectively. For T90%, the values of RMSEP were 39.0°C, 39.9°C and 39.9°C for PLS, ePLS, SVR and eSVR, respectively. The confidence intervals calculated by the proposed boosting methodology presented acceptable values for the three properties analyzed; however, they were lower than those calculated by the standard methodology for PLS. Copyright © 2015 Elsevier B.V. All rights reserved.
Liu, Xue-Mei; Zhang, Hai-Liang
2014-10-01
Ultraviolet/visible (UV/Vis) spectroscopy was studied for the rapid determination of chemical oxygen demand (COD), which was an indicator to measure the concentration of organic matter in aquaculture water. In order to reduce the influence of the absolute noises of the spectra, the extracted 135 absorbance spectra were preprocessed by Savitzky-Golay smoothing (SG), EMD, and wavelet transform (WT) methods. The preprocessed spectra were then used to select latent variables (LVs) by partial least squares (PLS) methods. Partial least squares (PLS) was used to build models with the full spectra, and back- propagation neural network (BPNN) and least square support vector machine (LS-SVM) were applied to build models with the selected LVs. The overall results showed that BPNN and LS-SVM models performed better than PLS models, and the LS-SVM models with LVs based on WT preprocessed spectra obtained the best results with the determination coefficient (r2) and RMSE being 0. 83 and 14. 78 mg · L(-1) for calibration set, and 0.82 and 14.82 mg · L(-1) for the prediction set respectively. The method showed the best performance in LS-SVM model. The results indicated that it was feasible to use UV/Vis with LVs which were obtained by PLS method, combined with LS-SVM calibration could be applied to the rapid and accurate determination of COD in aquaculture water. Moreover, this study laid the foundation for further implementation of online analysis of aquaculture water and rapid determination of other water quality parameters.
Wu, Sa; Zhang, Xin; Li, Zhi-Ming; Shi, Yan-Xia; Huang, Jia-Jia; Xia, Yi; Yang, Hang; Jiang, Wen-Qi
2013-01-01
Post-transplant lymphoproliferative disorder (PTLD) is a common complication of therapeutic immunosuppression after organ transplantation. Gene expression profile facilitates the identification of biological difference between Epstein-Barr virus (EBV) positive and negative PTLDs. Previous studies mainly implemented variance/regression analysis without considering unaccounted array specific factors. The aim of this study is to investigate the gene expression difference between EBV positive and negative PTLDs through partial least squares (PLS) based analysis. With a microarray data set from the Gene Expression Omnibus database, we performed PLS based analysis. We acquired 1188 differentially expressed genes. Pathway and Gene Ontology enrichment analysis identified significantly over-representation of dysregulated genes in immune response and cancer related biological processes. Network analysis identified three hub genes with degrees higher than 15, including CREBBP, ATXN1, and PML. Proteins encoded by CREBBP and PML have been reported to be interact with EBV before. Our findings shed light on expression distinction of EBV positive and negative PTLDs with the hope to offer theoretical support for future therapeutic study.
USDA-ARS?s Scientific Manuscript database
A technique of using multiple calibration sets in partial least squares regression (PLS) was proposed to improve the quantitative determination of ammonia from open-path Fourier transform infrared spectra. The spectra were measured near animal farms, and the path-integrated concentration of ammonia...
Lascola, Robert; O'Rourke, Patrick E.; Kyser, Edward A.
2017-10-05
Here, we have developed a piecewise local (PL) partial least squares (PLS) analysis method for total plutonium measurements by absorption spectroscopy in nitric acid-based nuclear material processing streams. Instead of using a single PLS model that covers all expected solution conditions, the method selects one of several local models based on an assessment of solution absorbance, acidity, and Pu oxidation state distribution. The local models match the global model for accuracy against the calibration set, but were observed in several instances to be more robust to variations associated with measurements in the process. The improvements are attributed to the relativemore » parsimony of the local models. Not all of the sources of spectral variation are uniformly present at each part of the calibration range. Thus, the global model is locally overfitting and susceptible to increased variance when presented with new samples. A second set of models quantifies the relative concentrations of Pu(III), (IV), and (VI). Standards containing a mixture of these species were not at equilibrium due to a disproportionation reaction. Therefore, a separate principal component analysis is used to estimate of the concentrations of the individual oxidation states in these standards in the absence of independent confirmatory analysis. The PL analysis approach is generalizable to other systems where the analysis of chemically complicated systems can be aided by rational division of the overall range of solution conditions into simpler sub-regions.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lascola, Robert; O'Rourke, Patrick E.; Kyser, Edward A.
Here, we have developed a piecewise local (PL) partial least squares (PLS) analysis method for total plutonium measurements by absorption spectroscopy in nitric acid-based nuclear material processing streams. Instead of using a single PLS model that covers all expected solution conditions, the method selects one of several local models based on an assessment of solution absorbance, acidity, and Pu oxidation state distribution. The local models match the global model for accuracy against the calibration set, but were observed in several instances to be more robust to variations associated with measurements in the process. The improvements are attributed to the relativemore » parsimony of the local models. Not all of the sources of spectral variation are uniformly present at each part of the calibration range. Thus, the global model is locally overfitting and susceptible to increased variance when presented with new samples. A second set of models quantifies the relative concentrations of Pu(III), (IV), and (VI). Standards containing a mixture of these species were not at equilibrium due to a disproportionation reaction. Therefore, a separate principal component analysis is used to estimate of the concentrations of the individual oxidation states in these standards in the absence of independent confirmatory analysis. The PL analysis approach is generalizable to other systems where the analysis of chemically complicated systems can be aided by rational division of the overall range of solution conditions into simpler sub-regions.« less
On-line milk spectrometry: analysis of bovine milk composition
NASA Astrophysics Data System (ADS)
Spitzer, Kyle; Kuennemeyer, Rainer; Woolford, Murray; Claycomb, Rod
2005-04-01
We present partial least squares (PLS) regressions to predict the composition of raw, unhomogenised milk using visible to near infrared spectroscopy. A total of 370 milk samples from individual quarters were collected and analysed on-line by two low cost spectrometers in the wavelength ranges 380-1100 nm and 900-1700 nm. Samples were collected from 22 Friesian, 17 Jersey, 2 Ayrshire and 3 Friesian-Jersey crossbred cows over a period of 7 consecutive days. Transmission spectra were recorded in an inline flowcell through a 0.5 mm thick milk sample. PLS models, where wavelength selection was performed using iterative PLS, were developed for fat, protein, lactose, and somatic cell content. The root mean square error of prediction (and correlation coefficient) for the nir and visible spectrometers respectively were 0.70%(0.93) and 0.91%(0.91) for fat, 0.65%(0.5) and 0.47%(0.79) for protein, 0.36%(0.49) and 0.45%(0.43) for lactose, and 0.50(0.54) and 0.48(0.51) for log10 somatic cells.
Li, Shuifang; Zhang, Xin; Shan, Yang; Su, Donglin; Ma, Qiang; Wen, Ruizhi; Li, Jiaojuan
2017-03-01
Near-infrared spectroscopy (NIR) was used for qualitative and quantitative detection of honey adulterated with high-fructose corn syrup (HFCS) or maltose syrup (MS). Competitive adaptive reweighted sampling (CARS) was employed to select key variables. Partial least squares linear discriminant analysis (PLS-LDA) was adopted to classify the adulterated honey samples. The CARS-PLS-LDA models showed an accuracy of 86.3% (honey vs. adulterated honey with HFCS) and 96.1% (honey vs. adulterated honey with MS), respectively. PLS regression (PLSR) was used to predict the extent of adulteration in the honeys. The results showed that NIR combined with PLSR could not be used to quantify adulteration with HFCS, but could be used to quantify adulteration with MS: coefficient (R p 2 ) and root mean square of prediction (RMSEP) were 0.901 and 4.041 for MS-adulterated samples from different floral origins, and 0.981 and 1.786 for MS-adulterated samples from the same floral origin (Brassica spp.), respectively. Copyright © 2016. Published by Elsevier Ltd.
A Partial Least Squares Based Procedure for Upstream Sequence Classification in Prokaryotes.
Mehmood, Tahir; Bohlin, Jon; Snipen, Lars
2015-01-01
The upstream region of coding genes is important for several reasons, for instance locating transcription factor, binding sites, and start site initiation in genomic DNA. Motivated by a recently conducted study, where multivariate approach was successfully applied to coding sequence modeling, we have introduced a partial least squares (PLS) based procedure for the classification of true upstream prokaryotic sequence from background upstream sequence. The upstream sequences of conserved coding genes over genomes were considered in analysis, where conserved coding genes were found by using pan-genomics concept for each considered prokaryotic species. PLS uses position specific scoring matrix (PSSM) to study the characteristics of upstream region. Results obtained by PLS based method were compared with Gini importance of random forest (RF) and support vector machine (SVM), which is much used method for sequence classification. The upstream sequence classification performance was evaluated by using cross validation, and suggested approach identifies prokaryotic upstream region significantly better to RF (p-value < 0.01) and SVM (p-value < 0.01). Further, the proposed method also produced results that concurred with known biological characteristics of the upstream region.
NASA Astrophysics Data System (ADS)
Bai, Xue-Mei; Liu, Tie; Liu, De-Long; Wei, Yong-Ju
2018-02-01
A chemometrics-assisted excitation-emission matrix (EEM) fluorescence method was proposed for simultaneous determination of α-asarone and β-asarone in Acorus tatarinowii. Using the strategy of combining EEM data with chemometrics methods, the simultaneous determination of α-asarone and β-asarone in the complex Traditional Chinese medicine system was achieved successfully, even in the presence of unexpected interferents. The physical or chemical separation step was avoided due to the use of ;mathematical separation;. Six second-order calibration methods were used including parallel factor analysis (PARAFAC), alternating trilinear decomposition (ATLD), alternating penalty trilinear decomposition (APTLD), self-weighted alternating trilinear decomposition (SWATLD), the unfolded partial least-squares (U-PLS) and multidimensional partial least-squares (N-PLS) with residual bilinearization (RBL). In addition, HPLC method was developed to further validate the presented strategy. Consequently, for the validation samples, the analytical results obtained by six second-order calibration methods were almost accurate. But for the Acorus tatarinowii samples, the results indicated a slightly better predictive ability of N-PLS/RBL procedure over other methods.
Tsopelas, Fotios; Konstantopoulos, Dimitris; Kakoulidou, Anna Tsantili
2018-07-26
In the present work, two approaches for the voltammetric fingerprinting of oils and their combination with chemometrics were investigated in order to detect the adulteration of extra virgin olive oil with olive pomace oil as well as the most common seed oils, namely sunflower, soybean and corn oil. In particular, cyclic voltammograms of diluted extra virgin olive oils, regular (pure) olive oils (blends of refined olive oils with virgin olive oils), olive pomace oils and seed oils in presence of dichloromethane and 0.1 M of LiClO 4 in EtOH as electrolyte were recorded at a glassy carbon working electrode. Cyclic voltammetry was also employed in methanolic extracts of olive and seed oils. Datapoints of cyclic voltammograms were exported and submitted to Principal Component Analysis (PCA), Partial Least Square- Discriminant Analysis (PLS-DA) and soft independent modeling of class analogy (SIMCA). In diluted oils, PLS-DA provided a clear discrimination between olive oils (extra virgin and regular) and olive pomace/seed oils, while SIMCA showed a clear discrimination of extra virgin olive oil in regard to all other samples. Using methanolic extracts and considering datapoints recorded between 0.6 and 1.3 V, PLS-DA provided more information, resulting in three clusters-extra virgin olive oils, regular olive oils and seed/olive pomace oils-while SIMCA showed inferior performance. For the quantification of extra virgin olive oil adulteration with olive pomace oil or seed oils, a model based on Partial Least Square (PLS) analysis was developed. Detection limit of adulteration in olive oil was found to be 2% (v/v) and the linearity range up to 33% (v/v). Validation and applicability of all models was proved using a suitable test set. In the case of PLS, synthetic oil mixtures with 4 known adulteration levels in the range of 4-26% were also employed as a blind test set. Copyright © 2018 Elsevier B.V. All rights reserved.
Müller, Aline Lima Hermes; Picoloto, Rochele Sogari; de Azevedo Mello, Paola; Ferrão, Marco Flores; de Fátima Pereira dos Santos, Maria; Guimarães, Regina Célia Lourenço; Müller, Edson Irineu; Flores, Erico Marlon Moraes
2012-04-01
Total sulfur concentration was determined in atmospheric residue (AR) and vacuum residue (VR) samples obtained from petroleum distillation process by Fourier transform infrared spectroscopy with attenuated total reflectance (FT-IR/ATR) in association with chemometric methods. Calibration and prediction set consisted of 40 and 20 samples, respectively. Calibration models were developed using two variable selection models: interval partial least squares (iPLS) and synergy interval partial least squares (siPLS). Different treatments and pre-processing steps were also evaluated for the development of models. The pre-treatment based on multiplicative scatter correction (MSC) and the mean centered data were selected for models construction. The use of siPLS as variable selection method provided a model with root mean square error of prediction (RMSEP) values significantly better than those obtained by PLS model using all variables. The best model was obtained using siPLS algorithm with spectra divided in 20 intervals and combinations of 3 intervals (911-824, 823-736 and 737-650 cm(-1)). This model produced a RMSECV of 400 mg kg(-1) S and RMSEP of 420 mg kg(-1) S, showing a correlation coefficient of 0.990. Copyright © 2011 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Kang, Qian; Ru, Qingguo; Liu, Yan; Xu, Lingyan; Liu, Jia; Wang, Yifei; Zhang, Yewen; Li, Hui; Zhang, Qing; Wu, Qing
2016-01-01
An on-line near infrared (NIR) spectroscopy monitoring method with an appropriate multivariate calibration method was developed for the extraction process of Fu-fang Shuanghua oral solution (FSOS). On-line NIR spectra were collected through two fiber optic probes, which were designed to transmit NIR radiation by a 2 mm flange. Partial least squares (PLS), interval PLS (iPLS) and synergy interval PLS (siPLS) algorithms were used comparatively for building the calibration regression models. During the extraction process, the feasibility of NIR spectroscopy was employed to determine the concentrations of chlorogenic acid (CA) content, total phenolic acids contents (TPC), total flavonoids contents (TFC) and soluble solid contents (SSC). High performance liquid chromatography (HPLC), ultraviolet spectrophotometric method (UV) and loss on drying methods were employed as reference methods. Experiment results showed that the performance of siPLS model is the best compared with PLS and iPLS. The calibration models for AC, TPC, TFC and SSC had high values of determination coefficients of (R2) (0.9948, 0.9992, 0.9950 and 0.9832) and low root mean square error of cross validation (RMSECV) (0.0113, 0.0341, 0.1787 and 1.2158), which indicate a good correlation between reference values and NIR predicted values. The overall results show that the on line detection method could be feasible in real application and would be of great value for monitoring the mixed decoction process of FSOS and other Chinese patent medicines.
Multimodal Classification of Mild Cognitive Impairment Based on Partial Least Squares.
Wang, Pingyue; Chen, Kewei; Yao, Li; Hu, Bin; Wu, Xia; Zhang, Jiacai; Ye, Qing; Guo, Xiaojuan
2016-08-10
In recent years, increasing attention has been given to the identification of the conversion of mild cognitive impairment (MCI) to Alzheimer's disease (AD). Brain neuroimaging techniques have been widely used to support the classification or prediction of MCI. The present study combined magnetic resonance imaging (MRI), 18F-fluorodeoxyglucose PET (FDG-PET), and 18F-florbetapir PET (florbetapir-PET) to discriminate MCI converters (MCI-c, individuals with MCI who convert to AD) from MCI non-converters (MCI-nc, individuals with MCI who have not converted to AD in the follow-up period) based on the partial least squares (PLS) method. Two types of PLS models (informed PLS and agnostic PLS) were built based on 64 MCI-c and 65 MCI-nc from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. The results showed that the three-modality informed PLS model achieved better classification accuracy of 81.40%, sensitivity of 79.69%, and specificity of 83.08% compared with the single-modality model, and the three-modality agnostic PLS model also achieved better classification compared with the two-modality model. Moreover, combining the three modalities with clinical test score (ADAS-cog), the agnostic PLS model (independent data: florbetapir-PET; dependent data: FDG-PET and MRI) achieved optimal accuracy of 86.05%, sensitivity of 81.25%, and specificity of 90.77%. In addition, the comparison of PLS, support vector machine (SVM), and random forest (RF) showed greater diagnostic power of PLS. These results suggested that our multimodal PLS model has the potential to discriminate MCI-c from the MCI-nc and may therefore be helpful in the early diagnosis of AD.
Oliveri, Paolo; López, M Isabel; Casolino, M Chiara; Ruisánchez, Itziar; Callao, M Pilar; Medini, Luca; Lanteri, Silvia
2014-12-03
A new class-modeling method, referred to as partial least squares density modeling (PLS-DM), is presented. The method is based on partial least squares (PLS), using a distance-based sample density measurement as the response variable. Potential function probability density is subsequently calculated on PLS scores and used, jointly with residual Q statistics, to develop efficient class models. The influence of adjustable model parameters on the resulting performances has been critically studied by means of cross-validation and application of the Pareto optimality criterion. The method has been applied to verify the authenticity of olives in brine from cultivar Taggiasca, based on near-infrared (NIR) spectra recorded on homogenized solid samples. Two independent test sets were used for model validation. The final optimal model was characterized by high efficiency and equilibrate balance between sensitivity and specificity values, if compared with those obtained by application of well-established class-modeling methods, such as soft independent modeling of class analogy (SIMCA) and unequal dispersed classes (UNEQ). Copyright © 2014 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
De Lucia, Frank C. Jr.; Gottfried, Jennifer L.; Munson, Chase A.
2008-11-01
A technique being evaluated for standoff explosives detection is laser-induced breakdown spectroscopy (LIBS). LIBS is a real-time sensor technology that uses components that can be configured into a ruggedized standoff instrument. The U.S. Army Research Laboratory has been coupling standoff LIBS spectra with chemometrics for several years now in order to discriminate between explosives and nonexplosives. We have investigated the use of partial least squares discriminant analysis (PLS-DA) for explosives detection. We have extended our study of PLS-DA to more complex sample types, including binary mixtures, different types of explosives, and samples not included in the model. We demonstrate themore » importance of building the PLS-DA model by iteratively testing it against sample test sets. Independent test sets are used to test the robustness of the final model.« less
Determination of butter adulteration with margarine using Raman spectroscopy.
Uysal, Reyhan Selin; Boyaci, Ismail Hakki; Genis, Hüseyin Efe; Tamer, Ugur
2013-12-15
In this study, adulteration of butter with margarine was analysed using Raman spectroscopy combined with chemometric methods (principal component analysis (PCA), principal component regression (PCR), partial least squares (PLS)) and artificial neural networks (ANNs). Different butter and margarine samples were mixed at various concentrations ranging from 0% to 100% w/w. PCA analysis was applied for the classification of butters, margarines and mixtures. PCR, PLS and ANN were used for the detection of adulteration ratios of butter. Models were created using a calibration data set and developed models were evaluated using a validation data set. The coefficient of determination (R(2)) values between actual and predicted values obtained for PCR, PLS and ANN for the validation data set were 0.968, 0.987 and 0.978, respectively. In conclusion, a combination of Raman spectroscopy with chemometrics and ANN methods can be applied for testing butter adulteration. Copyright © 2013 Elsevier Ltd. All rights reserved.
Jiang, Hui; Zhang, Hang; Chen, Quansheng; Mei, Congli; Liu, Guohai
2015-01-01
The use of wavelength variable selection before partial least squares discriminant analysis (PLS-DA) for qualitative identification of solid state fermentation degree by FT-NIR spectroscopy technique was investigated in this study. Two wavelength variable selection methods including competitive adaptive reweighted sampling (CARS) and stability competitive adaptive reweighted sampling (SCARS) were employed to select the important wavelengths. PLS-DA was applied to calibrate identified model using selected wavelength variables by CARS and SCARS for identification of solid state fermentation degree. Experimental results showed that the number of selected wavelength variables by CARS and SCARS were 58 and 47, respectively, from the 1557 original wavelength variables. Compared with the results of full-spectrum PLS-DA, the two wavelength variable selection methods both could enhance the performance of identified models. Meanwhile, compared with CARS-PLS-DA model, the SCARS-PLS-DA model achieved better results with the identification rate of 91.43% in the validation process. The overall results sufficiently demonstrate the PLS-DA model constructed using selected wavelength variables by a proper wavelength variable method can be more accurate identification of solid state fermentation degree. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Jiang, Hui; Zhang, Hang; Chen, Quansheng; Mei, Congli; Liu, Guohai
2015-10-01
The use of wavelength variable selection before partial least squares discriminant analysis (PLS-DA) for qualitative identification of solid state fermentation degree by FT-NIR spectroscopy technique was investigated in this study. Two wavelength variable selection methods including competitive adaptive reweighted sampling (CARS) and stability competitive adaptive reweighted sampling (SCARS) were employed to select the important wavelengths. PLS-DA was applied to calibrate identified model using selected wavelength variables by CARS and SCARS for identification of solid state fermentation degree. Experimental results showed that the number of selected wavelength variables by CARS and SCARS were 58 and 47, respectively, from the 1557 original wavelength variables. Compared with the results of full-spectrum PLS-DA, the two wavelength variable selection methods both could enhance the performance of identified models. Meanwhile, compared with CARS-PLS-DA model, the SCARS-PLS-DA model achieved better results with the identification rate of 91.43% in the validation process. The overall results sufficiently demonstrate the PLS-DA model constructed using selected wavelength variables by a proper wavelength variable method can be more accurate identification of solid state fermentation degree.
NASA Astrophysics Data System (ADS)
Shao, Yongni; Xie, Chuanqi; Jiang, Linjun; Shi, Jiahui; Zhu, Jiajin; He, Yong
2015-04-01
Visible/near infrared spectroscopy (Vis/NIR) based on sensitive wavelengths (SWs) and chemometrics was proposed to discriminate different tomatoes bred by spaceflight mutagenesis from their leafs or fruits (green or mature). The tomato breeds were mutant M1, M2 and their parent. Partial least squares (PLS) analysis and least squares-support vector machine (LS-SVM) were implemented for calibration models. PLS analysis was implemented for calibration models with different wavebands including the visible region (400-700 nm) and the near infrared region (700-1000 nm). The best PLS models were achieved in the visible region for the leaf and green fruit samples and in the near infrared region for the mature fruit samples. Furthermore, different latent variables (4-8 LVs for leafs, 5-9 LVs for green fruits, and 4-9 LVs for mature fruits) were used as inputs of LS-SVM to develop the LV-LS-SVM models with the grid search technique and radial basis function (RBF) kernel. The optimal LV-LS-SVM models were achieved with six LVs for the leaf samples, seven LVs for green fruits, and six LVs for mature fruits, respectively, and they outperformed the PLS models. Moreover, independent component analysis (ICA) was executed to select several SWs based on loading weights. The optimal LS-SVM model was achieved with SWs of 550-560 nm, 562-574 nm, 670-680 nm and 705-715 nm for the leaf samples; 548-556 nm, 559-564 nm, 678-685 nm and 962-974 nm for the green fruit samples; and 712-718 nm, 720-729 nm, 968-978 nm and 820-830 nm for the mature fruit samples. All of them had better performance than PLS and LV-LS-SVM, with the parameters of correlation coefficient (rp), root mean square error of prediction (RMSEP) and bias of 0.9792, 0.2632 and 0.0901 based on leaf discrimination, 0.9837, 0.2783 and 0.1758 based on green fruit discrimination, 0.9804, 0.2215 and -0.0035 based on mature fruit discrimination, respectively. The overall results indicated that ICA was an effective way for the selection of SWs, and the Vis/NIR combined with LS-SVM models had the capability to predict the different breeds (mutant M1, mutant M2 and their parent) of tomatoes from leafs and fruits.
Hordge, LaQuana N; McDaniel, Kiara L; Jones, Derick D; Fakayode, Sayo O
2016-05-15
The endocrine disruption property of estrogens necessitates the immediate need for effective monitoring and development of analytical protocols for their analyses in biological and human specimens. This study explores the first combined utility of a steady-state fluorescence spectroscopy and multivariate partial-least-square (PLS) regression analysis for the simultaneous determination of two estrogens (17α-ethinylestradiol (EE) and norgestimate (NOR)) concentrations in bovine serum albumin (BSA) and human serum albumin (HSA) samples. The influence of EE and NOR concentrations and temperature on the emission spectra of EE-HSA EE-BSA, NOR-HSA, and NOR-BSA complexes was also investigated. The binding of EE with HSA and BSA resulted in increase in emission characteristics of HSA and BSA and a significant blue spectra shift. In contrast, the interaction of NOR with HSA and BSA quenched the emission characteristics of HSA and BSA. The observed emission spectral shifts preclude the effective use of traditional univariate regression analysis of fluorescent data for the determination of EE and NOR concentrations in HSA and BSA samples. Multivariate partial-least-squares (PLS) regression analysis was utilized to correlate the changes in emission spectra with EE and NOR concentrations in HSA and BSA samples. The figures-of-merit of the developed PLS regression models were excellent, with limits of detection as low as 1.6×10(-8) M for EE and 2.4×10(-7) M for NOR and good linearity (R(2)>0.994985). The PLS models correctly predicted EE and NOR concentrations in independent validation HSA and BSA samples with a root-mean-square-percent-relative-error (RMS%RE) of less than 6.0% at physiological condition. On the contrary, the use of univariate regression resulted in poor predictions of EE and NOR in HSA and BSA samples, with RMS%RE larger than 40% at physiological conditions. High accuracy, low sensitivity, simplicity, low-cost with no prior analyte extraction or separation required makes this method promising, compelling, and attractive alternative for the rapid determination of estrogen concentrations in biomedical and biological specimens, pharmaceuticals, or environmental samples. Published by Elsevier B.V.
Cao, Hui; Yan, Xingyu; Li, Yaojiang; Wang, Yanxia; Zhou, Yan; Yang, Sanchun
2014-01-01
Quantitative analysis for the flue gas of natural gas-fired generator is significant for energy conservation and emission reduction. The traditional partial least squares method may not deal with the nonlinear problems effectively. In the paper, a nonlinear partial least squares method with extended input based on radial basis function neural network (RBFNN) is used for components prediction of flue gas. For the proposed method, the original independent input matrix is the input of RBFNN and the outputs of hidden layer nodes of RBFNN are the extension term of the original independent input matrix. Then, the partial least squares regression is performed on the extended input matrix and the output matrix to establish the components prediction model of flue gas. A near-infrared spectral dataset of flue gas of natural gas combustion is used for estimating the effectiveness of the proposed method compared with PLS. The experiments results show that the root-mean-square errors of prediction values of the proposed method for methane, carbon monoxide, and carbon dioxide are, respectively, reduced by 4.74%, 21.76%, and 5.32% compared to those of PLS. Hence, the proposed method has higher predictive capabilities and better robustness.
Dönmez, Ozlem Aksu; Aşçi, Bürge; Bozdoğan, Abdürrezzak; Sungur, Sidika
2011-02-15
A simple and rapid analytical procedure was proposed for the determination of chromatographic peaks by means of partial least squares multivariate calibration (PLS) of high-performance liquid chromatography with diode array detection (HPLC-DAD). The method is exemplified with analysis of quaternary mixtures of potassium guaiacolsulfonate (PG), guaifenesin (GU), diphenhydramine HCI (DP) and carbetapentane citrate (CP) in syrup preparations. In this method, the area does not need to be directly measured and predictions are more accurate. Though the chromatographic and spectral peaks of the analytes were heavily overlapped and interferents coeluted with the compounds studied, good recoveries of analytes could be obtained with HPLC-DAD coupled with PLS calibration. This method was tested by analyzing the synthetic mixture of PG, GU, DP and CP. As a comparison method, a classsical HPLC method was used. The proposed methods were applied to syrups samples containing four drugs and the obtained results were statistically compared with each other. Finally, the main advantage of HPLC-PLS method over the classical HPLC method tried to emphasized as the using of simple mobile phase, shorter analysis time and no use of internal standard and gradient elution. Copyright © 2010 Elsevier B.V. All rights reserved.
Statistical variation in progressive scrambling
NASA Astrophysics Data System (ADS)
Clark, Robert D.; Fox, Peter C.
2004-07-01
The two methods most often used to evaluate the robustness and predictivity of partial least squares (PLS) models are cross-validation and response randomization. Both methods may be overly optimistic for data sets that contain redundant observations, however. The kinds of perturbation analysis widely used for evaluating model stability in the context of ordinary least squares regression are only applicable when the descriptors are independent of each other and errors are independent and normally distributed; neither assumption holds for QSAR in general and for PLS in particular. Progressive scrambling is a novel, non-parametric approach to perturbing models in the response space in a way that does not disturb the underlying covariance structure of the data. Here, we introduce adjustments for two of the characteristic values produced by a progressive scrambling analysis - the deprecated predictivity (Q_s^{ast^2}) and standard error of prediction (SDEP s * ) - that correct for the effect of introduced perturbation. We also explore the statistical behavior of the adjusted values (Q_0^{ast^2} and SDEP 0 * ) and the sensitivity to perturbation (d q 2/d r yy ' 2). It is shown that the three statistics are all robust for stable PLS models, in terms of the stochastic component of their determination and of their variation due to sampling effects involved in training set selection.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Coble, Jamie; Orton, Christopher; Schwantes, Jon
Abstract—The Multi-Isotope Process (MIP) Monitor provides an efficient approach to monitoring the process conditions in used nuclear fuel reprocessing facilities to support process verification and validation. The MIP Monitor applies multivariate analysis to gamma spectroscopy of reprocessing streams in order to detect small changes in the gamma spectrum, which may indicate changes in process conditions. This research extends the MIP Monitor by characterizing a used fuel sample after initial dissolution according to the type of reactor of origin (pressurized or boiling water reactor), initial enrichment, burn up, and cooling time. Simulated gamma spectra were used to develop and test threemore » fuel characterization algorithms. The classification and estimation models employed are based on the partial least squares regression (PLS) algorithm. A PLS discriminate analysis model was developed which perfectly classified reactor type. Locally weighted PLS models were fitted on-the-fly to estimate continuous fuel characteristics. Burn up was predicted within 0.1% root mean squared percent error (RMSPE) and both cooling time and initial enrichment within approximately 2% RMSPE. This automated fuel characterization can be used to independently verify operator declarations of used fuel characteristics and inform the MIP Monitor anomaly detection routines at later stages of the fuel reprocessing stream to improve sensitivity to changes in operational parameters and material diversions.« less
Fakayode, Sayo O; Mitchell, Breanna S; Pollard, David A
2014-08-01
Accurate understanding of analyte boiling points (BP) is of critical importance in gas chromatographic (GC) separation and crude oil refinery operation in petrochemical industries. This study reported the first combined use of GC separation and partial-least-square (PLS1) multivariate regression analysis of petrochemical structural activity relationship (SAR) for accurate BP determination of two commercially available (D3710 and MA VHP) calibration gas mix samples. The results of the BP determination using PLS1 multivariate regression were further compared with the results of traditional simulated distillation method of BP determination. The developed PLS1 regression was able to correctly predict analytes BP in D3710 and MA VHP calibration gas mix samples, with a root-mean-square-%-relative-error (RMS%RE) of 6.4%, and 10.8% respectively. In contrast, the overall RMS%RE of 32.9% and 40.4%, respectively obtained for BP determination in D3710 and MA VHP using a traditional simulated distillation method were approximately four times larger than the corresponding RMS%RE of BP prediction using MRA, demonstrating the better predictive ability of MRA. The reported method is rapid, robust, and promising, and can be potentially used routinely for fast analysis, pattern recognition, and analyte BP determination in petrochemical industries. Copyright © 2014 Elsevier B.V. All rights reserved.
Mass Spectrometry and Fourier Transform Infrared Spectroscopy for Analysis of Biological Materials
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anderson, Timothy J.
Time-of-flight mass spectrometry along with statistical analysis was utilized to study metabolic profiles among rats fed resistant starch (RS) diets. Fischer 344 rats were fed four starch diets consisting of 55% (w/w, dbs) starch. A control starch diet consisting of corn starch was compared against three RS diets. The RS diets were high-amylose corn starch (HA7), HA7 chemically modified with octenyl succinic anhydride, and stearic-acid-complexed HA7 starch. A subgroup received antibiotic treatment to determine if perturbations in the gut microbiome were long lasting. A second subgroup was treated with azoxymethane (AOM), a carcinogen. At the end of the eight weekmore » study, cecal and distal-colon contents samples were collected from the sacrificed rats. Metabolites were extracted from cecal and distal colon samples into acetonitrile. The extracts were then analyzed on an accurate-mass time-of-flight mass spectrometer to obtain their metabolic profile. The data were analyzed using partial least-squares discriminant analysis (PLS-DA). The PLS-DA analysis utilized a training set and verification set to classify samples within diet and treatment groups. PLS-DA could reliably differentiate the diet treatments for both cecal and distal colon samples. The PLS-DA analyses of the antibiotic and no antibiotic treated subgroups were well classified for cecal samples and modestly separated for distal-colon samples. PLS-DA analysis had limited success separating distal colon samples for rats given AOM from those not treated; the cecal samples from AOM had very poor classification. Mass spectrometry profiling coupled with PLS-DA can readily classify metabolite differences among rats given RS diets.« less
Tu, Yu-Kang; Davey Smith, George; Gilthorpe, Mark S.
2011-01-01
Due to a problem of identification, how to estimate the distinct effects of age, time period and cohort has been a controversial issue in the analysis of trends in health outcomes in epidemiology. In this study, we propose a novel approach, partial least squares (PLS) analysis, to separate the effects of age, period, and cohort. Our example for illustration is taken from the Glasgow Alumni cohort. A total of 15,322 students (11,755 men and 3,567 women) received medical screening at the Glasgow University between 1948 and 1968. The aim is to investigate the secular trends in blood pressure over 1925 and 1950 while taking into account the year of examination and age at examination. We excluded students born before 1925 or aged over 25 years at examination and those with missing values in confounders from the analyses, resulting in 12,546 and 12,516 students for analysis of systolic and diastolic blood pressure, respectively. PLS analysis shows that both systolic and diastolic blood pressure increased with students' age, and students born later had on average lower blood pressure (SBP: −0.17 mmHg/per year [95% confidence intervals: −0.19 to −0.15] for men and −0.25 [−0.28 to −0.22] for women; DBP: −0.14 [−0.15 to −0.13] for men; −0.09 [−0.11 to −0.07] for women). PLS also shows a decreasing trend in blood pressure over the examination period. As identification is not a problem for PLS, it provides a flexible modelling strategy for age-period-cohort analysis. More emphasis is then required to clarify the substantive and conceptual issues surrounding the definitions and interpretations of age, period and cohort effects. PMID:21556329
[NIR Assignment of Magnolol by 2D-COS Technology and Model Application Huoxiangzhengqi Oral Liduid].
Pei, Yan-ling; Wu, Zhi-sheng; Shi, Xin-yuan; Pan, Xiao-ning; Peng, Yan-fang; Qiao, Yan-jiang
2015-08-01
Near infrared (NIR) spectroscopy assignment of Magnolol was performed using deuterated chloroform solvent and two-dimensional correlation spectroscopy (2D-COS) technology. According to the synchronous spectra of deuterated chloroform solvent and Magnolol, 1365~1455, 1600~1720, 2000~2181 and 2275~2465 nm were the characteristic absorption of Magnolol. Connected with the structure of Magnolol, 1440 nm was the stretching vibration of phenolic group O-H, 1679 nm was the stretching vibration of aryl and methyl which connected with aryl, 2117, 2304, 2339 and 2370 nm were the combination of the stretching vibration, bending vibration and deformation vibration for aryl C-H, 2445 nm were the bending vibration of methyl which linked with aryl group, these bands attribut to the characteristics of Magnolol. Huoxiangzhengqi Oral Liduid was adopted to study the Magnolol, the characteristic band by spectral assignment and the band by interval Partial Least Squares (iPLS) and Synergy interval Partial Least Squares (SiPLS) were used to establish Partial Least Squares (PLS) quantitative model, the coefficient of determination Rcal(2) and Rpre(2) were greater than 0.99, the Root Mean of Square Error of Calibration (RM-SEC), Root Mean of Square Error of Cross Validation (RMSECV) and Root Mean of Square Error of Prediction (RMSEP) were very small. It indicated that the characteristic band by spectral assignment has the same results with the Chemometrics in PLS model. It provided a reference for NIR spectral assignment of chemical compositions in Chinese Materia Medica, and the band filters of NIR were interpreted.
Freye, Chris E; Fitz, Brian D; Billingsley, Matthew C; Synovec, Robert E
2016-06-01
The chemical composition and several physical properties of RP-1 fuels were studied using comprehensive two-dimensional (2D) gas chromatography (GC×GC) coupled with flame ionization detection (FID). A "reversed column" GC×GC configuration was implemented with a RTX-wax column on the first dimension ((1)D), and a RTX-1 as the second dimension ((2)D). Modulation was achieved using a high temperature diaphragm valve mounted directly in the oven. Using leave-one-out cross-validation (LOOCV), the summed GC×GC-FID signal of three compound-class selective 2D regions (alkanes, cycloalkanes, and aromatics) was regressed against previously measured ASTM derived values for these compound classes, yielding root mean square errors of cross validation (RMSECV) of 0.855, 0.734, and 0.530mass%, respectively. For comparison, using partial least squares (PLS) analysis with LOOCV, the GC×GC-FID signal of the entire 2D separations was regressed against the same ASTM values, yielding a linear trend for the three compound classes (alkanes, cycloalkanes, and aromatics), yielding RMSECV values of 1.52, 2.76, and 0.945 mass%, respectively. Additionally, a more detailed PLS analysis was undertaken of the compounds classes (n-alkanes, iso-alkanes, mono-, di-, and tri-cycloalkanes, and aromatics), and of physical properties previously determined by ASTM methods (such as net heat of combustion, hydrogen content, density, kinematic viscosity, sustained boiling temperature and vapor rise temperature). Results from these PLS studies using the relatively simple to use and inexpensive GC×GC-FID instrumental platform are compared to previously reported results using the GC×GC-TOFMS instrumental platform. Copyright © 2016 Elsevier B.V. All rights reserved.
De Girolamo, A; Lippolis, V; Nordkvist, E; Visconti, A
2009-06-01
Fourier transform near-infrared spectroscopy (FT-NIR) was used for rapid and non-invasive analysis of deoxynivalenol (DON) in durum and common wheat. The relevance of using ground wheat samples with a homogeneous particle size distribution to minimize measurement variations and avoid DON segregation among particles of different sizes was established. Calibration models for durum wheat, common wheat and durum + common wheat samples, with particle size <500 microm, were obtained by using partial least squares (PLS) regression with an external validation technique. Values of root mean square error of prediction (RMSEP, 306-379 microg kg(-1)) were comparable and not too far from values of root mean square error of cross-validation (RMSECV, 470-555 microg kg(-1)). Coefficients of determination (r(2)) indicated an "approximate to good" level of prediction of the DON content by FT-NIR spectroscopy in the PLS calibration models (r(2) = 0.71-0.83), and a "good" discrimination between low and high DON contents in the PLS validation models (r(2) = 0.58-0.63). A "limited to good" practical utility of the models was ascertained by range error ratio (RER) values higher than 6. A qualitative model, based on 197 calibration samples, was developed to discriminate between blank and naturally contaminated wheat samples by setting a cut-off at 300 microg kg(-1) DON to separate the two classes. The model correctly classified 69% of the 65 validation samples with most misclassified samples (16 of 20) showing DON contamination levels quite close to the cut-off level. These findings suggest that FT-NIR analysis is suitable for the determination of DON in unprocessed wheat at levels far below the maximum permitted limits set by the European Commission.
Fadzlillah, Nurrulhidayah Ahmad; Rohman, Abdul; Ismail, Amin; Mustafa, Shuhaimi; Khatib, Alfi
2013-01-01
In dairy product sector, butter is one of the potential sources of fat soluble vitamins, namely vitamin A, D, E, K; consequently, butter is taken into account as high valuable price from other dairy products. This fact has attracted unscrupulous market players to blind butter with other animal fats to gain economic profit. Animal fats like mutton fat (MF) are potential to be mixed with butter due to the similarity in terms of fatty acid composition. This study focused on the application of FTIR-ATR spectroscopy in conjunction with chemometrics for classification and quantification of MF as adulterant in butter. The FTIR spectral region of 3910-710 cm⁻¹ was used for classification between butter and butter blended with MF at various concentrations with the aid of discriminant analysis (DA). DA is able to classify butter and adulterated butter without any mistakenly grouped. For quantitative analysis, partial least square (PLS) regression was used to develop a calibration model at the frequency regions of 3910-710 cm⁻¹. The equation obtained for the relationship between actual value of MF and FTIR predicted values of MF in PLS calibration model was y = 0.998x + 1.033, with the values of coefficient of determination (R²) and root mean square error of calibration are 0.998 and 0.046% (v/v), respectively. The PLS calibration model was subsequently used for the prediction of independent samples containing butter in the binary mixtures with MF. Using 9 principal components, root mean square error of prediction (RMSEP) is 1.68% (v/v). The results showed that FTIR spectroscopy can be used for the classification and quantification of MF in butter formulation for verification purposes.
NASA Astrophysics Data System (ADS)
Ahmed, Shamim; Miorelli, Roberto; Calmon, Pierre; Anselmi, Nicola; Salucci, Marco
2018-04-01
This paper describes Learning-By-Examples (LBE) technique for performing quasi real time flaw localization and characterization within a conductive tube based on Eddy Current Testing (ECT) signals. Within the framework of LBE, the combination of full-factorial (i.e., GRID) sampling and Partial Least Squares (PLS) feature extraction (i.e., GRID-PLS) techniques are applied for generating a suitable training set in offine phase. Support Vector Regression (SVR) is utilized for model development and inversion during offine and online phases, respectively. The performance and robustness of the proposed GIRD-PLS/SVR strategy on noisy test set is evaluated and compared with standard GRID/SVR approach.
Yang, Yuan-Gui; Zhang, Ji; Zhao, Yan-Li; Zhang, Jin-Yu; Wang, Yuan-Zhong
2017-07-01
A rapid method was developed and validated by ultra-performance liquid chromatography-triple quadrupole mass spectroscopy with ultraviolet detection (UPLC-UV-MS) for simultaneous determination of paris saponin I, paris saponin II, paris saponin VI and paris saponin VII. Partial least squares discriminant analysis (PLS-DA) based on UPLC and Fourier transform infrared (FT-IR) spectroscopy was employed to evaluate Paris polyphylla var. yunnanensis (PPY) at different harvesting times. Quantitative determination implied that the various contents of bioactive compounds with different harvesting times may lead to different pharmacological effects; the average content of total saponins for PPY harvested at 8 years was higher than that from other samples. The PLS-DA of FT-IR spectra had a better performance than that of UPLC for discrimination of PPY from different harvesting times. Copyright © 2016 John Wiley & Sons, Ltd.
NASA Astrophysics Data System (ADS)
Chen, Quansheng; Qi, Shuai; Li, Huanhuan; Han, Xiaoyan; Ouyang, Qin; Zhao, Jiewen
2014-10-01
To rapidly and efficiently detect the presence of adulterants in honey, three-dimensional fluorescence spectroscopy (3DFS) technique was employed with the help of multivariate calibration. The data of 3D fluorescence spectra were compressed using characteristic extraction and the principal component analysis (PCA). Then, partial least squares (PLS) and back propagation neural network (BP-ANN) algorithms were used for modeling. The model was optimized by cross validation, and its performance was evaluated according to root mean square error of prediction (RMSEP) and correlation coefficient (R) in prediction set. The results showed that BP-ANN model was superior to PLS models, and the optimum prediction results of the mixed group (sunflower ± longan ± buckwheat ± rape) model were achieved as follow: RMSEP = 0.0235 and R = 0.9787 in the prediction set. The study demonstrated that the 3D fluorescence spectroscopy technique combined with multivariate calibration has high potential in rapid, nondestructive, and accurate quantitative analysis of honey adulteration.
NASA Astrophysics Data System (ADS)
Duan, Fajie; Fu, Xiao; Jiang, Jiajia; Huang, Tingting; Ma, Ling; Zhang, Cong
2018-05-01
In this work, an automatic variable selection method for quantitative analysis of soil samples using laser-induced breakdown spectroscopy (LIBS) is proposed, which is based on full spectrum correction (FSC) and modified iterative predictor weighting-partial least squares (mIPW-PLS). The method features automatic selection without artificial processes. To illustrate the feasibility and effectiveness of the method, a comparison with genetic algorithm (GA) and successive projections algorithm (SPA) for different elements (copper, barium and chromium) detection in soil was implemented. The experimental results showed that all the three methods could accomplish variable selection effectively, among which FSC-mIPW-PLS required significantly shorter computation time (12 s approximately for 40,000 initial variables) than the others. Moreover, improved quantification models were got with variable selection approaches. The root mean square errors of prediction (RMSEP) of models utilizing the new method were 27.47 (copper), 37.15 (barium) and 39.70 (chromium) mg/kg, which showed comparable prediction effect with GA and SPA.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anderson, Timothy J.; Jones, Roger W.; Ai, Yongfeng
Time-of-flight mass spectrometry along with statistical analysis was utilized to study metabolic profiles among rats fed resistant starch (RS) diets. Fischer 344 rats were fed four starch diets consisting of 55 % (w/w, dbs) starch. A control starch diet consisting of corn starch was compared against three RS diets. The RS diets were high-amylose corn starch (HA7), HA7 chemically modified with octenyl succinic anhydride, and stearic-acid-complexed HA7 starch. A subgroup received antibiotic treatment to determine if perturbations in the gut microbiome were long lasting. A second subgroup was treated with azoxymethane (AOM), a carcinogen. At the end of the 8-weekmore » study, cecal and distal colon content samples were collected from the sacrificed rats. Metabolites were extracted from cecal and distal colon samples into acetonitrile. The extracts were then analyzed on an accurate-mass time-of-flight mass spectrometer to obtain their metabolic profile. The data were analyzed using partial least-squares discriminant analysis (PLS-DA). The PLS-DA analysis utilized a training set and verification set to classify samples within diet and treatment groups. PLS-DA could reliably differentiate the diet treatments for both cecal and distal colon samples. The PLS-DA analyses of the antibiotic and no antibiotic-treated subgroups were well classified for cecal samples and modestly separated for distal colon samples. PLS-DA analysis had limited success separating distal colon samples for rats given AOM from those not treated; the cecal samples from AOM had very poor classification. Mass spectrometry profiling coupled with PLS-DA can readily classify metabolite differences among rats given RS diets.« less
NASA Astrophysics Data System (ADS)
Hashim, Noor Haslinda Noor; Latip, Jalifah; Khatib, Alfi
2016-11-01
The metabolites of Clinacanthus nutans leaves extracts and their dependence on drying process were systematically characterized using 1H nuclear magnetic resonance spectroscopy (NMR) multivariate data analysis. Principal component analysis (PCA) and partial least square-discriminant analysis (PLS-DA) were able to distinguish the leaves extracts obtained from different drying methods. The identified metabolites were carbohydrates, amino acid, flavonoids and sulfur glucoside compounds. The major metabolites responsible for the separation in PLS-DA loading plots were lupeol, cycloclinacosides, betulin, cerebrosides and choline. The results showed that the combination of 1H NMR spectroscopy and multivariate data analyses could act as an efficient technique to understand the C. nutans composition and its variation.
Wu, Yanwei; Guo, Pan; Chen, Siying; Chen, He; Zhang, Yinchao
2017-04-01
Auto-adaptive background subtraction (AABS) is proposed as a denoising method for data processing of the coherent Doppler lidar (CDL). The method is proposed specifically for a low-signal-to-noise-ratio regime, in which the drifting power spectral density of CDL data occurs. Unlike the periodogram maximum (PM) and adaptive iteratively reweighted penalized least squares (airPLS), the proposed method presents reliable peaks and is thus advantageous in identifying peak locations. According to the analysis results of simulated and actually measured data, the proposed method outperforms the airPLS method and the PM algorithm in the furthest detectable range. The proposed method improves the detection range approximately up to 16.7% and 40% when compared to the airPLS method and the PM method, respectively. It also has smaller mean wind velocity and standard error values than the airPLS and PM methods. The AABS approach improves the quality of Doppler shift estimates and can be applied to obtain the whole wind profiling by the CDL.
Lozano, Valeria A; Ibañez, Gabriela A; Olivieri, Alejandro C
2009-10-05
In the presence of analyte-background interactions and a significant background signal, both second-order multivariate calibration and standard addition are required for successful analyte quantitation achieving the second-order advantage. This report discusses a modified second-order standard addition method, in which the test data matrix is subtracted from the standard addition matrices, and quantitation proceeds via the classical external calibration procedure. It is shown that this novel data processing method allows one to apply not only parallel factor analysis (PARAFAC) and multivariate curve resolution-alternating least-squares (MCR-ALS), but also the recently introduced and more flexible partial least-squares (PLS) models coupled to residual bilinearization (RBL). In particular, the multidimensional variant N-PLS/RBL is shown to produce the best analytical results. The comparison is carried out with the aid of a set of simulated data, as well as two experimental data sets: one aimed at the determination of salicylate in human serum in the presence of naproxen as an additional interferent, and the second one devoted to the analysis of danofloxacin in human serum in the presence of salicylate.
2012-01-01
Background Decision-making in healthcare is complex. Research on coverage decision-making has focused on comparative studies for several countries, statistical analyses for single decision-makers, the decision outcome and appraisal criteria. Accounting for decision processes extends the complexity, as they are multidimensional and process elements need to be regarded as latent constructs (composites) that are not observed directly. The objective of this study was to present a practical application of partial least square path modelling (PLS-PM) to evaluate how it offers a method for empirical analysis of decision-making in healthcare. Methods Empirical approaches that applied PLS-PM to decision-making in healthcare were identified through a systematic literature search. PLS-PM was used as an estimation technique for a structural equation model that specified hypotheses between the components of decision processes and the reasonableness of decision-making in terms of medical, economic and other ethical criteria. The model was estimated for a sample of 55 coverage decisions on the extension of newborn screening programmes in Europe. Results were evaluated by standard reliability and validity measures for PLS-PM. Results After modification by dropping two indicators that showed poor measures in the measurement models’ quality assessment and were not meaningful for newborn screening, the structural equation model estimation produced plausible results. The presence of three influences was supported: the links between both stakeholder participation or transparency and the reasonableness of decision-making; and the effect of transparency on the degree of scientific rigour of assessment. Reliable and valid measurement models were obtained to describe the composites of ‘transparency’, ‘participation’, ‘scientific rigour’ and ‘reasonableness’. Conclusions The structural equation model was among the first applications of PLS-PM to coverage decision-making. It allowed testing of hypotheses in situations where there are links between several non-observable constructs. PLS-PM was compatible in accounting for the complexity of coverage decisions to obtain a more realistic perspective for empirical analysis. The model specification can be used for hypothesis testing by using larger sample sizes and for data in the full domain of health technologies. PMID:22856325
Fischer, Katharina E
2012-08-02
Decision-making in healthcare is complex. Research on coverage decision-making has focused on comparative studies for several countries, statistical analyses for single decision-makers, the decision outcome and appraisal criteria. Accounting for decision processes extends the complexity, as they are multidimensional and process elements need to be regarded as latent constructs (composites) that are not observed directly. The objective of this study was to present a practical application of partial least square path modelling (PLS-PM) to evaluate how it offers a method for empirical analysis of decision-making in healthcare. Empirical approaches that applied PLS-PM to decision-making in healthcare were identified through a systematic literature search. PLS-PM was used as an estimation technique for a structural equation model that specified hypotheses between the components of decision processes and the reasonableness of decision-making in terms of medical, economic and other ethical criteria. The model was estimated for a sample of 55 coverage decisions on the extension of newborn screening programmes in Europe. Results were evaluated by standard reliability and validity measures for PLS-PM. After modification by dropping two indicators that showed poor measures in the measurement models' quality assessment and were not meaningful for newborn screening, the structural equation model estimation produced plausible results. The presence of three influences was supported: the links between both stakeholder participation or transparency and the reasonableness of decision-making; and the effect of transparency on the degree of scientific rigour of assessment. Reliable and valid measurement models were obtained to describe the composites of 'transparency', 'participation', 'scientific rigour' and 'reasonableness'. The structural equation model was among the first applications of PLS-PM to coverage decision-making. It allowed testing of hypotheses in situations where there are links between several non-observable constructs. PLS-PM was compatible in accounting for the complexity of coverage decisions to obtain a more realistic perspective for empirical analysis. The model specification can be used for hypothesis testing by using larger sample sizes and for data in the full domain of health technologies.
Goicoechea, H C; Olivieri, A C
2001-07-01
A newly developed multivariate method involving net analyte preprocessing (NAP) was tested using central composite calibration designs of progressively decreasing size regarding the multivariate simultaneous spectrophotometric determination of three active components (phenylephrine, diphenhydramine and naphazoline) and one excipient (methylparaben) in nasal solutions. Its performance was evaluated and compared with that of partial least-squares (PLS-1). Minimisation of the calibration predicted error sum of squares (PRESS) as a function of a moving spectral window helped to select appropriate working spectral ranges for both methods. The comparison of NAP and PLS results was carried out using two tests: (1) the elliptical joint confidence region for the slope and intercept of a predicted versus actual concentrations plot for a large validation set of samples and (2) the D-optimality criterion concerning the information content of the calibration data matrix. Extensive simulations and experimental validation showed that, unlike PLS, the NAP method is able to furnish highly satisfactory results when the calibration set is reduced from a full four-component central composite to a fractional central composite, as expected from the modelling requirements of net analyte based methods.
Fang, Guihua; Goh, Jing Yeen; Tay, Manjun; Lau, Hiu Fung; Li, Sam Fong Yau
2013-06-01
The correct identification of oils and fats is important to consumers from both commercial and health perspectives. Proton nuclear magnetic resonance ((1)H NMR) spectroscopy, gas chromatography-mass spectrometry (GC/MS) fingerprinting and chemometrics were employed successfully for the quality control of oils and fats. Principal component analysis (PCA) of both techniques showed group clustering of 14 types of oils and fats. Partial least squares discriminant analysis (PLS-DA) and orthogonal projections to latent structures discriminant analysis (OPLS-DA) using GC/MS data had excellent classification sensitivity and specificity compared to models using NMR data. Depending on the availability of the instruments, data from either technique can effectively be applied for the establishment of an oils and fats database to identify unknown samples. Partial least squares (PLS) models were successfully established for the detection of as low as 5% of lard and beef tallow spiked into canola oil, thus illustrating possible applications in Islamic and Jewish countries. Copyright © 2012 Elsevier Ltd. All rights reserved.
Luoma, Pekka; Natschläger, Thomas; Malli, Birgit; Pawliczek, Marcin; Brandstetter, Markus
2018-05-12
A model recalibration method based on additive Partial Least Squares (PLS) regression is generalized for multi-adjustment scenarios of independent variance sources (referred to as additive PLS - aPLS). aPLS allows for effortless model readjustment under changing measurement conditions and the combination of independent variance sources with the initial model by means of additive modelling. We demonstrate these distinguishing features on two NIR spectroscopic case-studies. In case study 1 aPLS was used as a readjustment method for an emerging offset. The achieved RMS error of prediction (1.91 a.u.) was of similar level as before the offset occurred (2.11 a.u.). In case-study 2 a calibration combining different variance sources was conducted. The achieved performance was of sufficient level with an absolute error being better than 0.8% of the mean concentration, therefore being able to compensate negative effects of two independent variance sources. The presented results show the applicability of the aPLS approach. The main advantages of the method are that the original model stays unadjusted and that the modelling is conducted on concrete changes in the spectra thus supporting efficient (in most cases straightforward) modelling. Additionally, the method is put into context of existing machine learning algorithms. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Li, Lin
2008-12-01
Partial least squares (PLS) regressions were applied to lunar highland and mare soil data characterized by the Lunar Soil Characterization Consortium (LSCC) for spectral estimation of the abundance of lunar soil chemical constituents FeO and Al2O3. The LSCC data set was split into a number of subsets including the total highland, Apollo 16, Apollo 14, and total mare soils, and then PLS was applied to each to investigate the effect of nonlinearity on the performance of the PLS method. The weight-loading vectors resulting from PLS were analyzed to identify mineral species responsible for spectral estimation of the soil chemicals. The results from PLS modeling indicate that the PLS performance depends on the correlation of constituents of interest to their major mineral carriers, and the Apollo 16 soils are responsible for the large errors of FeO and Al2O3 estimates when the soils were modeled along with other types of soils. These large errors are primarily attributed to the degraded correlation FeO to pyroxene for the relatively mature Apollo 16 soils as a result of space weathering and secondary to the interference of olivine. PLS consistently yields very accurate fits to the two soil chemicals when applied to mare soils. Although Al2O3 has no spectrally diagnostic characteristics, this chemical can be predicted for all subset data by PLS modeling at high accuracies because of its correlation to FeO. This correlation is reflected in the symmetry of the PLS weight-loading vectors for FeO and Al2O3, which prove to be very useful for qualitative interpretation of the PLS results. However, this qualitative interpretation of PLS modeling cannot be achieved using principal component regression loading vectors.
Fulcher, Yan G.; Fotso, Martial; Chang, Chee-Hoon; Rindt, Hans; Reinero, Carol R.
2016-01-01
Asthma is prevalent in children and cats, and needs means of noninvasive diagnosis. We sought to distinguish noninvasively the differences in 53 cats before and soon after induction of allergic asthma, using NMR spectra of exhaled breath condensate (EBC). Statistical pattern recognition was improved considerably by preprocessing the spectra with probabilistic quotient normalization and glog transformation. Classification of the 106 preprocessed spectra by principal component analysis and partial least squares with discriminant analysis (PLS-DA) appears to be impaired by variances unrelated to eosinophilic asthma. By filtering out confounding variances, orthogonal signal correction (OSC) PLS-DA greatly improved the separation of the healthy and early asthmatic states, attaining 94% specificity and 94% sensitivity in predictions. OSC enhancement of multi-level PLS-DA boosted the specificity of the prediction to 100%. OSC-PLS-DA of the normalized spectra suggest the most promising biomarkers of allergic asthma in cats to include increased acetone, metabolite(s) with overlapped NMR peaks near 5.8 ppm, and a hydroxyphenyl-containing metabolite, as well as decreased phthalate. Acetone is elevated in the EBC of 74% of the cats with early asthma. The noninvasive detection of early experimental asthma, biomarkers in EBC, and metabolic perturbation invite further investigation of the diagnostic potential in humans. PMID:27764146
Quantification of brain lipids by FTIR spectroscopy and partial least squares regression
NASA Astrophysics Data System (ADS)
Dreissig, Isabell; Machill, Susanne; Salzer, Reiner; Krafft, Christoph
2009-01-01
Brain tissue is characterized by high lipid content. Its content decreases and the lipid composition changes during transformation from normal brain tissue to tumors. Therefore, the analysis of brain lipids might complement the existing diagnostic tools to determine the tumor type and tumor grade. Objective of this work is to extract lipids from gray matter and white matter of porcine brain tissue, record infrared (IR) spectra of these extracts and develop a quantification model for the main lipids based on partial least squares (PLS) regression. IR spectra of the pure lipids cholesterol, cholesterol ester, phosphatidic acid, phosphatidylcholine, phosphatidylethanolamine, phosphatidylserine, phosphatidylinositol, sphingomyelin, galactocerebroside and sulfatide were used as references. Two lipid mixtures were prepared for training and validation of the quantification model. The composition of lipid extracts that were predicted by the PLS regression of IR spectra was compared with lipid quantification by thin layer chromatography.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cen Haiyan; Bao Yidan; He Yong
2006-10-10
Visible and near-infrared reflectance (visible-NIR) spectroscopy is applied to discriminate different varieties of bayberry juices. The discrimination of visible-NIR spectra from samples is a matter of pattern recognition. By partial least squares (PLS), the spectrum is reduced to certain factors, which are then taken as the input of the backpropagation neural network (BPNN). Through training and prediction, three different varieties of bayberry juice are classified based on the output of the BPNN. In addition, a mathematical model is built and the algorithm is optimized. With proper parameters in the training set,100% accuracy is obtained by the BPNN. Thus it ismore » concluded that the PLS analysis combined with the BPNN is an alternative for pattern recognition based on visible and NIR spectroscopy.« less
Zuo, Yamin; Deng, Xuehua; Wu, Qing
2018-05-04
Discrimination of Gastrodia elata ( G. elata ) geographical origin is of great importance to pharmaceutical companies and consumers in China. this paper focuses on the feasibility of near infrared spectrum (NIRS) combined multivariate analysis as a rapid and non-destructive method to prove its fit for this purpose. Firstly, 16 batches of G. elata samples from four main-cultivation regions in China were quantified by traditional HPLC method. It showed that samples from different origins could not be efficiently differentiated by the contents of four phenolic compounds in this study. Secondly, the raw near infrared (NIR) spectra of those samples were acquired and two different pattern recognition techniques were used to classify the geographical origins. The results showed that with spectral transformation optimized, discriminant analysis (DA) provided 97% and 99% correct classification for the calibration and validation sets of samples from discriminating of four different main-cultivation regions, and provided 98% and 99% correct classifications for the calibration and validation sets of samples from eight different cities, respectively, which all performed better than the principal component analysis (PCA) method. Thirdly, as phenolic compounds content (PCC) is highly related with the quality of G. elata , synergy interval partial least squares (Si-PLS) was applied to build the PCC prediction model. The coefficient of determination for prediction (R p ²) of the Si-PLS model was 0.9209, and root mean square error for prediction (RMSEP) was 0.338. The two regions (4800 cm −1 ⁻5200 cm −1 , and 5600 cm −1 ⁻6000 cm −1 ) selected by Si-PLS corresponded to the absorptions of aromatic ring in the basic phenolic structure. It can be concluded that NIR spectroscopy combined with PCA, DA and Si-PLS would be a potential tool to provide a reference for the quality control of G. elata.
2013-01-01
Background Given the serious threats posed to terrestrial ecosystems by industrial contamination, environmental monitoring is a standard procedure used for assessing the current status of an environment or trends in environmental parameters. Measurement of metal concentrations at different trophic levels followed by their statistical analysis using exploratory multivariate methods can provide meaningful information on the status of environmental quality. In this context, the present paper proposes a novel chemometric approach to standard statistical methods by combining the Block clustering with Partial least square (PLS) analysis to investigate the accumulation patterns of metals in anthropized terrestrial ecosystems. The present study focused on copper, zinc, manganese, iron, cobalt, cadmium, nickel, and lead transfer along a soil-plant-snai food chain, and the hepatopancreas of the Roman snail (Helix pomatia) was used as a biological end-point of metal accumulation. Results Block clustering deliniates between the areas exposed to industrial and vehicular contamination. The toxic metals have similar distributions in the nettle leaves and snail hepatopancreas. PLS analysis showed that (1) zinc and copper concentrations at the lower trophic levels are the most important latent factors that contribute to metal accumulation in land snails; (2) cadmium and lead are the main determinants of pollution pattern in areas exposed to industrial contamination; (3) at the sites located near roads lead is the most threatfull metal for terrestrial ecosystems. Conclusion There were three major benefits by applying block clustering with PLS for processing the obtained data: firstly, it helped in grouping sites depending on the type of contamination. Secondly, it was valuable for identifying the latent factors that contribute the most to metal accumulation in land snails. Finally, it optimized the number and type of data that are best for monitoring the status of metallic contamination in terrestrial ecosystems exposed to different kinds of anthropic polution. PMID:23987502
NASA Astrophysics Data System (ADS)
Tewari, Jagdish; Strong, Richard; Boulas, Pierre
2017-02-01
This article summarizes the development and validation of a Fourier transform near infrared spectroscopy (FT-NIR) method for the rapid at-line prediction of active pharmaceutical ingredient (API) in a powder blend to optimize small molecule formulations. The method was used to determine the blend uniformity end-point for a pharmaceutical solid dosage formulation containing a range of API concentrations. A set of calibration spectra from samples with concentrations ranging from 1% to 15% of API (w/w) were collected at-line from 4000 to 12,500 cm- 1. The ability of the FT-NIR method to predict API concentration in the blend samples was validated against a reference high performance liquid chromatography (HPLC) method. The prediction efficiency of four different types of multivariate data modeling methods such as partial least-squares 1 (PLS1), partial least-squares 2 (PLS2), principal component regression (PCR) and artificial neural network (ANN), were compared using relevant multivariate figures of merit. The prediction ability of the regression models were cross validated against results generated with the reference HPLC method. PLS1 and ANN showed excellent and superior prediction abilities when compared to PLS2 and PCR. Based upon these results and because of its decreased complexity compared to ANN, PLS1 was selected as the best chemometric method to predict blend uniformity at-line. The FT-NIR measurement and the associated chemometric analysis were implemented in the production environment for rapid at-line determination of the end-point of the small molecule blending operation. FIGURE 1: Correlation coefficient vs Rank plot FIGURE 2: FT-NIR spectra of different steps of Blend and final blend FIGURE 3: Predictions ability of PCR FIGURE 4: Blend uniformity predication ability of PLS2 FIGURE 5: Prediction efficiency of blend uniformity using ANN FIGURE 6: Comparison of prediction efficiency of chemometric models TABLE 1: Order of Addition for Blending Steps
El Alami El Hassani, Nadia; Tahri, Khalid; Llobet, Eduard; Bouchikhi, Benachir; Errachid, Abdelhamid; Zine, Nadia; El Bari, Nezha
2018-03-15
Moroccan and French honeys from different geographical areas were classified and characterized by applying a voltammetric electronic tongue (VE-tongue) coupled to analytical methods. The studied parameters include color intensity, free lactonic and total acidity, proteins, phenols, hydroxymethylfurfural content (HMF), sucrose, reducing and total sugars. The geographical classification of different honeys was developed through three-pattern recognition techniques: principal component analysis (PCA), support vector machines (SVMs) and hierarchical cluster analysis (HCA). Honey characterization was achieved by partial least squares modeling (PLS). All the PLS models developed were able to accurately estimate the correct values of the parameters analyzed using as input the voltammetric experimental data (i.e. r>0.9). This confirms the potential ability of the VE-tongue for performing a rapid characterization of honeys via PLS in which an uncomplicated, cost-effective sample preparation process that does not require the use of additional chemicals is implemented. Copyright © 2017 Elsevier Ltd. All rights reserved.
Visible/near-infrared spectroscopy to predict water holding capacity in broiler breast meat
USDA-ARS?s Scientific Manuscript database
Visible/Near-infrared spectroscopy (Vis/NIRS) was examined as a tool for rapidly determining water holding capacity (WHC) in broiler breast meat. Both partial least squares (PLS) and principal component analysis (PCA) models were developed to relate Vis/NIRS spectra of 85 broiler breast meat sample...
Organizational Commitment, Knowledge Management Interventions, and Learning Organization Capacity
ERIC Educational Resources Information Center
Massingham, Peter; Diment, Kieren
2009-01-01
Purpose: The purpose of this paper is to examine the relationship between organizational commitment and knowledge management initiatives in developing learning organization capacity (LOC). Design/methodology/approach: This is an empirical study based on a single case study, using partial least squares (PLS) analysis. Findings: The strategic…
Shao, Yongni; Xie, Chuanqi; Jiang, Linjun; Shi, Jiahui; Zhu, Jiajin; He, Yong
2015-04-05
Visible/near infrared spectroscopy (Vis/NIR) based on sensitive wavelengths (SWs) and chemometrics was proposed to discriminate different tomatoes bred by spaceflight mutagenesis from their leafs or fruits (green or mature). The tomato breeds were mutant M1, M2 and their parent. Partial least squares (PLS) analysis and least squares-support vector machine (LS-SVM) were implemented for calibration models. PLS analysis was implemented for calibration models with different wavebands including the visible region (400-700 nm) and the near infrared region (700-1000 nm). The best PLS models were achieved in the visible region for the leaf and green fruit samples and in the near infrared region for the mature fruit samples. Furthermore, different latent variables (4-8 LVs for leafs, 5-9 LVs for green fruits, and 4-9 LVs for mature fruits) were used as inputs of LS-SVM to develop the LV-LS-SVM models with the grid search technique and radial basis function (RBF) kernel. The optimal LV-LS-SVM models were achieved with six LVs for the leaf samples, seven LVs for green fruits, and six LVs for mature fruits, respectively, and they outperformed the PLS models. Moreover, independent component analysis (ICA) was executed to select several SWs based on loading weights. The optimal LS-SVM model was achieved with SWs of 550-560 nm, 562-574 nm, 670-680 nm and 705-71 5 nm for the leaf samples; 548-556 nm, 559-564 nm, 678-685 nm and 962-974 nm for the green fruit samples; and 712-718 nm, 720-729 nm, 968-978 nm and 820-830 nm for the mature fruit samples. All of them had better performance than PLS and LV-LS-SVM, with the parameters of correlation coefficient (rp), root mean square error of prediction (RMSEP) and bias of 0.9792, 0.2632 and 0.0901 based on leaf discrimination, 0.9837, 0.2783 and 0.1758 based on green fruit discrimination, 0.9804, 0.2215 and -0.0035 based on mature fruit discrimination, respectively. The overall results indicated that ICA was an effective way for the selection of SWs, and the Vis/NIR combined with LS-SVM models had the capability to predict the different breeds (mutant M1, mutant M2 and their parent) of tomatoes from leafs and fruits. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Yan, Wen-juan; Yang, Ming; He, Guo-quan; Qin, Lin; Li, Gang
2014-11-01
In order to identify the diabetic patients by using tongue near-infrared (NIR) spectrum - a spectral classification model of the NIR reflectivity of the tongue tip is proposed, based on the partial least square (PLS) method. 39sample data of tongue tip's NIR spectra are harvested from healthy people and diabetic patients , respectively. After pretreatment of the reflectivity, the spectral data are set as the independent variable matrix, and information of classification as the dependent variables matrix, Samples were divided into two groups - i.e. 53 samples as calibration set and 25 as prediction set - then the PLS is used to build the classification model The constructed modelfrom the 53 samples has the correlation of 0.9614 and the root mean square error of cross-validation (RMSECV) of 0.1387.The predictions for the 25 samples have the correlation of 0.9146 and the RMSECV of 0.2122.The experimental result shows that the PLS method can achieve good classification on features of healthy people and diabetic patients.
de Groot, P J; Swierenga, H; Postma, G J; Melssen, W J; Buydens, L M C
2003-06-01
The combination of Raman and infrared spectroscopy on the one hand and wavelength selection on the other hand is used to improve the partial least-squares (PLS) prediction of seven selected yarn properties. These properties are important for on-line quality control during production. From 71 yarn samples, the Raman and infrared spectra are measured and reference methods are used to determine the selected properties. Making separate PLS models for all yarn properties using the Raman and infrared spectra, prior to wavelength selection, reveals that Raman spectroscopy outperforms infrared spectroscopy. If wavelength selection is applied, the PLS prediction error decreases and the correlation coefficient increases for all properties. However, a substantial wavelength selection effect is present for the infrared spectra compared to the Raman spectra. For the infrared spectra, wavelength selection results in PLS prediction errors comparable with the prediction performance of the Raman spectra prior to wavelength selection. Concatenating the Raman and infrared spectra does not enhance the PLS prediction performance, not even after wavelength selection. It is concluded that an infrared spectrometer, combined with a wavelength selection procedure, can be used if no (suitable) Raman instrument is available.
NASA Astrophysics Data System (ADS)
Yang, Renjie; Dong, Guimei; Sun, Xueshan; Yang, Yanrong; Yu, Yaping; Liu, Haixue; Zhang, Weiyu
2018-02-01
A new approach for quantitative determination of polycyclic aromatic hydrocarbons (PAHs) in environment was proposed based on two-dimensional (2D) fluorescence correlation spectroscopy in conjunction with multivariate method. 40 mixture solutions of anthracene and pyrene were prepared in the laboratory. Excitation-emission matrix (EEM) fluorescence spectra of all samples were collected. And 2D fluorescence correlation spectra were calculated under the excitation perturbation. The N-way partial least squares (N-PLS) models were developed based on 2D fluorescence correlation spectra, showing a root mean square error of calibration (RMSEC) of 3.50 μg L- 1 and root mean square error of prediction (RMSEP) of 4.42 μg L- 1 for anthracene and of 3.61 μg L- 1 and 4.29 μg L- 1 for pyrene, respectively. Also, the N-PLS models were developed for quantitative analysis of anthracene and pyrene using EEM fluorescence spectra. The RMSEC and RMSEP were 3.97 μg L- 1 and 4.63 μg L- 1 for anthracene, 4.46 μg L- 1 and 4.52 μg L- 1 for pyrene, respectively. It was found that the N-PLS model using 2D fluorescence correlation spectra could provide better results comparing with EEM fluorescence spectra because of its low RMSEC and RMSEP. The methodology proposed has the potential to be an alternative method for detection of PAHs in environment.
NASA Astrophysics Data System (ADS)
Liu, Fei; He, Yong
2008-03-01
Three different chemometric methods were performed for the determination of sugar content of cola soft drinks using visible and near infrared spectroscopy (Vis/NIRS). Four varieties of colas were prepared and 180 samples (45 samples for each variety) were selected for the calibration set, while 60 samples (15 samples for each variety) for the validation set. The smoothing way of Savitzky-Golay, standard normal variate (SNV) and Savitzky-Golay first derivative transformation were applied for the pre-processing of spectral data. The first eleven principal components (PCs) extracted by partial least squares (PLS) analysis were employed as the inputs of BP neural network (BPNN) and least squares-support vector machine (LS-SVM) model. Then the BPNN model with the optimal structural parameters and LS-SVM model with radial basis function (RBF) kernel were applied to build the regression model with a comparison of PLS regression. The correlation coefficient (r), root mean square error of prediction (RMSEP) and bias for prediction were 0.971, 1.259 and -0.335 for PLS, 0.986, 0.763, and -0.042 for BPNN, while 0.978, 0.995 and -0.227 for LS-SVM, respectively. All the three methods supplied a high and satisfying precision. The results indicated that Vis/NIR spectroscopy combined with chemometric methods could be utilized as a high precision way for the determination of sugar content of cola soft drinks.
Boiret, Mathieu; Meunier, Loïc; Ginot, Yves-Michel
2011-02-20
A near infrared (NIR) method was developed for determination of tablet potency of active pharmaceutical ingredient (API) in a complex coated tablet matrix. The calibration set contained samples from laboratory and production scale batches. The reference values were obtained by high performance liquid chromatography (HPLC) and partial least squares (PLS) regression was used to establish a model. The model was challenged by calculating tablet potency of two external test sets. Root mean square errors of prediction were respectively equal to 2.0% and 2.7%. To use this model with a second spectrometer from the production field, a calibration transfer method called piecewise direct standardisation (PDS) was used. After the transfer, the root mean square error of prediction of the first test set was 2.4% compared to 4.0% without transferring the spectra. A statistical technique using bootstrap of PLS residuals was used to estimate confidence intervals of tablet potency calculations. This method requires an optimised PLS model, selection of the bootstrap number and determination of the risk. In the case of a chemical analysis, the tablet potency value will be included within the confidence interval calculated by the bootstrap method. An easy to use graphical interface was developed to easily determine if the predictions, surrounded by minimum and maximum values, are within the specifications defined by the regulatory organisation. Copyright © 2010 Elsevier B.V. All rights reserved.
Raman microspectroscopy of nucleus and cytoplasm for human colon cancer diagnosis.
Liu, Wenjing; Wang, Hongbo; Du, Jingjing; Jing, Chuanyong
2017-11-15
Subcellular Raman analysis is a promising clinic tool for cancer diagnosis, but constrained by the difficulty of deciphering subcellular spectra in actual human tissues. We report a label-free subcellular Raman analysis for use in cancer diagnosis that integrates subcellular signature spectra by subtracting cytoplasm from nucleus spectra (Nuc.-Cyt.) with a partial least squares-discriminant analysis (PLS-DA) model. Raman mapping with the classical least-squares (CLS) model allowed direct visualization of the distribution of the cytoplasm and nucleus. The PLS-DA model was employed to evaluate the diagnostic performance of five types of spectral datasets, including non-selective, nucleus, cytoplasm, ratio of nucleus to cytoplasm (Nuc./Cyt.), and nucleus minus cytoplasm (Nuc.-Cyt.), resulting in diagnostic sensitivity of 88.3%, 84.0%, 98.4%, 84.5%, and 98.9%, respectively. Discriminating between normal and cancerous cells of actual human tissues through subcellular Raman markers is feasible, especially when using the nucleus-cytoplasm difference spectra. The subcellular Raman approach had good stability, and had excellent diagnostic performance for rectal as well as colon tissues. The insights gained from this study shed new light on the general applicability of subcellular Raman analysis in clinical trials. Copyright © 2017 Elsevier B.V. All rights reserved.
Rahmania, Halida; Sudjadi; Rohman, Abdul
2015-02-01
For Indonesian community, meatball is one of the favorite meat food products. In order to gain economical benefits, the substitution of beef meat with rat meat can happen due to the different prices between rat meat and beef. In this present research, the feasibility of FTIR spectroscopy in combination with multivariate calibration of partial least square (PLS) was used for the quantitative analysis of rat meat in the binary mixture of beef in meatball formulation. Meanwhile, the chemometrics of principal component analysis (PCA) was used for the classification between rat meat and beef meatballs. Some frequency regions in mid infrared region were optimized, and finally, the frequency region of 750-1000 cm(-1) was selected during PLS and PCA modeling.For quantitative analysis, the relationship between actual values (x-axis) and FTIR predicted values (y-axis) of rat meat is described by the equation of y= 0.9417x+ 2.8410 with coefficient of determination (R2) of 0.993, and root mean square error of calibration (RMSEC) of 1.79%. Furthermore, PCA was successfully used for the classification of rat meat meatball and beef meatball.
NASA Astrophysics Data System (ADS)
Attia, Khalid A. M.; Nassar, Mohammed W. I.; El-Zeiny, Mohamed B.; Serag, Ahmed
2016-03-01
Different chemometric models were applied for the quantitative analysis of amoxicillin (AMX), and flucloxacillin (FLX) in their binary mixtures, namely, partial least squares (PLS), spectral residual augmented classical least squares (SRACLS), concentration residual augmented classical least squares (CRACLS) and artificial neural networks (ANNs). All methods were applied with and without variable selection procedure (genetic algorithm GA). The methods were used for the quantitative analysis of the drugs in laboratory prepared mixtures and real market sample via handling the UV spectral data. Robust and simpler models were obtained by applying GA. The proposed methods were found to be rapid, simple and required no preliminary separation steps.
Determination of total phenolic compounds in compost by infrared spectroscopy.
Cascant, M M; Sisouane, M; Tahiri, S; Krati, M El; Cervera, M L; Garrigues, S; de la Guardia, M
2016-06-01
Middle and near infrared (MIR and NIR) were applied to determine the total phenolic compounds (TPC) content in compost samples based on models built by using partial least squares (PLS) regression. The multiplicative scatter correction, standard normal variate and first derivative were employed as spectra pretreatment, and the number of latent variable were optimized by leave-one-out cross-validation. The performance of PLS-ATR-MIR and PLS-DR-NIR models was evaluated according to root mean square error of cross validation and prediction (RMSECV and RMSEP), the coefficient of determination for prediction (Rpred(2)) and residual predictive deviation (RPD) being obtained for this latter values of 5.83 and 8.26 for MIR and NIR, respectively. Copyright © 2016 Elsevier B.V. All rights reserved.
Niazi, Ali; Zolgharnein, Javad; Afiuni-Zadeh, Somaie
2007-11-01
Ternary mixtures of thiamin, riboflavin and pyridoxal have been simultaneously determined in synthetic and real samples by applications of spectrophotometric and least-squares support vector machines. The calibration graphs were linear in the ranges of 1.0 - 20.0, 1.0 - 10.0 and 1.0 - 20.0 microg ml(-1) with detection limits of 0.6, 0.5 and 0.7 microg ml(-1) for thiamin, riboflavin and pyridoxal, respectively. The experimental calibration matrix was designed with 21 mixtures of these chemicals. The concentrations were varied between calibration graph concentrations of vitamins. The simultaneous determination of these vitamin mixtures by using spectrophotometric methods is a difficult problem, due to spectral interferences. The partial least squares (PLS) modeling and least-squares support vector machines were used for the multivariate calibration of the spectrophotometric data. An excellent model was built using LS-SVM, with low prediction errors and superior performance in relation to PLS. The root mean square errors of prediction (RMSEP) for thiamin, riboflavin and pyridoxal with PLS and LS-SVM were 0.6926, 0.3755, 0.4322 and 0.0421, 0.0318, 0.0457, respectively. The proposed method was satisfactorily applied to the rapid simultaneous determination of thiamin, riboflavin and pyridoxal in commercial pharmaceutical preparations and human plasma samples.
Fast classification of hazelnut cultivars through portable infrared spectroscopy and chemometrics
NASA Astrophysics Data System (ADS)
Manfredi, Marcello; Robotti, Elisa; Quasso, Fabio; Mazzucco, Eleonora; Calabrese, Giorgio; Marengo, Emilio
2018-01-01
The authentication and traceability of hazelnuts is very important for both the consumer and the food industry, to safeguard the protected varieties and the food quality. This study investigates the use of a portable FTIR spectrometer coupled to multivariate statistical analysis for the classification of raw hazelnuts. The method discriminates hazelnuts from different origins/cultivars based on differences of the signal intensities of their IR spectra. The multivariate classification methods, namely principal component analysis (PCA) followed by linear discriminant analysis (LDA) and partial least square discriminant analysis (PLS-DA), with or without variable selection, allowed a very good discrimination among the groups, with PLS-DA coupled to variable selection providing the best results. Due to the fast analysis, high sensitivity, simplicity and no sample preparation, the proposed analytical methodology could be successfully used to verify the cultivar of hazelnuts, and the analysis can be performed quickly and directly on site.
Linear and nonlinear methods in modeling the aqueous solubility of organic compounds.
Catana, Cornel; Gao, Hua; Orrenius, Christian; Stouten, Pieter F W
2005-01-01
Solubility data for 930 diverse compounds have been analyzed using linear Partial Least Square (PLS) and nonlinear PLS methods, Continuum Regression (CR), and Neural Networks (NN). 1D and 2D descriptors from MOE package in combination with E-state or ISIS keys have been used. The best model was obtained using linear PLS for a combination between 22 MOE descriptors and 65 ISIS keys. It has a correlation coefficient (r2) of 0.935 and a root-mean-square error (RMSE) of 0.468 log molar solubility (log S(w)). The model validated on a test set of 177 compounds not included in the training set has r2 0.911 and RMSE 0.475 log S(w). The descriptors were ranked according to their importance, and at the top of the list have been found the 22 MOE descriptors. The CR model produced results as good as PLS, and because of the way in which cross-validation has been done it is expected to be a valuable tool in prediction besides PLS model. The statistics obtained using nonlinear methods did not surpass those got with linear ones. The good statistic obtained for linear PLS and CR recommends these models to be used in prediction when it is difficult or impossible to make experimental measurements, for virtual screening, combinatorial library design, and efficient leads optimization.
Paradowska, Katarzyna; Jamróz, Marta Katarzyna; Kobyłka, Mariola; Gowin, Ewelina; Maczka, Paulina; Skibiński, Robert; Komsta, Łukasz
2012-01-01
This paper presents a preliminary study in building discriminant models from solid-state NMR spectrometry data to detect the presence of acetaminophen in over-the-counter pharmaceutical formulations. The dataset, containing 11 spectra of pure substances and 21 spectra of various formulations, was processed by partial least squares discriminant analysis (PLS-DA). The model found coped with the discrimination, and its quality parameters were acceptable. It was found that standard normal variate preprocessing had almost no influence on unsupervised investigation of the dataset. The influence of variable selection with the uninformative variable elimination by PLS method was studied, reducing the dataset from 7601 variables to around 300 informative variables, but not improving the model performance. The results showed the possibility to construct well-working PLS-DA models from such small datasets without a full experimental design.
NASA Astrophysics Data System (ADS)
Mi, Jiaping; Li, Yuanqian; Zhou, Xiaoli; Zheng, Bo; Zhou, Ying
2006-01-01
A flow injection-CCD diode array detection spectrophotometry with partial least squares (PLS) program for simultaneous determination of iron, copper and cobalt in food samples has been established. The method was based on the chromogenic reaction of the three metal ions and 2- (5-Bromo-2-pyridylazo)-5-diethylaminophenol, 5-Br-PADAP in acetic acid - sodium acetate buffer solution (pH5) with Triton X-100 and ascorbic acid. The overlapped spectra of the colored complexes were collected by charge-coupled device (CCD) - diode array detector and the multi-wavelength absorbance data was processed using partial least squares (PLS) algorithm. Optimum reaction conditions and parameters of flow injection analysis were investigated. The samples of tea, sesame, laver, millet, cornmeal, mung bean and soybean powder were determined by the proposed method. The average recoveries of spiked samples were 91.80%~100.9% for Iron, 92.50%~108.0% for Copper, 93.00%~110.5% for Cobalt, respectively with relative standard deviation (R.S.D) of 1.1%~12.1%. The sampling rate is 45 samples h-1. The determination results of the food samples were in good agreement between the proposed method and ICP-AES.
Zhang, Chu; Liu, Fei; Kong, Wenwen; He, Yong
2015-01-01
Visible and near-infrared hyperspectral imaging covering spectral range of 380–1030 nm as a rapid and non-destructive method was applied to estimate the soluble protein content of oilseed rape leaves. Average spectrum (500–900 nm) of the region of interest (ROI) of each sample was extracted, and four samples out of 128 samples were defined as outliers by Monte Carlo-partial least squares (MCPLS). Partial least squares (PLS) model using full spectra obtained dependable performance with the correlation coefficient (rp) of 0.9441, root mean square error of prediction (RMSEP) of 0.1658 mg/g and residual prediction deviation (RPD) of 2.98. The weighted regression coefficient (Bw), successive projections algorithm (SPA) and genetic algorithm-partial least squares (GAPLS) selected 18, 15, and 16 sensitive wavelengths, respectively. SPA-PLS model obtained the best performance with rp of 0.9554, RMSEP of 0.1538 mg/g and RPD of 3.25. Distribution of protein content within the rape leaves were visualized and mapped on the basis of the SPA-PLS model. The overall results indicated that hyperspectral imaging could be used to determine and visualize the soluble protein content of rape leaves. PMID:26184198
Kaniu, M I; Angeyo, K H; Mwala, A K; Mwangi, F K
2012-08-30
Soil quality assessment (SQA) calls for rapid, simple and affordable but accurate analysis of soil quality indicators (SQIs). Routine methods of soil analysis are tedious and expensive. Energy dispersive X-ray fluorescence and scattering (EDXRFS) spectrometry in conjunction with chemometrics is a potentially powerful method for rapid SQA. In this study, a 25 m Ci (109)Cd isotope source XRF spectrometer was used to realize EDXRFS spectrometry of soils. Glycerol (a simulate of "organic" soil solution) and kaolin (a model clay soil) doped with soil micro (Fe, Cu, Zn) and macro (NO(3)(-), SO(4)(2-), H(2)PO(4)(-)) nutrients were used to train multivariate chemometric calibration models for direct (non-invasive) analysis of SQIs based on partial least squares (PLS) and artificial neural networks (ANN). The techniques were compared for each SQI with respect to speed, robustness, correction ability for matrix effects, and resolution of spectral overlap. The method was then applied to perform direct rapid analysis of SQIs in field soils. A one-way ANOVA test showed no statistical difference at 95% confidence interval between PLS and ANN results compared to reference soil nutrients. PLS was more accurate analyzing C, N, Na, P and Zn (R(2)>0.9) and low SEP of (0.05%, 0.01%, 0.01%, and 1.98 μg g(-1)respectively), while ANN was better suited for analysis of Mg, Cu and Fe (R(2)>0.9 and SEP of 0.08%, 4.02 μg g(-1), and 0.88 μg g(-1) respectively). Copyright © 2012 Elsevier B.V. All rights reserved.
ERIC Educational Resources Information Center
Baumann, Chris; Hamin
2011-01-01
A nation's culture, competitiveness and economic performance explain academic performance. Partial Least Squares (PLS) testing of 2252 students shows culture affects competitiveness and academic performance. Culture and economic performance each explain 32%; competitiveness 36%. The model predicts academic performance when culture, competitiveness…
Detection of pit fragments in fresh cherries using near infrared spectroscopy
USDA-ARS?s Scientific Manuscript database
NIR spectroscopy in the wavelength region from 900nm to 2600nm was evaluated as the basis for a rapid, non-destructive method for the detection of pits and pit fragments in fresh cherries. Partial Least Squares discriminant analysis (PLS-DA) following various spectral pretreatments was applied to sp...
Elkhoudary, Mahmoud M; Abdel Salam, Randa A; Hadad, Ghada M
2014-09-15
Metronidazole (MNZ) is a widely used antibacterial and amoebicide drug. Therefore, it is important to develop a rapid and specific analytical method for the determination of MNZ in mixture with Spiramycin (SPY), Diloxanide (DIX) and Cliquinol (CLQ) in pharmaceutical preparations. This work describes simple, sensitive and reliable six multivariate calibration methods, namely linear and nonlinear artificial neural networks preceded by genetic algorithm (GA-ANN) and principle component analysis (PCA-ANN) as well as partial least squares (PLS) either alone or preceded by genetic algorithm (GA-PLS) for UV spectrophotometric determination of MNZ, SPY, DIX and CLQ in pharmaceutical preparations with no interference of pharmaceutical additives. The results manifest the problem of nonlinearity and how models like ANN can handle it. Analytical performance of these methods was statistically validated with respect to linearity, accuracy, precision and specificity. The developed methods indicate the ability of the previously mentioned multivariate calibration models to handle and solve UV spectra of the four components' mixtures using easy and widely used UV spectrophotometer. Copyright © 2014 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Elkhoudary, Mahmoud M.; Abdel Salam, Randa A.; Hadad, Ghada M.
2014-09-01
Metronidazole (MNZ) is a widely used antibacterial and amoebicide drug. Therefore, it is important to develop a rapid and specific analytical method for the determination of MNZ in mixture with Spiramycin (SPY), Diloxanide (DIX) and Cliquinol (CLQ) in pharmaceutical preparations. This work describes simple, sensitive and reliable six multivariate calibration methods, namely linear and nonlinear artificial neural networks preceded by genetic algorithm (GA-ANN) and principle component analysis (PCA-ANN) as well as partial least squares (PLS) either alone or preceded by genetic algorithm (GA-PLS) for UV spectrophotometric determination of MNZ, SPY, DIX and CLQ in pharmaceutical preparations with no interference of pharmaceutical additives. The results manifest the problem of nonlinearity and how models like ANN can handle it. Analytical performance of these methods was statistically validated with respect to linearity, accuracy, precision and specificity. The developed methods indicate the ability of the previously mentioned multivariate calibration models to handle and solve UV spectra of the four components’ mixtures using easy and widely used UV spectrophotometer.
NASA Astrophysics Data System (ADS)
Yang, Yue; Wang, Lei; Wu, Yongjiang; Liu, Xuesong; Bi, Yuan; Xiao, Wei; Chen, Yong
2017-07-01
There is a growing need for the effective on-line process monitoring during the manufacture of traditional Chinese medicine to ensure quality consistency. In this study, the potential of near infrared (NIR) spectroscopy technique to monitor the extraction process of Flos Lonicerae Japonicae was investigated. A new algorithm of synergy interval PLS with genetic algorithm (Si-GA-PLS) was proposed for modeling. Four different PLS models, namely Full-PLS, Si-PLS, GA-PLS, and Si-GA-PLS, were established, and their performances in predicting two quality parameters (viz. total acid and soluble solid contents) were compared. In conclusion, Si-GA-PLS model got the best results due to the combination of superiority of Si-PLS and GA. For Si-GA-PLS, the determination coefficient (Rp2) and root-mean-square error for the prediction set (RMSEP) were 0.9561 and 147.6544 μg/ml for total acid, 0.9062 and 0.1078% for soluble solid contents, correspondingly. The overall results demonstrated that the NIR spectroscopy technique combined with Si-GA-PLS calibration is a reliable and non-destructive alternative method for on-line monitoring of the extraction process of TCM on the production scale.
NASA Astrophysics Data System (ADS)
Duarte, Janaína; Pacheco, Marcos T. T.; Villaverde, Antonio Balbin; Machado, Rosangela Z.; Zângaro, Renato A.; Silveira, Landulfo
2010-07-01
Toxoplasmosis is an important zoonosis in public health because domestic cats are the main agents responsible for the transmission of this disease in Brazil. We investigate a method for diagnosing toxoplasmosis based on Raman spectroscopy. Dispersive near-infrared Raman spectra are used to quantify anti-Toxoplasma gondii (IgG) antibodies in blood sera from domestic cats. An 830-nm laser is used for sample excitation, and a dispersive spectrometer is used to detect the Raman scattering. A serological test is performed in all serum samples by the enzyme-linked immunosorbent assay (ELISA) for validation. Raman spectra are taken from 59 blood serum samples and a quantification model is implemented based on partial least squares (PLS) to quantify the sample's serology by Raman spectra compared to the results provided by the ELISA test. Based on the serological values provided by the Raman/PLS model, diagnostic parameters such as sensitivity, specificity, accuracy, positive prediction values, and negative prediction values are calculated to discriminate negative from positive samples, obtaining 100, 80, 90, 83.3, and 100%, respectively. Raman spectroscopy, associated with the PLS, is promising as a serological assay for toxoplasmosis, enabling fast and sensitive diagnosis.
Aguilera, Teodoro; Lozano, Jesús; Paredes, José A.; Álvarez, Fernando J.; Suárez, José I.
2012-01-01
The aim of this work is to propose an alternative way for wine classification and prediction based on an electronic nose (e-nose) combined with Independent Component Analysis (ICA) as a dimensionality reduction technique, Partial Least Squares (PLS) to predict sensorial descriptors and Artificial Neural Networks (ANNs) for classification purpose. A total of 26 wines from different regions, varieties and elaboration processes have been analyzed with an e-nose and tasted by a sensory panel. Successful results have been obtained in most cases for prediction and classification. PMID:22969387
Párta, László; Zalai, Dénes; Borbély, Sándor; Putics, Akos
2014-02-01
The application of dielectric spectroscopy was frequently investigated as an on-line cell culture monitoring tool; however, it still requires supportive data and experience in order to become a robust technique. In this study, dielectric spectroscopy was used to predict viable cell density (VCD) at industrially relevant high levels in concentrated fed-batch culture of Chinese hamster ovary cells producing a monoclonal antibody for pharmaceutical purposes. For on-line dielectric spectroscopy measurements, capacitance was scanned within a wide range of frequency values (100-19,490 kHz) in six parallel cell cultivation batches. Prior to detailed mathematical analysis of the collected data, principal component analysis (PCA) was applied to compare dielectric behavior of the cultivations. PCA analysis resulted in detecting measurement disturbances. By using the measured spectroscopic data, partial least squares regression (PLS), Cole-Cole, and linear modeling were applied and compared in order to predict VCD. The Cole-Cole and the PLS model provided reliable prediction over the entire cultivation including both the early and decline phases of cell growth, while the linear model failed to estimate VCD in the later, declining cultivation phase. In regards to the measurement error sensitivity, remarkable differences were shown among PLS, Cole-Cole, and linear modeling. VCD prediction accuracy could be improved in the runs with measurement disturbances by first derivative pre-treatment in PLS and by parameter optimization of the Cole-Cole modeling.
Multivariate analysis of gamma spectra to characterize used nuclear fuel
Coble, Jamie; Orton, Christopher; Schwantes, Jon
2017-01-17
The Multi-Isotope Process (MIP) Monitor provides an efficient means to monitor the process conditions in used nuclear fuel reprocessing facilities to support process verification and validation. The MIP Monitor applies multivariate analysis to gamma spectroscopy of key stages in the reprocessing stream in order to detect small changes in the gamma spectrum, which may indicate changes in process conditions. This research extends the MIP Monitor by characterizing a used fuel sample after initial dissolution according to the type of reactor of origin (pressurized or boiling water reactor; PWR and BWR, respectively), initial enrichment, burn up, and cooling time. Simulated gammamore » spectra were used in this paper to develop and test three fuel characterization algorithms. The classification and estimation models employed are based on the partial least squares regression (PLS) algorithm. A PLS discriminate analysis model was developed which perfectly classified reactor type for the three PWR and three BWR reactor designs studied. Locally weighted PLS models were fitted on-the-fly to estimate the remaining fuel characteristics. For the simulated gamma spectra considered, burn up was predicted with 0.1% root mean squared percent error (RMSPE) and both cooling time and initial enrichment with approximately 2% RMSPE. Finally, this approach to automated fuel characterization can be used to independently verify operator declarations of used fuel characteristics and to inform the MIP Monitor anomaly detection routines at later stages of the fuel reprocessing stream to improve sensitivity to changes in operational parameters that may indicate issues with operational control or malicious activities.« less
Multivariate analysis of gamma spectra to characterize used nuclear fuel
DOE Office of Scientific and Technical Information (OSTI.GOV)
Coble, Jamie; Orton, Christopher; Schwantes, Jon
The Multi-Isotope Process (MIP) Monitor provides an efficient means to monitor the process conditions in used nuclear fuel reprocessing facilities to support process verification and validation. The MIP Monitor applies multivariate analysis to gamma spectroscopy of key stages in the reprocessing stream in order to detect small changes in the gamma spectrum, which may indicate changes in process conditions. This research extends the MIP Monitor by characterizing a used fuel sample after initial dissolution according to the type of reactor of origin (pressurized or boiling water reactor; PWR and BWR, respectively), initial enrichment, burn up, and cooling time. Simulated gammamore » spectra were used in this paper to develop and test three fuel characterization algorithms. The classification and estimation models employed are based on the partial least squares regression (PLS) algorithm. A PLS discriminate analysis model was developed which perfectly classified reactor type for the three PWR and three BWR reactor designs studied. Locally weighted PLS models were fitted on-the-fly to estimate the remaining fuel characteristics. For the simulated gamma spectra considered, burn up was predicted with 0.1% root mean squared percent error (RMSPE) and both cooling time and initial enrichment with approximately 2% RMSPE. Finally, this approach to automated fuel characterization can be used to independently verify operator declarations of used fuel characteristics and to inform the MIP Monitor anomaly detection routines at later stages of the fuel reprocessing stream to improve sensitivity to changes in operational parameters that may indicate issues with operational control or malicious activities.« less
Gorre, Elsa; Owens, Kevin G
2016-11-01
In this work an attenuated total reflection Fourier transform infrared (FT-IR) absorption based method is used to measure the solubility of two matrix-assisted laser desorption-ionization (MALDI) matrices in a few pure solvents and mixtures of acetonitrile and water using low microliter amounts of solution. Results from a method that averages the values obtained from multiple calibration curves created by manual peak picking are compared to those predicted using a partial least squares (PLS) chemometrics approach. The PLS method provided solubility values that were in good agreement with the manual method with significantly greater ease of analysis. As a test, the solubility of adipic acid in acetone was measured using the two methods of analysis, and the values are in good agreement with solubility values reported in literature. The solubilities of the MALDI matrices α-cyano-4-hydroxy cinnamic acid (CHCA) and sinapinic acid (SA) were measured in a series of mixtures made from acetonitrile (ACN) and water; surprisingly, the results show a highly nonlinear trend. While both CHCA and SA show solubility values of less than 10 mg/mL in the pure solvents, the solubility value for SA increases to 56.3 mg/mL in a 75:25 v/v ACN:water mixture. This can have a significant effect on the matrix-to-analyte ratios in the MALDI experiment when sample protocols call for preparation of a saturated solution of the matrix in the chosen solvent system. © The Author(s) 2016.
Real-time Raman spectroscopy for automatic in vivo skin cancer detection: an independent validation.
Zhao, Jianhua; Lui, Harvey; Kalia, Sunil; Zeng, Haishan
2015-11-01
In a recent study, we have demonstrated that real-time Raman spectroscopy could be used for skin cancer diagnosis. As a translational study, the objective of this study is to validate previous findings through a completely independent clinical test. In total, 645 confirmed cases were included in the analysis, including a cohort of 518 cases from a previous study, and an independent cohort of 127 new cases. Multi-variant statistical data analyses including principal component with general discriminant analysis (PC-GDA) and partial least squares (PLS) were used separately for lesion classification, which generated similar results. When the previous cohort (n = 518) was used as training and the new cohort (n = 127) was used as testing, the area under the receiver operating characteristic curve (ROC AUC) was found to be 0.889 (95 % CI 0.834-0.944; PLS); when the two cohorts were combined, the ROC AUC was 0.894 (95 % CI 0.870-0.918; PLS) with the narrowest confidence intervals. Both analyses were comparable to the previous findings, where the ROC AUC was 0.896 (95 % CI 0.846-0.946; PLS). The independent study validates that real-time Raman spectroscopy could be used for automatic in vivo skin cancer diagnosis with good accuracy.
Guo, Canyong; Luo, Xuefang; Zhou, Xiaohua; Shi, Beijia; Wang, Juanjuan; Zhao, Jinqi; Zhang, Xiaoxia
2017-06-05
Vibrational spectroscopic techniques such as infrared, near-infrared and Raman spectroscopy have become popular in detecting and quantifying polymorphism of pharmaceutics since they are fast and non-destructive. This study assessed the ability of three vibrational spectroscopy combined with multivariate analysis to quantify a low-content undesired polymorph within a binary polymorphic mixture. Partial least squares (PLS) regression and support vector machine (SVM) regression were employed to build quantitative models. Fusidic acid, a steroidal antibiotic, was used as the model compound. It was found that PLS regression performed slightly better than SVM regression in all the three spectroscopic techniques. Root mean square errors of prediction (RMSEP) were ranging from 0.48% to 1.17% for diffuse reflectance FTIR spectroscopy and 1.60-1.93% for diffuse reflectance FT-NIR spectroscopy and 1.62-2.31% for Raman spectroscopy. The results indicate that diffuse reflectance FTIR spectroscopy offers significant advantages in providing accurate measurement of polymorphic content in the fusidic acid binary mixtures, while Raman spectroscopy is the least accurate technique for quantitative analysis of polymorphs. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Technical Reports Server (NTRS)
Anderson, R. B.; Morris, Richard V.; Clegg, S. M.; Humphries, S. D.; Wiens, R. C.; Bell, J. F., III; Mertzman, S. A.
2010-01-01
The ChemCam instrument [1] on the Mars Science Laboratory (MSL) rover will be used to obtain the chemical composition of surface targets within 7 m of the rover using Laser Induced Breakdown Spectroscopy (LIBS). ChemCam analyzes atomic emission spectra (240-800 nm) from a plasma created by a pulsed Nd:KGW 1067 nm laser. The LIBS spectra can be used in a semiquantitative way to rapidly classify targets (e.g., basalt, andesite, carbonate, sulfate, etc.) and in a quantitative way to estimate their major and minor element chemical compositions. Quantitative chemical analysis from LIBS spectra is complicated by a number of factors, including chemical matrix effects [2]. Recent work has shown promising results using multivariate techniques such as partial least squares (PLS) regression and artificial neural networks (ANN) to predict elemental abundances in samples [e.g. 2-6]. To develop, refine, and evaluate analysis schemes for LIBS spectra of geologic materials, we collected spectra of a diverse set of well-characterized natural geologic samples and are comparing the predictive abilities of PLS, cascade correlation ANN (CC-ANN) and multilayer perceptron ANN (MLP-ANN) analysis procedures.
Steingass, Christof Björn; Jutzi, Manfred; Müller, Jenny; Carle, Reinhold; Schmarr, Hans-Georg
2015-03-01
Ripening-dependent changes of pineapple volatiles were studied in a nontargeted profiling analysis. Volatiles were isolated via headspace solid phase microextraction and analyzed by comprehensive 2D gas chromatography and mass spectrometry (HS-SPME-GC×GC-qMS). Profile patterns presented in the contour plots were evaluated applying image processing techniques and subsequent multivariate statistical data analysis. Statistical methods comprised unsupervised hierarchical cluster analysis (HCA) and principal component analysis (PCA) to classify the samples. Supervised partial least squares discriminant analysis (PLS-DA) and partial least squares (PLS) regression were applied to discriminate different ripening stages and describe the development of volatiles during postharvest storage, respectively. Hereby, substantial chemical markers allowing for class separation were revealed. The workflow permitted the rapid distinction between premature green-ripe pineapples and postharvest-ripened sea-freighted fruits. Volatile profiles of fully ripe air-freighted pineapples were similar to those of green-ripe fruits postharvest ripened for 6 days after simulated sea freight export, after PCA with only two principal components. However, PCA considering also the third principal component allowed differentiation between air-freighted fruits and the four progressing postharvest maturity stages of sea-freighted pineapples.
Monitoring of beer fermentation based on hybrid electronic tongue.
Kutyła-Olesiuk, Anna; Zaborowski, Michał; Prokaryn, Piotr; Ciosek, Patrycja
2012-10-01
Monitoring of biotechnological processes, including fermentation is extremely important because of the rapidly occurring changes in the composition of the samples during the production. In the case of beer, the analysis of physicochemical parameters allows for the determination of the stage of fermentation process and the control of its possible perturbations. As a tool to control the beer production process a sensor array can be used, composed of potentiometric and voltammetric sensors (so-called hybrid Electronic Tongue, h-ET). The aim of this study is to apply electronic tongue system to distinguish samples obtained during alcoholic fermentation. The samples originate from batch of homemade beer fermentation and from two stages of the process: fermentation reaction and maturation of beer. The applied sensor array consists of 10 miniaturized ion-selective electrodes (potentiometric ET) and silicon based 3-electrode voltammetric transducers (voltammetric ET). The obtained results were processed using Partial Least Squares (PLS) and Partial Least Squares-Discriminant Analysis (PLS-DA). For potentiometric data, voltammetric data, and combined potentiometric and voltammetric data, comparison of the classification ability was conducted based on Root Mean Squared Error (RMSE), sensitivity, specificity, and coefficient F calculation. It is shown, that in the contrast to the separately used techniques, the developed hybrid system allowed for a better characterization of the beer samples. Data fusion in hybrid ET enables to obtain better results both in qualitative analysis (RMSE, specificity, sensitivity) and in quantitative analysis (RMSE, R(2), a, b). Copyright © 2012 Elsevier B.V. All rights reserved.
Jiménez-Carvelo, Ana M; González-Casado, Antonio; Pérez-Castaño, Estefanía; Cuadros-Rodríguez, Luis
2017-03-01
A new analytical method for the differentiation of olive oil from other vegetable oils using reversed-phase LC and applying chemometric techniques was developed. A 3 cm short column was used to obtain the chromatographic fingerprint of the methyl-transesterified fraction of each vegetable oil. The chromatographic analysis took only 4 min. The multivariate classification methods used were k-nearest neighbors, partial least-squares (PLS) discriminant analysis, one-class PLS, support vector machine classification, and soft independent modeling of class analogies. The discrimination of olive oil from other vegetable edible oils was evaluated by several classification quality metrics. Several strategies for the classification of the olive oil were used: one input-class, two input-class, and pseudo two input-class.
Glucose determination in human aqueous humor with Raman spectroscopy
NASA Technical Reports Server (NTRS)
Lambert, James L.; Pelletier, Christine C.; Borchert, Mark
2005-01-01
It has been suggested that spectroscopic analysis of the aqueous humor of the eye could be used to indirectly predict blood glucose levels in diabetics noninvasively. We have been investigating this potential using Raman spectroscopy in combination with partial least squares (PLS) analysis. We have determined that glucose at clinically relevant concentrations can be accurately predicted in human aqueous humor in vitro using a PLS model based on artificial aqueous humor. We have further determined that with proper instrument design, the light energy necessary to achieve clinically acceptable prediction of glucose does not damage the retinas of rabbits and can be delivered at powers below internationally acceptable safety limits. Herein we summarize our current results and address our strategies to improve instrument design. 2005 Society of Photo-Optical Instrumentation Engineers.
Wang, Qi; He, Haijun; Li, Bing; Lin, Hancheng; Zhang, Yinming; Zhang, Ji
2017-01-01
Estimating PMI is of great importance in forensic investigations. Although many methods are used to estimate the PMI, a few investigations focus on the postmortem redistribution. In this study, ultraviolet–visible (UV–Vis) measurement combined with visual inspection indicated a regular diffusion of hemoglobin into plasma after death showing the redistribution of postmortem components in blood. Thereafter, attenuated total reflection–Fourier transform infrared (ATR–FTIR) spectroscopy was used to confirm the variations caused by this phenomenon. First, full-spectrum partial least-squares (PLS) and genetic algorithm combined with PLS (GA-PLS) models were constructed to predict the PMI. The performance of GA-PLS model was better than that of full-spectrum PLS model based on its root mean square error (RMSE) of cross-validation of 3.46 h (R2 = 0.95) and the RMSE of prediction of 3.46 h (R2 = 0.94). The investigation on the similarity of spectra between blood plasma and formed elements also supported the role of redistribution of components in spectral changes in postmortem plasma. These results demonstrated that ATR-FTIR spectroscopy coupled with the advanced mathematical methods could serve as a convenient and reliable tool to study the redistribution of postmortem components and estimate the PMI. PMID:28753641
Siebers, Nina; Kruse, Jens; Eckhardt, Kai-Uwe; Hu, Yongfeng; Leinweber, Peter
2012-07-01
Cadmium (Cd) has a high toxicity and resolving its speciation in soil is challenging but essential for estimating the environmental risk. In this study partial least-square (PLS) regression was tested for its capability to deconvolute Cd L(3)-edge X-ray absorption near-edge structure (XANES) spectra of multi-compound mixtures. For this, a library of Cd reference compound spectra and a spectrum of a soil sample were acquired. A good coefficient of determination (R(2)) of Cd compounds in mixtures was obtained for the PLS model using binary and ternary mixtures of various Cd reference compounds proving the validity of this approach. In order to describe complex systems like soil, multi-compound mixtures of a variety of Cd compounds must be included in the PLS model. The obtained PLS regression model was then applied to a highly Cd-contaminated soil revealing Cd(3)(PO(4))(2) (36.1%), Cd(NO(3))(2)·4H(2)O (24.5%), Cd(OH)(2) (21.7%), CdCO(3) (17.1%) and CdCl(2) (0.4%). These preliminary results proved that PLS regression is a promising approach for a direct determination of Cd speciation in the solid phase of a soil sample.
Borràs, Eva; Ferré, Joan; Boqué, Ricard; Mestres, Montserrat; Aceña, Laura; Calvo, Angels; Busto, Olga
2016-08-01
Headspace-Mass Spectrometry (HS-MS), Fourier Transform Mid-Infrared spectroscopy (FT-MIR) and UV-Visible spectrophotometry (UV-vis) instrumental responses have been combined to predict virgin olive oil sensory descriptors. 343 olive oil samples analyzed during four consecutive harvests (2010-2014) were used to build multivariate calibration models using partial least squares (PLS) regression. The reference values of the sensory attributes were provided by expert assessors from an official taste panel. The instrumental data were modeled individually and also using data fusion approaches. The use of fused data with both low- and mid-level of abstraction improved PLS predictions for all the olive oil descriptors. The best PLS models were obtained for two positive attributes (fruity and bitter) and two defective descriptors (fusty and musty), all of them using data fusion of MS and MIR spectral fingerprints. Although good predictions were not obtained for some sensory descriptors, the results are encouraging, specially considering that the legal categorization of virgin olive oils only requires the determination of fruity and defective descriptors. Copyright © 2016 Elsevier B.V. All rights reserved.
Thermal-to-visible face recognition using partial least squares.
Hu, Shuowen; Choi, Jonghyun; Chan, Alex L; Schwartz, William Robson
2015-03-01
Although visible face recognition has been an active area of research for several decades, cross-modal face recognition has only been explored by the biometrics community relatively recently. Thermal-to-visible face recognition is one of the most difficult cross-modal face recognition challenges, because of the difference in phenomenology between the thermal and visible imaging modalities. We address the cross-modal recognition problem using a partial least squares (PLS) regression-based approach consisting of preprocessing, feature extraction, and PLS model building. The preprocessing and feature extraction stages are designed to reduce the modality gap between the thermal and visible facial signatures, and facilitate the subsequent one-vs-all PLS-based model building. We incorporate multi-modal information into the PLS model building stage to enhance cross-modal recognition. The performance of the proposed recognition algorithm is evaluated on three challenging datasets containing visible and thermal imagery acquired under different experimental scenarios: time-lapse, physical tasks, mental tasks, and subject-to-camera range. These scenarios represent difficult challenges relevant to real-world applications. We demonstrate that the proposed method performs robustly for the examined scenarios.
Darwish, Hany W; Bakheit, Ahmed H; Abdelhameed, Ali S
2016-03-01
Simultaneous spectrophotometric analysis of a multi-component dosage form of olmesartan, amlodipine and hydrochlorothiazide used for the treatment of hypertension has been carried out using various chemometric methods. Multivariate calibration methods include classical least squares (CLS) executed by net analyte processing (NAP-CLS), orthogonal signal correction (OSC-CLS) and direct orthogonal signal correction (DOSC-CLS) in addition to multivariate curve resolution-alternating least squares (MCR-ALS). Results demonstrated the efficiency of the proposed methods as quantitative tools of analysis as well as their qualitative capability. The three analytes were determined precisely using the aforementioned methods in an external data set and in a dosage form after optimization of experimental conditions. Finally, the efficiency of the models was validated via comparison with the partial least squares (PLS) method in terms of accuracy and precision.
Ouyang, Qin; Zhao, Jiewen; Chen, Quansheng
2015-01-01
The non-sugar solids (NSS) content is one of the most important nutrition indicators of Chinese rice wine. This study proposed a rapid method for the measurement of NSS content in Chinese rice wine using near infrared (NIR) spectroscopy. We also systemically studied the efficient spectral variables selection algorithms that have to go through modeling. A new algorithm of synergy interval partial least square with competitive adaptive reweighted sampling (Si-CARS-PLS) was proposed for modeling. The performance of the final model was back-evaluated using root mean square error of calibration (RMSEC) and correlation coefficient (Rc) in calibration set and similarly tested by mean square error of prediction (RMSEP) and correlation coefficient (Rp) in prediction set. The optimum model by Si-CARS-PLS algorithm was achieved when 7 PLS factors and 18 variables were included, and the results were as follows: Rc=0.95 and RMSEC=1.12 in the calibration set, Rp=0.95 and RMSEP=1.22 in the prediction set. In addition, Si-CARS-PLS algorithm showed its superiority when compared with the commonly used algorithms in multivariate calibration. This work demonstrated that NIR spectroscopy technique combined with a suitable multivariate calibration algorithm has a high potential in rapid measurement of NSS content in Chinese rice wine. Copyright © 2015 Elsevier B.V. All rights reserved.
Yu, Peigen; Low, Mei Yin; Zhou, Weibiao
2018-01-01
In order to develop products that would be preferred by consumers, the effects of the chemical compositions of ready-to-drink green tea beverages on consumer liking were studied through regression analyses. Green tea model systems were prepared by dosing solutions of 0.1% green tea extract with differing concentrations of eight flavour keys deemed to be important for green tea aroma and taste, based on a D-optimal experimental design, before undergoing commercial sterilisation. Sensory evaluation of the green tea model system was carried out using an untrained consumer panel to obtain hedonic liking scores of the samples. Regression models were subsequently trained to objectively predict the consumer liking scores of the green tea model systems. A linear partial least squares (PLS) regression model was developed to describe the effects of the eight flavour keys on consumer liking, with a coefficient of determination (R 2 ) of 0.733, and a root-mean-square error (RMSE) of 3.53%. The PLS model was further augmented with an artificial neural network (ANN) to establish a PLS-ANN hybrid model. The established hybrid model was found to give a better prediction of consumer liking scores, based on its R 2 (0.875) and RMSE (2.41%). Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Solimun
2017-05-01
The aim of this research is to model survival data from kidney-transplant patients using the partial least squares (PLS)-Cox regression, which can both meet and not meet the no-multicollinearity assumption. The secondary data were obtained from research entitled "Factors affecting the survival of kidney-transplant patients". The research subjects comprised 250 patients. The predictor variables consisted of: age (X1), sex (X2); two categories, prior hemodialysis duration (X3), diabetes (X4); two categories, prior transplantation number (X5), number of blood transfusions (X6), discrepancy score (X7), use of antilymphocyte globulin(ALG) (X8); two categories, while the response variable was patient survival time (in months). Partial least squares regression is a model that connects the predictor variables X and the response variable y and it initially aims to determine the relationship between them. Results of the above analyses suggest that the survival of kidney transplant recipients ranged from 0 to 55 months, with 62% of the patients surviving until they received treatment that lasted for 55 months. The PLS-Cox regression analysis results revealed that patients' age and the use of ALG significantly affected the survival time of patients. The factor of patients' age (X1) in the PLS-Cox regression model merely affected the failure probability by 1.201. This indicates that the probability of dying for elderly patients with a kidney transplant is 1.152 times higher than that for younger patients.
Katayama, K; Sato, T; Arai, T; Amao, H; Ohta, Y; Ozawa, T; Kenyon, P R; Hickson, R E; Tazaki, H
2013-02-01
Simple liquid chromatography-mass spectrometry (LC-MS) was applied to non-targeted metabolic analyses to discover new metabolic markers in animal plasma. Principle component analysis (PCA) and partial least squares-discriminate analysis (PLS-DA) were used to analyse LC-MS multivariate data. PCA clearly generated two separate clusters for artificially induced diabetic mice and healthy control mice. PLS-DA of time-course changes in plasma metabolites of chicks after feeding generated three clusters (pre- and immediately after feeding, 0.5-3 h after feeding and 4 h after feeding). Two separate clusters were also generated for plasma metabolites of pregnant Angus heifers with differing live-weight change profiles (gaining or losing). The accompanying PLS-DA loading plot detailed the metabolites that contribute the most to the cluster separation. In each case, the same highly hydrophilic metabolite was strongly correlated to the group separation. The metabolite was identified as betaine by LC-MS/MS. This result indicates that betaine and its metabolic precursor, choline, may be useful biomarkers to evaluate the nutritional and metabolic status of animals. © 2011 Blackwell Verlag GmbH.
Quantitative analysis of red wine tannins using Fourier-transform mid-infrared spectrometry.
Fernandez, Katherina; Agosin, Eduardo
2007-09-05
Tannin content and composition are critical quality components of red wines. No spectroscopic method assessing these phenols in wine has been described so far. We report here a new method using Fourier transform mid-infrared (FT-MIR) spectroscopy and chemometric techniques for the quantitative analysis of red wine tannins. Calibration models were developed using protein precipitation and phloroglucinolysis as analytical reference methods. After spectra preprocessing, six different predictive partial least-squares (PLS) models were evaluated, including the use of interval selection procedures such as iPLS and CSMWPLS. PLS regression with full-range (650-4000 cm(-1)), second derivative of the spectra and phloroglucinolysis as the reference method gave the most accurate determination for tannin concentration (RMSEC = 2.6%, RMSEP = 9.4%, r = 0.995). The prediction of the mean degree of polymerization (mDP) of the tannins also gave a reasonable prediction (RMSEC = 6.7%, RMSEP = 10.3%, r = 0.958). These results represent the first step in the development of a spectroscopic methodology for the quantification of several phenolic compounds that are critical for wine quality.
Son, Hong-Seok; Kim, Ki Myong; van den Berg, Frans; Hwang, Geum-Sook; Park, Won-Mok; Lee, Cherl-Ho; Hong, Young-Shick
2008-09-10
(1)H NMR spectroscopy was used to investigate the metabolic differences in wines produced from different grape varieties and different regions. A significant separation among wines from Campbell Early, Cabernet Sauvignon, and Shiraz grapes was observed using principal component analysis (PCA) and partial least squares-discriminant analysis (PLS-DA). The metabolites contributing to the separation were assigned to be 2,3-butanediol, lactate, acetate, proline, succinate, malate, glycerol, tartarate, glucose, and phenolic compounds by PCA and PLS-DA loading plots. Wines produced from Cabernet Sauvignon grapes harvested in the continental areas of Australia, France, and California were also separated. PLS-DA loading plots revealed that the level of proline in Californian Cabernet Sauvignon wines was higher than that in Australian and French Cabernet Sauvignon, Australian Shiraz, and Korean Campbell Early wines, showing that the chemical composition of the grape berries varies with the variety and growing area. This study highlights the applicability of NMR-based metabolomics with multivariate statistical data sets in determining wine quality and product origin.
Žuvela, Petar; Liu, J Jay; Macur, Katarzyna; Bączek, Tomasz
2015-10-06
In this work, performance of five nature-inspired optimization algorithms, genetic algorithm (GA), particle swarm optimization (PSO), artificial bee colony (ABC), firefly algorithm (FA), and flower pollination algorithm (FPA), was compared in molecular descriptor selection for development of quantitative structure-retention relationship (QSRR) models for 83 peptides that originate from eight model proteins. The matrix with 423 descriptors was used as input, and QSRR models based on selected descriptors were built using partial least squares (PLS), whereas root mean square error of prediction (RMSEP) was used as a fitness function for their selection. Three performance criteria, prediction accuracy, computational cost, and the number of selected descriptors, were used to evaluate the developed QSRR models. The results show that all five variable selection methods outperform interval PLS (iPLS), sparse PLS (sPLS), and the full PLS model, whereas GA is superior because of its lowest computational cost and higher accuracy (RMSEP of 5.534%) with a smaller number of variables (nine descriptors). The GA-QSRR model was validated initially through Y-randomization. In addition, it was successfully validated with an external testing set out of 102 peptides originating from Bacillus subtilis proteomes (RMSEP of 22.030%). Its applicability domain was defined, from which it was evident that the developed GA-QSRR exhibited strong robustness. All the sources of the model's error were identified, thus allowing for further application of the developed methodology in proteomics.
Vindimian, Éric; Garric, Jeanne; Flammarion, Patrick; Thybaud, Éric; Babut, Marc
1999-10-01
The evaluation of the ecotoxicity of effluents requires a battery of biological tests on several species. In order to derive a summary parameter from such a battery, a single endpoint was calculated for all the tests: the EC10, obtained by nonlinear regression, with bootstrap evaluation of the confidence intervals. Principal component analysis was used to characterize and visualize the correlation between the tests. The table of the toxicity of the effluents was then submitted to a panel of experts, who classified the effluents according to the test results. Partial least squares (PLS) regression was used to fit the average value of the experts' judgements to the toxicity data, using a simple equation. Furthermore, PLS regression on partial data sets and other considerations resulted in an optimum battery, with two chronic tests and one acute test. The index is intended to be used for the classification of effluents based on their toxicity to aquatic species. Copyright © 1999 SETAC.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vindimian, E.; Garric, J.; Flammarion, P.
1999-10-01
The evaluation of the ecotoxicity of effluents requires a battery of biological tests on several species. In order to derive a summary parameter from such a battery, a single endpoint was calculated for all the tests: the EC10, obtained by nonlinear regression, with bootstrap evaluation of the confidence intervals. Principal component analysis was used to characterize and visualize the correlation between the tests. The table of the toxicity of the effluents was then submitted to a panel of experts, who classified the effluents according to the test results. Partial least squares (PLS) regression was used to fit the average valuemore » of the experts' judgments to the toxicity data, using a simple equation. Furthermore, PLS regression on partial data sets and other considerations resulted in an optimum battery, with two chronic tests and one acute test. The index is intended to be used for the classification of effluents based on their toxicity to aquatic species.« less
Eliseyev, Andrey; Aksenova, Tetiana
2016-01-01
In the current paper the decoding algorithms for motor-related BCI systems for continuous upper limb trajectory prediction are considered. Two methods for the smooth prediction, namely Sobolev and Polynomial Penalized Multi-Way Partial Least Squares (PLS) regressions, are proposed. The methods are compared to the Multi-Way Partial Least Squares and Kalman Filter approaches. The comparison demonstrated that the proposed methods combined the prediction accuracy of the algorithms of the PLS family and trajectory smoothness of the Kalman Filter. In addition, the prediction delay is significantly lower for the proposed algorithms than for the Kalman Filter approach. The proposed methods could be applied in a wide range of applications beyond neuroscience. PMID:27196417
Prediction of valid acidity in intact apples with Fourier transform near infrared spectroscopy.
Liu, Yan-De; Ying, Yi-Bin; Fu, Xia-Ping
2005-03-01
To develop nondestructive acidity prediction for intact Fuji apples, the potential of Fourier transform near infrared (FT-NIR) method with fiber optics in interactance mode was investigated. Interactance in the 800 nm to 2619 nm region was measured for intact apples, harvested from early to late maturity stages. Spectral data were analyzed by two multivariate calibration techniques including partial least squares (PLS) and principal component regression (PCR) methods. A total of 120 Fuji apples were tested and 80 of them were used to form a calibration data set. The influences of different data preprocessing and spectra treatments were also quantified. Calibration models based on smoothing spectra were slightly worse than that based on derivative spectra, and the best result was obtained when the segment length was 5 nm and the gap size was 10 points. Depending on data preprocessing and PLS method, the best prediction model yielded correlation coefficient of determination (r2) of 0.759, low root mean square error of prediction (RMSEP) of 0.0677, low root mean square error of calibration (RMSEC) of 0.0562. The results indicated the feasibility of FT-NIR spectral analysis for predicting apple valid acidity in a nondestructive way.
Prediction of valid acidity in intact apples with Fourier transform near infrared spectroscopy*
Liu, Yan-de; Ying, Yi-bin; Fu, Xia-ping
2005-01-01
To develop nondestructive acidity prediction for intact Fuji apples, the potential of Fourier transform near infrared (FT-NIR) method with fiber optics in interactance mode was investigated. Interactance in the 800 nm to 2619 nm region was measured for intact apples, harvested from early to late maturity stages. Spectral data were analyzed by two multivariate calibration techniques including partial least squares (PLS) and principal component regression (PCR) methods. A total of 120 Fuji apples were tested and 80 of them were used to form a calibration data set. The influences of different data preprocessing and spectra treatments were also quantified. Calibration models based on smoothing spectra were slightly worse than that based on derivative spectra, and the best result was obtained when the segment length was 5 nm and the gap size was 10 points. Depending on data preprocessing and PLS method, the best prediction model yielded correlation coefficient of determination (r 2) of 0.759, low root mean square error of prediction (RMSEP) of 0.0677, low root mean square error of calibration (RMSEC) of 0.0562. The results indicated the feasibility of FT-NIR spectral analysis for predicting apple valid acidity in a nondestructive way. PMID:15682498
Locally-Based Kernal PLS Smoothing to Non-Parametric Regression Curve Fitting
NASA Technical Reports Server (NTRS)
Rosipal, Roman; Trejo, Leonard J.; Wheeler, Kevin; Korsmeyer, David (Technical Monitor)
2002-01-01
We present a novel smoothing approach to non-parametric regression curve fitting. This is based on kernel partial least squares (PLS) regression in reproducing kernel Hilbert space. It is our concern to apply the methodology for smoothing experimental data where some level of knowledge about the approximate shape, local inhomogeneities or points where the desired function changes its curvature is known a priori or can be derived based on the observed noisy data. We propose locally-based kernel PLS regression that extends the previous kernel PLS methodology by incorporating this knowledge. We compare our approach with existing smoothing splines, hybrid adaptive splines and wavelet shrinkage techniques on two generated data sets.
Teoh, Shao Thing; Kitamura, Miki; Nakayama, Yasumune; Putri, Sastia; Mukai, Yukio; Fukusaki, Eiichiro
2016-08-01
In recent years, the advent of high-throughput omics technology has made possible a new class of strain engineering approaches, based on identification of possible gene targets for phenotype improvement from omic-level comparison of different strains or growth conditions. Metabolomics, with its focus on the omic level closest to the phenotype, lends itself naturally to this semi-rational methodology. When a quantitative phenotype such as growth rate under stress is considered, regression modeling using multivariate techniques such as partial least squares (PLS) is often used to identify metabolites correlated with the target phenotype. However, linear modeling techniques such as PLS require a consistent metabolite-phenotype trend across the samples, which may not be the case when outliers or multiple conflicting trends are present in the data. To address this, we proposed a data-mining strategy that utilizes random sample consensus (RANSAC) to select subsets of samples with consistent trends for construction of better regression models. By applying a combination of RANSAC and PLS (RANSAC-PLS) to a dataset from a previous study (gas chromatography/mass spectrometry metabolomics data and 1-butanol tolerance of 19 yeast mutant strains), new metabolites were indicated to be correlated with tolerance within certain subsets of the samples. The relevance of these metabolites to 1-butanol tolerance were then validated from single-deletion strains of corresponding metabolic genes. The results showed that RANSAC-PLS is a promising strategy to identify unique metabolites that provide additional hints for phenotype improvement, which could not be detected by traditional PLS modeling using the entire dataset. Copyright © 2016 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.
Data Mining Methods for Omics and Knowledge of Crude Medicinal Plants toward Big Data Biology
Afendi, Farit M.; Ono, Naoaki; Nakamura, Yukiko; Nakamura, Kensuke; Darusman, Latifah K.; Kibinge, Nelson; Morita, Aki Hirai; Tanaka, Ken; Horai, Hisayuki; Altaf-Ul-Amin, Md.; Kanaya, Shigehiko
2013-01-01
Molecular biological data has rapidly increased with the recent progress of the Omics fields, e.g., genomics, transcriptomics, proteomics and metabolomics that necessitates the development of databases and methods for efficient storage, retrieval, integration and analysis of massive data. The present study reviews the usage of KNApSAcK Family DB in metabolomics and related area, discusses several statistical methods for handling multivariate data and shows their application on Indonesian blended herbal medicines (Jamu) as a case study. Exploration using Biplot reveals many plants are rarely utilized while some plants are highly utilized toward specific efficacy. Furthermore, the ingredients of Jamu formulas are modeled using Partial Least Squares Discriminant Analysis (PLS-DA) in order to predict their efficacy. The plants used in each Jamu medicine served as the predictors, whereas the efficacy of each Jamu provided the responses. This model produces 71.6% correct classification in predicting efficacy. Permutation test then is used to determine plants that serve as main ingredients in Jamu formula by evaluating the significance of the PLS-DA coefficients. Next, in order to explain the role of plants that serve as main ingredients in Jamu medicines, information of pharmacological activity of the plants is added to the predictor block. Then N-PLS-DA model, multiway version of PLS-DA, is utilized to handle the three-dimensional array of the predictor block. The resulting N-PLS-DA model reveals that the effects of some pharmacological activities are specific for certain efficacy and the other activities are diverse toward many efficacies. Mathematical modeling introduced in the present study can be utilized in global analysis of big data targeting to reveal the underlying biology. PMID:24688691
Hurtado-Fernández, E; Pacchiarotta, T; Mayboroda, O A; Fernández-Gutiérrez, A; Carrasco-Pancorbo, A
2015-01-01
In order to investigate avocado fruit ripening, nontargeted GC-APCI-TOF MS metabolic profiling analyses were carried out. Principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA) were used to explore the metabolic profiles from fruit samples of 13 varieties at two different ripening degrees. Mannoheptulose; pentadecylfuran; aspartic, malic, stearic, citric and pantothenic acids; mannitol; and β-sitosterol were some of the metabolites found as more influential for the PLS-DA model. The similarities among genetically related samples (putative mutants of "Hass") and their metabolic differences from the rest of the varieties under study have also been evaluated. The achieved results reveal new insights into avocado fruit composition and metabolite changes, demonstrating therefore the value of metabolomics as a functional genomics tool in characterizing the mechanism of fruit ripening development, a key developmental stage in most economically important fruit crops.
Multi-element fingerprinting as a tool in origin authentication of four east China marine species.
Guo, Lipan; Gong, Like; Yu, Yanlei; Zhang, Hong
2013-12-01
The contents of 25 elements in 4 types of commercial marine species from the East China Sea were determined by inductively coupled plasma mass spectrometry and atomic absorption spectrometry. The elemental composition was used to differentiate marine species according to geographical origin by multivariate statistical analysis. The results showed that principal component analysis could distinguish samples from different areas and reveal the elements which played the most important role in origin diversity. The established models by partial least squares discriminant analysis (PLS-DA) and by probabilistic neural network (PNN) can both precisely predict the origin of the marine species. Further study indicated that PLS-DA and PNN were efficacious in regional discrimination. The models from these 2 statistical methods, with an accuracy of 97.92% and 100%, respectively, could both distinguish samples from different areas without the need for species differentiation. © 2013 Institute of Food Technologists®
Quantitative analysis of bayberry juice acidity based on visible and near-infrared spectroscopy
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shao Yongni; He Yong; Mao Jingyuan
Visible and near-infrared (Vis/NIR) reflectance spectroscopy has been investigated for its ability to nondestructively detect acidity in bayberry juice. What we believe to be a new, better mathematic model is put forward, which we have named principal component analysis-stepwise regression analysis-backpropagation neural network (PCA-SRA-BPNN), to build a correlation between the spectral reflectivity data and the acidity of bayberry juice. In this model, the optimum network parameters,such as the number of input nodes, hidden nodes, learning rate, and momentum, are chosen by the value of root-mean-square (rms) error. The results show that its prediction statistical parameters are correlation coefficient (r) ofmore » 0.9451 and root-mean-square error of prediction(RMSEP) of 0.1168. Partial least-squares (PLS) regression is also established to compare with this model. Before doing this, the influences of various spectral pretreatments (standard normal variate, multiplicative scatter correction, S. Golay first derivative, and wavelet package transform) are compared. The PLS approach with wavelet package transform preprocessing spectra is found to provide the best results, and its prediction statistical parameters are correlation coefficient (r) of 0.9061 and RMSEP of 0.1564. Hence, these two models are both desirable to analyze the data from Vis/NIR spectroscopy and to solve the problem of the acidity prediction of bayberry juice. This supplies basal research to ultimately realize the online measurements of the juice's internal quality through this Vis/NIR spectroscopy technique.« less
Payne, Courtney E; Wolfrum, Edward J
2015-01-01
Obtaining accurate chemical composition and reactivity (measures of carbohydrate release and yield) information for biomass feedstocks in a timely manner is necessary for the commercialization of biofuels. Our objective was to use near-infrared (NIR) spectroscopy and partial least squares (PLS) multivariate analysis to develop calibration models to predict the feedstock composition and the release and yield of soluble carbohydrates generated by a bench-scale dilute acid pretreatment and enzymatic hydrolysis assay. Major feedstocks included in the calibration models are corn stover, sorghum, switchgrass, perennial cool season grasses, rice straw, and miscanthus. We present individual model statistics to demonstrate model performance and validation samples to more accurately measure predictive quality of the models. The PLS-2 model for composition predicts glucan, xylan, lignin, and ash (wt%) with uncertainties similar to primary measurement methods. A PLS-2 model was developed to predict glucose and xylose release following pretreatment and enzymatic hydrolysis. An additional PLS-2 model was developed to predict glucan and xylan yield. PLS-1 models were developed to predict the sum of glucose/glucan and xylose/xylan for release and yield (grams per gram). The release and yield models have higher uncertainties than the primary methods used to develop the models. It is possible to build effective multispecies feedstock models for composition, as well as carbohydrate release and yield. The model for composition is useful for predicting glucan, xylan, lignin, and ash with good uncertainties. The release and yield models have higher uncertainties; however, these models are useful for rapidly screening sample populations to identify unusual samples.
Balabin, Roman M; Smirnov, Sergey V
2011-07-15
Melamine (2,4,6-triamino-1,3,5-triazine) is a nitrogen-rich chemical implicated in the pet and human food recalls and in the global food safety scares involving milk products. Due to the serious health concerns associated with melamine consumption and the extensive scope of affected products, rapid and sensitive methods to detect melamine's presence are essential. We propose the use of spectroscopy data-produced by near-infrared (near-IR/NIR) and mid-infrared (mid-IR/MIR) spectroscopies, in particular-for melamine detection in complex dairy matrixes. None of the up-to-date reported IR-based methods for melamine detection has unambiguously shown its wide applicability to different dairy products as well as limit of detection (LOD) below 1 ppm on independent sample set. It was found that infrared spectroscopy is an effective tool to detect melamine in dairy products, such as infant formula, milk powder, or liquid milk. ALOD below 1 ppm (0.76±0.11 ppm) can be reached if a correct spectrum preprocessing (pretreatment) technique and a correct multivariate (MDA) algorithm-partial least squares regression (PLS), polynomial PLS (Poly-PLS), artificial neural network (ANN), support vector regression (SVR), or least squares support vector machine (LS-SVM)-are used for spectrum analysis. The relationship between MIR/NIR spectrum of milk products and melamine content is nonlinear. Thus, nonlinear regression methods are needed to correctly predict the triazine-derivative content of milk products. It can be concluded that mid- and near-infrared spectroscopy can be regarded as a quick, sensitive, robust, and low-cost method for liquid milk, infant formula, and milk powder analysis. Copyright © 2011 Elsevier B.V. All rights reserved.
Wang, Chang; Huang, Chichao; Qian, Jian; Xiao, Jian; Li, Huan; Wen, Yongli; He, Xinhua; Ran, Wei; Shen, Qirong; Yu, Guanghui
2014-01-01
The composting industry has been growing rapidly in China because of a boom in the animal industry. Therefore, a rapid and accurate assessment of the quality of commercial organic fertilizers is of the utmost importance. In this study, a novel technique that combines near infrared (NIR) spectroscopy with partial least squares (PLS) analysis is developed for rapidly and accurately assessing commercial organic fertilizers quality. A total of 104 commercial organic fertilizers were collected from full-scale compost factories in Jiangsu Province, east China. In general, the NIR-PLS technique showed accurate predictions of the total organic matter, water soluble organic nitrogen, pH, and germination index; less accurate results of the moisture, total nitrogen, and electrical conductivity; and the least accurate results for water soluble organic carbon. Our results suggested the combined NIR-PLS technique could be applied as a valuable tool to rapidly and accurately assess the quality of commercial organic fertilizers. PMID:24586313
Wang, Chang; Huang, Chichao; Qian, Jian; Xiao, Jian; Li, Huan; Wen, Yongli; He, Xinhua; Ran, Wei; Shen, Qirong; Yu, Guanghui
2014-01-01
The composting industry has been growing rapidly in China because of a boom in the animal industry. Therefore, a rapid and accurate assessment of the quality of commercial organic fertilizers is of the utmost importance. In this study, a novel technique that combines near infrared (NIR) spectroscopy with partial least squares (PLS) analysis is developed for rapidly and accurately assessing commercial organic fertilizers quality. A total of 104 commercial organic fertilizers were collected from full-scale compost factories in Jiangsu Province, east China. In general, the NIR-PLS technique showed accurate predictions of the total organic matter, water soluble organic nitrogen, pH, and germination index; less accurate results of the moisture, total nitrogen, and electrical conductivity; and the least accurate results for water soluble organic carbon. Our results suggested the combined NIR-PLS technique could be applied as a valuable tool to rapidly and accurately assess the quality of commercial organic fertilizers.
Martelo-Vidal, M J; Vázquez, M
2014-09-01
Spectral analysis is a quick and non-destructive method to analyse wine. In this work, trans-resveratrol, oenin, malvin, catechin, epicatechin, quercetin and syringic acid were determined in commercial red wines from DO Rías Baixas and DO Ribeira Sacra (Spain) by UV-VIS-NIR spectroscopy. Calibration models were developed using principal component regression (PCR) or partial least squares (PLS) regression. HPLC was used as reference method. The results showed that reliable PLS models were obtained to quantify all polyphenols for Rías Baixas wines. For Ribeira Sacra, feasible models were obtained to determine quercetin, epicatechin, oenin and syringic acid. PCR calibration models showed worst reliable of prediction than PLS models. For red wines from mencía grapes, feasible models were obtained for catechin and oenin, regardless the geographical origin. The results obtained demonstrate that UV-VIS-NIR spectroscopy can be used to determine individual polyphenolic compounds in red wines. Copyright © 2014 Elsevier Ltd. All rights reserved.
Özbalci, Beril; Boyaci, İsmail Hakkı; Topcu, Ali; Kadılar, Cem; Tamer, Uğur
2013-02-15
The aim of this study was to quantify glucose, fructose, sucrose and maltose contents of honey samples using Raman spectroscopy as a rapid method. By performing a single measurement, quantifications of sugar contents have been said to be unaffordable according to the molecular similarities between sugar molecules in honey matrix. This bottleneck was overcome by coupling Raman spectroscopy with chemometric methods (principal component analysis (PCA) and partial least squares (PLS)) and an artificial neural network (ANN). Model solutions of four sugars were processed with PCA and significant separation was observed. This operation, done with the spectral features by using PLS and ANN methods, led to the discriminant analysis of sugar contents. Models/trained networks were created using a calibration data set and evaluated using a validation data set. The correlation coefficient values between actual and predicted values of glucose, fructose, sucrose and maltose were determined as 0.964, 0.965, 0.968 and 0.949 for PLS and 0.965, 0.965, 0.978 and 0.956 for ANN, respectively. The requirement of rapid analysis of sugar contents of commercial honeys has been met by the data processed within this article. Copyright © 2012 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Fragkaki, A. G.; Angelis, Y. S.; Tsantili-Kakoulidou, A.; Koupparis, M.; Georgakopoulos, C.
2009-08-01
Anabolic androgenic steroids (AAS) are included in the List of prohibited substances of the World Anti-Doping Agency (WADA) as substances abused to enhance athletic performance. Gas chromatography coupled to mass spectrometry (GC-MS) plays an important role in doping control analyses identifying AAS as their enolized-trimethylsilyl (TMS)-derivatives using the electron ionization (EI) mode. This paper explores the suitability of complementary GC-MS mass spectra and statistical analysis (principal component analysis, PCA and partial least squares-discriminant analysis, PLS-DA) to differentiate AAS as a function of their structural and conformational features expressed by their fragment ions. The results obtained showed that the application of PCA yielded a classification among the AAS molecules which became more apparent after applying PLS-DA to the dataset. The application of PLS-DA yielded a clear separation among the AAS molecules which were, thus, classified as: 1-ene-3-keto, 3-hydroxyl with saturated A-ring, 1-ene-3-hydroxyl, 4-ene-3-keto, 1,4-diene-3-keto and 3-keto with saturated A-ring anabolic steroids. The study of this paper also presents structurally diagnostic fragment ions and dissociation routes providing evidence for the presence of unknown AAS or chemically modified molecules known as designer steroids.
Quantitative determination of wool in textile by near-infrared spectroscopy and multivariate models.
Chen, Hui; Tan, Chao; Lin, Zan
2018-08-05
The wool content in textiles is a key quality index and the corresponding quantitative analysis takes an important position due to common adulterations in both raw and finished textiles. Conventional methods are maybe complicated, destructive, time-consuming, environment-unfriendly. Developing a quick, easy-to-use and green alternative method is interesting. The work focuses on exploring the feasibility of combining near-infrared (NIR) spectroscopy and several partial least squares (PLS)-based algorithms and elastic component regression (ECR) algorithms for measuring wool content in textile. A total of 108 cloth samples with wool content ranging from 0% to 100% (w/w) were collected and all the compositions are really existent in the market. The dataset was divided equally into the training and test sets for developing and validating calibration models. When using local PLS, the original spectrum axis was split into 20 sub-intervals. No obvious difference of performance can be seen for the local PLS models. The ECR model is comparable or superior to the other models due its flexibility, i.e., being transition state from PCR to PLS. It seems that ECR combined with NIR technique may be a potential method for determining wool content in textile products. In addition, it might have regulatory advantages to avoid time-consuming and environmental-unfriendly chemical analysis. Copyright © 2018 Elsevier B.V. All rights reserved.
Gottfried, Jennifer L
2011-07-01
The potential of laser-induced breakdown spectroscopy (LIBS) to discriminate biological and chemical threat simulant residues prepared on multiple substrates and in the presence of interferents has been explored. The simulant samples tested include Bacillus atrophaeus spores, Escherichia coli, MS-2 bacteriophage, α-hemolysin from Staphylococcus aureus, 2-chloroethyl ethyl sulfide, and dimethyl methylphosphonate. The residue samples were prepared on polycarbonate, stainless steel and aluminum foil substrates by Battelle Eastern Science and Technology Center. LIBS spectra were collected by Battelle on a portable LIBS instrument developed by A3 Technologies. This paper presents the chemometric analysis of the LIBS spectra using partial least-squares discriminant analysis (PLS-DA). The performance of PLS-DA models developed based on the full LIBS spectra, and selected emission intensities and ratios have been compared. The full-spectra models generally provided better classification results based on the inclusion of substrate emission features; however, the intensity/ratio models were able to correctly identify more types of simulant residues in the presence of interferents. The fusion of the two types of PLS-DA models resulted in a significant improvement in classification performance for models built using multiple substrates. In addition to identifying the major components of residue mixtures, minor components such as growth media and solvents can be identified with an appropriately designed PLS-DA model.
NASA Astrophysics Data System (ADS)
Gholizadeh, H.; Robeson, S. M.
2015-12-01
Empirical models have been widely used to estimate global chlorophyll content from remotely sensed data. Here, we focus on the standard NASA empirical models that use blue-green band ratios. These band ratio ocean color (OC) algorithms are in the form of fourth-order polynomials and the parameters of these polynomials (i.e. coefficients) are estimated from the NASA bio-Optical Marine Algorithm Data set (NOMAD). Most of the points in this data set have been sampled from tropical and temperate regions. However, polynomial coefficients obtained from this data set are used to estimate chlorophyll content in all ocean regions with different properties such as sea-surface temperature, salinity, and downwelling/upwelling patterns. Further, the polynomial terms in these models are highly correlated. In sum, the limitations of these empirical models are as follows: 1) the independent variables within the empirical models, in their current form, are correlated (multicollinear), and 2) current algorithms are global approaches and are based on the spatial stationarity assumption, so they are independent of location. Multicollinearity problem is resolved by using partial least squares (PLS). PLS, which transforms the data into a set of independent components, can be considered as a combined form of principal component regression (PCR) and multiple regression. Geographically weighted regression (GWR) is also used to investigate the validity of spatial stationarity assumption. GWR solves a regression model over each sample point by using the observations within its neighbourhood. PLS results show that the empirical method underestimates chlorophyll content in high latitudes, including the Southern Ocean region, when compared to PLS (see Figure 1). Cluster analysis of GWR coefficients also shows that the spatial stationarity assumption in empirical models is not likely a valid assumption.
Robust PLS approach for KPI-related prediction and diagnosis against outliers and missing data
NASA Astrophysics Data System (ADS)
Yin, Shen; Wang, Guang; Yang, Xu
2014-07-01
In practical industrial applications, the key performance indicator (KPI)-related prediction and diagnosis are quite important for the product quality and economic benefits. To meet these requirements, many advanced prediction and monitoring approaches have been developed which can be classified into model-based or data-driven techniques. Among these approaches, partial least squares (PLS) is one of the most popular data-driven methods due to its simplicity and easy implementation in large-scale industrial process. As PLS is totally based on the measured process data, the characteristics of the process data are critical for the success of PLS. Outliers and missing values are two common characteristics of the measured data which can severely affect the effectiveness of PLS. To ensure the applicability of PLS in practical industrial applications, this paper introduces a robust version of PLS to deal with outliers and missing values, simultaneously. The effectiveness of the proposed method is finally demonstrated by the application results of the KPI-related prediction and diagnosis on an industrial benchmark of Tennessee Eastman process.
Many multivariate methods are used in describing and predicting relation; each has its unique usage of categorical and non-categorical data. In multivariate analysis of variance (MANOVA), many response variables (y's) are related to many independent variables that are categorical...
Application of visible and near-infrared spectroscopy to classification of Miscanthus species
Jin, Xiaoli; Chen, Xiaoling; Xiao, Liang; ...
2017-04-03
Here, the feasibility of visible and near infrared (NIR) spectroscopy as tool to classify Miscanthus samples was explored in this study. Three types of Miscanthus plants, namely, M. sinensis, M. sacchariflorus and M. fIoridulus, were analyzed using a NIR spectrophotometer. Several classification models based on the NIR spectra data were developed using line discriminated analysis (LDA), partial least squares (PLS), least squares support vector machine regression (LSSVR), radial basis function (RBF) and neural network (NN). The principal component analysis (PCA) presented rough classification with overlapping samples, while the models of Line_LSSVR, RBF_LSSVR and RBF_NN presented almost same calibration and validationmore » results. Due to the higher speed of Line_LSSVR than RBF_LSSVR and RBF_NN, we selected the line_LSSVR model as a representative. In our study, the model based on line_LSSVR showed higher accuracy than LDA and PLS models. The total correct classification rates of 87.79 and 96.51% were observed based on LDA and PLS model in the testing set, respectively, while the line_LSSVR showed 99.42% of total correct classification rate. Meanwhile, the lin_LSSVR model in the testing set showed correct classification rate of 100, 100 and 96.77% for M. sinensis, M. sacchariflorus and M. fIoridulus, respectively. The lin_LSSVR model assigned 99.42% of samples to the right groups, except one M. fIoridulus sample. The results demonstrated that NIR spectra combined with a preliminary morphological classification could be an effective and reliable procedure for the classification of Miscanthus species.« less
Application of visible and near-infrared spectroscopy to classification of Miscanthus species
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jin, Xiaoli; Chen, Xiaoling; Xiao, Liang
Here, the feasibility of visible and near infrared (NIR) spectroscopy as tool to classify Miscanthus samples was explored in this study. Three types of Miscanthus plants, namely, M. sinensis, M. sacchariflorus and M. fIoridulus, were analyzed using a NIR spectrophotometer. Several classification models based on the NIR spectra data were developed using line discriminated analysis (LDA), partial least squares (PLS), least squares support vector machine regression (LSSVR), radial basis function (RBF) and neural network (NN). The principal component analysis (PCA) presented rough classification with overlapping samples, while the models of Line_LSSVR, RBF_LSSVR and RBF_NN presented almost same calibration and validationmore » results. Due to the higher speed of Line_LSSVR than RBF_LSSVR and RBF_NN, we selected the line_LSSVR model as a representative. In our study, the model based on line_LSSVR showed higher accuracy than LDA and PLS models. The total correct classification rates of 87.79 and 96.51% were observed based on LDA and PLS model in the testing set, respectively, while the line_LSSVR showed 99.42% of total correct classification rate. Meanwhile, the lin_LSSVR model in the testing set showed correct classification rate of 100, 100 and 96.77% for M. sinensis, M. sacchariflorus and M. fIoridulus, respectively. The lin_LSSVR model assigned 99.42% of samples to the right groups, except one M. fIoridulus sample. The results demonstrated that NIR spectra combined with a preliminary morphological classification could be an effective and reliable procedure for the classification of Miscanthus species.« less
Application of visible and near-infrared spectroscopy to classification of Miscanthus species.
Jin, Xiaoli; Chen, Xiaoling; Xiao, Liang; Shi, Chunhai; Chen, Liang; Yu, Bin; Yi, Zili; Yoo, Ji Hye; Heo, Kweon; Yu, Chang Yeon; Yamada, Toshihiko; Sacks, Erik J; Peng, Junhua
2017-01-01
The feasibility of visible and near infrared (NIR) spectroscopy as tool to classify Miscanthus samples was explored in this study. Three types of Miscanthus plants, namely, M. sinensis, M. sacchariflorus and M. fIoridulus, were analyzed using a NIR spectrophotometer. Several classification models based on the NIR spectra data were developed using line discriminated analysis (LDA), partial least squares (PLS), least squares support vector machine regression (LSSVR), radial basis function (RBF) and neural network (NN). The principal component analysis (PCA) presented rough classification with overlapping samples, while the models of Line_LSSVR, RBF_LSSVR and RBF_NN presented almost same calibration and validation results. Due to the higher speed of Line_LSSVR than RBF_LSSVR and RBF_NN, we selected the line_LSSVR model as a representative. In our study, the model based on line_LSSVR showed higher accuracy than LDA and PLS models. The total correct classification rates of 87.79 and 96.51% were observed based on LDA and PLS model in the testing set, respectively, while the line_LSSVR showed 99.42% of total correct classification rate. Meanwhile, the lin_LSSVR model in the testing set showed correct classification rate of 100, 100 and 96.77% for M. sinensis, M. sacchariflorus and M. fIoridulus, respectively. The lin_LSSVR model assigned 99.42% of samples to the right groups, except one M. fIoridulus sample. The results demonstrated that NIR spectra combined with a preliminary morphological classification could be an effective and reliable procedure for the classification of Miscanthus species.
Application of visible and near-infrared spectroscopy to classification of Miscanthus species
Shi, Chunhai; Chen, Liang; Yu, Bin; Yi, Zili; Yoo, Ji Hye; Heo, Kweon; Yu, Chang Yeon; Yamada, Toshihiko; Sacks, Erik J.; Peng, Junhua
2017-01-01
The feasibility of visible and near infrared (NIR) spectroscopy as tool to classify Miscanthus samples was explored in this study. Three types of Miscanthus plants, namely, M. sinensis, M. sacchariflorus and M. fIoridulus, were analyzed using a NIR spectrophotometer. Several classification models based on the NIR spectra data were developed using line discriminated analysis (LDA), partial least squares (PLS), least squares support vector machine regression (LSSVR), radial basis function (RBF) and neural network (NN). The principal component analysis (PCA) presented rough classification with overlapping samples, while the models of Line_LSSVR, RBF_LSSVR and RBF_NN presented almost same calibration and validation results. Due to the higher speed of Line_LSSVR than RBF_LSSVR and RBF_NN, we selected the line_LSSVR model as a representative. In our study, the model based on line_LSSVR showed higher accuracy than LDA and PLS models. The total correct classification rates of 87.79 and 96.51% were observed based on LDA and PLS model in the testing set, respectively, while the line_LSSVR showed 99.42% of total correct classification rate. Meanwhile, the lin_LSSVR model in the testing set showed correct classification rate of 100, 100 and 96.77% for M. sinensis, M. sacchariflorus and M. fIoridulus, respectively. The lin_LSSVR model assigned 99.42% of samples to the right groups, except one M. fIoridulus sample. The results demonstrated that NIR spectra combined with a preliminary morphological classification could be an effective and reliable procedure for the classification of Miscanthus species. PMID:28369059
NASA Astrophysics Data System (ADS)
Yehia, Ali M.; Mohamed, Heba M.
2016-01-01
Three advanced chemmometric-assisted spectrophotometric methods namely; Concentration Residuals Augmented Classical Least Squares (CRACLS), Multivariate Curve Resolution-Alternating Least Squares (MCR-ALS) and Principal Component Analysis-Artificial Neural Networks (PCA-ANN) were developed, validated and benchmarked to PLS calibration; to resolve the severely overlapped spectra and simultaneously determine; Paracetamol (PAR), Guaifenesin (GUA) and Phenylephrine (PHE) in their ternary mixture and in presence of p-aminophenol (AP) the main degradation product and synthesis impurity of Paracetamol. The analytical performance of the proposed methods was described by percentage recoveries, root mean square error of calibration and standard error of prediction. The four multivariate calibration methods could be directly used without any preliminary separation step and successfully applied for pharmaceutical formulation analysis, showing no excipients' interference.
NASA Astrophysics Data System (ADS)
Goudarzi, Nasser
2016-04-01
In this work, two new and powerful chemometrics methods are applied for the modeling and prediction of the 19F chemical shift values of some fluorinated organic compounds. The radial basis function-partial least square (RBF-PLS) and random forest (RF) are employed to construct the models to predict the 19F chemical shifts. In this study, we didn't used from any variable selection method and RF method can be used as variable selection and modeling technique. Effects of the important parameters affecting the ability of the RF prediction power such as the number of trees (nt) and the number of randomly selected variables to split each node (m) were investigated. The root-mean-square errors of prediction (RMSEP) for the training set and the prediction set for the RBF-PLS and RF models were 44.70, 23.86, 29.77, and 23.69, respectively. Also, the correlation coefficients of the prediction set for the RBF-PLS and RF models were 0.8684 and 0.9313, respectively. The results obtained reveal that the RF model can be used as a powerful chemometrics tool for the quantitative structure-property relationship (QSPR) studies.
Gómez-Carracedo, M P; Andrade, J M; Rutledge, D N; Faber, N M
2007-03-07
Selecting the correct dimensionality is critical for obtaining partial least squares (PLS) regression models with good predictive ability. Although calibration and validation sets are best established using experimental designs, industrial laboratories cannot afford such an approach. Typically, samples are collected in an (formally) undesigned way, spread over time and their measurements are included in routine measurement processes. This makes it hard to evaluate PLS model dimensionality. In this paper, classical criteria (leave-one-out cross-validation and adjusted Wold's criterion) are compared to recently proposed alternatives (smoothed PLS-PoLiSh and a randomization test) to seek out the optimum dimensionality of PLS models. Kerosene (jet fuel) samples were measured by attenuated total reflectance-mid-IR spectrometry and their spectra where used to predict eight important properties determined using reference methods that are time-consuming and prone to analytical errors. The alternative methods were shown to give reliable dimensionality predictions when compared to external validation. By contrast, the simpler methods seemed to be largely affected by the largest changes in the modeling capabilities of the first components.
Greene, LaVana; Elzey, Brianda; Franklin, Mariah; Fakayode, Sayo O
2017-03-05
The negative health impact of polycyclic aromatic hydrocarbons (PAHs) and differences in pharmacological activity of enantiomers of chiral molecules in humans highlights the need for analysis of PAHs and their chiral analogue molecules in humans. Herein, the first use of cyclodextrin guest-host inclusion complexation, fluorescence spectrophotometry, and chemometric approach to PAH (anthracene) and chiral-PAH analogue derivatives (1-(9-anthryl)-2,2,2-triflouroethanol (TFE)) analyses are reported. The binding constants (K b ), stoichiometry (n), and thermodynamic properties (Gibbs free energy (ΔG), enthalpy (ΔH), and entropy (ΔS)) of anthracene and enantiomers of TFE-methyl-β-cyclodextrin (Me-β-CD) guest-host complexes were also determined. Chemometric partial-least-square (PLS) regression analysis of emission spectra data of Me-β-CD-guest-host inclusion complexes was used for the determination of anthracene and TFE enantiomer concentrations in Me-β-CD-guest-host inclusion complex samples. The values of calculated K b and negative ΔG suggest the thermodynamic favorability of anthracene-Me-β-CD and enantiomeric of TFE-Me-β-CD inclusion complexation reactions. However, anthracene-Me-β-CD and enantiomer TFE-Me-β-CD inclusion complexations showed notable differences in the binding affinity behaviors and thermodynamic properties. The PLS regression analysis resulted in square-correlation-coefficients of 0.997530 or better and a low LOD of 3.81×10 -7 M for anthracene and 3.48×10 -8 M for TFE enantiomers at physiological conditions. Most importantly, PLS regression accurately determined the anthracene and TFE enantiomer concentrations with an average low error of 2.31% for anthracene, 4.44% for R-TFE and 3.60% for S-TFE. The results of the study are highly significant because of its high sensitivity and accuracy for analysis of PAH and chiral PAH analogue derivatives without the need of an expensive chiral column, enantiomeric resolution, or use of a polarized light. Published by Elsevier B.V.
Relationship between Composition and Toxicity of Motor Vehicle Emission Samples
McDonald, Jacob D.; Eide, Ingvar; Seagrave, JeanClare; Zielinska, Barbara; Whitney, Kevin; Lawson, Douglas R.; Mauderly, Joe L.
2004-01-01
In this study we investigated the statistical relationship between particle and semivolatile organic chemical constituents in gasoline and diesel vehicle exhaust samples, and toxicity as measured by inflammation and tissue damage in rat lungs and mutagenicity in bacteria. Exhaust samples were collected from “normal” and “high-emitting” gasoline and diesel light-duty vehicles. We employed a combination of principal component analysis (PCA) and partial least-squares regression (PLS; also known as projection to latent structures) to evaluate the relationships between chemical composition of vehicle exhaust and toxicity. The PLS analysis revealed the chemical constituents covarying most strongly with toxicity and produced models predicting the relative toxicity of the samples with good accuracy. The specific nitro-polycyclic aromatic hydrocarbons important for mutagenicity were the same chemicals that have been implicated by decades of bioassay-directed fractionation. These chemicals were not related to lung toxicity, which was associated with organic carbon and select organic compounds that are present in lubricating oil. The results demonstrate the utility of the PCA/PLS approach for evaluating composition–response relationships in complex mixture exposures and also provide a starting point for confirming causality and determining the mechanisms of the lung effects. PMID:15531438
Detection of triglycerides using immobilized enzymes in food and biological samples
NASA Astrophysics Data System (ADS)
Raichur, Ashish; Lesi, Abiodun; Pedersen, Henrik
1996-04-01
A scheme for the determination of total triglyceride (fat) content in biomedical and food samples is being developed. The primary emphasis is to minimize the reagents used, simplify sample preparation and develop a robust system that would facilitate on-line monitoring. The new detection scheme developed thus far involves extracting triglycerides into an organic solvent (cyclohexane) and performing partial least squares (PLS) analysis on the NIR (1100 - 2500 nm) absorbance spectra of the solution. A training set using 132 spectra of known triglyceride mixtures was complied. Eight PLS calibrations were generated and were used to predict the total fat extracted from commercial samples such as mayonnaise, butter, corn oil and coconut oil. The results typically gave a correlation coefficient (r) of 0.99 or better. Predictions were typically within 90% and better at higher concentrations. Experiments were also performed using an immobilized lipase reactor to hydrolyze the fat extracted into the organic solvent. Performing PLS analysis on the difference spectra of the substrate and product could enhance specificity. This is being verified experimentally. Further work with biomedical samples is to be performed. This scheme may be developed into a feasible detection method for triglycerides in the biomedical and food industries.
Miller, Arthur L.; Weakley, Andrew Todd; Griffiths, Peter R.; Cauda, Emanuele G.; Bayman, Sean
2017-01-01
In order to help reduce silicosis in miners, the National Institute for Occupational Health and Safety (NIOSH) is developing field-portable methods for measuring airborne respirable crystalline silica (RCS), specifically the polymorph α-quartz, in mine dusts. In this study we demonstrate the feasibility of end-of-shift measurement of α-quartz using a direct-on-filter (DoF) method to analyze coal mine dust samples deposited onto polyvinyl chloride filters. The DoF method is potentially amenable for on-site analyses, but deviates from the current regulatory determination of RCS for coal mines by eliminating two sample preparation steps: ashing the sampling filter and redepositing the ash prior to quantification by Fourier transform infrared (FT-IR) spectrometry. In this study, the FT-IR spectra of 66 coal dust samples from active mines were used, and the RCS was quantified by using: (1) an ordinary least squares (OLS) calibration approach that utilizes standard silica material as done in the Mine Safety and Health Administration's P7 method; and (2) a partial least squares (PLS) regression approach. Both were capable of accounting for kaolinite, which can confound the IR analysis of silica. The OLS method utilized analytical standards for silica calibration and kaolin correction, resulting in a good linear correlation with P7 results and minimal bias but with the accuracy limited by the presence of kaolinite. The PLS approach also produced predictions well-correlated to the P7 method, as well as better accuracy in RCS prediction, and no bias due to variable kaolinite mass. Besides decreased sensitivity to mineral or substrate confounders, PLS has the advantage that the analyst is not required to correct for the presence of kaolinite or background interferences related to the substrate, making the method potentially viable for automated RCS prediction in the field. This study demonstrated the efficacy of FT-IR transmission spectrometry for silica determination in coal mine dusts, using both OLS and PLS analyses, when kaolinite was present. PMID:27645724
Miller, Arthur L; Weakley, Andrew Todd; Griffiths, Peter R; Cauda, Emanuele G; Bayman, Sean
2017-05-01
In order to help reduce silicosis in miners, the National Institute for Occupational Health and Safety (NIOSH) is developing field-portable methods for measuring airborne respirable crystalline silica (RCS), specifically the polymorph α-quartz, in mine dusts. In this study we demonstrate the feasibility of end-of-shift measurement of α-quartz using a direct-on-filter (DoF) method to analyze coal mine dust samples deposited onto polyvinyl chloride filters. The DoF method is potentially amenable for on-site analyses, but deviates from the current regulatory determination of RCS for coal mines by eliminating two sample preparation steps: ashing the sampling filter and redepositing the ash prior to quantification by Fourier transform infrared (FT-IR) spectrometry. In this study, the FT-IR spectra of 66 coal dust samples from active mines were used, and the RCS was quantified by using: (1) an ordinary least squares (OLS) calibration approach that utilizes standard silica material as done in the Mine Safety and Health Administration's P7 method; and (2) a partial least squares (PLS) regression approach. Both were capable of accounting for kaolinite, which can confound the IR analysis of silica. The OLS method utilized analytical standards for silica calibration and kaolin correction, resulting in a good linear correlation with P7 results and minimal bias but with the accuracy limited by the presence of kaolinite. The PLS approach also produced predictions well-correlated to the P7 method, as well as better accuracy in RCS prediction, and no bias due to variable kaolinite mass. Besides decreased sensitivity to mineral or substrate confounders, PLS has the advantage that the analyst is not required to correct for the presence of kaolinite or background interferences related to the substrate, making the method potentially viable for automated RCS prediction in the field. This study demonstrated the efficacy of FT-IR transmission spectrometry for silica determination in coal mine dusts, using both OLS and PLS analyses, when kaolinite was present.
Riahi, Siavash; Hadiloo, Farshad; Milani, Seyed Mohammad R; Davarkhah, Nazila; Ganjali, Mohammad R; Norouzi, Parviz; Seyfi, Payam
2011-05-01
The accuracy in predicting different chemometric methods was compared when applied on ordinary UV spectra and first order derivative spectra. Principal component regression (PCR) and partial least squares with one dependent variable (PLS1) and two dependent variables (PLS2) were applied on spectral data of pharmaceutical formula containing pseudoephedrine (PDP) and guaifenesin (GFN). The ability to derivative in resolved overlapping spectra chloropheniramine maleate was evaluated when multivariate methods are adopted for analysis of two component mixtures without using any chemical pretreatment. The chemometrics models were tested on an external validation dataset and finally applied to the analysis of pharmaceuticals. Significant advantages were found in analysis of the real samples when the calibration models from derivative spectra were used. It should also be mentioned that the proposed method is a simple and rapid way requiring no preliminary separation steps and can be used easily for the analysis of these compounds, especially in quality control laboratories. Copyright © 2011 John Wiley & Sons, Ltd.
Wang, Jun; Kliks, Michael M; Jun, Soojin; Jackson, Mel; Li, Qing X
2010-03-01
Quantitative analysis of glucose, fructose, sucrose, and maltose in different geographic origin honey samples in the world using the Fourier transform infrared (FTIR) spectroscopy and chemometrics such as partial least squares (PLS) and principal component regression was studied. The calibration series consisted of 45 standard mixtures, which were made up of glucose, fructose, sucrose, and maltose. There were distinct peak variations of all sugar mixtures in the spectral "fingerprint" region between 1500 and 800 cm(-1). The calibration model was successfully validated using 7 synthetic blend sets of sugars. The PLS 2nd-derivative model showed the highest degree of prediction accuracy with a highest R(2) value of 0.999. Along with the canonical variate analysis, the calibration model further validated by high-performance liquid chromatography measurements for commercial honey samples demonstrates that FTIR can qualitatively and quantitatively determine the presence of glucose, fructose, sucrose, and maltose in multiple regional honey samples.
Liu, Xue-Mei; Liu, Jian-She
2012-11-01
Visible infrared spectroscopy (Vis/SW-NIRS) was investigated in the present study for measurement accuracy of soil properties,namely, available nitrogen(N) and available potassium(K). Three types of pretreatments including standard normal variate (SNV), multiplicative scattering correction (MSC) and Savitzky-Golay smoothing+first derivative were adopted to eliminate the system noises and external disturbances. Then partial least squares (PLS) and least squares-support vector machine (LS-SVM) models analysis were implemented for calibration models. Simultaneously, the performance of least squares-support vector machine (LS-SVM) models was compared with three kinds of inputs, including PCA(PCs), latent variables (LVs), and effective wavelengths (EWs). The results indicated that all LS-SVM models outperformed PLS models. The performance of the model was evaluated by the correlation coefficient (r2) and RMSEP. The optimal EWs-LS-SVM models were achieved, and the correlation coefficient (r2) and RMSEP were 0.82 and 17.2 for N and 0.72 and 15.0 for K, respectively. The results indicated that visible and short wave-near infrared spectroscopy (Vis/SW-NIRS)(325-1 075 nm) combined with LS-SVM could be utilized as a precision method for the determination of soil properties.
Hutengs, Christopher; Ludwig, Bernard; Jung, András; Eisele, Andreas; Vohland, Michael
2018-03-27
Mid-infrared (MIR) spectroscopy has received widespread interest as a method to complement traditional soil analysis. Recently available portable MIR spectrometers additionally offer potential for on-site applications, given sufficient spectral data quality. We therefore tested the performance of the Agilent 4300 Handheld FTIR (DRIFT spectra) in comparison to a Bruker Tensor 27 bench-top instrument in terms of (i) spectral quality and measurement noise quantified by wavelet analysis; (ii) accuracy of partial least squares (PLS) calibrations for soil organic carbon (SOC), total nitrogen (N), pH, clay and sand content with a repeated cross-validation analysis; and (iii) key spectral regions for these soil properties identified with a Monte Carlo spectral variable selection approach. Measurements and multivariate calibrations with the handheld device were as good as or slightly better than Bruker equipped with a DRIFT accessory, but not as accurate as with directional hemispherical reflectance (DHR) data collected with an integrating sphere. Variations in noise did not markedly affect the accuracy of multivariate PLS calibrations. Identified key spectral regions for PLS calibrations provided a good match between Agilent and Bruker DHR data, especially for SOC and N. Our findings suggest that portable FTIR instruments are a viable alternative for MIR measurements in the laboratory and offer great potential for on-site applications.
NASA Astrophysics Data System (ADS)
Attia, Khalid A. M.; Nassar, Mohammed W. I.; El-Zeiny, Mohamed B.; Serag, Ahmed
2016-04-01
Three simple, specific, accurate and precise spectrophotometric methods were developed for the determination of cefprozil (CZ) in the presence of its alkaline induced degradation product (DCZ). The first method was the bivariate method, while the two other multivariate methods were partial least squares (PLS) and spectral residual augmented classical least squares (SRACLS). The multivariate methods were applied with and without variable selection procedure (genetic algorithm GA). These methods were tested by analyzing laboratory prepared mixtures of the above drug with its alkaline induced degradation product and they were applied to its commercial pharmaceutical products.
Phenolic Analysis and Theoretic Design for Chinese Commercial Wines' Authentication.
Li, Si-Yu; Zhu, Bao-Qing; Reeves, Malcolm J; Duan, Chang-Qing
2018-01-01
To develop a robust tool for Chinese commercial wines' varietal, regional, and vintage authentication, phenolic compounds in 121 Chinese commercial dry red wines were detected and quantified by using high-performance liquid chromatography triple-quadrupole mass spectrometry (HPLC-QqQ-MS/MS), and differentiation abilities of principal component analysis (PCA), partial least squares discriminant analysis (PLS-DA), and orthogonal partial least squares discriminant analysis (OPLS-DA) were compared. Better than PCA and PLS-DA, OPLS-DA models used to differentiate wines according to their varieties (Cabernet Sauvignon or other varieties), regions (east or west Cabernet Sauvignon wines), and vintages (young or old Cabernet Sauvignon wines) were ideally established. The S-plot provided in OPLS-DA models showed the key phenolic compounds which were both statistically and biochemically significant in sample differentiation. Besides, the potential of the OPLS-DA models in deeper sample differentiating of more detailed regional and vintage information of wines was proved optimistic. On the basis of our results, a promising theoretic design for wine authentication was further proposed for the first time, which might be helpful in practical authentication of more commercial wines. The phenolic data of 121 Chinese commercial dry red wines was processed with different statistical tools for varietal, regional, and vintage differentiation. A promising theoretical design was summarized, which might be helpful for wine authentication in practical situation. © 2017 Institute of Food Technologists®.
Study on rapid valid acidity evaluation of apple by fiber optic diffuse reflectance technique
NASA Astrophysics Data System (ADS)
Liu, Yande; Ying, Yibin; Fu, Xiaping; Jiang, Xuesong
2004-03-01
Some issues related to nondestructive evaluation of valid acidity in intact apples by means of Fourier transform near infrared (FTNIR) (800-2631nm) method were addressed. A relationship was established between the diffuse reflectance spectra recorded with a bifurcated optic fiber and the valid acidity. The data were analyzed by multivariate calibration analysis such as partial least squares (PLS) analysis and principal component regression (PCR) technique. A total of 120 Fuji apples were tested and 80 of them were used to form a calibration data set. The influence of data preprocessing and different spectra treatments were also investigated. Models based on smoothing spectra were slightly worse than models based on derivative spectra and the best result was obtained when the segment length was 5 and the gap size was 10. Depending on data preprocessing and multivariate calibration technique, the best prediction model had a correlation efficient (0.871), a low RMSEP (0.0677), a low RMSEC (0.056) and a small difference between RMSEP and RMSEC by PLS analysis. The results point out the feasibility of FTNIR spectral analysis to predict the fruit valid acidity non-destructively. The ratio of data standard deviation to the root mean square error of prediction (SDR) is better to be less than 3 in calibration models, however, the results cannot meet the demand of actual application. Therefore, further study is required for better calibration and prediction.
NASA Astrophysics Data System (ADS)
Hadad, Ghada M.; El-Gindy, Alaa; Mahmoud, Waleed M. M.
2008-08-01
High-performance liquid chromatography (HPLC) and multivariate spectrophotometric methods are described for the simultaneous determination of ambroxol hydrochloride (AM) and doxycycline (DX) in combined pharmaceutical capsules. The chromatographic separation was achieved on reversed-phase C 18 analytical column with a mobile phase consisting of a mixture of 20 mM potassium dihydrogen phosphate, pH 6-acetonitrile in ratio of (1:1, v/v) and UV detection at 245 nm. Also, the resolution has been accomplished by using numerical spectrophotometric methods as classical least squares (CLS), principal component regression (PCR) and partial least squares (PLS-1) applied to the UV spectra of the mixture and graphical spectrophotometric method as first derivative of the ratio spectra ( 1DD) method. Analytical figures of merit (FOM), such as sensitivity, selectivity, analytical sensitivity, limit of quantitation and limit of detection were determined for CLS, PLS-1 and PCR methods. The proposed methods were validated and successfully applied for the analysis of pharmaceutical formulation and laboratory-prepared mixtures containing the two component combination.
NASA Astrophysics Data System (ADS)
Metwally, Fadia H.
2008-02-01
The quantitative predictive abilities of the new and simple bivariate spectrophotometric method are compared with the results obtained by the use of multivariate calibration methods [the classical least squares (CLS), principle component regression (PCR) and partial least squares (PLS)], using the information contained in the absorption spectra of the appropriate solutions. Mixtures of the two drugs Nifuroxazide (NIF) and Drotaverine hydrochloride (DRO) were resolved by application of the bivariate method. The different chemometric approaches were applied also with previous optimization of the calibration matrix, as they are useful in simultaneous inclusion of many spectral wavelengths. The results found by application of the bivariate, CLS, PCR and PLS methods for the simultaneous determinations of mixtures of both components containing 2-12 μg ml -1 of NIF and 2-8 μg ml -1 of DRO are reported. Both approaches were satisfactorily applied to the simultaneous determination of NIF and DRO in pure form and in pharmaceutical formulation. The results were in accordance with those given by the EVA Pharma reference spectrophotometric method.
Hadad, Ghada M; El-Gindy, Alaa; Mahmoud, Waleed M M
2008-08-01
High-performance liquid chromatography (HPLC) and multivariate spectrophotometric methods are described for the simultaneous determination of ambroxol hydrochloride (AM) and doxycycline (DX) in combined pharmaceutical capsules. The chromatographic separation was achieved on reversed-phase C(18) analytical column with a mobile phase consisting of a mixture of 20mM potassium dihydrogen phosphate, pH 6-acetonitrile in ratio of (1:1, v/v) and UV detection at 245 nm. Also, the resolution has been accomplished by using numerical spectrophotometric methods as classical least squares (CLS), principal component regression (PCR) and partial least squares (PLS-1) applied to the UV spectra of the mixture and graphical spectrophotometric method as first derivative of the ratio spectra ((1)DD) method. Analytical figures of merit (FOM), such as sensitivity, selectivity, analytical sensitivity, limit of quantitation and limit of detection were determined for CLS, PLS-1 and PCR methods. The proposed methods were validated and successfully applied for the analysis of pharmaceutical formulation and laboratory-prepared mixtures containing the two component combination.
Blood analysis by Raman spectroscopy.
Enejder, Annika M K; Koo, Tae-Woong; Oh, Jeankun; Hunter, Martin; Sasic, Slobodan; Feld, Michael S; Horowitz, Gary L
2002-11-15
Concentrations of multiple analytes were simultaneously measured in whole blood with clinical accuracy, without sample processing, using near-infrared Raman spectroscopy. Spectra were acquired with an instrument employing nonimaging optics, designed using Monte Carlo simulations of the influence of light-scattering-absorbing blood cells on the excitation and emission of Raman light in turbid medium. Raman spectra were collected from whole blood drawn from 31 individuals. Quantitative predictions of glucose, urea, total protein, albumin, triglycerides, hematocrit, and hemoglobin were made by means of partial least-squares (PLS) analysis with clinically relevant precision (r(2) values >0.93). The similarity of the features of the PLS calibration spectra to those of the respective analyte spectra illustrates that the predictions are based on molecular information carried by the Raman light. This demonstrates the feasibility of using Raman spectroscopy for quantitative measurements of biomolecular contents in highly light-scattering and absorbing media.
HPLC-based metabolic profiling and quality control of leaves of different Panax species
Yang, Seung-Ok; Lee, Sang Won; Kim, Young Ock; Sohn, Sang-Hyun; Kim, Young Chang; Hyun, Dong Yoon; Hong, Yoon Pyo; Shin, Yu Su
2013-01-01
Leaves from Panax ginseng Meyer (Korean origin and Chinese origin of Korean ginseng) and P. quinquefolius (American ginseng) were harvested in Haenam province, Korea, and were analyzed to investigate patterns in major metabolites using HPLC-based metabolic profiling. Partial least squares discriminant analysis (PLS-DA) was used to analyze the HPLC chromatogram data. There was a clear separation between Panax species and/or origins from different countries in the PLS-DA score plots. The ginsenoside compounds of Rg1, Re, Rg2, Rb2, Rb3, and Rd in Korean leaves were higher than in Chinese and American ginseng leaves, and the Rb1 level in P. quinquefolius leaves was higher than in P. ginseng (Korean origin or Chinese origin). HPLC chromatogram data coupled with multivariate statistical analysis can be used to profile the metabolite content and undertake quality control of Panax products. PMID:23717177
Üstündağ, Özgür; Dinç, Erdal; Özdemir, Nurten; Tilkan, M Günseli
2015-01-01
In the development strategies of new drug products and generic drug products, the simultaneous in-vitro dissolution behavior of oral dosage formulations is the most important indication for the quantitative estimation of efficiency and biopharmaceutical characteristics of drug substances. This is to force the related field's scientists to improve very powerful analytical methods to get more reliable, precise and accurate results in the quantitative analysis and dissolution testing of drug formulations. In this context, two new chemometric tools, partial least squares (PLS) and principal component regression (PCR) were improved for the simultaneous quantitative estimation and dissolution testing of zidovudine (ZID) and lamivudine (LAM) in a tablet dosage form. The results obtained in this study strongly encourage us to use them for the quality control, the routine analysis and the dissolution test of the marketing tablets containing ZID and LAM drugs.
Noninvasive and fast measurement of blood glucose in vivo by near infrared (NIR) spectroscopy
NASA Astrophysics Data System (ADS)
Jintao, Xue; Liming, Ye; Yufei, Liu; Chunyan, Li; Han, Chen
2017-05-01
This research was to develop a method for noninvasive and fast blood glucose assay in vivo. Near-infrared (NIR) spectroscopy, a more promising technique compared to other methods, was investigated in rats with diabetes and normal rats. Calibration models are generated by two different multivariate strategies: partial least squares (PLS) as linear regression method and artificial neural networks (ANN) as non-linear regression method. The PLS model was optimized individually by considering spectral range, spectral pretreatment methods and number of model factors, while the ANN model was studied individually by selecting spectral pretreatment methods, parameters of network topology, number of hidden neurons, and times of epoch. The results of the validation showed the two models were robust, accurate and repeatable. Compared to the ANN model, the performance of the PLS model was much better, with lower root mean square error of validation (RMSEP) of 0.419 and higher correlation coefficients (R) of 96.22%.
Payne, Courtney E.; Wolfrum, Edward J.
2015-03-12
Obtaining accurate chemical composition and reactivity (measures of carbohydrate release and yield) information for biomass feedstocks in a timely manner is necessary for the commercialization of biofuels. Our objective was to use near-infrared (NIR) spectroscopy and partial least squares (PLS) multivariate analysis to develop calibration models to predict the feedstock composition and the release and yield of soluble carbohydrates generated by a bench-scale dilute acid pretreatment and enzymatic hydrolysis assay. Major feedstocks included in the calibration models are corn stover, sorghum, switchgrass, perennial cool season grasses, rice straw, and miscanthus. Here are the results: We present individual model statistics tomore » demonstrate model performance and validation samples to more accurately measure predictive quality of the models. The PLS-2 model for composition predicts glucan, xylan, lignin, and ash (wt%) with uncertainties similar to primary measurement methods. A PLS-2 model was developed to predict glucose and xylose release following pretreatment and enzymatic hydrolysis. An additional PLS-2 model was developed to predict glucan and xylan yield. PLS-1 models were developed to predict the sum of glucose/glucan and xylose/xylan for release and yield (grams per gram). The release and yield models have higher uncertainties than the primary methods used to develop the models. In conclusion, it is possible to build effective multispecies feedstock models for composition, as well as carbohydrate release and yield. The model for composition is useful for predicting glucan, xylan, lignin, and ash with good uncertainties. The release and yield models have higher uncertainties; however, these models are useful for rapidly screening sample populations to identify unusual samples.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Payne, Courtney E.; Wolfrum, Edward J.
Obtaining accurate chemical composition and reactivity (measures of carbohydrate release and yield) information for biomass feedstocks in a timely manner is necessary for the commercialization of biofuels. Our objective was to use near-infrared (NIR) spectroscopy and partial least squares (PLS) multivariate analysis to develop calibration models to predict the feedstock composition and the release and yield of soluble carbohydrates generated by a bench-scale dilute acid pretreatment and enzymatic hydrolysis assay. Major feedstocks included in the calibration models are corn stover, sorghum, switchgrass, perennial cool season grasses, rice straw, and miscanthus. Here are the results: We present individual model statistics tomore » demonstrate model performance and validation samples to more accurately measure predictive quality of the models. The PLS-2 model for composition predicts glucan, xylan, lignin, and ash (wt%) with uncertainties similar to primary measurement methods. A PLS-2 model was developed to predict glucose and xylose release following pretreatment and enzymatic hydrolysis. An additional PLS-2 model was developed to predict glucan and xylan yield. PLS-1 models were developed to predict the sum of glucose/glucan and xylose/xylan for release and yield (grams per gram). The release and yield models have higher uncertainties than the primary methods used to develop the models. In conclusion, it is possible to build effective multispecies feedstock models for composition, as well as carbohydrate release and yield. The model for composition is useful for predicting glucan, xylan, lignin, and ash with good uncertainties. The release and yield models have higher uncertainties; however, these models are useful for rapidly screening sample populations to identify unusual samples.« less
NASA Astrophysics Data System (ADS)
Krepper, Gabriela; Romeo, Florencia; Fernandes, David Douglas de Sousa; Diniz, Paulo Henrique Gonçalves Dias; de Araújo, Mário César Ugulino; Di Nezio, María Susana; Pistonesi, Marcelo Fabián; Centurión, María Eugenia
2018-01-01
Determining fat content in hamburgers is very important to minimize or control the negative effects of fat on human health, effects such as cardiovascular diseases and obesity, which are caused by the high consumption of saturated fatty acids and cholesterol. This study proposed an alternative analytical method based on Near Infrared Spectroscopy (NIR) and Successive Projections Algorithm for interval selection in Partial Least Squares regression (iSPA-PLS) for fat content determination in commercial chicken hamburgers. For this, 70 hamburger samples with a fat content ranging from 14.27 to 32.12 mg kg- 1 were prepared based on the upper limit recommended by the Argentinean Food Codex, which is 20% (w w- 1). NIR spectra were then recorded and then preprocessed by applying different approaches: base line correction, SNV, MSC, and Savitzky-Golay smoothing. For comparison, full-spectrum PLS and the Interval PLS are also used. The best performance for the prediction set was obtained for the first derivative Savitzky-Golay smoothing with a second-order polynomial and window size of 19 points, achieving a coefficient of correlation of 0.94, RMSEP of 1.59 mg kg- 1, REP of 7.69% and RPD of 3.02. The proposed methodology represents an excellent alternative to the conventional Soxhlet extraction method, since waste generation is avoided, yet without the use of either chemical reagents or solvents, which follows the primary principles of Green Chemistry. The new method was successfully applied to chicken hamburger analysis, and the results agreed with those with reference values at a 95% confidence level, making it very attractive for routine analysis.
Krepper, Gabriela; Romeo, Florencia; Fernandes, David Douglas de Sousa; Diniz, Paulo Henrique Gonçalves Dias; de Araújo, Mário César Ugulino; Di Nezio, María Susana; Pistonesi, Marcelo Fabián; Centurión, María Eugenia
2018-01-15
Determining fat content in hamburgers is very important to minimize or control the negative effects of fat on human health, effects such as cardiovascular diseases and obesity, which are caused by the high consumption of saturated fatty acids and cholesterol. This study proposed an alternative analytical method based on Near Infrared Spectroscopy (NIR) and Successive Projections Algorithm for interval selection in Partial Least Squares regression (iSPA-PLS) for fat content determination in commercial chicken hamburgers. For this, 70 hamburger samples with a fat content ranging from 14.27 to 32.12mgkg -1 were prepared based on the upper limit recommended by the Argentinean Food Codex, which is 20% (ww -1 ). NIR spectra were then recorded and then preprocessed by applying different approaches: base line correction, SNV, MSC, and Savitzky-Golay smoothing. For comparison, full-spectrum PLS and the Interval PLS are also used. The best performance for the prediction set was obtained for the first derivative Savitzky-Golay smoothing with a second-order polynomial and window size of 19 points, achieving a coefficient of correlation of 0.94, RMSEP of 1.59mgkg -1 , REP of 7.69% and RPD of 3.02. The proposed methodology represents an excellent alternative to the conventional Soxhlet extraction method, since waste generation is avoided, yet without the use of either chemical reagents or solvents, which follows the primary principles of Green Chemistry. The new method was successfully applied to chicken hamburger analysis, and the results agreed with those with reference values at a 95% confidence level, making it very attractive for routine analysis. Copyright © 2017 Elsevier B.V. All rights reserved.
Francisco, Fabiane Lacerda; Saviano, Alessandro Morais; Almeida, Túlia de Souza Botelho; Lourenço, Felipe Rebello
2016-05-01
Microbiological assays are widely used to estimate the relative potencies of antibiotics in order to guarantee the efficacy, safety, and quality of drug products. Despite of the advantages of turbidimetric bioassays when compared to other methods, it has limitations concerning the linearity and range of the dose-response curve determination. Here, we proposed to use partial least squares (PLS) regression to solve these limitations and to improve the prediction of relative potencies of antibiotics. Kinetic-reading microplate turbidimetric bioassays for apramacyin and vancomycin were performed using Escherichia coli (ATCC 8739) and Bacillus subtilis (ATCC 6633), respectively. Microbial growths were measured as absorbance up to 180 and 300min for apramycin and vancomycin turbidimetric bioassays, respectively. Conventional dose-response curves (absorbances or area under the microbial growth curve vs. log of antibiotic concentration) showed significant regression, however there were significant deviation of linearity. Thus, they could not be used for relative potency estimations. PLS regression allowed us to construct a predictive model for estimating the relative potencies of apramycin and vancomycin without over-fitting and it improved the linear range of turbidimetric bioassay. In addition, PLS regression provided predictions of relative potencies equivalent to those obtained from agar diffusion official methods. Therefore, we conclude that PLS regression may be used to estimate the relative potencies of antibiotics with significant advantages when compared to conventional dose-response curve determination. Copyright © 2016 Elsevier B.V. All rights reserved.
Yehia, Ali M; Mohamed, Heba M
2016-01-05
Three advanced chemmometric-assisted spectrophotometric methods namely; Concentration Residuals Augmented Classical Least Squares (CRACLS), Multivariate Curve Resolution-Alternating Least Squares (MCR-ALS) and Principal Component Analysis-Artificial Neural Networks (PCA-ANN) were developed, validated and benchmarked to PLS calibration; to resolve the severely overlapped spectra and simultaneously determine; Paracetamol (PAR), Guaifenesin (GUA) and Phenylephrine (PHE) in their ternary mixture and in presence of p-aminophenol (AP) the main degradation product and synthesis impurity of Paracetamol. The analytical performance of the proposed methods was described by percentage recoveries, root mean square error of calibration and standard error of prediction. The four multivariate calibration methods could be directly used without any preliminary separation step and successfully applied for pharmaceutical formulation analysis, showing no excipients' interference. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
de Oliveira, Isadora R. N.; Roque, Jussara V.; Maia, Mariza P.; Stringheta, Paulo C.; Teófilo, Reinaldo F.
2018-04-01
A new method was developed to determine the antioxidant properties of red cabbage extract (Brassica oleracea) by mid (MID) and near (NIR) infrared spectroscopies and partial least squares (PLS) regression. A 70% (v/v) ethanolic extract of red cabbage was concentrated to 9° Brix and further diluted (12 to 100%) in water. The dilutions were used as external standards for the building of PLS models. For the first time, this strategy was applied for building multivariate regression models. Reference analyses and spectral data were obtained from diluted extracts. The determinate properties were total and monomeric anthocyanins, total polyphenols and antioxidant capacity by ABTS (2,2-azino-bis(3-ethyl-benzothiazoline-6-sulfonate)) and DPPH (2,2-diphenyl-1-picrylhydrazyl) methods. Ordered predictors selection (OPS) and genetic algorithm (GA) were used for feature selection before PLS regression (PLS-1). In addition, a PLS-2 regression was applied to all properties simultaneously. PLS-1 models provided more predictive models than did PLS-2 regression. PLS-OPS and PLS-GA models presented excellent prediction results with a correlation coefficient higher than 0.98. However, the best models were obtained using PLS and variable selection with the OPS algorithm and the models based on NIR spectra were considered more predictive for all properties. Then, these models provided a simple, rapid and accurate method for determination of red cabbage extract antioxidant properties and its suitability for use in the food industry.
Towards molecular design using 2D-molecular contour maps obtained from PLS regression coefficients
NASA Astrophysics Data System (ADS)
Borges, Cleber N.; Barigye, Stephen J.; Freitas, Matheus P.
2017-12-01
The multivariate image analysis descriptors used in quantitative structure-activity relationships are direct representations of chemical structures as they are simply numerical decodifications of pixels forming the 2D chemical images. These MDs have found great utility in the modeling of diverse properties of organic molecules. Given the multicollinearity and high dimensionality of the data matrices generated with the MIA-QSAR approach, modeling techniques that involve the projection of the data space onto orthogonal components e.g. Partial Least Squares (PLS) have been generally used. However, the chemical interpretation of the PLS-based MIA-QSAR models, in terms of the structural moieties affecting the modeled bioactivity has not been straightforward. This work describes the 2D-contour maps based on the PLS regression coefficients, as a means of assessing the relevance of single MIA predictors to the response variable, and thus allowing for the structural, electronic and physicochemical interpretation of the MIA-QSAR models. A sample study to demonstrate the utility of the 2D-contour maps to design novel drug-like molecules is performed using a dataset of some anti-HIV-1 2-amino-6-arylsulfonylbenzonitriles and derivatives, and the inferences obtained are consistent with other reports in the literature. In addition, the different schemes for encoding atomic properties in molecules are discussed and evaluated.
Wu, Jing-zhu; Wang, Feng-zhu; Wang, Li-li; Zhang, Xiao-chao; Mao, Wen-hua
2015-01-01
In order to improve the accuracy and robustness of detecting tomato seedlings nitrogen content based on near-infrared spectroscopy (NIR), 4 kinds of characteristic spectrum selecting methods were studied in the present paper, i. e. competitive adaptive reweighted sampling (CARS), Monte Carlo uninformative variables elimination (MCUVE), backward interval partial least squares (BiPLS) and synergy interval partial least squares (SiPLS). There were totally 60 tomato seedlings cultivated at 10 different nitrogen-treatment levels (urea concentration from 0 to 120 mg . L-1), with 6 samples at each nitrogen-treatment level. They are in different degrees of over nitrogen, moderate nitrogen, lack of nitrogen and no nitrogen status. Each sample leaves were collected to scan near-infrared spectroscopy from 12 500 to 3 600 cm-1. The quantitative models based on the above 4 methods were established. According to the experimental result, the calibration model based on CARS and MCUVE selecting methods show better performance than those based on BiPLS and SiPLS selecting methods, but their prediction ability is much lower than that of the latter. Among them, the model built by BiPLS has the best prediction performance. The correlation coefficient (r), root mean square error of prediction (RMSEP) and ratio of performance to standard derivate (RPD) is 0. 952 7, 0. 118 3 and 3. 291, respectively. Therefore, NIR technology combined with characteristic spectrum selecting methods can improve the model performance. But the characteristic spectrum selecting methods are not universal. For the built model based or single wavelength variables selection is more sensitive, it is more suitable for the uniform object. While the anti-interference ability of the model built based on wavelength interval selection is much stronger, it is more suitable for the uneven and poor reproducibility object. Therefore, the characteristic spectrum selection will only play a better role in building model, combined with the consideration of sample state and the model indexes.
Quantification of trace metals in infant formula premixes using laser-induced breakdown spectroscopy
NASA Astrophysics Data System (ADS)
Cama-Moncunill, Raquel; Casado-Gavalda, Maria P.; Cama-Moncunill, Xavier; Markiewicz-Keszycka, Maria; Dixit, Yash; Cullen, Patrick J.; Sullivan, Carl
2017-09-01
Infant formula is a human milk substitute generally based upon fortified cow milk components. In order to mimic the composition of breast milk, trace elements such as copper, iron and zinc are usually added in a single operation using a premix. The correct addition of premixes must be verified to ensure that the target levels in infant formulae are achieved. In this study, a laser-induced breakdown spectroscopy (LIBS) system was assessed as a fast validation tool for trace element premixes. LIBS is a promising emission spectroscopic technique for elemental analysis, which offers real-time analyses, little to no sample preparation and ease of use. LIBS was employed for copper and iron determinations of premix samples ranging approximately from 0 to 120 mg/kg Cu/1640 mg/kg Fe. LIBS spectra are affected by several parameters, hindering subsequent quantitative analyses. This work aimed at testing three matrix-matched calibration approaches (simple-linear regression, multi-linear regression and partial least squares regression (PLS)) as means for precision and accuracy enhancement of LIBS quantitative analysis. All calibration models were first developed using a training set and then validated with an independent test set. PLS yielded the best results. For instance, the PLS model for copper provided a coefficient of determination (R2) of 0.995 and a root mean square error of prediction (RMSEP) of 14 mg/kg. Furthermore, LIBS was employed to penetrate through the samples by repetitively measuring the same spot. Consequently, LIBS spectra can be obtained as a function of sample layers. This information was used to explore whether measuring deeper into the sample could reduce possible surface-contaminant effects and provide better quantifications.
Tursi, Antonio; Mastromarino, Paola; Capobianco, Daniela; Elisei, Walter; Miccheli, Alfredo; Capuani, Giorgio; Tomassini, Alberta; Campagna, Giuseppe; Picchio, Marcello; Giorgetti, GianMarco; Fabiocchi, Federica; Brandimarte, Giovanni
2016-10-01
The aim of this study was to assess fecal microbiota and metabolome in a population with symptomatic uncomplicated diverticular disease (SUDD). Whether intestinal microbiota and metabolic profiling may be altered in patients with SUDD is unknown. Stool samples from 44 consecutive women [15 patients with SUDD, 13 with asymptomatic diverticulosis (AD), and 16 healthy controls (HCs)] were analyzed. Real-time polymerase chain reaction was used to quantify targeted microorganisms. High-resolution proton nuclear magnetic resonance spectroscopy associated with multivariate analysis with partial least-square discriminant analysis (PLS-DA) was applied on the metabolite data set. The overall bacterial quantity did not differ among the 3 groups (P=0.449), with no difference in Bacteroides/Prevotella, Clostridium coccoides, Bifidobacterium, Lactobacillus, and Escherichia coli subgroups. The amount of Akkermansia muciniphila species was significantly different between HC, AD, and SUDD subjects (P=0.017). PLS-DA analysis of nuclear magnetic resonance -based metabolomics associated with microbiological data showed significant discrimination between HCs and AD patients (R=0.733; Q=0.383; P<0.05, LV=2). PLS analysis showed lower N-acetyl compound and isovalerate levels in AD, associated with higher levels of A. municiphila, as compared with the HC group. PLS-DA applied on AD and SUDD samples showed a good discrimination between these 2 groups (R=0.69; Q=0.35; LV=2). SUDD patients were characterized by low levels of valerate, butyrate, and choline and by high levels of N-acetyl derivatives and U1. SUDD and AD do not show colonic bacterial overgrowth, but a significant difference in the levels of fecal A. muciniphila was observed. Moreover, increasing expression of some metabolites as expression of different AD and SUDD metabolic activity was found.
Burgués, Javier; Marco, Santiago
2018-08-17
Metal oxide semiconductor (MOX) sensors are usually temperature-modulated and calibrated with multivariate models such as partial least squares (PLS) to increase the inherent low selectivity of this technology. The multivariate sensor response patterns exhibit heteroscedastic and correlated noise, which suggests that maximum likelihood methods should outperform PLS. One contribution of this paper is the comparison between PLS and maximum likelihood principal components regression (MLPCR) in MOX sensors. PLS is often criticized by the lack of interpretability when the model complexity increases beyond the chemical rank of the problem. This happens in MOX sensors due to cross-sensitivities to interferences, such as temperature or humidity and non-linearity. Additionally, the estimation of fundamental figures of merit, such as the limit of detection (LOD), is still not standardized in multivariate models. Orthogonalization methods, such as orthogonal projection to latent structures (O-PLS), have been successfully applied in other fields to reduce the complexity of PLS models. In this work, we propose a LOD estimation method based on applying the well-accepted univariate LOD formulas to the scores of the first component of an orthogonal PLS model. The resulting LOD is compared to the multivariate LOD range derived from error-propagation. The methodology is applied to data extracted from temperature-modulated MOX sensors (FIS SB-500-12 and Figaro TGS 3870-A04), aiming at the detection of low concentrations of carbon monoxide in the presence of uncontrolled humidity (chemical noise). We found that PLS models were simpler and more accurate than MLPCR models. Average LOD values of 0.79 ppm (FIS) and 1.06 ppm (Figaro) were found using the approach described in this paper. These values were contained within the LOD ranges obtained with the error-propagation approach. The mean LOD increased to 1.13 ppm (FIS) and 1.59 ppm (Figaro) when considering validation samples collected two weeks after calibration, which represents a 43% and 46% degradation, respectively. The orthogonal score-plot was a very convenient tool to visualize MOX sensor data and to validate the LOD estimates. Copyright © 2018 Elsevier B.V. All rights reserved.
Han, Xue; Jiang, Hong; Zhang, Dingkun; Zhang, Yingying; Xiong, Xi; Jiao, Jiaojiao; Xu, Runchun; Yang, Ming; Han, Li; Lin, Junzhi
2017-01-01
Background: The current astringency evaluation for herbs has become dissatisfied with the requirement of pharmaceutical process. It needed a new method to accurately assess astringency. Methods: First, quinine, sucrose, citric acid, sodium chloride, monosodium glutamate, and tannic acid (TA) were analyzed by electronic tongue (e-tongue) to determine the approximate region of astringency in partial least square (PLS) map. Second, different concentrations of TA were detected to define the standard curve of astringency. Meanwhile, coordinate-concentration relationship could be obtained by fitting the PLS abscissa of standard curve and corresponding concentration. Third, Chebulae Fructus (CF), Yuganzi throat tablets (YGZTT), and Sanlejiang oral liquid (SLJOL) were tested to define the region in PLS map. Finally, the astringent intensities of samples were calculated combining with the standard coordinate-concentration relationship and expressed by concentrations of TA. Then, Euclidean distance (Ed) analysis and human sensory test were processed to verify the results. Results: The fitting equation between concentration and abscissa of TA was Y = 0.00498 × e(−X/0.51035) + 0.10905 (r = 0.999). The astringency of 1, 0.1 mg/mL CF was predicted at 0.28, 0.12 mg/mL TA; 2, 0.2 mg/mL YGZTTs was predicted at 0.18, 0.11 mg/mL TA; 0.002, 0.0002 mg/mL SLJOL was predicted at 0.15, 0.10 mg/mL TA. The validation results showed that the predicted astringency of e-tongue was basically consistent to human sensory and was more accuracy than Ed analysis. Conclusion: The study indicated the established method was objective and feasible. It provided a new quantitative method for astringency of herbs. SUMMARY The astringency of Chebulae Fructus, Yuganzi throat tablets, and Sanlejiang oral liquid was predicted by electronic tongueEuclidean distance analysis and human sensory test verified the resultsA new strategy which was objective, simple, and sensitive to compare astringent intensity of herbs and preparations was provided. Abbreviations used: CF: Chebulae Fructus; E-tongue: Electronic tongue; Ed: Euclidean distance; PLS: Partial least square; PCA: Principal component analysis; SLJOL: Sanlejiang oral liquid; TA: Tannic acid; VAS: Visual analog scale; YGZTT: Yuganzi throat tablets. PMID:28839378
Wu, Xia; Zhu, Jian-Cheng; Zhang, Yu; Li, Wei-Min; Rong, Xiang-Lu; Feng, Yi-Fan
2016-08-25
Potential impact of lipid research has been increasingly realized both in disease treatment and prevention. An effective metabolomics approach based on ultra-performance liquid chromatography/quadrupole-time-of-flight mass spectrometry (UPLC/Q-TOF-MS) along with multivariate statistic analysis has been applied for investigating the dynamic change of plasma phospholipids compositions in early type 2 diabetic rats after the treatment of an ancient prescription of Chinese Medicine Huang-Qi-San. The exported UPLC/Q-TOF-MS data of plasma samples were subjected to SIMCA-P and processed by bioMark, mixOmics, Rcomdr packages with R software. A clear score plots of plasma sample groups, including normal control group (NC), model group (MC), positive medicine control group (Flu) and Huang-Qi-San group (HQS), were achieved by principal-components analysis (PCA), partial least-squares discriminant analysis (PLS-DA) and orthogonal partial least-squares discriminant analysis (OPLS-DA). Biomarkers were screened out using student T test, principal component regression (PCR), partial least-squares regression (PLS) and important variable method (variable influence on projection, VIP). Structures of metabolites were identified and metabolic pathways were deduced by correlation coefficient. The relationship between compounds was explained by the correlation coefficient diagram, and the metabolic differences between similar compounds were illustrated. Based on KEGG database, the biological significances of identified biomarkers were described. The correlation coefficient was firstly applied to identify the structure and deduce the metabolic pathways of phospholipids metabolites, and the study provided a new methodological cue for further understanding the molecular mechanisms of metabolites in the process of regulating Huang-Qi-San for treating early type 2 diabetes. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Pappas, Christos; Kyraleou, Maria; Voskidi, Eleni; Kotseridis, Yorgos; Taranilis, Petros A; Kallithraka, Stamatina
2015-02-01
The direct and simultaneous quantitative determination of the mean degree of polymerization (mDP) and the degree of galloylation (%G) in grape seeds were quantified using diffuse reflectance infrared Fourier transform spectroscopy and partial least squares (PLS). The results were compared with those obtained using the conventional analysis employing phloroglucinolysis as pretreatment followed by high performance liquid chromatography-UV and mass spectrometry detection. Infrared spectra were recorded in solid state samples after freeze drying. The 2nd derivative of the 1832 to 1416 and 918 to 739 cm(-1) spectral regions for the quantification of mDP, the 2nd derivative of the 1813 to 607 cm(-1) spectral region for the degree of %G determination and PLS regression were used. The determination coefficients (R(2) ) of mDP and %G were 0.99 and 0.98, respectively. The corresponding values of the root-mean-square error of calibration were found 0.506 and 0.692, the root-mean-square error of cross validation 0.811 and 0.921, and the root-mean-square error of prediction 0.612 and 0.801. The proposed method in comparison with the conventional method is simpler, less time consuming, more economical, and requires reduced quantities of chemical reagents and fewer sample pretreatment steps. It could be a starting point for the design of more specific models according to the requirements of the wineries. © 2015 Institute of Food Technologists®
NASA Astrophysics Data System (ADS)
Luna, Aderval S.; da Silva, Arnaldo P.; Pinho, Jéssica S. A.; Ferré, Joan; Boqué, Ricard
Near infrared (NIR) spectroscopy and multivariate classification were applied to discriminate soybean oil samples into non-transgenic and transgenic. Principal Component Analysis (PCA) was applied to extract relevant features from the spectral data and to remove the anomalous samples. The best results were obtained when with Support Vectors Machine-Discriminant Analysis (SVM-DA) and Partial Least Squares-Discriminant Analysis (PLS-DA) after mean centering plus multiplicative scatter correction. For SVM-DA the percentage of successful classification was 100% for the training group and 100% and 90% in validation group for non transgenic and transgenic soybean oil samples respectively. For PLS-DA the percentage of successful classification was 95% and 100% in training group for non transgenic and transgenic soybean oil samples respectively and 100% and 80% in validation group for non transgenic and transgenic respectively. The results demonstrate that NIR spectroscopy can provide a rapid, nondestructive and reliable method to distinguish non-transgenic and transgenic soybean oils.
Carranco, Núria; Farrés-Cebrián, Mireia; Saurina, Javier
2018-01-01
High performance liquid chromatography method with ultra-violet detection (HPLC-UV) fingerprinting was applied for the analysis and characterization of olive oils, and was performed using a Zorbax Eclipse XDB-C8 reversed-phase column under gradient elution, employing 0.1% formic acid aqueous solution and methanol as mobile phase. More than 130 edible oils, including monovarietal extra-virgin olive oils (EVOOs) and other vegetable oils, were analyzed. Principal component analysis results showed a noticeable discrimination between olive oils and other vegetable oils using raw HPLC-UV chromatographic profiles as data descriptors. However, selected HPLC-UV chromatographic time-window segments were necessary to achieve discrimination among monovarietal EVOOs. Partial least square (PLS) regression was employed to tackle olive oil authentication of Arbequina EVOO adulterated with Picual EVOO, a refined olive oil, and sunflower oil. Highly satisfactory results were obtained after PLS analysis, with overall errors in the quantitation of adulteration in the Arbequina EVOO (minimum 2.5% adulterant) below 2.9%. PMID:29561820
Rapid detection of talcum powder in tea using FT-IR spectroscopy coupled with chemometrics
Li, Xiaoli; Zhang, Yuying; He, Yong
2016-01-01
This paper investigated the feasibility of Fourier transform infrared transmission (FT-IR) spectroscopy to detect talcum powder illegally added in tea based on chemometric methods. Firstly, 210 samples of tea powder with 13 dose levels of talcum powder were prepared for FT-IR spectra acquirement. In order to highlight the slight variations in FT-IR spectra, smoothing, normalize and standard normal variate (SNV) were employed to preprocess the raw spectra. Among them, SNV preprocessing had the best performance with high correlation of prediction (RP = 0.948) and low root mean square error of prediction (RMSEP = 0.108) of partial least squares (PLS) model. Then 18 characteristic wavenumbers were selected based on a hybrid of backward interval partial least squares (biPLS) regression, competitive adaptive reweighted sampling (CARS) algorithm and successive projections algorithm (SPA). These characteristic wavenumbers only accounted for 0.64% of the full wavenumbers. Following that, 18 characteristic wavenumbers were used to build linear and nonlinear determination models by PLS regression and extreme learning machine (ELM), respectively. The optimal model with RP = 0.963 and RMSEP = 0.137 was achieved by ELM algorithm. These results demonstrated that FT-IR spectroscopy with chemometrics could be used successfully to detect talcum powder in tea. PMID:27468701
NASA Astrophysics Data System (ADS)
Tewari, Jagdish C.; Dixit, Vivechana; Cho, Byoung-Kwan; Malik, Kamal A.
2008-12-01
The capacity to confirm the variety or origin and the estimation of sucrose, glucose, fructose of the citrus fruits are major interests of citrus juice industry. A rapid classification and quantification technique was developed and validated for simultaneous and nondestructive quantifying the sugar constituent's concentrations and the origin of citrus fruits using Fourier Transform Near-Infrared (FT-NIR) spectroscopy in conjunction with Artificial Neural Network (ANN) using genetic algorithm, Chemometrics and Correspondences Analysis (CA). To acquire good classification accuracy and to present a wide range of concentration of sucrose, glucose and fructose, we have collected 22 different varieties of citrus fruits from the market during the entire season of citruses. FT-NIR spectra were recorded in the NIR region from 1100 to 2500 nm using the fiber optic probe and three types of data analysis were performed. Chemometrics analysis using Partial Least Squares (PLS) was performed in order to determine the concentration of individual sugars. Artificial Neural Network analysis was performed for classification, origin or variety identification of citrus fruits using genetic algorithm. Correspondence analysis was performed in order to visualize the relationship between the citrus fruits. To compute a PLS model based upon the reference values and to validate the developed method, high performance liquid chromatography (HPLC) was performed. Spectral range and the number of PLS factors were optimized for the lowest standard error of calibration (SEC), prediction (SEP) and correlation coefficient ( R2). The calibration model developed was able to assess the sucrose, glucose and fructose contents in unknown citrus fruit up to an R2 value of 0.996-0.998. Numbers of factors from F1 to F10 were optimized for correspondence analysis for relationship visualization of citrus fruits based on the output values of genetic algorithm. ANN and CA analysis showed excellent classification of citrus according to the variety to which they belong and well-classified citrus according to their origin. The technique has potential in rapid determination of sugars content and to identify different varieties and origins of citrus in citrus juice industry.
Tewari, Jagdish C; Dixit, Vivechana; Cho, Byoung-Kwan; Malik, Kamal A
2008-12-01
The capacity to confirm the variety or origin and the estimation of sucrose, glucose, fructose of the citrus fruits are major interests of citrus juice industry. A rapid classification and quantification technique was developed and validated for simultaneous and nondestructive quantifying the sugar constituent's concentrations and the origin of citrus fruits using Fourier Transform Near-Infrared (FT-NIR) spectroscopy in conjunction with Artificial Neural Network (ANN) using genetic algorithm, Chemometrics and Correspondences Analysis (CA). To acquire good classification accuracy and to present a wide range of concentration of sucrose, glucose and fructose, we have collected 22 different varieties of citrus fruits from the market during the entire season of citruses. FT-NIR spectra were recorded in the NIR region from 1,100 to 2,500 nm using the fiber optic probe and three types of data analysis were performed. Chemometrics analysis using Partial Least Squares (PLS) was performed in order to determine the concentration of individual sugars. Artificial Neural Network analysis was performed for classification, origin or variety identification of citrus fruits using genetic algorithm. Correspondence analysis was performed in order to visualize the relationship between the citrus fruits. To compute a PLS model based upon the reference values and to validate the developed method, high performance liquid chromatography (HPLC) was performed. Spectral range and the number of PLS factors were optimized for the lowest standard error of calibration (SEC), prediction (SEP) and correlation coefficient (R(2)). The calibration model developed was able to assess the sucrose, glucose and fructose contents in unknown citrus fruit up to an R(2) value of 0.996-0.998. Numbers of factors from F1 to F10 were optimized for correspondence analysis for relationship visualization of citrus fruits based on the output values of genetic algorithm. ANN and CA analysis showed excellent classification of citrus according to the variety to which they belong and well-classified citrus according to their origin. The technique has potential in rapid determination of sugars content and to identify different varieties and origins of citrus in citrus juice industry.
NASA Astrophysics Data System (ADS)
Palou, Anna; Miró, Aira; Blanco, Marcelo; Larraz, Rafael; Gómez, José Francisco; Martínez, Teresa; González, Josep Maria; Alcalà, Manel
2017-06-01
Even when the feasibility of using near infrared (NIR) spectroscopy combined with partial least squares (PLS) regression for prediction of physico-chemical properties of biodiesel/diesel blends has been widely demonstrated, inclusion in the calibration sets of the whole variability of diesel samples from diverse production origins still remains as an important challenge when constructing the models. This work presents a useful strategy for the systematic selection of calibration sets of samples of biodiesel/diesel blends from diverse origins, based on a binary code, principal components analysis (PCA) and the Kennard-Stones algorithm. Results show that using this methodology the models can keep their robustness over time. PLS calculations have been done using a specialized chemometric software as well as the software of the NIR instrument installed in plant, and both produced RMSEP under reproducibility values of the reference methods. The models have been proved for on-line simultaneous determination of seven properties: density, cetane index, fatty acid methyl esters (FAME) content, cloud point, boiling point at 95% of recovery, flash point and sulphur.
Cebeci Maltaş, Derya; Kwok, Kaho; Wang, Ping; Taylor, Lynne S; Ben-Amotz, Dor
2013-06-01
Identifying pharmaceutical ingredients is a routine procedure required during industrial manufacturing. Here we show that a recently developed Raman compressive detection strategy can be employed to classify various widely used pharmaceutical materials using a hybrid supervised/unsupervised strategy in which only two ingredients are used for training and yet six other ingredients can also be distinguished. More specifically, our liquid crystal spatial light modulator (LC-SLM) based compressive detection instrument is trained using only the active ingredient, tadalafil, and the excipient, lactose, but is tested using these and various other excipients; microcrystalline cellulose, magnesium stearate, titanium (IV) oxide, talc, sodium lauryl sulfate and hydroxypropyl cellulose. Partial least squares discriminant analysis (PLS-DA) is used to generate the compressive detection filters necessary for fast chemical classification. Although the filters used in this study are trained on only lactose and tadalafil, we show that all the pharmaceutical ingredients mentioned above can be differentiated and classified using PLS-DA compressive detection filters with an accumulation time of 10ms per filter. Copyright © 2013 Elsevier B.V. All rights reserved.
Mohammadi Moghaddam, Toktam; Razavi, Seyed M A; Taghizadeh, Masoud; Sazgarnia, Ameneh
2016-01-01
Roasting is an important step in the processing of pistachio nuts. The effect of hot air roasting temperature (90, 120 and 150 °C), time (20, 35 and 50 min) and air velocity (0.5, 1.5 and 2.5 m/s) on textural and sensory characteristics of pistachio nuts and kernels were investigated. The results showed that increasing the roasting temperature decreased the fracture force (82-25.54 N), instrumental hardness (82.76-37.59 N), apparent modulus of elasticity (47-21.22 N/s), compressive energy (280.73-101.18 N.s) and increased amount of bitterness (1-2.5) and the hardness score (6-8.40) of pistachio kernels. Higher roasting time improved the flavor of samples. The results of the consumer test showed that the roasted pistachio kernels have good acceptability for flavor (score 5.83-8.40), color (score 7.20-8.40) and hardness (score 6-8.40) acceptance. Moreover, Partial Least Square (PLS) analysis of instrumental and sensory data provided important information for the correlation of objective and subjective properties. The univariate analysis showed that over 93.87 % of the variation in sensory hardness and almost 87 % of the variation in sensory acceptability could be explained by instrumental texture properties.
Chotimah, Chusnul; Sudjadi; Riyanto, Sugeng; Rohman, Abdul
2015-01-01
Purpose: Analysis of drugs in multicomponent system officially is carried out using chromatographic technique, however, this technique is too laborious and involving sophisticated instrument. Therefore, UV-VIS spectrophotometry coupled with multivariate calibration of partial least square (PLS) for quantitative analysis of metamizole, thiamin and pyridoxin is developed in the presence of cyanocobalamine without any separation step. Methods: The calibration and validation samples are prepared. The calibration model is prepared by developing a series of sample mixture consisting these drugs in certain proportion. Cross validation of calibration sample using leave one out technique is used to identify the smaller set of components that provide the greatest predictive ability. The evaluation of calibration model was based on the coefficient of determination (R2) and root mean square error of calibration (RMSEC). Results: The results showed that the coefficient of determination (R2) for the relationship between actual values and predicted values for all studied drugs was higher than 0.99 indicating good accuracy. The RMSEC values obtained were relatively low, indicating good precision. The accuracy and presision results of developed method showed no significant difference compared to those obtained by official method of HPLC. Conclusion: The developed method (UV-VIS spectrophotometry in combination with PLS) was succesfully used for analysis of metamizole, thiamin and pyridoxin in tablet dosage form. PMID:26819934
[Quality evaluation of American ginseng using UPLC coupled with multivariate analysis].
Tang, Yan; Yan, Shu-Mo; Wang, Jing-Jing; Yuan, Yuan; Yang, Bin
2016-05-01
An ultra performance liquid chromatography (UPLC)method combined with multivariate data analysis was developed to evaluate the quality of American ginseng by simultaneously determining the concentrations of six ginsenosides (Rg₁, Re, Rb₁, Rc, Ro and Rd)in the samples. For UPLC, acetonitrile with 0.01% formic acid and water with 0.01% formic acid were used as the mobile phase with gradient elution. Under the established chromatographic conditions, the six ginsenosides could be well separated and the results of linearity, stability, precision, repeatability, and recovery rate all reached the requirement of quantification analysis, respectively. The total contents of Rg₁, Re, and Rb₁ in 57 samples all reached the requirement of the 2015 edition of Chinese Pharmacopoeia. At the same time, the experimental data were analyzed by principle component analysis (PCA) and partial least squares discriminant analysis (PLS-DA). The crude drugs and the decoction pieces can be discriminated by a PCA method and the samples with different age can be distinguished by a PLS-DA method. Copyright© by the Chinese Pharmaceutical Association.
Kim, So-Hyun; Cho, Somi K; Hyun, Sun-Hee; Park, Hae-Eun; Kim, Young-Suk; Choi, Hyung-Kyoon
2011-01-01
Guava leaves were classified and the free radical scavenging activity (FRSA) evaluated according to different harvest times by using the (1)H-NMR-based metabolomic technique. A principal component analysis (PCA) of (1)H-NMR data from the guava leaves provided clear clusters according to the harvesting time. A partial least squares (PLS) analysis indicated a correlation between the metabolic profile and FRSA. FRSA levels of the guava leaves harvested during May and August were high, and those leaves contained higher amounts of 3-hydroxybutyric acid, acetic acid, glutamic acid, asparagine, citric acid, malonic acid, trans-aconitic acid, ascorbic acid, maleic acid, cis-aconitic acid, epicatechin, protocatechuic acid, and xanthine than the leaves harvested during October and December. Epicatechin and protocatechuic acid among those compounds seem to have enhanced FRSA of the guava leaf samples harvested in May and August. A PLS regression model was established to predict guava leaf FRSA at different harvesting times by using a (1)H-NMR data set. The predictability of the PLS model was then tested by internal and external validation. The results of this study indicate that (1)H-NMR-based metabolomic data could usefully characterize guava leaves according to their time of harvesting.
NASA Astrophysics Data System (ADS)
Hu, Leqian; Ma, Shuai; Yin, Chunling
2018-03-01
In this work, fluorescence spectroscopy combined with multi-way pattern recognition techniques were developed for determining the geographical origin of kudzu root and detection and quantification of adulterants in kudzu root. Excitation-emission (EEM) spectra were obtained for 150 pure kudzu root samples of different geographical origins and 150 fake kudzu roots with different adulteration proportions by recording emission from 330 to 570 nm with excitation in the range of 320-480 nm, respectively. Multi-way principal components analysis (M-PCA) and multilinear partial least squares discriminant analysis (N-PLS-DA) methods were used to decompose the excitation-emission matrices datasets. 150 pure kudzu root samples could be differentiated exactly from each other according to their geographical origins by M-PCA and N-PLS-DA models. For the adulteration kudzu root samples, N-PLS-DA got better and more reliable classification result comparing with the M-PCA model. The results obtained in this study indicated that EEM spectroscopy coupling with multi-way pattern recognition could be used as an easy, rapid and novel tool to distinguish the geographical origin of kudzu root and detect adulterated kudzu root. Besides, this method was also suitable for determining the geographic origin and detection the adulteration of the other foodstuffs which can produce fluorescence.
Kusumaningrum, Dewi; Lee, Hoonsoo; Lohumi, Santosh; Mo, Changyeun; Kim, Moon S; Cho, Byoung-Kwan
2018-03-01
The viability of seeds is important for determining their quality. A high-quality seed is one that has a high capability of germination that is necessary to ensure high productivity. Hence, developing technology for the detection of seed viability is a high priority in agriculture. Fourier transform near-infrared (FT-NIR) spectroscopy is one of the most popular devices among other vibrational spectroscopies. This study aims to use FT-NIR spectroscopy to determine the viability of soybean seeds. Viable and artificial ageing seeds as non-viable soybeans were used in this research. The FT-NIR spectra of soybean seeds were collected and analysed using a partial least-squares discriminant analysis (PLS-DA) to classify viable and non-viable soybean seeds. Moreover, the variable importance in projection (VIP) method for variable selection combined with the PLS-DA was employed. The most effective wavelengths were selected by the VIP method, which selected 146 optimal variables from the full set of 1557 variables. The results demonstrated that the FT-NIR spectral analysis with the PLS-DA method that uses all variables or the selected variables showed good performance based on the high value of prediction accuracy for soybean viability with an accuracy close to 100%. Hence, FT-NIR techniques with a chemometric analysis have the potential for rapidly measuring soybean seed viability. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
Niazi, Ali; Khorshidi, Neda; Ghaemmaghami, Pegah
2015-01-25
In this study an analytical procedure based on microwave-assisted dispersive liquid-liquid microextraction (MA-DLLME) and spectrophotometric coupled with chemometrics methods is proposed to determine uranium. In the proposed method, 4-(2-pyridylazo) resorcinol (PAR) is used as a chelating agent, and chloroform and ethanol are selected as extraction and dispersive solvent. The optimization strategy is carried out by using two level full factorial designs. Results of the two level full factorial design (2(4)) based on an analysis of variance demonstrated that the pH, concentration of PAR, amount of dispersive and extraction solvents are statistically significant. Optimal condition for three variables: pH, concentration of PAR, amount of dispersive and extraction solvents are obtained by using Box-Behnken design. Under the optimum conditions, the calibration graphs are linear in the range of 20.0-350.0 ng mL(-1) with detection limit of 6.7 ng mL(-1) (3δB/slope) and the enrichment factor of this method for uranium reached at 135. The relative standard deviation (R.S.D.) is 1.64% (n=7, c=50 ng mL(-1)). The partial least squares (PLS) modeling was used for multivariate calibration of the spectrophotometric data. The orthogonal signal correction (OSC) was used for preprocessing of data matrices and the prediction results of model, with and without using OSC, were statistically compared. MA-DLLME-OSC-PLS method was presented for the first time in this study. The root mean squares error of prediction (RMSEP) for uranium determination using PLS and OSC-PLS models were 4.63 and 0.98, respectively. This procedure allows the determination of uranium synthesis and real samples such as waste water with good reliability of the determination. Copyright © 2014. Published by Elsevier B.V.
Aleixandre-Tudo, José Luis; Nieuwoudt, Helené; Aleixandre, José Luis; Du Toit, Wessel J
2015-02-04
The validation of ultraviolet-visible (UV-vis) spectroscopy combined with partial least-squares (PLS) regression to quantify red wine tannins is reported. The methylcellulose precipitable (MCP) tannin assay and the bovine serum albumin (BSA) tannin assay were used as reference methods. To take the high variability of wine tannins into account when the calibration models were built, a diverse data set was collected from samples of South African red wines that consisted of 18 different cultivars, from regions spanning the wine grape-growing areas of South Africa with their various sites, climates, and soils, ranging in vintage from 2000 to 2012. A total of 240 wine samples were analyzed, and these were divided into a calibration set (n = 120) and a validation set (n = 120) to evaluate the predictive ability of the models. To test the robustness of the PLS calibration models, the predictive ability of the classifying variables cultivar, vintage year, and experimental versus commercial wines was also tested. In general, the statistics obtained when BSA was used as a reference method were slightly better than those obtained with MCP. Despite this, the MCP tannin assay should also be considered as a valid reference method for developing PLS calibrations. The best calibration statistics for the prediction of new samples were coefficient of correlation (R 2 val) = 0.89, root mean standard error of prediction (RMSEP) = 0.16, and residual predictive deviation (RPD) = 3.49 for MCP and R 2 val = 0.93, RMSEP = 0.08, and RPD = 4.07 for BSA, when only the UV region (260-310 nm) was selected, which also led to a faster analysis time. In addition, a difference in the results obtained when the predictive ability of the classifying variables vintage, cultivar, or commercial versus experimental wines was studied suggests that tannin composition is highly affected by many factors. This study also discusses the correlations in tannin values between the methylcellulose and protein precipitation methods.
[Determination of Cu in Shell of Preserved Egg by LIBS Coupled with PLS].
Hu, Hui-qin; Xu, Xue-hong; Liu, Mu-hua; Tu, Jian-ping; Huang, Le; Huang, Lin; Yao, Ming-yin; Chen, Tian-bing; Yang, Ping
2015-12-01
In this work, the content of copper in the shell of preserved eggs were determined directly by Laser induced breakdown spectroscopy (LIBS), and the characteristics lines of Cu was obtained. The samples of eggshell were pretreated by acid wet digestion, and the real content of Cu was obtained by atomic absorption spectrophotometer (AAS). Due to the test precision and accuracy of LIBS was influenced by a serious of factors, for example, the complex matrix effect of sample, the enviro nment noise, the system noise of the instrument, the stability of laser energy and so on. And the conventional unvariate linear calibration curve between LIBS intensity and content of element of sample, such as by use of Schiebe G-Lomakin equation, can not meet the requirement of quantitative analysis. In account of that, a kind of multivariate calibration method is needed. In this work, the data of LIBS spectra were processed by partial least squares (PLS), the precision and accuracy of PLS model were compared by different smoothing treatment and five pretreatment methods. The result showed that the correlation coefficient and the accuracy of the PLS model were improved, and the root mean square error and the average relative error were reduced effectively by 11 point smoothing with Multiplicative scatter correction (MSC) pretreatment. The results of the study show that, heavy metal Cu in preserved egg shells can be direct detected accurately by laser induced breakdown spectroscopy, and the next step batch tests will been conducted to find out the relationship of heavy metal Cu content in the preserved egg between the eggshell, egg white and egg yolk. And the goal of the contents of heavy metals in the egg white, egg yolk can be knew through determinate the eggshell by the LIBS can be achieved, to provide new method for rapid non-destructive testing technology for quality and satety of agricultural products.
Teng, Wei-Zhuo; Song, Jia; Meng, Fan-Xin; Meng, Qing-Fan; Lu, Jia-Hui; Hu, Shuang; Teng, Li-Rong; Wang, Di; Xie, Jing
2014-10-01
Partial least squares (PLS) and radial basis function neural network (RBFNN) combined with near infrared spectros- copy (NIR) were applied to develop models for cordycepic acid, polysaccharide and adenosine analysis in Paecilomyces hepialid fermentation mycelium. The developed models possess well generalization and predictive ability which can be applied for crude drugs and related productions determination. During the experiment, 214 Paecilomyces hepialid mycelium samples were obtained via chemical mutagenesis combined with submerged fermentation. The contents of cordycepic acid, polysaccharide and adenosine were determined via traditional methods and the near infrared spectroscopy data were collected. The outliers were removed and the numbers of calibration set were confirmed via Monte Carlo partial least square (MCPLS) method. Based on the values of degree of approach (Da), both moving window partial least squares (MWPLS) and moving window radial basis function neural network (MWRBFNN) were applied to optimize characteristic wavelength variables, optimum preprocessing methods and other important variables in the models. After comparison, the RBFNN, RBFNN and PLS models were developed successfully for cordycepic acid, polysaccharide and adenosine detection, and the correlation between reference values and predictive values in both calibration set (R2c) and validation set (R2p) of optimum models was 0.9417 and 0.9663, 0.9803 and 0.9850, and 0.9761 and 0.9728, respectively. All the data suggest that these models possess well fitness and predictive ability.
Winning, Hanne; Roldán-Marín, Eduvigis; Dragsted, Lars O; Viereck, Nanna; Poulsen, Morten; Sánchez-Moreno, Concepción; Cano, M Pilar; Engelsen, Søren B
2009-11-01
The metabolome following intake of onion by-products is evaluated. Thirty-two rats were fed a diet containing an onion by-product or one of the two derived onion by-product fractions: an ethanol extract and the residue. A 24 hour urine sample was analyzed using (1)H NMR spectroscopy in order to investigate the effects of onion intake on the rat metabolism. Application of interval extended canonical variates analysis (ECVA) proved to be able to distinguish between the metabolomic profiles from rats consuming normal feed and rats fed with an onion diet. Two dietary biomarkers for onion intake were identified as dimethyl sulfone and 3-hydroxyphenylacetic acid. The same two dietary biomarkers were subsequently revealed by interval partial least squares regression (PLS) to be perfect quantitative markers for onion intake. The best PLS calibration model yielded a root mean square error of cross-validation (RMSECV) of 0.97% (w/w) with only 1 latent variable and a squared correlation coefficient of 0.94. This indicates that urine from rats on the by-product diet, the extract diet, and the residue diet all contain the same dietary biomarkers and it is concluded that dimethyl sulfone and 3-hydroxyphenylacetic acid are dietary biomarkers for onion intake. Being able to detect specific dietary biomarkers is highly beneficial in the control of nutritionally enhanced functional foods.
Domain-Invariant Partial-Least-Squares Regression.
Nikzad-Langerodi, Ramin; Zellinger, Werner; Lughofer, Edwin; Saminger-Platz, Susanne
2018-05-11
Multivariate calibration models often fail to extrapolate beyond the calibration samples because of changes associated with the instrumental response, environmental condition, or sample matrix. Most of the current methods used to adapt a source calibration model to a target domain exclusively apply to calibration transfer between similar analytical devices, while generic methods for calibration-model adaptation are largely missing. To fill this gap, we here introduce domain-invariant partial-least-squares (di-PLS) regression, which extends ordinary PLS by a domain regularizer in order to align the source and target distributions in the latent-variable space. We show that a domain-invariant weight vector can be derived in closed form, which allows the integration of (partially) labeled data from the source and target domains as well as entirely unlabeled data from the latter. We test our approach on a simulated data set where the aim is to desensitize a source calibration model to an unknown interfering agent in the target domain (i.e., unsupervised model adaptation). In addition, we demonstrate unsupervised, semisupervised, and supervised model adaptation by di-PLS on two real-world near-infrared (NIR) spectroscopic data sets.
Roberts, D K; Winters, J E; Castells, D D; Clark, C A; Teitelbaum, B A
2001-01-01
To investigate pigmented striae of the anterior lens capsule in African-Americans, a potential indicator of significant anterior segment pigment dispersion. A group of 40 African-American subjects who exhibited pigmented lens striae (PLS) were identified from a non-referred, primary eye care population in Chicago, IL, USA. These subjects were then compared to an age, race, and gender matched control group relative to refractive error and the presence or absence of diabetes and hypertension. The PLS subjects (mean age = 65.4 +/- 8.8 years, range = 50-87 years) consisted of 36 females and 4 males. PLS were bilateral in 36 (85%) of the 40 subjects. Among the eyes with PLS, 21 (55%) of 38 right eyes and 22 (61%) of 36 left eyes also had significant corneal endothelial pigment dusting, commonly in the shape of a Krukenberg's spindle. Ten (25%) of the PLS subjects had either glaucoma or ocular hypertension (7 bilateral, 3 unilateral). The presence of trabecular meshwork pigment varied from minimal to heavy. The mean +/- SD (range) refractive error of the PLS right eyes was +1.61 +/- 1.43D (-1.50 to +5.00D) and +1.77 +/- 1.37D (-1.00 to +5.00D) for the left eyes. Based on these data, the PLS right eyes were +1.63D (Student's t, p = 0.0001; 95% CI = +0.82 to +2.44D) more hyperopic on average than the control right eyes, and the PLS left eyes were +1.77D (p = 0.0001; 95% CI = +0.92 to +2.63D) more hyperopic on average than the control left eyes. Trend analysis showed a gradually increasing likelihood of PLS with increasing magnitude of hyperopia in both eyes (Mantel-Haenszel chi-square, p = 0.001). Among PLS subjects, 24 (60%) of 40 were hypertensive and 9 (23%) of 40 were diabetic. However, these proportions were not significantly different (two-tailed Fisher's exact test; hypertension: p = 0.30; diabetes: p = 0.70) from the randomly selected controls. Among our African-American group, which consisted predominately of females >50 years of age, the likelihood of PLS increased with increasing hyperopic refractive error. This finding is consistent with the possibility that PLS may, in some circumstances, indicate a significant pigment dispersal process due to iris-lens rubbing that may be associated with crowding of anterior segment structures. Additional study is warranted to further assess the nature of PLS, their precise relationship with an age-related pigment dispersal process, and their true significance as a risk factor for development of glaucoma.
Hashimoto, Ryu-Ichiro; Itahashi, Takashi; Okada, Rieko; Hasegawa, Sayaka; Tani, Masayuki; Kato, Nobumasa; Mimura, Masaru
2018-01-01
Abnormalities in functional brain networks in schizophrenia have been studied by examining intrinsic and extrinsic brain activity under various experimental paradigms. However, the identified patterns of abnormal functional connectivity (FC) vary depending on the adopted paradigms. Thus, it is unclear whether and how these patterns are inter-related. In order to assess relationships between abnormal patterns of FC during intrinsic activity and those during extrinsic activity, we adopted a data-fusion approach and applied partial least square (PLS) analyses to FC datasets from 25 patients with chronic schizophrenia and 25 age- and sex-matched normal controls. For the input to the PLS analyses, we generated a pair of FC maps during the resting state (REST) and the auditory deviance response (ADR) from each participant using the common seed region in the left middle temporal gyrus, which is a focus of activity associated with auditory verbal hallucinations (AVHs). PLS correlation (PLS-C) analysis revealed that patients with schizophrenia have significantly lower loadings of a component containing positive FCs in default-mode network regions during REST and a component containing positive FCs in the auditory and attention-related networks during ADR. Specifically, loadings of the REST component were significantly correlated with the severities of positive symptoms and AVH in patients with schizophrenia. The co-occurrence of such altered FC patterns during REST and ADR was replicated using PLS regression, wherein FC patterns during REST are modeled to predict patterns during ADR. These findings provide an integrative understanding of altered FCs during intrinsic and extrinsic activity underlying core schizophrenia symptoms.
Lakshmi, KS; Lakshmi, S
2010-01-01
Two chemometric methods were developed for the simultaneous determination of telmisartan and hydrochlorothiazide. The chemometric methods applied were principal component regression (PCR) and partial least square (PLS-1). These approaches were successfully applied to quantify the two drugs in the mixture using the information included in the UV absorption spectra of appropriate solutions in the range of 200-350 nm with the intervals Δλ = 1 nm. The calibration of PCR and PLS-1 models was evaluated by internal validation (prediction of compounds in its own designed training set of calibration) and by external validation over laboratory prepared mixtures and pharmaceutical preparations. The PCR and PLS-1 methods require neither any separation step, nor any prior graphical treatment of the overlapping spectra of the two drugs in a mixture. The results of PCR and PLS-1 methods were compared with each other and a good agreement was found. PMID:21331198
Lakshmi, Ks; Lakshmi, S
2010-01-01
Two chemometric methods were developed for the simultaneous determination of telmisartan and hydrochlorothiazide. The chemometric methods applied were principal component regression (PCR) and partial least square (PLS-1). These approaches were successfully applied to quantify the two drugs in the mixture using the information included in the UV absorption spectra of appropriate solutions in the range of 200-350 nm with the intervals Δλ = 1 nm. The calibration of PCR and PLS-1 models was evaluated by internal validation (prediction of compounds in its own designed training set of calibration) and by external validation over laboratory prepared mixtures and pharmaceutical preparations. The PCR and PLS-1 methods require neither any separation step, nor any prior graphical treatment of the overlapping spectra of the two drugs in a mixture. The results of PCR and PLS-1 methods were compared with each other and a good agreement was found.
Statistical process control of cocrystallization processes: A comparison between OPLS and PLS.
Silva, Ana F T; Sarraguça, Mafalda Cruz; Ribeiro, Paulo R; Santos, Adenilson O; De Beer, Thomas; Lopes, João Almeida
2017-03-30
Orthogonal partial least squares regression (OPLS) is being increasingly adopted as an alternative to partial least squares (PLS) regression due to the better generalization that can be achieved. Particularly in multivariate batch statistical process control (BSPC), the use of OPLS for estimating nominal trajectories is advantageous. In OPLS, the nominal process trajectories are expected to be captured in a single predictive principal component while uncorrelated variations are filtered out to orthogonal principal components. In theory, OPLS will yield a better estimation of the Hotelling's T 2 statistic and corresponding control limits thus lowering the number of false positives and false negatives when assessing the process disturbances. Although OPLS advantages have been demonstrated in the context of regression, its use on BSPC was seldom reported. This study proposes an OPLS-based approach for BSPC of a cocrystallization process between hydrochlorothiazide and p-aminobenzoic acid monitored on-line with near infrared spectroscopy and compares the fault detection performance with the same approach based on PLS. A series of cocrystallization batches with imposed disturbances were used to test the ability to detect abnormal situations by OPLS and PLS-based BSPC methods. Results demonstrated that OPLS was generally superior in terms of sensibility and specificity in most situations. In some abnormal batches, it was found that the imposed disturbances were only detected with OPLS. Copyright © 2017 Elsevier B.V. All rights reserved.
In vivo diagnosis of cervical precancer using Raman spectroscopy and genetic algorithm techniques.
Duraipandian, Shiyamala; Zheng, Wei; Ng, Joseph; Low, Jeffrey J H; Ilancheran, A; Huang, Zhiwei
2011-10-21
This study aimed to evaluate the clinical utility of applying near-infrared (NIR) Raman spectroscopy and genetic algorithm-partial least squares-discriminant analysis (GA-PLS-DA) to identify biomolecular changes of cervical tissues associated with dysplastic transformation during colposcopic examination. A total of 105 in vivo Raman spectra were measured from 57 cervical sites (35 normal and 22 precancer sites) of 29 patients recruited, in which 65 spectra were from normal sites, while 40 spectra were from cervical precancerous lesions (i.e., 7 low-grade CIN and 33 high-grade CIN). The GA feature selection technique incorporated with PLS was utilized to study the significant biochemical Raman bands for differentiation between normal and precancer cervical tissues. The GA-PLS-DA algorithm with double cross-validation (dCV) identified seven diagnostically significant Raman bands in the ranges of 925-935, 979-999, 1080-1090, 1240-1260, 1320-1340, 1400-1420, and 1625-1645 cm(-1) related to proteins, nucleic acids and lipids in tissue, and yielded a diagnostic accuracy of 82.9% (sensitivity of 72.5% (29/40) and specificity of 89.2% (58/65)) for precancer detection. The results of this exploratory study suggest that Raman spectroscopy in conjunction with GA-PLS-DA and dCV methods has the potential to provide clinically significant discrimination between normal and precancer cervical tissues at the molecular level.
NASA Astrophysics Data System (ADS)
Jintao, Xue; Yufei, Liu; Liming, Ye; Chunyan, Li; Quanwei, Yang; Weiying, Wang; Yun, Jing; Minxiang, Zhang; Peng, Li
2018-01-01
Near-Infrared Spectroscopy (NIRS) was first used to develop a method for rapid and simultaneous determination of 5 active alkaloids (berberine, coptisine, palmatine, epiberberine and jatrorrhizine) in 4 parts (rhizome, fibrous root, stem and leaf) of Coptidis Rhizoma. A total of 100 samples from 4 main places of origin were collected and studied. With HPLC analysis values as calibration reference, the quantitative analysis of 5 marker components was performed by two different modeling methods, partial least-squares (PLS) regression as linear regression and artificial neural networks (ANN) as non-linear regression. The results indicated that the 2 types of models established were robust, accurate and repeatable for five active alkaloids, and the ANN models was more suitable for the determination of berberine, coptisine and palmatine while the PLS model was more suitable for the analysis of epiberberine and jatrorrhizine. The performance of the optimal models was achieved as follows: the correlation coefficient (R) for berberine, coptisine, palmatine, epiberberine and jatrorrhizine was 0.9958, 0.9956, 0.9959, 0.9963 and 0.9923, respectively; the root mean square error of validation (RMSEP) was 0.5093, 0.0578, 0.0443, 0.0563 and 0.0090, respectively. Furthermore, for the comprehensive exploitation and utilization of plant resource of Coptidis Rhizoma, the established NIR models were used to analysis the content of 5 active alkaloids in 4 parts of Coptidis Rhizoma and 4 main origin of places. This work demonstrated that NIRS may be a promising method as routine screening for off-line fast analysis or on-line quality assessment of traditional Chinese medicine (TCM).
Recognition of beer brand based on multivariate analysis of volatile fingerprint.
Cajka, Tomas; Riddellova, Katerina; Tomaniova, Monika; Hajslova, Jana
2010-06-18
Automated head-space solid-phase microextraction (HS-SPME)-based sampling procedure, coupled to gas chromatography-time-of-flight mass spectrometry (GC-TOFMS), was developed and employed for obtaining of fingerprints (GC profiles) of beer volatiles. In total, 265 speciality beer samples were collected over a 1-year period with the aim to distinguish, based on analytical (profiling) data, (i) the beers labelled as Rochefort 8; (ii) a group consisting of Rochefort 6, 8, 10 beers; and (iii) Trappist beers. For the chemometric evaluation of the data, partial least squares discriminant analysis (PLS-DA), linear discriminant analysis (LDA), and artificial neural networks with multilayer perceptrons (ANN-MLP) were tested. The best prediction ability was obtained for the model that distinguished a group of Rochefort 6, 8, 10 beers from the rest of beers. In this case, all chemometric tools employed provided 100% correct classification. Slightly worse prediction abilities were achieved for the models "Trappist vs. non-Trappist beers" with the values of 93.9% (PLS-DA), 91.9% (LDA) and 97.0% (ANN-MLP) and "Rochefort 8 vs. the rest" with the values of 87.9% (PLS-DA) and 84.8% (LDA) and 93.9% (ANN-MLP). In addition to chromatographic profiling, also the potential of direct coupling of SPME (extraction/pre-concentration device) with high-resolution TOFMS employing a direct analysis in real time (DART) ion source has been demonstrated as a challenging profiling approach. Copyright (c) 2010 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Anderson, R. B.; Clegg, S. M.; Frydenvang, J.
2015-12-01
One of the primary challenges faced by the ChemCam instrument on the Curiosity Mars rover is developing a regression model that can accurately predict the composition of the wide range of target types encountered (basalts, calcium sulfate, feldspar, oxides, etc.). The original calibration used 69 rock standards to train a partial least squares (PLS) model for each major element. By expanding the suite of calibration samples to >400 targets spanning a wider range of compositions, the accuracy of the model was improved, but some targets with "extreme" compositions (e.g. pure minerals) were still poorly predicted. We have therefore developed a simple method, referred to as "submodel PLS", to improve the performance of PLS across a wide range of target compositions. In addition to generating a "full" (0-100 wt.%) PLS model for the element of interest, we also generate several overlapping submodels (e.g. for SiO2, we generate "low" (0-50 wt.%), "mid" (30-70 wt.%), and "high" (60-100 wt.%) models). The submodels are generally more accurate than the "full" model for samples within their range because they are able to adjust for matrix effects that are specific to that range. To predict the composition of an unknown target, we first predict the composition with the submodels and the "full" model. Then, based on the predicted composition from the "full" model, the appropriate submodel prediction can be used (e.g. if the full model predicts a low composition, use the "low" model result, which is likely to be more accurate). For samples with "full" predictions that occur in a region of overlap between submodels, the submodel predictions are "blended" using a simple linear weighted sum. The submodel PLS method shows improvements in most of the major elements predicted by ChemCam and reduces the occurrence of negative predictions for low wt.% targets. Submodel PLS is currently being used in conjunction with ICA regression for the major element compositions of ChemCam data.
Genisheva, Z; Quintelas, C; Mesquita, D P; Ferreira, E C; Oliveira, J M; Amaral, A L
2018-04-25
This work aims to explore the potential of near infrared (NIR) spectroscopy to quantify volatile compounds in Vinho Verde wines, commonly determined by gas chromatography. For this purpose, 105 Vinho Verde wine samples were analyzed using Fourier transform near infrared (FT-NIR) transmission spectroscopy in the range of 5435 cm -1 to 6357 cm -1 . Boxplot and principal components analysis (PCA) were performed for clusters identification and outliers removal. A partial least square (PLS) regression was then applied to develop the calibration models, by a new iterative approach. The predictive ability of the models was confirmed by an external validation procedure with an independent sample set. The obtained results could be considered as quite good with coefficients of determination (R 2 ) varying from 0.94 to 0.97. The current methodology, using NIR spectroscopy and chemometrics, can be seen as a promising rapid tool to determine volatile compounds in Vinho Verde wines. Copyright © 2017 Elsevier Ltd. All rights reserved.
Drivelos, Spiros A; Danezis, Georgios P; Haroutounian, Serkos A; Georgiou, Constantinos A
2016-12-15
This study examines the trace and rare earth elemental (REE) fingerprint variations of PDO (Protected Designation of Origin) "Fava Santorinis" over three consecutive harvesting years (2011-2013). Classification of samples in harvesting years was studied by performing discriminant analysis (DA), k nearest neighbours (κ-NN), partial least squares (PLS) analysis and probabilistic neural networks (PNN) using rare earth elements and trace metals determined using ICP-MS. DA performed better than κ-NN, producing 100% discrimination using trace elements and 79% using REEs. PLS was found to be superior to PNN, achieving 99% and 90% classification for trace and REEs, respectively, while PNN achieved 96% and 71% classification for trace and REEs, respectively. The information obtained using REEs did not enhance classification, indicating that REEs vary minimally per harvesting year, providing robust geographical origin discrimination. The results show that seasonal patterns can occur in the elemental composition of "Fava Santorinis", probably reflecting seasonality of climate. Copyright © 2016 Elsevier Ltd. All rights reserved.
Teodoro, Janaína Aparecida Reis; Pereira, Hebert Vinicius; Sena, Marcelo Martins; Piccin, Evandro; Zacca, Jorge Jardim; Augusti, Rodinei
2017-12-15
A direct method based on the application of paper spray mass spectrometry (PS-MS) combined with a chemometric supervised method (partial least square discriminant analysis, PLS-DA) was developed and applied to the discrimination of authentic and counterfeit samples of blended Scottish whiskies. The developed methodology employed the negative ion mode MS, included 44 authentic whiskies from diverse brands and batches and 44 counterfeit samples of the same brands seized during operations of the Brazilian Federal Police, totalizing 88 samples. An exploratory principal component analysis (PCA) model showed a reasonable discrimination of the counterfeit whiskies in PC2. In spite of the samples heterogeneity, a robust, reliable and accurate PLS-DA model was generated and validated, which was able to correctly classify the samples with nearly 100% success rate. The use of PS-MS also allowed the identification of the main marker compounds associated with each type of sample analyzed: authentic or counterfeit. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Darwish, Hany W.; Hassan, Said A.; Salem, Maissa Y.; El-Zeany, Badr A.
2016-02-01
Two advanced, accurate and precise chemometric methods are developed for the simultaneous determination of amlodipine besylate (AML) and atorvastatin calcium (ATV) in the presence of their acidic degradation products in tablet dosage forms. The first method was Partial Least Squares (PLS-1) and the second was Artificial Neural Networks (ANN). PLS was compared to ANN models with and without variable selection procedure (genetic algorithm (GA)). For proper analysis, a 5-factor 5-level experimental design was established resulting in 25 mixtures containing different ratios of the interfering species. Fifteen mixtures were used as calibration set and the other ten mixtures were used as validation set to validate the prediction ability of the suggested models. The proposed methods were successfully applied to the analysis of pharmaceutical tablets containing AML and ATV. The methods indicated the ability of the mentioned models to solve the highly overlapped spectra of the quinary mixture, yet using inexpensive and easy to handle instruments like the UV-VIS spectrophotometer.
NASA Technical Reports Server (NTRS)
Anderson, R. B.; Morris, R. V.; Clegg, S. M.; Bell, J. F., III; Humphries, S. D.; Wiens, R. C.
2011-01-01
The ChemCam instrument selected for the Curiosity rover is capable of remote laser-induced breakdown spectroscopy (LIBS).[1] We used a remote LIBS instrument similar to ChemCam to analyze 197 geologic slab samples and 32 pressed-powder geostandards. The slab samples are well-characterized and have been used to validate the calibration of previous instruments on Mars missions, including CRISM [2], OMEGA [3], the MER Pancam [4], Mini-TES [5], and Moessbauer [6] instruments and the Phoenix SSI [7]. The resulting dataset was used to compare multivariate methods for quantitative LIBS and to determine the effect of grain size on calculations. Three multivariate methods - partial least squares (PLS), multilayer perceptron artificial neural networks (MLP ANNs) and cascade correlation (CC) ANNs - were used to generate models and extract the quantitative composition of unknown samples. PLS can be used to predict one element (PLS1) or multiple elements (PLS2) at a time, as can the neural network methods. Although MLP and CC ANNs were successful in some cases, PLS generally produced the most accurate and precise results.
NASA Astrophysics Data System (ADS)
Zhang, Xuexi; Xiao, Zhi-Yan; Yin, Jianhua; Xia, Yang
2014-09-01
Fourier transform infrared imaging (FTIRI) combined with chemometrics can be used to detect the structure of bio-macromolecule, measure the concentrations of some components, and so on. In this study, FTIRI with Partial Least-Squares (PLS) regression was applied to study the concentration of two main components in bovine nasal cartilage (BNC), collagen and proteoglycan. An infrared spectrum library was built by mixing the collagen and chondroitin 6-sulfate (main of proteoglycan) at different ratios. Some pretreatments are needed for building PLS model. FTIR images were collected from BNC sections at 6.25μm and 25μm pixel size. The spectra extracted from BNC-FTIR images were imported into the PLS regression program to predict the concentrations of collagen and proteoglycan. These PLS-determined concentrations are agreed with the result in our previous work and biochemical analytical results. The prediction shows that the concentrations of collagen and proteoglycan in BNC are comparative on the whole. However, the concentration of proteoglycan is a litter higher than that of collagen, to some extent.
[Detection of Hawthorn Fruit Defects Using Hyperspectral Imaging].
Liu, De-hua; Zhang, Shu-juan; Wang, Bin; Yu, Ke-qiang; Zhao, Yan-ru; He, Yong
2015-11-01
Hyperspectral imaging technology covered the range of 380-1000 nm was employed to detect defects (bruise and insect damage) of hawthorn fruit. A total of 134 samples were collected, which included damage fruit of 46, pest fruit of 30, injure and pest fruit of 10 and intact fruit of 48. Because calyx · s⁻¹ tem-end and bruise/insect damage regions offered a similar appearance characteristic in RGB images, which could produce easily confusion between them. Hence, five types of defects including bruise, insect damage, sound, calyx, and stem-end were collected from 230 hawthorn fruits. After acquiring hyperspectral images of hawthorn fruits, the spectral data were extracted from region of interest (ROI). Then, several pretreatment methods of standard normalized variate (SNV), savitzky golay (SG), median filter (MF) and multiplicative scatter correction (MSC) were used and partial least squares method(PLS) model was carried out to obtain the better performance. Accordingly to their results, SNV pretreatment methods assessed by PLS was viewed as best pretreatment method. Lastly, SNV was chosen as the pretreatment method. Spectral features of five different regions were combined with Regression coefficients(RCs) of partial least squares-discriminant analysis (PLS-DA) model was used to identify the important wavelengths and ten wavebands at 483, 563, 645, 671, 686, 722, 777, 819, 837 and 942 nm were selected from all of the wavebands. Using Kennard-Stone algorithm, all kinds of samples were randomly divided into training set (173) and test set (57) according to the proportion of 3:1. And then, least squares-support vector machine (LS-SVM) discriminate model was established by using the selected wavebands. The results showed that the discriminate accuracy of the method was 91.23%. In the other hand, images at ten important wavebands were executed to Principal component analysis (PCA). Using "Sobel" operator and region growing algrorithm "Regiongrow", the edge and defect feature of 86 Hawthorn could be recognized. Lastly, the detect precision of bruised, insect damage and two-defect samples is 95.65%, 86.67% and 100%, respectively. This investigation demonstrated that hyperspectral imaging technology could detect the defects of bruise, insect damage, calyx, and stem-end in hawthorn fruit in qualitative analysis and feature detection which provided a theoretical reference for the defects nondestructive detection of hawthorn fruit.
Ahmad, Iftikhar; Ahmad, Manzoor; Khan, Karim; Ikram, Masroor
2016-06-01
Optical polarimetry was employed for assessment of ex vivo healthy and basal cell carcinoma (BCC) tissue samples from human skin. Polarimetric analyses revealed that depolarization and retardance for healthy tissue group were significantly higher (p<0.001) compared to BCC tissue group. Histopathology indicated that these differences partially arise from BCC-related characteristic changes in tissue morphology. Wilks lambda statistics demonstrated the potential of all investigated polarimetric properties for computer assisted classification of the two tissue groups. Based on differences in polarimetric properties, partial least square (PLS) regression classified the samples with 100% accuracy, sensitivity and specificity. These findings indicate that optical polarimetry together with PLS statistics hold promise for automated pathology classification. Copyright © 2016 Elsevier B.V. All rights reserved.
Detection of Tetracycline in Milk using NIR Spectroscopy and Partial Least Squares
NASA Astrophysics Data System (ADS)
Wu, Nan; Xu, Chenshan; Yang, Renjie; Ji, Xinning; Liu, Xinyuan; Yang, Fan; Zeng, Ming
2018-02-01
The feasibility of measuring tetracycline in milk was investigated by near infrared (NIR) spectroscopic technique combined with partial least squares (PLS) method. The NIR transmittance spectra of 40 pure milk samples and 40 tetracycline adulterated milk samples with different concentrations (from 0.005 to 40 mg/L) were obtained. The pure milk and tetracycline adulterated milk samples were properly assigned to the categories with 100% accuracy in the calibration set, and the rate of correct classification of 96.3% was obtained in the prediction set. For the quantitation of tetracycline in adulterated milk, the root mean squares errors for calibration and prediction models were 0.61 mg/L and 4.22 mg/L, respectively. The PLS model had good fitting effect in calibration set, however its predictive ability was limited, especially for low tetracycline concentration samples. Totally, this approach can be considered as a promising tool for discrimination of tetracycline adulterated milk, as a supplement to high performance liquid chromatography.
NASA Astrophysics Data System (ADS)
Isingizwe Nturambirwe, J. Frédéric; Perold, Willem J.; Opara, Umezuruike L.
2016-02-01
Near infrared (NIR) spectroscopy has gained extensive use in quality evaluation. It is arguably one of the most advanced spectroscopic tools in non-destructive quality testing of food stuff, from measurement to data analysis and interpretation. NIR spectral data are interpreted through means often involving multivariate statistical analysis, sometimes associated with optimisation techniques for model improvement. The objective of this research was to explore the extent to which genetic algorithms (GA) can be used to enhance model development, for predicting fruit quality. Apple fruits were used, and NIR spectra in the range from 12000 to 4000 cm-1 were acquired on both bruised and healthy tissues, with different degrees of mechanical damage. GAs were used in combination with partial least squares regression methods to develop bruise severity prediction models, and compared to PLS models developed using the full NIR spectrum. A classification model was developed, which clearly separated bruised from unbruised apple tissue. GAs helped improve prediction models by over 10%, in comparison with full spectrum-based models, as evaluated in terms of error of prediction (Root Mean Square Error of Cross-validation). PLS models to predict internal quality, such as sugar content and acidity were developed and compared to the versions optimized by genetic algorithm. Overall, the results highlighted the potential use of GA method to improve speed and accuracy of fruit quality prediction.
Quantitative analysis of multi-component gas mixture based on AOTF-NIR spectroscopy
NASA Astrophysics Data System (ADS)
Hao, Huimin; Zhang, Yong; Liu, Junhua
2007-12-01
Near Infrared (NIR) spectroscopy analysis technology has attracted many eyes and has wide application in many domains in recent years because of its remarkable advantages. But the NIR spectrometer can only be used for liquid and solid analysis by now. In this paper, a new quantitative analysis method of gas mixture by using new generation NIR spectrometer is explored. To collect the NIR spectra of gas mixtures, a vacuumable gas cell was designed and assembled to Luminar 5030-731 Acousto-Optic Tunable Filter (AOTF)-NIR spectrometer. Standard gas samples of methane (CH 4), ethane (C IIH 6) and propane (C 3H 8) are diluted with super pure nitrogen via precision volumetric gas flow controllers to obtain gas mixture samples of different concentrations dynamically. The gas mixtures were injected into the gas cell and the spectra of wavelength between 1100nm-2300nm were collected. The feature components extracted from gas mixture spectra by using Partial Least Squares (PLS) were used as the inputs of the Support Vector Regress Machine (SVR) to establish the quantitative analysis model. The effectiveness of the model is tested by the samples of predicting set. The prediction Root Mean Square Error (RMSE) of CH 4, C IIH 6 and C 3H 8 is respectively 1.27%, 0.89%, and 1.20% when the concentrations of component gas are over 0.5%. It shows that the AOTF-NIR spectrometer with gas cell can be used for gas mixture analysis. PLS combining with SVR has a good performance in NIR spectroscopy analysis. This paper provides the bases for extending the application of NIR spectroscopy analysis to gas detection.
Alves, Junia O; Botelho, Bruno G; Sena, Marcelo M; Augusti, Rodinei
2013-10-01
Direct infusion electrospray ionization mass spectrometry in the positive ion mode [ESI(+)-MS] is used to obtain fingerprints of aqueous-methanolic extracts of two types of olive oils, extra virgin (EV) and ordinary (OR), as well as of samples of EV olive oil adulterated by the addition of OR olive oil and other edible oils: corn (CO), sunflower (SF), soybean (SO) and canola (CA). The MS data is treated by the partial least squares discriminant analysis (PLS-DA) protocol aiming at discriminating the above-mentioned classes formed by the genuine olive oils, EV (1) and OR (2), as well as the EV adulterated samples, i.e. EV/SO (3), EV/CO (4), EV/SF (5), EV/CA (6) and EV/OR (7). The PLS-DA model employed is built with 190 and 70 samples for the training and test sets, respectively. For all classes (1-7), EV and OR olive oils as well as the adulterated samples (in a proportion varying from 0.5 to 20.0% w/w) are properly classified. The developed methodology required no ions identification and demonstrated to be fast, as each measurement lasted about 3 min including the extraction step and MS analysis, and reliable, because high sensitivities (rate of true positives) and specificities (rate of true negatives) were achieved. Finally, it can be envisaged that this approach has potential to be applied in quality control of EV olive oils. Copyright © 2013 John Wiley & Sons, Ltd.
Marschner, C B; Kokla, M; Amigo, J M; Rozanski, E A; Wiinberg, B; McEvoy, F J
2017-07-11
Diagnosis of pulmonary thromboembolism (PTE) in dogs relies on computed tomography pulmonary angiography (CTPA), but detailed interpretation of CTPA images is demanding for the radiologist and only large vessels may be evaluated. New approaches for better detection of smaller thrombi include dual energy computed tomography (DECT) as well as computer assisted diagnosis (CAD) techniques. The purpose of this study was to investigate the performance of quantitative texture analysis for detecting dogs with PTE using grey-level co-occurrence matrices (GLCM) and multivariate statistical classification analyses. CT images from healthy (n = 6) and diseased (n = 29) dogs with and without PTE confirmed on CTPA were segmented so that only tissue with CT numbers between -1024 and -250 Houndsfield Units (HU) was preserved. GLCM analysis and subsequent multivariate classification analyses were performed on texture parameters extracted from these images. Leave-one-dog-out cross validation and receiver operator characteristic (ROC) showed that the models generated from the texture analysis were able to predict healthy dogs with optimal levels of performance. Partial Least Square Discriminant Analysis (PLS-DA) obtained a sensitivity of 94% and a specificity of 96%, while Support Vector Machines (SVM) yielded a sensitivity of 99% and a specificity of 100%. The models, however, performed worse in classifying the type of disease in the diseased dog group: In diseased dogs with PTE sensitivities were 30% (PLS-DA) and 38% (SVM), and specificities were 80% (PLS-DA) and 89% (SVM). In diseased dogs without PTE the sensitivities of the models were 59% (PLS-DA) and 79% (SVM) and specificities were 79% (PLS-DA) and 82% (SVM). The results indicate that texture analysis of CTPA images using GLCM is an effective tool for distinguishing healthy from abnormal lung. Furthermore the texture of pulmonary parenchyma in dogs with PTE is altered, when compared to the texture of pulmonary parenchyma of healthy dogs. The models' poorer performance in classifying dogs within the diseased group, may be related to the low number of dogs compared to texture variables, a lack of balanced number of dogs within each group or a real lack of difference in the texture features among the diseased dogs.
Mo, Changyeun; Kim, Giyoung; Lee, Kangjin; Kim, Moon S; Cho, Byoung-Kwan; Lim, Jongguk; Kang, Sukwon
2014-04-24
In this study, we developed a viability evaluation method for pepper (Capsicum annuum L.) seeds based on hyperspectral reflectance imaging. The reflectance spectra of pepper seeds in the 400-700 nm range are collected from hyperspectral reflectance images obtained using blue, green, and red LED illumination. A partial least squares-discriminant analysis (PLS-DA) model is developed to classify viable and non-viable seeds. Four spectral ranges generated with four types of LEDs (blue, green, red, and RGB), which were pretreated using various methods, are investigated to develop the classification models. The optimal PLS-DA model based on the standard normal variate for RGB LED illumination (400-700 nm) yields discrimination accuracies of 96.7% and 99.4% for viable seeds and nonviable seeds, respectively. The use of images based on the PLS-DA model with the first-order derivative of a 31.5-nm gap for red LED illumination (600-700 nm) yields 100% discrimination accuracy for both viable and nonviable seeds. The results indicate that a hyperspectral imaging technique based on LED light can be potentially applied to high-quality pepper seed sorting.
Lu, Shao Hua; Li, Bao Qiong; Zhai, Hong Lin; Zhang, Xin; Zhang, Zhuo Yong
2018-04-25
Terahertz time-domain spectroscopy has been applied to many fields, however, it still encounters drawbacks in multicomponent mixtures analysis due to serious spectral overlapping. Here, an effective approach to quantitative analysis was proposed, and applied on the determination of the ternary amino acids in foxtail millet substrate. Utilizing three parameters derived from the THz-TDS, the images were constructed and the Tchebichef image moments were used to extract the information of target components. Then the quantitative models were obtained by stepwise regression. The correlation coefficients of leave-one-out cross-validation (R loo-cv 2 ) were more than 0.9595. As for external test set, the predictive correlation coefficients (R p 2 ) were more than 0.8026 and the root mean square error of prediction (RMSE p ) were less than 1.2601. Compared with the traditional methods (PLS and N-PLS methods), our approach is more accurate, robust and reliable, and can be a potential excellent approach to quantify multicomponent with THz-TDS spectroscopy. Copyright © 2017 Elsevier Ltd. All rights reserved.
Klein-Júnior, Luiz C; Viaene, Johan; Tuenter, Emmy; Salton, Juliana; Gasper, André L; Apers, Sandra; Andries, Jan P M; Pieters, Luc; Henriques, Amélia T; Vander Heyden, Yvan
2016-09-09
Psychotria nemorosa is chemically characterized by indole alkaloids and displays significant inhibitory activity on butyrylcholinesterase (BChE) and monoamine oxidase-A (MAO-A), both enzymes related to neurodegenerative disorders. In the present study, 43 samples of P. nemorosa leaves were extracted and fractionated in accordance to previously optimized methods (see Part I). These fractions were analyzed by means of UPLC-DAD and assayed for their BChE and MAO-A inhibitory potencies. The chromatographic fingerprint data was first aligned using correlation optimized warping and Principal Component Analysis to explore the data structure was performed. Multivariate calibration techniques, namely Partial Least Squares (PLS1), PLS2 and Orthogonal Projections to Latent Structure (O-PLS1), were evaluated for modelling the activities as a function of the fingerprints. Since the best results were obtained with O-PLS1 model (RMSECV=9.3 and 3.3 for BChE and MAO-A, respectively), the regression coefficients of the model were analyzed and plotted relative to the original fingerprints. Four peaks were indicated as multifunctional compounds, with the capacity to impair both BChE and MAO-A activities. In order to confirm these results, a semi-prep HPLC technique was used and a fraction containing the four peaks was purified and evaluated in vitro. It was observed that the fraction exhibited an IC50 of 2.12μgmL(-1) for BChE and 1.07μgmL(-1) for MAO-A. These results reinforce the prediction obtained by O-PLS1 modelling. Copyright © 2016 Elsevier B.V. All rights reserved.
Zhang, Mengliang; Harrington, Peter de B
2015-01-01
Multivariate partial least-squares (PLS) method was applied to the quantification of two complex polychlorinated biphenyls (PCBs) commercial mixtures, Aroclor 1254 and 1260, in a soil matrix. PCBs in soil samples were extracted by headspace solid phase microextraction (SPME) and determined by gas chromatography/mass spectrometry (GC/MS). Decachlorinated biphenyl (deca-CB) was used as internal standard. After the baseline correction was applied, four data representations including extracted ion chromatograms (EIC) for Aroclor 1254, EIC for Aroclor 1260, EIC for both Aroclors and two-way data sets were constructed for PLS-1 and PLS-2 calibrations and evaluated with respect to quantitative prediction accuracy. The PLS model was optimized with respect to the number of latent variables using cross validation of the calibration data set. The validation of the method was performed with certified soil samples and real field soil samples and the predicted concentrations for both Aroclors using EIC data sets agreed with the certified values. The linear range of the method was from 10μgkg(-1) to 1000μgkg(-1) for both Aroclor 1254 and 1260 in soil matrices and the detection limit was 4μgkg(-1) for Aroclor 1254 and 6μgkg(-1) for Aroclor 1260. This holistic approach for the determination of mixtures of complex samples has broad application to environmental forensics and modeling. Copyright © 2014 Elsevier Ltd. All rights reserved.
Maltesen, Morten Jonas; van de Weert, Marco; Grohganz, Holger
2012-09-01
Moisture content and aerodynamic particle size are critical quality attributes for spray-dried protein formulations. In this study, spray-dried insulin powders intended for pulmonary delivery were produced applying design of experiments methodology. Near infrared spectroscopy (NIR) in combination with preprocessing and multivariate analysis in the form of partial least squares projections to latent structures (PLS) were used to correlate the spectral data with moisture content and aerodynamic particle size measured by a time of flight principle. PLS models predicting the moisture content were based on the chemical information of the water molecules in the NIR spectrum. Models yielded prediction errors (RMSEP) between 0.39% and 0.48% with thermal gravimetric analysis used as reference method. The PLS models predicting the aerodynamic particle size were based on baseline offset in the NIR spectra and yielded prediction errors between 0.27 and 0.48 μm. The morphology of the spray-dried particles had a significant impact on the predictive ability of the models. Good predictive models could be obtained for spherical particles with a calibration error (RMSECV) of 0.22 μm, whereas wrinkled particles resulted in much less robust models with a Q (2) of 0.69. Based on the results in this study, NIR is a suitable tool for process analysis of the spray-drying process and for control of moisture content and particle size, in particular for smooth and spherical particles.
Bajoub, Aadil; Medina-Rodríguez, Santiago; Ajal, El Amine; Cuadros-Rodríguez, Luis; Monasterio, Romina Paula; Vercammen, Joeri; Fernández-Gutiérrez, Alberto; Carrasco-Pancorbo, Alegría
2018-04-01
Selected Ion flow tube mass spectrometry (SIFT-MS) in combination with chemometrics was used to authenticate the geographical origin of Mediterranean virgin olive oils (VOOs) produced under geographical origin labels. In particular, 130 oil samples from six different Mediterranean regions (Kalamata (Greece); Toscana (Italy); Meknès and Tyout (Morocco); and Priego de Córdoba and Baena (Spain)) were considered. The headspace volatile fingerprints were measured by SIFT-MS in full scan with H 3 O + , NO + and O 2 + as precursor ions and the results were subjected to chemometric treatments. Principal Component Analysis (PCA) was used for preliminary multivariate data analysis and Partial Least Squares-Discriminant Analysis (PLS-DA) was applied to build different models (considering the three reagent ions) to classify samples according to the country of origin and regions (within the same country). The multi-class PLS-DA models showed very good performance in terms of fitting accuracy (98.90-100%) and prediction accuracy (96.70-100% accuracy for cross validation and 97.30-100% accuracy for external validation (test set)). Considering the two-class PLS-DA models, the one for the Spanish samples showed 100% sensitivity, specificity and accuracy in calibration, cross validation and external validation; the model for Moroccan oils also showed very satisfactory results (with perfect scores for almost every parameter in all the cases). Copyright © 2017 Elsevier Ltd. All rights reserved.
Lund, Jensen A; Brown, Paula N; Shipley, Paul R
2017-09-01
For compliance with US Current Good Manufacturing Practice regulations for dietary supplements, manufacturers must provide identity of source plant material. Despite the popularity of hawthorn as a dietary supplement, relatively little is known about the comparative phytochemistry of different hawthorn species, and in particular North American hawthorns. The combination of NMR spectrometry with chemometric analyses offers an innovative approach to differentiating hawthorn species and exploring the phytochemistry. Two European and two North American species, harvested from a farm trial in late summer 2008, were analyzed by standard 1D 1 H and J-resolved (JRES) experiments. The data were preprocessed and modelled by principal component analysis (PCA). A supervised model was then generated by partial least squares-discriminant analysis (PLS-DA) for classification and evaluated by cross validation. Supervised random forests models were constructed from the dataset to explore the potential of machine learning for identification of unique patterns across species. 1D 1 H NMR data yielded increased differentiation over the JRES data. The random forests results correlated with PLS-DA results and outperformed PLS-DA in classification accuracy. In all of these analyses differentiation of the Crataegus spp. was best achieved by focusing on the NMR spectral region that contains signals unique to plant phenolic compounds. Identification of potentially significant metabolites for differentiation between species was approached using univariate techniques including significance analysis of microarrays and Kruskall-Wallis tests. Copyright © 2017 Elsevier Ltd. All rights reserved.
Li, Yankun; Shao, Xueguang; Cai, Wensheng
2007-04-15
Consensus modeling of combining the results of multiple independent models to produce a single prediction avoids the instability of single model. Based on the principle of consensus modeling, a consensus least squares support vector regression (LS-SVR) method for calibrating the near-infrared (NIR) spectra was proposed. In the proposed approach, NIR spectra of plant samples were firstly preprocessed using discrete wavelet transform (DWT) for filtering the spectral background and noise, then, consensus LS-SVR technique was used for building the calibration model. With an optimization of the parameters involved in the modeling, a satisfied model was achieved for predicting the content of reducing sugar in plant samples. The predicted results show that consensus LS-SVR model is more robust and reliable than the conventional partial least squares (PLS) and LS-SVR methods.
Liu, Fei; Feng, Lei; Lou, Bing-gan; Sun, Guang-ming; Wang, Lian-ping; He, Yong
2010-07-01
The combinational-stimulated bands were used to develop linear and nonlinear calibrations for the early detection of sclerotinia of oilseed rape (Brassica napus L.). Eighty healthy and 100 Sclerotinia leaf samples were scanned, and different preprocessing methods combined with successive projections algorithm (SPA) were applied to develop partial least squares (PLS) discriminant models, multiple linear regression (MLR) and least squares-support vector machine (LS-SVM) models. The results indicated that the optimal full-spectrum PLS model was achieved by direct orthogonal signal correction (DOSC), then De-trending and Raw spectra with correct recognition ratio of 100%, 95.7% and 95.7%, respectively. When using combinational-stimulated bands, the optimal linear models were SPA-MLR (DOSC) and SPA-PLS (DOSC) with correct recognition ratio of 100%. All SPA-LSSVM models using DOSC, De-trending and Raw spectra achieved perfect results with recognition of 100%. The overall results demonstrated that it was feasible to use combinational-stimulated bands for the early detection of Sclerotinia of oilseed rape, and DOSC-SPA was a powerful way for informative wavelength selection. This method supplied a new approach to the early detection and portable monitoring instrument of sclerotinia.
NASA Astrophysics Data System (ADS)
Suhandy, D.; Yulia, M.; Ogawa, Y.; Kondo, N.
2018-05-01
In the present research, an evaluation of using near infrared (NIR) spectroscopy in tandem with full spectrum partial least squares (FS-PLS) regression for quantification of degree of adulteration in civet coffee was conducted. A number of 126 ground roasted coffee samples with degree of adulteration 0-51% were prepared. Spectral data were acquired using a NIR spectrometer equipped with an integrating sphere for diffuse reflectance measurement in the range of 1300-2500 nm. The samples were divided into two groups calibration sample set (84 samples) and prediction sample set (42 samples). The calibration model was developed on original spectra using FS-PLS regression with full-cross validation method. The calibration model exhibited the determination coefficient R2=0.96 for calibration and R2=0.92 for validation. The prediction resulted in low root mean square error of prediction (RMSEP) (4.67%) and high ratio prediction to deviation (RPD) (3.75). In conclusion, the degree of adulteration in civet coffee have been quantified successfully by using NIR spectroscopy and FS-PLS regression in a non-destructive, economical, precise, and highly sensitive method, which uses very simple sample preparation.
Prediction of ethanol in bottled Chinese rice wine by NIR spectroscopy
NASA Astrophysics Data System (ADS)
Ying, Yibin; Yu, Haiyan; Pan, Xingxiang; Lin, Tao
2006-10-01
To evaluate the applicability of non-invasive visible and near infrared (VIS-NIR) spectroscopy for determining ethanol concentration of Chinese rice wine in square brown glass bottle, transmission spectra of 100 bottled Chinese rice wine samples were collected in the spectral range of 350-1200 nm. Statistical equations were established between the reference data and VIS-NIR spectra by partial least squares (PLS) regression method. Performance of three kinds of mathematical treatment of spectra (original spectra, first derivative spectra and second derivative spectra) were also discussed. The PLS models of original spectra turned out better results, with higher correlation coefficient in calibration (R cal) of 0.89, lower root mean standard error of calibration (RMSEC) of 0.165, and lower root mean standard error of cross validation (RMSECV) of 0.179. Using original spectra, PLS models for ethanol concentration prediction were developed. The R cal and the correlation coefficient in validation (R val) were 0.928 and 0.875, respectively; and the RMSEC and the root mean standard error of validation (RMSEP) were 0.135 (%, v v -1) and 0.177 (%, v v -1), respectively. The results demonstrated that VIS-NIR spectroscopy could be used to predict ethanol concentration in bottled Chinese rice wine.
Mabood, F; Boqué, R; Folcarelli, R; Busto, O; Jabeen, F; Al-Harrasi, Ahmed; Hussain, J
2016-05-15
In this study the effect of thermal treatment on the enhancement of synchronous fluorescence spectroscopic method for discrimination and quantification of pure extra virgin olive oil (EVOO) samples from EVOO samples adulterated with refined oil was investigated. Two groups of samples were used. One group was analyzed at room temperature (25 °C) and the other group was thermally treated in a thermostatic water bath at 75 °C for 8h, in contact with air and with light exposure, to favor oxidation. All the samples were then measured with synchronous fluorescence spectroscopy. Synchronous fluorescence spectra were acquired by varying the wavelength in the region from 250 to 720 nm at 20 nm wavelength differential interval of excitation and emission. Pure and adulterated olive oils were discriminated by using partial least-squares discriminant analysis (PLS-DA). It was found that the best PLS-DA models were those built with the difference spectra (75 °C-25 °C), which were able to discriminate pure from adulterated oils at a 2% level of adulteration of refined olive oils. Furthermore, PLS regression models were also built to quantify the level of adulteration. Again, the best model was the one built with the difference spectra, with a prediction error of 3.18% of adulteration. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Mabood, F.; Boqué, R.; Folcarelli, R.; Busto, O.; Jabeen, F.; Al-Harrasi, Ahmed; Hussain, J.
2016-05-01
In this study the effect of thermal treatment on the enhancement of synchronous fluorescence spectroscopic method for discrimination and quantification of pure extra virgin olive oil (EVOO) samples from EVOO samples adulterated with refined oil was investigated. Two groups of samples were used. One group was analyzed at room temperature (25 °C) and the other group was thermally treated in a thermostatic water bath at 75 °C for 8 h, in contact with air and with light exposure, to favor oxidation. All the samples were then measured with synchronous fluorescence spectroscopy. Synchronous fluorescence spectra were acquired by varying the wavelength in the region from 250 to 720 nm at 20 nm wavelength differential interval of excitation and emission. Pure and adulterated olive oils were discriminated by using partial least-squares discriminant analysis (PLS-DA). It was found that the best PLS-DA models were those built with the difference spectra (75 °C-25 °C), which were able to discriminate pure from adulterated oils at a 2% level of adulteration of refined olive oils. Furthermore, PLS regression models were also built to quantify the level of adulteration. Again, the best model was the one built with the difference spectra, with a prediction error of 3.18% of adulteration.
Fischedick, Justin T
2017-01-01
Introduction: With laws changing around the world regarding the legal status of Cannabis sativa (cannabis) it is important to develop objective classification systems that help explain the chemical variation found among various cultivars. Currently cannabis cultivars are named using obscure and inconsistent nomenclature. Terpenoids, responsible for the aroma of cannabis, are a useful group of compounds for distinguishing cannabis cultivars with similar cannabinoid content. Methods: In this study we analyzed terpenoid content of cannabis samples obtained from a single medical cannabis dispensary in California over the course of a year. Terpenoids were quantified by gas chromatography with flame ionization detection and peak identification was confirmed with gas chromatography mass spectrometry. Quantitative data from 16 major terpenoids were analyzed using hierarchical clustering analysis (HCA), principal component analysis (PCA), partial least squares discriminant analysis (PLS-DA), and orthogonal partial least squares discriminant analysis (OPLS-DA). Results: A total of 233 samples representing 30 cultivars were used to develop a classification scheme based on quantitative data, HCA, PCA, and OPLS-DA. Initially cultivars were divided into five major groups, which were subdivided into 13 classes based on differences in terpenoid profile. Different classification models were compared with PLS-DA and found to perform best when many representative samples of a particular class were included. Conclusion: A hierarchy of terpenoid chemotypes was observed in the data set. Some cultivars fit into distinct chemotypes, whereas others seemed to represent a continuum of chemotypes. This study has demonstrated an approach to classifying cannabis cultivars based on terpenoid profile.
NASA Astrophysics Data System (ADS)
Chen, Jiemei; Peng, Lijun; Han, Yun; Yao, Lijun; Zhang, Jing; Pan, Tao
2018-03-01
Near-infrared (NIR) spectroscopy combined with chemometrics was applied to rapidly analyse haemoglobin A2 (HbA2) for β-thalassemia screening in human haemolysate samples. The relative content indicator HbA2 was indirectly quantified by simultaneous analysis of two absolute content indicators (Hb and Hb • HbA2). According to the comprehensive prediction effect of the multiple partitioning of calibration and prediction sets, the parameters were optimized to achieve modelling stability, and the preferred models were validated using the samples not involved in modelling. Savitzky-Golay smoothing was firstly used for the spectral pretreatment. The absorbance optimization partial least squares (AO-PLS) was used to eliminate high-absorption wave-bands appropriately. The equidistant combination PLS (EC-PLS) was further used to optimize wavelength models. The selected optimal models were I = 856 nm, N = 16, G = 1 and F = 6 for Hb and I = 988 nm, N = 12, G = 2 and F = 5 for Hb • HbA2. Through independent validation, the root-mean-square errors and correlation coefficients for prediction (RMSEP, RP) were 3.50 g L- 1 and 0.977 for Hb and 0.38 g L- 1 and 0.917 for Hb • HbA2, respectively. The predicted values of relative percentage HbA2 were further calculated, and the calculated RMSEP and RP were 0.31% and 0.965, respectively. The sensitivity and specificity for β-thalassemia both reached 100%. Therefore, the prediction of HbA2 achieved high accuracy for distinguishing β-thalassemia. The local optimal models for single parameter and the optimal equivalent model sets were proposed, providing more models to match possible constraints in practical applications. The NIR analysis method for the screening indicator of β-thalassemia was successfully established. The proposed method was rapid, simple and promising for thalassemia screening in a large population.
Mirjankar, Nikhil S; Fraga, Carlos G; Carman, April J; Moran, James J
2016-02-02
Chemical attribution signatures (CAS) for chemical threat agents (CTAs), such as cyanides, are being investigated to provide an evidentiary link between CTAs and specific sources to support criminal investigations and prosecutions. Herein, stocks of KCN and NaCN were analyzed for trace anions by high performance ion chromatography (HPIC), carbon stable isotope ratio (δ(13)C) by isotope ratio mass spectrometry (IRMS), and trace elements by inductively coupled plasma optical emission spectroscopy (ICP-OES). The collected analytical data were evaluated using hierarchical cluster analysis (HCA), Fisher-ratio (F-ratio), interval partial least-squares (iPLS), genetic algorithm-based partial least-squares (GAPLS), partial least-squares discriminant analysis (PLSDA), K nearest neighbors (KNN), and support vector machines discriminant analysis (SVMDA). HCA of anion impurity profiles from multiple cyanide stocks from six reported countries of origin resulted in cyanide samples clustering into three groups, independent of the associated alkali metal (K or Na). The three groups were independently corroborated by HCA of cyanide elemental profiles and corresponded to countries each having one known solid cyanide factory: Czech Republic, Germany, and United States. Carbon stable isotope measurements resulted in two clusters: Germany and United States (the single Czech stock grouped with United States stocks). Classification errors for two validation studies using anion impurity profiles collected over five years on different instruments were as low as zero for KNN and SVMDA, demonstrating the excellent reliability associated with using anion impurities for matching a cyanide sample to its factory using our current cyanide stocks. Variable selection methods reduced errors for those classification methods having errors greater than zero; iPLS-forward selection and F-ratio typically provided the lowest errors. Finally, using anion profiles to classify cyanides to a specific stock or stock group for a subset of United States stocks resulted in cross-validation errors ranging from 0 to 5.3%.
NASA Astrophysics Data System (ADS)
Sindt, Nathan M.; Robison, Faith; Brick, Mark A.; Schwartz, Howard F.; Heuberger, Adam L.; Prenni, Jessica E.
2018-02-01
Matrix-assisted desorption/ionization time of flight mass spectrometry (MALDI-TOF-MS) is a fast and effective tool for microbial species identification. However, current approaches are limited to species-level identification even when genetic differences are known. Here, we present a novel workflow that applies the statistical method of partial least squares discriminant analysis (PLS-DA) to MALDI-TOF-MS protein fingerprint data of Xanthomonas axonopodis, an important bacterial plant pathogen of fruit and vegetable crops. Mass spectra of 32 X. axonopodis strains were used to create a mass spectral library and PLS-DA was employed to model the closely related strains. A robust workflow was designed to optimize the PLS-DA model by assessing the model performance over a range of signal-to-noise ratios (s/n) and mass filter (MF) thresholds. The optimized parameters were observed to be s/n = 3 and MF = 0.7. The model correctly classified 83% of spectra withheld from the model as a test set. A new decision rule was developed, termed the rolled-up Maximum Decision Rule (ruMDR), and this method improved identification rates to 92%. These results demonstrate that MALDI-TOF-MS protein fingerprints of bacterial isolates can be utilized to enable identification at the strain level. Furthermore, the open-source framework of this workflow allows for broad implementation across various instrument platforms as well as integration with alternative modeling and classification algorithms.
The Role of Safety Culture in Influencing Provider Perceptions of Patient Safety.
Bishop, Andrea C; Boyle, Todd A
2016-12-01
To determine how provider perceptions of safety culture influence their involvement in patient safety practices. Health-care providers were surveyed in 2 tertiary hospitals located in Atlantic Canada, composed of 4 units in total. The partial least squares (PLS) approach to structural equation modeling was used to analyze the data. Latent variables provider PLS model encompassed the hypothesized relationships between provider characteristics, safety culture, perceptions of patient safety practices, and actual performance of patient safety practices, using the Health Belief Model (HBM) as a guide. Data analysis was conducted using SmartPLS. A total of 113 health-care providers completed a survey out of an eligible 318, representing a response rate of 35.5%. The final PLS model showed acceptable internal consistency with all four latent variables having a composite reliability score above the recommended 0.70 cutoff value (safety culture = 0.86, threat = 0.76, expectations = 0.83, PS practices = 0.75). Discriminant validity was established, and all path coefficients were found to be significant at the α = 0.05 level using nonparametric bootstrapping. The survey results show that safety culture accounted for 34% of the variance in perceptions of threat and 42% of the variance in expectations. This research supports the role that safety culture plays in the promotion and maintenance of patient safety activities for health-care providers. As such, it is recommended that the introduction of new patient safety strategies follow a thorough exploration of an organization's safety culture.
NASA Astrophysics Data System (ADS)
Hegazy, Maha A.; Lotfy, Hayam M.; Mowaka, Shereen; Mohamed, Ekram Hany
2016-07-01
Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations.
NASA Astrophysics Data System (ADS)
Thangsunan, Patcharapong; Kittiwachana, Sila; Meepowpan, Puttinan; Kungwan, Nawee; Prangkio, Panchika; Hannongbua, Supa; Suree, Nuttee
2016-06-01
Improving performance of scoring functions for drug docking simulations is a challenging task in the modern discovery pipeline. Among various ways to enhance the efficiency of scoring function, tuning of energetic component approach is an attractive option that provides better predictions. Herein we present the first development of rapid and simple tuning models for predicting and scoring inhibitory activity of investigated ligands docked into catalytic core domain structures of HIV-1 integrase (IN) enzyme. We developed the models using all energetic terms obtained from flexible ligand-rigid receptor dockings by AutoDock4, followed by a data analysis using either partial least squares (PLS) or self-organizing maps (SOMs). The models were established using 66 and 64 ligands of mercaptobenzenesulfonamides for the PLS-based and the SOMs-based inhibitory activity predictions, respectively. The models were then evaluated for their predictability quality using closely related test compounds, as well as five different unrelated inhibitor test sets. Weighting constants for each energy term were also optimized, thus customizing the scoring function for this specific target protein. Root-mean-square error (RMSE) values between the predicted and the experimental inhibitory activities were determined to be <1 (i.e. within a magnitude of a single log scale of actual IC50 values). Hence, we propose that, as a pre-functional assay screening step, AutoDock4 docking in combination with these subsequent rapid weighted energy tuning methods via PLS and SOMs analyses is a viable approach to predict the potential inhibitory activity and to discriminate among small drug-like molecules to target a specific protein of interest.
NASA Astrophysics Data System (ADS)
He, Anhua; Singh, Ramesh P.; Sun, Zhaohua; Ye, Qing; Zhao, Gang
2016-07-01
The earth tide, atmospheric pressure, precipitation and earthquake fluctuations, especially earthquake greatly impacts water well levels, thus anomalous co-seismic changes in ground water levels have been observed. In this paper, we have used four different models, simple linear regression (SLR), multiple linear regression (MLR), principal component analysis (PCA) and partial least squares (PLS) to compute the atmospheric pressure and earth tidal effects on water level. Furthermore, we have used the Akaike information criterion (AIC) to study the performance of various models. Based on the lowest AIC and sum of squares for error values, the best estimate of the effects of atmospheric pressure and earth tide on water level is found using the MLR model. However, MLR model does not provide multicollinearity between inputs, as a result the atmospheric pressure and earth tidal response coefficients fail to reflect the mechanisms associated with the groundwater level fluctuations. On the premise of solving serious multicollinearity of inputs, PLS model shows the minimum AIC value. The atmospheric pressure and earth tidal response coefficients show close response with the observation using PLS model. The atmospheric pressure and the earth tidal response coefficients are found to be sensitive to the stress-strain state using the observed data for the period 1 April-8 June 2008 of Chuan 03# well. The transient enhancement of porosity of rock mass around Chuan 03# well associated with the Wenchuan earthquake (Mw = 7.9 of 12 May 2008) that has taken its original pre-seismic level after 13 days indicates that the co-seismic sharp rise of water well could be induced by static stress change, rather than development of new fractures.
Analysis of spreadable cheese by Raman spectroscopy and chemometric tools.
Oliveira, Kamila de Sá; Callegaro, Layce de Souza; Stephani, Rodrigo; Almeida, Mariana Ramos; de Oliveira, Luiz Fernando Cappa
2016-03-01
In this work, FT-Raman spectroscopy was explored to evaluate spreadable cheese samples. A partial least squares discriminant analysis was employed to identify the spreadable cheese samples containing starch. To build the models, two types of samples were used: commercial samples and samples manufactured in local industries. The method of supervised classification PLS-DA was employed to classify the samples as adulterated or without starch. Multivariate regression was performed using the partial least squares method to quantify the starch in the spreadable cheese. The limit of detection obtained for the model was 0.34% (w/w) and the limit of quantification was 1.14% (w/w). The reliability of the models was evaluated by determining the confidence interval, which was calculated using the bootstrap re-sampling technique. The results show that the classification models can be used to complement classical analysis and as screening methods. Copyright © 2015 Elsevier Ltd. All rights reserved.
Croker, Denise M; Hennigan, Michelle C; Maher, Anthony; Hu, Yun; Ryder, Alan G; Hodnett, Benjamin K
2012-04-07
Diffraction and spectroscopic methods were evaluated for quantitative analysis of binary powder mixtures of FII(6.403) and FIII(6.525) piracetam. The two polymorphs of piracetam could be distinguished using powder X-ray diffraction (PXRD), Raman and near-infrared (NIR) spectroscopy. The results demonstrated that Raman and NIR spectroscopy are most suitable for quantitative analysis of this polymorphic mixture. When the spectra are treated with the combination of multiplicative scatter correction (MSC) and second derivative data pretreatments, the partial least squared (PLS) regression model gave a root mean square error of calibration (RMSEC) of 0.94 and 0.99%, respectively. FIII(6.525) demonstrated some preferred orientation in PXRD analysis, making PXRD the least preferred method of quantification. Copyright © 2012 Elsevier B.V. All rights reserved.
Whelan, Jessica; Craven, Stephen; Glennon, Brian
2012-01-01
In this study, the application of Raman spectroscopy to the simultaneous quantitative determination of glucose, glutamine, lactate, ammonia, glutamate, total cell density (TCD), and viable cell density (VCD) in a CHO fed-batch process was demonstrated in situ in 3 L and 15 L bioreactors. Spectral preprocessing and partial least squares (PLS) regression were used to correlate spectral data with off-line reference data. Separate PLS calibration models were developed for each analyte at the 3 L laboratory bioreactor scale before assessing its transferability to the same bioprocess conducted at the 15 L pilot scale. PLS calibration models were successfully developed for all analytes bar VCD and transferred to the 15 L scale. Copyright © 2012 American Institute of Chemical Engineers (AIChE).
Ferragina, A.; de los Campos, G.; Vazquez, A. I.; Cecchinato, A.; Bittante, G.
2017-01-01
The aim of this study was to assess the performance of Bayesian models commonly used for genomic selection to predict “difficult-to-predict” dairy traits, such as milk fatty acid (FA) expressed as percentage of total fatty acids, and technological properties, such as fresh cheese yield and protein recovery, using Fourier-transform infrared (FTIR) spectral data. Our main hypothesis was that Bayesian models that can estimate shrinkage and perform variable selection may improve our ability to predict FA traits and technological traits above and beyond what can be achieved using the current calibration models (e.g., partial least squares, PLS). To this end, we assessed a series of Bayesian methods and compared their prediction performance with that of PLS. The comparison between models was done using the same sets of data (i.e., same samples, same variability, same spectral treatment) for each trait. Data consisted of 1,264 individual milk samples collected from Brown Swiss cows for which gas chromatographic FA composition, milk coagulation properties, and cheese-yield traits were available. For each sample, 2 spectra in the infrared region from 5,011 to 925 cm−1 were available and averaged before data analysis. Three Bayesian models: Bayesian ridge regression (Bayes RR), Bayes A, and Bayes B, and 2 reference models: PLS and modified PLS (MPLS) procedures, were used to calibrate equations for each of the traits. The Bayesian models used were implemented in the R package BGLR (http://cran.r-project.org/web/packages/BGLR/index.html), whereas the PLS and MPLS were those implemented in the WinISI II software (Infrasoft International LLC, State College, PA). Prediction accuracy was estimated for each trait and model using 25 replicates of a training-testing validation procedure. Compared with PLS, which is currently the most widely used calibration method, MPLS and the 3 Bayesian methods showed significantly greater prediction accuracy. Accuracy increased in moving from calibration to external validation methods, and in moving from PLS and MPLS to Bayesian methods, particularly Bayes A and Bayes B. The maximum R2 value of validation was obtained with Bayes B and Bayes A. For the FA, C10:0 (% of each FA on total FA basis) had the highest R2 (0.75, achieved with Bayes A and Bayes B), and among the technological traits, fresh cheese yield R2 of 0.82 (achieved with Bayes B). These 2 methods have proven to be useful instruments in shrinking and selecting very informative wavelengths and inferring the structure and functions of the analyzed traits. We conclude that Bayesian models are powerful tools for deriving calibration equations, and, importantly, these equations can be easily developed using existing open-source software. As part of our study, we provide scripts based on the open source R software BGLR, which can be used to train customized prediction equations for other traits or populations. PMID:26387015
Yang, Jun-Ho; Yoh, Jack J
2018-01-01
A novel technique is reported for separating overlapping latent fingerprints using chemometric approaches that combine laser-induced breakdown spectroscopy (LIBS) and multivariate analysis. The LIBS technique provides the capability of real time analysis and high frequency scanning as well as the data regarding the chemical composition of overlapping latent fingerprints. These spectra offer valuable information for the classification and reconstruction of overlapping latent fingerprints by implementing appropriate statistical multivariate analysis. The current study employs principal component analysis and partial least square methods for the classification of latent fingerprints from the LIBS spectra. This technique was successfully demonstrated through a classification study of four distinct latent fingerprints using classification methods such as soft independent modeling of class analogy (SIMCA) and partial least squares discriminant analysis (PLS-DA). The novel method yielded an accuracy of more than 85% and was proven to be sufficiently robust. Furthermore, through laser scanning analysis at a spatial interval of 125 µm, the overlapping fingerprints were reconstructed as separate two-dimensional forms.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Clegg, Samuel M; Barefield, James E; Wiens, Roger C
2008-01-01
Quantitative analysis with LIBS traditionally employs calibration curves that are complicated by the chemical matrix effects. These chemical matrix effects influence the LIBS plasma and the ratio of elemental composition to elemental emission line intensity. Consequently, LIBS calibration typically requires a priori knowledge of the unknown, in order for a series of calibration standards similar to the unknown to be employed. In this paper, three new Multivariate Analysis (MV A) techniques are employed to analyze the LIBS spectra of 18 disparate igneous and highly-metamorphosed rock samples. Partial Least Squares (PLS) analysis is used to generate a calibration model from whichmore » unknown samples can be analyzed. Principal Components Analysis (PCA) and Soft Independent Modeling of Class Analogy (SIMCA) are employed to generate a model and predict the rock type of the samples. These MV A techniques appear to exploit the matrix effects associated with the chemistries of these 18 samples.« less
Leaf Chlorophyll Content Estimation of Winter Wheat Based on Visible and Near-Infrared Sensors.
Zhang, Jianfeng; Han, Wenting; Huang, Lvwen; Zhang, Zhiyong; Ma, Yimian; Hu, Yamin
2016-03-25
The leaf chlorophyll content is one of the most important factors for the growth of winter wheat. Visual and near-infrared sensors are a quick and non-destructive testing technology for the estimation of crop leaf chlorophyll content. In this paper, a new approach is developed for leaf chlorophyll content estimation of winter wheat based on visible and near-infrared sensors. First, the sliding window smoothing (SWS) was integrated with the multiplicative scatter correction (MSC) or the standard normal variable transformation (SNV) to preprocess the reflectance spectra images of wheat leaves. Then, a model for the relationship between the leaf relative chlorophyll content and the reflectance spectra was developed using the partial least squares (PLS) and the back propagation neural network. A total of 300 samples from areas surrounding Yangling, China, were used for the experimental studies. The samples of visible and near-infrared spectroscopy at the wavelength of 450,900 nm were preprocessed using SWS, MSC and SNV. The experimental results indicate that the preprocessing using SWS and SNV and then modeling using PLS can achieve the most accurate estimation, with the correlation coefficient at 0.8492 and the root mean square error at 1.7216. Thus, the proposed approach can be widely used for winter wheat chlorophyll content analysis.
Farrés, Mireia; Piña, Benjamí; Tauler, Romà
2016-08-01
Copper containing fungicides are used to protect vineyards from fungal infections. Higher residues of copper in grapes at toxic concentrations are potentially toxic and affect the microorganisms living in vineyards, such as Saccharomyces cerevisiae. In this study, the response of the metabolic profiles of S. cerevisiae at different concentrations of copper sulphate (control, 1 mM, 3 mM and 6 mM) was analysed by liquid chromatography coupled to mass spectrometry (LC-MS) and multivariate curve resolution-alternating least squares (MCR-ALS) using an untargeted metabolomics approach. Peak areas of the MCR-ALS resolved elution profiles in control and in Cu(ii)-treated samples were compared using partial least squares regression (PLSR) and PLS-discriminant analysis (PLS-DA), and the intracellular metabolites best contributing to sample discrimination were selected and identified. Fourteen metabolites showed significant concentration changes upon Cu(ii) exposure, following a dose-response effect. The observed changes were consistent with the expected effects of Cu(ii) toxicity, including oxidative stress and DNA damage. This research confirmed that LC-MS based metabolomics coupled to chemometric methods are a powerful approach for discerning metabolomics changes in S. cerevisiae and for elucidating modes of toxicity of environmental stressors, including heavy metals like Cu(ii).
de Almeida, Maurício Liberal; Saatkamp, Cassiano Junior; Fernandes, Adriana Barrinha; Pinheiro, Antonio Luiz Barbosa; Silveira, Landulfo
2016-09-01
Urea and creatinine are commonly used as biomarkers of renal function. Abnormal concentrations of these biomarkers are indicative of pathological processes such as renal failure. This study aimed to develop a model based on Raman spectroscopy to estimate the concentration values of urea and creatinine in human serum. Blood sera from 55 clinically normal subjects and 47 patients with chronic kidney disease undergoing dialysis were collected, and concentrations of urea and creatinine were determined by spectrophotometric methods. A Raman spectrum was obtained with a high-resolution dispersive Raman spectrometer (830 nm). A spectral model was developed based on partial least squares (PLS), where the concentrations of urea and creatinine were correlated with the Raman features. Principal components analysis (PCA) was used to discriminate dialysis patients from normal subjects. The PLS model showed r = 0.97 and r = 0.93 for urea and creatinine, respectively. The root mean square errors of cross-validation (RMSECV) for the model were 17.6 and 1.94 mg/dL, respectively. PCA showed high discrimination between dialysis and normality (95 % accuracy). The Raman technique was able to determine the concentrations with low error and to discriminate dialysis from normal subjects, consistent with a rapid and low-cost test.
Lee, Hoonsoo; Kim, Moon S; Song, Yu-Rim; Oh, Chang-Sik; Lim, Hyoun-Sub; Lee, Wang-Hee; Kang, Jum-Soon; Cho, Byoung-Kwan
2017-03-01
There is a need to minimize economic damage by sorting infected seeds from healthy seeds before seeding. However, current methods of detecting infected seeds, such as seedling grow-out, enzyme-linked immunosorbent assays, the polymerase chain reaction (PCR) and the real-time PCR have a critical drawbacks in that they are time-consuming, labor-intensive and destructive procedures. The present study aimed to evaluate the potential of visible/near-infrared (Vis/NIR) hyperspectral imaging system for detecting bacteria-infected watermelon seeds. A hyperspectral Vis/NIR reflectance imaging system (spectral region of 400-1000 nm) was constructed to obtain hyperspectral reflectance images for 336 bacteria-infected watermelon seeds, which were then subjected to partial least square discriminant analysis (PLS-DA) and a least-squares support vector machine (LS-SVM) to classify bacteria-infected watermelon seeds from healthy watermelon seeds. The developed system detected bacteria-infected watermelon seeds with an accuracy > 90% (PLS-DA: 91.7%, LS-SVM: 90.5%), suggesting that the Vis/NIR hyperspectral imaging system is effective for quarantining bacteria-infected watermelon seeds. The results of the present study show that it is possible to use the Vis/NIR hyperspectral imaging system for detecting bacteria-infected watermelon seeds. © 2016 Society of Chemical Industry. © 2016 Society of Chemical Industry.
NASA Astrophysics Data System (ADS)
Hong, Jangho; Kawashima, Ayato; Hamada, Noriaki
2017-06-01
In this study, we developed a facile fabrication method to access a highly reproducible plasmonic surface enhanced Raman scattering substrate via the immobilization of gold nanoparticles on an Ultrafiltration (UF) membrane using a suction technique. This was combined with a simple and rapid analyte concentration and detection method utilizing portable Raman spectroscopy. The minimum detectable concentrations for aqueous thiabendazole standard solution and thiabendazole in orange extract are 0.01 μg/mL and 0.125 μg/g, respectively. The partial least squares (PLS) regression plot shows a good linear relationship between 0.001 and 100 μg/mL of analyte, with a root mean square error of prediction (RMSEP) of 0.294 and a correlation coefficient (R2) of 0.976 for the thiabendazole standard solution. Meanwhile, the PLS plot also shows a good linear relationship between 0.0 and 2.5 μg/g of analyte, with an RMSEP value of 0.298 and an R2 value of 0.993 for the orange peel extract. In addition to the detection of other types of pesticides in agricultural products, this highly uniform plasmonic substrate has great potential for application in various environmentally-related areas.
Li, Zhigang; Wang, Qiaoyun; Lv, Jiangtao; Ma, Zhenhe; Yang, Linjuan
2015-06-01
Spectroscopy is often applied when a rapid quantitative analysis is required, but one challenge is the translation of raw spectra into a final analysis. Derivative spectra are often used as a preliminary preprocessing step to resolve overlapping signals, enhance signal properties, and suppress unwanted spectral features that arise due to non-ideal instrument and sample properties. In this study, to improve quantitative analysis of near-infrared spectra, derivatives of noisy raw spectral data need to be estimated with high accuracy. A new spectral estimator based on singular perturbation technique, called the singular perturbation spectra estimator (SPSE), is presented, and the stability analysis of the estimator is given. Theoretical analysis and simulation experimental results confirm that the derivatives can be estimated with high accuracy using this estimator. Furthermore, the effectiveness of the estimator for processing noisy infrared spectra is evaluated using the analysis of beer spectra. The derivative spectra of the beer and the marzipan are used to build the calibration model using partial least squares (PLS) modeling. The results show that the PLS based on the new estimator can achieve better performance compared with the Savitzky-Golay algorithm and can serve as an alternative choice for quantitative analytical applications.
Effective diagnosis of Alzheimer’s disease by means of large margin-based methodology
2012-01-01
Background Functional brain images such as Single-Photon Emission Computed Tomography (SPECT) and Positron Emission Tomography (PET) have been widely used to guide the clinicians in the Alzheimer’s Disease (AD) diagnosis. However, the subjectivity involved in their evaluation has favoured the development of Computer Aided Diagnosis (CAD) Systems. Methods It is proposed a novel combination of feature extraction techniques to improve the diagnosis of AD. Firstly, Regions of Interest (ROIs) are selected by means of a t-test carried out on 3D Normalised Mean Square Error (NMSE) features restricted to be located within a predefined brain activation mask. In order to address the small sample-size problem, the dimension of the feature space was further reduced by: Large Margin Nearest Neighbours using a rectangular matrix (LMNN-RECT), Principal Component Analysis (PCA) or Partial Least Squares (PLS) (the two latter also analysed with a LMNN transformation). Regarding the classifiers, kernel Support Vector Machines (SVMs) and LMNN using Euclidean, Mahalanobis and Energy-based metrics were compared. Results Several experiments were conducted in order to evaluate the proposed LMNN-based feature extraction algorithms and its benefits as: i) linear transformation of the PLS or PCA reduced data, ii) feature reduction technique, and iii) classifier (with Euclidean, Mahalanobis or Energy-based methodology). The system was evaluated by means of k-fold cross-validation yielding accuracy, sensitivity and specificity values of 92.78%, 91.07% and 95.12% (for SPECT) and 90.67%, 88% and 93.33% (for PET), respectively, when a NMSE-PLS-LMNN feature extraction method was used in combination with a SVM classifier, thus outperforming recently reported baseline methods. Conclusions All the proposed methods turned out to be a valid solution for the presented problem. One of the advances is the robustness of the LMNN algorithm that not only provides higher separation rate between the classes but it also makes (in combination with NMSE and PLS) this rate variation more stable. In addition, their generalization ability is another advance since several experiments were performed on two image modalities (SPECT and PET). PMID:22849649
Effective diagnosis of Alzheimer's disease by means of large margin-based methodology.
Chaves, Rosa; Ramírez, Javier; Górriz, Juan M; Illán, Ignacio A; Gómez-Río, Manuel; Carnero, Cristobal
2012-07-31
Functional brain images such as Single-Photon Emission Computed Tomography (SPECT) and Positron Emission Tomography (PET) have been widely used to guide the clinicians in the Alzheimer's Disease (AD) diagnosis. However, the subjectivity involved in their evaluation has favoured the development of Computer Aided Diagnosis (CAD) Systems. It is proposed a novel combination of feature extraction techniques to improve the diagnosis of AD. Firstly, Regions of Interest (ROIs) are selected by means of a t-test carried out on 3D Normalised Mean Square Error (NMSE) features restricted to be located within a predefined brain activation mask. In order to address the small sample-size problem, the dimension of the feature space was further reduced by: Large Margin Nearest Neighbours using a rectangular matrix (LMNN-RECT), Principal Component Analysis (PCA) or Partial Least Squares (PLS) (the two latter also analysed with a LMNN transformation). Regarding the classifiers, kernel Support Vector Machines (SVMs) and LMNN using Euclidean, Mahalanobis and Energy-based metrics were compared. Several experiments were conducted in order to evaluate the proposed LMNN-based feature extraction algorithms and its benefits as: i) linear transformation of the PLS or PCA reduced data, ii) feature reduction technique, and iii) classifier (with Euclidean, Mahalanobis or Energy-based methodology). The system was evaluated by means of k-fold cross-validation yielding accuracy, sensitivity and specificity values of 92.78%, 91.07% and 95.12% (for SPECT) and 90.67%, 88% and 93.33% (for PET), respectively, when a NMSE-PLS-LMNN feature extraction method was used in combination with a SVM classifier, thus outperforming recently reported baseline methods. All the proposed methods turned out to be a valid solution for the presented problem. One of the advances is the robustness of the LMNN algorithm that not only provides higher separation rate between the classes but it also makes (in combination with NMSE and PLS) this rate variation more stable. In addition, their generalization ability is another advance since several experiments were performed on two image modalities (SPECT and PET).
NASA Astrophysics Data System (ADS)
Pérez-Rodríguez, Marta; Horák-Terra, Ingrid; Rodríguez-Lado, Luis; Martínez Cortizas, Antonio
2016-11-01
Despite its potential, infrared spectroscopy combined with multivariate statistics has been seldom used to model peat properties with environmental value, such us the concentration of potentially toxic metals. In this research, we applied attenuated total reflectance (ATR) Fourier-Transform Infrared (FTIR) spectroscopy to evaluate the ability of the technique to predict mercury concentrations in late-Pleistocene/Holocene peat from a minerogenic peatland from Minas Gerais (Brazil). Mercury concentrations were analysed using a Milestone DMA-80 analyzer and attenuated total reflectance FTIR-ATR was performed using a Gladi-ATR (Pike Technologies) in the mid IR spectrum (4000-400 cm- 1). Concentrations were modelled using principal components (PCR) and partial least squares regression (PLS). The performance of the models varied between moderate and very good (R2 0.67-0.90), with low RMSD values (0.35-1.06). A PLS model based on three latent vectors (LV1 to LV3) provided the best (R2 0.90, RMSD 0.35) results. LV1 reflected total organic matter content versus mineral matter (mainly quartz from local fluxes), LV2 was related to dust deposition from regional sources, and LV3 reflected peat organic matter decomposition. Compared to a previous investigation based on geochemical data, the spectroscopy-based PLS model performed better, but it has to be complemented with additional data (as δ13 C ratios) to reliably reproduce the changes of the factors controlling mercury accumulation over time. This, time- and cost-effective, methodology may help to develop multi-core approaches to study the within and between mire (of a similar type and area) variability in mercury accumulation, and probably also other peat properties. Fig. S2 Loadings weights of the three and two significant components from the direct (dPCR) and transposed (trPCR) PCR models. Fig. S3 Depth records of the cumulative effects of the factors involved in the variation of mercury concentrations. Left, MIR-PLS model; centre, MIR-PLS + δ13 C data model; right, geochemical model from Pérez-Rodríguez et al. [44].
Chang, Xiangwei; Zhang, Juanjuan; Li, Dekun; Zhou, Dazheng; Zhang, Yuling; Wang, Jincheng; Hu, Bing; Ju, Aichun; Ye, Zhengliang
2017-07-15
The adulteration or falsification of the cultivation age of mountain cultivated ginseng (MCG) has been a serious problem in the commercial MCG market. To develop an efficient discrimination tool for the cultivation age and to explore potential age-dependent markers, an optimized ultra high-performance liquid chromatography/quadrupole time-of-flight mass spectrometry (UHPLC/QTOF-MS)-based metabolomics approach was applied in the global metabolite profiling of 156 MCG leaf (MGL) samples aged from 6 to 18 years. Multivariate statistical methods such as principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA) were used to compare the derived patterns between MGL samples of different cultivation ages. The present study demonstrated that 6-18-year-old MGL samples can be successfully discriminated using two simple successive steps, together with four PLS-DA discrimination models. Furthermore, 39 robust age-dependent markers enabling differentiation among the 6-18-year-old MGL samples were discovered. The results were validated by a permutation test and an external test set to verify the predictability and reliability of the established discrimination models. More importantly, without destroying the MCG roots, the proposed approach could also be applied to discriminate MCG root ages indirectly, using a minimum amount of homophyletic MGL samples combined with the established four PLS-DA models and identified markers. Additionally, to the best of our knowledge, this is the first study in which 6-18-year-old MCG root ages have been nondestructively differentiated by analyzing homophyletic MGL samples using UHPLC/QTOF-MS analysis and two simple successive steps together with four PLS-DA models. The method developed in this study can be used as a standard protocol for discriminating and predicting MGL ages directly and homophyletic MCG root ages indirectly. Copyright © 2017 Elsevier B.V. All rights reserved.
Rapid Analysis of Deoxynivalenol in Durum Wheat by FT-NIR Spectroscopy
De Girolamo, Annalisa; Cervellieri, Salvatore; Visconti, Angelo; Pascale, Michelangelo
2014-01-01
Fourier-transform-near infrared (FT-NIR) spectroscopy has been used to develop quantitative and classification models for the prediction of deoxynivalenol (DON) levels in durum wheat samples. Partial least-squares (PLS) regression analysis was used to determine DON in wheat samples in the range of <50–16,000 µg/kg DON. The model displayed a large root mean square error of prediction value (1,977 µg/kg) as compared to the EU maximum limit for DON in unprocessed durum wheat (i.e., 1,750 µg/kg), thus making the PLS approach unsuitable for quantitative prediction of DON in durum wheat. Linear discriminant analysis (LDA) was successfully used to differentiate wheat samples based on their DON content. A first approach used LDA to group wheat samples into three classes: A (DON ≤ 1,000 µg/kg), B (1,000 < DON ≤ 2,500 µg/kg), and C (DON > 2,500 µg/kg) (LDA I). A second approach was used to discriminate highly contaminated wheat samples based on three different cut-off limits, namely 1,000 (LDA II), 1,200 (LDA III) and 1,400 µg/kg DON (LDA IV). The overall classification and false compliant rates for the three models were 75%–90% and 3%–7%, respectively, with model LDA IV using a cut-off of 1,400 µg/kg fulfilling the requirement of the European official guidelines for screening methods. These findings confirmed the suitability of FT-NIR to screen a large number of wheat samples for DON contamination and to verify the compliance with EU regulation. PMID:25384107
Rapid analysis of deoxynivalenol in durum wheat by FT-NIR spectroscopy.
De Girolamo, Annalisa; Cervellieri, Salvatore; Visconti, Angelo; Pascale, Michelangelo
2014-11-06
Fourier-transform-near infrared (FT-NIR) spectroscopy has been used to develop quantitative and classification models for the prediction of deoxynivalenol (DON) levels in durum wheat samples. Partial least-squares (PLS) regression analysis was used to determine DON in wheat samples in the range of <50-16,000 µg/kg DON. The model displayed a large root mean square error of prediction value (1,977 µg/kg) as compared to the EU maximum limit for DON in unprocessed durum wheat (i.e., 1,750 µg/kg), thus making the PLS approach unsuitable for quantitative prediction of DON in durum wheat. Linear discriminant analysis (LDA) was successfully used to differentiate wheat samples based on their DON content. A first approach used LDA to group wheat samples into three classes: A (DON ≤ 1,000 µg/kg), B (1,000 < DON ≤ 2,500 µg/kg), and C (DON > 2,500 µg/kg) (LDA I). A second approach was used to discriminate highly contaminated wheat samples based on three different cut-off limits, namely 1,000 (LDA II), 1,200 (LDA III) and 1,400 µg/kg DON (LDA IV). The overall classification and false compliant rates for the three models were 75%-90% and 3%-7%, respectively, with model LDA IV using a cut-off of 1,400 µg/kg fulfilling the requirement of the European official guidelines for screening methods. These findings confirmed the suitability of FT-NIR to screen a large number of wheat samples for DON contamination and to verify the compliance with EU regulation.
Philip Ye, X; Liu, Lu; Hayes, Douglas; Womac, Alvin; Hong, Kunlun; Sokhansanj, Shahab
2008-10-01
The objectives of this research were to determine the variation of chemical composition across botanical fractions of cornstover, and to probe the potential of Fourier transform near-infrared (FT-NIR) techniques in qualitatively classifying separated cornstover fractions and in quantitatively analyzing chemical compositions of cornstover by developing calibration models to predict chemical compositions of cornstover based on FT-NIR spectra. Large variations of cornstover chemical composition for wide calibration ranges, which is required by a reliable calibration model, were achieved by manually separating the cornstover samples into six botanical fractions, and their chemical compositions were determined by conventional wet chemical analyses, which proved that chemical composition varies significantly among different botanical fractions of cornstover. Different botanic fractions, having total saccharide content in descending order, are husk, sheath, pith, rind, leaf, and node. Based on FT-NIR spectra acquired on the biomass, classification by Soft Independent Modeling of Class Analogy (SIMCA) was employed to conduct qualitative classification of cornstover fractions, and partial least square (PLS) regression was used for quantitative chemical composition analysis. SIMCA was successfully demonstrated in classifying botanical fractions of cornstover. The developed PLS model yielded root mean square error of prediction (RMSEP %w/w) of 0.92, 1.03, 0.17, 0.27, 0.21, 1.12, and 0.57 for glucan, xylan, galactan, arabinan, mannan, lignin, and ash, respectively. The results showed the potential of FT-NIR techniques in combination with multivariate analysis to be utilized by biomass feedstock suppliers, bioethanol manufacturers, and bio-power producers in order to better manage bioenergy feedstocks and enhance bioconversion.
NASA Astrophysics Data System (ADS)
Yan, Hong; Song, Xiangzhong; Tian, Kuangda; Chen, Yilin; Xiong, Yanmei; Min, Shungeng
2018-02-01
A novel method, mid-infrared (MIR) spectroscopy, which enables the determination of Chlorantraniliprole in Abamectin within minutes, is proposed. We further evaluate the prediction ability of four wavelength selection methods, including bootstrapping soft shrinkage approach (BOSS), Monte Carlo uninformative variable elimination (MCUVE), genetic algorithm partial least squares (GA-PLS) and competitive adaptive reweighted sampling (CARS) respectively. The results showed that BOSS method obtained the lowest root mean squared error of cross validation (RMSECV) (0.0245) and root mean squared error of prediction (RMSEP) (0.0271), as well as the highest coefficient of determination of cross-validation (Qcv2) (0.9998) and the coefficient of determination of test set (Q2test) (0.9989), which demonstrated that the mid infrared spectroscopy can be used to detect Chlorantraniliprole in Abamectin conveniently. Meanwhile, a suitable wavelength selection method (BOSS) is essential to conducting a component spectral analysis.
Darwish, Hany W; Hassan, Said A; Salem, Maissa Y; El-Zeany, Badr A
2016-02-05
Two advanced, accurate and precise chemometric methods are developed for the simultaneous determination of amlodipine besylate (AML) and atorvastatin calcium (ATV) in the presence of their acidic degradation products in tablet dosage forms. The first method was Partial Least Squares (PLS-1) and the second was Artificial Neural Networks (ANN). PLS was compared to ANN models with and without variable selection procedure (genetic algorithm (GA)). For proper analysis, a 5-factor 5-level experimental design was established resulting in 25 mixtures containing different ratios of the interfering species. Fifteen mixtures were used as calibration set and the other ten mixtures were used as validation set to validate the prediction ability of the suggested models. The proposed methods were successfully applied to the analysis of pharmaceutical tablets containing AML and ATV. The methods indicated the ability of the mentioned models to solve the highly overlapped spectra of the quinary mixture, yet using inexpensive and easy to handle instruments like the UV-VIS spectrophotometer. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Thumanu, Kanjana; Tanthanuch, Waraporn; Ye, Danna; Sangmalee, Anawat; Lorthongpanich, Chanchao; Parnpai, Rangsun; Heraud, Philip
2011-05-01
Stem cell-based therapy for liver regeneration has been proposed to overcome the persistent shortage in the supply of suitable donor organs. A requirement for this to succeed is to find a rapid method to detect functional hepatocytes, differentiated from embryonic stem cells. We propose Fourier transform infrared (FTIR) microspectroscopy as a versatile method to identify the early and last stages of the differentiation process leading to the formation of hepatocytes. Using synchrotron-FTIR microspectroscopy, the means of identifying hepatocytes at the single-cell level is possible and explored. Principal component analysis and subsequent partial least-squares (PLS) discriminant analysis is applied to distinguish endoderm induction from hepatic progenitor cells and matured hepatocyte-like cells. The data are well modeled by PLS with endoderm induction, hepatic progenitor cells, and mature hepatocyte-like cells able to be discriminated with very high sensitivity and specificity. This method provides a practical tool to monitor endoderm induction and has the potential to be applied for quality control of cell differentiation leading to hepatocyte formation.
Chemometric studies on potential larvicidal compounds against Aedes aegypti.
Scotti, Luciana; Scotti, Marcus Tullius; Silva, Viviane Barros; Santos, Sandra Regina Lima; Cavalcanti, Sócrates C H; Mendonça, Francisco J B
2014-03-01
The mosquito Aedes aegypti (Diptera, Culicidae) is the vector of yellow and dengue fever. In this study, chemometric tools, such as, Principal Component Analysis (PCA), Consensus PCA (CPCA), and Partial Least Squares Regression (PLS), were applied to a set of fifty five active compounds against Ae. aegypti larvae, which includes terpenes, cyclic alcohols, phenolic compounds, and their synthetic derivatives. The calculations were performed using the VolSurf+ program. CPCA analysis suggests that the higher weight blocks of descriptors were SIZE/SHAPE, DRY, and H2O. The PCA was generated with 48 descriptors selected from the previous blocks. The scores plot showed good separation between more and less potent compounds. The first two PCs accounted for over 60% of the data variance. The best model obtained in PLS, after validation leave-one-out, exhibited q(2) = 0.679 and r(2) = 0.714. External prediction model was R(2) = 0.623. The independent variables having a hydrophobic profile were strongly correlated to the biological data. The interaction maps generated with the GRID force field showed that the most active compounds exhibit more interaction with the DRY probe.
Mello, Cesar; Ribeiro, Diórginis; Novaes, Fábio; Poppi, Ronei J
2005-10-01
Use of classical microbiological methods to differentiate bacteria that cause gastroenteritis is cumbersome but usually very efficient. The high cost of reagents and the time required for such identifications, approximately four days, could have serious consequences, however, mainly when the patients are children, the elderly, or adults with low resistance. The search for new methods enabling rapid and reagentless differentiation of these microorganisms is, therefore, extremely relevant. In this work the main microorganisms responsible for gastroenteritis, Escherichia coli, Salmonella choleraesuis, and Shigella flexneri, were studied. For each microorganism sixty different dispersions were prepared in physiological solution. The Raman spectra of these dispersions were recorded using a diode laser operating in the near infrared region. Partial least-squares (PLS) discriminant analysis was used to differentiate among the bacteria by use of their respective Raman spectra. This approach enabled correct classification of 100% of the bacteria evaluated and unknown samples from the clinical environment, in less time ( approximately 10 h), by use of a low-cost, portable Raman spectrometer, which can be easily used in intensive care units and clinical environments.
Fernández de la Ossa, Mª Ángeles; Amigo, José Manuel; García-Ruiz, Carmen
2014-09-01
In this study near infrared hyperspectral imaging (NIR-HSI) is used to provide a fast, non-contact, non-invasive and non-destructive method for the analysis of explosive residues on human handprints. Volunteers manipulated individually each of these explosives and after deposited their handprints on plastic sheets. For this purpose, classical explosives, potentially used as part of improvised explosive devices (IEDs) as ammonium nitrate, blackpowder, single- and double-base smokeless gunpowders and dynamite were studied. A partial-least squares discriminant analysis (PLS-DA) model was built to detect and classify the presence of explosive residues in handprints. High levels of sensitivity and specificity for the PLS-DA classification model created to identify ammonium nitrate, blackpowder, single- and double-base smokeless gunpowders and dynamite residues were obtained, allowing the development of a preliminary library and facilitating the direct and in situ detection of explosives by NIR-HSI. Consequently, this technique is showed as a promising forensic tool for the detection of explosive residues and other related samples. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Wang, Mei; Avula, Bharathi; Wang, Yan-Hong; Zhao, Jianping; Avonto, Cristina; Parcher, Jon F; Raman, Vijayasankar; Zweigenbaum, Jerry A; Wylie, Philip L; Khan, Ikhlas A
2014-01-01
As part of an ongoing research program on authentication, safety and biological evaluation of phytochemicals and dietary supplements, an in-depth chemical investigation of different types of chamomile was performed. A collection of chamomile samples including authenticated plants, commercial products and essential oils was analysed by GC/MS. Twenty-seven authenticated plant samples representing three types of chamomile, viz. German chamomile, Roman chamomile and Juhua were analysed. This set of data was employed to construct a sample class prediction (SCP) model based on stepwise reduction of data dimensionality followed by principle component analysis (PCA) and partial least squares discriminant analysis (PLS-DA). The model was cross-validated with samples including authenticated plants and commercial products. The model demonstrated 100.0% accuracy for both recognition and prediction abilities. In addition, 35 commercial products and 11 essential oils purported to contain chamomile were subsequently predicted by the validated PLS-DA model. Furthermore, tentative identification of the marker compounds correlated with different types of chamomile was explored. Copyright © 2013 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Pecháček, Pavel; Stella, David; Keil, Petr; Kleisner, Karel
2014-12-01
The males of the Brimstone butterfly ( Gonepteryx rhamni) have ultraviolet pattern on the dorsal surfaces of their wings. Using geometric morphometrics, we have analysed correlations between environmental variables (climate, productivity) and shape variability of the ultraviolet pattern and the forewing in 110 male specimens of G. rhamni collected in the Palaearctic zone. To start with, we subjected the environmental variables to principal component analysis (PCA). The first PCA axis (precipitation, temperature, latitude) significantly correlated with shape variation of the ultraviolet patterns across the Palaearctic. Additionally, we have performed two-block partial least squares (PLS) analysis to assess co-variation between intraspecific shape variation and the variation of 11 environmental variables. The first PLS axis explained 93 % of variability and represented the effect of precipitation, temperature and latitude. Along this axis, we observed a systematic increase in the relative area of ultraviolet colouration with increasing temperature and precipitation and decreasing latitude. We conclude that the shape variation of ultraviolet patterns on the forewings of male Brimstones is correlated with large-scale environmental factors.
Visible micro-Raman spectroscopy for determining glucose content in beverage industry.
Delfino, I; Camerlingo, C; Portaccio, M; Ventura, B Della; Mita, L; Mita, D G; Lepore, M
2011-07-15
The potential of Raman spectroscopy with excitation in the visible as a tool for quantitative determination of single components in food industry products was investigated by focusing the attention on glucose content in commercial sport drinks. At this aim, micro-Raman spectra in the 600-1600cm(-1) wavenumber shift region of four sport drinks were recorded, showing well defined and separated vibrational fingerprints of the various contained sugars (glucose, fructose and sucrose). By profiting of the spectral separation of some peculiar peaks, glucose content was quantified by using a multivariate statistical analysis based on the interval Partial Least Square (iPLS) approach. The iPLS model needed for data analysis procedure was built by using glucose aqueous solutions at known sugar concentrations as calibration data. This model was then applied to sport drink spectra and gave predicted glucose concentrations in good agreement with the values obtained by using a biochemical assay. These results represent a significant step towards the development of a fast and simple method for the on-line glucose quantification in products of food and beverage industry. Copyright © 2011 Elsevier Ltd. All rights reserved.
Li, Min; Zhang, Lu; Yao, Xiaolong; Jiang, Xingyu
2017-01-01
The emerging membrane introduction mass spectrometry technique has been successfully used to detect benzene, toluene, ethyl benzene and xylene (BTEX), while overlapped spectra have unfortunately hindered its further application to the analysis of mixtures. Multivariate calibration, an efficient method to analyze mixtures, has been widely applied. In this paper, we compared univariate and multivariate analyses for quantification of the individual components of mixture samples. The results showed that the univariate analysis creates poor models with regression coefficients of 0.912, 0.867, 0.440 and 0.351 for BTEX, respectively. For multivariate analysis, a comparison to the partial-least squares (PLS) model shows that the orthogonal partial-least squares (OPLS) regression exhibits an optimal performance with regression coefficients of 0.995, 0.999, 0.980 and 0.976, favorable calibration parameters (RMSEC and RMSECV) and a favorable validation parameter (RMSEP). Furthermore, the OPLS exhibits a good recovery of 73.86 - 122.20% and relative standard deviation (RSD) of the repeatability of 1.14 - 4.87%. Thus, MIMS coupled with the OPLS regression provides an optimal approach for a quantitative BTEX mixture analysis in monitoring and predicting water pollution.
A multiple hold-out framework for Sparse Partial Least Squares.
Monteiro, João M; Rao, Anil; Shawe-Taylor, John; Mourão-Miranda, Janaina
2016-09-15
Supervised classification machine learning algorithms may have limitations when studying brain diseases with heterogeneous populations, as the labels might be unreliable. More exploratory approaches, such as Sparse Partial Least Squares (SPLS), may provide insights into the brain's mechanisms by finding relationships between neuroimaging and clinical/demographic data. The identification of these relationships has the potential to improve the current understanding of disease mechanisms, refine clinical assessment tools, and stratify patients. SPLS finds multivariate associative effects in the data by computing pairs of sparse weight vectors, where each pair is used to remove its corresponding associative effect from the data by matrix deflation, before computing additional pairs. We propose a novel SPLS framework which selects the adequate number of voxels and clinical variables to describe each associative effect, and tests their reliability by fitting the model to different splits of the data. As a proof of concept, the approach was applied to find associations between grey matter probability maps and individual items of the Mini-Mental State Examination (MMSE) in a clinical sample with various degrees of dementia. The framework found two statistically significant associative effects between subsets of brain voxels and subsets of the questions/tasks. SPLS was compared with its non-sparse version (PLS). The use of projection deflation versus a classical PLS deflation was also tested in both PLS and SPLS. SPLS outperformed PLS, finding statistically significant effects and providing higher correlation values in hold-out data. Moreover, projection deflation provided better results. Copyright © 2016 The Author(s). Published by Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Toubar, Safaa S.; Hegazy, Maha A.; Elshahed, Mona S.; Helmy, Marwa I.
2016-06-01
In this work, resolution and quantitation of spectral signals are achieved by several univariate and multivariate techniques. The novel pure component contribution algorithm (PCCA) along with mean centering of ratio spectra (MCR) and the factor based partial least squares (PLS) algorithms were developed for simultaneous determination of chlorzoxazone (CXZ), aceclofenac (ACF) and paracetamol (PAR) in their pure form and recently co-formulated tablets. The PCCA method allows the determination of each drug at its λmax. While, the mean centered values at 230, 302 and 253 nm, were used for quantification of CXZ, ACF and PAR, respectively, by MCR method. Partial least-squares (PLS) algorithm was applied as a multivariate calibration method. The three methods were successfully applied for determination of CXZ, ACF and PAR in pure form and tablets. Good linear relationships were obtained in the ranges of 2-50, 2-40 and 2-30 μg mL- 1 for CXZ, ACF and PAR, in order, by both PCCA and MCR, while the PLS model was built for the three compounds each in the range of 2-10 μg mL- 1. The results obtained from the proposed methods were statistically compared with a reported one. PCCA and MCR methods were validated according to ICH guidelines, while PLS method was validated by both cross validation and an independent data set. They are found suitable for the determination of the studied drugs in bulk powder and tablets.
NASA Astrophysics Data System (ADS)
Chen, Hua-cai; Chen, Xing-dan; Lu, Yong-jun; Cao, Zhi-qiang
2006-01-01
Near infrared (NIR) reflectance spectroscopy was used to develop a fast determination method for total ginsenosides in Ginseng (Panax Ginseng) powder. The spectra were analyzed with multiplicative signal correction (MSC) correlation method. The best correlative spectra region with the total ginsenosides content was 1660 nm~1880 nm and 2230nm~2380 nm. The NIR calibration models of ginsenosides were built with multiple linear regression (MLR), principle component regression (PCR) and partial least squares (PLS) regression respectively. The results showed that the calibration model built with PLS combined with MSC and the optimal spectrum region was the best one. The correlation coefficient and the root mean square error of correction validation (RMSEC) of the best calibration model were 0.98 and 0.15% respectively. The optimal spectrum region for calibration was 1204nm~2014nm. The result suggested that using NIR to rapidly determinate the total ginsenosides content in ginseng powder were feasible.
Consistent Partial Least Squares Path Modeling via Regularization.
Jung, Sunho; Park, JaeHong
2018-01-01
Partial least squares (PLS) path modeling is a component-based structural equation modeling that has been adopted in social and psychological research due to its data-analytic capability and flexibility. A recent methodological advance is consistent PLS (PLSc), designed to produce consistent estimates of path coefficients in structural models involving common factors. In practice, however, PLSc may frequently encounter multicollinearity in part because it takes a strategy of estimating path coefficients based on consistent correlations among independent latent variables. PLSc has yet no remedy for this multicollinearity problem, which can cause loss of statistical power and accuracy in parameter estimation. Thus, a ridge type of regularization is incorporated into PLSc, creating a new technique called regularized PLSc. A comprehensive simulation study is conducted to evaluate the performance of regularized PLSc as compared to its non-regularized counterpart in terms of power and accuracy. The results show that our regularized PLSc is recommended for use when serious multicollinearity is present.
Shimizu, Yu; Yoshimoto, Junichiro; Takamura, Masahiro; Okada, Go; Okamoto, Yasumasa; Yamawaki, Shigeto; Doya, Kenji
2017-01-01
In diagnostic applications of statistical machine learning methods to brain imaging data, common problems include data high-dimensionality and co-linearity, which often cause over-fitting and instability. To overcome these problems, we applied partial least squares (PLS) regression to resting-state functional magnetic resonance imaging (rs-fMRI) data, creating a low-dimensional representation that relates symptoms to brain activity and that predicts clinical measures. Our experimental results, based upon data from clinically depressed patients and healthy controls, demonstrated that PLS and its kernel variants provided significantly better prediction of clinical measures than ordinary linear regression. Subsequent classification using predicted clinical scores distinguished depressed patients from healthy controls with 80% accuracy. Moreover, loading vectors for latent variables enabled us to identify brain regions relevant to depression, including the default mode network, the right superior frontal gyrus, and the superior motor area. PMID:28700672
Comparison of 3 Methods for Identifying Dietary Patterns Associated With Risk of Disease
DiBello, Julia R.; Kraft, Peter; McGarvey, Stephen T.; Goldberg, Robert; Campos, Hannia
2008-01-01
Reduced rank regression and partial least-squares regression (PLS) are proposed alternatives to principal component analysis (PCA). Using all 3 methods, the authors derived dietary patterns in Costa Rican data collected on 3,574 cases and controls in 1994–2004 and related the resulting patterns to risk of first incident myocardial infarction. Four dietary patterns associated with myocardial infarction were identified. Factor 1, characterized by high intakes of lean chicken, vegetables, fruit, and polyunsaturated oil, was generated by all 3 dietary pattern methods and was associated with a significantly decreased adjusted risk of myocardial infarction (28%–46%, depending on the method used). PCA and PLS also each yielded a pattern associated with a significantly decreased risk of myocardial infarction (31% and 23%, respectively); this pattern was characterized by moderate intake of alcohol and polyunsaturated oil and low intake of high-fat dairy products. The fourth factor derived from PCA was significantly associated with a 38% increased risk of myocardial infarction and was characterized by high intakes of coffee and palm oil. Contrary to previous studies, the authors found PCA and PLS to produce more patterns associated with cardiovascular disease than reduced rank regression. The most effective method for deriving dietary patterns related to disease may vary depending on the study goals. PMID:18945692
Adedipe, Oluwatosin E; Johanningsmeier, Suzanne D; Truong, Van-Den; Yencho, G Craig
2016-03-02
This study investigated the ability of near-infrared spectroscopy (NIRS) to predict acrylamide content in French-fried potato. Potato flour spiked with acrylamide (50-8000 μg/kg) was used to determine if acrylamide could be accurately predicted in a potato matrix. French fries produced with various pretreatments and cook times (n = 84) and obtained from quick-service restaurants (n = 64) were used for model development and validation. Acrylamide was quantified using gas chromatography-mass spectrometry, and reflectance spectra (400-2500 nm) of each freeze-dried sample were captured on a Foss XDS Rapid Content Analyzer-NIR spectrometer. Partial least-squares (PLS) discriminant analysis and PLS regression modeling demonstrated that NIRS could accurately detect acrylamide content as low as 50 μg/kg in the model potato matrix. Prediction errors of 135 μg/kg (R(2) = 0.98) and 255 μg/kg (R(2) = 0.93) were achieved with the best PLS models for acrylamide prediction in Russet Norkotah French-fried potato and multiple samples of unknown varieties, respectively. The findings indicate that NIRS can be used as a screening tool in potato breeding and potato processing research to reduce acrylamide in the food supply.
Yao, Sen; Li, Tao; Liu, HongGao; Li, JieQing; Wang, YuanZhong
2018-04-01
Boletaceae mushrooms are wild-grown edible mushrooms that have high nutrition, delicious flavor and large economic value distributing in Yunnan Province, China. Traceability is important for the authentication and quality assessment of Boletaceae mushrooms. In this study, UV-visible and Fourier transform infrared (FTIR) spectroscopies were applied for traceability of 247 Boletaceae mushroom samples in combination with chemometrics. Compared with a single spectroscopy technique, data fusion strategy can obviously improve the classification performance in partial least square discriminant analysis (PLS-DA) and grid-search support vector machine (GS-SVM) models, for both species and geographical origin traceability. In addition, PLS-DA and GS-SVM models can provide 100.00% accuracy for species traceability and have reliable evaluation parameters. For geographical origin traceability, the accuracy of prediction in the PLS-DA model by data fusion was just 64.63%, but the GS-SVM model based on data fusion was 100.00%. The results demonstrated that the data fusion strategy of UV-visible and FTIR combined with GS-SVM could provide a higher synergic effect for traceability of Boletaceae mushrooms and have a good generalization ability for the comprehensive quality control and evaluation of similar foods. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
Dinç, Erdal; Ustündağ, Ozgür; Baleanu, Dumitru
2010-08-01
The sole use of pyridoxine hydrochloride during treatment of tuberculosis gives rise to pyridoxine deficiency. Therefore, a combination of pyridoxine hydrochloride and isoniazid is used in pharmaceutical dosage form in tuberculosis treatment to reduce this side effect. In this study, two chemometric methods, partial least squares (PLS) and principal component regression (PCR), were applied to the simultaneous determination of pyridoxine (PYR) and isoniazid (ISO) in their tablets. A concentration training set comprising binary mixtures of PYR and ISO consisting of 20 different combinations were randomly prepared in 0.1 M HCl. Both multivariate calibration models were constructed using the relationships between the concentration data set (concentration data matrix) and absorbance data matrix in the spectral region 200-330 nm. The accuracy and the precision of the proposed chemometric methods were validated by analyzing synthetic mixtures containing the investigated drugs. The recovery results obtained by applying PCR and PLS calibrations to the artificial mixtures were found between 100.0 and 100.7%. Satisfactory results obtained by applying the PLS and PCR methods to both artificial and commercial samples were obtained. The results obtained in this manuscript strongly encourage us to use them for the quality control and the routine analysis of the marketing tablets containing PYR and ISO drugs. Copyright © 2010 John Wiley & Sons, Ltd.
Song, Jingwei; He, Jiaying; Zhu, Menghua; Tan, Debao; Zhang, Yu; Ye, Song; Shen, Dingtao; Zou, Pengfei
2014-01-01
A simulated annealing (SA) based variable weighted forecast model is proposed to combine and weigh local chaotic model, artificial neural network (ANN), and partial least square support vector machine (PLS-SVM) to build a more accurate forecast model. The hybrid model was built and multistep ahead prediction ability was tested based on daily MSW generation data from Seattle, Washington, the United States. The hybrid forecast model was proved to produce more accurate and reliable results and to degrade less in longer predictions than three individual models. The average one-week step ahead prediction has been raised from 11.21% (chaotic model), 12.93% (ANN), and 12.94% (PLS-SVM) to 9.38%. Five-week average has been raised from 13.02% (chaotic model), 15.69% (ANN), and 15.92% (PLS-SVM) to 11.27%. PMID:25301508
Gomes, Adriano de Araújo; Alcaraz, Mirta Raquel; Goicoechea, Hector C; Araújo, Mario Cesar U
2014-02-06
In this work the Successive Projection Algorithm is presented for intervals selection in N-PLS for three-way data modeling. The proposed algorithm combines noise-reduction properties of PLS with the possibility of discarding uninformative variables in SPA. In addition, second-order advantage can be achieved by the residual bilinearization (RBL) procedure when an unexpected constituent is present in a test sample. For this purpose, SPA was modified in order to select intervals for use in trilinear PLS. The ability of the proposed algorithm, namely iSPA-N-PLS, was evaluated on one simulated and two experimental data sets, comparing the results to those obtained by N-PLS. In the simulated system, two analytes were quantitated in two test sets, with and without unexpected constituent. In the first experimental system, the determination of the four fluorophores (l-phenylalanine; l-3,4-dihydroxyphenylalanine; 1,4-dihydroxybenzene and l-tryptophan) was conducted with excitation-emission data matrices. In the second experimental system, quantitation of ofloxacin was performed in water samples containing two other uncalibrated quinolones (ciprofloxacin and danofloxacin) by high performance liquid chromatography with UV-vis diode array detector. For comparison purpose, a GA algorithm coupled with N-PLS/RBL was also used in this work. In most of the studied cases iSPA-N-PLS proved to be a promising tool for selection of variables in second-order calibration, generating models with smaller RMSEP, when compared to both the global model using all of the sensors in two dimensions and GA-NPLS/RBL. Copyright © 2013 Elsevier B.V. All rights reserved.
Lafuente, Victoria; Herrera, Luis J; Pérez, María del Mar; Val, Jesús; Negueruela, Ignacio
2015-08-15
In this work, near infrared spectroscopy (NIR) and an acoustic measure (AWETA) (two non-destructive methods) were applied in Prunus persica fruit 'Calrico' (n = 260) to predict Magness-Taylor (MT) firmness. Separate and combined use of these measures was evaluated and compared using partial least squares (PLS) and least squares support vector machine (LS-SVM) regression methods. Also, a mutual-information-based variable selection method, seeking to find the most significant variables to produce optimal accuracy of the regression models, was applied to a joint set of variables (NIR wavelengths and AWETA measure). The newly proposed combined NIR-AWETA model gave good values of the determination coefficient (R(2)) for PLS and LS-SVM methods (0.77 and 0.78, respectively), improving the reliability of MT firmness prediction in comparison with separate NIR and AWETA predictions. The three variables selected by the variable selection method (AWETA measure plus NIR wavelengths 675 and 697 nm) achieved R(2) values 0.76 and 0.77, PLS and LS-SVM. These results indicated that the proposed mutual-information-based variable selection algorithm was a powerful tool for the selection of the most relevant variables. © 2014 Society of Chemical Industry.
Chen, Ru-huang; Jin, Gang
2015-08-01
This paper presented an application of mid-infrared (MIR), near-infrared (NIR) and Raman spectroscopies for collecting the spectra of 31 kinds of low density polyethylene/polyprolene (LDPE/PP) samples with different proportions. The different pre-processing methods (multiplicative scatter correction, mean centering and Savitzky-Golay first derivative) and spectral region were explored to develop partial least-squares (PLS) model for LDPE, their influence on the accuracy of PLS model also being discussed. Three spectroscopies were compared about the accuracy of quantitative measurement. Consequently, the pre-processing methods and spectral region have a great impact on the accuracy of PLS model, especially the spectra with subtle difference, random noise and baseline variation. After being pre-processed and spectral region selected, the calibration model of MIR, NIR and Raman exhibited R2/RMSEC values of 0.9906/2.941, 0.9973/1.561 and 0.9972/1.598 respectively, which corrsponding to 0.8876/10.15, 0.8493/11.75 and 0.8757/10.67 before any treatment. The results also suggested MIR, NIR and Raman are three strong tools to predict the content of LDPE in LDPE/PP blend. However, NIR and Raman showed higher accuracy after being pre-processed and more suitability to fast quantitative characterization due to their high measuring speed.
Pan, Yu; Zhang, Ji; Li, Hong; Wang, Yuan-Zhong; Li, Wan-Yi
2016-10-01
Macamides with a benzylalkylamide nucleus are characteristic and major bioactive compounds in the functional food maca (Lepidium meyenii Walp). The aim of this study was to explore variations in macamide content among maca from China and Peru. Twenty-seven batches of maca hypocotyls with different phenotypes, sampled from different geographical origins, were extracted and profiled by liquid chromatography with ultraviolet detection/tandem mass spectrometry (LC-UV/MS/MS). Twelve macamides were identified by MS operated in multiple scanning modes. Similarity analysis showed that maca samples differed significantly in their macamide fingerprinting. Partial least squares discriminant analysis (PLS-DA) was used to differentiate samples according to their geographical origin and to identify the most relevant variables in the classification model. The prediction accuracy for raw maca was 91% and five macamides were selected and considered as chemical markers for sample classification. When combined with a PLS-DA model, characteristic fingerprinting based on macamides could be recommended for labelling for the authentication of maca from different geographical origins. The results provided potential evidence for the relationships between environmental or other factors and distribution of macamides. © 2016 Society of Chemical Industry. © 2016 Society of Chemical Industry.
Lourenço, Vera; Herdling, Thorsten; Reich, Gabriele; Menezes, José C; Lochmann, Dirk
2011-08-01
A set of 192 fluid bed granulation batches at industrial scale were in-line monitored using microwave resonance technology (MRT) to determine moisture, temperature and density of the granules. Multivariate data analysis techniques such as multiway partial least squares (PLS), multiway principal component analysis (PCA) and multivariate batch control charts were applied onto collected batch data sets. The combination of all these techniques, along with off-line particle size measurements, led to significantly increased process understanding. A seasonality effect could be put into evidence that impacted further processing through its influence on the final granule size. Moreover, it was demonstrated by means of a PLS that a relation between the particle size and the MRT measurements can be quantitatively defined, highlighting a potential ability of the MRT sensor to predict information about the final granule size. This study has contributed to improve a fluid bed granulation process, and the process knowledge obtained shows that the product quality can be built in process design, following Quality by Design (QbD) and Process Analytical Technology (PAT) principles. Copyright © 2011. Published by Elsevier B.V.
Zhang, Ji; Li, Bing; Wang, Qi; Wei, Xin; Feng, Weibo; Chen, Yijiu; Huang, Ping; Wang, Zhenyuan
2017-12-21
Postmortem interval (PMI) evaluation remains a challenge in the forensic community due to the lack of efficient methods. Studies have focused on chemical analysis of biofluids for PMI estimation; however, no reports using spectroscopic methods in pericardial fluid (PF) are available. In this study, Fourier transform infrared (FTIR) spectroscopy with attenuated total reflectance (ATR) accessory was applied to collect comprehensive biochemical information from rabbit PF at different PMIs. The PMI-dependent spectral signature was determined by two-dimensional (2D) correlation analysis. The partial least square (PLS) and nu-support vector machine (nu-SVM) models were then established based on the acquired spectral dataset. Spectral variables associated with amide I, amide II, COO - , C-H bending, and C-O or C-OH vibrations arising from proteins, polypeptides, amino acids and carbohydrates, respectively, were susceptible to PMI in 2D correlation analysis. Moreover, the nu-SVM model appeared to achieve a more satisfactory prediction than the PLS model in calibration; the reliability of both models was determined in an external validation set. The study shows the possibility of application of ATR-FTIR methods in postmortem interval estimation using PF samples.
Basatnia, Nabee; Hossein, Seyed Abbas; Rodrigo-Comino, Jesús; Khaledian, Yones; Brevik, Eric C; Aitkenhead-Peterson, Jacqueline; Natesan, Usha
2018-04-29
Coastal lagoon ecosystems are vulnerable to eutrophication, which leads to the accumulation of nutrients from the surrounding watershed over the long term. However, there is a lack of information about methods that could accurate quantify this problem in rapidly developed countries. Therefore, various statistical methods such as cluster analysis (CA), principal component analysis (PCA), partial least square (PLS), principal component regression (PCR), and ordinary least squares regression (OLS) were used in this study to estimate total organic matter content in sediments (TOM) using other parameters such as temperature, dissolved oxygen (DO), pH, electrical conductivity (EC), nitrite (NO 2 ), nitrate (NO 3 ), biological oxygen demand (BOD), phosphate (PO 4 ), total phosphorus (TP), salinity, and water depth along a 3-km transect in the Gomishan Lagoon (Iran). Results indicated that nutrient concentration and the dissolved oxygen gradient were the most significant parameters in the lagoon water quality heterogeneity. Additionally, anoxia at the bottom of the lagoon in sediments and re-suspension of the sediments were the main factors affecting internal nutrient loading. To validate the models, R 2 , RMSECV, and RPDCV were used. The PLS model was stronger than the other models. Also, classification analysis of the Gomishan Lagoon identified two hydrological zones: (i) a North Zone characterized by higher water exchange, higher dissolved oxygen and lower salinity and nutrients, and (ii) a Central and South Zone with high residence time, higher nutrient concentrations, lower dissolved oxygen, and higher salinity. A recommendation for the management of coastal lagoons, specifically the Gomishan Lagoon, to decrease or eliminate nutrient loadings is discussed and should be transferred to policy makers, the scientific community, and local inhabitants.
Marzetti, Emanuele; Landi, Francesco; Marini, Federico; Cesari, Matteo; Buford, Thomas W.; Manini, Todd M.; Onder, Graziano; Pahor, Marco; Bernabei, Roberto; Leeuwenburgh, Christiaan; Calvani, Riccardo
2014-01-01
Background: Chronic, low-grade inflammation and declining physical function are hallmarks of the aging process. However, previous attempts to correlate individual inflammatory biomarkers with physical performance in older people have produced mixed results. Given the complexity of the inflammatory response, the simultaneous analysis of an array of inflammatory mediators may provide more insights into the relationship between inflammation and age-related physical function decline. This study was designed to explore the association between a panel of inflammatory markers and physical performance in older adults through a multivariate statistical approach. Methods: Community-dwelling older persons were categorized into “normal walkers” (NWs; n = 27) or “slow walkers” (SWs; n = 11) groups using 0.8 m s−1 as the 4-m gait speed cutoff. A panel of 14 circulating inflammatory biomarkers was assayed by multiplex analysis. Partial least squares-discriminant analysis (PLS-DA) was used to identify patterns of inflammatory mediators associated with gait speed categories. Results: The optimal complexity of the PLS-DA model was found to be five latent variables. The proportion of correct classification was 88.9% for NW subjects (74.1% in cross-validation) and 90.9% for SW individuals (81.8% in cross-validation). Discriminant biomarkers in the model were interleukin 8, myeloperoxidase, and tumor necrosis factor alpha (all higher in the SW group), and P-selectin, interferon gamma, and granulocyte–macrophage colony-stimulating factor (all higher in the NW group). Conclusion: Distinct profiles of circulating inflammatory biomarkers characterize older subjects with different levels of physical performance. The dissection of these patterns may provide novel insights into the role played by inflammation in the disabling cascade and possible new targets for interventions. PMID:25593902
Henrique, C M; Teófilo, R F; Sabino, L; Ferreira, M M C; Cereda, M P
2007-05-01
Cassava starches are widely used in the production of biodegradable films, but their resistance to humidity migration is very low. In this work, commercial cassava starch films were studied and classified according to their physicochemical properties. A nondestructive method for water vapor permeability determination, which combines with infrared spectroscopy and multivariate calibration, is also presented. The following commercial cassava starches were studied: pregelatinized (amidomax 3550), carboxymethylated starch (CMA) of low and high viscosities, and esterified starches. To make the films, 2 different starch concentrations were evaluated, consisting of water suspensions with 3% and 5% starch. The filmogenic solutions were dried and characterized for their thickness, grammage, water vapor permeability, water activity, tensile strength (deformation force), water solubility, and puncture strength (deformation). The minimum thicknesses were 0.5 to 0.6 mm in pregelatinized starch films. The results were treated by means of the following chemometric methods: principal component analysis (PCA) and partial least squares (PLS) regression. PCA analysis on the physicochemical properties of the films showed that the differences in concentration of the dried material (3% and 5% starch) and also in the type of starch modification were mainly related to the following properties: permeability, solubility, and thickness. IR spectra collected in the region of 4000 to 600 cm(-1) were used to build a PLS model with good predictive power for water vapor permeability determination, with mean relative errors of 10.0% for cross-validation and 7.8% for the prediction set.
NASA Astrophysics Data System (ADS)
Mabood, Fazal; Boqué, Ricard; Folcarelli, Rita; Busto, Olga; Al-Harrasi, Ahmed; Hussain, Javid
2015-05-01
We have investigated the effect of thermal treatment on the discrimination of pure extra virgin olive oil (EVOO) samples from EVOO samples adulterated with sunflower oil. Two groups of samples were used. One group was analyzed at room temperature (25 °C) and the other group was thermally treated in a thermostatic water bath at 75 °C for 8 h, in contact with air and with light exposure, to favor oxidation. All samples were then measured with synchronous fluorescence spectroscopy. Fluorescence spectra were acquired by varying the excitation wavelength in the region from 250 to 720 nm. In order to optimize the differences between excitation and emission wavelengths, four constant differential wavelengths, i.e., 20 nm, 40 nm, 60 nm and 80 nm, were tried. Partial least-squares discriminant analysis (PLS-DA) was used to discriminate between pure and adulterated oils. It was found that the 20 nm difference was the optimal, at which the discrimination models showed the best results. The best PLS-DA models were those built with the difference spectra (75-25 °C), which were able to discriminate pure from adulterated oils at a 2% level of adulteration. Furthermore, PLS regression models were built to quantify the level of adulteration. Again, the best model was the one built with the difference spectra, with a prediction error of 1.75% of adulteration.
Pereira, Hebert Vinicius; Amador, Victória Silva; Sena, Marcelo Martins; Augusti, Rodinei; Piccin, Evandro
2016-10-12
Paper spray mass spectrometry (PS-MS) combined with partial least squares discriminant analysis (PLS-DA) was applied for the first time in a forensic context to a fast and effective differentiation of beers. Eight different brands of American standard lager beers produced by four different breweries (141 samples from 55 batches) were studied with the aim at performing a differentiation according to their market prices. The three leader brands in the Brazilian beer market, which have been subject to fraud, were modeled as the higher-price class, while the five brands most used for counterfeiting were modeled as the lower-price class. Parameters affecting the paper spray ionization were examined and optimized. The best MS signal stability and intensity was obtained while using the positive ion mode, with PS(+) mass spectra characterized by intense pairs of signals corresponding to sodium and potassium adducts of malto-oligosaccharides. Discrimination was not apparent neither by using visual inspection nor principal component analysis (PCA). However, supervised classification models provided high rates of sensitivity and specificity. A PLS-DA model using full scan mass spectra were improved by variable selection with ordered predictors selection (OPS), providing 100% of reliability rate and reducing the number of variables from 1701 to 60. This model was interpreted by detecting fifteen variables as the most significant VIP (variable importance in projection) scores, which were therefore considered diagnostic ions for this type of beer counterfeit. Copyright © 2016 Elsevier B.V. All rights reserved.
Analysis of lard in meatball broth using Fourier transform infrared spectroscopy and chemometrics.
Kurniawati, Endah; Rohman, Abdul; Triyana, Kuwat
2014-01-01
Meatball is one of the favorite foods in Indonesia. For the economic reason (due to the price difference), the substitution of beef meat with pork can occur. In this study, FTIR spectroscopy in combination with chemometrics of partial least square (PLS) and principal component analysis (PCA) was used for analysis of pork fat (lard) in meatball broth. Lard in meatball broth was quantitatively determined at wavenumber region of 1018-1284 cm(-1). The coefficient of determination (R(2)) and root mean square error of calibration (RMSEC) values obtained were 0.9975 and 1.34% (v/v), respectively. Furthermore, the classification of lard and beef fat in meatball broth as well as in commercial samples was performed at wavenumber region of 1200-1000 cm(-1). The results showed that FTIR spectroscopy coupled with chemometrics can be used for quantitative analysis and classification of lard in meatball broth for Halal verification studies. The developed method is simple in operation, rapid and not involving extensive sample preparation. © 2013.
Kortesniemi, Maaria; Vuorinen, Anssi L; Sinkkonen, Jari; Yang, Baoru; Rajala, Ari; Kallio, Heikki
2015-04-01
The oilseeds of the commercially important oilseed rape (Brassica napus) and turnip rape (Brassica rapa) were investigated with (1)H NMR metabolomics. The compositions of ripened (cultivated in field trials) and developing seeds (cultivated in controlled conditions) were compared in multivariate models using principal component analysis (PCA), partial least squares discriminant analysis (PLS-DA), and orthogonal partial least squares discriminant analysis (OPLS-DA). Differences in the major lipids and the minor metabolites between the two species were found. A higher content of polyunsaturated fatty acids and sucrose were observed in turnip rape, while the overall oil content and sinapine levels were higher in oilseed rape. The genotype traits were negligible compared to the effect of the growing site and concomitant conditions on the oilseed metabolome. This study demonstrates the applicability of NMR-based analysis in determining the species, geographical origin, developmental stage, and quality of oilseed Brassicas. Copyright © 2014 Elsevier Ltd. All rights reserved.
Ma, W; Zhang, T-F; Lu, P; Lu, S H
2014-01-01
Breast cancer is categorized into two broad groups: estrogen receptor positive (ER+) and ER negative (ER-) groups. Previous study proposed that under trastuzumab-based neoadjuvant chemotherapy, tumor initiating cell (TIC) featured ER- tumors response better than ER+ tumors. Exploration of the molecular difference of these two groups may help developing new therapeutic strategies, especially for ER- patients. With gene expression profile from the Gene Expression Omnibus (GEO) database, we performed partial least squares (PLS) based analysis, which is more sensitive than common variance/regression analysis. We acquired 512 differentially expressed genes. Four pathways were found to be enriched with differentially expressed genes, involving immune system, metabolism and genetic information processing process. Network analysis identified five hub genes with degrees higher than 10, including APP, ESR1, SMAD3, HDAC2, and PRKAA1. Our findings provide new understanding for the molecular difference between TIC featured ER- and ER+ breast tumors with the hope offer supports for therapeutic studies.
Song, Seung Yeob; Lee, Young Koung; Kim, In-Jung
2016-01-01
A high-throughput screening system for Citrus lines were established with higher sugar and acid contents using Fourier transform infrared (FT-IR) spectroscopy in combination with multivariate analysis. FT-IR spectra confirmed typical spectral differences between the frequency regions of 950-1100 cm(-1), 1300-1500 cm(-1), and 1500-1700 cm(-1). Principal component analysis (PCA) and subsequent partial least square-discriminant analysis (PLS-DA) were able to discriminate five Citrus lines into three separate clusters corresponding to their taxonomic relationships. The quantitative predictive modeling of sugar and acid contents from Citrus fruits was established using partial least square regression algorithms from FT-IR spectra. The regression coefficients (R(2)) between predicted values and estimated sugar and acid content values were 0.99. These results demonstrate that by using FT-IR spectra and applying quantitative prediction modeling to Citrus sugar and acid contents, excellent Citrus lines can be early detected with greater accuracy. Copyright © 2015 Elsevier Ltd. All rights reserved.
Hegazy, Maha A; Lotfy, Hayam M; Mowaka, Shereen; Mohamed, Ekram Hany
2016-07-05
Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations. Copyright © 2016 Elsevier B.V. All rights reserved.
Paper Spray Mass Spectrometry for the Forensic Analysis of Black Ballpoint Pen Inks
NASA Astrophysics Data System (ADS)
Amador, Victoria Silva; Pereira, Hebert Vinicius; Sena, Marcelo Martins; Augusti, Rodinei; Piccin, Evandro
2017-09-01
This article describes the use of paper spray mass spectrometry (PS-MS) for the direct analysis of black ink writings made with ballpoint pens. The novel approach was developed in a forensic context by first performing the classification of commercially available ballpoint pens according to their brands. Six of the most commonly worldwide utilized brands (Bic, Paper Mate, Faber Castell, Pentel, Compactor, and Pilot) were differentiated according to their characteristic chemical patterns obtained by PS-MS. MS on the negative ion mode at a mass range of m/ z 100-1000 allowed prompt discrimination just by visual inspection. On the other hand, the concept of relative ion intensity (RII) and the analysis at other mass ranges were necessary for the differentiation using the positive ion mode. PS-MS combined with partial least squares (PLS) was utilized to monitor changes on the ink chemical composition after light exposure (artificial aging studies). The PLS model was optimized by variable selection, which allowed the identification of the most influencing ions on the degradation process. The feasibility of the method on forensic investigations was also demonstrated in three different applications: (1) analysis of overlapped fresh ink lines, (2) analysis of old inks from archived documents, and (3) detection of alterations (simulated forgeries) performed on archived documents. [Figure not available: see fulltext.
Yang, Yan-Qin; Yin, Hong-Xu; Yuan, Hai-Bo; Jiang, Yong-Wen; Dong, Chun-Wang; Deng, Yu-Liang
2018-01-01
In the present work, a novel infrared-assisted extraction coupled to headspace solid-phase microextraction (IRAE-HS-SPME) followed by gas chromatography-mass spectrometry (GC-MS) was developed for rapid determination of the volatile components in green tea. The extraction parameters such as fiber type, sample amount, infrared power, extraction time, and infrared lamp distance were optimized by orthogonal experimental design. Under optimum conditions, a total of 82 volatile compounds in 21 green tea samples from different geographical origins were identified. Compared with classical water-bath heating, the proposed technique has remarkable advantages of considerably reducing the analytical time and high efficiency. In addition, an effective classification of green teas based on their volatile profiles was achieved by partial least square-discriminant analysis (PLS-DA) and hierarchical clustering analysis (HCA). Furthermore, the application of a dual criterion based on the variable importance in the projection (VIP) values of the PLS-DA models and on the category from one-way univariate analysis (ANOVA) allowed the identification of 12 potential volatile markers, which were considered to make the most important contribution to the discrimination of the samples. The results suggest that IRAE-HS-SPME/GC-MS technique combined with multivariate analysis offers a valuable tool to assess geographical traceability of different tea varieties.
Analysis of Flavonoid in Medicinal Plant Extract Using Infrared Spectroscopy and Chemometrics
Retnaningtyas, Yuni; Nuri; Lukman, Hilmia
2016-01-01
Infrared (IR) spectroscopy combined with chemometrics has been developed for simple analysis of flavonoid in the medicinal plant extract. Flavonoid was extracted from medicinal plant leaves by ultrasonication and maceration. IR spectra of selected medicinal plant extract were correlated with flavonoid content using chemometrics. The chemometric method used for calibration analysis was Partial Last Square (PLS) and the methods used for classification analysis were Linear Discriminant Analysis (LDA), Soft Independent Modelling of Class Analogies (SIMCA), and Support Vector Machines (SVM). In this study, the calibration of NIR model that showed best calibration with R 2 and RMSEC value was 0.9916499 and 2.1521897, respectively, while the accuracy of all classification models (LDA, SIMCA, and SVM) was 100%. R 2 and RMSEC of calibration of FTIR model were 0.8653689 and 8.8958149, respectively, while the accuracy of LDA, SIMCA, and SVM was 86.0%, 91.2%, and 77.3%, respectively. PLS and LDA of NIR models were further used to predict unknown flavonoid content in commercial samples. Using these models, the significance of flavonoid content that has been measured by NIR and UV-Vis spectrophotometry was evaluated with paired samples t-test. The flavonoid content that has been measured with both methods gave no significant difference. PMID:27529051
Yin, Hong-Xu; Yuan, Hai-Bo; Jiang, Yong-Wen; Dong, Chun-Wang; Deng, Yu-Liang
2018-01-01
In the present work, a novel infrared-assisted extraction coupled to headspace solid-phase microextraction (IRAE-HS-SPME) followed by gas chromatography-mass spectrometry (GC-MS) was developed for rapid determination of the volatile components in green tea. The extraction parameters such as fiber type, sample amount, infrared power, extraction time, and infrared lamp distance were optimized by orthogonal experimental design. Under optimum conditions, a total of 82 volatile compounds in 21 green tea samples from different geographical origins were identified. Compared with classical water-bath heating, the proposed technique has remarkable advantages of considerably reducing the analytical time and high efficiency. In addition, an effective classification of green teas based on their volatile profiles was achieved by partial least square-discriminant analysis (PLS-DA) and hierarchical clustering analysis (HCA). Furthermore, the application of a dual criterion based on the variable importance in the projection (VIP) values of the PLS-DA models and on the category from one-way univariate analysis (ANOVA) allowed the identification of 12 potential volatile markers, which were considered to make the most important contribution to the discrimination of the samples. The results suggest that IRAE-HS-SPME/GC-MS technique combined with multivariate analysis offers a valuable tool to assess geographical traceability of different tea varieties. PMID:29494626
A(1)H NMR-based metabonomic study on the SAMP8 and SAMR1 mice and the effect of electro-acupuncture.
Qiao-feng, Wu; Ling-ling, Guo; Shu-guang, Yu; Qi, Zhang; Sheng-feng, Lu; Fang, Zeng; Hai-yan, Yin; Yong, Tang; Xian-zhong, Yan
2011-10-01
A (1)H NMR-based metabonomic method was used to investigate the metabolic change of plasma in senescence-prone 8 (SAMP8) mice before and after electro-acupuncture (EA). Sixteen SAMP8 male mice (aged 8 months) were randomly divided into model group and acupuncture treatment group while the later group received EA treatment for 21 days. Eight senescence-resistant 1 (SAMR1) mice were used as the control group. Morris water maze was used to evaluate the effects of EA. All mice plasma samples obtained from different groups were analyzed by using 600 MHz (1)H nuclear magnetic resonances ((1)H NMR) spectroscopy. The data sets were analyzed by Principal Components Analysis (PCA) and Partial Least Squares-Discriminant Analysis (PLS-DA) to discriminate the key plasma metabolites among different groups. Results indicated that both the escape and probe tasks of SAMP8 could be improved by EA treatment. Metabonomic study showed that SAMR1 and SAMP8 were separated clearly in both CPMG_OSC_PLS and LED _OSC_PLS score plots. Interestingly, samples obtained from EA group were distributed closely to SAMR1 group in CPMG_OSC_PLS score plot, but away from SAMP8 group in LED_OSC_PLS score plot. Corresponding loading plots showed that much less lactate was seen in SAMP8 mice plasma. Other changes including higher levels of dimethylamine (DMA) Choline and α-glucose but lower levels of leucine/isoleucine, HDL, LDL/VLDL, 3-Hydroxybutyrate (3-HB), and Trimethylamine N-oxide (TMAO) were observed in the SAMP8 mice plasma than in the SAMR1. After EA treatment, the levels of lactate, DMA, choline and TMAO were improved. Results of this work can provide valuable clues to the understanding of the metabolic changes in the senile impairment of mice. It is also hoped that the methodology can be used in evaluating the effects of EA and understanding the underlying acupuncture mechanism in treating neurodegenerative diseases. Copyright © 2011 Elsevier Inc. All rights reserved.
Bao, Yidan; Kong, Wenwen; Liu, Fei; Qiu, Zhengjun; He, Yong
2012-01-01
Amino acids are quite important indices to indicate the growth status of oilseed rape under herbicide stress. Near infrared (NIR) spectroscopy combined with chemometrics was applied for fast determination of glutamic acid in oilseed rape leaves. The optimal spectral preprocessing method was obtained after comparing Savitzky-Golay smoothing, standard normal variate, multiplicative scatter correction, first and second derivatives, detrending and direct orthogonal signal correction. Linear and nonlinear calibration methods were developed, including partial least squares (PLS) and least squares-support vector machine (LS-SVM). The most effective wavelengths (EWs) were determined by the successive projections algorithm (SPA), and these wavelengths were used as the inputs of PLS and LS-SVM model. The best prediction results were achieved by SPA-LS-SVM (Raw) model with correlation coefficient r = 0.9943 and root mean squares error of prediction (RMSEP) = 0.0569 for prediction set. These results indicated that NIR spectroscopy combined with SPA-LS-SVM was feasible for the fast and effective detection of glutamic acid in oilseed rape leaves. The selected EWs could be used to develop spectral sensors, and the important and basic amino acid data were helpful to study the function mechanism of herbicide. PMID:23203052
Hacisalihoglu, Gokhan; Larbi, Bismark; Settles, A Mark
2010-01-27
The objective of this study was to explore the potential of near-infrared reflectance (NIR) spectroscopy to determine individual seed composition in common bean ( Phaseolus vulgaris L.). NIR spectra and analytical measurements of seed weight, protein, and starch were collected from 267 individual bean seeds representing 91 diverse genotypes. Partial least-squares (PLS) regression models were developed with 61 bean accessions randomly assigned to a calibration data set and 30 accessions assigned to an external validation set. Protein gave the most accurate PLS regression, with the external validation set having a standard error of prediction (SEP) = 1.6%. PLS regressions for seed weight and starch had sufficient accuracy for seed sorting applications, with SEP = 41.2 mg and 4.9%, respectively. Seed color had a clear effect on the NIR spectra, with black beans having a distinct spectral type. Seed coat color did not impact the accuracy of PLS predictions. This research demonstrates that NIR is a promising technique for simultaneous sorting of multiple seed traits in single bean seeds with no sample preparation.
Land, Walker H; Heine, John J; Raway, Tom; Mizaku, Alda; Kovalchuk, Nataliya; Yang, Jack Y; Yang, Mary Qu
2008-01-01
The automated decision paradigms presented in this work address the false positive (FP) biopsy occurrence in diagnostic mammography. An EP/ES stochastic hybrid and two kernelized Partial Least Squares (K-PLS) paradigms were investigated with following studies: methodology performance comparisonsautomated diagnostic accuracy assessments with two data sets. The findings showed: the new hybrid produced comparable results more rapidlythe new K-PLS paradigms train and operate Essentially in real time for the data sets studied. Both advancements are essential components for eventually achieving the FP reduction goal, while maintaining acceptable diagnostic sensitivities.
Raway, Tom; Mizaku, Alda; Kovalchuk, Nataliya; Yang, Jack Y.; Yang, Mary Qu
2015-01-01
The automated decision paradigms presented in this work address the false positive (FP) biopsy occurrence in diagnostic mammography. An EP/ES stochastic hybrid and two kernelized Partial Least Squares (K-PLS) paradigms were investigated with following studies: methodology performance comparisonsautomated diagnostic accuracy assessments with two data sets. The findings showed: the new hybrid produced comparable results more rapidlythe new K-PLS paradigms train and operate Essentially in real time for the data sets studied. Both advancements are essential components for eventually achieving the FP reduction goal, while maintaining acceptable diagnostic sensitivities. PMID:26430470
Goicoechea, H C; Olivieri, A C
1999-08-01
The use of multivariate spectrophotometric calibration is presented for the simultaneous determination of the active components of tablets used in the treatment of pulmonary tuberculosis. The resolution of ternary mixtures of rifampicin, isoniazid and pyrazinamide has been accomplished by using partial least squares (PLS-1) regression analysis. Although the components show an important degree of spectral overlap, they have been simultaneously determined with high accuracy and precision, rapidly and with no need of nonaqueous solvents for dissolving the samples. No interference has been observed from the tablet excipients. A comparison is presented with the related multivariate method of classical least squares (CLS) analysis, which is shown to yield less reliable results due to the severe spectral overlap among the studied compounds. This is highlighted in the case of isoniazid, due to the small absorbances measured for this component.
The Extent and Prediction of Heavy Metal Pollution in Soils of Shahrood and Damghan, Iran.
Sakizadeh, Mohamad; Mirzaei, Rouhollah; Ghorbani, Hadi
2015-12-01
The levels of 12 heavy metals (Ag, Ba, Be, Cd, Co, Cr, Cu, Ni, Pb, Tl, V, Zn) were considered in 229 soil samples in Semnan Province, Iran. To discriminate between natural and anthropogenic inputs of heavy metals, factor analysis was used. Seven factors accounting for 90.5 % of the total variance were extracted. The mining and agricultural activities along with geogenic sources have been attributed as the main causes of the levels of heavy metals in the study area. The partial least squares regression was utilized to predict the level of soil pollution index (SPI) considering the concentrations of 12 heavy metals. The eigenvectors from the first three PLS represented more than 98 % of the overall variance. The correlation coefficient between the observed and predicted SPI was 0.99 indicating the high efficiency of this method. The resultant coefficient of determination for three PLS components was 0.984 confirming the predictive ability of this method.
Melquiades, Fábio L; Thomaz, Edivaldo L
2016-05-01
An important aspect for the evaluation of fire effects in slash-and-burn agricultural system, as well as in wildfire, is the soil burn severity. The objective of this study is to estimate the maximum temperature reached in real soil burn events using energy dispersive X-ray fluorescence (EDXRF) as an analytical tool, combined with partial least square (PLS) regression. Muffle-heated soil samples were used for PLS regression model calibration and two real slash-and-burn soils were tested as external samples in the model. It was possible to associate EDXRF spectra alterations to the maximum temperature reached in the heat affected soils with about 17% relative standard deviation. The results are promising since the analysis is fast, nondestructive, and conducted after the burn event, although local calibration for each type of burned soil is necessary. Copyright © by the American Society of Agronomy, Crop Science Society of America, and Soil Science Society of America, Inc.
Potential of near-infrared spectroscopy for quality evaluation of cattle leather.
Braz, Carlos Eduardo M; Jacinto, Manuel Antonio C; Pereira-Filho, Edenir R; Souza, Gilberto B; Nogueira, Ana Rita A
2018-05-09
Models using near-infrared spectroscopy (NIRS) were constructed based on physical-mechanical tests to determine the quality of cattle leather. The following official parameters were used, considering the industry requirements: tensile strength (TS), percentage elongation (%E), tear strength (TT), and double hole tear strength (DHS). Classification models were constructed with the use of k-nearest neighbor (kNN), soft independent modeling of class analogy (SIMCA), and partial least squares-discriminant analysis (PLS-DA). The evaluated figures of merit, accuracy, sensitivity, and specificity presented results between 85% and 93%, and the false alarm rates from 9% to 14%. The model with lowest validation percentage (92%) was kNN, and the highest was PLS-DA (100%). For TS, lower values were obtained, from 52% for kNN and 74% for SIMCA. The other parameters %E, TT, and DHS presented hit rates between 87 and 100%. The abilities of the models were similar, showing they can be used to predict the quality of cattle leather. Copyright © 2018 Elsevier B.V. All rights reserved.
Random forest models to predict aqueous solubility.
Palmer, David S; O'Boyle, Noel M; Glen, Robert C; Mitchell, John B O
2007-01-01
Random Forest regression (RF), Partial-Least-Squares (PLS) regression, Support Vector Machines (SVM), and Artificial Neural Networks (ANN) were used to develop QSPR models for the prediction of aqueous solubility, based on experimental data for 988 organic molecules. The Random Forest regression model predicted aqueous solubility more accurately than those created by PLS, SVM, and ANN and offered methods for automatic descriptor selection, an assessment of descriptor importance, and an in-parallel measure of predictive ability, all of which serve to recommend its use. The prediction of log molar solubility for an external test set of 330 molecules that are solid at 25 degrees C gave an r2 = 0.89 and RMSE = 0.69 log S units. For a standard data set selected from the literature, the model performed well with respect to other documented methods. Finally, the diversity of the training and test sets are compared to the chemical space occupied by molecules in the MDL drug data report, on the basis of molecular descriptors selected by the regression analysis.
Ferragina, A; de los Campos, G; Vazquez, A I; Cecchinato, A; Bittante, G
2015-11-01
The aim of this study was to assess the performance of Bayesian models commonly used for genomic selection to predict "difficult-to-predict" dairy traits, such as milk fatty acid (FA) expressed as percentage of total fatty acids, and technological properties, such as fresh cheese yield and protein recovery, using Fourier-transform infrared (FTIR) spectral data. Our main hypothesis was that Bayesian models that can estimate shrinkage and perform variable selection may improve our ability to predict FA traits and technological traits above and beyond what can be achieved using the current calibration models (e.g., partial least squares, PLS). To this end, we assessed a series of Bayesian methods and compared their prediction performance with that of PLS. The comparison between models was done using the same sets of data (i.e., same samples, same variability, same spectral treatment) for each trait. Data consisted of 1,264 individual milk samples collected from Brown Swiss cows for which gas chromatographic FA composition, milk coagulation properties, and cheese-yield traits were available. For each sample, 2 spectra in the infrared region from 5,011 to 925 cm(-1) were available and averaged before data analysis. Three Bayesian models: Bayesian ridge regression (Bayes RR), Bayes A, and Bayes B, and 2 reference models: PLS and modified PLS (MPLS) procedures, were used to calibrate equations for each of the traits. The Bayesian models used were implemented in the R package BGLR (http://cran.r-project.org/web/packages/BGLR/index.html), whereas the PLS and MPLS were those implemented in the WinISI II software (Infrasoft International LLC, State College, PA). Prediction accuracy was estimated for each trait and model using 25 replicates of a training-testing validation procedure. Compared with PLS, which is currently the most widely used calibration method, MPLS and the 3 Bayesian methods showed significantly greater prediction accuracy. Accuracy increased in moving from calibration to external validation methods, and in moving from PLS and MPLS to Bayesian methods, particularly Bayes A and Bayes B. The maximum R(2) value of validation was obtained with Bayes B and Bayes A. For the FA, C10:0 (% of each FA on total FA basis) had the highest R(2) (0.75, achieved with Bayes A and Bayes B), and among the technological traits, fresh cheese yield R(2) of 0.82 (achieved with Bayes B). These 2 methods have proven to be useful instruments in shrinking and selecting very informative wavelengths and inferring the structure and functions of the analyzed traits. We conclude that Bayesian models are powerful tools for deriving calibration equations, and, importantly, these equations can be easily developed using existing open-source software. As part of our study, we provide scripts based on the open source R software BGLR, which can be used to train customized prediction equations for other traits or populations. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Du, Lijuan; Lu, Weiying; Cai, Zhenzhen Julia; Bao, Lei; Hartmann, Christoph; Gao, Boyan; Yu, Liangli Lucy
2018-02-01
Flow injection mass spectrometry (FIMS) combined with chemometrics was evaluated for rapidly detecting economically motivated adulteration (EMA) of milk. Twenty-two pure milk and thirty-five counterparts adulterated with soybean, pea, and whey protein isolates at 0.5, 1, 3, 5, and 10% (w/w) levels were analyzed. The principal component analysis (PCA), partial least-squares-discriminant analysis (PLS-DA), and support vector machine (SVM) classification models indicated that the adulterated milks could successfully be classified from the pure milks. FIMS combined with chemometrics might be an effective method to detect possible EMA in milk. Copyright © 2017 Elsevier Ltd. All rights reserved.
Bhatt, Chet R; Jain, Jinesh C; Goueguel, Christian L; McIntyre, Dustin L; Singh, Jagdish P
2018-01-01
Laser-induced breakdown spectroscopy (LIBS) was used to detect rare earth elements (REEs) in natural geological samples. Low and high intensity emission lines of Ce, La, Nd, Y, Pr, Sm, Eu, Gd, and Dy were identified in the spectra recorded from the samples to claim the presence of these REEs. Multivariate analysis was executed by developing partial least squares regression (PLS-R) models for the quantification of Ce, La, and Nd. Analysis of unknown samples indicated that the prediction results of these samples were found comparable to those obtained by inductively coupled plasma mass spectrometry analysis. Data support that LIBS has potential to quantify REEs in geological minerals/ores.
NASA Astrophysics Data System (ADS)
Abdel Hameed, Eman A.; Abdel Salam, Randa A.; Hadad, Ghada M.
2015-04-01
Chemometric-assisted spectrophotometric methods and high performance liquid chromatography (HPLC) were developed for the simultaneous determination of the seven most commonly prescribed β-blockers (atenolol, sotalol, metoprolol, bisoprolol, propranolol, carvedilol and nebivolol). Principal component regression PCR, partial least square PLS and PLS with previous wavelength selection by genetic algorithm (GA-PLS) were used for chemometric analysis of spectral data of these drugs. The compositions of the mixtures used in the calibration set were varied to cover the linearity ranges 0.7-10 μg ml-1 for AT, 1-15 μg ml-1 for ST, 1-15 μg ml-1 for MT, 0.3-5 μg ml-1 for BS, 0.1-3 μg ml-1 for PR, 0.1-3 μg ml-1 for CV and 0.7-5 μg ml-1 for NB. The analytical performances of these chemometric methods were characterized by relative prediction errors and were compared with each other. GA-PLS showed superiority over the other applied multivariate methods due to the wavelength selection. A new gradient HPLC method had been developed using statistical experimental design. Optimum conditions of separation were determined with the aid of central composite design. The developed HPLC method was found to be linear in the range of 0.2-20 μg ml-1 for AT, 0.2-20 μg ml-1 for ST, 0.1-15 μg ml-1 for MT, 0.1-15 μg ml-1 for BS, 0.1-13 μg ml-1 for PR, 0.1-13 μg ml-1 for CV and 0.4-20 μg ml-1 for NB. No significant difference between the results of the proposed GA-PLS and HPLC methods with respect to accuracy and precision. The proposed analytical methods did not show any interference of the excipients when applied to pharmaceutical products.
Vignaduzzo, Silvana E; Maggio, Rubén M; Castellano, Patricia M; Kaufman, Teodoro S
2006-12-01
Two new analytical methods have been developed as convenient and useful alternatives for simultaneous determination of hydrochlorothiazide (HCT) and propranolol hydrochloride (PRO) in pharmaceutical formulations. The methods are based on the first derivative of ratio spectra (DRS) and on partial least squares (PLS) analysis of the ultraviolet absorption spectra of the samples in the 250-350-nm region. The methods were calibrated between 8.7 and 16.0 mg L(-1) for HCT and between 14.0 and 51.5 mg L(-1) for PRO. An asymmetric full-factorial design and wavelength selection (277-294 nm for HCT and 297-319 for PRO) were used for the PLS method and signal intensities at 276 and 322 nm were used in the DRS method for HCT and PRO, respectively. Performance characteristics of the analytical methods were evaluated by use of validation samples and both methods showed to be accurate and precise, furnishing near quantitative analyte recoveries (100.4 and 99.3% for HCT and PRO by use of PLS) and relative standard deviations below 2%. For PLS the lower limits of quantification were 0.37 and 0.66 mg L(-1) for HCT and PRO, respectively, whereas for DRS they were 1.15 and 3.05 mg L(-1) for HCT and PRO, respectively. The methods were used for quantification of HCT and PRO in synthetic mixtures and in two commercial tablet preparations containing different proportions of the analytes. The results of the drug content assay and the tablet dissolution test were in statistical agreement (p < 0.05) with those furnished by the official procedures of the USP 29. Preparation of dissolution profiles of the combined tablet formulations was also performed with the aid of the proposed methods. The methods are easy to apply, use relatively simple equipment, require minimum sample pre-treatment, enable high sample throughput, and generate less solvent waste than other procedures.
NASA Astrophysics Data System (ADS)
Al-Harrasi, Ahmed; Rehman, Najeeb Ur; Mabood, Fazal; Albroumi, Muhammaed; Ali, Liaqat; Hussain, Javid; Hussain, Hidayat; Csuk, René; Khan, Abdul Latif; Alam, Tanveer; Alameri, Saif
2017-09-01
In the present study, for the first time, NIR spectroscopy coupled with PLS regression as a rapid and alternative method was developed to quantify the amount of Keto-β-Boswellic Acid (KBA) in different plant parts of Boswellia sacra and the resin exudates of the trunk. NIR spectroscopy was used for the measurement of KBA standards and B. sacra samples in absorption mode in the wavelength range from 700-2500 nm. PLS regression model was built from the obtained spectral data using 70% of KBA standards (training set) in the range from 0.1 ppm to 100 ppm. The PLS regression model obtained was having R-square value of 98% with 0.99 corelationship value and having good prediction with RMSEP value 3.2 and correlation of 0.99. It was then used to quantify the amount of KBA in the samples of B. sacra. The results indicated that the MeOH extract of resin has the highest concentration of KBA (0.6%) followed by essential oil (0.1%). However, no KBA was found in the aqueous extract. The MeOH extract of the resin was subjected to column chromatography to get various sub-fractions at different polarity of organic solvents. The sub-fraction at 4% MeOH/CHCl3 (4.1% of KBA) was found to contain the highest percentage of KBA followed by another sub-fraction at 2% MeOH/CHCl3 (2.2% of KBA). The present results also indicated that KBA is only present in the gum-resin of the trunk and not in all parts of the plant. These results were further confirmed through HPLC analysis and therefore it is concluded that NIRS coupled with PLS regression is a rapid and alternate method for quantification of KBA in Boswellia sacra. It is non-destructive, rapid, sensitive and uses simple methods of sample preparation.
Wang, Chunyan; Zhu, Hongbin; Pi, Zifeng; Song, Fengrui; Liu, Zhiqiang; Liu, Shuying
2013-09-15
An analytical method for quantifying underivatized amino acids (AAs) in urine samples of rats was developed by using liquid chromatography coupled with tandem mass spectrometry (LC-MS/MS). Classification of type 2 diabetes rats was based on urine amino acids metabolic profiling. LC-MS/MS analysis was applied through chromatographic separation and multiple reactions monitoring (MRM) transitions of MS/MS. Multivariate profile-wide predictive models were constructed using partial least squares discriminant analysis (PLS-DA) by SIMAC-P 11.5 version software package and hierarchical cluster analysis (HCA) by SPSS 18.0 version software. Some amino acids in urine of rats have significant change. The results of the present study prove that this method could perform the quantification of free AAs in urine of rats by using LC-MS/MS. In summary, the PLS-DA and HCA statistical analysis in our research were preferable to differentiate healthy rats and type 2 diabetes rats by the quantification of AAs in their urine samples. In addition, comparing with health group the seven increased amino acids in urine of type 2 rats were returned to normal under the treatment of acarbose. Copyright © 2013 Elsevier B.V. All rights reserved.
Wang, Juan; Li, Jing; Li, Hongfa; Wu, Xiaolei; Gao, Wenyuan
2015-09-01
A electrospray ionization tandem mass spectrometry (ESI-MS(n)) analysis was performed in order to identify the active composition in Pseudostellaria heterophylla adventitious roots. Pseudostellarin A, C, D, and G were identified from P. heterophylla adventitious roots on the basis of LC-MS(n) analysis. The culture conditions of adventitious roots were optimized, and datasets were subjected to a partial least squares discriminant analysis (PLS-DA), in which the growth ratio and some compounds showed a positive correlation with an aeration volume of 0.3 vvm and inoculum density of 0.15 %. Fed-batch cultivation enhanced the contents of total saponin, polysaccharides, and specific oxygen uptaker rate (SOUR). The maximum dry root weight (4.728 g l(-1)) was achieved in the 3/4 Murashige and Skoog (MS) medium group. PLS-DA showed that polysaccharides contributed significantly to the clustering of different groups and showed a positive correlation in the MS medium group. The delayed-type hypersensitivity (DTH) reaction on the mice induced by 2,4-dinitrofluorobenzene (DNFB) was applied to compare the immunocompetence effects of adventitious roots (AR) with field native roots (NR) of P. heterophylla. As a result, AR possessed a similar immunoregulation function as NR.
Metabolic Response to XD14 Treatment in Human Breast Cancer Cell Line MCF-7
Pan, Daqiang; Kather, Michel; Willmann, Lucas; Schlimpert, Manuel; Bauer, Christoph; Lagies, Simon; Schmidtkunz, Karin; Eisenhardt, Steffen U.; Jung, Manfred; Günther, Stefan; Kammerer, Bernd
2016-01-01
XD14 is a 4-acyl pyrrole derivative, which was discovered by a high-throughput virtual screening experiment. XD14 inhibits bromodomain and extra-terminal domain (BET) proteins (BRD2, BRD3, BRD4 and BRDT) and consequently suppresses cell proliferation. In this study, metabolic profiling reveals the molecular effects in the human breast cancer cell line MCF-7 (Michigan Cancer Foundation-7) treated by XD14. A three-day time series experiment with two concentrations of XD14 was performed. Gas chromatography-mass spectrometry (GC-MS) was applied for untargeted profiling of treated and non-treated MCF-7 cells. The gained data sets were evaluated by several statistical methods: analysis of variance (ANOVA), clustering analysis, principle component analysis (PCA), and partial least squares discriminant analysis (PLS-DA). Cell proliferation was strongly inhibited by treatment with 50 µM XD14. Samples could be discriminated by time and XD14 concentration using PLS-DA. From the 117 identified metabolites, 67 were significantly altered after XD14 treatment. These metabolites include amino acids, fatty acids, Krebs cycle and glycolysis intermediates, as well as compounds of purine and pyrimidine metabolism. This massive intervention in energy metabolism and the lack of available nucleotides could explain the decreased proliferation rate of the cancer cells. PMID:27783056
Li, Zihui; Du, Boping; Li, Jing; Zhang, Jinli; Zheng, Xiaojing; Jia, Hongyan; Xing, Aiying; Sun, Qi; Liu, Fei; Zhang, Zongde
2017-03-01
Tuberculous meningitis (TBM) is the most severe and frequent form of central nervous system tuberculosis. The current lack of efficient diagnostic tests makes it difficult to differentiate TBM from other common types of meningitis, especially viral meningitis (VM). Metabolomics is an important tool to identify disease-specific biomarkers. However, little metabolomic information is available on adult TBM. We used 1 H nuclear magnetic resonance-based metabolomics to investigate the metabolic features of the CSF from 18 TBM and 20 VM patients. Principal component analysis and orthogonal signal correction-partial least squares-discriminant analysis (OSC-PLS-DA) were applied to analyze profiling data. Metabolites were identified using the Human Metabolome Database and pathway analysis was performed with MetaboAnalyst 3.0. The OSC-PLS-DA model could distinguish TBM from VM with high reliability. A total of 25 key metabolites that contributed to their discrimination were identified, including some, such as betaine and cyclohexane, rarely reported before in TBM. Pathway analysis indicated that amino acid and energy metabolism was significantly different in the CSF of TBM compared with VM. Twenty-five key metabolites identified in our study may be potential biomarkers for TBM differential diagnosis and are worthy of further investigation. Copyright © 2017 Elsevier B.V. All rights reserved.
Rahman, Ziyaur; Siddiqui, Akhtar; Khan, Mansoor A
2013-12-01
The focus of present investigation was to characterize and evaluate the variability of solid dispersion (SD) of amorphous vancomycin (VCM), utilizing crystalline polyethylene glycol (PEG-6000) as a carrier and subsequently, determining their percentage composition by nondestructive method of process analytical technology (PAT) sensors. The SD were prepared by heat fusion method and characterized for physicochemical and spectral properties. Enhanced dissolution was shown by the SD formulations. Decreased crystallinity of PEG-6000 was observed indicating that the drug was present as solution and dispersed form within the polymer. The SD formulations were homogenous as shown by near infrared (NIR) chemical imaging data. Principal component analysis (PCA) and partial least square (PLS) method were applied to NIR and PXRD (powder X-ray diffraction) data to develop model for quantification of drug and carrier. PLS of both data showed correlation coefficient >0.9934 with good prediction capability as revealed by smaller value of root mean square and standard error. The model based on NIR and PXRD were two folds more accurate in estimating PEG-6000 than VCM. In conclusion, the drug dissolution from the SD increased by decreasing crystallinity of PEG-6000, and the chemometric models showed usefulness of PAT sensor in estimating percentage of both VCM and PEG-600 simultaneously. © 2013 Wiley Periodicals, Inc. and the American Pharmacists Association.
Vásquez, Valeria; Báez, María E; Bravo, Manuel; Fuentes, Edwar
2013-09-01
Seven heavy polycyclic aromatic hydrocarbons (PAHs) of concern on the US Environmental Protection Agency priority pollutant list (benzo[a]anthracene, benzo[b]fluoranthene, benzo[k]fluoranthene, benzo[a]pyrene, dibenz[a,h]anthracene, benzo[g,h,i]perylene, and indeno[1,2,3-c,d]-pyrene) were simultaneously analyzed in extra virgin olive oil. The analysis is based on the measurement of excitation-emission matrices on nylon membrane and processing of data using unfolded partial least-squares regression with residual bilinearization (U-PLS/RBL). The conditions needed to retain the PAHs present in the oil matrix on the nylon membrane were evaluated. The limit of detection for the proposed method ranged from 0.29 to 1.0 μg kg(-1), with recoveries between 64 and 78 %. The predicted U-PLS/RBL concentrations compared favorably with those measured using high-performance liquid chromatography with fluorescence detection. The proposed method was applied to ten samples of edible oil, two of which presented PAHs ranging from 0.35 to 0.63 μg kg(-1). The principal advantages of the proposed analytical method are that it provides a significant reduction in time and solvent consumption with a similar limit of detection as compared with chromatography.
Elkhoudary, Mahmoud M; Naguib, Ibrahim A; Abdel Salam, Randa A; Hadad, Ghada M
2017-05-01
Four accurate, sensitive and reliable stability indicating chemometric methods were developed for the quantitative determination of Agomelatine (AGM) whether in pure form or in pharmaceutical formulations. Two supervised learning machines' methods; linear artificial neural networks (PC-linANN) preceded by principle component analysis and linear support vector regression (linSVR), were compared with two principle component based methods; principle component regression (PCR) as well as partial least squares (PLS) for the spectrofluorimetric determination of AGM and its degradants. The results showed the benefits behind using linear learning machines' methods and the inherent merits of their algorithms in handling overlapped noisy spectral data especially during the challenging determination of AGM alkaline and acidic degradants (DG1 and DG2). Relative mean squared error of prediction (RMSEP) for the proposed models in the determination of AGM were 1.68, 1.72, 0.68 and 0.22 for PCR, PLS, SVR and PC-linANN; respectively. The results showed the superiority of supervised learning machines' methods over principle component based methods. Besides, the results suggested that linANN is the method of choice for determination of components in low amounts with similar overlapped spectra and narrow linearity range. Comparison between the proposed chemometric models and a reported HPLC method revealed the comparable performance and quantification power of the proposed models.
Malzert-Fréon, A; Hennequin, D; Rault, S
2010-11-01
Lipidic nanoparticles (NP), formulated from a phase inversion temperature process, have been studied with chemometric techniques to emphasize the influence of the four major components (Solutol®, Labrasol®, Labrafac®, water) on their average diameter and their distribution in size. Typically, these NP present a monodisperse size lower than 200 nm, as determined by dynamic light scattering measurements. From the application of the partial least squares (PLS) regression technique to the experimental data collected during definition of the feasibility zone, it was established that NP present a core-shell structure where Labrasol® is well encapsulated and contributes to the structuring of the NP. Even if this solubility enhancer is regarded as a pure surfactant in the literature, it appears that the oil moieties of this macrogolglyceride mixture significantly influence its properties. Furthermore, results have shown that PLS technique can be also used for predictions of sizes for given relative proportions of components and it was established that from a mixture design, the quantitative mixture composition to use in order to reach a targeted size and a targeted polydispersity index (PDI) can be easily predicted. Hence, statistical models can be a useful tool to control and optimize the characteristics in size of NP. © 2010 Wiley-Liss, Inc. and the American Pharmacists Association
NASA Astrophysics Data System (ADS)
Qiu, Peng; D'Souza, Warren D.; McAvoy, Thomas J.; Liu, K. J. Ray
2007-09-01
Tumor motion induced by respiration presents a challenge to the reliable delivery of conformal radiation treatments. Real-time motion compensation represents the technologically most challenging clinical solution but has the potential to overcome the limitations of existing methods. The performance of a real-time couch-based motion compensation system is mainly dependent on two aspects: the ability to infer the internal anatomical position and the performance of the feedback control system. In this paper, we propose two novel methods for the two aspects respectively, and then combine the proposed methods into one system. To accurately estimate the internal tumor position, we present partial-least squares (PLS) regression to predict the position of the diaphragm using skin-based motion surrogates. Four radio-opaque markers were placed on the abdomen of patients who underwent fluoroscopic imaging of the diaphragm. The coordinates of the markers served as input variables and the position of the diaphragm served as the output variable. PLS resulted in lower prediction errors compared with standard multiple linear regression (MLR). The performance of the feedback control system depends on the system dynamics and dead time (delay between the initiation and execution of the control action). While the dynamics of the system can be inverted in a feedback control system, the dead time cannot be inverted. To overcome the dead time of the system, we propose a predictive feedback control system by incorporating forward prediction using least-mean-square (LMS) and recursive least square (RLS) filtering into the couch-based control system. Motion data were obtained using a skin-based marker. The proposed predictive feedback control system was benchmarked against pure feedback control (no forward prediction) and resulted in a significant performance gain. Finally, we combined the PLS inference model and the predictive feedback control to evaluate the overall performance of the feedback control system. Our results show that, with the tumor motion unknown but inferred by skin-based markers through the PLS model, the predictive feedback control system was able to effectively compensate intra-fraction motion.
Elsohaby, Ibrahim; McClure, J Trenton; Riley, Christopher B; Bryanton, Janet; Bigsby, Kathryn; Shaw, R Anthony
2018-02-20
Attenuated total reflectance infrared (ATR-IR) spectroscopy is a simple, rapid and cost-effective method for the analysis of serum. However, the complex nature of serum remains a limiting factor to the reliability of this method. We investigated the benefits of coupling the centrifugal ultrafiltration with ATR-IR spectroscopy for quantification of human serum IgA concentration. Human serum samples (n = 196) were analyzed for IgA using an immunoturbidimetric assay. ATR-IR spectra were acquired for whole serum samples and for the retentate (residue) reconstituted with saline following 300 kDa centrifugal ultrafiltration. IR-based analytical methods were developed for each of the two spectroscopic datasets, and the accuracy of each of the two methods compared. Analytical methods were based upon partial least squares regression (PLSR) calibration models - one with 5-PLS factors (for whole serum) and the second with 9-PLS factors (for the reconstituted retentate). Comparison of the two sets of IR-based analytical results to reference IgA values revealed improvements in the Pearson correlation coefficient (from 0.66 to 0.76), and the root mean squared error of prediction in IR-based IgA concentrations (from 102 to 79 mg/dL) for the ultrafiltration retentate-based method as compared to the method built upon whole serum spectra. Depleting human serum low molecular weight proteins using a 300 kDa centrifugal filter thus enhances the accuracy IgA quantification by ATR-IR spectroscopy. Further evaluation and optimization of this general approach may ultimately lead to routine analysis of a range of high molecular-weight analytical targets that are otherwise unsuitable for IR-based analysis. Copyright © 2017 Elsevier B.V. All rights reserved.
PLS-LS-SVM based modeling of ATR-IR as a robust method in detection and qualification of alprazolam
NASA Astrophysics Data System (ADS)
Parhizkar, Elahehnaz; Ghazali, Mohammad; Ahmadi, Fatemeh; Sakhteman, Amirhossein
2017-02-01
According to the United States pharmacopeia (USP), Gold standard technique for Alprazolam determination in dosage forms is HPLC, an expensive and time-consuming method that is not easy to approach. In this study chemometrics assisted ATR-IR was introduced as an alternative method that produce similar results in fewer time and energy consumed manner. Fifty-eight samples containing different concentrations of commercial alprazolam were evaluated by HPLC and ATR-IR method. A preprocessing approach was applied to convert raw data obtained from ATR-IR spectra to normal matrix. Finally, a relationship between alprazolam concentrations achieved by HPLC and ATR-IR data was established using PLS-LS-SVM (partial least squares least squares support vector machines). Consequently, validity of the method was verified to yield a model with low error values (root mean square error of cross validation equal to 0.98). The model was able to predict about 99% of the samples according to R2 of prediction set. Response permutation test was also applied to affirm that the model was not assessed by chance correlations. At conclusion, ATR-IR can be a reliable method in manufacturing process in detection and qualification of alprazolam content.
NASA Astrophysics Data System (ADS)
Moustafa, Azza A.; Hegazy, Maha A.; Mohamed, Dalia; Ali, Omnia
2016-02-01
A novel approach for the resolution and quantitation of severely overlapped quaternary mixture of carbinoxamine maleate (CAR), pholcodine (PHL), ephedrine hydrochloride (EPH) and sunset yellow (SUN) in syrup was demonstrated utilizing different spectrophotometric assisted multivariate calibration methods. The applied methods have used different processing and pre-processing algorithms. The proposed methods were partial least squares (PLS), concentration residuals augmented classical least squares (CRACLS), and a novel method; continuous wavelet transforms coupled with partial least squares (CWT-PLS). These methods were applied to a training set in the concentration ranges of 40-100 μg/mL, 40-160 μg/mL, 100-500 μg/mL and 8-24 μg/mL for the four components, respectively. The utilized methods have not required any preliminary separation step or chemical pretreatment. The validity of the methods was evaluated by an external validation set. The selectivity of the developed methods was demonstrated by analyzing the drugs in their combined pharmaceutical formulation without any interference from additives. The obtained results were statistically compared with the official and reported methods where no significant difference was observed regarding both accuracy and precision.
Shashilov, Victor A; Sikirzhytski, Vitali; Popova, Ludmila A; Lednev, Igor K
2010-09-01
Here we report on novel quantitative approaches for protein structural characterization using deep UV resonance Raman (DUVRR) spectroscopy. Specifically, we propose a new method combining hydrogen-deuterium (HD) exchange and Bayesian source separation for extracting the DUVRR signatures of various structural elements of aggregated proteins including the cross-beta core and unordered parts of amyloid fibrils. The proposed method is demonstrated using the set of DUVRR spectra of hen egg white lysozyme acquired at various stages of HD exchange. Prior information about the concentration matrix and the spectral features of the individual components was incorporated into the Bayesian equation to eliminate the ill-conditioning of the problem caused by 100% correlation of the concentration profiles of protonated and deuterated species. Secondary structure fractions obtained by partial least squares (PLS) and least squares support vector machines (LS-SVMs) were used as the initial guess for the Bayessian source separation. Advantages of the PLS and LS-SVMs methods over the classical least squares calibration (CLSC) are discussed and illustrated using the DUVRR data of the prion protein in its native and aggregated forms. Copyright (c) 2010 Elsevier Inc. All rights reserved.
Lao, Wan-li; He, Yu-chan; Li, Gai-yun; Zhou, Qun
2016-01-01
The biomass to plastic ratio in wood plastic composites (WPCs) greatly affects the physical and mechanical properties and price. Fast and accurate evaluation of the biomass to plastic ratio is important for the further development of WPCs. Quantitative analysis of the WPC main composition currently relies primarily on thermo-analytical methods. However, these methods have some inherent disadvantages, including time-consuming, high analytical errors and sophisticated, which severely limits the applications of these techniques. Therefore, in this study, Fourier Transform Infrared (FTIR) spectroscopy in combination with partial least square (PLS) has been used for rapid prediction of bamboo and polypropylene (PP) content in bamboo/PP composites. The bamboo powders were used as filler after being dried at 105 degrees C for 24 h. PP was used as matrix materials, and some chemical regents were used as additives. Then 42 WPC samples with different ratios of bamboo and PP were prepared by the methods of extrusion. FTIR spectral data of 42 WPC samples were collected by means of KBr pellets technique. The model for bamboo and PP content prediction was developed by PLS-2 and full cross validation. Results of internal cross validation showed that the first derivative spectra in the range of 1 800-800 cm(-1) corrected by standard normal variate (SNV) yielded the optimal model. For both bamboo and PP calibration, the coefficients of determination (R2) were 0.955. The standard errors of calibration (SEC) were 1.872 for bamboo content and 1.848 for PP content, respectively. For both bamboo and PP validation, the R2 values were 0.950. The standard errors of cross validation (SECV) were 1.927 for bamboo content and 1.950 for PP content, respectively. And the ratios of performance to deviation (RPD) were 4.45 for both biomass and PP examinations. The results of external validation showed that the relative prediction deviations for both biomass and PP contents were lower than ± 6%. FTIR combined with PLS can be used for rapid and accurate determination of bamboo and PP content in bamboo/PP composites.
Dou, Yun-De; Huang, Tao; Wang, Qun; Shu, Xin; Zhao, Shi-Gang; Li, Lei; Liu, Tao; Lu, Gang; Chan, Wai-Yee; Liu, Hong-Bin
2018-01-29
Characterization of the genetic landscapes of familial ovarian cancer through integrated analysis of microRNA and mRNA by partial least squares (PLS) and Monte Carlo technique based on genome-wide association studies (GWAS). The miRNA and mRNA transcriptional data in familial ovarian cancer were characterized from the Gene Expression Omnibus (GEO) database. The miRNA and mRNA expression profiles in peripheral blood lymphocytes (PBLs) of 74 familial ovarian cancer patients and 47 control subjects were analyzed with the integration of partial least squares (PLS) and Monte Carlo techniques. Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were also performed. Total of 16 miRNA-mRNA pairs were identified with the target gene prediction results of miRNAs and mRNAs. An innovated miRNA-mRNA integrated network was constructed in which 6 downregulated miRNAs and 1 upregulated miRNAs were included. KEGG and GO pathway enrichment analysis revealed over-representation of dysregulated miRNAs in various biological processes especially in cancer pathology. Hsa-miR-34b played a pivotal role in this network and interacted with other miRNAs. Hsa-miR-136 and hsa-miR-335 were associated with p53 and Erk1/2 pathways and tumor suppressors, such as PTEN. The results from this research provide insights on miRNA-mRNA networks and offer new tools for studying transcriptional variants in familial ovarian cancer. Copyright © 2018 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Ni, Yongnian; Wang, Yong; Kokot, Serge
2008-10-01
A spectrophotometric method for the simultaneous determination of the important pharmaceuticals, pefloxacin and its structurally similar metabolite, norfloxacin, is described for the first time. The analysis is based on the monitoring of a kinetic spectrophotometric reaction of the two analytes with potassium permanganate as the oxidant. The measurement of the reaction process followed the absorbance decrease of potassium permanganate at 526 nm, and the accompanying increase of the product, potassium manganate, at 608 nm. It was essential to use multivariate calibrations to overcome severe spectral overlaps and similarities in reaction kinetics. Calibration curves for the individual analytes showed linear relationships over the concentration ranges of 1.0-11.5 mg L -1 at 526 and 608 nm for pefloxacin, and 0.15-1.8 mg L -1 at 526 and 608 nm for norfloxacin. Various multivariate calibration models were applied, at the two analytical wavelengths, for the simultaneous prediction of the two analytes including classical least squares (CLS), principal component regression (PCR), partial least squares (PLS), radial basis function-artificial neural network (RBF-ANN) and principal component-radial basis function-artificial neural network (PC-RBF-ANN). PLS and PC-RBF-ANN calibrations with the data collected at 526 nm, were the preferred methods—%RPE T ˜ 5, and LODs for pefloxacin and norfloxacin of 0.36 and 0.06 mg L -1, respectively. Then, the proposed method was applied successfully for the simultaneous determination of pefloxacin and norfloxacin present in pharmaceutical and human plasma samples. The results compared well with those from the alternative analysis by HPLC.
Zhang, Bing-Fang; Yuan, Li-Bo; Kong, Qing-Ming; Shen, Wei-Zheng; Zhang, Bing-Xiu; Liu, Cheng-Hai
2014-10-01
In the present study, a new method using near infrared spectroscopy combined with optical fiber sensing technology was applied to the analysis of hogwash oil in blended oil. The 50 samples were a blend of frying oil and "nine three" soybean oil according to a certain volume ratio. The near infrared transmission spectroscopies were collected and the quantitative analysis model of frying oil was established by partial least squares (PLS) and BP artificial neural network The coefficients of determina- tion of calibration sets were 0.908 and 0.934 respectively. The coefficients of determination of validation sets were 0.961 and 0.952, the root mean square error of calibrations (RMSEC) was 0.184 and 0.136, and the root mean square error of predictions (RMSEP) was all 0.111 6. They conform to the model application requirement. At the same time, frying oil and qualified edible oil were identified with the principal component analysis (PCA), and the accurate rate was 100%. The experiment proved that near infrared spectral technology not only can quickly and accurately identify hogwash oil, but also can quantitatively detect hog- wash oil. This method has a wide application prospect in the detection of oil.
Pec, Jaroslav; Flores-Sanchez, Isvett Josefina; Choi, Young Hae; Verpoorte, Robert
2010-07-01
Cannabis sativa L. plants produce a diverse array of secondary metabolites. Cannabis cell cultures were treated with jasmonic acid (JA) and pectin as elicitors to evaluate their effect on metabolism from two cell lines using NMR spectroscopy and multivariate data analysis. According to principal component analysis (PCA) and partial least square-discriminant analysis (PLS-DA), the chloroform extract of the pectin-treated cultures were more different than control and JA-treated cultures; but in the methanol/water extract the metabolome of the JA-treated cells showed clear differences with control and pectin-treated cultures. Tyrosol, an antioxidant metabolite, was detected in cannabis cell cultures. The tyrosol content increased after eliciting with JA.
de Peinder, P; Vredenbregt, M J; Visser, T; de Kaste, D
2008-08-05
Research has been carried on the feasibility of near infrared (NIR) and Raman spectroscopy as rapid screening methods to discriminate between genuine and counterfeits of the cholesterol-lowering medicine Lipitor. Classification, based on partial least squares discriminant analysis (PLS-DA) models, appears to be successful for both spectroscopic techniques, irrespective of whether atorvastatine or lovastatine has been used as the active pharmaceutical ingredient (API). The discriminative power of the NIR model, in particular, largely relies on the spectral differences of the tablet matrix. This is due to the relative large sample volume that is probed with NIR and the strong spectroscopic activity of the excipients. PLS-DA models based on NIR or Raman spectra can also be applied to distinguish between atorvastatine and lovastatine as the API used in the counterfeits tested in this study. A disadvantage of Raman microscopy for this type of analysis is that it is primarily a surface technique. As a consequence spectra of the coating and the tablet core might differ. Besides, spectra may change with the position of the laser in case the sample is inhomogeneous. However, the robustness of the PLS-DA models turned out to be sufficiently large to allow a reliable discrimination. Principal component analysis (PCA) of the spectra revealed that the conditions, at which tablets have been stored, affect the NIR data. This effect is attributed to the adsorption of water from the atmosphere after unpacking from the blister. It implies that storage conditions should be taken into account when the NIR technique is used for discriminating purposes. However, in this study both models based on NIR spectra and Raman data enabled reliable discrimination between genuine and counterfeited Lipitor tablets, regardless of their storage conditions.
Izquierdo-Garcia, Jose L; Nin, Nicolas; Jimenez-Clemente, Jorge; Horcajada, Juan P; Arenas-Miras, Maria Del Mar; Gea, Joaquim; Esteban, Andres; Ruiz-Cabello, Jesus; Lorente, Jose A
2017-12-29
The integrated analysis of changes in the metabolic profile could be critical for the discovery of biomarkers of lung injury, and also for generating new pathophysiological hypotheses and designing novel therapeutic targets for the acute respiratory distress syndrome (ARDS). This study aimed at developing a Nuclear Magnetic Resonance (NMR)-based approach for the identification of the metabolomic profile of ARDS in patients with H1N1 influenza virus pneumonia. Serum samples from 30 patients (derivation set) diagnosed of H1N1 influenza virus pneumonia were analysed by unsupervised Principal Component Analysis (PCA) to identify metabolic differences between patients with and without ARDS by NMR-spectroscopy. A predictive model of partial least squares discriminant analysis (PLS-DA) was developed for the identification of ARDS. PLS-DA was trained with the derivation set and tested in another set of samples from 26 patients also diagnosed of H1N1 influenza virus pneumonia (validation set). Decreased serum glucose, alanine, glutamine, methylhistidine and fatty acids concentrations, and elevated serum phenylalanine and methylguanidine concentrations, discriminated patients with ARDS versus patients without ARDS. PLS-DA model successfully identified the presence of ARDS in the validation set with a success rate of 92% (sensitivity 100% and specificity 91%). The classification functions showed a good correlation with the Sequential Organ Failure Assessment (SOFA) score (R = 0.74, p < 0.0001) and the Pa02/Fi02 ratio (R = 0.41, p = 0.03). The serum metabolomic profile is sensitive and specific to identify ARDS in patients with H1N1 influenza A pneumonia. Future studies are needed to determine the role of NMR-spectroscopy as a biomarker of ARDS.
Passos, Cláudia P; Cardoso, Susana M; Barros, António S; Silva, Carlos M; Coimbra, Manuel A
2010-02-28
Fourier transform infrared (FTIR) spectroscopy has being emphasised as a widespread technique in the quick assess of food components. In this work, procyanidins were extracted with methanol and acetone/water from the seeds of white and red grape varieties. A fractionation by graded methanol/chloroform precipitations allowed to obtain 26 samples that were characterised using thiolysis as pre-treatment followed by HPLC-UV and MS detection. The average degree of polymerisation (DPn) of the procyanidins in the samples ranged from 2 to 11 flavan-3-ol residues. FTIR spectroscopy within the wavenumbers region of 1800-700 cm(-1) allowed to build a partial least squares (PLS1) regression model with 8 latent variables (LVs) for the estimation of the DPn, giving a RMSECV of 11.7%, with a R(2) of 0.91 and a RMSEP of 2.58. The application of orthogonal projection to latent structures (O-PLS1) clarifies the interpretation of the regression model vectors. Moreover, the O-PLS procedure has removed 88% of non-correlated variations with the DPn, allowing to relate the increase of the absorbance peaks at 1203 and 1099 cm(-1) with the increase of the DPn due to the higher proportion of substitutions in the aromatic ring of the polymerised procyanidin molecules. Copyright 2009 Elsevier B.V. All rights reserved.
Pereira, Leandro S A; Lisboa, Fernanda L C; Coelho Neto, José; Valladão, Frederico N; Sena, Marcelo M
2018-05-09
Several new psychoactive substances (NPS) have reached the illegal drug market in recent years, and ecstasy-like tablets are one of the forms affected by this change. Cathinones and tryptamines have increasingly been found in ecstasy-like seized samples as well as other amphetamine type stimulants. A presumptive method for identifying different drugs in seized ecstasy tablets (n=92) using ATR-FTIR (attenuated total reflectance - Fourier transform infrared spectroscopy) and PLS-DA (partial least squares discriminant analysis) was developed. A hierarchical strategy of sequential modeling was performed with PLS-DA. The main model discriminated four classes: 5-MeO-MIPT, methylenedioxyamphetamines (MDMA and MDA), methamphetamine, and cathinones. Two submodels were built to identify drugs present in MDs and cathinones classes. Models were validated through the estimate of figures of merit. The average reliability rate (RLR) of the main model was 96.8% and accordance (ACC) was 100%. For the submodels, RLR and ACC were 100%. The reliability of the models was corroborated through their spectral interpretation. Thus, spectral assignments were performed by associating informative vectors of each specific modeled class to the respective drugs. The developed method is simple, fast, and can be applied to the forensic laboratory routine, leading to objective results reports useful for forensic scientists and law enforcement. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Larasati, Ophilia; Puspita Dirgahayani, Eng., Dr.
2018-05-01
Transport services are essential to support daily life. A lack of transport supply leads to the existence of transport disadvantaged (TDA) groups who are vulnerable to social exclusion, which happens when a particular group or individual is having difficulties to access certain activities that are considered normal in society. To tackle this phenomenon, the understanding of the influence of TDA variables on social exclusion is needed. The aim of this study is to analyze the influences of TDA variables on social exclusion in a rural context, with Cibeureum Village (Bandung Barat Regency) and Bunikasih Village (Subang Regency) as the study case. Both case studies provide different characteristics of accessibility. Partial Least Squares (PLS) Structural Equation Modeling (SEM) is chosen as the method to analyze the influences of TDA variables on social exclusion. The PLS-SEM model is developed according to the social exclusion variable and four TDA variables, i.e., accessibility, individual characteristics, private vehicle existence, and travel behavior. IPMA is done after the PLS-SEM model is evaluated. The study reveals that among four of the TDA variables, accessibility has the most influence on social exclusion, hence interventions related to improving accessibility are needed to tackle social exclusion. More specifically, the provision of alternative modes is needed in both study areas, while in Bunikasih Village the cost of travel is also an important variable to consider.
Jiang, Wei; Zhou, Chengfeng; Han, Guangting; Via, Brian; Swain, Tammy; Fan, Zhaofei; Liu, Shaoyang
2017-01-01
Plant fibrous material is a good resource in textile and other industries. Normally, several kinds of plant fibrous materials used in one process are needed to be identified and characterized in advance. It is easy to identify them when they are in raw condition. However, most of the materials are semi products which are ground, rotted or pre-hydrolyzed. To classify these samples which include different species with high accuracy is a big challenge. In this research, both qualitative and quantitative analysis methods were chosen to classify six different species of samples, including softwood, hardwood, bast, and aquatic plant. Soft Independent Modeling of Class Analogy (SIMCA) and partial least squares (PLS) were used. The algorithm to classify different species of samples using PLS was created independently in this research. Results found that the six species can be successfully classified using SIMCA and PLS methods, and these two methods show similar results. The identification rates of kenaf, ramie and pine are 100%, and the identification rates of lotus, eucalyptus and tallow are higher than 94%. It is also found that spectra loadings can help pick up best wavenumber ranges for constructing the NIR model. Inter material distance can show how close between two species. Scores graph is helpful to choose the principal components numbers during the model construction. PMID:28105037
Geographical provenance of palm oil by fatty acid and volatile compound fingerprinting techniques.
Tres, A; Ruiz-Samblas, C; van der Veer, G; van Ruth, S M
2013-04-15
Analytical methods are required in addition to administrative controls to verify the geographical origin of vegetable oils such as palm oil in an objective manner. In this study the application of fatty acid and volatile organic compound fingerprinting in combination with chemometrics have been applied to verify the geographical origin of crude palm oil (continental scale). For this purpose 94 crude palm oil samples were collected from South East Asia (55), South America (11) and Africa (28). Partial least squares discriminant analysis (PLS-DA) was used to develop a hierarchical classification model by combining two consecutive binary PLS-DA models. First, a PLS-DA model was built to distinguish South East Asian from non-South East Asian palm oil samples. Then a second model was developed, only for the non-Asian samples, to discriminate African from South American crude palm oil. Models were externally validated by using them to predict the identity of new authentic samples. The fatty acid fingerprinting model revealed three misclassified samples. The volatile compound fingerprinting models showed an 88%, 100% and 100% accuracy for the South East Asian, African and American class, respectively. The verification of the geographical origin of crude palm oil is feasible by fatty acid and volatile compound fingerprinting. Further research is required to further validate the approach and to increase its spatial specificity to country/province scale. Copyright © 2012 Elsevier Ltd. All rights reserved.
Wang, Shenghao; Zhang, Yuyan; Cao, Fuyi; Pei, Zhenying; Gao, Xuewei; Zhang, Xu; Zhao, Yong
2018-02-13
This paper presents a novel spectrum analysis tool named synergy adaptive moving window modeling based on immune clone algorithm (SA-MWM-ICA) considering the tedious and inconvenient labor involved in the selection of pre-processing methods and spectral variables by prior experience. In this work, immune clone algorithm is first introduced into the spectrum analysis field as a new optimization strategy, covering the shortage of the relative traditional methods. Based on the working principle of the human immune system, the performance of the quantitative model is regarded as antigen, and a special vector corresponding to the above mentioned antigen is regarded as antibody. The antibody contains a pre-processing method optimization region which is created by 11 decimal digits, and a spectrum variable optimization region which is formed by some moving windows with changeable width and position. A set of original antibodies are created by modeling with this algorithm. After calculating the affinity of these antibodies, those with high affinity will be selected to clone. The regulation for cloning is that the higher the affinity, the more copies will be. In the next step, another import operation named hyper-mutation is applied to the antibodies after cloning. Moreover, the regulation for hyper-mutation is that the lower the affinity, the more possibility will be. Several antibodies with high affinity will be created on the basis of these steps. Groups of simulated dataset, gasoline near-infrared spectra dataset, and soil near-infrared spectra dataset are employed to verify and illustrate the performance of SA-MWM-ICA. Analysis results show that the performance of the quantitative models adopted by SA-MWM-ICA are better especially for structures with relatively complex spectra than traditional models such as partial least squares (PLS), moving window PLS (MWPLS), genetic algorithm PLS (GAPLS), and pretreatment method classification and adjustable parameter changeable size moving window PLS (CA-CSMWPLS). The selected pre-processing methods and spectrum variables are easily explained. The proposed method will converge in few generations and can be used not only for near-infrared spectroscopy analysis but also for other similar spectral analysis, such as infrared spectroscopy. Copyright © 2017 Elsevier B.V. All rights reserved.
Zhang, Xuan; Li, Wei; Yin, Bin; Chen, Weizhong; Kelly, Declan P; Wang, Xiaoxin; Zheng, Kaiyi; Du, Yiping
2013-10-01
Coffee is the most heavily consumed beverage in the world after water, for which quality is a key consideration in commercial trade. Therefore, caffeine content which has a significant effect on the final quality of the coffee products requires to be determined fast and reliably by new analytical techniques. The main purpose of this work was to establish a powerful and practical analytical method based on near infrared spectroscopy (NIRS) and chemometrics for quantitative determination of caffeine content in roasted Arabica coffees. Ground coffee samples within a wide range of roasted levels were analyzed by NIR, meanwhile, in which the caffeine contents were quantitative determined by the most commonly used HPLC-UV method as the reference values. Then calibration models based on chemometric analyses of the NIR spectral data and reference concentrations of coffee samples were developed. Partial least squares (PLS) regression was used to construct the models. Furthermore, diverse spectra pretreatment and variable selection techniques were applied in order to obtain robust and reliable reduced-spectrum regression models. Comparing the respective quality of the different models constructed, the application of second derivative pretreatment and stability competitive adaptive reweighted sampling (SCARS) variable selection provided a notably improved regression model, with root mean square error of cross validation (RMSECV) of 0.375 mg/g and correlation coefficient (R) of 0.918 at PLS factor of 7. An independent test set was used to assess the model, with the root mean square error of prediction (RMSEP) of 0.378 mg/g, mean relative error of 1.976% and mean relative standard deviation (RSD) of 1.707%. Thus, the results provided by the high-quality calibration model revealed the feasibility of NIR spectroscopy for at-line application to predict the caffeine content of unknown roasted coffee samples, thanks to the short analysis time of a few seconds and non-destructive advantages of NIRS. Copyright © 2013 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Zhang, Xuan; Li, Wei; Yin, Bin; Chen, Weizhong; Kelly, Declan P.; Wang, Xiaoxin; Zheng, Kaiyi; Du, Yiping
2013-10-01
Coffee is the most heavily consumed beverage in the world after water, for which quality is a key consideration in commercial trade. Therefore, caffeine content which has a significant effect on the final quality of the coffee products requires to be determined fast and reliably by new analytical techniques. The main purpose of this work was to establish a powerful and practical analytical method based on near infrared spectroscopy (NIRS) and chemometrics for quantitative determination of caffeine content in roasted Arabica coffees. Ground coffee samples within a wide range of roasted levels were analyzed by NIR, meanwhile, in which the caffeine contents were quantitative determined by the most commonly used HPLC-UV method as the reference values. Then calibration models based on chemometric analyses of the NIR spectral data and reference concentrations of coffee samples were developed. Partial least squares (PLS) regression was used to construct the models. Furthermore, diverse spectra pretreatment and variable selection techniques were applied in order to obtain robust and reliable reduced-spectrum regression models. Comparing the respective quality of the different models constructed, the application of second derivative pretreatment and stability competitive adaptive reweighted sampling (SCARS) variable selection provided a notably improved regression model, with root mean square error of cross validation (RMSECV) of 0.375 mg/g and correlation coefficient (R) of 0.918 at PLS factor of 7. An independent test set was used to assess the model, with the root mean square error of prediction (RMSEP) of 0.378 mg/g, mean relative error of 1.976% and mean relative standard deviation (RSD) of 1.707%. Thus, the results provided by the high-quality calibration model revealed the feasibility of NIR spectroscopy for at-line application to predict the caffeine content of unknown roasted coffee samples, thanks to the short analysis time of a few seconds and non-destructive advantages of NIRS.
NASA Astrophysics Data System (ADS)
Belal, F.; Ibrahim, F.; Sheribah, Z. A.; Alaa, H.
2018-06-01
In this paper, novel univariate and multivariate regression methods along with model-updating technique were developed and validated for the simultaneous determination of quaternary mixture of imatinib (IMB), gemifloxacin (GMI), nalbuphine (NLP) and naproxen (NAP). The univariate method is extended derivative ratio (EDR) which depends on measuring every drug in the quaternary mixture by using a ternary mixture of the other three drugs as divisor. Peak amplitudes were measured at 294 nm, 250 nm, 283 nm and 239 nm within linear concentration ranges of 4.0-17.0, 3.0-15.0, 4.0-80.0 and 1.0-6.0 μg mL-1 for IMB, GMI, NLP and NAB, respectively. Multivariate methods adopted are partial least squares (PLS) in original and derivative mode. These models were constructed for simultaneous determination of the studied drugs in the ranges of 4.0-8.0, 3.0-11.0, 10.0-18.0 and 1.0-3.0 μg mL-1 for IMB, GMI, NLP and NAB, respectively, by using eighteen mixtures as a calibration set and seven mixtures as a validation set. The root mean square error of predication (RMSEP) were 0.09 and 0.06 for IMB, 0.14 and 0.13 for GMI, 0.07 and 0.02 for NLP and 0.64 and 0.27 for NAP by PLS in original and derivative mode, respectively. Both models were successfully applied for analysis of IMB, GMI, NLP and NAP in their dosage forms. Updated PLS in derivative mode and EDR were applied for determination of the studied drugs in spiked human urine. The obtained results were statistically compared with those obtained by the reported methods giving a conclusion that there is no significant difference regarding accuracy and precision.
Belal, F; Ibrahim, F; Sheribah, Z A; Alaa, H
2018-06-05
In this paper, novel univariate and multivariate regression methods along with model-updating technique were developed and validated for the simultaneous determination of quaternary mixture of imatinib (IMB), gemifloxacin (GMI), nalbuphine (NLP) and naproxen (NAP). The univariate method is extended derivative ratio (EDR) which depends on measuring every drug in the quaternary mixture by using a ternary mixture of the other three drugs as divisor. Peak amplitudes were measured at 294nm, 250nm, 283nm and 239nm within linear concentration ranges of 4.0-17.0, 3.0-15.0, 4.0-80.0 and 1.0-6.0μgmL -1 for IMB, GMI, NLP and NAB, respectively. Multivariate methods adopted are partial least squares (PLS) in original and derivative mode. These models were constructed for simultaneous determination of the studied drugs in the ranges of 4.0-8.0, 3.0-11.0, 10.0-18.0 and 1.0-3.0μgmL -1 for IMB, GMI, NLP and NAB, respectively, by using eighteen mixtures as a calibration set and seven mixtures as a validation set. The root mean square error of predication (RMSEP) were 0.09 and 0.06 for IMB, 0.14 and 0.13 for GMI, 0.07 and 0.02 for NLP and 0.64 and 0.27 for NAP by PLS in original and derivative mode, respectively. Both models were successfully applied for analysis of IMB, GMI, NLP and NAP in their dosage forms. Updated PLS in derivative mode and EDR were applied for determination of the studied drugs in spiked human urine. The obtained results were statistically compared with those obtained by the reported methods giving a conclusion that there is no significant difference regarding accuracy and precision. Copyright © 2018 Elsevier B.V. All rights reserved.
Delwiche, Stephen R; Reeves, James B
2010-01-01
In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly smoothing operations or derivatives. While such operations are often useful in reducing the number of latent variables of the actual decomposition and lowering residual error, they also run the risk of misleading the practitioner into accepting calibration equations that are poorly adapted to samples outside of the calibration. The current study developed a graphical method to examine this effect on partial least squares (PLS) regression calibrations of near-infrared (NIR) reflection spectra of ground wheat meal with two analytes, protein content and sodium dodecyl sulfate sedimentation (SDS) volume (an indicator of the quantity of the gluten proteins that contribute to strong doughs). These two properties were chosen because of their differing abilities to be modeled by NIR spectroscopy: excellent for protein content, fair for SDS sedimentation volume. To further demonstrate the potential pitfalls of preprocessing, an artificial component, a randomly generated value, was included in PLS regression trials. Savitzky-Golay (digital filter) smoothing, first-derivative, and second-derivative preprocess functions (5 to 25 centrally symmetric convolution points, derived from quadratic polynomials) were applied to PLS calibrations of 1 to 15 factors. The results demonstrated the danger of an over reliance on preprocessing when (1) the number of samples used in a multivariate calibration is low (<50), (2) the spectral response of the analyte is weak, and (3) the goodness of the calibration is based on the coefficient of determination (R(2)) rather than a term based on residual error. The graphical method has application to the evaluation of other preprocess functions and various types of spectroscopy data.
Measurement of single soybean seed attributes by near infrared technologies. A comparative study
USDA-ARS?s Scientific Manuscript database
Four near infrared spectrophotometers, and their associated spectral collection methods, were tested and compared for measuring three soybean single seed attributes: weight (g), protein (%), and oil (%). Using partial least squares (PLS) and 4 preprocessing methods, the attribute which was significa...
Jović, Ozren
2016-12-15
A novel method for quantitative prediction and variable-selection on spectroscopic data, called Durbin-Watson partial least-squares regression (dwPLS), is proposed in this paper. The idea is to inspect serial correlation in infrared data that is known to consist of highly correlated neighbouring variables. The method selects only those variables whose intervals have a lower Durbin-Watson statistic (dw) than a certain optimal cutoff. For each interval, dw is calculated on a vector of regression coefficients. Adulteration of cold-pressed linseed oil (L), a well-known nutrient beneficial to health, is studied in this work by its being mixed with cheaper oils: rapeseed oil (R), sesame oil (Se) and sunflower oil (Su). The samples for each botanical origin of oil vary with respect to producer, content and geographic origin. The results obtained indicate that MIR-ATR, combined with dwPLS could be implemented to quantitative determination of edible-oil adulteration. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Fu, Y.; Yang, W.; Xu, O.; Zhou, L.; Wang, J.
2017-04-01
To investigate time-variant and nonlinear characteristics in industrial processes, a soft sensor modelling method based on time difference, moving-window recursive partial least square (PLS) and adaptive model updating is proposed. In this method, time difference values of input and output variables are used as training samples to construct the model, which can reduce the effects of the nonlinear characteristic on modelling accuracy and retain the advantages of recursive PLS algorithm. To solve the high updating frequency of the model, a confidence value is introduced, which can be updated adaptively according to the results of the model performance assessment. Once the confidence value is updated, the model can be updated. The proposed method has been used to predict the 4-carboxy-benz-aldehyde (CBA) content in the purified terephthalic acid (PTA) oxidation reaction process. The results show that the proposed soft sensor modelling method can reduce computation effectively, improve prediction accuracy by making use of process information and reflect the process characteristics accurately.
Consistent Partial Least Squares Path Modeling via Regularization
Jung, Sunho; Park, JaeHong
2018-01-01
Partial least squares (PLS) path modeling is a component-based structural equation modeling that has been adopted in social and psychological research due to its data-analytic capability and flexibility. A recent methodological advance is consistent PLS (PLSc), designed to produce consistent estimates of path coefficients in structural models involving common factors. In practice, however, PLSc may frequently encounter multicollinearity in part because it takes a strategy of estimating path coefficients based on consistent correlations among independent latent variables. PLSc has yet no remedy for this multicollinearity problem, which can cause loss of statistical power and accuracy in parameter estimation. Thus, a ridge type of regularization is incorporated into PLSc, creating a new technique called regularized PLSc. A comprehensive simulation study is conducted to evaluate the performance of regularized PLSc as compared to its non-regularized counterpart in terms of power and accuracy. The results show that our regularized PLSc is recommended for use when serious multicollinearity is present. PMID:29515491
Grisales, Jaiver Osorio; Arancibia, Juan A; Castells, Cecilia B; Olivieri, Alejandro C
2012-12-01
In this report, we demonstrate how chiral liquid chromatography combined with multivariate chemometric techniques, specifically unfolded-partial least-squares regression (U-PLS), provides a powerful analytical methodology. Using U-PLS, strongly overlapped enantiomer profiles in a sample could be successfully processed and enantiomeric purity could be accurately determined without requiring baseline enantioresolution between peaks. The samples were partially enantioseparated with a permethyl-β-cyclodextrin chiral column under reversed-phase conditions. Signals detected with a diode-array detector within a wavelength range from 198 to 241 nm were recorded, and the data were processed by a second-order multivariate algorithm to decrease detection limits. The R-(-)-enantiomer of ibuprofen in tablet formulation samples could be determined at the level of 0.5 mg L⁻¹ in the presence of 99.9% of the S-(+)-enantiomorph with relative prediction error within ±3%. Copyright © 2012 Elsevier B.V. All rights reserved.
Liu, Changhong; Liu, Wei; Chen, Wei; Yang, Jianbo; Zheng, Lei
2015-04-15
Tomato is an important health-stimulating fruit because of the antioxidant properties of its main bioactive compounds, dominantly lycopene and phenolic compounds. Nowadays, product differentiation in the fruit market requires an accurate evaluation of these value-added compounds. An experiment was conducted to simultaneously and non-destructively measure lycopene and phenolic compounds content in intact tomatoes using multispectral imaging combined with chemometric methods. Partial least squares (PLS), least squares-support vector machines (LS-SVM) and back propagation neural network (BPNN) were applied to develop quantitative models. Compared with PLS and LS-SVM, BPNN model considerably improved the performance with coefficient of determination in prediction (RP(2))=0.938 and 0.965, residual predictive deviation (RPD)=4.590 and 9.335 for lycopene and total phenolics content prediction, respectively. It is concluded that multispectral imaging is an attractive alternative to the standard methods for determination of bioactive compounds content in intact tomatoes, providing a useful platform for infield fruit sorting/grading. Copyright © 2014 Elsevier Ltd. All rights reserved.
Sundbom, E; Jeanneau, M
1996-03-01
The main aim of the study is to establish an empirical connection between perceptual defences as measured by the Defense Mechanism Test (DMT)--a projective percept-genetic method--and manifest linguistic expressions based on word pattern analyses. The subjects were 25 psychiatric patients with the diagnoses neurotic personality organization (NPO), borderline personality organization (BPO) and psychotic personality organization (PPO) in accordance with Kernberg's theory. A set of 130 DMT variables and 40 linguistic variables were analyzed by means of partial least squares (PLS) discriminant analysis separately and then pooled together. The overall hypothesis was that it would be possible to define the personality organization of the patients in terms of an amalgam of perceptual defences and word patterns, and that these two kinds of data would confirm each other. The result of the combined PLS analysis revealed a very good separation between the diagnostic groups as measured by the pooled variable sets. Among other things, it was shown that NPO patients are principally characterized by linguistic variables, whereas BPO and PPO patients are better defined by perceptual defences as measured by the DMT method.
[NIR Fingerprints of Different Medicinal Parts of Angelicae Sinensis Radix].
Zhang, Ya-ya; Gu, Zhi-rong; Ding, Jun-xia; Wang, Yao-peng; Sun, Yu-jing; Wang, Ya-li
2015-07-01
To investigate the spectrum characteristics of near-intrared dittuse retlectance spectroscopy (NIR) fingerprint of different medicinal parts of Angelicae Sinensis Radix. 96 batches of samples were collected from 14 counties of Gansu Province and Yunnan Province. The NIR fingerprints were collected by integrated sphere. Similarity analysis and partial least square discriminant analysis(PLS-DA) were used to analyze the fingerprint. The average spectrum of NIR fingerprint of different medicinal parts of Angelicae Sinensis Radix showed some differences; the absorbance in characteristic absorption was in a decreasing order of body > tail > head > whole. Most NIR fingerprint similarities of different medicinal parts of Angelicae Sinensis Radix exceeded 0. 95. The established model of PLS-DA could be used to accurately classify the medicinal parts of Angelicae Sinensis Radix. The differences of NIR fingerprints of different medicinal parts of Angelicae Sinensis Radix were mainly existing in the wave number ranges of 8,443 - 8,284 cm -1, 7,003 - 6,896 cm-1, 6,102 - 5,864 cm-1, 4,847 - 4,674 cm-1, and 4,386 - 4,208 cm-1. The different medicinal parts of Angelicae Sinensis Radix have some differences in chemical components.
Beauclercq, Stéphane; Nadal-Desbarats, Lydie; Hennequet-Antier, Christelle; Gabriel, Irène; Tesseraud, Sophie; Calenge, Fanny; Le Bihan-Duval, Elisabeth; Mignon-Grasteau, Sandrine
2018-04-27
The increasing cost of conventional feedstuffs has bolstered interest in genetic selection for digestive efficiency (DE), a component of feed efficiency, assessed by apparent metabolisable energy corrected to zero nitrogen retention (AMEn). However, its measurement is time-consuming and constraining, and its relationship with metabolic efficiency poorly understood. To simplify selection for this trait, we searched for indirect metabolic biomarkers through an analysis of the serum metabolome using nuclear magnetic resonance ( 1 H NMR). A partial least squares (PLS) model including six amino acids and two derivatives from butyrate predicted 59% of AMEn variability. Moreover, to increase our knowledge of the molecular mechanisms controlling DE, we investigated 1 H NMR metabolomes of ileal, caecal, and serum contents by fitting canonical sparse PLS. This analysis revealed strong associations between metabolites and DE. Models based on the ileal, caecal, and serum metabolome respectively explained 77%, 78%, and 74% of the variability of AMEn and its constitutive components (utilisation of starch, lipids, and nitrogen). In our conditions, the metabolites presenting the strongest associations with AMEn were proline in the serum, fumarate in the ileum and glucose in caeca. This study shows that serum metabolomics offers new opportunities to predict chicken DE.
Exploring the influence of encoding format on subsequent memory.
Turney, Indira C; Dennis, Nancy A; Maillet, David; Rajah, M Natasha
2017-05-01
Distinctive encoding is greatly influenced by gist-based processes and has been shown to suffer when highly similar items are presented in close succession. Thus, elucidating the mechanisms underlying how presentation format affects gist processing is essential in determining the factors that influence these encoding processes. The current study utilised multivariate partial least squares (PLS) analysis to identify encoding networks directly associated with retrieval performance in a blocked and intermixed presentation condition. Subsequent memory analysis for successfully encoded items indicated no significant differences between reaction time and retrieval performance and presentation format. Despite no significant behavioural differences, behaviour PLS revealed differences in brain-behaviour correlations and mean condition activity in brain regions associated with gist-based vs. distinctive encoding. Specifically, the intermixed format encouraged more distinctive encoding, showing increased activation of regions associated with strategy use and visual processing (e.g., frontal and visual cortices, respectively). Alternatively, the blocked format exhibited increased gist-based processes, accompanied by increased activity in the right inferior frontal gyrus. Together, results suggest that the sequence that information is presented during encoding affects the degree to which distinctive encoding is engaged. These findings extend our understanding of the Fuzzy Trace Theory and the role of presentation format on encoding processes.
Wei, Zhenbo; Wang, Jun; Ye, Linshuang
2011-08-15
A voltammetric electronic tongue (VE-tongue) was developed to discriminate the difference between Chinese rice wines in this research. Three types of Chinese rice wine with different marked ages (1, 3, and 5 years) were classified by the VE-tongue by principal component analysis (PCA) and cluster analysis (CA). The VE-tongue consisted of six working electrodes (gold, silver, platinum, palladium, tungsten, and titanium) in a standard three-electrode configuration. The multi-frequency large amplitude pulse voltammetry (MLAPV), which consisted of four segments of 1 Hz, 10 Hz, 100 Hz, and 1000 Hz, was applied as the potential waveform. The three types of Chinese rice wine could be classified accurately by PCA and CA, and some interesting regularity is shown in the score plots with the help of PCA. Two regression models, partial least squares (PLS) and back-error propagation-artificial neural network (BP-ANN), were used for wine age prediction. The regression results showed that the marked ages of the three types of Chinese rice wine were successfully predicted using PLS and BP-ANN. Copyright © 2011 Elsevier B.V. All rights reserved.
Devrim, Burcu; Dinç, Erdal; Bozkir, Asuman
2014-01-01
Diphenhydramine hydrochloride (DPH), a histamine H1-receptor antagonist, is widely used as antiallergic, antiemetic and antitussive drug found in many pharmaceutical preparations. In this study, a new reconstitutable syrup formulation of DPH was prepared because it is more stable in solid form than that in liquid form. The quantitative estimation of the DPH content of a reconstitutable syrup formulation in the presence of pharmaceutical excipients, D-sorbitol, sodium citrate, sodium benzoate and sodium EDTA is not possible by the direct absorbance measurement. Therefore, a signal processing approach based on continuous wavelet transform was used to determine the DPH in the reconstitutable syrup formulations and to eliminate the effect of excipients on the analysis. The absorption spectra of DPH in the range of 5.0-40.0 μg/mL were recorded between 200-300 nm. Various wavelet families were tested and Biorthogonal1.1 continuous wavelet transform (BIOR1.1-CWT) was found to be optimal signal processing family to get fast and desirable determination results and to overcome excipient interference effects. For a comparison of the experimental results obtained by partial least squares (PLS) and principal component regression (PCR) methods were applied to the quantitative prediction of DPH in the mentioned samples. The validity of the proposed BIOR1.1-CWT, PLS and PCR methods were achieved analyzing the prepared samples containing the mentioned excipients and using standard addition technique. It was observed that the proposed graphical and numerical approaches are suitable for the quantitative analysis of DPH in samples including excipients.
Determinants of caregivers' awareness of Universal Newborn Hearing Screening in Malaysia.
Abdul Majid, Abdul Halim; Zakaria, Mohd Normani; Abdullah, Nor Azimah Chew; Hamzah, Sulaiman; Mukari, Siti Zamratol-Mai Sarah
2017-10-01
This paper aims to investigate the effects of perceived attitude and anxiety on awareness of UNHS among caregivers in Malaysia. Using cross sectional research approach, data were collected and some 46 out of 87 questionnaires distributed to caregivers attending UNHS programs at selected public hospitals were usable for analysis (response rate of 52.8%). Partial Least Squares Method (PLS) algorithm and bootstrapping technique were employed to test the hypotheses of the study. R square value is 0.205, and it implies that exogenous latent variables explained 21% of the variance of the endogenous latent variable. This value indicates moderate and acceptable level of R-squared values. Findings from PLS structural model evaluation revealed that anxiety has no significant influence (β = -0.091, t = 0.753, p > 0.10) on caregivers' awareness; but perceived attitude has significant effect (β = -0.444, t = 3.434, p < 0.01) on caregivers' awareness. Caregivers' awareness of UNHS is influenced by their perceived attitude while anxiety is not associated with caregivers' awareness. This implies that caregivers may not believe in early detection of hearing impairment in children, thinking that their babies are too young to be tested for hearing loss. Moreover, socio-economic situation of the caregivers may have contributed to their failure to honor UNHS screening appointments as some of them may need to work to earn a living while some may perceive it a waste of time honoring such appointments. Non-significant relationship between anxiety and caregivers' awareness may be due to religious beliefs of caregivers. Limitations and suggestions were discussed. Copyright © 2017 Elsevier B.V. All rights reserved.
Balabin, Roman M; Lomakina, Ekaterina I
2011-04-21
In this study, we make a general comparison of the accuracy and robustness of five multivariate calibration models: partial least squares (PLS) regression or projection to latent structures, polynomial partial least squares (Poly-PLS) regression, artificial neural networks (ANNs), and two novel techniques based on support vector machines (SVMs) for multivariate data analysis: support vector regression (SVR) and least-squares support vector machines (LS-SVMs). The comparison is based on fourteen (14) different datasets: seven sets of gasoline data (density, benzene content, and fractional composition/boiling points), two sets of ethanol gasoline fuel data (density and ethanol content), one set of diesel fuel data (total sulfur content), three sets of petroleum (crude oil) macromolecules data (weight percentages of asphaltenes, resins, and paraffins), and one set of petroleum resins data (resins content). Vibrational (near-infrared, NIR) spectroscopic data are used to predict the properties and quality coefficients of gasoline, biofuel/biodiesel, diesel fuel, and other samples of interest. The four systems presented here range greatly in composition, properties, strength of intermolecular interactions (e.g., van der Waals forces, H-bonds), colloid structure, and phase behavior. Due to the high diversity of chemical systems studied, general conclusions about SVM regression methods can be made. We try to answer the following question: to what extent can SVM-based techniques replace ANN-based approaches in real-world (industrial/scientific) applications? The results show that both SVR and LS-SVM methods are comparable to ANNs in accuracy. Due to the much higher robustness of the former, the SVM-based approaches are recommended for practical (industrial) application. This has been shown to be especially true for complicated, highly nonlinear objects.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tripathi, Markandey M.; Krishnan, Sundar R.; Srinivasan, Kalyan K.
Chemiluminescence emissions from OH*, CH*, C2, and CO2 formed within the reaction zone of premixed flames depend upon the fuel-air equivalence ratio in the burning mixture. In the present paper, a new partial least square regression (PLS-R) based multivariate sensing methodology is investigated and compared with an OH*/CH* intensity ratio-based calibration model for sensing equivalence ratio in atmospheric methane-air premixed flames. Five replications of spectral data at nine different equivalence ratios ranging from 0.73 to 1.48 were used in the calibration of both models. During model development, the PLS-R model was initially validated with the calibration data set using themore » leave-one-out cross validation technique. Since the PLS-R model used the entire raw spectral intensities, it did not need the nonlinear background subtraction of CO2 emission that is required for typical OH*/CH* intensity ratio calibrations. An unbiased spectral data set (not used in the PLS-R model development), for 28 different equivalence ratio conditions ranging from 0.71 to 1.67, was used to predict equivalence ratios using the PLS-R and the intensity ratio calibration models. It was found that the equivalence ratios predicted with the PLS-R based multivariate calibration model matched the experimentally measured equivalence ratios within 7%; whereas, the OH*/CH* intensity ratio calibration grossly underpredicted equivalence ratios in comparison to measured equivalence ratios, especially under rich conditions ( > 1.2). The practical implications of the chemiluminescence-based multivariate equivalence ratio sensing methodology are also discussed.« less
Terra, Luciana A; Filgueiras, Paulo R; Tose, Lílian V; Romão, Wanderson; de Souza, Douglas D; de Castro, Eustáquio V R; de Oliveira, Mirela S L; Dias, Júlio C M; Poppi, Ronei J
2014-10-07
Negative-ion mode electrospray ionization, ESI(-), with Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS) was coupled to a Partial Least Squares (PLS) regression and variable selection methods to estimate the total acid number (TAN) of Brazilian crude oil samples. Generally, ESI(-)-FT-ICR mass spectra present a power of resolution of ca. 500,000 and a mass accuracy less than 1 ppm, producing a data matrix containing over 5700 variables per sample. These variables correspond to heteroatom-containing species detected as deprotonated molecules, [M - H](-) ions, which are identified primarily as naphthenic acids, phenols and carbazole analog species. The TAN values for all samples ranged from 0.06 to 3.61 mg of KOH g(-1). To facilitate the spectral interpretation, three methods of variable selection were studied: variable importance in the projection (VIP), interval partial least squares (iPLS) and elimination of uninformative variables (UVE). The UVE method seems to be more appropriate for selecting important variables, reducing the dimension of the variables to 183 and producing a root mean square error of prediction of 0.32 mg of KOH g(-1). By reducing the size of the data, it was possible to relate the selected variables with their corresponding molecular formulas, thus identifying the main chemical species responsible for the TAN values.
Singh, Digar; Lee, Choong H
2018-01-01
Notwithstanding its mitosporic nature, an improbable morpho-transformation state i. e., sclerotial development (SD), is vaguely known in Aspergillus oryzae . Nevertheless an intriguing phenomenon governing mold's development and stress response, the effects of exogenous factors engendering SD, especially the volatile organic compounds (VOCs) mediated interactions (VMI) pervasive in microbial niches have largely remained unexplored. Herein, we examined the effects of intra-species VMI on SD in A. oryzae RIB 40, followed by comprehensive analyses of associated growth rates, pH alterations, biochemical phenotypes, and exometabolomes. We cultivated A. oryzae RIB 40 (S1 VMI : KACC 44967) opposite a non-SD partner strain, A. oryzae (S2: KCCM 60345), conditioning VMI in a specially designed "twin plate assembly." Notably, SD in S1 VMI was delayed relative to its non-conditioned control (S1) cultivated without partner strain (S2) in twin plate. Selectively evaluating A. oryzae RIB 40 (S1 VMI vs. S1) for altered phenotypes concomitant to SD, we observed a marked disparity for corresponding growth rates (S1 VMI < S1) 7days , media pH (S1 VMI > S1) 7days , and biochemical characteristics viz ., protease (S1 VMI > S1) 7days , amylase (S1 VMI > nS1) 3-7 days , and antioxidants (S1 VMI > S1) 7days levels. The partial least squares-discriminant analysis (PLS-DA) of gas chromatography-time of flight-mass spectrometry (GC-TOF-MS) datasets for primary metabolites exhibited a clustered pattern (PLS1, 22.04%; PLS2, 11.36%), with 7 days incubated S1 VMI extracts showed higher abundance of amino acids, sugars, and sugar alcohols with lower organic acids and fatty acids levels, relative to S1. Intriguingly, the higher amino acid and sugar alcohol levels were positively correlated with antioxidant activity, likely impeding SD in S1 VMI . Further, the PLS-DA (PLS1, 18.11%; PLS2, 15.02%) based on liquid chromatography-mass spectrometry (LC-MS) datasets exhibited a notable disparity for post-SD (9-11 days) sample extracts with higher oxylipins and 13-desoxypaxilline levels in S1 VMI relative to S1, intertwining Aspergillus morphogenesis and secondary metabolism. The analysis of VOCs for the 7 days incubated samples displayed considerably higher accumulation of C-8 compounds in the headspace of twin-plate experimental sets (S1 VMI :S2) compared to those in non-conditioned controls (S1 and S2-without respective partner strains), potentially triggering altered morpho-transformation and concurring biochemical as well as metabolic states in molds.
Fernández-Novales, Juan; López, María-Isabel; González-Caballero, Virginia; Ramírez, Pilar; Sánchez, María-Teresa
2011-06-01
Volumic mass-a key component of must quality control tests during alcoholic fermentation-is of great interest to the winemaking industry. Transmitance near-infrared (NIR) spectra of 124 must samples over the range of 200-1,100-nm were obtained using a miniature spectrometer. The performance of this instrument to predict volumic mass was evaluated using partial least squares (PLS) regression and multiple linear regression (MLR). The validation statistics coefficient of determination (r(2)) and the standard error of prediction (SEP) were r(2) = 0.98, n = 31 and r(2) = 0.96, n = 31, and SEP = 5.85 and 7.49 g/dm(3) for PLS and MLR equations developed to fit reference data for volumic mass and spectral data. Comparison of results from MLR and PLS demonstrates that a MLR model with six significant wavelengths (P < 0.05) fit volumic mass data to transmittance (1/T) data slightly worse than a more sophisticated PLS model using the full scanning range. The results suggest that NIR spectroscopy is a suitable technique for predicting volumic mass during alcoholic fermentation, and that a low-cost NIR instrument can be used for this purpose.
NASA Astrophysics Data System (ADS)
Ying, Yibin; Liu, Yande; Fu, Xiaping; Lu, Huishan
2005-11-01
The artificial neural networks (ANNs) have been used successfully in applications such as pattern recognition, image processing, automation and control. However, majority of today's applications of ANNs is back-propagate feed-forward ANN (BP-ANN). In this paper, back-propagation artificial neural networks (BP-ANN) were applied for modeling soluble solid content (SSC) of intact pear from their Fourier transform near infrared (FT-NIR) spectra. One hundred and sixty-four pear samples were used to build the calibration models and evaluate the models predictive ability. The results are compared to the classical calibration approaches, i.e. principal component regression (PCR), partial least squares (PLS) and non-linear PLS (NPLS). The effects of the optimal methods of training parameters on the prediction model were also investigated. BP-ANN combine with principle component regression (PCR) resulted always better than the classical PCR, PLS and Weight-PLS methods, from the point of view of the predictive ability. Based on the results, it can be concluded that FT-NIR spectroscopy and BP-ANN models can be properly employed for rapid and nondestructive determination of fruit internal quality.
Kuriakose, Saji; Joe, I Hubert
2013-11-01
Determination of the authenticity of essential oils has become more significant, in recent years, following some illegal adulteration and contamination scandals. The present investigative study focuses on the application of near infrared spectroscopy to detect sample authenticity and quantify economic adulteration of sandalwood oils. Several data pre-treatments are investigated for calibration and prediction using partial least square regression (PLSR). The quantitative data analysis is done using a new spectral approach - full spectrum or sequential spectrum. The optimum number of PLS components is obtained according to the lowest root mean square error of calibration (RMSEC=0.00009% v/v). The lowest root mean square error of prediction (RMSEP=0.00016% v/v) in the test set and the highest coefficient of determination (R(2)=0.99989) are used as the evaluation tools for the best model. A nonlinear method, locally weighted regression (LWR), is added to extract nonlinear information and to compare with the linear PLSR model. Copyright © 2013 Elsevier B.V. All rights reserved.
Guelpa, Anina; Bevilacqua, Marta; Marini, Federico; O'Kennedy, Kim; Geladi, Paul; Manley, Marena
2015-04-15
It has been established in this study that the Rapid Visco Analyser (RVA) can describe maize hardness, irrespective of the RVA profile, when used in association with appropriate multivariate data analysis techniques. Therefore, the RVA can complement or replace current and/or conventional methods as a hardness descriptor. Hardness modelling based on RVA viscograms was carried out using seven conventional hardness methods (hectoliter mass (HLM), hundred kernel mass (HKM), particle size index (PSI), percentage vitreous endosperm (%VE), protein content, percentage chop (%chop) and near infrared (NIR) spectroscopy) as references and three different RVA profiles (hard, soft and standard) as predictors. An approach using locally weighted partial least squares (LW-PLS) was followed to build the regression models. The resulted prediction errors (root mean square error of cross-validation (RMSECV) and root mean square error of prediction (RMSEP)) for the quantification of hardness values were always lower or in the same order of the laboratory error of the reference method. Copyright © 2014 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Kuriakose, Saji; Joe, I. Hubert
2013-11-01
Determination of the authenticity of essential oils has become more significant, in recent years, following some illegal adulteration and contamination scandals. The present investigative study focuses on the application of near infrared spectroscopy to detect sample authenticity and quantify economic adulteration of sandalwood oils. Several data pre-treatments are investigated for calibration and prediction using partial least square regression (PLSR). The quantitative data analysis is done using a new spectral approach - full spectrum or sequential spectrum. The optimum number of PLS components is obtained according to the lowest root mean square error of calibration (RMSEC = 0.00009% v/v). The lowest root mean square error of prediction (RMSEP = 0.00016% v/v) in the test set and the highest coefficient of determination (R2 = 0.99989) are used as the evaluation tools for the best model. A nonlinear method, locally weighted regression (LWR), is added to extract nonlinear information and to compare with the linear PLSR model.
Optical scatterometry of quarter-micron patterns using neural regression
NASA Astrophysics Data System (ADS)
Bischoff, Joerg; Bauer, Joachim J.; Haak, Ulrich; Hutschenreuther, Lutz; Truckenbrodt, Horst
1998-06-01
With shrinking dimensions and increasing chip areas, a rapid and non-destructive full wafer characterization after every patterning cycle is an inevitable necessity. In former publications it was shown that Optical Scatterometry (OS) has the potential to push the attainable feature limits of optical techniques from 0.8 . . . 0.5 microns for imaging methods down to 0.1 micron and below. Thus the demands of future metrology can be met. Basically being a nonimaging method, OS combines light scatter (or diffraction) measurements with modern data analysis schemes to solve the inverse scatter issue. For very fine patterns with lambda-to-pitch ratios grater than one, the specular reflected light versus the incidence angle is recorded. Usually, the data analysis comprises two steps -- a training cycle connected the a rigorous forward modeling and the prediction itself. Until now, two data analysis schemes are usually applied -- the multivariate regression based Partial Least Squares method (PLS) and a look-up-table technique which is also referred to as Minimum Mean Square Error approach (MMSE). Both methods are afflicted with serious drawbacks. On the one hand, the prediction accuracy of multivariate regression schemes degrades with larger parameter ranges due to the linearization properties of the method. On the other hand, look-up-table methods are rather time consuming during prediction thus prolonging the processing time and reducing the throughput. An alternate method is an Artificial Neural Network (ANN) based regression which combines the advantages of multivariate regression and MMSE. Due to the versatility of a neural network, not only can its structure be adapted more properly to the scatter problem, but also the nonlinearity of the neuronal transfer functions mimic the nonlinear behavior of optical diffraction processes more adequately. In spite of these pleasant properties, the prediction speed of ANN regression is comparable with that of the PLS-method. In this paper, the viability and performance of ANN-regression will be demonstrated with the example of sub-quarter-micron resist metrology. To this end, 0.25 micrometer line/space patterns have been printed in positive photoresist by means of DUV projection lithography. In order to evaluate the total metrology chain from light scatter measurement through data analysis, a thorough modeling has been performed. Assuming a trapezoidal shape of the developed resist profile, a training data set was generated by means of the Rigorous Coupled Wave Approach (RCWA). After training the model, a second data set was computed and deteriorated by Gaussian noise to imitate real measuring conditions. Then, these data have been fed into the models established before resulting in a Standard Error of Prediction (SEP) which corresponds to the measuring accuracy. Even with putting only little effort in the design of a back-propagation network, the ANN is clearly superior to the PLS-method. Depending on whether a network with one or two hidden layers was used, accuracy gains between 2 and 5 can be achieved compared with PLS regression. Furthermore, the ANN is less noise sensitive, for there is only a doubling of the SEP at 5% noise for ANN whereas for PLS the accuracy degrades rapidly with increasing noise. The accuracy gain also depends on the light polarization and on the measured parameters. Finally, these results have been proven experimentally, where the OS-results are in good accordance with the profiles obtained from cross- sectioning micrographs.
NASA Astrophysics Data System (ADS)
Figueroa-Navedo, Amanda; Galán-Freyle, Nataly Y.; Pacheco-Londoño, Leonardo C.; Hernández-Rivera, Samuel P.
2013-05-01
Terrorists conceal highly energetic materials (HEM) as Improvised Explosive Devices (IED) in various types of materials such as PVC, wood, Teflon, aluminum, acrylic, carton and rubber to disguise them from detection equipment used by military and security agency personnel. Infrared emissions (IREs) of substrates, with and without HEM, were measured to generate models for detection and discrimination. Multivariable analysis techniques such as principal component analysis (PCA), soft independent modeling by class analogy (SIMCA), partial least squares-discriminant analysis (PLS-DA), support vector machine (SVM) and neural networks (NN) were employed to generate models, in which the emission of IR light from heated samples was stimulated using a CO2 laser giving rise to laser induced thermal emission (LITE) of HEMs. Traces of a specific target threat chemical explosive: PETN in surface concentrations of 10 to 300 ug/cm2 were studied on the surfaces mentioned. Custom built experimental setup used a CO2 laser as a heating source positioned with a telescope, where a minimal loss in reflective optics was reported, for the Mid-IR at a distance of 4 m and 32 scans at 10 s. SVM-DA resulted in the best statistical technique for a discrimination performance of 97%. PLS-DA accurately predicted over 94% and NN 88%.
LIBS data analysis using a predictor-corrector based digital signal processor algorithm
NASA Astrophysics Data System (ADS)
Sanders, Alex; Griffin, Steven T.; Robinson, Aaron
2012-06-01
There are many accepted sensor technologies for generating spectra for material classification. Once the spectra are generated, communication bandwidth limitations favor local material classification with its attendant reduction in data transfer rates and power consumption. Transferring sensor technologies such as Cavity Ring-Down Spectroscopy (CRDS) and Laser Induced Breakdown Spectroscopy (LIBS) require effective material classifiers. A result of recent efforts has been emphasis on Partial Least Squares - Discriminant Analysis (PLS-DA) and Principle Component Analysis (PCA). Implementation of these via general purpose computers is difficult in small portable sensor configurations. This paper addresses the creation of a low mass, low power, robust hardware spectra classifier for a limited set of predetermined materials in an atmospheric matrix. Crucial to this is the incorporation of PCA or PLS-DA classifiers into a predictor-corrector style implementation. The system configuration guarantees rapid convergence. Software running on multi-core Digital Signal Processor (DSPs) simulates a stream-lined plasma physics model estimator, reducing Analog-to-Digital (ADC) power requirements. This paper presents the results of a predictorcorrector model implemented on a low power multi-core DSP to perform substance classification. This configuration emphasizes the hardware system and software design via a predictor corrector model that simultaneously decreases the sample rate while performing the classification.
Liu, Changhong; Liu, Wei; Lu, Xuzhong; Ma, Fei; Chen, Wei; Yang, Jianbo; Zheng, Lei
2014-01-01
Multispectral imaging with 19 wavelengths in the range of 405-970 nm has been evaluated for nondestructive determination of firmness, total soluble solids (TSS) content and ripeness stage in strawberry fruit. Several analysis approaches, including partial least squares (PLS), support vector machine (SVM) and back propagation neural network (BPNN), were applied to develop theoretical models for predicting the firmness and TSS of intact strawberry fruit. Compared with PLS and SVM, BPNN considerably improved the performance of multispectral imaging for predicting firmness and total soluble solids content with the correlation coefficient (r) of 0.94 and 0.83, SEP of 0.375 and 0.573, and bias of 0.035 and 0.056, respectively. Subsequently, the ability of multispectral imaging technology to classify fruit based on ripeness stage was tested using SVM and principal component analysis-back propagation neural network (PCA-BPNN) models. The higher classification accuracy of 100% was achieved using SVM model. Moreover, the results of all these models demonstrated that the VIS parts of the spectra were the main contributor to the determination of firmness, TSS content estimation and classification of ripeness stage in strawberry fruit. These results suggest that multispectral imaging, together with suitable analysis model, is a promising technology for rapid estimation of quality attributes and classification of ripeness stage in strawberry fruit.
Measuring virtues--development of a scale to measure employee virtues and their influence on health.
Wärnå-Furu, Carola; Sääksjärvi, Maria; Santavirta, Nina
2010-12-01
The objectives of this article are to present a measurement instrument for virtues, and to examine the link between virtues and health. The instrument was tested by the occupational health care at a large Finnish pulp and paper manufacturer and was shown to be consistent, valid and reliable. In developing the scale, we had two samples of employees and used factor analysis and partial least squares modelling (PLS) on both samples. Factor analysis showed that pride is the most important virtue, followed by love and generosity. In the PLS analysis, we found virtues to significantly reduce the number of sick days. In addition, we found significant relationships between virtues and fatigue, depression and happiness. Virtuous behaviour decreased sick leave and depression. The virtues had a positive influence on happiness and on improvement in one's health. The results show that by taking into account virtues in working life, companies can significantly improve their employees' well-being. The measurement instrument helps broaden the traditional view on health and is meant to be used by health care professionals in their daily practice. By addressing a person's physical, mental and virtual well-being, health care practitioners can take care of employees on a broader level than before. © 2010 The Authors. Journal compilation © 2010 Nordic College of Caring Science.
Schönig, Sarah; Recke, Andreas; Hirose, Misa; Ludwig, Ralf J; Seeger, Karsten
2013-06-26
Epidermolysis bullosa acquisita (EBA) is a rare skin blistering disease with a prevalence of 0.2/ million people. EBA is characterized by autoantibodies against type VII collagen. Type VII collagen builds anchoring fibrils that are essential for the dermal-epidermal junction. The pathogenic relevance of antibodies against type VII collagen subdomains has been demonstrated both in vitro and in vivo. Despite the multitude of clinical and immunological data, no information on metabolic changes exists. We used an animal model of EBA to obtain insights into metabolomic changes during EBA. Sera from mice with immunization-induced EBA and control mice were obtained and metabolites were isolated by filtration. Proton nuclear magnetic resonance (NMR) spectra were recorded and analyzed by principal component analysis (PCA), partial least squares discrimination analysis (PLS-DA) and random forest. The metabolic pattern of immunized mice and control mice could be clearly distinguished with PCA and PLS-DA. Metabolites that contribute to the discrimination could be identified via random forest. The observed changes in the metabolic pattern of EBA sera, i.e. increased levels of amino acid, point toward an increased energy demand in EBA. Knowledge about metabolic changes due to EBA could help in future to assess the disease status during treatment. Confirming the metabolic changes in patients needs probably large cohorts.
2013-01-01
Background A major hindrance to the development of high yielding biofuel feedstocks is the ability to rapidly assess large populations for fermentable sugar yields. Whilst recent advances have outlined methods for the rapid assessment of biomass saccharification efficiency, none take into account the total biomass, or the soluble sugar fraction of the plant. Here we present a holistic high-throughput methodology for assessing sweet Sorghum bicolor feedstocks at 10 days post-anthesis for total fermentable sugar yields including stalk biomass, soluble sugar concentrations, and cell wall saccharification efficiency. Results A mathematical method for assessing whole S. bicolor stalks using the fourth internode from the base of the plant proved to be an effective high-throughput strategy for assessing stalk biomass, soluble sugar concentrations, and cell wall composition and allowed calculation of total stalk fermentable sugars. A high-throughput method for measuring soluble sucrose, glucose, and fructose using partial least squares (PLS) modelling of juice Fourier transform infrared (FTIR) spectra was developed. The PLS prediction was shown to be highly accurate with each sugar attaining a coefficient of determination (R 2 ) of 0.99 with a root mean squared error of prediction (RMSEP) of 11.93, 5.52, and 3.23 mM for sucrose, glucose, and fructose, respectively, which constitutes an error of <4% in each case. The sugar PLS model correlated well with gas chromatography–mass spectrometry (GC-MS) and brix measures. Similarly, a high-throughput method for predicting enzymatic cell wall digestibility using PLS modelling of FTIR spectra obtained from S. bicolor bagasse was developed. The PLS prediction was shown to be accurate with an R 2 of 0.94 and RMSEP of 0.64 μg.mgDW-1.h-1. Conclusions This methodology has been demonstrated as an efficient and effective way to screen large biofuel feedstock populations for biomass, soluble sugar concentrations, and cell wall digestibility simultaneously allowing a total fermentable yield calculation. It unifies and simplifies previous screening methodologies to produce a holistic assessment of biofuel feedstock potential. PMID:24365407
PREDICTION OF MOLECULAR PROPERTIES WITH MID-INFRARED SPECTRA AND INTERFEROGRAMS
We have built infrared spectroscopy-based partial least squares (PLS) models for molecular polarizabilities using a 97 member training set and a 59 member independent prediction set. These 156 compounds span a very wide range of chemical structure. Our goal was to use this well...
Comparison of three chemometrics methods for near-infrared spectra of glucose in the whole blood
NASA Astrophysics Data System (ADS)
Zhang, Hongyan; Ding, Dong; Li, Xin; Chen, Yu; Tang, Yuguo
2005-01-01
Principal Component Regression (PCR), Partial Least Square (PLS) and Artificial Neural Networks (ANN) methods are used in the analysis for the near infrared (NIR) spectra of glucose in the whole blood. The calibration model is built up in the spectrum band where there are the glucose has much more spectral absorption than the water, fat, and protein with these methods and the correlation coefficients of the model are showed in this paper. Comparing these results, a suitable method to analyze the glucose NIR spectrum in the whole blood is found.
Gu, Haiwei; Pan, Zhengzheng; Xi, Bowei; Asiago, Vincent; Musselman, Brian; Raftery, Daniel
2011-02-07
Nuclear magnetic resonance (NMR) spectroscopy and mass spectrometry (MS) are the two most commonly used analytical tools in metabolomics, and their complementary nature makes the combination particularly attractive. A combined analytical approach can improve the potential for providing reliable methods to detect metabolic profile alterations in biofluids or tissues caused by disease, toxicity, etc. In this paper, (1)H NMR spectroscopy and direct analysis in real time (DART)-MS were used for the metabolomics analysis of serum samples from breast cancer patients and healthy controls. Principal component analysis (PCA) of the NMR data showed that the first principal component (PC1) scores could be used to separate cancer from normal samples. However, no such obvious clustering could be observed in the PCA score plot of DART-MS data, even though DART-MS can provide a rich and informative metabolic profile. Using a modified multivariate statistical approach, the DART-MS data were then reevaluated by orthogonal signal correction (OSC) pretreated partial least squares (PLS), in which the Y matrix in the regression was set to the PC1 score values from the NMR data analysis. This approach, and a similar one using the first latent variable from PLS-DA of the NMR data resulted in a significant improvement of the separation between the disease samples and normals, and a metabolic profile related to breast cancer could be extracted from DART-MS. The new approach allows the disease classification to be expressed on a continuum as opposed to a binary scale and thus better represents the disease and healthy classifications. An improved metabolic profile obtained by combining MS and NMR by this approach may be useful to achieve more accurate disease detection and gain more insight regarding disease mechanisms and biology. Copyright © 2010 Elsevier B.V. All rights reserved.
Sciubba, Fabio; Avanzato, Damiano; Vaccaro, Angela; Capuani, Giorgio; Spagnoli, Mariangela; Di Cocco, Maria Enrica; Tzareva, Irina Nikolova; Delfini, Maurizio
2017-04-01
The metabolic profiling of pistachio (Pistacia vera) aqueous extracts from two different cultivars, namely 'Bianca' and 'Gloria', was monitored over the months from May to September employing high field NMR spectroscopy. A large number of water-soluble metabolites were assigned by means of 1D and 2D NMR experiments. The change in the metabolic profiles monitored over time allowed the pistachio development to be investigated. Specific temporal trends of amino acids, sugars, organic acids and other metabolites were observed and analysed by multivariate Partial Least Squares (PLS) analysis. Statistical analysis showed that while in the period from May to September there were few differences between the two cultivars, the ripening rate was different.
USDA-ARS?s Scientific Manuscript database
Hyperspectral scattering is a promising technique for rapid and noninvasive measurement of multiple quality attributes of apple fruit. A hierarchical evolutionary algorithm (HEA) approach, in combination with subspace decomposition and partial least squares (PLS) regression, was proposed to select o...
Enhancement of partial robust M-regression (PRM) performance using Bisquare weight function
NASA Astrophysics Data System (ADS)
Mohamad, Mazni; Ramli, Norazan Mohamed; Ghani@Mamat, Nor Azura Md; Ahmad, Sanizah
2014-09-01
Partial Least Squares (PLS) regression is a popular regression technique for handling multicollinearity in low and high dimensional data which fits a linear relationship between sets of explanatory and response variables. Several robust PLS methods are proposed to accommodate the classical PLS algorithms which are easily affected with the presence of outliers. The recent one was called partial robust M-regression (PRM). Unfortunately, the use of monotonous weighting function in the PRM algorithm fails to assign appropriate and proper weights to large outliers according to their severity. Thus, in this paper, a modified partial robust M-regression is introduced to enhance the performance of the original PRM. A re-descending weight function, known as Bisquare weight function is recommended to replace the fair function in the PRM. A simulation study is done to assess the performance of the modified PRM and its efficiency is also tested in both contaminated and uncontaminated simulated data under various percentages of outliers, sample sizes and number of predictors.
NASA Astrophysics Data System (ADS)
Barbeira, Paulo J. S.; Paganotti, Rosilene S. N.; Ássimos, Ariane A.
2013-10-01
This study had the objective of determining the content of dry extract of commercial alcoholic extracts of bee propolis through Partial Least Squares (PLS) multivariate calibration and electronic spectroscopy. The PLS model provided a good prediction of dry extract content in commercial alcoholic extracts of bee propolis in the range of 2.7 a 16.8% (m/v), presenting the advantage of being less laborious and faster than the traditional gravimetric methodology. The PLS model was optimized with outlier detection tests according to the ASTM E 1655-05. In this study it was possible to verify that a centrifugation stage is extremely important in order to avoid the presence of waxes, resulting in a more accurate model. Around 50% of the analyzed samples presented content of dry extract lower than the value established by Brazilian legislation, in most cases, the values found were different from the values claimed in the product's label.
Missing RRI interpolation for HRV analysis using locally-weighted partial least squares regression.
Kamata, Keisuke; Fujiwara, Koichi; Yamakawa, Toshiki; Kano, Manabu
2016-08-01
The R-R interval (RRI) fluctuation in electrocardiogram (ECG) is called heart rate variability (HRV). Since HRV reflects autonomic nervous function, HRV-based health monitoring services, such as stress estimation, drowsy driving detection, and epileptic seizure prediction, have been proposed. In these HRV-based health monitoring services, precise R wave detection from ECG is required; however, R waves cannot always be detected due to ECG artifacts. Missing RRI data should be interpolated appropriately for HRV analysis. The present work proposes a missing RRI interpolation method by utilizing using just-in-time (JIT) modeling. The proposed method adopts locally weighted partial least squares (LW-PLS) for RRI interpolation, which is a well-known JIT modeling method used in the filed of process control. The usefulness of the proposed method was demonstrated through a case study of real RRI data collected from healthy persons. The proposed JIT-based interpolation method could improve the interpolation accuracy in comparison with a static interpolation method.
Determination of elemental composition of shale rocks by laser induced breakdown spectroscopy
NASA Astrophysics Data System (ADS)
Sanghapi, Hervé K.; Jain, Jinesh; Bol'shakov, Alexander; Lopano, Christina; McIntyre, Dustin; Russo, Richard
2016-08-01
In this study laser induced breakdown spectroscopy (LIBS) is used for elemental characterization of outcrop samples from the Marcellus Shale. Powdered samples were pressed to form pellets and used for LIBS analysis. Partial least squares regression (PLS-R) and univariate calibration curves were used for quantification of analytes. The matrix effect is substantially reduced using the partial least squares calibration method. Predicted results with LIBS are compared to ICP-OES results for Si, Al, Ti, Mg, and Ca. As for C, its results are compared to those obtained by a carbon analyzer. Relative errors of the LIBS measurements are in the range of 1.7 to 12.6%. The limits of detection (LODs) obtained for Si, Al, Ti, Mg and Ca are 60.9, 33.0, 15.6, 4.2 and 0.03 ppm, respectively. An LOD of 0.4 wt.% was obtained for carbon. This study shows that the LIBS method can provide a rapid analysis of shale samples and can potentially benefit depleted gas shale carbon storage research.
Dong, Jian-Jun; Li, Qing-Liang; Yin, Hua; Zhong, Cheng; Hao, Jun-Guang; Yang, Pan-Fei; Tian, Yu-Hong; Jia, Shi-Ru
2014-10-15
Sensory evaluation is regarded as a necessary procedure to ensure a reproducible quality of beer. Meanwhile, high-throughput analytical methods provide a powerful tool to analyse various flavour compounds, such as higher alcohol and ester. In this study, the relationship between flavour compounds and sensory evaluation was established by non-linear models such as partial least squares (PLS), genetic algorithm back-propagation neural network (GA-BP), support vector machine (SVM). It was shown that SVM with a Radial Basis Function (RBF) had a better performance of prediction accuracy for both calibration set (94.3%) and validation set (96.2%) than other models. Relatively lower prediction abilities were observed for GA-BP (52.1%) and PLS (31.7%). In addition, the kernel function of SVM played an essential role of model training when the prediction accuracy of SVM with polynomial kernel function was 32.9%. As a powerful multivariate statistics method, SVM holds great potential to assess beer quality. Copyright © 2014 Elsevier Ltd. All rights reserved.
Dong, Yanhong; Li, Juan; Zhong, Xiaoxiao; Cao, Liya; Luo, Yang; Fan, Qi
2016-04-15
This paper establishes a novel method to simultaneously predict the tablet weight (TW) and trimethoprim (TMP) content of compound sulfamethoxazole tablets (SMZCO) by near infrared (NIR) spectroscopy with partial least squares (PLS) regression for controlling the uniformity of dosage units (UODU). The NIR spectra for 257 samples were measured using the optimized parameter values and pretreated using the optimized chemometric techniques. After the outliers were ignored, two PLS models for predicting TW and TMP content were respectively established by using the selected spectral sub-ranges and the reference values. The TW model reaches the correlation coefficient of calibration (R(c)) 0.9543 and the TMP content model has the R(c) 0.9205. The experimental results indicate that this strategy expands the NIR application in controlling UODU, especially in the high-throughput and rapid analysis of TWs and contents of the compound pharmaceutical tablets, and may be an important complement to the common NIR on-line analytical method for pharmaceutical tablets. Copyright © 2016 Elsevier B.V. All rights reserved.
Sandoval, S; Torres, A; Pawlowsky-Reusing, E; Riechel, M; Caradot, N
2013-01-01
The present study aims to explore the relationship between rainfall variables and water quality/quantity characteristics of combined sewer overflows (CSOs), by the use of multivariate statistical methods and online measurements at a principal CSO outlet in Berlin (Germany). Canonical correlation results showed that the maximum and average rainfall intensities are the most influential variables to describe CSO water quantity and pollutant loads whereas the duration of the rainfall event and the rain depth seem to be the most influential variables to describe CSO pollutant concentrations. The analysis of partial least squares (PLS) regression models confirms the findings of the canonical correlation and highlights three main influences of rainfall on CSO characteristics: (i) CSO water quantity characteristics are mainly influenced by the maximal rainfall intensities, (ii) CSO pollutant concentrations were found to be mostly associated with duration of the rainfall and (iii) pollutant loads seemed to be principally influenced by dry weather duration before the rainfall event. The prediction quality of PLS models is rather low (R² < 0.6) but results can be useful to explore qualitatively the influence of rainfall on CSO characteristics.
In vivo study of dermal collagen of striae distensae by confocal Raman spectroscopy.
Lung, Pam Wen; Tippavajhala, Vamshi Krishna; de Oliveira Mendes, Thiago; Téllez-Soto, Claudio A; Schuck, Desirée Cigaran; Brohem, Carla Abdo; Lorencini, Marcio; Martin, Airton Abrahão
2018-04-01
This research work mainly deals with studying qualitatively the changes in the dermal collagen of two forms of striae distensae (SD) namely striae rubrae (SR) and striae albae (SA) when compared to normal skin (NS) using confocal Raman spectroscopy. The methodology includes an in vivo human skin study for the comparison of confocal Raman spectra of dermis region of SR, SA, and NS by supervised multivariate analysis using partial least squares discriminant analysis (PLS-DA) to determine qualitatively the changes in dermal collagen. These groups are further analyzed for the extent of hydration of dermal collagen by studying the changes in the water content bound to it. PLS-DA score plot showed good separation of the confocal Raman spectra of dermis region into SR, SA, and NS data groups. Further analysis using loading plot and S-plot indicated the participation of various components of dermal collagen in the separation of these groups. Bound water content analysis showed that the extent of hydration of collagen is more in SD when compared to NS. Based on the results obtained, this study confirms the active involvement of dermal collagen in the formation of SD. It also emphasizes the need to study quantitatively the role of these various biochemical changes in the dermal collagen responsible for the variance between SR, SA, and NS.
NASA Astrophysics Data System (ADS)
Tsai, Yu-Hsuan; Garrett, Timothy J.; Carter, Christy S.; Yost, Richard A.
2015-06-01
Skeletal muscles are composed of heterogeneous muscle fibers that have different physiological, morphological, biochemical, and histological characteristics. In this work, skeletal muscles extensor digitorum longus, soleus, and whole gastrocnemius were analyzed by matrix-assisted laser desorption/ionization mass spectrometry to characterize small molecule metabolites of oxidative and glycolytic muscle fiber types as well as to visualize biomarker localization. Multivariate data analysis such as principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA) were performed to extract significant features. Different metabolic fingerprints were observed from oxidative and glycolytic fibers. Higher abundances of biomolecules such as antioxidant anserine as well as acylcarnitines were observed in the glycolytic fibers, whereas taurine and some nucleotides were found to be localized in the oxidative fibers.
Novel near-infrared sampling apparatus for single kernel analysis of oil content in maize.
Janni, James; Weinstock, B André; Hagen, Lisa; Wright, Steve
2008-04-01
A method of rapid, nondestructive chemical and physical analysis of individual maize (Zea mays L.) kernels is needed for the development of high value food, feed, and fuel traits. Near-infrared (NIR) spectroscopy offers a robust nondestructive method of trait determination. However, traditional NIR bulk sampling techniques cannot be applied successfully to individual kernels. Obtaining optimized single kernel NIR spectra for applied chemometric predictive analysis requires a novel sampling technique that can account for the heterogeneous forms, morphologies, and opacities exhibited in individual maize kernels. In this study such a novel technique is described and compared to less effective means of single kernel NIR analysis. Results of the application of a partial least squares (PLS) derived model for predictive determination of percent oil content per individual kernel are shown.
Inácio, Maria Raquel Cavalcanti; de Lima, Kássio Michell Gomes; Lopes, Valquiria Garcia; Pessoa, José Dalton Cruz; de Almeida Teixeira, Gustavo Henrique
2013-02-15
The aim of this study was to evaluate near-infrared reflectance spectroscopy (NIR), and multivariate calibration potential as a rapid method to determinate anthocyanin content in intact fruit (açaí and palmitero-juçara). Several multivariate calibration techniques, including partial least squares (PLS), interval partial least squares, genetic algorithm, successive projections algorithm, and net analyte signal were compared and validated by establishing figures of merit. Suitable results were obtained with the PLS model (four latent variables and 5-point smoothing) with a detection limit of 6.2 g kg(-1), limit of quantification of 20.7 g kg(-1), accuracy estimated as root mean square error of prediction of 4.8 g kg(-1), mean selectivity of 0.79 g kg(-1), sensitivity of 5.04×10(-3) g kg(-1), precision of 27.8 g kg(-1), and signal-to-noise ratio of 1.04×10(-3) g kg(-1). These results suggest NIR spectroscopy and multivariate calibration can be effectively used to determine anthocyanin content in intact açaí and palmitero-juçara fruit. Copyright © 2012 Elsevier Ltd. All rights reserved.
Impact of multicollinearity on small sample hydrologic regression models
NASA Astrophysics Data System (ADS)
Kroll, Charles N.; Song, Peter
2013-06-01
Often hydrologic regression models are developed with ordinary least squares (OLS) procedures. The use of OLS with highly correlated explanatory variables produces multicollinearity, which creates highly sensitive parameter estimators with inflated variances and improper model selection. It is not clear how to best address multicollinearity in hydrologic regression models. Here a Monte Carlo simulation is developed to compare four techniques to address multicollinearity: OLS, OLS with variance inflation factor screening (VIF), principal component regression (PCR), and partial least squares regression (PLS). The performance of these four techniques was observed for varying sample sizes, correlation coefficients between the explanatory variables, and model error variances consistent with hydrologic regional regression models. The negative effects of multicollinearity are magnified at smaller sample sizes, higher correlations between the variables, and larger model error variances (smaller R2). The Monte Carlo simulation indicates that if the true model is known, multicollinearity is present, and the estimation and statistical testing of regression parameters are of interest, then PCR or PLS should be employed. If the model is unknown, or if the interest is solely on model predictions, is it recommended that OLS be employed since using more complicated techniques did not produce any improvement in model performance. A leave-one-out cross-validation case study was also performed using low-streamflow data sets from the eastern United States. Results indicate that OLS with stepwise selection generally produces models across study regions with varying levels of multicollinearity that are as good as biased regression techniques such as PCR and PLS.
Shi, Yue; Huang, Wenjiang; Ye, Huichun; Ruan, Chao; Xing, Naichen; Geng, Yun; Dong, Yingying; Peng, Dailiang
2018-06-11
In recent decades, rice disease co-epidemics have caused tremendous damage to crop production in both China and Southeast Asia. A variety of remote sensing based approaches have been developed and applied to map diseases distribution using coarse- to moderate-resolution imagery. However, the detection and discrimination of various disease species infecting rice were seldom assessed using high spatial resolution data. The aims of this study were (1) to develop a set of normalized two-stage vegetation indices (VIs) for characterizing the progressive development of different diseases with rice; (2) to explore the performance of combined normalized two-stage VIs in partial least square discriminant analysis (PLS-DA); and (3) to map and evaluate the damage caused by rice diseases at fine spatial scales, for the first time using bi-temporal, high spatial resolution imagery from PlanetScope datasets at a 3 m spatial resolution. Our findings suggest that the primary biophysical parameters caused by different disease (e.g., changes in leaf area, pigment contents, or canopy morphology) can be captured using combined normalized two-stage VIs. PLS-DA was able to classify rice diseases at a sub-field scale, with an overall accuracy of 75.62% and a Kappa value of 0.47. The approach was successfully applied during a typical co-epidemic outbreak of rice dwarf (Rice dwarf virus, RDV), rice blast ( Magnaporthe oryzae ), and glume blight ( Phyllosticta glumarum ) in Guangxi Province, China. Furthermore, our approach highlighted the feasibility of the method in capturing heterogeneous disease patterns at fine spatial scales over the large spatial extents.
Akimoto, Yuki; Yugi, Katsuyuki; Uda, Shinsuke; Kudo, Takamasa; Komori, Yasunori; Kubota, Hiroyuki; Kuroda, Shinya
2013-01-01
Cells use common signaling molecules for the selective control of downstream gene expression and cell-fate decisions. The relationship between signaling molecules and downstream gene expression and cellular phenotypes is a multiple-input and multiple-output (MIMO) system and is difficult to understand due to its complexity. For example, it has been reported that, in PC12 cells, different types of growth factors activate MAP kinases (MAPKs) including ERK, JNK, and p38, and CREB, for selective protein expression of immediate early genes (IEGs) such as c-FOS, c-JUN, EGR1, JUNB, and FOSB, leading to cell differentiation, proliferation and cell death; however, how multiple-inputs such as MAPKs and CREB regulate multiple-outputs such as expression of the IEGs and cellular phenotypes remains unclear. To address this issue, we employed a statistical method called partial least squares (PLS) regression, which involves a reduction of the dimensionality of the inputs and outputs into latent variables and a linear regression between these latent variables. We measured 1,200 data points for MAPKs and CREB as the inputs and 1,900 data points for IEGs and cellular phenotypes as the outputs, and we constructed the PLS model from these data. The PLS model highlighted the complexity of the MIMO system and growth factor-specific input-output relationships of cell-fate decisions in PC12 cells. Furthermore, to reduce the complexity, we applied a backward elimination method to the PLS regression, in which 60 input variables were reduced to 5 variables, including the phosphorylation of ERK at 10 min, CREB at 5 min and 60 min, AKT at 5 min and JNK at 30 min. The simple PLS model with only 5 input variables demonstrated a predictive ability comparable to that of the full PLS model. The 5 input variables effectively extracted the growth factor-specific simple relationships within the MIMO system in cell-fate decisions in PC12 cells.
Noncontact analysis of the fiber weight per unit area in prepreg by near-infrared spectroscopy.
Jiang, B; Huang, Y D
2008-05-26
The fiber weight per unit area in prepreg is an important factor to ensure the quality of the composite products. Near-infrared spectroscopy (NIRS) technology together with a noncontact reflectance sources has been applied for quality analysis of the fiber weight per unit area. The range of the unit area fiber weight was 13.39-14.14mgcm(-2). The regression method was employed by partial least squares (PLS) and principal components regression (PCR). The calibration model was developed by 55 samples to determine the fiber weight per unit area in prepreg. The determination coefficient (R(2)), root mean square error of calibration (RMSEC) and root mean square error of prediction (RMSEP) were 0.82, 0.092, 0.099, respectively. The predicted values of the fiber weight per unit area in prepreg measured by NIRS technology were comparable to the values obtained by the reference method. For this technology, the noncontact reflectance sources focused directly on the sample with neither previous treatment nor manipulation. The results of the paired t-test revealed that there was no significant difference between the NIR method and the reference method. Besides, the prepreg could be analyzed one time within 20s without sample destruction.
[Identification of two varieties of Citri Fructus by fingerprint and chemometrics].
Su, Jing-hua; Zhang, Chao; Sun, Lei; Gu, Bing-ren; Ma, Shuang-cheng
2015-06-01
Citri Fructus identification by fingerprint and chemometrics was investigated in this paper. Twenty-three Citri Fructus samples were collected which referred to two varieties as Cirtus wilsonii and C. medica recorded in Chinese Pharmacopoeia. HPLC chromatograms were obtained. The components were partly identified by reference substances, and then common pattern was established for chemometrics analysis. Similarity analysis, principal component analysis (PCA) , partial least squares-discriminant analysis (PLS-DA) and hierarchical cluster analysis heatmap were applied. The results indicated that C. wilsonii and C. medica could be ideally classified with common pattern contained twenty-five characteristic peaks. Besides, preliminary pattern recognition had verified the chemometrics analytical results. Absolute peak area (APA) was used for relevant quantitative analysis, results showed the differences between two varieties and it was valuable for further quality control as selection of characteristic components.
Pavurala, Naresh; Xu, Xiaoming; Krishnaiah, Yellela S R
2017-05-15
Hyperspectral imaging using near infrared spectroscopy (NIRS) integrates spectroscopy and conventional imaging to obtain both spectral and spatial information of materials. The non-invasive and rapid nature of hyperspectral imaging using NIRS makes it a valuable process analytical technology (PAT) tool for in-process monitoring and control of the manufacturing process for transdermal drug delivery systems (TDS). The focus of this investigation was to develop and validate the use of Near Infra-red (NIR) hyperspectral imaging to monitor coat thickness uniformity, a critical quality attribute (CQA) for TDS. Chemometric analysis was used to process the hyperspectral image and a partial least square (PLS) model was developed to predict the coat thickness of the TDS. The goodness of model fit and prediction were 0.9933 and 0.9933, respectively, indicating an excellent fit to the training data and also good predictability. The % Prediction Error (%PE) for internal and external validation samples was less than 5% confirming the accuracy of the PLS model developed in the present study. The feasibility of the hyperspectral imaging as a real-time process analytical tool for continuous processing was also investigated. When the PLS model was applied to detect deliberate variation in coating thickness, it was able to predict both the small and large variations as well as identify coating defects such as non-uniform regions and presence of air bubbles. Published by Elsevier B.V.
Standoff detection of chemical and biological threats using laser-induced breakdown spectroscopy.
Gottfried, Jennifer L; De Lucia, Frank C; Munson, Chase A; Miziolek, Andrzej W
2008-04-01
Laser-induced breakdown spectroscopy (LIBS) is a promising technique for real-time chemical and biological warfare agent detection in the field. We have demonstrated the detection and discrimination of the biological warfare agent surrogates Bacillus subtilis (BG) (2% false negatives, 0% false positives) and ovalbumin (0% false negatives, 1% false positives) at 20 meters using standoff laser-induced breakdown spectroscopy (ST-LIBS) and linear correlation. Unknown interferent samples (not included in the model), samples on different substrates, and mixtures of BG and Arizona road dust have been classified with reasonable success using partial least squares discriminant analysis (PLS-DA). A few of the samples tested such as the soot (not included in the model) and the 25% BG:75% dust mixture resulted in a significant number of false positives or false negatives, respectively. Our preliminary results indicate that while LIBS is able to discriminate biomaterials with similar elemental compositions at standoff distances based on differences in key intensity ratios, further work is needed to reduce the number of false positives/negatives by refining the PLS-DA model to include a sufficient range of material classes and carefully selecting a detection threshold. In addition, we have demonstrated that LIBS can distinguish five different organophosphate nerve agent simulants at 20 meters, despite their similar stoichiometric formulas. Finally, a combined PLS-DA model for chemical, biological, and explosives detection using a single ST-LIBS sensor has been developed in order to demonstrate the potential of standoff LIBS for universal hazardous materials detection.
Orthogonal decomposition of left ventricular remodeling in myocardial infarction
Zhang, Xingyu; Medrano-Gracia, Pau; Ambale-Venkatesh, Bharath; Bluemke, David A.; Cowan, Brett R; Finn, J. Paul; Kadish, Alan H.; Lee, Daniel C.; Lima, Joao A. C.; Young, Alistair A.; Suinesiaputra, Avan
2017-01-01
Abstract Left ventricular size and shape are important for quantifying cardiac remodeling in response to cardiovascular disease. Geometric remodeling indices have been shown to have prognostic value in predicting adverse events in the clinical literature, but these often describe interrelated shape changes. We developed a novel method for deriving orthogonal remodeling components directly from any (moderately independent) set of clinical remodeling indices. Results: Six clinical remodeling indices (end-diastolic volume index, sphericity, relative wall thickness, ejection fraction, apical conicity, and longitudinal shortening) were evaluated using cardiac magnetic resonance images of 300 patients with myocardial infarction, and 1991 asymptomatic subjects, obtained from the Cardiac Atlas Project. Partial least squares (PLS) regression of left ventricular shape models resulted in remodeling components that were optimally associated with each remodeling index. A Gram–Schmidt orthogonalization process, by which remodeling components were successively removed from the shape space in the order of shape variance explained, resulted in a set of orthonormal remodeling components. Remodeling scores could then be calculated that quantify the amount of each remodeling component present in each case. A one-factor PLS regression led to more decoupling between scores from the different remodeling components across the entire cohort, and zero correlation between clinical indices and subsequent scores. Conclusions: The PLS orthogonal remodeling components had similar power to describe differences between myocardial infarction patients and asymptomatic subjects as principal component analysis, but were better associated with well-understood clinical indices of cardiac remodeling. The data and analyses are available from www.cardiacatlas.org. PMID:28327972
Lin, M; Al-Holy, M; Mousavi-Hesary, M; Al-Qadiri, H; Cavinato, A G; Rasco, B A
2004-01-01
To evaluate the feasibility of visible and short-wavelength near-infrared (SW-NIR) diffuse reflectance spectroscopy (600-1100 nm) to quantify the microbial loads in chicken meat and to develop a rapid methodology for monitoring the onset of spoilage. Twenty-four prepackaged fresh chicken breast muscle samples were prepared and stored at 21 degrees C for 24 h. Visible and SW-NIR was used to detect and quantify the microbial loads in chicken breast muscle at time intervals of 0, 2, 4, 6, 8, 10, 12 and 24 h. Spectra were collected in the diffuse reflectance mode (600-1100 nm). Total aerobic plate count (APC) of each sample was determined by the spread plate method at 32 degrees C for 48 h. Principal component analysis (PCA) and partial least squares (PLS) based prediction models were developed. PCA analysis showed clear segregation of samples held 8 h or longer compared with 0-h control. An optimum PLS model required eight latent variables for chicken muscle (R = 0.91, SEP = 0.48 log CFU g(-1)). Visible and SW-NIR combined with PCA is capable of perceiving the change of the microbial loads in chicken muscle once the APC increases slightly above 1 log cycle. Accurate quantification of the bacterial loads in chicken muscle can be calculated from the PLS-based prediction method. Visible and SW-NIR spectroscopy is a technique with a considerable potential for monitoring food safety and food spoilage. Visible and SW-NIR can acquire a metabolic snapshot and quantify the microbial loads of food samples rapidly, accurately, and noninvasively. This method would allow for more expeditious applications of quality control in food industries.