NASA Astrophysics Data System (ADS)
Shan, Jiajia; Wang, Xue; Zhou, Hao; Han, Shuqing; Riza, Dimas Firmanda Al; Kondo, Naoshi
2018-04-01
Synchronous fluorescence spectra, combined with multivariate analysis were used to predict flavonoids content in green tea rapidly and nondestructively. This paper presented a new and efficient spectral intervals selection method called clustering based partial least square (CL-PLS), which selected informative wavelengths by combining clustering concept and partial least square (PLS) methods to improve models’ performance by synchronous fluorescence spectra. The fluorescence spectra of tea samples were obtained and k-means and kohonen-self organizing map clustering algorithms were carried out to cluster full spectra into several clusters, and sub-PLS regression model was developed on each cluster. Finally, CL-PLS models consisting of gradually selected clusters were built. Correlation coefficient (R) was used to evaluate the effect on prediction performance of PLS models. In addition, variable influence on projection partial least square (VIP-PLS), selectivity ratio partial least square (SR-PLS), interval partial least square (iPLS) models and full spectra PLS model were investigated and the results were compared. The results showed that CL-PLS presented the best result for flavonoids prediction using synchronous fluorescence spectra.
Shan, Jiajia; Wang, Xue; Zhou, Hao; Han, Shuqing; Riza, Dimas Firmanda Al; Kondo, Naoshi
2018-03-13
Synchronous fluorescence spectra, combined with multivariate analysis were used to predict flavonoids content in green tea rapidly and nondestructively. This paper presented a new and efficient spectral intervals selection method called clustering based partial least square (CL-PLS), which selected informative wavelengths by combining clustering concept and partial least square (PLS) methods to improve models' performance by synchronous fluorescence spectra. The fluorescence spectra of tea samples were obtained and k-means and kohonen-self organizing map clustering algorithms were carried out to cluster full spectra into several clusters, and sub-PLS regression model was developed on each cluster. Finally, CL-PLS models consisting of gradually selected clusters were built. Correlation coefficient (R) was used to evaluate the effect on prediction performance of PLS models. In addition, variable influence on projection partial least square (VIP-PLS), selectivity ratio partial least square (SR-PLS), interval partial least square (iPLS) models and full spectra PLS model were investigated and the results were compared. The results showed that CL-PLS presented the best result for flavonoids prediction using synchronous fluorescence spectra.
Kernel Partial Least Squares for Nonlinear Regression and Discrimination
NASA Technical Reports Server (NTRS)
Rosipal, Roman; Clancy, Daniel (Technical Monitor)
2002-01-01
This paper summarizes recent results on applying the method of partial least squares (PLS) in a reproducing kernel Hilbert space (RKHS). A previously proposed kernel PLS regression model was proven to be competitive with other regularized regression methods in RKHS. The family of nonlinear kernel-based PLS models is extended by considering the kernel PLS method for discrimination. Theoretical and experimental results on a two-class discrimination problem indicate usefulness of the method.
Lu, Yuzhen; Du, Changwen; Yu, Changbing; Zhou, Jianmin
2014-08-01
Fast and non-destructive determination of rapeseed protein content carries significant implications in rapeseed production. This study presented the first attempt of using Fourier transform mid-infrared photoacoustic spectroscopy (FTIR-PAS) to quantify protein content of rapeseed. The full-spectrum model was first built using partial least squares (PLS). Interval selection methods including interval partial least squares (iPLS), synergy interval partial least squares (siPLS), backward elimination interval partial least squares (biPLS) and dynamic backward elimination interval partial least squares (dyn-biPLS) were then employed to select the relevant band or band combination for PLS modeling. The full-spectrum PLS model achieved an ratio of prediction to deviation (RPD) of 2.047. In comparison, all interval selection methods produced better results than full-spectrum modeling. siPLS achieved the best predictive accuracy with an RPD of 3.215 when the spectrum was sectioned into 25 intervals, and two intervals (1198-1335 and 1614-1753 cm(-1) ) were selected. iPLS excelled biPLS and dyn-biPLS, and dyn-biPLS performed slightly better than biPLS. FTIR-PAS was verified as a promising analytical tool to quantify rapeseed protein content. Interval selection could extract the relevant individual band or synergy band associated with the sample constituent of interest, and then improve the prediction accuracy of the full-spectrum model. © 2013 Society of Chemical Industry.
Divya, O; Mishra, Ashok K
2007-05-29
Quantitative determination of kerosene fraction present in diesel has been carried out based on excitation emission matrix fluorescence (EEMF) along with parallel factor analysis (PARAFAC) and N-way partial least squares regression (N-PLS). EEMF is a simple, sensitive and nondestructive method suitable for the analysis of multifluorophoric mixtures. Calibration models consisting of varying compositions of diesel and kerosene were constructed and their validation was carried out using leave-one-out cross validation method. The accuracy of the model was evaluated through the root mean square error of prediction (RMSEP) for the PARAFAC, N-PLS and unfold PLS methods. N-PLS was found to be a better method compared to PARAFAC and unfold PLS method because of its low RMSEP values.
Cao, Hui; Li, Yao-Jiang; Zhou, Yan; Wang, Yan-Xia
2014-11-01
To deal with nonlinear characteristics of spectra data for the thermal power plant flue, a nonlinear partial least square (PLS) analysis method with internal model based on neural network is adopted in the paper. The latent variables of the independent variables and the dependent variables are extracted by PLS regression firstly, and then they are used as the inputs and outputs of neural network respectively to build the nonlinear internal model by train process. For spectra data of flue gases of the thermal power plant, PLS, the nonlinear PLS with the internal model of back propagation neural network (BP-NPLS), the non-linear PLS with the internal model of radial basis function neural network (RBF-NPLS) and the nonlinear PLS with the internal model of adaptive fuzzy inference system (ANFIS-NPLS) are compared. The root mean square error of prediction (RMSEP) of sulfur dioxide of BP-NPLS, RBF-NPLS and ANFIS-NPLS are reduced by 16.96%, 16.60% and 19.55% than that of PLS, respectively. The RMSEP of nitric oxide of BP-NPLS, RBF-NPLS and ANFIS-NPLS are reduced by 8.60%, 8.47% and 10.09% than that of PLS, respectively. The RMSEP of nitrogen dioxide of BP-NPLS, RBF-NPLS and ANFIS-NPLS are reduced by 2.11%, 3.91% and 3.97% than that of PLS, respectively. Experimental results show that the nonlinear PLS is more suitable for the quantitative analysis of glue gas than PLS. Moreover, by using neural network function which can realize high approximation of nonlinear characteristics, the nonlinear partial least squares method with internal model mentioned in this paper have well predictive capabilities and robustness, and could deal with the limitations of nonlinear partial least squares method with other internal model such as polynomial and spline functions themselves under a certain extent. ANFIS-NPLS has the best performance with the internal model of adaptive fuzzy inference system having ability to learn more and reduce the residuals effectively. Hence, ANFIS-NPLS is an accurate and useful quantitative thermal power plant flue gas analysis method.
NASA Astrophysics Data System (ADS)
Talebpour, Zahra; Tavallaie, Roya; Ahmadi, Seyyed Hamid; Abdollahpour, Assem
2010-09-01
In this study, a new method for the simultaneous determination of penicillin G salts in pharmaceutical mixture via FT-IR spectroscopy combined with chemometrics was investigated. The mixture of penicillin G salts is a complex system due to similar analytical characteristics of components. Partial least squares (PLS) and radial basis function-partial least squares (RBF-PLS) were used to develop the linear and nonlinear relation between spectra and components, respectively. The orthogonal signal correction (OSC) preprocessing method was used to correct unexpected information, such as spectral overlapping and scattering effects. In order to compare the influence of OSC on PLS and RBF-PLS models, the optimal linear (PLS) and nonlinear (RBF-PLS) models based on conventional and OSC preprocessed spectra were established and compared. The obtained results demonstrated that OSC clearly enhanced the performance of both RBF-PLS and PLS calibration models. Also in the case of some nonlinear relation between spectra and component, OSC-RBF-PLS gave satisfactory results than OSC-PLS model which indicated that the OSC was helpful to remove extrinsic deviations from linearity without elimination of nonlinear information related to component. The chemometric models were tested on an external dataset and finally applied to the analysis commercialized injection product of penicillin G salts.
NASA Astrophysics Data System (ADS)
Yuniarto, Budi; Kurniawan, Robert
2017-03-01
PLS Path Modeling (PLS-PM) is different from covariance based SEM, where PLS-PM use an approach based on variance or component, therefore, PLS-PM is also known as a component based SEM. Multiblock Partial Least Squares (MBPLS) is a method in PLS regression which can be used in PLS Path Modeling which known as Multiblock PLS Path Modeling (MBPLS-PM). This method uses an iterative procedure in its algorithm. This research aims to modify MBPLS-PM with Back Propagation Neural Network approach. The result is MBPLS-PM algorithm can be modified using the Back Propagation Neural Network approach to replace the iterative process in backward and forward step to get the matrix t and the matrix u in the algorithm. By modifying the MBPLS-PM algorithm using Back Propagation Neural Network approach, the model parameters obtained are relatively not significantly different compared to model parameters obtained by original MBPLS-PM algorithm.
Song, Weiran; Wang, Hui; Maguire, Paul; Nibouche, Omar
2018-06-07
Partial Least Squares Discriminant Analysis (PLS-DA) is one of the most effective multivariate analysis methods for spectral data analysis, which extracts latent variables and uses them to predict responses. In particular, it is an effective method for handling high-dimensional and collinear spectral data. However, PLS-DA does not explicitly address data multimodality, i.e., within-class multimodal distribution of data. In this paper, we present a novel method termed nearest clusters based PLS-DA (NCPLS-DA) for addressing the multimodality and nonlinearity issues explicitly and improving the performance of PLS-DA on spectral data classification. The new method applies hierarchical clustering to divide samples into clusters and calculates the corresponding centre of every cluster. For a given query point, only clusters whose centres are nearest to such a query point are used for PLS-DA. Such a method can provide a simple and effective tool for separating multimodal and nonlinear classes into clusters which are locally linear and unimodal. Experimental results on 17 datasets, including 12 UCI and 5 spectral datasets, show that NCPLS-DA can outperform 4 baseline methods, namely, PLS-DA, kernel PLS-DA, local PLS-DA and k-NN, achieving the highest classification accuracy most of the time. Copyright © 2018 Elsevier B.V. All rights reserved.
Local classification: Locally weighted-partial least squares-discriminant analysis (LW-PLS-DA).
Bevilacqua, Marta; Marini, Federico
2014-08-01
The possibility of devising a simple, flexible and accurate non-linear classification method, by extending the locally weighted partial least squares (LW-PLS) approach to the cases where the algorithm is used in a discriminant way (partial least squares discriminant analysis, PLS-DA), is presented. In particular, to assess which category an unknown sample belongs to, the proposed algorithm operates by identifying which training objects are most similar to the one to be predicted and building a PLS-DA model using these calibration samples only. Moreover, the influence of the selected training samples on the local model can be further modulated by adopting a not uniform distance-based weighting scheme which allows the farthest calibration objects to have less impact than the closest ones. The performances of the proposed locally weighted-partial least squares-discriminant analysis (LW-PLS-DA) algorithm have been tested on three simulated data sets characterized by a varying degree of non-linearity: in all cases, a classification accuracy higher than 99% on external validation samples was achieved. Moreover, when also applied to a real data set (classification of rice varieties), characterized by a high extent of non-linearity, the proposed method provided an average correct classification rate of about 93% on the test set. By the preliminary results, showed in this paper, the performances of the proposed LW-PLS-DA approach have proved to be comparable and in some cases better than those obtained by other non-linear methods (k nearest neighbors, kernel-PLS-DA and, in the case of rice, counterpropagation neural networks). Copyright © 2014 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Hart, Brian K.; Griffiths, Peter R.
1998-06-01
Partial least squares (PLS) regression has been evaluated as a robust calibration technique for over 100 hazardous air pollutants (HAPs) measured by open path Fourier transform infrared (OP/FT-IR) spectrometry. PLS has the advantage over the current recommended calibration method of classical least squares (CLS), in that it can look at the whole useable spectrum (700-1300 cm-1, 2000-2150 cm-1, and 2400-3000 cm-1), and detect several analytes simultaneously. Up to one hundred HAPs synthetically added to OP/FT-IR backgrounds have been simultaneously calibrated and detected using PLS. PLS also has the advantage in requiring less preprocessing of spectra than that which is required in CLS calibration schemes, allowing PLS to provide user independent real-time analysis of OP/FT-IR spectra.
Marques Junior, Jucelino Medeiros; Muller, Aline Lima Hermes; Foletto, Edson Luiz; da Costa, Adilson Ben; Bizzi, Cezar Augusto; Irineu Muller, Edson
2015-01-01
A method for determination of propranolol hydrochloride in pharmaceutical preparation using near infrared spectrometry with fiber optic probe (FTNIR/PROBE) and combined with chemometric methods was developed. Calibration models were developed using two variable selection models: interval partial least squares (iPLS) and synergy interval partial least squares (siPLS). The treatments based on the mean centered data and multiplicative scatter correction (MSC) were selected for models construction. A root mean square error of prediction (RMSEP) of 8.2 mg g(-1) was achieved using siPLS (s2i20PLS) algorithm with spectra divided into 20 intervals and combination of 2 intervals (8501 to 8801 and 5201 to 5501 cm(-1)). Results obtained by the proposed method were compared with those using the pharmacopoeia reference method and significant difference was not observed. Therefore, proposed method allowed a fast, precise, and accurate determination of propranolol hydrochloride in pharmaceutical preparations. Furthermore, it is possible to carry out on-line analysis of this active principle in pharmaceutical formulations with use of fiber optic probe.
Rapid and Simultaneous Prediction of Eight Diesel Quality Parameters through ATR-FTIR Analysis.
Nespeca, Maurilio Gustavo; Hatanaka, Rafael Rodrigues; Flumignan, Danilo Luiz; de Oliveira, José Eduardo
2018-01-01
Quality assessment of diesel fuel is highly necessary for society, but the costs and time spent are very high while using standard methods. Therefore, this study aimed to develop an analytical method capable of simultaneously determining eight diesel quality parameters (density; flash point; total sulfur content; distillation temperatures at 10% (T10), 50% (T50), and 85% (T85) recovery; cetane index; and biodiesel content) through attenuated total reflection Fourier transform infrared (ATR-FTIR) spectroscopy and the multivariate regression method, partial least square (PLS). For this purpose, the quality parameters of 409 samples were determined using standard methods, and their spectra were acquired in ranges of 4000-650 cm -1 . The use of the multivariate filters, generalized least squares weighting (GLSW) and orthogonal signal correction (OSC), was evaluated to improve the signal-to-noise ratio of the models. Likewise, four variable selection approaches were tested: manual exclusion, forward interval PLS (FiPLS), backward interval PLS (BiPLS), and genetic algorithm (GA). The multivariate filters and variables selection algorithms generated more fitted and accurate PLS models. According to the validation, the FTIR/PLS models presented accuracy comparable to the reference methods and, therefore, the proposed method can be applied in the diesel routine monitoring to significantly reduce costs and analysis time.
Rapid and Simultaneous Prediction of Eight Diesel Quality Parameters through ATR-FTIR Analysis
Hatanaka, Rafael Rodrigues; Flumignan, Danilo Luiz; de Oliveira, José Eduardo
2018-01-01
Quality assessment of diesel fuel is highly necessary for society, but the costs and time spent are very high while using standard methods. Therefore, this study aimed to develop an analytical method capable of simultaneously determining eight diesel quality parameters (density; flash point; total sulfur content; distillation temperatures at 10% (T10), 50% (T50), and 85% (T85) recovery; cetane index; and biodiesel content) through attenuated total reflection Fourier transform infrared (ATR-FTIR) spectroscopy and the multivariate regression method, partial least square (PLS). For this purpose, the quality parameters of 409 samples were determined using standard methods, and their spectra were acquired in ranges of 4000–650 cm−1. The use of the multivariate filters, generalized least squares weighting (GLSW) and orthogonal signal correction (OSC), was evaluated to improve the signal-to-noise ratio of the models. Likewise, four variable selection approaches were tested: manual exclusion, forward interval PLS (FiPLS), backward interval PLS (BiPLS), and genetic algorithm (GA). The multivariate filters and variables selection algorithms generated more fitted and accurate PLS models. According to the validation, the FTIR/PLS models presented accuracy comparable to the reference methods and, therefore, the proposed method can be applied in the diesel routine monitoring to significantly reduce costs and analysis time. PMID:29629209
NASA Astrophysics Data System (ADS)
Kang, Qian; Ru, Qingguo; Liu, Yan; Xu, Lingyan; Liu, Jia; Wang, Yifei; Zhang, Yewen; Li, Hui; Zhang, Qing; Wu, Qing
2016-01-01
An on-line near infrared (NIR) spectroscopy monitoring method with an appropriate multivariate calibration method was developed for the extraction process of Fu-fang Shuanghua oral solution (FSOS). On-line NIR spectra were collected through two fiber optic probes, which were designed to transmit NIR radiation by a 2 mm flange. Partial least squares (PLS), interval PLS (iPLS) and synergy interval PLS (siPLS) algorithms were used comparatively for building the calibration regression models. During the extraction process, the feasibility of NIR spectroscopy was employed to determine the concentrations of chlorogenic acid (CA) content, total phenolic acids contents (TPC), total flavonoids contents (TFC) and soluble solid contents (SSC). High performance liquid chromatography (HPLC), ultraviolet spectrophotometric method (UV) and loss on drying methods were employed as reference methods. Experiment results showed that the performance of siPLS model is the best compared with PLS and iPLS. The calibration models for AC, TPC, TFC and SSC had high values of determination coefficients of (R2) (0.9948, 0.9992, 0.9950 and 0.9832) and low root mean square error of cross validation (RMSECV) (0.0113, 0.0341, 0.1787 and 1.2158), which indicate a good correlation between reference values and NIR predicted values. The overall results show that the on line detection method could be feasible in real application and would be of great value for monitoring the mixed decoction process of FSOS and other Chinese patent medicines.
Xie, Chuanqi; He, Yong
2016-01-01
This study was carried out to use hyperspectral imaging technique for determining color (L*, a* and b*) and eggshell strength and identifying cracked chicken eggs. Partial least squares (PLS) models based on full and selected wavelengths suggested by regression coefficient (RC) method were established to predict the four parameters, respectively. Partial least squares-discriminant analysis (PLS-DA) and RC-partial least squares-discriminant analysis (RC-PLS-DA) models were applied to identify cracked eggs. PLS models performed well with the correlation coefficient (rp) of 0.788 for L*, 0.810 for a*, 0.766 for b* and 0.835 for eggshell strength. RC-PLS models also obtained the rp of 0.771 for L*, 0.806 for a*, 0.767 for b* and 0.841 for eggshell strength. The classification results were 97.06% in PLS-DA model and 88.24% in RC-PLS-DA model. It demonstrated that hyperspectral imaging technique has the potential to be used to detect color and eggshell strength values and identify cracked chicken eggs. PMID:26882990
da Silva, Fabiana E B; Flores, Érico M M; Parisotto, Graciele; Müller, Edson I; Ferrão, Marco F
2016-03-01
An alternative method for the quantification of sulphametoxazole (SMZ) and trimethoprim (TMP) using diffuse reflectance infrared Fourier-transform spectroscopy (DRIFTS) and partial least square regression (PLS) was developed. Interval Partial Least Square (iPLS) and Synergy Partial Least Square (siPLS) were applied to select a spectral range that provided the lowest prediction error in comparison to the full-spectrum model. Fifteen commercial tablet formulations and forty-nine synthetic samples were used. The ranges of concentration considered were 400 to 900 mg g-1SMZ and 80 to 240 mg g-1 TMP. Spectral data were recorded between 600 and 4000 cm-1 with a 4 cm-1 resolution by Diffuse Reflectance Infrared Fourier Transform Spectroscopy (DRIFTS). The proposed procedure was compared to high performance liquid chromatography (HPLC). The results obtained from the root mean square error of prediction (RMSEP), during the validation of the models for samples of sulphamethoxazole (SMZ) and trimethoprim (TMP) using siPLS, demonstrate that this approach is a valid technique for use in quantitative analysis of pharmaceutical formulations. The selected interval algorithm allowed building regression models with minor errors when compared to the full spectrum PLS model. A RMSEP of 13.03 mg g-1for SMZ and 4.88 mg g-1 for TMP was obtained after the selection the best spectral regions by siPLS.
NASA Astrophysics Data System (ADS)
Müller, Aline Lima Hermes; Picoloto, Rochele Sogari; Mello, Paola de Azevedo; Ferrão, Marco Flores; dos Santos, Maria de Fátima Pereira; Guimarães, Regina Célia Lourenço; Müller, Edson Irineu; Flores, Erico Marlon Moraes
2012-04-01
Total sulfur concentration was determined in atmospheric residue (AR) and vacuum residue (VR) samples obtained from petroleum distillation process by Fourier transform infrared spectroscopy with attenuated total reflectance (FT-IR/ATR) in association with chemometric methods. Calibration and prediction set consisted of 40 and 20 samples, respectively. Calibration models were developed using two variable selection models: interval partial least squares (iPLS) and synergy interval partial least squares (siPLS). Different treatments and pre-processing steps were also evaluated for the development of models. The pre-treatment based on multiplicative scatter correction (MSC) and the mean centered data were selected for models construction. The use of siPLS as variable selection method provided a model with root mean square error of prediction (RMSEP) values significantly better than those obtained by PLS model using all variables. The best model was obtained using siPLS algorithm with spectra divided in 20 intervals and combinations of 3 intervals (911-824, 823-736 and 737-650 cm-1). This model produced a RMSECV of 400 mg kg-1 S and RMSEP of 420 mg kg-1 S, showing a correlation coefficient of 0.990.
Oliveri, Paolo; López, M Isabel; Casolino, M Chiara; Ruisánchez, Itziar; Callao, M Pilar; Medini, Luca; Lanteri, Silvia
2014-12-03
A new class-modeling method, referred to as partial least squares density modeling (PLS-DM), is presented. The method is based on partial least squares (PLS), using a distance-based sample density measurement as the response variable. Potential function probability density is subsequently calculated on PLS scores and used, jointly with residual Q statistics, to develop efficient class models. The influence of adjustable model parameters on the resulting performances has been critically studied by means of cross-validation and application of the Pareto optimality criterion. The method has been applied to verify the authenticity of olives in brine from cultivar Taggiasca, based on near-infrared (NIR) spectra recorded on homogenized solid samples. Two independent test sets were used for model validation. The final optimal model was characterized by high efficiency and equilibrate balance between sensitivity and specificity values, if compared with those obtained by application of well-established class-modeling methods, such as soft independent modeling of class analogy (SIMCA) and unequal dispersed classes (UNEQ). Copyright © 2014 Elsevier B.V. All rights reserved.
Nondestructive evaluation of soluble solid content in strawberry by near infrared spectroscopy
NASA Astrophysics Data System (ADS)
Guo, Zhiming; Huang, Wenqian; Chen, Liping; Wang, Xiu; Peng, Yankun
This paper indicates the feasibility to use near infrared (NIR) spectroscopy combined with synergy interval partial least squares (siPLS) algorithms as a rapid nondestructive method to estimate the soluble solid content (SSC) in strawberry. Spectral preprocessing methods were optimized selected by cross-validation in the model calibration. Partial least squares (PLS) algorithm was conducted on the calibration of regression model. The performance of the final model was back-evaluated according to root mean square error of calibration (RMSEC) and correlation coefficient (R2 c) in calibration set, and tested by mean square error of prediction (RMSEP) and correlation coefficient (R2 p) in prediction set. The optimal siPLS model was obtained with after first derivation spectra preprocessing. The measurement results of best model were achieved as follow: RMSEC = 0.2259, R2 c = 0.9590 in the calibration set; and RMSEP = 0.2892, R2 p = 0.9390 in the prediction set. This work demonstrated that NIR spectroscopy and siPLS with efficient spectral preprocessing is a useful tool for nondestructively evaluation SSC in strawberry.
Wang, Yan-peng; Gong, Qi; Yu, Sheng-rong; Liu, You-yan
2012-04-01
A method for detecting trace impurities in high concentration matrix by ICP-AES based on partial least squares (PLS) was established. The research showed that PLS could effectively correct the interference caused by high level of matrix concentration error and could withstand higher concentrations of matrix than multicomponent spectral fitting (MSF). When the mass ratios of matrix to impurities were from 1 000 : 1 to 20 000 : 1, the recoveries of standard addition were between 95% and 105% by PLS. For the system in which interference effect has nonlinear correlation with the matrix concentrations, the prediction accuracy of normal PLS method was poor, but it can be improved greatly by using LIN-PPLS, which was based on matrix transformation of sample concentration. The contents of Co, Pb and Ga in stream sediment (GBW07312) were detected by MSF, PLS and LIN-PPLS respectively. The results showed that the prediction accuracy of LIN-PPLS was better than PLS, and the prediction accuracy of PLS was better than MSF.
Katsarov, Plamen; Gergov, Georgi; Alin, Aylin; Pilicheva, Bissera; Al-Degs, Yahya; Simeonov, Vasil; Kassarova, Margarita
2018-03-01
The prediction power of partial least squares (PLS) and multivariate curve resolution-alternating least squares (MCR-ALS) methods have been studied for simultaneous quantitative analysis of the binary drug combination - doxylamine succinate and pyridoxine hydrochloride. Analysis of first-order UV overlapped spectra was performed using different PLS models - classical PLS1 and PLS2 as well as partial robust M-regression (PRM). These linear models were compared to MCR-ALS with equality and correlation constraints (MCR-ALS-CC). All techniques operated within the full spectral region and extracted maximum information for the drugs analysed. The developed chemometric methods were validated on external sample sets and were applied to the analyses of pharmaceutical formulations. The obtained statistical parameters were satisfactory for calibration and validation sets. All developed methods can be successfully applied for simultaneous spectrophotometric determination of doxylamine and pyridoxine both in laboratory-prepared mixtures and commercial dosage forms.
NASA Astrophysics Data System (ADS)
Meksiarun, Phiranuphon; Ishigaki, Mika; Huck-Pezzei, Verena A. C.; Huck, Christian W.; Wongravee, Kanet; Sato, Hidetoshi; Ozaki, Yukihiro
2017-03-01
This study aimed to extract the paraffin component from paraffin-embedded oral cancer tissue spectra using three multivariate analysis (MVA) methods; Independent Component Analysis (ICA), Partial Least Squares (PLS) and Independent Component - Partial Least Square (IC-PLS). The estimated paraffin components were used for removing the contribution of paraffin from the tissue spectra. These three methods were compared in terms of the efficiency of paraffin removal and the ability to retain the tissue information. It was found that ICA, PLS and IC-PLS could remove the paraffin component from the spectra at almost the same level while Principal Component Analysis (PCA) was incapable. In terms of retaining cancer tissue spectral integrity, effects of PLS and IC-PLS on the non-paraffin region were significantly less than that of ICA where cancer tissue spectral areas were deteriorated. The paraffin-removed spectra were used for constructing Raman images of oral cancer tissue and compared with Hematoxylin and Eosin (H&E) stained tissues for verification. This study has demonstrated the capability of Raman spectroscopy together with multivariate analysis methods as a diagnostic tool for the paraffin-embedded tissue section.
Miaw, Carolina Sheng Whei; Assis, Camila; Silva, Alessandro Rangel Carolino Sales; Cunha, Maria Luísa; Sena, Marcelo Martins; de Souza, Scheilla Vitorino Carvalho
2018-07-15
Grape, orange, peach and passion fruit nectars were formulated and adulterated by dilution with syrup, apple and cashew juices at 10 levels for each adulterant. Attenuated total reflectance Fourier transform mid infrared (ATR-FTIR) spectra were obtained. Partial least squares (PLS) multivariate calibration models allied to different variable selection methods, such as interval partial least squares (iPLS), ordered predictors selection (OPS) and genetic algorithm (GA), were used to quantify the main fruits. PLS improved by iPLS-OPS variable selection showed the highest predictive capacity to quantify the main fruit contents. The selected variables in the final models varied from 72 to 100; the root mean square errors of prediction were estimated from 0.5 to 2.6%; the correlation coefficients of prediction ranged from 0.948 to 0.990; and, the mean relative errors of prediction varied from 3.0 to 6.7%. All of the developed models were validated. Copyright © 2018 Elsevier Ltd. All rights reserved.
Müller, Aline Lima Hermes; Picoloto, Rochele Sogari; de Azevedo Mello, Paola; Ferrão, Marco Flores; de Fátima Pereira dos Santos, Maria; Guimarães, Regina Célia Lourenço; Müller, Edson Irineu; Flores, Erico Marlon Moraes
2012-04-01
Total sulfur concentration was determined in atmospheric residue (AR) and vacuum residue (VR) samples obtained from petroleum distillation process by Fourier transform infrared spectroscopy with attenuated total reflectance (FT-IR/ATR) in association with chemometric methods. Calibration and prediction set consisted of 40 and 20 samples, respectively. Calibration models were developed using two variable selection models: interval partial least squares (iPLS) and synergy interval partial least squares (siPLS). Different treatments and pre-processing steps were also evaluated for the development of models. The pre-treatment based on multiplicative scatter correction (MSC) and the mean centered data were selected for models construction. The use of siPLS as variable selection method provided a model with root mean square error of prediction (RMSEP) values significantly better than those obtained by PLS model using all variables. The best model was obtained using siPLS algorithm with spectra divided in 20 intervals and combinations of 3 intervals (911-824, 823-736 and 737-650 cm(-1)). This model produced a RMSECV of 400 mg kg(-1) S and RMSEP of 420 mg kg(-1) S, showing a correlation coefficient of 0.990. Copyright © 2011 Elsevier B.V. All rights reserved.
Li, Wen-bing; Yao, Lin-tao; Liu, Mu-hua; Huang, Lin; Yao, Ming-yin; Chen, Tian-bing; He, Xiu-wen; Yang, Ping; Hu, Hui-qin; Nie, Jiang-hui
2015-05-01
Cu in navel orange was detected rapidly by laser-induced breakdown spectroscopy (LIBS) combined with partial least squares (PLS) for quantitative analysis, then the effect on the detection accuracy of the model with different spectral data ptetreatment methods was explored. Spectral data for the 52 Gannan navel orange samples were pretreated by different data smoothing, mean centralized and standard normal variable transform. Then 319~338 nm wavelength section containing characteristic spectral lines of Cu was selected to build PLS models, the main evaluation indexes of models such as regression coefficient (r), root mean square error of cross validation (RMSECV) and the root mean square error of prediction (RMSEP) were compared and analyzed. Three indicators of PLS model after 13 points smoothing and processing of the mean center were found reaching 0. 992 8, 3. 43 and 3. 4 respectively, the average relative error of prediction model is only 5. 55%, and in one word, the quality of calibration and prediction of this model are the best results. The results show that selecting the appropriate data pre-processing method, the prediction accuracy of PLS quantitative model of fruits and vegetables detected by LIBS can be improved effectively, providing a new method for fast and accurate detection of fruits and vegetables by LIBS.
Balabin, Roman M; Smirnov, Sergey V
2011-04-29
During the past several years, near-infrared (near-IR/NIR) spectroscopy has increasingly been adopted as an analytical tool in various fields from petroleum to biomedical sectors. The NIR spectrum (above 4000 cm(-1)) of a sample is typically measured by modern instruments at a few hundred of wavelengths. Recently, considerable effort has been directed towards developing procedures to identify variables (wavelengths) that contribute useful information. Variable selection (VS) or feature selection, also called frequency selection or wavelength selection, is a critical step in data analysis for vibrational spectroscopy (infrared, Raman, or NIRS). In this paper, we compare the performance of 16 different feature selection methods for the prediction of properties of biodiesel fuel, including density, viscosity, methanol content, and water concentration. The feature selection algorithms tested include stepwise multiple linear regression (MLR-step), interval partial least squares regression (iPLS), backward iPLS (BiPLS), forward iPLS (FiPLS), moving window partial least squares regression (MWPLS), (modified) changeable size moving window partial least squares (CSMWPLS/MCSMWPLSR), searching combination moving window partial least squares (SCMWPLS), successive projections algorithm (SPA), uninformative variable elimination (UVE, including UVE-SPA), simulated annealing (SA), back-propagation artificial neural networks (BP-ANN), Kohonen artificial neural network (K-ANN), and genetic algorithms (GAs, including GA-iPLS). Two linear techniques for calibration model building, namely multiple linear regression (MLR) and partial least squares regression/projection to latent structures (PLS/PLSR), are used for the evaluation of biofuel properties. A comparison with a non-linear calibration model, artificial neural networks (ANN-MLP), is also provided. Discussion of gasoline, ethanol-gasoline (bioethanol), and diesel fuel data is presented. The results of other spectroscopic techniques application, such as Raman, ultraviolet-visible (UV-vis), or nuclear magnetic resonance (NMR) spectroscopies, can be greatly improved by an appropriate feature selection choice. Copyright © 2011 Elsevier B.V. All rights reserved.
Dealing with gene expression missing data.
Brás, L P; Menezes, J C
2006-05-01
Compared evaluation of different methods is presented for estimating missing values in microarray data: weighted K-nearest neighbours imputation (KNNimpute), regression-based methods such as local least squares imputation (LLSimpute) and partial least squares imputation (PLSimpute) and Bayesian principal component analysis (BPCA). The influence in prediction accuracy of some factors, such as methods' parameters, type of data relationships used in the estimation process (i.e. row-wise, column-wise or both), missing rate and pattern and type of experiment [time series (TS), non-time series (NTS) or mixed (MIX) experiments] is elucidated. Improvements based on the iterative use of data (iterative LLS and PLS imputation--ILLSimpute and IPLSimpute), the need to perform initial imputations (modified PLS and Helland PLS imputation--MPLSimpute and HPLSimpute) and the type of relationships employed (KNNarray, LLSarray, HPLSarray and alternating PLS--APLSimpute) are proposed. Overall, it is shown that data set properties (type of experiment, missing rate and pattern) affect the data similarity structure, therefore influencing the methods' performance. LLSimpute and ILLSimpute are preferable in the presence of data with a stronger similarity structure (TS and MIX experiments), whereas PLS-based methods (MPLSimpute, IPLSimpute and APLSimpute) are preferable when estimating NTS missing data.
Luoma, Pekka; Natschläger, Thomas; Malli, Birgit; Pawliczek, Marcin; Brandstetter, Markus
2018-05-12
A model recalibration method based on additive Partial Least Squares (PLS) regression is generalized for multi-adjustment scenarios of independent variance sources (referred to as additive PLS - aPLS). aPLS allows for effortless model readjustment under changing measurement conditions and the combination of independent variance sources with the initial model by means of additive modelling. We demonstrate these distinguishing features on two NIR spectroscopic case-studies. In case study 1 aPLS was used as a readjustment method for an emerging offset. The achieved RMS error of prediction (1.91 a.u.) was of similar level as before the offset occurred (2.11 a.u.). In case-study 2 a calibration combining different variance sources was conducted. The achieved performance was of sufficient level with an absolute error being better than 0.8% of the mean concentration, therefore being able to compensate negative effects of two independent variance sources. The presented results show the applicability of the aPLS approach. The main advantages of the method are that the original model stays unadjusted and that the modelling is conducted on concrete changes in the spectra thus supporting efficient (in most cases straightforward) modelling. Additionally, the method is put into context of existing machine learning algorithms. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Darwish, Hany W.; Hassan, Said A.; Salem, Maissa Y.; El-Zeany, Badr A.
2014-03-01
Different chemometric models were applied for the quantitative analysis of Amlodipine (AML), Valsartan (VAL) and Hydrochlorothiazide (HCT) in ternary mixture, namely, Partial Least Squares (PLS) as traditional chemometric model and Artificial Neural Networks (ANN) as advanced model. PLS and ANN were applied with and without variable selection procedure (Genetic Algorithm GA) and data compression procedure (Principal Component Analysis PCA). The chemometric methods applied are PLS-1, GA-PLS, ANN, GA-ANN and PCA-ANN. The methods were used for the quantitative analysis of the drugs in raw materials and pharmaceutical dosage form via handling the UV spectral data. A 3-factor 5-level experimental design was established resulting in 25 mixtures containing different ratios of the drugs. Fifteen mixtures were used as a calibration set and the other ten mixtures were used as validation set to validate the prediction ability of the suggested methods. The validity of the proposed methods was assessed using the standard addition technique.
Wang, Yonghua; Li, Yan; Wang, Bin
2007-01-01
Nicotine and a variety of other drugs and toxins are metabolized by cytochrome P450 (CYP) 2A6. The aim of the present study was to build a quantitative structure-activity relationship (QSAR) model to predict the activities of nicotine analogues on CYP2A6. Kernel partial least squares (K-PLS) regression was employed with the electro-topological descriptors to build the computational models. Both the internal and external predictabilities of the models were evaluated with test sets to ensure their validity and reliability. As a comparison to K-PLS, a standard PLS algorithm was also applied on the same training and test sets. Our results show that the K-PLS produced reasonable results that outperformed the PLS model on the datasets. The obtained K-PLS model will be helpful for the design of novel nicotine-like selective CYP2A6 inhibitors.
NASA Astrophysics Data System (ADS)
Glavanović, Siniša; Glavanović, Marija; Tomišić, Vladislav
2016-03-01
The UV spectrophotometric methods for simultaneous quantitative determination of paracetamol and tramadol in paracetamol-tramadol tablets were developed. The spectrophotometric data obtained were processed by means of partial least squares (PLS) and genetic algorithm coupled with PLS (GA-PLS) methods in order to determine the content of active substances in the tablets. The results gained by chemometric processing of the spectroscopic data were statistically compared with those obtained by means of validated ultra-high performance liquid chromatographic (UHPLC) method. The accuracy and precision of data obtained by the developed chemometric models were verified by analysing the synthetic mixture of drugs, and by calculating recovery as well as relative standard error (RSE). A statistically good agreement was found between the amounts of paracetamol determined using PLS and GA-PLS algorithms, and that obtained by UHPLC analysis, whereas for tramadol GA-PLS results were proven to be more reliable compared to those of PLS. The simplest and the most accurate and precise models were constructed by using the PLS method for paracetamol (mean recovery 99.5%, RSE 0.89%) and the GA-PLS method for tramadol (mean recovery 99.4%, RSE 1.69%).
Filgueiras, Paulo R; Terra, Luciana A; Castro, Eustáquio V R; Oliveira, Lize M S L; Dias, Júlio C M; Poppi, Ronei J
2015-09-01
This paper aims to estimate the temperature equivalent to 10% (T10%), 50% (T50%) and 90% (T90%) of distilled volume in crude oils using (1)H NMR and support vector regression (SVR). Confidence intervals for the predicted values were calculated using a boosting-type ensemble method in a procedure called ensemble support vector regression (eSVR). The estimated confidence intervals obtained by eSVR were compared with previously accepted calculations from partial least squares (PLS) models and a boosting-type ensemble applied in the PLS method (ePLS). By using the proposed boosting strategy, it was possible to identify outliers in the T10% property dataset. The eSVR procedure improved the accuracy of the distillation temperature predictions in relation to standard PLS, ePLS and SVR. For T10%, a root mean square error of prediction (RMSEP) of 11.6°C was obtained in comparison with 15.6°C for PLS, 15.1°C for ePLS and 28.4°C for SVR. The RMSEPs for T50% were 24.2°C, 23.4°C, 22.8°C and 14.4°C for PLS, ePLS, SVR and eSVR, respectively. For T90%, the values of RMSEP were 39.0°C, 39.9°C and 39.9°C for PLS, ePLS, SVR and eSVR, respectively. The confidence intervals calculated by the proposed boosting methodology presented acceptable values for the three properties analyzed; however, they were lower than those calculated by the standard methodology for PLS. Copyright © 2015 Elsevier B.V. All rights reserved.
Eliseyev, Andrey; Aksenova, Tetiana
2016-01-01
In the current paper the decoding algorithms for motor-related BCI systems for continuous upper limb trajectory prediction are considered. Two methods for the smooth prediction, namely Sobolev and Polynomial Penalized Multi-Way Partial Least Squares (PLS) regressions, are proposed. The methods are compared to the Multi-Way Partial Least Squares and Kalman Filter approaches. The comparison demonstrated that the proposed methods combined the prediction accuracy of the algorithms of the PLS family and trajectory smoothness of the Kalman Filter. In addition, the prediction delay is significantly lower for the proposed algorithms than for the Kalman Filter approach. The proposed methods could be applied in a wide range of applications beyond neuroscience. PMID:27196417
Kuligowski, Julia; Carrión, David; Quintás, Guillermo; Garrigues, Salvador; de la Guardia, Miguel
2011-01-01
The selection of an appropriate calibration set is a critical step in multivariate method development. In this work, the effect of using different calibration sets, based on a previous classification of unknown samples, on the partial least squares (PLS) regression model performance has been discussed. As an example, attenuated total reflection (ATR) mid-infrared spectra of deep-fried vegetable oil samples from three botanical origins (olive, sunflower, and corn oil), with increasing polymerized triacylglyceride (PTG) content induced by a deep-frying process were employed. The use of a one-class-classifier partial least squares-discriminant analysis (PLS-DA) and a rooted binary directed acyclic graph tree provided accurate oil classification. Oil samples fried without foodstuff could be classified correctly, independent of their PTG content. However, class separation of oil samples fried with foodstuff, was less evident. The combined use of double-cross model validation with permutation testing was used to validate the obtained PLS-DA classification models, confirming the results. To discuss the usefulness of the selection of an appropriate PLS calibration set, the PTG content was determined by calculating a PLS model based on the previously selected classes. In comparison to a PLS model calculated using a pooled calibration set containing samples from all classes, the root mean square error of prediction could be improved significantly using PLS models based on the selected calibration sets using PLS-DA, ranging between 1.06 and 2.91% (w/w).
de Almeida, Valber Elias; de Araújo Gomes, Adriano; de Sousa Fernandes, David Douglas; Goicoechea, Héctor Casimiro; Galvão, Roberto Kawakami Harrop; Araújo, Mario Cesar Ugulino
2018-05-01
This paper proposes a new variable selection method for nonlinear multivariate calibration, combining the Successive Projections Algorithm for interval selection (iSPA) with the Kernel Partial Least Squares (Kernel-PLS) modelling technique. The proposed iSPA-Kernel-PLS algorithm is employed in a case study involving a Vis-NIR spectrometric dataset with complex nonlinear features. The analytical problem consists of determining Brix and sucrose content in samples from a sugar production system, on the basis of transflectance spectra. As compared to full-spectrum Kernel-PLS, the iSPA-Kernel-PLS models involve a smaller number of variables and display statistically significant superiority in terms of accuracy and/or bias in the predictions. Published by Elsevier B.V.
Linear and nonlinear methods in modeling the aqueous solubility of organic compounds.
Catana, Cornel; Gao, Hua; Orrenius, Christian; Stouten, Pieter F W
2005-01-01
Solubility data for 930 diverse compounds have been analyzed using linear Partial Least Square (PLS) and nonlinear PLS methods, Continuum Regression (CR), and Neural Networks (NN). 1D and 2D descriptors from MOE package in combination with E-state or ISIS keys have been used. The best model was obtained using linear PLS for a combination between 22 MOE descriptors and 65 ISIS keys. It has a correlation coefficient (r2) of 0.935 and a root-mean-square error (RMSE) of 0.468 log molar solubility (log S(w)). The model validated on a test set of 177 compounds not included in the training set has r2 0.911 and RMSE 0.475 log S(w). The descriptors were ranked according to their importance, and at the top of the list have been found the 22 MOE descriptors. The CR model produced results as good as PLS, and because of the way in which cross-validation has been done it is expected to be a valuable tool in prediction besides PLS model. The statistics obtained using nonlinear methods did not surpass those got with linear ones. The good statistic obtained for linear PLS and CR recommends these models to be used in prediction when it is difficult or impossible to make experimental measurements, for virtual screening, combinatorial library design, and efficient leads optimization.
Rahman, Md. Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D. W.; Labrique, Alain B.; Rashid, Mahbubur; Christian, Parul; West, Keith P.
2017-01-01
Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 − -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset. PMID:29261760
Kabir, Alamgir; Rahman, Md Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D W; Labrique, Alain B; Rashid, Mahbubur; Christian, Parul; West, Keith P
2017-01-01
Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 - -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset.
NASA Astrophysics Data System (ADS)
Yan, Wen-juan; Yang, Ming; He, Guo-quan; Qin, Lin; Li, Gang
2014-11-01
In order to identify the diabetic patients by using tongue near-infrared (NIR) spectrum - a spectral classification model of the NIR reflectivity of the tongue tip is proposed, based on the partial least square (PLS) method. 39sample data of tongue tip's NIR spectra are harvested from healthy people and diabetic patients , respectively. After pretreatment of the reflectivity, the spectral data are set as the independent variable matrix, and information of classification as the dependent variables matrix, Samples were divided into two groups - i.e. 53 samples as calibration set and 25 as prediction set - then the PLS is used to build the classification model The constructed modelfrom the 53 samples has the correlation of 0.9614 and the root mean square error of cross-validation (RMSECV) of 0.1387.The predictions for the 25 samples have the correlation of 0.9146 and the RMSECV of 0.2122.The experimental result shows that the PLS method can achieve good classification on features of healthy people and diabetic patients.
NASA Astrophysics Data System (ADS)
Liu, Fei; He, Yong
2008-02-01
Visible and near infrared (Vis/NIR) transmission spectroscopy and chemometric methods were utilized to predict the pH values of cola beverages. Five varieties of cola were prepared and 225 samples (45 samples for each variety) were selected for the calibration set, while 75 samples (15 samples for each variety) for the validation set. The smoothing way of Savitzky-Golay and standard normal variate (SNV) followed by first-derivative were used as the pre-processing methods. Partial least squares (PLS) analysis was employed to extract the principal components (PCs) which were used as the inputs of least squares-support vector machine (LS-SVM) model according to their accumulative reliabilities. Then LS-SVM with radial basis function (RBF) kernel function and a two-step grid search technique were applied to build the regression model with a comparison of PLS regression. The correlation coefficient (r), root mean square error of prediction (RMSEP) and bias were 0.961, 0.040 and 0.012 for PLS, while 0.975, 0.031 and 4.697x10 -3 for LS-SVM, respectively. Both methods obtained a satisfying precision. The results indicated that Vis/NIR spectroscopy combined with chemometric methods could be applied as an alternative way for the prediction of pH of cola beverages.
Noninvasive and fast measurement of blood glucose in vivo by near infrared (NIR) spectroscopy
NASA Astrophysics Data System (ADS)
Jintao, Xue; Liming, Ye; Yufei, Liu; Chunyan, Li; Han, Chen
2017-05-01
This research was to develop a method for noninvasive and fast blood glucose assay in vivo. Near-infrared (NIR) spectroscopy, a more promising technique compared to other methods, was investigated in rats with diabetes and normal rats. Calibration models are generated by two different multivariate strategies: partial least squares (PLS) as linear regression method and artificial neural networks (ANN) as non-linear regression method. The PLS model was optimized individually by considering spectral range, spectral pretreatment methods and number of model factors, while the ANN model was studied individually by selecting spectral pretreatment methods, parameters of network topology, number of hidden neurons, and times of epoch. The results of the validation showed the two models were robust, accurate and repeatable. Compared to the ANN model, the performance of the PLS model was much better, with lower root mean square error of validation (RMSEP) of 0.419 and higher correlation coefficients (R) of 96.22%.
Kernel PLS-SVC for Linear and Nonlinear Discrimination
NASA Technical Reports Server (NTRS)
Rosipal, Roman; Trejo, Leonard J.; Matthews, Bryan
2003-01-01
A new methodology for discrimination is proposed. This is based on kernel orthonormalized partial least squares (PLS) dimensionality reduction of the original data space followed by support vector machines for classification. Close connection of orthonormalized PLS and Fisher's approach to linear discrimination or equivalently with canonical correlation analysis is described. This gives preference to use orthonormalized PLS over principal component analysis. Good behavior of the proposed method is demonstrated on 13 different benchmark data sets and on the real world problem of the classification finger movement periods versus non-movement periods based on electroencephalogram.
An improved partial least-squares regression method for Raman spectroscopy
NASA Astrophysics Data System (ADS)
Momenpour Tehran Monfared, Ali; Anis, Hanan
2017-10-01
It is known that the performance of partial least-squares (PLS) regression analysis can be improved using the backward variable selection method (BVSPLS). In this paper, we further improve the BVSPLS based on a novel selection mechanism. The proposed method is based on sorting the weighted regression coefficients, and then the importance of each variable of the sorted list is evaluated using root mean square errors of prediction (RMSEP) criterion in each iteration step. Our Improved BVSPLS (IBVSPLS) method has been applied to leukemia and heparin data sets and led to an improvement in limit of detection of Raman biosensing ranged from 10% to 43% compared to PLS. Our IBVSPLS was also compared to the jack-knifing (simpler) and Genetic Algorithm (more complex) methods. Our method was consistently better than the jack-knifing method and showed either a similar or a better performance compared to the genetic algorithm.
NASA Astrophysics Data System (ADS)
Goudarzi, Nasser
2016-04-01
In this work, two new and powerful chemometrics methods are applied for the modeling and prediction of the 19F chemical shift values of some fluorinated organic compounds. The radial basis function-partial least square (RBF-PLS) and random forest (RF) are employed to construct the models to predict the 19F chemical shifts. In this study, we didn't used from any variable selection method and RF method can be used as variable selection and modeling technique. Effects of the important parameters affecting the ability of the RF prediction power such as the number of trees (nt) and the number of randomly selected variables to split each node (m) were investigated. The root-mean-square errors of prediction (RMSEP) for the training set and the prediction set for the RBF-PLS and RF models were 44.70, 23.86, 29.77, and 23.69, respectively. Also, the correlation coefficients of the prediction set for the RBF-PLS and RF models were 0.8684 and 0.9313, respectively. The results obtained reveal that the RF model can be used as a powerful chemometrics tool for the quantitative structure-property relationship (QSPR) studies.
Partial Least Squares for Discrimination in fMRI Data
Andersen, Anders H.; Rayens, William S.; Liu, Yushu; Smith, Charles D.
2011-01-01
Multivariate methods for discrimination were used in the comparison of brain activation patterns between groups of cognitively normal women who are at either high or low Alzheimer's disease risk based on family history and apolipoprotein-E4 status. Linear discriminant analysis (LDA) was preceded by dimension reduction using either principal component analysis (PCA), partial least squares (PLS), or a new oriented partial least squares (OrPLS) method. The aim was to identify a spatial pattern of functionally connected brain regions that was differentially expressed by the risk groups and yielded optimal classification accuracy. Multivariate dimension reduction is required prior to LDA when the data contains more feature variables than there are observations on individual subjects. Whereas PCA has been commonly used to identify covariance patterns in neuroimaging data, this approach only identifies gross variability and is not capable of distinguishing among-groups from within-groups variability. PLS and OrPLS provide a more focused dimension reduction by incorporating information on class structure and therefore lead to more parsimonious models for discrimination. Performance was evaluated in terms of the cross-validated misclassification rates. The results support the potential of using fMRI as an imaging biomarker or diagnostic tool to discriminate individuals with disease or high risk. PMID:22227352
NASA Astrophysics Data System (ADS)
Samadi-Maybodi, Abdolraouf; Darzi, S. K. Hassani Nejad
2008-10-01
Resolution of binary mixtures of vitamin B12, methylcobalamin and B12 coenzyme with minimum sample pre-treatment and without analyte separation has been successfully achieved by methods of partial least squares algorithm with one dependent variable (PLS1), orthogonal signal correction/partial least squares (OSC/PLS), principal component regression (PCR) and hybrid linear analysis (HLA). Data of analysis were obtained from UV-vis spectra. The UV-vis spectra of the vitamin B12, methylcobalamin and B12 coenzyme were recorded in the same spectral conditions. The method of central composite design was used in the ranges of 10-80 mg L -1 for vitamin B12 and methylcobalamin and 20-130 mg L -1 for B12 coenzyme. The models refinement procedure and validation were performed by cross-validation. The minimum root mean square error of prediction (RMSEP) was 2.26 mg L -1 for vitamin B12 with PLS1, 1.33 mg L -1 for methylcobalamin with OSC/PLS and 3.24 mg L -1 for B12 coenzyme with HLA techniques. Figures of merit such as selectivity, sensitivity, analytical sensitivity and LOD were determined for three compounds. The procedure was successfully applied to simultaneous determination of three compounds in synthetic mixtures and in a pharmaceutical formulation.
Niazi, Ali; Zolgharnein, Javad; Afiuni-Zadeh, Somaie
2007-11-01
Ternary mixtures of thiamin, riboflavin and pyridoxal have been simultaneously determined in synthetic and real samples by applications of spectrophotometric and least-squares support vector machines. The calibration graphs were linear in the ranges of 1.0 - 20.0, 1.0 - 10.0 and 1.0 - 20.0 microg ml(-1) with detection limits of 0.6, 0.5 and 0.7 microg ml(-1) for thiamin, riboflavin and pyridoxal, respectively. The experimental calibration matrix was designed with 21 mixtures of these chemicals. The concentrations were varied between calibration graph concentrations of vitamins. The simultaneous determination of these vitamin mixtures by using spectrophotometric methods is a difficult problem, due to spectral interferences. The partial least squares (PLS) modeling and least-squares support vector machines were used for the multivariate calibration of the spectrophotometric data. An excellent model was built using LS-SVM, with low prediction errors and superior performance in relation to PLS. The root mean square errors of prediction (RMSEP) for thiamin, riboflavin and pyridoxal with PLS and LS-SVM were 0.6926, 0.3755, 0.4322 and 0.0421, 0.0318, 0.0457, respectively. The proposed method was satisfactorily applied to the rapid simultaneous determination of thiamin, riboflavin and pyridoxal in commercial pharmaceutical preparations and human plasma samples.
Error propagation of partial least squares for parameters optimization in NIR modeling.
Du, Chenzhao; Dai, Shengyun; Qiao, Yanjiang; Wu, Zhisheng
2018-03-05
A novel methodology is proposed to determine the error propagation of partial least-square (PLS) for parameters optimization in near-infrared (NIR) modeling. The parameters include spectral pretreatment, latent variables and variable selection. In this paper, an open source dataset (corn) and a complicated dataset (Gardenia) were used to establish PLS models under different modeling parameters. And error propagation of modeling parameters for water quantity in corn and geniposide quantity in Gardenia were presented by both type І and type II error. For example, when variable importance in the projection (VIP), interval partial least square (iPLS) and backward interval partial least square (BiPLS) variable selection algorithms were used for geniposide in Gardenia, compared with synergy interval partial least squares (SiPLS), the error weight varied from 5% to 65%, 55% and 15%. The results demonstrated how and what extent the different modeling parameters affect error propagation of PLS for parameters optimization in NIR modeling. The larger the error weight, the worse the model. Finally, our trials finished a powerful process in developing robust PLS models for corn and Gardenia under the optimal modeling parameters. Furthermore, it could provide a significant guidance for the selection of modeling parameters of other multivariate calibration models. Copyright © 2017. Published by Elsevier B.V.
Error propagation of partial least squares for parameters optimization in NIR modeling
NASA Astrophysics Data System (ADS)
Du, Chenzhao; Dai, Shengyun; Qiao, Yanjiang; Wu, Zhisheng
2018-03-01
A novel methodology is proposed to determine the error propagation of partial least-square (PLS) for parameters optimization in near-infrared (NIR) modeling. The parameters include spectral pretreatment, latent variables and variable selection. In this paper, an open source dataset (corn) and a complicated dataset (Gardenia) were used to establish PLS models under different modeling parameters. And error propagation of modeling parameters for water quantity in corn and geniposide quantity in Gardenia were presented by both type І and type II error. For example, when variable importance in the projection (VIP), interval partial least square (iPLS) and backward interval partial least square (BiPLS) variable selection algorithms were used for geniposide in Gardenia, compared with synergy interval partial least squares (SiPLS), the error weight varied from 5% to 65%, 55% and 15%. The results demonstrated how and what extent the different modeling parameters affect error propagation of PLS for parameters optimization in NIR modeling. The larger the error weight, the worse the model. Finally, our trials finished a powerful process in developing robust PLS models for corn and Gardenia under the optimal modeling parameters. Furthermore, it could provide a significant guidance for the selection of modeling parameters of other multivariate calibration models.
NASA Astrophysics Data System (ADS)
Hemmateenejad, Bahram; Rezaei, Zahra; Khabnadideh, Soghra; Saffari, Maryam
2007-11-01
Carbamazepine (CBZ) undergoes enzyme biotransformation through epoxidation with the formation of its metabolite, carbamazepine-10,11-epoxide (CBZE). A simple chemometrics-assisted spectrophotometric method has been proposed for simultaneous determination of CBZ and CBZE in plasma. A liquid extraction procedure was operated to separate the analytes from plasma, and the UV absorbance spectra of the resultant solutions were subjected to partial least squares (PLS) regression. The optimum number of PLS latent variables was selected according to the PRESS values of leave-one-out cross-validation. A HPLC method was also employed for comparison. The respective mean recoveries for analysis of CBZ and CBZE in synthetic mixtures were 102.57 (±0.25)% and 103.00 (±0.09)% for PLS and 99.40 (±0.15)% and 102.20 (±0.02)%. The concentrations of CBZ and CBZE were also determined in five patients using the PLS and HPLC methods. The results showed that the data obtained by PLS were comparable with those obtained by HPLC method.
NASA Astrophysics Data System (ADS)
Yeganeh, B.; Motlagh, M. Shafie Pour; Rashidi, Y.; Kamalan, H.
2012-08-01
Due to the health impacts caused by exposures to air pollutants in urban areas, monitoring and forecasting of air quality parameters have become popular as an important topic in atmospheric and environmental research today. The knowledge on the dynamics and complexity of air pollutants behavior has made artificial intelligence models as a useful tool for a more accurate pollutant concentration prediction. This paper focuses on an innovative method of daily air pollution prediction using combination of Support Vector Machine (SVM) as predictor and Partial Least Square (PLS) as a data selection tool based on the measured values of CO concentrations. The CO concentrations of Rey monitoring station in the south of Tehran, from Jan. 2007 to Feb. 2011, have been used to test the effectiveness of this method. The hourly CO concentrations have been predicted using the SVM and the hybrid PLS-SVM models. Similarly, daily CO concentrations have been predicted based on the aforementioned four years measured data. Results demonstrated that both models have good prediction ability; however the hybrid PLS-SVM has better accuracy. In the analysis presented in this paper, statistic estimators including relative mean errors, root mean squared errors and the mean absolute relative error have been employed to compare performances of the models. It has been concluded that the errors decrease after size reduction and coefficients of determination increase from 56 to 81% for SVM model to 65-85% for hybrid PLS-SVM model respectively. Also it was found that the hybrid PLS-SVM model required lower computational time than SVM model as expected, hence supporting the more accurate and faster prediction ability of hybrid PLS-SVM model.
Liao, Xiang; Wang, Qing; Fu, Ji-hong; Tang, Jun
2015-09-01
This work was undertaken to establish a quantitative analysis model which can rapid determinate the content of linalool, linalyl acetate of Xinjiang lavender essential oil. Totally 165 lavender essential oil samples were measured by using near infrared absorption spectrum (NIR), after analyzing the near infrared spectral absorption peaks of all samples, lavender essential oil have abundant chemical information and the interference of random noise may be relatively low on the spectral intervals of 7100~4500 cm(-1). Thus, the PLS models was constructed by using this interval for further analysis. 8 abnormal samples were eliminated. Through the clustering method, 157 lavender essential oil samples were divided into 105 calibration set samples and 52 validation set samples. Gas chromatography mass spectrometry (GC-MS) was used as a tool to determine the content of linalool and linalyl acetate in lavender essential oil. Then the matrix was established with the GC-MS raw data of two compounds in combination with the original NIR data. In order to optimize the model, different pretreatment methods were used to preprocess the raw NIR spectral to contrast the spectral filtering effect, after analysizing the quantitative model results of linalool and linalyl acetate, the root mean square error prediction (RMSEP) of orthogonal signal transformation (OSC) was 0.226, 0.558, spectrally, it was the optimum pretreatment method. In addition, forward interval partial least squares (FiPLS) method was used to exclude the wavelength points which has nothing to do with determination composition or present nonlinear correlation, finally 8 spectral intervals totally 160 wavelength points were obtained as the dataset. Combining the data sets which have optimized by OSC-FiPLS with partial least squares (PLS) to establish a rapid quantitative analysis model for determining the content of linalool and linalyl acetate in Xinjiang lavender essential oil, numbers of hidden variables of two components were 8 in the model. The performance of the model was evaluated according to root mean square error of cross-validation (RMSECV), root mean square error of prediction (RMSEP). In the model, RESECV of linalool and linalyl acetate were 0.170 and 0.416, respectively; RM-SEP were 0.188 and 0.364. The results indicated that raw data was pretreated by OSC and FiPLS, the NIR-PLS quantitative analysis model with good robustness, high measurement precision; it could quickly determine the content of linalool and linalyl acetate in lavender essential oil. In addition, the model has a favorable prediction ability. The study also provide a new effective method which could rapid quantitative analysis the major components of Xinjiang lavender essential oil.
NASA Astrophysics Data System (ADS)
Li, Lin
2008-12-01
Partial least squares (PLS) regressions were applied to lunar highland and mare soil data characterized by the Lunar Soil Characterization Consortium (LSCC) for spectral estimation of the abundance of lunar soil chemical constituents FeO and Al2O3. The LSCC data set was split into a number of subsets including the total highland, Apollo 16, Apollo 14, and total mare soils, and then PLS was applied to each to investigate the effect of nonlinearity on the performance of the PLS method. The weight-loading vectors resulting from PLS were analyzed to identify mineral species responsible for spectral estimation of the soil chemicals. The results from PLS modeling indicate that the PLS performance depends on the correlation of constituents of interest to their major mineral carriers, and the Apollo 16 soils are responsible for the large errors of FeO and Al2O3 estimates when the soils were modeled along with other types of soils. These large errors are primarily attributed to the degraded correlation FeO to pyroxene for the relatively mature Apollo 16 soils as a result of space weathering and secondary to the interference of olivine. PLS consistently yields very accurate fits to the two soil chemicals when applied to mare soils. Although Al2O3 has no spectrally diagnostic characteristics, this chemical can be predicted for all subset data by PLS modeling at high accuracies because of its correlation to FeO. This correlation is reflected in the symmetry of the PLS weight-loading vectors for FeO and Al2O3, which prove to be very useful for qualitative interpretation of the PLS results. However, this qualitative interpretation of PLS modeling cannot be achieved using principal component regression loading vectors.
NASA Astrophysics Data System (ADS)
Bai, Xue-Mei; Liu, Tie; Liu, De-Long; Wei, Yong-Ju
2018-02-01
A chemometrics-assisted excitation-emission matrix (EEM) fluorescence method was proposed for simultaneous determination of α-asarone and β-asarone in Acorus tatarinowii. Using the strategy of combining EEM data with chemometrics methods, the simultaneous determination of α-asarone and β-asarone in the complex Traditional Chinese medicine system was achieved successfully, even in the presence of unexpected interferents. The physical or chemical separation step was avoided due to the use of ;mathematical separation;. Six second-order calibration methods were used including parallel factor analysis (PARAFAC), alternating trilinear decomposition (ATLD), alternating penalty trilinear decomposition (APTLD), self-weighted alternating trilinear decomposition (SWATLD), the unfolded partial least-squares (U-PLS) and multidimensional partial least-squares (N-PLS) with residual bilinearization (RBL). In addition, HPLC method was developed to further validate the presented strategy. Consequently, for the validation samples, the analytical results obtained by six second-order calibration methods were almost accurate. But for the Acorus tatarinowii samples, the results indicated a slightly better predictive ability of N-PLS/RBL procedure over other methods.
He, Yan-Lin; Xu, Yuan; Geng, Zhi-Qiang; Zhu, Qun-Xiong
2016-03-01
In this paper, a hybrid robust model based on an improved functional link neural network integrating with partial least square (IFLNN-PLS) is proposed. Firstly, an improved functional link neural network with small norm of expanded weights and high input-output correlation (SNEWHIOC-FLNN) was proposed for enhancing the generalization performance of FLNN. Unlike the traditional FLNN, the expanded variables of the original inputs are not directly used as the inputs in the proposed SNEWHIOC-FLNN model. The original inputs are attached to some small norm of expanded weights. As a result, the correlation coefficient between some of the expanded variables and the outputs is enhanced. The larger the correlation coefficient is, the more relevant the expanded variables tend to be. In the end, the expanded variables with larger correlation coefficient are selected as the inputs to improve the performance of the traditional FLNN. In order to test the proposed SNEWHIOC-FLNN model, three UCI (University of California, Irvine) regression datasets named Housing, Concrete Compressive Strength (CCS), and Yacht Hydro Dynamics (YHD) are selected. Then a hybrid model based on the improved FLNN integrating with partial least square (IFLNN-PLS) was built. In IFLNN-PLS model, the connection weights are calculated using the partial least square method but not the error back propagation algorithm. Lastly, IFLNN-PLS was developed as an intelligent measurement model for accurately predicting the key variables in the Purified Terephthalic Acid (PTA) process and the High Density Polyethylene (HDPE) process. Simulation results illustrated that the IFLNN-PLS could significant improve the prediction performance. Copyright © 2015 ISA. Published by Elsevier Ltd. All rights reserved.
Cao, Hui; Yan, Xingyu; Li, Yaojiang; Wang, Yanxia; Zhou, Yan; Yang, Sanchun
2014-01-01
Quantitative analysis for the flue gas of natural gas-fired generator is significant for energy conservation and emission reduction. The traditional partial least squares method may not deal with the nonlinear problems effectively. In the paper, a nonlinear partial least squares method with extended input based on radial basis function neural network (RBFNN) is used for components prediction of flue gas. For the proposed method, the original independent input matrix is the input of RBFNN and the outputs of hidden layer nodes of RBFNN are the extension term of the original independent input matrix. Then, the partial least squares regression is performed on the extended input matrix and the output matrix to establish the components prediction model of flue gas. A near-infrared spectral dataset of flue gas of natural gas combustion is used for estimating the effectiveness of the proposed method compared with PLS. The experiments results show that the root-mean-square errors of prediction values of the proposed method for methane, carbon monoxide, and carbon dioxide are, respectively, reduced by 4.74%, 21.76%, and 5.32% compared to those of PLS. Hence, the proposed method has higher predictive capabilities and better robustness.
Gómez-Carracedo, M P; Andrade, J M; Rutledge, D N; Faber, N M
2007-03-07
Selecting the correct dimensionality is critical for obtaining partial least squares (PLS) regression models with good predictive ability. Although calibration and validation sets are best established using experimental designs, industrial laboratories cannot afford such an approach. Typically, samples are collected in an (formally) undesigned way, spread over time and their measurements are included in routine measurement processes. This makes it hard to evaluate PLS model dimensionality. In this paper, classical criteria (leave-one-out cross-validation and adjusted Wold's criterion) are compared to recently proposed alternatives (smoothed PLS-PoLiSh and a randomization test) to seek out the optimum dimensionality of PLS models. Kerosene (jet fuel) samples were measured by attenuated total reflectance-mid-IR spectrometry and their spectra where used to predict eight important properties determined using reference methods that are time-consuming and prone to analytical errors. The alternative methods were shown to give reliable dimensionality predictions when compared to external validation. By contrast, the simpler methods seemed to be largely affected by the largest changes in the modeling capabilities of the first components.
Wu, Jing-zhu; Wang, Feng-zhu; Wang, Li-li; Zhang, Xiao-chao; Mao, Wen-hua
2015-01-01
In order to improve the accuracy and robustness of detecting tomato seedlings nitrogen content based on near-infrared spectroscopy (NIR), 4 kinds of characteristic spectrum selecting methods were studied in the present paper, i. e. competitive adaptive reweighted sampling (CARS), Monte Carlo uninformative variables elimination (MCUVE), backward interval partial least squares (BiPLS) and synergy interval partial least squares (SiPLS). There were totally 60 tomato seedlings cultivated at 10 different nitrogen-treatment levels (urea concentration from 0 to 120 mg . L-1), with 6 samples at each nitrogen-treatment level. They are in different degrees of over nitrogen, moderate nitrogen, lack of nitrogen and no nitrogen status. Each sample leaves were collected to scan near-infrared spectroscopy from 12 500 to 3 600 cm-1. The quantitative models based on the above 4 methods were established. According to the experimental result, the calibration model based on CARS and MCUVE selecting methods show better performance than those based on BiPLS and SiPLS selecting methods, but their prediction ability is much lower than that of the latter. Among them, the model built by BiPLS has the best prediction performance. The correlation coefficient (r), root mean square error of prediction (RMSEP) and ratio of performance to standard derivate (RPD) is 0. 952 7, 0. 118 3 and 3. 291, respectively. Therefore, NIR technology combined with characteristic spectrum selecting methods can improve the model performance. But the characteristic spectrum selecting methods are not universal. For the built model based or single wavelength variables selection is more sensitive, it is more suitable for the uniform object. While the anti-interference ability of the model built based on wavelength interval selection is much stronger, it is more suitable for the uneven and poor reproducibility object. Therefore, the characteristic spectrum selection will only play a better role in building model, combined with the consideration of sample state and the model indexes.
Multimodal Classification of Mild Cognitive Impairment Based on Partial Least Squares.
Wang, Pingyue; Chen, Kewei; Yao, Li; Hu, Bin; Wu, Xia; Zhang, Jiacai; Ye, Qing; Guo, Xiaojuan
2016-08-10
In recent years, increasing attention has been given to the identification of the conversion of mild cognitive impairment (MCI) to Alzheimer's disease (AD). Brain neuroimaging techniques have been widely used to support the classification or prediction of MCI. The present study combined magnetic resonance imaging (MRI), 18F-fluorodeoxyglucose PET (FDG-PET), and 18F-florbetapir PET (florbetapir-PET) to discriminate MCI converters (MCI-c, individuals with MCI who convert to AD) from MCI non-converters (MCI-nc, individuals with MCI who have not converted to AD in the follow-up period) based on the partial least squares (PLS) method. Two types of PLS models (informed PLS and agnostic PLS) were built based on 64 MCI-c and 65 MCI-nc from the Alzheimer's Disease Neuroimaging Initiative (ADNI) database. The results showed that the three-modality informed PLS model achieved better classification accuracy of 81.40%, sensitivity of 79.69%, and specificity of 83.08% compared with the single-modality model, and the three-modality agnostic PLS model also achieved better classification compared with the two-modality model. Moreover, combining the three modalities with clinical test score (ADAS-cog), the agnostic PLS model (independent data: florbetapir-PET; dependent data: FDG-PET and MRI) achieved optimal accuracy of 86.05%, sensitivity of 81.25%, and specificity of 90.77%. In addition, the comparison of PLS, support vector machine (SVM), and random forest (RF) showed greater diagnostic power of PLS. These results suggested that our multimodal PLS model has the potential to discriminate MCI-c from the MCI-nc and may therefore be helpful in the early diagnosis of AD.
Wang, Qi; He, Haijun; Li, Bing; Lin, Hancheng; Zhang, Yinming; Zhang, Ji
2017-01-01
Estimating PMI is of great importance in forensic investigations. Although many methods are used to estimate the PMI, a few investigations focus on the postmortem redistribution. In this study, ultraviolet–visible (UV–Vis) measurement combined with visual inspection indicated a regular diffusion of hemoglobin into plasma after death showing the redistribution of postmortem components in blood. Thereafter, attenuated total reflection–Fourier transform infrared (ATR–FTIR) spectroscopy was used to confirm the variations caused by this phenomenon. First, full-spectrum partial least-squares (PLS) and genetic algorithm combined with PLS (GA-PLS) models were constructed to predict the PMI. The performance of GA-PLS model was better than that of full-spectrum PLS model based on its root mean square error (RMSE) of cross-validation of 3.46 h (R2 = 0.95) and the RMSE of prediction of 3.46 h (R2 = 0.94). The investigation on the similarity of spectra between blood plasma and formed elements also supported the role of redistribution of components in spectral changes in postmortem plasma. These results demonstrated that ATR-FTIR spectroscopy coupled with the advanced mathematical methods could serve as a convenient and reliable tool to study the redistribution of postmortem components and estimate the PMI. PMID:28753641
NASA Astrophysics Data System (ADS)
Chen, Hui; Tan, Chao; Lin, Zan; Wu, Tong
2018-01-01
Milk is among the most popular nutrient source worldwide, which is of great interest due to its beneficial medicinal properties. The feasibility of the classification of milk powder samples with respect to their brands and the determination of protein concentration is investigated by NIR spectroscopy along with chemometrics. Two datasets were prepared for experiment. One contains 179 samples of four brands for classification and the other contains 30 samples for quantitative analysis. Principal component analysis (PCA) was used for exploratory analysis. Based on an effective model-independent variable selection method, i.e., minimal-redundancy maximal-relevance (MRMR), only 18 variables were selected to construct a partial least-square discriminant analysis (PLS-DA) model. On the test set, the PLS-DA model based on the selected variable set was compared with the full-spectrum PLS-DA model, both of which achieved 100% accuracy. In quantitative analysis, the partial least-square regression (PLSR) model constructed by the selected subset of 260 variables outperforms significantly the full-spectrum model. It seems that the combination of NIR spectroscopy, MRMR and PLS-DA or PLSR is a powerful tool for classifying different brands of milk and determining the protein content.
Liu, Xue-Mei; Zhang, Hai-Liang
2014-10-01
Ultraviolet/visible (UV/Vis) spectroscopy was studied for the rapid determination of chemical oxygen demand (COD), which was an indicator to measure the concentration of organic matter in aquaculture water. In order to reduce the influence of the absolute noises of the spectra, the extracted 135 absorbance spectra were preprocessed by Savitzky-Golay smoothing (SG), EMD, and wavelet transform (WT) methods. The preprocessed spectra were then used to select latent variables (LVs) by partial least squares (PLS) methods. Partial least squares (PLS) was used to build models with the full spectra, and back- propagation neural network (BPNN) and least square support vector machine (LS-SVM) were applied to build models with the selected LVs. The overall results showed that BPNN and LS-SVM models performed better than PLS models, and the LS-SVM models with LVs based on WT preprocessed spectra obtained the best results with the determination coefficient (r2) and RMSE being 0. 83 and 14. 78 mg · L(-1) for calibration set, and 0.82 and 14.82 mg · L(-1) for the prediction set respectively. The method showed the best performance in LS-SVM model. The results indicated that it was feasible to use UV/Vis with LVs which were obtained by PLS method, combined with LS-SVM calibration could be applied to the rapid and accurate determination of COD in aquaculture water. Moreover, this study laid the foundation for further implementation of online analysis of aquaculture water and rapid determination of other water quality parameters.
Lee, Soo Yee; Mediani, Ahmed; Maulidiani, Maulidiani; Khatib, Alfi; Ismail, Intan Safinar; Zawawi, Norhasnida; Abas, Faridah
2018-01-01
Neptunia oleracea is a plant consumed as a vegetable and which has been used as a folk remedy for several diseases. Herein, two regression models (partial least squares, PLS; and random forest, RF) in a metabolomics approach were compared and applied to the evaluation of the relationship between phenolics and bioactivities of N. oleracea. In addition, the effects of different extraction conditions on the phenolic constituents were assessed by pattern recognition analysis. Comparison of the PLS and RF showed that RF exhibited poorer generalization and hence poorer predictive performance. Both the regression coefficient of PLS and the variable importance of RF revealed that quercetin and kaempferol derivatives, caffeic acid and vitexin-2-O-rhamnoside were significant towards the tested bioactivities. Furthermore, principal component analysis (PCA) and partial least squares-discriminant analysis (PLS-DA) results showed that sonication and absolute ethanol are the preferable extraction method and ethanol ratio, respectively, to produce N. oleracea extracts with high phenolic levels and therefore high DPPH scavenging and α-glucosidase inhibitory activities. Both PLS and RF are useful regression models in metabolomics studies. This work provides insight into the performance of different multivariate data analysis tools and the effects of different extraction conditions on the extraction of desired phenolics from plants. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
Quantification of adulterations in extra virgin flaxseed oil using MIR and PLS.
de Souza, Letícia Maria; de Santana, Felipe Bachion; Gontijo, Lucas Caixeta; Mazivila, Sarmento Júnior; Borges Neto, Waldomiro
2015-09-01
This paper proposes a new method for the quantitative analysis of soybean oil (SO) and sunflower oil (SFO) as adulterants in extra virgin flaxseed oil (EFO) by applying Mid Infrared Spectroscopy (MIR) associated with chemometric technique of Partial Least Squares (PLS). The PLS models were built in accordance with standard method ASTM E1655-05 and these showed good correlation between the reference values and those calculated using the PLS models with low error values, with R = 0.998 for SFO and R = 0.999 for SO in EFO. These models were validated analytically in accordance with Brazilian and international guidelines through the estimate of figures of merit parameters, thus showing an effective and feasible method to control the quality of extra virgin flaxseed oil. Copyright © 2015 Elsevier Ltd. All rights reserved.
Wu, Yanwei; Guo, Pan; Chen, Siying; Chen, He; Zhang, Yinchao
2017-04-01
Auto-adaptive background subtraction (AABS) is proposed as a denoising method for data processing of the coherent Doppler lidar (CDL). The method is proposed specifically for a low-signal-to-noise-ratio regime, in which the drifting power spectral density of CDL data occurs. Unlike the periodogram maximum (PM) and adaptive iteratively reweighted penalized least squares (airPLS), the proposed method presents reliable peaks and is thus advantageous in identifying peak locations. According to the analysis results of simulated and actually measured data, the proposed method outperforms the airPLS method and the PM algorithm in the furthest detectable range. The proposed method improves the detection range approximately up to 16.7% and 40% when compared to the airPLS method and the PM method, respectively. It also has smaller mean wind velocity and standard error values than the airPLS and PM methods. The AABS approach improves the quality of Doppler shift estimates and can be applied to obtain the whole wind profiling by the CDL.
Lakshmi, KS; Lakshmi, S
2010-01-01
Two chemometric methods were developed for the simultaneous determination of telmisartan and hydrochlorothiazide. The chemometric methods applied were principal component regression (PCR) and partial least square (PLS-1). These approaches were successfully applied to quantify the two drugs in the mixture using the information included in the UV absorption spectra of appropriate solutions in the range of 200-350 nm with the intervals Δλ = 1 nm. The calibration of PCR and PLS-1 models was evaluated by internal validation (prediction of compounds in its own designed training set of calibration) and by external validation over laboratory prepared mixtures and pharmaceutical preparations. The PCR and PLS-1 methods require neither any separation step, nor any prior graphical treatment of the overlapping spectra of the two drugs in a mixture. The results of PCR and PLS-1 methods were compared with each other and a good agreement was found. PMID:21331198
Lakshmi, Ks; Lakshmi, S
2010-01-01
Two chemometric methods were developed for the simultaneous determination of telmisartan and hydrochlorothiazide. The chemometric methods applied were principal component regression (PCR) and partial least square (PLS-1). These approaches were successfully applied to quantify the two drugs in the mixture using the information included in the UV absorption spectra of appropriate solutions in the range of 200-350 nm with the intervals Δλ = 1 nm. The calibration of PCR and PLS-1 models was evaluated by internal validation (prediction of compounds in its own designed training set of calibration) and by external validation over laboratory prepared mixtures and pharmaceutical preparations. The PCR and PLS-1 methods require neither any separation step, nor any prior graphical treatment of the overlapping spectra of the two drugs in a mixture. The results of PCR and PLS-1 methods were compared with each other and a good agreement was found.
Robust PLS approach for KPI-related prediction and diagnosis against outliers and missing data
NASA Astrophysics Data System (ADS)
Yin, Shen; Wang, Guang; Yang, Xu
2014-07-01
In practical industrial applications, the key performance indicator (KPI)-related prediction and diagnosis are quite important for the product quality and economic benefits. To meet these requirements, many advanced prediction and monitoring approaches have been developed which can be classified into model-based or data-driven techniques. Among these approaches, partial least squares (PLS) is one of the most popular data-driven methods due to its simplicity and easy implementation in large-scale industrial process. As PLS is totally based on the measured process data, the characteristics of the process data are critical for the success of PLS. Outliers and missing values are two common characteristics of the measured data which can severely affect the effectiveness of PLS. To ensure the applicability of PLS in practical industrial applications, this paper introduces a robust version of PLS to deal with outliers and missing values, simultaneously. The effectiveness of the proposed method is finally demonstrated by the application results of the KPI-related prediction and diagnosis on an industrial benchmark of Tennessee Eastman process.
Application of near-infrared spectroscopy for the rapid quality assessment of Radix Paeoniae Rubra
NASA Astrophysics Data System (ADS)
Zhan, Hao; Fang, Jing; Tang, Liying; Yang, Hongjun; Li, Hua; Wang, Zhuju; Yang, Bin; Wu, Hongwei; Fu, Meihong
2017-08-01
Near-infrared (NIR) spectroscopy with multivariate analysis was used to quantify gallic acid, catechin, albiflorin, and paeoniflorin in Radix Paeoniae Rubra, and the feasibility to classify the samples originating from different areas was investigated. A new high-performance liquid chromatography method was developed and validated to analyze gallic acid, catechin, albiflorin, and paeoniflorin in Radix Paeoniae Rubra as the reference. Partial least squares (PLS), principal component regression (PCR), and stepwise multivariate linear regression (SMLR) were performed to calibrate the regression model. Different data pretreatments such as derivatives (1st and 2nd), multiplicative scatter correction, standard normal variate, Savitzky-Golay filter, and Norris derivative filter were applied to remove the systematic errors. The performance of the model was evaluated according to the root mean square of calibration (RMSEC), root mean square error of prediction (RMSEP), root mean square error of cross-validation (RMSECV), and correlation coefficient (r). The results show that compared to PCR and SMLR, PLS had a lower RMSEC, RMSECV, and RMSEP and higher r for all the four analytes. PLS coupled with proper pretreatments showed good performance in both the fitting and predicting results. Furthermore, the original areas of Radix Paeoniae Rubra samples were partly distinguished by principal component analysis. This study shows that NIR with PLS is a reliable, inexpensive, and rapid tool for the quality assessment of Radix Paeoniae Rubra.
NASA Astrophysics Data System (ADS)
Tewari, Jagdish; Strong, Richard; Boulas, Pierre
2017-02-01
This article summarizes the development and validation of a Fourier transform near infrared spectroscopy (FT-NIR) method for the rapid at-line prediction of active pharmaceutical ingredient (API) in a powder blend to optimize small molecule formulations. The method was used to determine the blend uniformity end-point for a pharmaceutical solid dosage formulation containing a range of API concentrations. A set of calibration spectra from samples with concentrations ranging from 1% to 15% of API (w/w) were collected at-line from 4000 to 12,500 cm- 1. The ability of the FT-NIR method to predict API concentration in the blend samples was validated against a reference high performance liquid chromatography (HPLC) method. The prediction efficiency of four different types of multivariate data modeling methods such as partial least-squares 1 (PLS1), partial least-squares 2 (PLS2), principal component regression (PCR) and artificial neural network (ANN), were compared using relevant multivariate figures of merit. The prediction ability of the regression models were cross validated against results generated with the reference HPLC method. PLS1 and ANN showed excellent and superior prediction abilities when compared to PLS2 and PCR. Based upon these results and because of its decreased complexity compared to ANN, PLS1 was selected as the best chemometric method to predict blend uniformity at-line. The FT-NIR measurement and the associated chemometric analysis were implemented in the production environment for rapid at-line determination of the end-point of the small molecule blending operation. FIGURE 1: Correlation coefficient vs Rank plot FIGURE 2: FT-NIR spectra of different steps of Blend and final blend FIGURE 3: Predictions ability of PCR FIGURE 4: Blend uniformity predication ability of PLS2 FIGURE 5: Prediction efficiency of blend uniformity using ANN FIGURE 6: Comparison of prediction efficiency of chemometric models TABLE 1: Order of Addition for Blending Steps
Jiang, Hui; Zhang, Hang; Chen, Quansheng; Mei, Congli; Liu, Guohai
2015-01-01
The use of wavelength variable selection before partial least squares discriminant analysis (PLS-DA) for qualitative identification of solid state fermentation degree by FT-NIR spectroscopy technique was investigated in this study. Two wavelength variable selection methods including competitive adaptive reweighted sampling (CARS) and stability competitive adaptive reweighted sampling (SCARS) were employed to select the important wavelengths. PLS-DA was applied to calibrate identified model using selected wavelength variables by CARS and SCARS for identification of solid state fermentation degree. Experimental results showed that the number of selected wavelength variables by CARS and SCARS were 58 and 47, respectively, from the 1557 original wavelength variables. Compared with the results of full-spectrum PLS-DA, the two wavelength variable selection methods both could enhance the performance of identified models. Meanwhile, compared with CARS-PLS-DA model, the SCARS-PLS-DA model achieved better results with the identification rate of 91.43% in the validation process. The overall results sufficiently demonstrate the PLS-DA model constructed using selected wavelength variables by a proper wavelength variable method can be more accurate identification of solid state fermentation degree. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Jiang, Hui; Zhang, Hang; Chen, Quansheng; Mei, Congli; Liu, Guohai
2015-10-01
The use of wavelength variable selection before partial least squares discriminant analysis (PLS-DA) for qualitative identification of solid state fermentation degree by FT-NIR spectroscopy technique was investigated in this study. Two wavelength variable selection methods including competitive adaptive reweighted sampling (CARS) and stability competitive adaptive reweighted sampling (SCARS) were employed to select the important wavelengths. PLS-DA was applied to calibrate identified model using selected wavelength variables by CARS and SCARS for identification of solid state fermentation degree. Experimental results showed that the number of selected wavelength variables by CARS and SCARS were 58 and 47, respectively, from the 1557 original wavelength variables. Compared with the results of full-spectrum PLS-DA, the two wavelength variable selection methods both could enhance the performance of identified models. Meanwhile, compared with CARS-PLS-DA model, the SCARS-PLS-DA model achieved better results with the identification rate of 91.43% in the validation process. The overall results sufficiently demonstrate the PLS-DA model constructed using selected wavelength variables by a proper wavelength variable method can be more accurate identification of solid state fermentation degree.
Kehimkar, Benjamin; Parsons, Brendon A; Hoggard, Jamin C; Billingsley, Matthew C; Bruno, Thomas J; Synovec, Robert E
2015-01-01
Recent efforts in predicting rocket propulsion (RP-1) fuel performance through modeling put greater emphasis on obtaining detailed and accurate fuel properties, as well as elucidating the relationships between fuel compositions and their properties. Herein, we study multidimensional chromatographic data obtained by comprehensive two-dimensional gas chromatography combined with time-of-flight mass spectrometry (GC × GC-TOFMS) to analyze RP-1 fuels. For GC × GC separations, RTX-Wax (polar stationary phase) and RTX-1 (non-polar stationary phase) columns were implemented for the primary and secondary dimensions, respectively, to separate the chemical compound classes (alkanes, cycloalkanes, aromatics, etc.), providing a significant level of chemical compositional information. The GC × GC-TOFMS data were analyzed using partial least squares regression (PLS) chemometric analysis to model and predict advanced distillation curve (ADC) data for ten RP-1 fuels that were previously analyzed using the ADC method. The PLS modeling provides insight into the chemical species that impact the ADC data. The PLS modeling correlates compositional information found in the GC × GC-TOFMS chromatograms of each RP-1 fuel, and their respective ADC, and allows prediction of the ADC for each RP-1 fuel with good precision and accuracy. The root-mean-square error of calibration (RMSEC) ranged from 0.1 to 0.5 °C, and was typically below ∼0.2 °C, for the PLS calibration of the ADC modeling with GC × GC-TOFMS data, indicating a good fit of the model to the calibration data. Likewise, the predictive power of the overall method via PLS modeling was assessed using leave-one-out cross-validation (LOOCV) yielding root-mean-square error of cross-validation (RMSECV) ranging from 1.4 to 2.6 °C, and was typically below ∼2.0 °C, at each % distilled measurement point during the ADC analysis.
NASA Astrophysics Data System (ADS)
de Oliveira, Isadora R. N.; Roque, Jussara V.; Maia, Mariza P.; Stringheta, Paulo C.; Teófilo, Reinaldo F.
2018-04-01
A new method was developed to determine the antioxidant properties of red cabbage extract (Brassica oleracea) by mid (MID) and near (NIR) infrared spectroscopies and partial least squares (PLS) regression. A 70% (v/v) ethanolic extract of red cabbage was concentrated to 9° Brix and further diluted (12 to 100%) in water. The dilutions were used as external standards for the building of PLS models. For the first time, this strategy was applied for building multivariate regression models. Reference analyses and spectral data were obtained from diluted extracts. The determinate properties were total and monomeric anthocyanins, total polyphenols and antioxidant capacity by ABTS (2,2-azino-bis(3-ethyl-benzothiazoline-6-sulfonate)) and DPPH (2,2-diphenyl-1-picrylhydrazyl) methods. Ordered predictors selection (OPS) and genetic algorithm (GA) were used for feature selection before PLS regression (PLS-1). In addition, a PLS-2 regression was applied to all properties simultaneously. PLS-1 models provided more predictive models than did PLS-2 regression. PLS-OPS and PLS-GA models presented excellent prediction results with a correlation coefficient higher than 0.98. However, the best models were obtained using PLS and variable selection with the OPS algorithm and the models based on NIR spectra were considered more predictive for all properties. Then, these models provided a simple, rapid and accurate method for determination of red cabbage extract antioxidant properties and its suitability for use in the food industry.
Zhou, Yan; Cao, Hui
2013-01-01
We propose an augmented classical least squares (ACLS) calibration method for quantitative Raman spectral analysis against component information loss. The Raman spectral signals with low analyte concentration correlations were selected and used as the substitutes for unknown quantitative component information during the CLS calibration procedure. The number of selected signals was determined by using the leave-one-out root-mean-square error of cross-validation (RMSECV) curve. An ACLS model was built based on the augmented concentration matrix and the reference spectral signal matrix. The proposed method was compared with partial least squares (PLS) and principal component regression (PCR) using one example: a data set recorded from an experiment of analyte concentration determination using Raman spectroscopy. A 2-fold cross-validation with Venetian blinds strategy was exploited to evaluate the predictive power of the proposed method. The one-way variance analysis (ANOVA) was used to access the predictive power difference between the proposed method and existing methods. Results indicated that the proposed method is effective at increasing the robust predictive power of traditional CLS model against component information loss and its predictive power is comparable to that of PLS or PCR.
de Groot, P J; Swierenga, H; Postma, G J; Melssen, W J; Buydens, L M C
2003-06-01
The combination of Raman and infrared spectroscopy on the one hand and wavelength selection on the other hand is used to improve the partial least-squares (PLS) prediction of seven selected yarn properties. These properties are important for on-line quality control during production. From 71 yarn samples, the Raman and infrared spectra are measured and reference methods are used to determine the selected properties. Making separate PLS models for all yarn properties using the Raman and infrared spectra, prior to wavelength selection, reveals that Raman spectroscopy outperforms infrared spectroscopy. If wavelength selection is applied, the PLS prediction error decreases and the correlation coefficient increases for all properties. However, a substantial wavelength selection effect is present for the infrared spectra compared to the Raman spectra. For the infrared spectra, wavelength selection results in PLS prediction errors comparable with the prediction performance of the Raman spectra prior to wavelength selection. Concatenating the Raman and infrared spectra does not enhance the PLS prediction performance, not even after wavelength selection. It is concluded that an infrared spectrometer, combined with a wavelength selection procedure, can be used if no (suitable) Raman instrument is available.
Waskitho, Dri; Lukitaningsih, Endang; Sudjadi; Rohman, Abdul
2016-01-01
Analysis of lard extracted from lipstick formulation containing castor oil has been performed using FTIR spectroscopic method combined with multivariate calibration. Three different extraction methods were compared, namely saponification method followed by liquid/liquid extraction with hexane/dichlorometane/ethanol/water, saponification method followed by liquid/liquid extraction with dichloromethane/ethanol/water, and Bligh & Dyer method using chloroform/methanol/water as extracting solvent. Qualitative and quantitative analysis of lard were performed using principle component (PCA) and partial least square (PLS) analysis, respectively. The results showed that, in all samples prepared by the three extraction methods, PCA was capable of identifying lard at wavelength region of 1200-800 cm -1 with the best result was obtained by Bligh & Dyer method. Furthermore, PLS analysis at the same wavelength region used for qualification showed that Bligh and Dyer was the most suitable extraction method with the highest determination coefficient (R 2 ) and the lowest root mean square error of calibration (RMSEC) as well as root mean square error of prediction (RMSEP) values.
NASA Technical Reports Server (NTRS)
Anderson, R. B.; Morris, R. V.; Clegg, S. M.; Bell, J. F., III; Humphries, S. D.; Wiens, R. C.
2011-01-01
The ChemCam instrument selected for the Curiosity rover is capable of remote laser-induced breakdown spectroscopy (LIBS).[1] We used a remote LIBS instrument similar to ChemCam to analyze 197 geologic slab samples and 32 pressed-powder geostandards. The slab samples are well-characterized and have been used to validate the calibration of previous instruments on Mars missions, including CRISM [2], OMEGA [3], the MER Pancam [4], Mini-TES [5], and Moessbauer [6] instruments and the Phoenix SSI [7]. The resulting dataset was used to compare multivariate methods for quantitative LIBS and to determine the effect of grain size on calculations. Three multivariate methods - partial least squares (PLS), multilayer perceptron artificial neural networks (MLP ANNs) and cascade correlation (CC) ANNs - were used to generate models and extract the quantitative composition of unknown samples. PLS can be used to predict one element (PLS1) or multiple elements (PLS2) at a time, as can the neural network methods. Although MLP and CC ANNs were successful in some cases, PLS generally produced the most accurate and precise results.
Peng, Jiangtao; Peng, Silong; Xie, Qiong; Wei, Jiping
2011-04-01
In order to eliminate the lower order polynomial interferences, a new quantitative calibration algorithm "Baseline Correction Combined Partial Least Squares (BCC-PLS)", which combines baseline correction and conventional PLS, is proposed. By embedding baseline correction constraints into PLS weights selection, the proposed calibration algorithm overcomes the uncertainty in baseline correction and can meet the requirement of on-line attenuated total reflectance Fourier transform infrared (ATR-FTIR) quantitative analysis. The effectiveness of the algorithm is evaluated by the analysis of glucose and marzipan ATR-FTIR spectra. BCC-PLS algorithm shows improved prediction performance over PLS. The root mean square error of cross-validation (RMSECV) on marzipan spectra for the prediction of the moisture is found to be 0.53%, w/w (range 7-19%). The sugar content is predicted with a RMSECV of 2.04%, w/w (range 33-68%). Copyright © 2011 Elsevier B.V. All rights reserved.
Zhang, Yan; Zou, Hong-Yan; Shi, Pei; Yang, Qin; Tang, Li-Juan; Jiang, Jian-Hui; Wu, Hai-Long; Yu, Ru-Qin
2016-01-01
Determination of benzo[a]pyrene (BaP) in cigarette smoke can be very important for the tobacco quality control and the assessment of its harm to human health. In this study, mid-infrared spectroscopy (MIR) coupled to chemometric algorithm (DPSO-WPT-PLS), which was based on the wavelet packet transform (WPT), discrete particle swarm optimization algorithm (DPSO) and partial least squares regression (PLS), was used to quantify harmful ingredient benzo[a]pyrene in the cigarette mainstream smoke with promising result. Furthermore, the proposed method provided better performance compared to several other chemometric models, i.e., PLS, radial basis function-based PLS (RBF-PLS), PLS with stepwise regression variable selection (Stepwise-PLS) as well as WPT-PLS with informative wavelet coefficients selected by correlation coefficient test (rtest-WPT-PLS). It can be expected that the proposed strategy could become a new effective, rapid quantitative analysis technique in analyzing the harmful ingredient BaP in cigarette mainstream smoke. Copyright © 2015 Elsevier B.V. All rights reserved.
Lee, Byeong-Ju; Kim, Hye-Youn; Lim, Sa Rang; Huang, Linfang; Choi, Hyung-Kyoon
2017-01-01
Panax ginseng C.A. Meyer is a herb used for medicinal purposes, and its discrimination according to cultivation age has been an important and practical issue. This study employed Fourier-transform infrared (FT-IR) spectroscopy with multivariate statistical analysis to obtain a prediction model for discriminating cultivation ages (5 and 6 years) and three different parts (rhizome, tap root, and lateral root) of P. ginseng. The optimal partial-least-squares regression (PLSR) models for discriminating ginseng samples were determined by selecting normalization methods, number of partial-least-squares (PLS) components, and variable influence on projection (VIP) cutoff values. The best prediction model for discriminating 5- and 6-year-old ginseng was developed using tap root, vector normalization applied after the second differentiation, one PLS component, and a VIP cutoff of 1.0 (based on the lowest root-mean-square error of prediction value). In addition, for discriminating among the three parts of P. ginseng, optimized PLSR models were established using data sets obtained from vector normalization, two PLS components, and VIP cutoff values of 1.5 (for 5-year-old ginseng) and 1.3 (for 6-year-old ginseng). To our knowledge, this is the first study to provide a novel strategy for rapidly discriminating the cultivation ages and parts of P. ginseng using FT-IR by selected normalization methods, number of PLS components, and VIP cutoff values.
Lim, Sa Rang; Huang, Linfang
2017-01-01
Panax ginseng C.A. Meyer is a herb used for medicinal purposes, and its discrimination according to cultivation age has been an important and practical issue. This study employed Fourier-transform infrared (FT-IR) spectroscopy with multivariate statistical analysis to obtain a prediction model for discriminating cultivation ages (5 and 6 years) and three different parts (rhizome, tap root, and lateral root) of P. ginseng. The optimal partial-least-squares regression (PLSR) models for discriminating ginseng samples were determined by selecting normalization methods, number of partial-least-squares (PLS) components, and variable influence on projection (VIP) cutoff values. The best prediction model for discriminating 5- and 6-year-old ginseng was developed using tap root, vector normalization applied after the second differentiation, one PLS component, and a VIP cutoff of 1.0 (based on the lowest root-mean-square error of prediction value). In addition, for discriminating among the three parts of P. ginseng, optimized PLSR models were established using data sets obtained from vector normalization, two PLS components, and VIP cutoff values of 1.5 (for 5-year-old ginseng) and 1.3 (for 6-year-old ginseng). To our knowledge, this is the first study to provide a novel strategy for rapidly discriminating the cultivation ages and parts of P. ginseng using FT-IR by selected normalization methods, number of PLS components, and VIP cutoff values. PMID:29049369
Barimani, Shirin; Kleinebudde, Peter
2017-10-01
A multivariate analysis method, Science-Based Calibration (SBC), was used for the first time for endpoint determination of a tablet coating process using Raman data. Two types of tablet cores, placebo and caffeine cores, received a coating suspension comprising a polyvinyl alcohol-polyethylene glycol graft-copolymer and titanium dioxide to a maximum coating thickness of 80µm. Raman spectroscopy was used as in-line PAT tool. The spectra were acquired every minute and correlated to the amount of applied aqueous coating suspension. SBC was compared to another well-known multivariate analysis method, Partial Least Squares-regression (PLS) and a simpler approach, Univariate Data Analysis (UVDA). All developed calibration models had coefficient of determination values (R 2 ) higher than 0.99. The coating endpoints could be predicted with root mean square errors (RMSEP) less than 3.1% of the applied coating suspensions. Compared to PLS and UVDA, SBC proved to be an alternative multivariate calibration method with high predictive power. Copyright © 2017 Elsevier B.V. All rights reserved.
A Partial Least Squares Based Procedure for Upstream Sequence Classification in Prokaryotes.
Mehmood, Tahir; Bohlin, Jon; Snipen, Lars
2015-01-01
The upstream region of coding genes is important for several reasons, for instance locating transcription factor, binding sites, and start site initiation in genomic DNA. Motivated by a recently conducted study, where multivariate approach was successfully applied to coding sequence modeling, we have introduced a partial least squares (PLS) based procedure for the classification of true upstream prokaryotic sequence from background upstream sequence. The upstream sequences of conserved coding genes over genomes were considered in analysis, where conserved coding genes were found by using pan-genomics concept for each considered prokaryotic species. PLS uses position specific scoring matrix (PSSM) to study the characteristics of upstream region. Results obtained by PLS based method were compared with Gini importance of random forest (RF) and support vector machine (SVM), which is much used method for sequence classification. The upstream sequence classification performance was evaluated by using cross validation, and suggested approach identifies prokaryotic upstream region significantly better to RF (p-value < 0.01) and SVM (p-value < 0.01). Further, the proposed method also produced results that concurred with known biological characteristics of the upstream region.
NASA Astrophysics Data System (ADS)
Aimran, Ahmad Nazim; Ahmad, Sabri; Afthanorhan, Asyraf; Awang, Zainudin
2017-05-01
Structural equation modeling (SEM) is the second generation statistical analysis technique developed for analyzing the inter-relationships among multiple variables in a model. Previous studies have shown that there seemed to be at least an implicit agreement about the factors that should drive the choice between covariance-based structural equation modeling (CB-SEM) and partial least square path modeling (PLS-PM). PLS-PM appears to be the preferred method by previous scholars because of its less stringent assumption and the need to avoid the perceived difficulties in CB-SEM. Along with this issue has been the increasing debate among researchers on the use of CB-SEM and PLS-PM in studies. The present study intends to assess the performance of CB-SEM and PLS-PM as a confirmatory study in which the findings will contribute to the body of knowledge of SEM. Maximum likelihood (ML) was chosen as the estimator for CB-SEM and was expected to be more powerful than PLS-PM. Based on the balanced experimental design, the multivariate normal data with specified population parameter and sample sizes were generated using Pro-Active Monte Carlo simulation, and the data were analyzed using AMOS for CB-SEM and SmartPLS for PLS-PM. Comparative Bias Index (CBI), construct relationship, average variance extracted (AVE), composite reliability (CR), and Fornell-Larcker criterion were used to study the consequence of each estimator. The findings conclude that CB-SEM performed notably better than PLS-PM in estimation for large sample size (100 and above), particularly in terms of estimations accuracy and consistency.
Goicoechea, H C; Olivieri, A C
2001-07-01
A newly developed multivariate method involving net analyte preprocessing (NAP) was tested using central composite calibration designs of progressively decreasing size regarding the multivariate simultaneous spectrophotometric determination of three active components (phenylephrine, diphenhydramine and naphazoline) and one excipient (methylparaben) in nasal solutions. Its performance was evaluated and compared with that of partial least-squares (PLS-1). Minimisation of the calibration predicted error sum of squares (PRESS) as a function of a moving spectral window helped to select appropriate working spectral ranges for both methods. The comparison of NAP and PLS results was carried out using two tests: (1) the elliptical joint confidence region for the slope and intercept of a predicted versus actual concentrations plot for a large validation set of samples and (2) the D-optimality criterion concerning the information content of the calibration data matrix. Extensive simulations and experimental validation showed that, unlike PLS, the NAP method is able to furnish highly satisfactory results when the calibration set is reduced from a full four-component central composite to a fractional central composite, as expected from the modelling requirements of net analyte based methods.
Zhang, Chu; Liu, Fei; Kong, Wenwen; He, Yong
2015-01-01
Visible and near-infrared hyperspectral imaging covering spectral range of 380–1030 nm as a rapid and non-destructive method was applied to estimate the soluble protein content of oilseed rape leaves. Average spectrum (500–900 nm) of the region of interest (ROI) of each sample was extracted, and four samples out of 128 samples were defined as outliers by Monte Carlo-partial least squares (MCPLS). Partial least squares (PLS) model using full spectra obtained dependable performance with the correlation coefficient (rp) of 0.9441, root mean square error of prediction (RMSEP) of 0.1658 mg/g and residual prediction deviation (RPD) of 2.98. The weighted regression coefficient (Bw), successive projections algorithm (SPA) and genetic algorithm-partial least squares (GAPLS) selected 18, 15, and 16 sensitive wavelengths, respectively. SPA-PLS model obtained the best performance with rp of 0.9554, RMSEP of 0.1538 mg/g and RPD of 3.25. Distribution of protein content within the rape leaves were visualized and mapped on the basis of the SPA-PLS model. The overall results indicated that hyperspectral imaging could be used to determine and visualize the soluble protein content of rape leaves. PMID:26184198
USDA-ARS?s Scientific Manuscript database
Two simple fingerprinting methods, flow-injection UV spectroscopy (FIUV) and 1H nuclear magnetic resonance (NMR), for discrimination of Aurantii FructusImmaturus and Fructus Poniciri TrifoliataeImmaturususing were described. Both methods were combined with partial least-squares discriminant analysis...
NASA Astrophysics Data System (ADS)
Yang, Yue; Wang, Lei; Wu, Yongjiang; Liu, Xuesong; Bi, Yuan; Xiao, Wei; Chen, Yong
2017-07-01
There is a growing need for the effective on-line process monitoring during the manufacture of traditional Chinese medicine to ensure quality consistency. In this study, the potential of near infrared (NIR) spectroscopy technique to monitor the extraction process of Flos Lonicerae Japonicae was investigated. A new algorithm of synergy interval PLS with genetic algorithm (Si-GA-PLS) was proposed for modeling. Four different PLS models, namely Full-PLS, Si-PLS, GA-PLS, and Si-GA-PLS, were established, and their performances in predicting two quality parameters (viz. total acid and soluble solid contents) were compared. In conclusion, Si-GA-PLS model got the best results due to the combination of superiority of Si-PLS and GA. For Si-GA-PLS, the determination coefficient (Rp2) and root-mean-square error for the prediction set (RMSEP) were 0.9561 and 147.6544 μg/ml for total acid, 0.9062 and 0.1078% for soluble solid contents, correspondingly. The overall results demonstrated that the NIR spectroscopy technique combined with Si-GA-PLS calibration is a reliable and non-destructive alternative method for on-line monitoring of the extraction process of TCM on the production scale.
Dönmez, Ozlem Aksu; Aşçi, Bürge; Bozdoğan, Abdürrezzak; Sungur, Sidika
2011-02-15
A simple and rapid analytical procedure was proposed for the determination of chromatographic peaks by means of partial least squares multivariate calibration (PLS) of high-performance liquid chromatography with diode array detection (HPLC-DAD). The method is exemplified with analysis of quaternary mixtures of potassium guaiacolsulfonate (PG), guaifenesin (GU), diphenhydramine HCI (DP) and carbetapentane citrate (CP) in syrup preparations. In this method, the area does not need to be directly measured and predictions are more accurate. Though the chromatographic and spectral peaks of the analytes were heavily overlapped and interferents coeluted with the compounds studied, good recoveries of analytes could be obtained with HPLC-DAD coupled with PLS calibration. This method was tested by analyzing the synthetic mixture of PG, GU, DP and CP. As a comparison method, a classsical HPLC method was used. The proposed methods were applied to syrups samples containing four drugs and the obtained results were statistically compared with each other. Finally, the main advantage of HPLC-PLS method over the classical HPLC method tried to emphasized as the using of simple mobile phase, shorter analysis time and no use of internal standard and gradient elution. Copyright © 2010 Elsevier B.V. All rights reserved.
Thermal-to-visible face recognition using partial least squares.
Hu, Shuowen; Choi, Jonghyun; Chan, Alex L; Schwartz, William Robson
2015-03-01
Although visible face recognition has been an active area of research for several decades, cross-modal face recognition has only been explored by the biometrics community relatively recently. Thermal-to-visible face recognition is one of the most difficult cross-modal face recognition challenges, because of the difference in phenomenology between the thermal and visible imaging modalities. We address the cross-modal recognition problem using a partial least squares (PLS) regression-based approach consisting of preprocessing, feature extraction, and PLS model building. The preprocessing and feature extraction stages are designed to reduce the modality gap between the thermal and visible facial signatures, and facilitate the subsequent one-vs-all PLS-based model building. We incorporate multi-modal information into the PLS model building stage to enhance cross-modal recognition. The performance of the proposed recognition algorithm is evaluated on three challenging datasets containing visible and thermal imagery acquired under different experimental scenarios: time-lapse, physical tasks, mental tasks, and subject-to-camera range. These scenarios represent difficult challenges relevant to real-world applications. We demonstrate that the proposed method performs robustly for the examined scenarios.
Li, Juan; Jiang, Yue; Fan, Qi; Chen, Yang; Wu, Ruanqi
2014-05-05
This paper establishes a high-throughput and high selective method to determine the impurity named oxidized glutathione (GSSG) and radial tensile strength (RTS) of reduced glutathione (GSH) tablets based on near infrared (NIR) spectroscopy and partial least squares (PLS). In order to build and evaluate the calibration models, the NIR diffuse reflectance spectra (DRS) and transmittance spectra (TS) for 330 GSH tablets were accurately measured by using the optimized parameter values. For analyzing GSSG or RTS of GSH tablets, the NIR-DRS or NIR-TS were selected, subdivided reasonably into calibration and prediction sets, and processed appropriately with chemometric techniques. After selecting spectral sub-ranges and neglecting spectrum outliers, the PLS calibration models were built and the factor numbers were optimized. Then, the PLS models were evaluated by the root mean square errors of calibration (RMSEC), cross-validation (RMSECV) and prediction (RMSEP), and by the correlation coefficients of calibration (R(c)) and prediction (R(p)). The results indicate that the proposed models have good performances. It is thus clear that the NIR-PLS can simultaneously, selectively, nondestructively and rapidly analyze the GSSG and RTS of GSH tablets, although the contents of GSSG impurity were quite low while those of GSH active pharmaceutical ingredient (API) quite high. This strategy can be an important complement to the common NIR methods used in the on-line analysis of API in pharmaceutical preparations. And this work expands the NIR applications in the high-throughput and extraordinarily selective analysis. Copyright © 2014 Elsevier B.V. All rights reserved.
Angeyo, K H; Gari, S; Mustapha, A O; Mangala, J M
2012-11-01
The greatest challenge to material characterization by XRF technique is encountered in direct trace analysis of complex matrices. We exploited partial least squares (PLS) in conjunction with energy dispersive X-ray fluorescence and scattering (EDXRFS) spectrometry to rapidly (200 s) analyze lubricating oils. The PLS-EDXRFS method affords non-invasive quality assurance (QA) analysis of complex matrix liquids as it gave optimistic results for both heavy- and low-Z metal additives. Scatter peaks may further be used for QA characterization via the light elements. Copyright © 2012 Elsevier Ltd. All rights reserved.
Cai, Rui; Wang, Shisheng; Tang, Bo; Li, Yueqing; Zhao, Weijie
2018-01-01
Sea cucumber is the major tonic seafood worldwide, and geographical origin traceability is an important part of its quality and safety control. In this work, a non-destructive method for origin traceability of sea cucumber (Apostichopus japonicus) from northern China Sea and East China Sea using near infrared spectroscopy (NIRS) and multivariate analysis methods was proposed. Total fat contents of 189 fresh sea cucumber samples were determined and partial least-squares (PLS) regression was used to establish the quantitative NIRS model. The ordered predictor selection algorithm was performed to select feasible wavelength regions for the construction of PLS and identification models. The identification model was developed by principal component analysis combined with Mahalanobis distance and scaling to the first range algorithms. In the test set of the optimum PLS models, the root mean square error of prediction was 0.45, and correlation coefficient was 0.90. The correct classification rates of 100% were obtained in both identification calibration model and test model. The overall results indicated that NIRS method combined with chemometric analysis was a suitable tool for origin traceability and identification of fresh sea cucumber samples from nine origins in China. PMID:29410795
Guo, Xiuhan; Cai, Rui; Wang, Shisheng; Tang, Bo; Li, Yueqing; Zhao, Weijie
2018-01-01
Sea cucumber is the major tonic seafood worldwide, and geographical origin traceability is an important part of its quality and safety control. In this work, a non-destructive method for origin traceability of sea cucumber ( Apostichopus japonicus ) from northern China Sea and East China Sea using near infrared spectroscopy (NIRS) and multivariate analysis methods was proposed. Total fat contents of 189 fresh sea cucumber samples were determined and partial least-squares (PLS) regression was used to establish the quantitative NIRS model. The ordered predictor selection algorithm was performed to select feasible wavelength regions for the construction of PLS and identification models. The identification model was developed by principal component analysis combined with Mahalanobis distance and scaling to the first range algorithms. In the test set of the optimum PLS models, the root mean square error of prediction was 0.45, and correlation coefficient was 0.90. The correct classification rates of 100% were obtained in both identification calibration model and test model. The overall results indicated that NIRS method combined with chemometric analysis was a suitable tool for origin traceability and identification of fresh sea cucumber samples from nine origins in China.
Burgués, Javier; Marco, Santiago
2018-08-17
Metal oxide semiconductor (MOX) sensors are usually temperature-modulated and calibrated with multivariate models such as partial least squares (PLS) to increase the inherent low selectivity of this technology. The multivariate sensor response patterns exhibit heteroscedastic and correlated noise, which suggests that maximum likelihood methods should outperform PLS. One contribution of this paper is the comparison between PLS and maximum likelihood principal components regression (MLPCR) in MOX sensors. PLS is often criticized by the lack of interpretability when the model complexity increases beyond the chemical rank of the problem. This happens in MOX sensors due to cross-sensitivities to interferences, such as temperature or humidity and non-linearity. Additionally, the estimation of fundamental figures of merit, such as the limit of detection (LOD), is still not standardized in multivariate models. Orthogonalization methods, such as orthogonal projection to latent structures (O-PLS), have been successfully applied in other fields to reduce the complexity of PLS models. In this work, we propose a LOD estimation method based on applying the well-accepted univariate LOD formulas to the scores of the first component of an orthogonal PLS model. The resulting LOD is compared to the multivariate LOD range derived from error-propagation. The methodology is applied to data extracted from temperature-modulated MOX sensors (FIS SB-500-12 and Figaro TGS 3870-A04), aiming at the detection of low concentrations of carbon monoxide in the presence of uncontrolled humidity (chemical noise). We found that PLS models were simpler and more accurate than MLPCR models. Average LOD values of 0.79 ppm (FIS) and 1.06 ppm (Figaro) were found using the approach described in this paper. These values were contained within the LOD ranges obtained with the error-propagation approach. The mean LOD increased to 1.13 ppm (FIS) and 1.59 ppm (Figaro) when considering validation samples collected two weeks after calibration, which represents a 43% and 46% degradation, respectively. The orthogonal score-plot was a very convenient tool to visualize MOX sensor data and to validate the LOD estimates. Copyright © 2018 Elsevier B.V. All rights reserved.
ATR-FTIR spectroscopy for the determination of Na4EDTA in detergent aqueous solutions.
Suárez, Leticia; García, Roberto; Riera, Francisco A; Diez, María A
2013-10-15
Fourier transform infrared spectroscopy in the attenuated total reflectance mode (ATR-FTIR) combined with partial last square (PLS) algorithms was used to design calibration and prediction models for a wide range of tetrasodium ethylenediaminetetraacetate (Na4EDTA) concentrations (0.1 to 28% w/w) in aqueous solutions. The spectra obtained using air and water as a background medium were tested for the best fit. The PLS models designed afforded a sufficient level of precision and accuracy to allow even very small amounts of Na4EDTA to be determined. A root mean square error of nearly 0.37 for the validation set was obtained. Over a concentration range below 5% w/w, the values estimated from a combination of ATR-FTIR spectroscopy and a PLS algorithm model were similar to those obtained from an HPLC analysis of NaFeEDTA complexes and subsequent detection by UV absorbance. However, the lowest detection limit for Na4EDTA concentrations afforded by this spectroscopic/chemometric method was 0.3% w/w. The PLS model was successfully used as a rapid and simple method to quantify Na4EDTA in aqueous solutions of industrial detergents as an alternative to HPLC-UV analysis which involves time-consuming dilution and complexation processes. © 2013 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Toubar, Safaa S.; Hegazy, Maha A.; Elshahed, Mona S.; Helmy, Marwa I.
2016-06-01
In this work, resolution and quantitation of spectral signals are achieved by several univariate and multivariate techniques. The novel pure component contribution algorithm (PCCA) along with mean centering of ratio spectra (MCR) and the factor based partial least squares (PLS) algorithms were developed for simultaneous determination of chlorzoxazone (CXZ), aceclofenac (ACF) and paracetamol (PAR) in their pure form and recently co-formulated tablets. The PCCA method allows the determination of each drug at its λmax. While, the mean centered values at 230, 302 and 253 nm, were used for quantification of CXZ, ACF and PAR, respectively, by MCR method. Partial least-squares (PLS) algorithm was applied as a multivariate calibration method. The three methods were successfully applied for determination of CXZ, ACF and PAR in pure form and tablets. Good linear relationships were obtained in the ranges of 2-50, 2-40 and 2-30 μg mL- 1 for CXZ, ACF and PAR, in order, by both PCCA and MCR, while the PLS model was built for the three compounds each in the range of 2-10 μg mL- 1. The results obtained from the proposed methods were statistically compared with a reported one. PCCA and MCR methods were validated according to ICH guidelines, while PLS method was validated by both cross validation and an independent data set. They are found suitable for the determination of the studied drugs in bulk powder and tablets.
Francisco, Fabiane Lacerda; Saviano, Alessandro Morais; Almeida, Túlia de Souza Botelho; Lourenço, Felipe Rebello
2016-05-01
Microbiological assays are widely used to estimate the relative potencies of antibiotics in order to guarantee the efficacy, safety, and quality of drug products. Despite of the advantages of turbidimetric bioassays when compared to other methods, it has limitations concerning the linearity and range of the dose-response curve determination. Here, we proposed to use partial least squares (PLS) regression to solve these limitations and to improve the prediction of relative potencies of antibiotics. Kinetic-reading microplate turbidimetric bioassays for apramacyin and vancomycin were performed using Escherichia coli (ATCC 8739) and Bacillus subtilis (ATCC 6633), respectively. Microbial growths were measured as absorbance up to 180 and 300min for apramycin and vancomycin turbidimetric bioassays, respectively. Conventional dose-response curves (absorbances or area under the microbial growth curve vs. log of antibiotic concentration) showed significant regression, however there were significant deviation of linearity. Thus, they could not be used for relative potency estimations. PLS regression allowed us to construct a predictive model for estimating the relative potencies of apramycin and vancomycin without over-fitting and it improved the linear range of turbidimetric bioassay. In addition, PLS regression provided predictions of relative potencies equivalent to those obtained from agar diffusion official methods. Therefore, we conclude that PLS regression may be used to estimate the relative potencies of antibiotics with significant advantages when compared to conventional dose-response curve determination. Copyright © 2016 Elsevier B.V. All rights reserved.
Wang, L; Qin, X C; Lin, H C; Deng, K F; Luo, Y W; Sun, Q R; Du, Q X; Wang, Z Y; Tuo, Y; Sun, J H
2018-02-01
To analyse the relationship between Fourier transform infrared (FTIR) spectrum of rat's spleen tissue and postmortem interval (PMI) for PMI estimation using FTIR spectroscopy combined with data mining method. Rats were sacrificed by cervical dislocation, and the cadavers were placed at 20 ℃. The FTIR spectrum data of rats' spleen tissues were taken and measured at different time points. After pretreatment, the data was analysed by data mining method. The absorption peak intensity of rat's spleen tissue spectrum changed with the PMI, while the absorption peak position was unchanged. The results of principal component analysis (PCA) showed that the cumulative contribution rate of the first three principal components was 96%. There was an obvious clustering tendency for the spectrum sample at each time point. The methods of partial least squares discriminant analysis (PLS-DA) and support vector machine classification (SVMC) effectively divided the spectrum samples with different PMI into four categories (0-24 h, 48-72 h, 96-120 h and 144-168 h). The determination coefficient ( R ²) of the PMI estimation model established by PLS regression analysis was 0.96, and the root mean square error of calibration (RMSEC) and root mean square error of cross validation (RMSECV) were 9.90 h and 11.39 h respectively. In prediction set, the R ² was 0.97, and the root mean square error of prediction (RMSEP) was 10.49 h. The FTIR spectrum of the rat's spleen tissue can be effectively analyzed qualitatively and quantitatively by the combination of FTIR spectroscopy and data mining method, and the classification and PLS regression models can be established for PMI estimation. Copyright© by the Editorial Department of Journal of Forensic Medicine.
NASA Astrophysics Data System (ADS)
Jiang, Hui; Liu, Guohai; Mei, Congli; Yu, Shuang; Xiao, Xiahong; Ding, Yuhan
2012-11-01
The feasibility of rapid determination of the process variables (i.e. pH and moisture content) in solid-state fermentation (SSF) of wheat straw using Fourier transform near infrared (FT-NIR) spectroscopy was studied. Synergy interval partial least squares (siPLS) algorithm was implemented to calibrate regression model. The number of PLS factors and the number of subintervals were optimized simultaneously by cross-validation. The performance of the prediction model was evaluated according to the root mean square error of cross-validation (RMSECV), the root mean square error of prediction (RMSEP) and the correlation coefficient (R). The measurement results of the optimal model were obtained as follows: RMSECV = 0.0776, Rc = 0.9777, RMSEP = 0.0963, and Rp = 0.9686 for pH model; RMSECV = 1.3544% w/w, Rc = 0.8871, RMSEP = 1.4946% w/w, and Rp = 0.8684 for moisture content model. Finally, compared with classic PLS and iPLS models, the siPLS model revealed its superior performance. The overall results demonstrate that FT-NIR spectroscopy combined with siPLS algorithm can be used to measure process variables in solid-state fermentation of wheat straw, and NIR spectroscopy technique has a potential to be utilized in SSF industry.
NASA Astrophysics Data System (ADS)
Moustafa, Azza A.; Hegazy, Maha A.; Mohamed, Dalia; Ali, Omnia
2016-02-01
A novel approach for the resolution and quantitation of severely overlapped quaternary mixture of carbinoxamine maleate (CAR), pholcodine (PHL), ephedrine hydrochloride (EPH) and sunset yellow (SUN) in syrup was demonstrated utilizing different spectrophotometric assisted multivariate calibration methods. The applied methods have used different processing and pre-processing algorithms. The proposed methods were partial least squares (PLS), concentration residuals augmented classical least squares (CRACLS), and a novel method; continuous wavelet transforms coupled with partial least squares (CWT-PLS). These methods were applied to a training set in the concentration ranges of 40-100 μg/mL, 40-160 μg/mL, 100-500 μg/mL and 8-24 μg/mL for the four components, respectively. The utilized methods have not required any preliminary separation step or chemical pretreatment. The validity of the methods was evaluated by an external validation set. The selectivity of the developed methods was demonstrated by analyzing the drugs in their combined pharmaceutical formulation without any interference from additives. The obtained results were statistically compared with the official and reported methods where no significant difference was observed regarding both accuracy and precision.
Domain-Invariant Partial-Least-Squares Regression.
Nikzad-Langerodi, Ramin; Zellinger, Werner; Lughofer, Edwin; Saminger-Platz, Susanne
2018-05-11
Multivariate calibration models often fail to extrapolate beyond the calibration samples because of changes associated with the instrumental response, environmental condition, or sample matrix. Most of the current methods used to adapt a source calibration model to a target domain exclusively apply to calibration transfer between similar analytical devices, while generic methods for calibration-model adaptation are largely missing. To fill this gap, we here introduce domain-invariant partial-least-squares (di-PLS) regression, which extends ordinary PLS by a domain regularizer in order to align the source and target distributions in the latent-variable space. We show that a domain-invariant weight vector can be derived in closed form, which allows the integration of (partially) labeled data from the source and target domains as well as entirely unlabeled data from the latter. We test our approach on a simulated data set where the aim is to desensitize a source calibration model to an unknown interfering agent in the target domain (i.e., unsupervised model adaptation). In addition, we demonstrate unsupervised, semisupervised, and supervised model adaptation by di-PLS on two real-world near-infrared (NIR) spectroscopic data sets.
Ouyang, Qin; Zhao, Jiewen; Chen, Quansheng
2015-01-01
The non-sugar solids (NSS) content is one of the most important nutrition indicators of Chinese rice wine. This study proposed a rapid method for the measurement of NSS content in Chinese rice wine using near infrared (NIR) spectroscopy. We also systemically studied the efficient spectral variables selection algorithms that have to go through modeling. A new algorithm of synergy interval partial least square with competitive adaptive reweighted sampling (Si-CARS-PLS) was proposed for modeling. The performance of the final model was back-evaluated using root mean square error of calibration (RMSEC) and correlation coefficient (Rc) in calibration set and similarly tested by mean square error of prediction (RMSEP) and correlation coefficient (Rp) in prediction set. The optimum model by Si-CARS-PLS algorithm was achieved when 7 PLS factors and 18 variables were included, and the results were as follows: Rc=0.95 and RMSEC=1.12 in the calibration set, Rp=0.95 and RMSEP=1.22 in the prediction set. In addition, Si-CARS-PLS algorithm showed its superiority when compared with the commonly used algorithms in multivariate calibration. This work demonstrated that NIR spectroscopy technique combined with a suitable multivariate calibration algorithm has a high potential in rapid measurement of NSS content in Chinese rice wine. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Liu, Fei; He, Yong
2008-03-01
Three different chemometric methods were performed for the determination of sugar content of cola soft drinks using visible and near infrared spectroscopy (Vis/NIRS). Four varieties of colas were prepared and 180 samples (45 samples for each variety) were selected for the calibration set, while 60 samples (15 samples for each variety) for the validation set. The smoothing way of Savitzky-Golay, standard normal variate (SNV) and Savitzky-Golay first derivative transformation were applied for the pre-processing of spectral data. The first eleven principal components (PCs) extracted by partial least squares (PLS) analysis were employed as the inputs of BP neural network (BPNN) and least squares-support vector machine (LS-SVM) model. Then the BPNN model with the optimal structural parameters and LS-SVM model with radial basis function (RBF) kernel were applied to build the regression model with a comparison of PLS regression. The correlation coefficient (r), root mean square error of prediction (RMSEP) and bias for prediction were 0.971, 1.259 and -0.335 for PLS, 0.986, 0.763, and -0.042 for BPNN, while 0.978, 0.995 and -0.227 for LS-SVM, respectively. All the three methods supplied a high and satisfying precision. The results indicated that Vis/NIR spectroscopy combined with chemometric methods could be utilized as a high precision way for the determination of sugar content of cola soft drinks.
Boiret, Mathieu; Meunier, Loïc; Ginot, Yves-Michel
2011-02-20
A near infrared (NIR) method was developed for determination of tablet potency of active pharmaceutical ingredient (API) in a complex coated tablet matrix. The calibration set contained samples from laboratory and production scale batches. The reference values were obtained by high performance liquid chromatography (HPLC) and partial least squares (PLS) regression was used to establish a model. The model was challenged by calculating tablet potency of two external test sets. Root mean square errors of prediction were respectively equal to 2.0% and 2.7%. To use this model with a second spectrometer from the production field, a calibration transfer method called piecewise direct standardisation (PDS) was used. After the transfer, the root mean square error of prediction of the first test set was 2.4% compared to 4.0% without transferring the spectra. A statistical technique using bootstrap of PLS residuals was used to estimate confidence intervals of tablet potency calculations. This method requires an optimised PLS model, selection of the bootstrap number and determination of the risk. In the case of a chemical analysis, the tablet potency value will be included within the confidence interval calculated by the bootstrap method. An easy to use graphical interface was developed to easily determine if the predictions, surrounded by minimum and maximum values, are within the specifications defined by the regulatory organisation. Copyright © 2010 Elsevier B.V. All rights reserved.
The development of comparative bias index
NASA Astrophysics Data System (ADS)
Aimran, Ahmad Nazim; Ahmad, Sabri; Afthanorhan, Asyraf; Awang, Zainudin
2017-08-01
Structural Equation Modeling (SEM) is a second generation statistical analysis techniques developed for analyzing the inter-relationships among multiple variables in a model simultaneously. There are two most common used methods in SEM namely Covariance-Based Structural Equation Modeling (CB-SEM) and Partial Least Square Path Modeling (PLS-PM). There have been continuous debates among researchers in the use of PLS-PM over CB-SEM. While there is few studies were conducted to test the performance of CB-SEM and PLS-PM bias in estimating simulation data. This study intends to patch this problem by a) developing the Comparative Bias Index and b) testing the performance of CB-SEM and PLS-PM using developed index. Based on balanced experimental design, two multivariate normal simulation data with of distinct specifications of size 50, 100, 200 and 500 are generated and analyzed using CB-SEM and PLS-PM.
Partial least squares (PLS) analysis offers a number of advantages over the more traditionally used regression analyses applied in landscape ecology, particularly for determining the associations among multiple constituents of surface water and landscape configuration. Common dat...
Lafuente, Victoria; Herrera, Luis J; Pérez, María del Mar; Val, Jesús; Negueruela, Ignacio
2015-08-15
In this work, near infrared spectroscopy (NIR) and an acoustic measure (AWETA) (two non-destructive methods) were applied in Prunus persica fruit 'Calrico' (n = 260) to predict Magness-Taylor (MT) firmness. Separate and combined use of these measures was evaluated and compared using partial least squares (PLS) and least squares support vector machine (LS-SVM) regression methods. Also, a mutual-information-based variable selection method, seeking to find the most significant variables to produce optimal accuracy of the regression models, was applied to a joint set of variables (NIR wavelengths and AWETA measure). The newly proposed combined NIR-AWETA model gave good values of the determination coefficient (R(2)) for PLS and LS-SVM methods (0.77 and 0.78, respectively), improving the reliability of MT firmness prediction in comparison with separate NIR and AWETA predictions. The three variables selected by the variable selection method (AWETA measure plus NIR wavelengths 675 and 697 nm) achieved R(2) values 0.76 and 0.77, PLS and LS-SVM. These results indicated that the proposed mutual-information-based variable selection algorithm was a powerful tool for the selection of the most relevant variables. © 2014 Society of Chemical Industry.
Žuvela, Petar; Liu, J Jay; Macur, Katarzyna; Bączek, Tomasz
2015-10-06
In this work, performance of five nature-inspired optimization algorithms, genetic algorithm (GA), particle swarm optimization (PSO), artificial bee colony (ABC), firefly algorithm (FA), and flower pollination algorithm (FPA), was compared in molecular descriptor selection for development of quantitative structure-retention relationship (QSRR) models for 83 peptides that originate from eight model proteins. The matrix with 423 descriptors was used as input, and QSRR models based on selected descriptors were built using partial least squares (PLS), whereas root mean square error of prediction (RMSEP) was used as a fitness function for their selection. Three performance criteria, prediction accuracy, computational cost, and the number of selected descriptors, were used to evaluate the developed QSRR models. The results show that all five variable selection methods outperform interval PLS (iPLS), sparse PLS (sPLS), and the full PLS model, whereas GA is superior because of its lowest computational cost and higher accuracy (RMSEP of 5.534%) with a smaller number of variables (nine descriptors). The GA-QSRR model was validated initially through Y-randomization. In addition, it was successfully validated with an external testing set out of 102 peptides originating from Bacillus subtilis proteomes (RMSEP of 22.030%). Its applicability domain was defined, from which it was evident that the developed GA-QSRR exhibited strong robustness. All the sources of the model's error were identified, thus allowing for further application of the developed methodology in proteomics.
Dinç, Erdal; Ertekin, Zehra Ceren
2016-01-01
An application of parallel factor analysis (PARAFAC) and three-way partial least squares (3W-PLS1) regression models to ultra-performance liquid chromatography-photodiode array detection (UPLC-PDA) data with co-eluted peaks in the same wavelength and time regions was described for the multicomponent quantitation of hydrochlorothiazide (HCT) and olmesartan medoxomil (OLM) in tablets. Three-way dataset of HCT and OLM in their binary mixtures containing telmisartan (IS) as an internal standard was recorded with a UPLC-PDA instrument. Firstly, the PARAFAC algorithm was applied for the decomposition of three-way UPLC-PDA data into the chromatographic, spectral and concentration profiles to quantify the concerned compounds. Secondly, 3W-PLS1 approach was subjected to the decomposition of a tensor consisting of three-way UPLC-PDA data into a set of triads to build 3W-PLS1 regression for the analysis of the same compounds in samples. For the proposed three-way analysis methods in the regression and prediction steps, the applicability and validity of PARAFAC and 3W-PLS1 models were checked by analyzing the synthetic mixture samples, inter-day and intra-day samples, and standard addition samples containing HCT and OLM. Two different three-way analysis methods, PARAFAC and 3W-PLS1, were successfully applied to the quantitative estimation of the solid dosage form containing HCT and OLM. Regression and prediction results provided from three-way analysis were compared with those obtained by traditional UPLC method. Copyright © 2015 Elsevier B.V. All rights reserved.
Partial least squares (PLS) analysis offers a number of advantages over the more traditionally used regression analyses applied in landscape ecology to study the associations among constituents of surface water and landscapes. Common data problems in ecological studies include: s...
NASA Astrophysics Data System (ADS)
Yu, Jiajia; He, Yong
Mango is a kind of popular tropical fruit, and the soluble solid content is an important in this study visible and short-wave near-infrared spectroscopy (VIS/SWNIR) technique was applied. For sake of investigating the feasibility of using VIS/SWNIR spectroscopy to measure the soluble solid content in mango, and validating the performance of selected sensitive bands, for the calibration set was formed by 135 mango samples, while the remaining 45 mango samples for the prediction set. The combination of partial least squares and backpropagation artificial neural networks (PLS-BP) was used to calculate the prediction model based on raw spectrum data. Based on PLS-BP, the determination coefficient for prediction (Rp) was 0.757 and root mean square and the process is simple and easy to operate. Compared with the Partial least squares (PLS) result, the performance of PLS-BP is better.
NASA Astrophysics Data System (ADS)
Shi, Ji-yong; Zou, Xiao-bo; Zhao, Jie-wen; Mel, Holmes; Wang, Kai-liang; Wang, Xue; Chen, Hong
Total flavonoids content is often considered an important quality index of Ginkgo biloba leaf. The feasibility of using near infrared (NIR) spectra at the wavelength range of 10,000-4000 cm-1 for rapid and nondestructive determination of total flavonoids content in G. biloba leaf was investigated. 120 fresh G. biloba leaves in different colors (green, green-yellowish and yellow) were used to spectra acquisition and total flavonoids determination. Partial least squares (PLS), interval partial least squares (iPLS) and synergy interval partial least squares (SiPLS) were used to develop calibration models for total flavonoids content in two colors leaves (green-yellowish and yellow) and three colors leaves (green, green-yellowish and yellow), respectively. The level of total flavonoids content for green, green-yellowish and yellow leaves was in an increasing order. Two characteristic wavelength regions (5840-6090 cm-1 and 6620-6880 cm-1), which corresponded to the absorptions of two aromatic rings in basic flavonoid structure, were selected by SiPLS. The optimal SiPLS model for total flavonoids content in the two colors leaves (r2 = 0.82, RMSEP = 2.62 mg g-1) had better performance than PLS and iPLS models. It could be concluded that NIR spectroscopy has significant potential in the nondestructive determination of total flavonoids content in fresh G. biloba leaf.
Jiang, Hui; Liu, Guohai; Mei, Congli; Yu, Shuang; Xiao, Xiahong; Ding, Yuhan
2012-11-01
The feasibility of rapid determination of the process variables (i.e. pH and moisture content) in solid-state fermentation (SSF) of wheat straw using Fourier transform near infrared (FT-NIR) spectroscopy was studied. Synergy interval partial least squares (siPLS) algorithm was implemented to calibrate regression model. The number of PLS factors and the number of subintervals were optimized simultaneously by cross-validation. The performance of the prediction model was evaluated according to the root mean square error of cross-validation (RMSECV), the root mean square error of prediction (RMSEP) and the correlation coefficient (R). The measurement results of the optimal model were obtained as follows: RMSECV=0.0776, R(c)=0.9777, RMSEP=0.0963, and R(p)=0.9686 for pH model; RMSECV=1.3544% w/w, R(c)=0.8871, RMSEP=1.4946% w/w, and R(p)=0.8684 for moisture content model. Finally, compared with classic PLS and iPLS models, the siPLS model revealed its superior performance. The overall results demonstrate that FT-NIR spectroscopy combined with siPLS algorithm can be used to measure process variables in solid-state fermentation of wheat straw, and NIR spectroscopy technique has a potential to be utilized in SSF industry. Copyright © 2012 Elsevier B.V. All rights reserved.
PLS-LS-SVM based modeling of ATR-IR as a robust method in detection and qualification of alprazolam
NASA Astrophysics Data System (ADS)
Parhizkar, Elahehnaz; Ghazali, Mohammad; Ahmadi, Fatemeh; Sakhteman, Amirhossein
2017-02-01
According to the United States pharmacopeia (USP), Gold standard technique for Alprazolam determination in dosage forms is HPLC, an expensive and time-consuming method that is not easy to approach. In this study chemometrics assisted ATR-IR was introduced as an alternative method that produce similar results in fewer time and energy consumed manner. Fifty-eight samples containing different concentrations of commercial alprazolam were evaluated by HPLC and ATR-IR method. A preprocessing approach was applied to convert raw data obtained from ATR-IR spectra to normal matrix. Finally, a relationship between alprazolam concentrations achieved by HPLC and ATR-IR data was established using PLS-LS-SVM (partial least squares least squares support vector machines). Consequently, validity of the method was verified to yield a model with low error values (root mean square error of cross validation equal to 0.98). The model was able to predict about 99% of the samples according to R2 of prediction set. Response permutation test was also applied to affirm that the model was not assessed by chance correlations. At conclusion, ATR-IR can be a reliable method in manufacturing process in detection and qualification of alprazolam content.
Ouyang, Qin; Zhao, Jiewen; Pan, Wenxiu; Chen, Quansheng
2016-01-01
A portable and low-cost spectral analytical system was developed and used to monitor real-time process parameters, i.e. total sugar content (TSC), alcohol content (AC) and pH during rice wine fermentation. Various partial least square (PLS) algorithms were implemented to construct models. The performance of a model was evaluated by the correlation coefficient (Rp) and the root mean square error (RMSEP) in the prediction set. Among the models used, the synergy interval PLS (Si-PLS) was found to be superior. The optimal performance by the Si-PLS model for the TSC was Rp = 0.8694, RMSEP = 0.438; the AC was Rp = 0.8097, RMSEP = 0.617; and the pH was Rp = 0.9039, RMSEP = 0.0805. The stability and reliability of the system, as well as the optimal models, were verified using coefficients of variation, most of which were found to be less than 5%. The results suggest this portable system is a promising tool that could be used as an alternative method for rapid monitoring of process parameters during rice wine fermentation. Copyright © 2015 Elsevier Ltd. All rights reserved.
Navy Fuel Composition and Screening Tool (FCAST) v2.8
2016-05-10
allowed us to develop partial least squares (PLS) models based on gas chromatography–mass spectrometry (GC-MS) data that predict fuel properties. The...Chemometric property modeling Partial least squares PLS Compositional profiler Naval Air Systems Command Air-4.4.5 Patuxent River Naval Air Station Patuxent...Cumulative predicted residual error sum of squares DiEGME Diethylene glycol monomethyl ether FCAST Fuel Composition and Screening Tool FFP Fit for
Zhang, Mengliang; Harrington, Peter de B
2015-01-01
Multivariate partial least-squares (PLS) method was applied to the quantification of two complex polychlorinated biphenyls (PCBs) commercial mixtures, Aroclor 1254 and 1260, in a soil matrix. PCBs in soil samples were extracted by headspace solid phase microextraction (SPME) and determined by gas chromatography/mass spectrometry (GC/MS). Decachlorinated biphenyl (deca-CB) was used as internal standard. After the baseline correction was applied, four data representations including extracted ion chromatograms (EIC) for Aroclor 1254, EIC for Aroclor 1260, EIC for both Aroclors and two-way data sets were constructed for PLS-1 and PLS-2 calibrations and evaluated with respect to quantitative prediction accuracy. The PLS model was optimized with respect to the number of latent variables using cross validation of the calibration data set. The validation of the method was performed with certified soil samples and real field soil samples and the predicted concentrations for both Aroclors using EIC data sets agreed with the certified values. The linear range of the method was from 10μgkg(-1) to 1000μgkg(-1) for both Aroclor 1254 and 1260 in soil matrices and the detection limit was 4μgkg(-1) for Aroclor 1254 and 6μgkg(-1) for Aroclor 1260. This holistic approach for the determination of mixtures of complex samples has broad application to environmental forensics and modeling. Copyright © 2014 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Duan, Fajie; Fu, Xiao; Jiang, Jiajia; Huang, Tingting; Ma, Ling; Zhang, Cong
2018-05-01
In this work, an automatic variable selection method for quantitative analysis of soil samples using laser-induced breakdown spectroscopy (LIBS) is proposed, which is based on full spectrum correction (FSC) and modified iterative predictor weighting-partial least squares (mIPW-PLS). The method features automatic selection without artificial processes. To illustrate the feasibility and effectiveness of the method, a comparison with genetic algorithm (GA) and successive projections algorithm (SPA) for different elements (copper, barium and chromium) detection in soil was implemented. The experimental results showed that all the three methods could accomplish variable selection effectively, among which FSC-mIPW-PLS required significantly shorter computation time (12 s approximately for 40,000 initial variables) than the others. Moreover, improved quantification models were got with variable selection approaches. The root mean square errors of prediction (RMSEP) of models utilizing the new method were 27.47 (copper), 37.15 (barium) and 39.70 (chromium) mg/kg, which showed comparable prediction effect with GA and SPA.
Newman, J; Egan, T; Harbourne, N; O'Riordan, D; Jacquier, J C; O'Sullivan, M
2014-08-01
Sensory evaluation can be problematic for ingredients with a bitter taste during research and development phase of new food products. In this study, 19 dairy protein hydrolysates (DPH) were analysed by an electronic tongue and their physicochemical characteristics, the data obtained from these methods were correlated with their bitterness intensity as scored by a trained sensory panel and each model was also assessed by its predictive capabilities. The physiochemical characteristics of the DPHs investigated were degree of hydrolysis (DH%), and data relating to peptide size and relative hydrophobicity from size exclusion chromatography (SEC) and reverse phase (RP) HPLC. Partial least square regression (PLS) was used to construct the prediction models. All PLS regressions had good correlations (0.78 to 0.93) with the strongest being the combination of data obtained from SEC and RP HPLC. However, the PLS with the strongest predictive power was based on the e-tongue which had the PLS regression with the lowest root mean predicted residual error sum of squares (PRESS) in the study. The results show that the PLS models constructed with the e-tongue and the combination of SEC and RP-HPLC has potential to be used for prediction of bitterness and thus reducing the reliance on sensory analysis in DPHs for future food research. Copyright © 2014 Elsevier B.V. All rights reserved.
Ramírez, J; Górriz, J M; Segovia, F; Chaves, R; Salas-Gonzalez, D; López, M; Alvarez, I; Padilla, P
2010-03-19
This letter shows a computer aided diagnosis (CAD) technique for the early detection of the Alzheimer's disease (AD) by means of single photon emission computed tomography (SPECT) image classification. The proposed method is based on partial least squares (PLS) regression model and a random forest (RF) predictor. The challenge of the curse of dimensionality is addressed by reducing the large dimensionality of the input data by downscaling the SPECT images and extracting score features using PLS. A RF predictor then forms an ensemble of classification and regression tree (CART)-like classifiers being its output determined by a majority vote of the trees in the forest. A baseline principal component analysis (PCA) system is also developed for reference. The experimental results show that the combined PLS-RF system yields a generalization error that converges to a limit when increasing the number of trees in the forest. Thus, the generalization error is reduced when using PLS and depends on the strength of the individual trees in the forest and the correlation between them. Moreover, PLS feature extraction is found to be more effective for extracting discriminative information from the data than PCA yielding peak sensitivity, specificity and accuracy values of 100%, 92.7%, and 96.9%, respectively. Moreover, the proposed CAD system outperformed several other recently developed AD CAD systems. Copyright 2010 Elsevier Ireland Ltd. All rights reserved.
Piccirilli, Gisela N; Escandar, Graciela M
2006-09-01
This paper demonstrates for the first time the power of a chemometric second-order algorithm for predicting, in a simple way and using spectrofluorimetric data, the concentration of analytes in the presence of both the inner-filter effect and unsuspected species. The simultaneous determination of the systemic fungicides carbendazim and thiabendazole was achieved and employed for the discussion of the scopes of the applied second-order chemometric tools: parallel factor analysis (PARAFAC) and partial least-squares with residual bilinearization (PLS/RBL). The chemometric study was performed using fluorescence excitation-emission matrices obtained after the extraction of the analytes over a C18-membrane surface. The ability of PLS/RBL to recognize and overcome the significant changes produced by thiabendazole in both the excitation and emission spectra of carbendazim is demonstrated. The high performance of the selected PLS/RBL method was established with the determination of both pesticides in artificial and real samples.
Miller, Arthur L.; Weakley, Andrew Todd; Griffiths, Peter R.; Cauda, Emanuele G.; Bayman, Sean
2017-01-01
In order to help reduce silicosis in miners, the National Institute for Occupational Health and Safety (NIOSH) is developing field-portable methods for measuring airborne respirable crystalline silica (RCS), specifically the polymorph α-quartz, in mine dusts. In this study we demonstrate the feasibility of end-of-shift measurement of α-quartz using a direct-on-filter (DoF) method to analyze coal mine dust samples deposited onto polyvinyl chloride filters. The DoF method is potentially amenable for on-site analyses, but deviates from the current regulatory determination of RCS for coal mines by eliminating two sample preparation steps: ashing the sampling filter and redepositing the ash prior to quantification by Fourier transform infrared (FT-IR) spectrometry. In this study, the FT-IR spectra of 66 coal dust samples from active mines were used, and the RCS was quantified by using: (1) an ordinary least squares (OLS) calibration approach that utilizes standard silica material as done in the Mine Safety and Health Administration's P7 method; and (2) a partial least squares (PLS) regression approach. Both were capable of accounting for kaolinite, which can confound the IR analysis of silica. The OLS method utilized analytical standards for silica calibration and kaolin correction, resulting in a good linear correlation with P7 results and minimal bias but with the accuracy limited by the presence of kaolinite. The PLS approach also produced predictions well-correlated to the P7 method, as well as better accuracy in RCS prediction, and no bias due to variable kaolinite mass. Besides decreased sensitivity to mineral or substrate confounders, PLS has the advantage that the analyst is not required to correct for the presence of kaolinite or background interferences related to the substrate, making the method potentially viable for automated RCS prediction in the field. This study demonstrated the efficacy of FT-IR transmission spectrometry for silica determination in coal mine dusts, using both OLS and PLS analyses, when kaolinite was present. PMID:27645724
Miller, Arthur L; Weakley, Andrew Todd; Griffiths, Peter R; Cauda, Emanuele G; Bayman, Sean
2017-05-01
In order to help reduce silicosis in miners, the National Institute for Occupational Health and Safety (NIOSH) is developing field-portable methods for measuring airborne respirable crystalline silica (RCS), specifically the polymorph α-quartz, in mine dusts. In this study we demonstrate the feasibility of end-of-shift measurement of α-quartz using a direct-on-filter (DoF) method to analyze coal mine dust samples deposited onto polyvinyl chloride filters. The DoF method is potentially amenable for on-site analyses, but deviates from the current regulatory determination of RCS for coal mines by eliminating two sample preparation steps: ashing the sampling filter and redepositing the ash prior to quantification by Fourier transform infrared (FT-IR) spectrometry. In this study, the FT-IR spectra of 66 coal dust samples from active mines were used, and the RCS was quantified by using: (1) an ordinary least squares (OLS) calibration approach that utilizes standard silica material as done in the Mine Safety and Health Administration's P7 method; and (2) a partial least squares (PLS) regression approach. Both were capable of accounting for kaolinite, which can confound the IR analysis of silica. The OLS method utilized analytical standards for silica calibration and kaolin correction, resulting in a good linear correlation with P7 results and minimal bias but with the accuracy limited by the presence of kaolinite. The PLS approach also produced predictions well-correlated to the P7 method, as well as better accuracy in RCS prediction, and no bias due to variable kaolinite mass. Besides decreased sensitivity to mineral or substrate confounders, PLS has the advantage that the analyst is not required to correct for the presence of kaolinite or background interferences related to the substrate, making the method potentially viable for automated RCS prediction in the field. This study demonstrated the efficacy of FT-IR transmission spectrometry for silica determination in coal mine dusts, using both OLS and PLS analyses, when kaolinite was present.
Liu, Fei; Feng, Lei; Lou, Bing-gan; Sun, Guang-ming; Wang, Lian-ping; He, Yong
2010-07-01
The combinational-stimulated bands were used to develop linear and nonlinear calibrations for the early detection of sclerotinia of oilseed rape (Brassica napus L.). Eighty healthy and 100 Sclerotinia leaf samples were scanned, and different preprocessing methods combined with successive projections algorithm (SPA) were applied to develop partial least squares (PLS) discriminant models, multiple linear regression (MLR) and least squares-support vector machine (LS-SVM) models. The results indicated that the optimal full-spectrum PLS model was achieved by direct orthogonal signal correction (DOSC), then De-trending and Raw spectra with correct recognition ratio of 100%, 95.7% and 95.7%, respectively. When using combinational-stimulated bands, the optimal linear models were SPA-MLR (DOSC) and SPA-PLS (DOSC) with correct recognition ratio of 100%. All SPA-LSSVM models using DOSC, De-trending and Raw spectra achieved perfect results with recognition of 100%. The overall results demonstrated that it was feasible to use combinational-stimulated bands for the early detection of Sclerotinia of oilseed rape, and DOSC-SPA was a powerful way for informative wavelength selection. This method supplied a new approach to the early detection and portable monitoring instrument of sclerotinia.
NASA Astrophysics Data System (ADS)
Suhandy, D.; Yulia, M.; Ogawa, Y.; Kondo, N.
2018-05-01
In the present research, an evaluation of using near infrared (NIR) spectroscopy in tandem with full spectrum partial least squares (FS-PLS) regression for quantification of degree of adulteration in civet coffee was conducted. A number of 126 ground roasted coffee samples with degree of adulteration 0-51% were prepared. Spectral data were acquired using a NIR spectrometer equipped with an integrating sphere for diffuse reflectance measurement in the range of 1300-2500 nm. The samples were divided into two groups calibration sample set (84 samples) and prediction sample set (42 samples). The calibration model was developed on original spectra using FS-PLS regression with full-cross validation method. The calibration model exhibited the determination coefficient R2=0.96 for calibration and R2=0.92 for validation. The prediction resulted in low root mean square error of prediction (RMSEP) (4.67%) and high ratio prediction to deviation (RPD) (3.75). In conclusion, the degree of adulteration in civet coffee have been quantified successfully by using NIR spectroscopy and FS-PLS regression in a non-destructive, economical, precise, and highly sensitive method, which uses very simple sample preparation.
Improved Quantitative Analysis of Ion Mobility Spectrometry by Chemometric Multivariate Calibration
DOE Office of Scientific and Technical Information (OSTI.GOV)
Fraga, Carlos G.; Kerr, Dayle; Atkinson, David A.
2009-09-01
Traditional peak-area calibration and the multivariate calibration methods of principle component regression (PCR) and partial least squares (PLS), including unfolded PLS (U-PLS) and multi-way PLS (N-PLS), were evaluated for the quantification of 2,4,6-trinitrotoluene (TNT) and cyclo-1,3,5-trimethylene-2,4,6-trinitramine (RDX) in Composition B samples analyzed by temperature step desorption ion mobility spectrometry (TSD-IMS). The true TNT and RDX concentrations of eight Composition B samples were determined by high performance liquid chromatography with UV absorbance detection. Most of the Composition B samples were found to have distinct TNT and RDX concentrations. Applying PCR and PLS on the exact same IMS spectra used for themore » peak-area study improved quantitative accuracy and precision approximately 3 to 5 fold and 2 to 4 fold, respectively. This in turn improved the probability of correctly identifying Composition B samples based upon the estimated RDX and TNT concentrations from 11% with peak area to 44% and 89% with PLS. This improvement increases the potential of obtaining forensic information from IMS analyzers by providing some ability to differentiate or match Composition B samples based on their TNT and RDX concentrations.« less
Measurement of single soybean seed attributes by near infrared technologies. A comparative study
USDA-ARS?s Scientific Manuscript database
Four near infrared spectrophotometers, and their associated spectral collection methods, were tested and compared for measuring three soybean single seed attributes: weight (g), protein (%), and oil (%). Using partial least squares (PLS) and 4 preprocessing methods, the attribute which was significa...
NASA Astrophysics Data System (ADS)
Fu, Y.; Yang, W.; Xu, O.; Zhou, L.; Wang, J.
2017-04-01
To investigate time-variant and nonlinear characteristics in industrial processes, a soft sensor modelling method based on time difference, moving-window recursive partial least square (PLS) and adaptive model updating is proposed. In this method, time difference values of input and output variables are used as training samples to construct the model, which can reduce the effects of the nonlinear characteristic on modelling accuracy and retain the advantages of recursive PLS algorithm. To solve the high updating frequency of the model, a confidence value is introduced, which can be updated adaptively according to the results of the model performance assessment. Once the confidence value is updated, the model can be updated. The proposed method has been used to predict the 4-carboxy-benz-aldehyde (CBA) content in the purified terephthalic acid (PTA) oxidation reaction process. The results show that the proposed soft sensor modelling method can reduce computation effectively, improve prediction accuracy by making use of process information and reflect the process characteristics accurately.
Shashilov, Victor A; Sikirzhytski, Vitali; Popova, Ludmila A; Lednev, Igor K
2010-09-01
Here we report on novel quantitative approaches for protein structural characterization using deep UV resonance Raman (DUVRR) spectroscopy. Specifically, we propose a new method combining hydrogen-deuterium (HD) exchange and Bayesian source separation for extracting the DUVRR signatures of various structural elements of aggregated proteins including the cross-beta core and unordered parts of amyloid fibrils. The proposed method is demonstrated using the set of DUVRR spectra of hen egg white lysozyme acquired at various stages of HD exchange. Prior information about the concentration matrix and the spectral features of the individual components was incorporated into the Bayesian equation to eliminate the ill-conditioning of the problem caused by 100% correlation of the concentration profiles of protonated and deuterated species. Secondary structure fractions obtained by partial least squares (PLS) and least squares support vector machines (LS-SVMs) were used as the initial guess for the Bayessian source separation. Advantages of the PLS and LS-SVMs methods over the classical least squares calibration (CLSC) are discussed and illustrated using the DUVRR data of the prion protein in its native and aggregated forms. Copyright (c) 2010 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Alaoui, G.; Leger, M.; Gagne, J.; Tremblay, L.
2009-05-01
The goal of this work was to evaluate the capability of infrared reflectance spectroscopy for a fast quantification of the elemental and molecular compositions of sedimentary and particulate organic matter (OM). A partial least-squares (PLS) regression model was used for analysis and values were compared to those obtained by traditional methods (i.e., elemental, humic and HPLC analyses). PLS tools are readily accessible from software such as GRAMS (Thermo-Fisher) used in spectroscopy. This spectroscopic-chemometric approach has several advantages including its rapidity and use of whole unaltered samples. To predict properties, a set of infrared spectra from representative samples must first be fitted to form a PLS calibration model. In this study, a large set (180) of sediments and particles on GFF filters from the St. Lawrence estuarine system were used. These samples are very heterogenous (e.g., various tributaries, terrigenous vs. marine, events such as landslides and floods) and thus represent a challenging test for PLS prediction. For sediments, the infrared spectra were obtained with a diffuse reflectance, or DRIFT, accessory. Sedimentary carbon, nitrogen, humic substance contents as well as humic substance proportions in OM and N:C ratios were predicted by PLS. The relative root mean square error of prediction (%RMSEP) for these properties were between 5.7% (humin content) and 14.1% (total humic substance yield) using the cross-validation, or leave-one out, approach. The %RMSEP calculated by PLS for carbon content was lower with the PLS model (7.6%) than with an external calibration method (11.7%) (Tremblay and Gagné, 2002, Anal. Chem., 74, 2985). Moreover, the PLS approach does not require the extraction of POM needed in external calibration. Results highlighted the importance of using a PLS calibration set representative of the unknown samples (e.g., same area). For filtered particles, the infrared spectra were obtained using a novel approach based on attenuated total reflectance, or ATR, allowing the direct analysis of the filters. In addition to carbon and nitrogen contents, amino acid and muramic acid (a bacterial biomarker) yields were predicted using PLS. Calculated %RMSEP varied from 6.4% (total amino acid content) to 18.6% (muramic acid content) with cross-validation. PLS regression modeling does not require a priori knowledge of the spectral bands associated with the properties to be predicted. In turn, the spectral regions that give good PLS predictions provided valuable information on band assignment and geochemical processes. For instance, nitrogen and humin contents were greatly determined by an absorption band caused by aluminosilicate OH group. This supports the idea that OM-clay interactions, important in humin formation and OM preservation, are mediated by nitrogen-containing groups.
Chang, Wen-Qi; Zhou, Jian-Liang; Li, Yi; Shi, Zi-Qi; Wang, Li; Yang, Jie; Li, Ping; Liu, Li-Fang; Xin, Gui-Zhong
2017-01-15
The elevation of free fatty acids (FFAs) has been regarded as a universal metabolic signature of excessive adipocyte lipolysis. Nowadays, in vitro lipolysis assay is generally essential for drug screening prior to the animal study. Here, we present a novel in vitro approach for lipolysis measurement combining UHPLC-Orbitrap and partial least squares (PLS) based analysis. Firstly, the calibration matrix was constructed by serial proportions of mixed samples (blended with control and model samples). Then, lipidome profiling was performed by UHPLC-Orbitrap, and 403 variables were extracted and aligned as dataset. Owing to the high resolution of Orbitrap analyzer and open source lipid identification software, 28 FFAs were further screened and identified. Based on the relative intensity of the screened FFAs, PLS regression model was constructed for lipolysis measurement. After leave-one-out cross-validation, ten principal components have been designated to build the final PLS model with excellent performances (RMSECV, 0.0268; RMSEC, 0.0173; R 2 , 0.9977). In addition, the high predictive accuracy (R 2 = 0.9907 and RMSEP = 0.0345) of the trained PLS model was also demonstrated using test samples. Finally, taking curcumin as a model compound, its antilipolytic effect on palmitic acid-induced lipolysis was successfully predicted as 31.78% by the proposed approach. Besides, supplementary evidences of curcumin induced modification in FFAs compositions as well as lipidome were given by PLS extended methods. Different from general biological assays, high resolution MS-based method provide more sophisticated information included in biological events. Thus, the novel biological evaluation model proposed here showed promising perspectives for drug evaluation or disease diagnosis. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Chen, Hua-cai; Chen, Xing-dan; Lu, Yong-jun; Cao, Zhi-qiang
2006-01-01
Near infrared (NIR) reflectance spectroscopy was used to develop a fast determination method for total ginsenosides in Ginseng (Panax Ginseng) powder. The spectra were analyzed with multiplicative signal correction (MSC) correlation method. The best correlative spectra region with the total ginsenosides content was 1660 nm~1880 nm and 2230nm~2380 nm. The NIR calibration models of ginsenosides were built with multiple linear regression (MLR), principle component regression (PCR) and partial least squares (PLS) regression respectively. The results showed that the calibration model built with PLS combined with MSC and the optimal spectrum region was the best one. The correlation coefficient and the root mean square error of correction validation (RMSEC) of the best calibration model were 0.98 and 0.15% respectively. The optimal spectrum region for calibration was 1204nm~2014nm. The result suggested that using NIR to rapidly determinate the total ginsenosides content in ginseng powder were feasible.
NASA Astrophysics Data System (ADS)
Anderson, R. B.; Clegg, S. M.; Frydenvang, J.
2015-12-01
One of the primary challenges faced by the ChemCam instrument on the Curiosity Mars rover is developing a regression model that can accurately predict the composition of the wide range of target types encountered (basalts, calcium sulfate, feldspar, oxides, etc.). The original calibration used 69 rock standards to train a partial least squares (PLS) model for each major element. By expanding the suite of calibration samples to >400 targets spanning a wider range of compositions, the accuracy of the model was improved, but some targets with "extreme" compositions (e.g. pure minerals) were still poorly predicted. We have therefore developed a simple method, referred to as "submodel PLS", to improve the performance of PLS across a wide range of target compositions. In addition to generating a "full" (0-100 wt.%) PLS model for the element of interest, we also generate several overlapping submodels (e.g. for SiO2, we generate "low" (0-50 wt.%), "mid" (30-70 wt.%), and "high" (60-100 wt.%) models). The submodels are generally more accurate than the "full" model for samples within their range because they are able to adjust for matrix effects that are specific to that range. To predict the composition of an unknown target, we first predict the composition with the submodels and the "full" model. Then, based on the predicted composition from the "full" model, the appropriate submodel prediction can be used (e.g. if the full model predicts a low composition, use the "low" model result, which is likely to be more accurate). For samples with "full" predictions that occur in a region of overlap between submodels, the submodel predictions are "blended" using a simple linear weighted sum. The submodel PLS method shows improvements in most of the major elements predicted by ChemCam and reduces the occurrence of negative predictions for low wt.% targets. Submodel PLS is currently being used in conjunction with ICA regression for the major element compositions of ChemCam data.
Li, Yuanpeng; Li, Fucui; Yang, Xinhao; Guo, Liu; Huang, Furong; Chen, Zhenqiang; Chen, Xingdan; Zheng, Shifu
2018-08-05
A rapid quantitative analysis model for determining the glycated albumin (GA) content based on Attenuated total reflectance (ATR)-Fourier transform infrared spectroscopy (FTIR) combining with linear SiPLS and nonlinear SVM has been developed. Firstly, the real GA content in human serum was determined by GA enzymatic method, meanwhile, the ATR-FTIR spectra of serum samples from the population of health examination were obtained. The spectral data of the whole spectra mid-infrared region (4000-600 cm -1 ) and GA's characteristic region (1800-800 cm -1 ) were used as the research object of quantitative analysis. Secondly, several preprocessing steps including first derivative, second derivative, variable standardization and spectral normalization, were performed. Lastly, quantitative analysis regression models were established by using SiPLS and SVM respectively. The SiPLS modeling results are as follows: root mean square error of cross validation (RMSECV T ) = 0.523 g/L, calibration coefficient (R C ) = 0.937, Root Mean Square Error of Prediction (RMSEP T ) = 0.787 g/L, and prediction coefficient (R P ) = 0.938. The SVM modeling results are as follows: RMSECV T = 0.0048 g/L, R C = 0.998, RMSEP T = 0.442 g/L, and R p = 0.916. The results indicated that the model performance was improved significantly after preprocessing and optimization of characteristic regions. While modeling performance of nonlinear SVM was considerably better than that of linear SiPLS. Hence, the quantitative analysis model for GA in human serum based on ATR-FTIR combined with SiPLS and SVM is effective. And it does not need sample preprocessing while being characterized by simple operations and high time efficiency, providing a rapid and accurate method for GA content determination. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Hadad, Ghada M.; El-Gindy, Alaa; Mahmoud, Waleed M. M.
2008-08-01
High-performance liquid chromatography (HPLC) and multivariate spectrophotometric methods are described for the simultaneous determination of ambroxol hydrochloride (AM) and doxycycline (DX) in combined pharmaceutical capsules. The chromatographic separation was achieved on reversed-phase C 18 analytical column with a mobile phase consisting of a mixture of 20 mM potassium dihydrogen phosphate, pH 6-acetonitrile in ratio of (1:1, v/v) and UV detection at 245 nm. Also, the resolution has been accomplished by using numerical spectrophotometric methods as classical least squares (CLS), principal component regression (PCR) and partial least squares (PLS-1) applied to the UV spectra of the mixture and graphical spectrophotometric method as first derivative of the ratio spectra ( 1DD) method. Analytical figures of merit (FOM), such as sensitivity, selectivity, analytical sensitivity, limit of quantitation and limit of detection were determined for CLS, PLS-1 and PCR methods. The proposed methods were validated and successfully applied for the analysis of pharmaceutical formulation and laboratory-prepared mixtures containing the two component combination.
Hadad, Ghada M; El-Gindy, Alaa; Mahmoud, Waleed M M
2008-08-01
High-performance liquid chromatography (HPLC) and multivariate spectrophotometric methods are described for the simultaneous determination of ambroxol hydrochloride (AM) and doxycycline (DX) in combined pharmaceutical capsules. The chromatographic separation was achieved on reversed-phase C(18) analytical column with a mobile phase consisting of a mixture of 20mM potassium dihydrogen phosphate, pH 6-acetonitrile in ratio of (1:1, v/v) and UV detection at 245 nm. Also, the resolution has been accomplished by using numerical spectrophotometric methods as classical least squares (CLS), principal component regression (PCR) and partial least squares (PLS-1) applied to the UV spectra of the mixture and graphical spectrophotometric method as first derivative of the ratio spectra ((1)DD) method. Analytical figures of merit (FOM), such as sensitivity, selectivity, analytical sensitivity, limit of quantitation and limit of detection were determined for CLS, PLS-1 and PCR methods. The proposed methods were validated and successfully applied for the analysis of pharmaceutical formulation and laboratory-prepared mixtures containing the two component combination.
Rapid detection of talcum powder in tea using FT-IR spectroscopy coupled with chemometrics
Li, Xiaoli; Zhang, Yuying; He, Yong
2016-01-01
This paper investigated the feasibility of Fourier transform infrared transmission (FT-IR) spectroscopy to detect talcum powder illegally added in tea based on chemometric methods. Firstly, 210 samples of tea powder with 13 dose levels of talcum powder were prepared for FT-IR spectra acquirement. In order to highlight the slight variations in FT-IR spectra, smoothing, normalize and standard normal variate (SNV) were employed to preprocess the raw spectra. Among them, SNV preprocessing had the best performance with high correlation of prediction (RP = 0.948) and low root mean square error of prediction (RMSEP = 0.108) of partial least squares (PLS) model. Then 18 characteristic wavenumbers were selected based on a hybrid of backward interval partial least squares (biPLS) regression, competitive adaptive reweighted sampling (CARS) algorithm and successive projections algorithm (SPA). These characteristic wavenumbers only accounted for 0.64% of the full wavenumbers. Following that, 18 characteristic wavenumbers were used to build linear and nonlinear determination models by PLS regression and extreme learning machine (ELM), respectively. The optimal model with RP = 0.963 and RMSEP = 0.137 was achieved by ELM algorithm. These results demonstrated that FT-IR spectroscopy with chemometrics could be used successfully to detect talcum powder in tea. PMID:27468701
Xu, Yun; Muhamadali, Howbeer; Sayqal, Ali; Dixon, Neil; Goodacre, Royston
2016-10-28
Partial least squares (PLS) is one of the most commonly used supervised modelling approaches for analysing multivariate metabolomics data. PLS is typically employed as either a regression model (PLS-R) or a classification model (PLS-DA). However, in metabolomics studies it is common to investigate multiple, potentially interacting, factors simultaneously following a specific experimental design. Such data often cannot be considered as a "pure" regression or a classification problem. Nevertheless, these data have often still been treated as a regression or classification problem and this could lead to ambiguous results. In this study, we investigated the feasibility of designing a hybrid target matrix Y that better reflects the experimental design than simple regression or binary class membership coding commonly used in PLS modelling. The new design of Y coding was based on the same principle used by structural modelling in machine learning techniques. Two real metabolomics datasets were used as examples to illustrate how the new Y coding can improve the interpretability of the PLS model compared to classic regression/classification coding.
NASA Astrophysics Data System (ADS)
Yang, Renjie; Dong, Guimei; Sun, Xueshan; Yang, Yanrong; Yu, Yaping; Liu, Haixue; Zhang, Weiyu
2018-02-01
A new approach for quantitative determination of polycyclic aromatic hydrocarbons (PAHs) in environment was proposed based on two-dimensional (2D) fluorescence correlation spectroscopy in conjunction with multivariate method. 40 mixture solutions of anthracene and pyrene were prepared in the laboratory. Excitation-emission matrix (EEM) fluorescence spectra of all samples were collected. And 2D fluorescence correlation spectra were calculated under the excitation perturbation. The N-way partial least squares (N-PLS) models were developed based on 2D fluorescence correlation spectra, showing a root mean square error of calibration (RMSEC) of 3.50 μg L- 1 and root mean square error of prediction (RMSEP) of 4.42 μg L- 1 for anthracene and of 3.61 μg L- 1 and 4.29 μg L- 1 for pyrene, respectively. Also, the N-PLS models were developed for quantitative analysis of anthracene and pyrene using EEM fluorescence spectra. The RMSEC and RMSEP were 3.97 μg L- 1 and 4.63 μg L- 1 for anthracene, 4.46 μg L- 1 and 4.52 μg L- 1 for pyrene, respectively. It was found that the N-PLS model using 2D fluorescence correlation spectra could provide better results comparing with EEM fluorescence spectra because of its low RMSEC and RMSEP. The methodology proposed has the potential to be an alternative method for detection of PAHs in environment.
Terra, Luciana A; Filgueiras, Paulo R; Tose, Lílian V; Romão, Wanderson; de Souza, Douglas D; de Castro, Eustáquio V R; de Oliveira, Mirela S L; Dias, Júlio C M; Poppi, Ronei J
2014-10-07
Negative-ion mode electrospray ionization, ESI(-), with Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS) was coupled to a Partial Least Squares (PLS) regression and variable selection methods to estimate the total acid number (TAN) of Brazilian crude oil samples. Generally, ESI(-)-FT-ICR mass spectra present a power of resolution of ca. 500,000 and a mass accuracy less than 1 ppm, producing a data matrix containing over 5700 variables per sample. These variables correspond to heteroatom-containing species detected as deprotonated molecules, [M - H](-) ions, which are identified primarily as naphthenic acids, phenols and carbazole analog species. The TAN values for all samples ranged from 0.06 to 3.61 mg of KOH g(-1). To facilitate the spectral interpretation, three methods of variable selection were studied: variable importance in the projection (VIP), interval partial least squares (iPLS) and elimination of uninformative variables (UVE). The UVE method seems to be more appropriate for selecting important variables, reducing the dimension of the variables to 183 and producing a root mean square error of prediction of 0.32 mg of KOH g(-1). By reducing the size of the data, it was possible to relate the selected variables with their corresponding molecular formulas, thus identifying the main chemical species responsible for the TAN values.
NASA Astrophysics Data System (ADS)
Liu, Wen; Zhang, Yuying; Yang, Si; Han, Donghai
2018-05-01
A new technique to identify the floral resources of honeys is demanded. Terahertz time-domain attenuated total reflection spectroscopy combined with chemometrics methods was applied to discriminate different categorizes (Medlar honey, Vitex honey, and Acacia honey). Principal component analysis (PCA), cluster analysis (CA) and partial least squares-discriminant analysis (PLS-DA) have been used to find information of the botanical origins of honeys. Spectral range also was discussed to increase the precision of PLS-DA model. The accuracy of 88.46% for validation set was obtained, using PLS-DA model in 0.5-1.5 THz. This work indicated terahertz time-domain attenuated total reflection spectroscopy was an available approach to evaluate the quality of honey rapidly.
Lozano, Valeria A; Ibañez, Gabriela A; Olivieri, Alejandro C
2009-10-05
In the presence of analyte-background interactions and a significant background signal, both second-order multivariate calibration and standard addition are required for successful analyte quantitation achieving the second-order advantage. This report discusses a modified second-order standard addition method, in which the test data matrix is subtracted from the standard addition matrices, and quantitation proceeds via the classical external calibration procedure. It is shown that this novel data processing method allows one to apply not only parallel factor analysis (PARAFAC) and multivariate curve resolution-alternating least-squares (MCR-ALS), but also the recently introduced and more flexible partial least-squares (PLS) models coupled to residual bilinearization (RBL). In particular, the multidimensional variant N-PLS/RBL is shown to produce the best analytical results. The comparison is carried out with the aid of a set of simulated data, as well as two experimental data sets: one aimed at the determination of salicylate in human serum in the presence of naproxen as an additional interferent, and the second one devoted to the analysis of danofloxacin in human serum in the presence of salicylate.
Prediction of valid acidity in intact apples with Fourier transform near infrared spectroscopy.
Liu, Yan-De; Ying, Yi-Bin; Fu, Xia-Ping
2005-03-01
To develop nondestructive acidity prediction for intact Fuji apples, the potential of Fourier transform near infrared (FT-NIR) method with fiber optics in interactance mode was investigated. Interactance in the 800 nm to 2619 nm region was measured for intact apples, harvested from early to late maturity stages. Spectral data were analyzed by two multivariate calibration techniques including partial least squares (PLS) and principal component regression (PCR) methods. A total of 120 Fuji apples were tested and 80 of them were used to form a calibration data set. The influences of different data preprocessing and spectra treatments were also quantified. Calibration models based on smoothing spectra were slightly worse than that based on derivative spectra, and the best result was obtained when the segment length was 5 nm and the gap size was 10 points. Depending on data preprocessing and PLS method, the best prediction model yielded correlation coefficient of determination (r2) of 0.759, low root mean square error of prediction (RMSEP) of 0.0677, low root mean square error of calibration (RMSEC) of 0.0562. The results indicated the feasibility of FT-NIR spectral analysis for predicting apple valid acidity in a nondestructive way.
Prediction of valid acidity in intact apples with Fourier transform near infrared spectroscopy*
Liu, Yan-de; Ying, Yi-bin; Fu, Xia-ping
2005-01-01
To develop nondestructive acidity prediction for intact Fuji apples, the potential of Fourier transform near infrared (FT-NIR) method with fiber optics in interactance mode was investigated. Interactance in the 800 nm to 2619 nm region was measured for intact apples, harvested from early to late maturity stages. Spectral data were analyzed by two multivariate calibration techniques including partial least squares (PLS) and principal component regression (PCR) methods. A total of 120 Fuji apples were tested and 80 of them were used to form a calibration data set. The influences of different data preprocessing and spectra treatments were also quantified. Calibration models based on smoothing spectra were slightly worse than that based on derivative spectra, and the best result was obtained when the segment length was 5 nm and the gap size was 10 points. Depending on data preprocessing and PLS method, the best prediction model yielded correlation coefficient of determination (r 2) of 0.759, low root mean square error of prediction (RMSEP) of 0.0677, low root mean square error of calibration (RMSEC) of 0.0562. The results indicated the feasibility of FT-NIR spectral analysis for predicting apple valid acidity in a nondestructive way. PMID:15682498
Szymanska-Chargot, M; Chylinska, M; Kruk, B; Zdunek, A
2015-01-22
The aim of this work was to quantitatively and qualitatively determine the composition of the cell wall material from apples during development by means of Fourier transform infrared (FT-IR) spectroscopy. The FT-IR region of 1500-800 cm(-1), containing characteristic bands for galacturonic acid, hemicellulose and cellulose, was examined using principal component analysis (PCA), k-means clustering and partial least squares (PLS). The samples were differentiated by development stage and cultivar using PCA and k-means clustering. PLS calibration models for galacturonic acid, hemicellulose and cellulose content from FT-IR spectra were developed and validated with the reference data. PLS models were tested using the root-mean-square errors of cross-validation for contents of galacturonic acid, hemicellulose and cellulose which was 8.30 mg/g, 4.08% and 1.74%, respectively. It was proven that FT-IR spectroscopy combined with chemometric methods has potential for fast and reliable determination of the main constituents of fruit cell walls. Copyright © 2014 Elsevier Ltd. All rights reserved.
USDA-ARS?s Scientific Manuscript database
A technique of using multiple calibration sets in partial least squares regression (PLS) was proposed to improve the quantitative determination of ammonia from open-path Fourier transform infrared spectra. The spectra were measured near animal farms, and the path-integrated concentration of ammonia...
Koch, Cosima; Posch, Andreas E; Goicoechea, Héctor C; Herwig, Christoph; Lendl, Bernhard
2014-01-07
This paper presents the quantification of Penicillin V and phenoxyacetic acid, a precursor, inline during Pencillium chrysogenum fermentations by FTIR spectroscopy and partial least squares (PLS) regression and multivariate curve resolution - alternating least squares (MCR-ALS). First, the applicability of an attenuated total reflection FTIR fiber optic probe was assessed offline by measuring standards of the analytes of interest and investigating matrix effects of the fermentation broth. Then measurements were performed inline during four fed-batch fermentations with online HPLC for the determination of Penicillin V and phenoxyacetic acid as reference analysis. PLS and MCR-ALS models were built using these data and validated by comparison of single analyte spectra with the selectivity ratio of the PLS models and the extracted spectral traces of the MCR-ALS models, respectively. The achieved root mean square errors of cross-validation for the PLS regressions were 0.22 g L(-1) for Penicillin V and 0.32 g L(-1) for phenoxyacetic acid and the root mean square errors of prediction for MCR-ALS were 0.23 g L(-1) for Penicillin V and 0.15 g L(-1) for phenoxyacetic acid. A general work-flow for building and assessing chemometric regression models for the quantification of multiple analytes in bioprocesses by FTIR spectroscopy is given. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Metwally, Fadia H.
2008-02-01
The quantitative predictive abilities of the new and simple bivariate spectrophotometric method are compared with the results obtained by the use of multivariate calibration methods [the classical least squares (CLS), principle component regression (PCR) and partial least squares (PLS)], using the information contained in the absorption spectra of the appropriate solutions. Mixtures of the two drugs Nifuroxazide (NIF) and Drotaverine hydrochloride (DRO) were resolved by application of the bivariate method. The different chemometric approaches were applied also with previous optimization of the calibration matrix, as they are useful in simultaneous inclusion of many spectral wavelengths. The results found by application of the bivariate, CLS, PCR and PLS methods for the simultaneous determinations of mixtures of both components containing 2-12 μg ml -1 of NIF and 2-8 μg ml -1 of DRO are reported. Both approaches were satisfactorily applied to the simultaneous determination of NIF and DRO in pure form and in pharmaceutical formulation. The results were in accordance with those given by the EVA Pharma reference spectrophotometric method.
Determination of butter adulteration with margarine using Raman spectroscopy.
Uysal, Reyhan Selin; Boyaci, Ismail Hakki; Genis, Hüseyin Efe; Tamer, Ugur
2013-12-15
In this study, adulteration of butter with margarine was analysed using Raman spectroscopy combined with chemometric methods (principal component analysis (PCA), principal component regression (PCR), partial least squares (PLS)) and artificial neural networks (ANNs). Different butter and margarine samples were mixed at various concentrations ranging from 0% to 100% w/w. PCA analysis was applied for the classification of butters, margarines and mixtures. PCR, PLS and ANN were used for the detection of adulteration ratios of butter. Models were created using a calibration data set and developed models were evaluated using a validation data set. The coefficient of determination (R(2)) values between actual and predicted values obtained for PCR, PLS and ANN for the validation data set were 0.968, 0.987 and 0.978, respectively. In conclusion, a combination of Raman spectroscopy with chemometrics and ANN methods can be applied for testing butter adulteration. Copyright © 2013 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Attia, Khalid A. M.; Nassar, Mohammed W. I.; El-Zeiny, Mohamed B.; Serag, Ahmed
2016-04-01
Three simple, specific, accurate and precise spectrophotometric methods were developed for the determination of cefprozil (CZ) in the presence of its alkaline induced degradation product (DCZ). The first method was the bivariate method, while the two other multivariate methods were partial least squares (PLS) and spectral residual augmented classical least squares (SRACLS). The multivariate methods were applied with and without variable selection procedure (genetic algorithm GA). These methods were tested by analyzing laboratory prepared mixtures of the above drug with its alkaline induced degradation product and they were applied to its commercial pharmaceutical products.
Kumar, Keshav; Mishra, Ashok Kumar
2015-07-01
Fluorescence characteristic of 8-anilinonaphthalene-1-sulfonic acid (ANS) in ethanol-water mixture in combination with partial least square (PLS) analysis was used to propose a simple and sensitive analytical procedure for monitoring the adulteration of ethanol by water. The proposed analytical procedure was found to be capable of detecting even small adulteration level of ethanol by water. The robustness of the procedure is evident from the statistical parameters such as square of correlation coefficient (R(2)), root mean square of calibration (RMSEC) and root mean square of prediction (RMSEP) that were found to be well with in the acceptable limits.
Chen, Ru-huang; Jin, Gang
2015-08-01
This paper presented an application of mid-infrared (MIR), near-infrared (NIR) and Raman spectroscopies for collecting the spectra of 31 kinds of low density polyethylene/polyprolene (LDPE/PP) samples with different proportions. The different pre-processing methods (multiplicative scatter correction, mean centering and Savitzky-Golay first derivative) and spectral region were explored to develop partial least-squares (PLS) model for LDPE, their influence on the accuracy of PLS model also being discussed. Three spectroscopies were compared about the accuracy of quantitative measurement. Consequently, the pre-processing methods and spectral region have a great impact on the accuracy of PLS model, especially the spectra with subtle difference, random noise and baseline variation. After being pre-processed and spectral region selected, the calibration model of MIR, NIR and Raman exhibited R2/RMSEC values of 0.9906/2.941, 0.9973/1.561 and 0.9972/1.598 respectively, which corrsponding to 0.8876/10.15, 0.8493/11.75 and 0.8757/10.67 before any treatment. The results also suggested MIR, NIR and Raman are three strong tools to predict the content of LDPE in LDPE/PP blend. However, NIR and Raman showed higher accuracy after being pre-processed and more suitability to fast quantitative characterization due to their high measuring speed.
Statistical variation in progressive scrambling
NASA Astrophysics Data System (ADS)
Clark, Robert D.; Fox, Peter C.
2004-07-01
The two methods most often used to evaluate the robustness and predictivity of partial least squares (PLS) models are cross-validation and response randomization. Both methods may be overly optimistic for data sets that contain redundant observations, however. The kinds of perturbation analysis widely used for evaluating model stability in the context of ordinary least squares regression are only applicable when the descriptors are independent of each other and errors are independent and normally distributed; neither assumption holds for QSAR in general and for PLS in particular. Progressive scrambling is a novel, non-parametric approach to perturbing models in the response space in a way that does not disturb the underlying covariance structure of the data. Here, we introduce adjustments for two of the characteristic values produced by a progressive scrambling analysis - the deprecated predictivity (Q_s^{ast^2}) and standard error of prediction (SDEP s * ) - that correct for the effect of introduced perturbation. We also explore the statistical behavior of the adjusted values (Q_0^{ast^2} and SDEP 0 * ) and the sensitivity to perturbation (d q 2/d r yy ' 2). It is shown that the three statistics are all robust for stable PLS models, in terms of the stochastic component of their determination and of their variation due to sampling effects involved in training set selection.
Fakayode, Sayo O; Mitchell, Breanna S; Pollard, David A
2014-08-01
Accurate understanding of analyte boiling points (BP) is of critical importance in gas chromatographic (GC) separation and crude oil refinery operation in petrochemical industries. This study reported the first combined use of GC separation and partial-least-square (PLS1) multivariate regression analysis of petrochemical structural activity relationship (SAR) for accurate BP determination of two commercially available (D3710 and MA VHP) calibration gas mix samples. The results of the BP determination using PLS1 multivariate regression were further compared with the results of traditional simulated distillation method of BP determination. The developed PLS1 regression was able to correctly predict analytes BP in D3710 and MA VHP calibration gas mix samples, with a root-mean-square-%-relative-error (RMS%RE) of 6.4%, and 10.8% respectively. In contrast, the overall RMS%RE of 32.9% and 40.4%, respectively obtained for BP determination in D3710 and MA VHP using a traditional simulated distillation method were approximately four times larger than the corresponding RMS%RE of BP prediction using MRA, demonstrating the better predictive ability of MRA. The reported method is rapid, robust, and promising, and can be potentially used routinely for fast analysis, pattern recognition, and analyte BP determination in petrochemical industries. Copyright © 2014 Elsevier B.V. All rights reserved.
Alarcón, Francis; Báez, María E; Bravo, Manuel; Richter, Pablo; Escandar, Graciela M; Olivieri, Alejandro C; Fuentes, Edwar
2013-01-15
The possibility of simultaneously determining seven concerned heavy polycyclic aromatic hydrocarbons (PAHs) of the US-EPA priority pollutant list, in extra virgin olive and sunflower oils was examined using unfolded partial least-squares with residual bilinearization (U-PLS/RBL) and parallel factor analysis (PARAFAC). Both of these methods were applied to fluorescence excitation emission matrices. The compounds studied were benzo[a]anthracene, benzo[b]fluoranthene, benzo[k]fluoranthene, benzo[a]pyrene, dibenz[a,h]anthracene, benzo[g,h,i]perylene and indeno[1,2,3-c,d]-pyrene. The analysis was performed using fluorescence spectroscopy after a microwave assisted liquid-liquid extraction and solid-phase extraction on silica. The U-PLS/RBL algorithm exhibited the best performance for resolving the heavy PAH mixture in the presence of both the highly complex oil matrix and other unpredicted PAHs of the US-EPA list. The obtained limit of detection for the proposed method ranged from 0.07 to 2 μg kg(-1). The predicted U-PLS/RBL concentrations were satisfactorily compared with those obtained using high-performance liquid chromatography with fluorescence detection. A simple analysis with a considerable reduction in time and solvent consumption in comparison with chromatography are the principal advantages of the proposed method. Copyright © 2012 Elsevier B.V. All rights reserved.
Monitoring multiple components in vinegar fermentation using Raman spectroscopy.
Uysal, Reyhan Selin; Soykut, Esra Acar; Boyaci, Ismail Hakki; Topcu, Ali
2013-12-15
In this study, the utility of Raman spectroscopy (RS) with chemometric methods for quantification of multiple components in the fermentation process was investigated. Vinegar, the product of a two stage fermentation, was used as a model and glucose and fructose consumption, ethanol production and consumption and acetic acid production were followed using RS and the partial least squares (PLS) method. Calibration of the PLS method was performed using model solutions. The prediction capability of the method was then investigated with both model and real samples. HPLC was used as a reference method. The results from comparing RS-PLS and HPLC with each other showed good correlations were obtained between predicted and actual sample values for glucose (R(2)=0.973), fructose (R(2)=0.988), ethanol (R(2)=0.996) and acetic acid (R(2)=0.983). In conclusion, a combination of RS with chemometric methods can be applied to monitor multiple components of the fermentation process from start to finish with a single measurement in a short time. Copyright © 2013 Elsevier Ltd. All rights reserved.
Detection of Tetracycline in Milk using NIR Spectroscopy and Partial Least Squares
NASA Astrophysics Data System (ADS)
Wu, Nan; Xu, Chenshan; Yang, Renjie; Ji, Xinning; Liu, Xinyuan; Yang, Fan; Zeng, Ming
2018-02-01
The feasibility of measuring tetracycline in milk was investigated by near infrared (NIR) spectroscopic technique combined with partial least squares (PLS) method. The NIR transmittance spectra of 40 pure milk samples and 40 tetracycline adulterated milk samples with different concentrations (from 0.005 to 40 mg/L) were obtained. The pure milk and tetracycline adulterated milk samples were properly assigned to the categories with 100% accuracy in the calibration set, and the rate of correct classification of 96.3% was obtained in the prediction set. For the quantitation of tetracycline in adulterated milk, the root mean squares errors for calibration and prediction models were 0.61 mg/L and 4.22 mg/L, respectively. The PLS model had good fitting effect in calibration set, however its predictive ability was limited, especially for low tetracycline concentration samples. Totally, this approach can be considered as a promising tool for discrimination of tetracycline adulterated milk, as a supplement to high performance liquid chromatography.
Bao, Yidan; Kong, Wenwen; Liu, Fei; Qiu, Zhengjun; He, Yong
2012-01-01
Amino acids are quite important indices to indicate the growth status of oilseed rape under herbicide stress. Near infrared (NIR) spectroscopy combined with chemometrics was applied for fast determination of glutamic acid in oilseed rape leaves. The optimal spectral preprocessing method was obtained after comparing Savitzky-Golay smoothing, standard normal variate, multiplicative scatter correction, first and second derivatives, detrending and direct orthogonal signal correction. Linear and nonlinear calibration methods were developed, including partial least squares (PLS) and least squares-support vector machine (LS-SVM). The most effective wavelengths (EWs) were determined by the successive projections algorithm (SPA), and these wavelengths were used as the inputs of PLS and LS-SVM model. The best prediction results were achieved by SPA-LS-SVM (Raw) model with correlation coefficient r = 0.9943 and root mean squares error of prediction (RMSEP) = 0.0569 for prediction set. These results indicated that NIR spectroscopy combined with SPA-LS-SVM was feasible for the fast and effective detection of glutamic acid in oilseed rape leaves. The selected EWs could be used to develop spectral sensors, and the important and basic amino acid data were helpful to study the function mechanism of herbicide. PMID:23203052
NASA Astrophysics Data System (ADS)
Peerbhay, Kabir Yunus; Mutanga, Onisimo; Ismail, Riyad
2013-05-01
Discriminating commercial tree species using hyperspectral remote sensing techniques is critical in monitoring the spatial distributions and compositions of commercial forests. However, issues related to data dimensionality and multicollinearity limit the successful application of the technology. The aim of this study was to examine the utility of the partial least squares discriminant analysis (PLS-DA) technique in accurately classifying six exotic commercial forest species (Eucalyptus grandis, Eucalyptus nitens, Eucalyptus smithii, Pinus patula, Pinus elliotii and Acacia mearnsii) using airborne AISA Eagle hyperspectral imagery (393-900 nm). Additionally, the variable importance in the projection (VIP) method was used to identify subsets of bands that could successfully discriminate the forest species. Results indicated that the PLS-DA model that used all the AISA Eagle bands (n = 230) produced an overall accuracy of 80.61% and a kappa value of 0.77, with user's and producer's accuracies ranging from 50% to 100%. In comparison, incorporating the optimal subset of VIP selected wavebands (n = 78) in the PLS-DA model resulted in an improved overall accuracy of 88.78% and a kappa value of 0.87, with user's and producer's accuracies ranging from 70% to 100%. Bands located predominantly within the visible region of the electromagnetic spectrum (393-723 nm) showed the most capability in terms of discriminating between the six commercial forest species. Overall, the research has demonstrated the potential of using PLS-DA for reducing the dimensionality of hyperspectral datasets as well as determining the optimal subset of bands to produce the highest classification accuracies.
Lakshmi, Karunanidhi Santhana; Lakshmi, Sivasubramanian
2011-03-01
Simultaneous determination of valsartan and hydrochlorothiazide by the H-point standard additions method (HPSAM) and partial least squares (PLS) calibration is described. Absorbances at a pair of wavelengths, 216 and 228 nm, were monitored with the addition of standard solutions of valsartan. Results of applying HPSAM showed that valsartan and hydrochlorothiazide can be determined simultaneously at concentration ratios varying from 20:1 to 1:15 in a mixed sample. The proposed PLS method does not require chemical separation and spectral graphical procedures for quantitative resolution of mixtures containing the titled compounds. The calibration model was based on absorption spectra in the 200-350 nm range for 25 different mixtures of valsartan and hydrochlorothiazide. Calibration matrices contained 0.5-3 μg mL-1 of both valsartan and hydrochlorothiazide. The standard error of prediction (SEP) for valsartan and hydrochlorothiazide was 0.020 and 0.038 μg mL-1, respectively. Both proposed methods were successfully applied to the determination of valsartan and hydrochlorothiazide in several synthetic and real matrix samples.
Elkhoudary, Mahmoud M; Abdel Salam, Randa A; Hadad, Ghada M
2014-09-15
Metronidazole (MNZ) is a widely used antibacterial and amoebicide drug. Therefore, it is important to develop a rapid and specific analytical method for the determination of MNZ in mixture with Spiramycin (SPY), Diloxanide (DIX) and Cliquinol (CLQ) in pharmaceutical preparations. This work describes simple, sensitive and reliable six multivariate calibration methods, namely linear and nonlinear artificial neural networks preceded by genetic algorithm (GA-ANN) and principle component analysis (PCA-ANN) as well as partial least squares (PLS) either alone or preceded by genetic algorithm (GA-PLS) for UV spectrophotometric determination of MNZ, SPY, DIX and CLQ in pharmaceutical preparations with no interference of pharmaceutical additives. The results manifest the problem of nonlinearity and how models like ANN can handle it. Analytical performance of these methods was statistically validated with respect to linearity, accuracy, precision and specificity. The developed methods indicate the ability of the previously mentioned multivariate calibration models to handle and solve UV spectra of the four components' mixtures using easy and widely used UV spectrophotometer. Copyright © 2014 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Elkhoudary, Mahmoud M.; Abdel Salam, Randa A.; Hadad, Ghada M.
2014-09-01
Metronidazole (MNZ) is a widely used antibacterial and amoebicide drug. Therefore, it is important to develop a rapid and specific analytical method for the determination of MNZ in mixture with Spiramycin (SPY), Diloxanide (DIX) and Cliquinol (CLQ) in pharmaceutical preparations. This work describes simple, sensitive and reliable six multivariate calibration methods, namely linear and nonlinear artificial neural networks preceded by genetic algorithm (GA-ANN) and principle component analysis (PCA-ANN) as well as partial least squares (PLS) either alone or preceded by genetic algorithm (GA-PLS) for UV spectrophotometric determination of MNZ, SPY, DIX and CLQ in pharmaceutical preparations with no interference of pharmaceutical additives. The results manifest the problem of nonlinearity and how models like ANN can handle it. Analytical performance of these methods was statistically validated with respect to linearity, accuracy, precision and specificity. The developed methods indicate the ability of the previously mentioned multivariate calibration models to handle and solve UV spectra of the four components’ mixtures using easy and widely used UV spectrophotometer.
Statistical process control of cocrystallization processes: A comparison between OPLS and PLS.
Silva, Ana F T; Sarraguça, Mafalda Cruz; Ribeiro, Paulo R; Santos, Adenilson O; De Beer, Thomas; Lopes, João Almeida
2017-03-30
Orthogonal partial least squares regression (OPLS) is being increasingly adopted as an alternative to partial least squares (PLS) regression due to the better generalization that can be achieved. Particularly in multivariate batch statistical process control (BSPC), the use of OPLS for estimating nominal trajectories is advantageous. In OPLS, the nominal process trajectories are expected to be captured in a single predictive principal component while uncorrelated variations are filtered out to orthogonal principal components. In theory, OPLS will yield a better estimation of the Hotelling's T 2 statistic and corresponding control limits thus lowering the number of false positives and false negatives when assessing the process disturbances. Although OPLS advantages have been demonstrated in the context of regression, its use on BSPC was seldom reported. This study proposes an OPLS-based approach for BSPC of a cocrystallization process between hydrochlorothiazide and p-aminobenzoic acid monitored on-line with near infrared spectroscopy and compares the fault detection performance with the same approach based on PLS. A series of cocrystallization batches with imposed disturbances were used to test the ability to detect abnormal situations by OPLS and PLS-based BSPC methods. Results demonstrated that OPLS was generally superior in terms of sensibility and specificity in most situations. In some abnormal batches, it was found that the imposed disturbances were only detected with OPLS. Copyright © 2017 Elsevier B.V. All rights reserved.
[NIR Assignment of Magnolol by 2D-COS Technology and Model Application Huoxiangzhengqi Oral Liduid].
Pei, Yan-ling; Wu, Zhi-sheng; Shi, Xin-yuan; Pan, Xiao-ning; Peng, Yan-fang; Qiao, Yan-jiang
2015-08-01
Near infrared (NIR) spectroscopy assignment of Magnolol was performed using deuterated chloroform solvent and two-dimensional correlation spectroscopy (2D-COS) technology. According to the synchronous spectra of deuterated chloroform solvent and Magnolol, 1365~1455, 1600~1720, 2000~2181 and 2275~2465 nm were the characteristic absorption of Magnolol. Connected with the structure of Magnolol, 1440 nm was the stretching vibration of phenolic group O-H, 1679 nm was the stretching vibration of aryl and methyl which connected with aryl, 2117, 2304, 2339 and 2370 nm were the combination of the stretching vibration, bending vibration and deformation vibration for aryl C-H, 2445 nm were the bending vibration of methyl which linked with aryl group, these bands attribut to the characteristics of Magnolol. Huoxiangzhengqi Oral Liduid was adopted to study the Magnolol, the characteristic band by spectral assignment and the band by interval Partial Least Squares (iPLS) and Synergy interval Partial Least Squares (SiPLS) were used to establish Partial Least Squares (PLS) quantitative model, the coefficient of determination Rcal(2) and Rpre(2) were greater than 0.99, the Root Mean of Square Error of Calibration (RM-SEC), Root Mean of Square Error of Cross Validation (RMSECV) and Root Mean of Square Error of Prediction (RMSEP) were very small. It indicated that the characteristic band by spectral assignment has the same results with the Chemometrics in PLS model. It provided a reference for NIR spectral assignment of chemical compositions in Chinese Materia Medica, and the band filters of NIR were interpreted.
2013-01-01
Background A major hindrance to the development of high yielding biofuel feedstocks is the ability to rapidly assess large populations for fermentable sugar yields. Whilst recent advances have outlined methods for the rapid assessment of biomass saccharification efficiency, none take into account the total biomass, or the soluble sugar fraction of the plant. Here we present a holistic high-throughput methodology for assessing sweet Sorghum bicolor feedstocks at 10 days post-anthesis for total fermentable sugar yields including stalk biomass, soluble sugar concentrations, and cell wall saccharification efficiency. Results A mathematical method for assessing whole S. bicolor stalks using the fourth internode from the base of the plant proved to be an effective high-throughput strategy for assessing stalk biomass, soluble sugar concentrations, and cell wall composition and allowed calculation of total stalk fermentable sugars. A high-throughput method for measuring soluble sucrose, glucose, and fructose using partial least squares (PLS) modelling of juice Fourier transform infrared (FTIR) spectra was developed. The PLS prediction was shown to be highly accurate with each sugar attaining a coefficient of determination (R 2 ) of 0.99 with a root mean squared error of prediction (RMSEP) of 11.93, 5.52, and 3.23 mM for sucrose, glucose, and fructose, respectively, which constitutes an error of <4% in each case. The sugar PLS model correlated well with gas chromatography–mass spectrometry (GC-MS) and brix measures. Similarly, a high-throughput method for predicting enzymatic cell wall digestibility using PLS modelling of FTIR spectra obtained from S. bicolor bagasse was developed. The PLS prediction was shown to be accurate with an R 2 of 0.94 and RMSEP of 0.64 μg.mgDW-1.h-1. Conclusions This methodology has been demonstrated as an efficient and effective way to screen large biofuel feedstock populations for biomass, soluble sugar concentrations, and cell wall digestibility simultaneously allowing a total fermentable yield calculation. It unifies and simplifies previous screening methodologies to produce a holistic assessment of biofuel feedstock potential. PMID:24365407
Zhang, Xue-Xi; Yin, Jian-Hua; Mao, Zhi-Hua; Xia, Yang
2015-01-01
Abstract. Fourier transform infrared imaging (FTIRI) combined with chemometrics algorithm has strong potential to obtain complex chemical information from biology tissues. FTIRI and partial least squares-discriminant analysis (PLS-DA) were used to differentiate healthy and osteoarthritic (OA) cartilages for the first time. A PLS model was built on the calibration matrix of spectra that was randomly selected from the FTIRI spectral datasets of healthy and lesioned cartilage. Leave-one-out cross-validation was performed in the PLS model, and the fitting coefficient between actual and predicted categorical values of the calibration matrix reached 0.95. In the calibration and prediction matrices, the successful identifying percentages of healthy and lesioned cartilage spectra were 100% and 90.24%, respectively. These results demonstrated that FTIRI combined with PLS-DA could provide a promising approach for the categorical identification of healthy and OA cartilage specimens. PMID:26057029
Zhang, Xue-Xi; Yin, Jian-Hua; Mao, Zhi-Hua; Xia, Yang
2015-06-01
Fourier transform infrared imaging (FTIRI) combined with chemometrics algorithm has strong potential to obtain complex chemical information from biology tissues. FTIRI and partial least squares-discriminant analysis (PLS-DA) were used to differentiate healthy and osteoarthritic (OA) cartilages for the first time. A PLS model was built on the calibration matrix of spectra that was randomly selected from the FTIRI spectral datasets of healthy and lesioned cartilage. Leave-one-out cross-validation was performed in the PLS model, and the fitting coefficient between actual and predicted categorical values of the calibration matrix reached 0.95. In the calibration and prediction matrices, the successful identifying percentages of healthy and lesioned cartilage spectra were 100% and 90.24%, respectively. These results demonstrated that FTIRI combined with PLS-DA could provide a promising approach for the categorical identification of healthy and OA cartilage specimens.
NASA Astrophysics Data System (ADS)
Luna, Aderval S.; Gonzaga, Fabiano B.; da Rocha, Werickson F. C.; Lima, Igor C. A.
2018-01-01
Laser-induced breakdown spectroscopy (LIBS) analysis was carried out on eleven steel samples to quantify the concentrations of chromium, nickel, and manganese. LIBS spectral data were correlated to known concentrations of the samples using different strategies in partial least squares (PLS) regression models. For the PLS analysis, one predictive model was separately generated for each element, while different approaches were used for the selection of variables (VIP: variable importance in projection and iPLS: interval partial least squares) in the PLS model to quantify the contents of the elements. The comparison of the performance of the models showed that there was no significant statistical difference using the Wilcoxon signed rank test. The elliptical joint confidence region (EJCR) did not detect systematic errors in these proposed methodologies for each metal.
Rodríguez-Entrena, Macario; Schuberth, Florian; Gelhard, Carsten
2018-01-01
Structural equation modeling using partial least squares (PLS-SEM) has become a main-stream modeling approach in various disciplines. Nevertheless, prior literature still lacks a practical guidance on how to properly test for differences between parameter estimates. Whereas existing techniques such as parametric and non-parametric approaches in PLS multi-group analysis solely allow to assess differences between parameters that are estimated for different subpopulations, the study at hand introduces a technique that allows to also assess whether two parameter estimates that are derived from the same sample are statistically different. To illustrate this advancement to PLS-SEM, we particularly refer to a reduced version of the well-established technology acceptance model.
NASA Astrophysics Data System (ADS)
Ahmed, Shamim; Miorelli, Roberto; Calmon, Pierre; Anselmi, Nicola; Salucci, Marco
2018-04-01
This paper describes Learning-By-Examples (LBE) technique for performing quasi real time flaw localization and characterization within a conductive tube based on Eddy Current Testing (ECT) signals. Within the framework of LBE, the combination of full-factorial (i.e., GRID) sampling and Partial Least Squares (PLS) feature extraction (i.e., GRID-PLS) techniques are applied for generating a suitable training set in offine phase. Support Vector Regression (SVR) is utilized for model development and inversion during offine and online phases, respectively. The performance and robustness of the proposed GIRD-PLS/SVR strategy on noisy test set is evaluated and compared with standard GRID/SVR approach.
Niazi, Ali; Khorshidi, Neda; Ghaemmaghami, Pegah
2015-01-25
In this study an analytical procedure based on microwave-assisted dispersive liquid-liquid microextraction (MA-DLLME) and spectrophotometric coupled with chemometrics methods is proposed to determine uranium. In the proposed method, 4-(2-pyridylazo) resorcinol (PAR) is used as a chelating agent, and chloroform and ethanol are selected as extraction and dispersive solvent. The optimization strategy is carried out by using two level full factorial designs. Results of the two level full factorial design (2(4)) based on an analysis of variance demonstrated that the pH, concentration of PAR, amount of dispersive and extraction solvents are statistically significant. Optimal condition for three variables: pH, concentration of PAR, amount of dispersive and extraction solvents are obtained by using Box-Behnken design. Under the optimum conditions, the calibration graphs are linear in the range of 20.0-350.0 ng mL(-1) with detection limit of 6.7 ng mL(-1) (3δB/slope) and the enrichment factor of this method for uranium reached at 135. The relative standard deviation (R.S.D.) is 1.64% (n=7, c=50 ng mL(-1)). The partial least squares (PLS) modeling was used for multivariate calibration of the spectrophotometric data. The orthogonal signal correction (OSC) was used for preprocessing of data matrices and the prediction results of model, with and without using OSC, were statistically compared. MA-DLLME-OSC-PLS method was presented for the first time in this study. The root mean squares error of prediction (RMSEP) for uranium determination using PLS and OSC-PLS models were 4.63 and 0.98, respectively. This procedure allows the determination of uranium synthesis and real samples such as waste water with good reliability of the determination. Copyright © 2014. Published by Elsevier B.V.
Jović, Ozren
2016-12-15
A novel method for quantitative prediction and variable-selection on spectroscopic data, called Durbin-Watson partial least-squares regression (dwPLS), is proposed in this paper. The idea is to inspect serial correlation in infrared data that is known to consist of highly correlated neighbouring variables. The method selects only those variables whose intervals have a lower Durbin-Watson statistic (dw) than a certain optimal cutoff. For each interval, dw is calculated on a vector of regression coefficients. Adulteration of cold-pressed linseed oil (L), a well-known nutrient beneficial to health, is studied in this work by its being mixed with cheaper oils: rapeseed oil (R), sesame oil (Se) and sunflower oil (Su). The samples for each botanical origin of oil vary with respect to producer, content and geographic origin. The results obtained indicate that MIR-ATR, combined with dwPLS could be implemented to quantitative determination of edible-oil adulteration. Copyright © 2016 Elsevier Ltd. All rights reserved.
Lim, Jongguk; Kim, Giyoung; Mo, Changyeun; Oh, Kyoungmin; Yoo, Hyeonchae; Ham, Hyeonheui; Kim, Moon S.
2017-01-01
The purpose of this study is to use near-infrared reflectance (NIR) spectroscopy equipment to nondestructively and rapidly discriminate Fusarium-infected hulled barley. Both normal hulled barley and Fusarium-infected hulled barley were scanned by using a NIR spectrometer with a wavelength range of 1175 to 2170 nm. Multiple mathematical pretreatments were applied to the reflectance spectra obtained for Fusarium discrimination and the multivariate analysis method of partial least squares discriminant analysis (PLS-DA) was used for discriminant prediction. The PLS-DA prediction model developed by applying the second-order derivative pretreatment to the reflectance spectra obtained from the side of hulled barley without crease achieved 100% accuracy in discriminating the normal hulled barley and the Fusarium-infected hulled barley. These results demonstrated the feasibility of rapid discrimination of the Fusarium-infected hulled barley by combining multivariate analysis with the NIR spectroscopic technique, which is utilized as a nondestructive detection method. PMID:28974012
Liu, Changhong; Liu, Wei; Chen, Wei; Yang, Jianbo; Zheng, Lei
2015-04-15
Tomato is an important health-stimulating fruit because of the antioxidant properties of its main bioactive compounds, dominantly lycopene and phenolic compounds. Nowadays, product differentiation in the fruit market requires an accurate evaluation of these value-added compounds. An experiment was conducted to simultaneously and non-destructively measure lycopene and phenolic compounds content in intact tomatoes using multispectral imaging combined with chemometric methods. Partial least squares (PLS), least squares-support vector machines (LS-SVM) and back propagation neural network (BPNN) were applied to develop quantitative models. Compared with PLS and LS-SVM, BPNN model considerably improved the performance with coefficient of determination in prediction (RP(2))=0.938 and 0.965, residual predictive deviation (RPD)=4.590 and 9.335 for lycopene and total phenolics content prediction, respectively. It is concluded that multispectral imaging is an attractive alternative to the standard methods for determination of bioactive compounds content in intact tomatoes, providing a useful platform for infield fruit sorting/grading. Copyright © 2014 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Qiu, Peng; D'Souza, Warren D.; McAvoy, Thomas J.; Liu, K. J. Ray
2007-09-01
Tumor motion induced by respiration presents a challenge to the reliable delivery of conformal radiation treatments. Real-time motion compensation represents the technologically most challenging clinical solution but has the potential to overcome the limitations of existing methods. The performance of a real-time couch-based motion compensation system is mainly dependent on two aspects: the ability to infer the internal anatomical position and the performance of the feedback control system. In this paper, we propose two novel methods for the two aspects respectively, and then combine the proposed methods into one system. To accurately estimate the internal tumor position, we present partial-least squares (PLS) regression to predict the position of the diaphragm using skin-based motion surrogates. Four radio-opaque markers were placed on the abdomen of patients who underwent fluoroscopic imaging of the diaphragm. The coordinates of the markers served as input variables and the position of the diaphragm served as the output variable. PLS resulted in lower prediction errors compared with standard multiple linear regression (MLR). The performance of the feedback control system depends on the system dynamics and dead time (delay between the initiation and execution of the control action). While the dynamics of the system can be inverted in a feedback control system, the dead time cannot be inverted. To overcome the dead time of the system, we propose a predictive feedback control system by incorporating forward prediction using least-mean-square (LMS) and recursive least square (RLS) filtering into the couch-based control system. Motion data were obtained using a skin-based marker. The proposed predictive feedback control system was benchmarked against pure feedback control (no forward prediction) and resulted in a significant performance gain. Finally, we combined the PLS inference model and the predictive feedback control to evaluate the overall performance of the feedback control system. Our results show that, with the tumor motion unknown but inferred by skin-based markers through the PLS model, the predictive feedback control system was able to effectively compensate intra-fraction motion.
ERIC Educational Resources Information Center
Pierce, Karisa M.; Schale, Stephen P.; Le, Trang M.; Larson, Joel C.
2011-01-01
We present a laboratory experiment for an advanced analytical chemistry course where we first focus on the chemometric technique partial least-squares (PLS) analysis applied to one-dimensional (1D) total-ion-current gas chromatography-mass spectrometry (GC-TIC) separations of biodiesel blends. Then, we focus on n-way PLS (n-PLS) applied to…
Umesh P. Agarwal; Richard S. Reiner; Sally A. Ralph
2010-01-01
Two new methods based on FTâRaman spectroscopy, one simple, based on band intensity ratio, and the other using a partial least squares (PLS) regression model, are proposed to determine cellulose I crystallinity. In the simple method, crystallinity in cellulose I samples was determined based on univariate regression that was first developed using the Raman band...
Kernel analysis of partial least squares (PLS) regression models.
Shinzawa, Hideyuki; Ritthiruangdej, Pitiporn; Ozaki, Yukihiro
2011-05-01
An analytical technique based on kernel matrix representation is demonstrated to provide further chemically meaningful insight into partial least squares (PLS) regression models. The kernel matrix condenses essential information about scores derived from PLS or principal component analysis (PCA). Thus, it becomes possible to establish the proper interpretation of the scores. A PLS model for the total nitrogen (TN) content in multiple Thai fish sauces is built with a set of near-infrared (NIR) transmittance spectra of the fish sauce samples. The kernel analysis of the scores effectively reveals that the variation of the spectral feature induced by the change in protein content is substantially associated with the total water content and the protein hydration. Kernel analysis is also carried out on a set of time-dependent infrared (IR) spectra representing transient evaporation of ethanol from a binary mixture solution of ethanol and oleic acid. A PLS model to predict the elapsed time is built with the IR spectra and the kernel matrix is derived from the scores. The detailed analysis of the kernel matrix provides penetrating insight into the interaction between the ethanol and the oleic acid.
2012-01-01
Background Decision-making in healthcare is complex. Research on coverage decision-making has focused on comparative studies for several countries, statistical analyses for single decision-makers, the decision outcome and appraisal criteria. Accounting for decision processes extends the complexity, as they are multidimensional and process elements need to be regarded as latent constructs (composites) that are not observed directly. The objective of this study was to present a practical application of partial least square path modelling (PLS-PM) to evaluate how it offers a method for empirical analysis of decision-making in healthcare. Methods Empirical approaches that applied PLS-PM to decision-making in healthcare were identified through a systematic literature search. PLS-PM was used as an estimation technique for a structural equation model that specified hypotheses between the components of decision processes and the reasonableness of decision-making in terms of medical, economic and other ethical criteria. The model was estimated for a sample of 55 coverage decisions on the extension of newborn screening programmes in Europe. Results were evaluated by standard reliability and validity measures for PLS-PM. Results After modification by dropping two indicators that showed poor measures in the measurement models’ quality assessment and were not meaningful for newborn screening, the structural equation model estimation produced plausible results. The presence of three influences was supported: the links between both stakeholder participation or transparency and the reasonableness of decision-making; and the effect of transparency on the degree of scientific rigour of assessment. Reliable and valid measurement models were obtained to describe the composites of ‘transparency’, ‘participation’, ‘scientific rigour’ and ‘reasonableness’. Conclusions The structural equation model was among the first applications of PLS-PM to coverage decision-making. It allowed testing of hypotheses in situations where there are links between several non-observable constructs. PLS-PM was compatible in accounting for the complexity of coverage decisions to obtain a more realistic perspective for empirical analysis. The model specification can be used for hypothesis testing by using larger sample sizes and for data in the full domain of health technologies. PMID:22856325
Detection of Genetically Modified Sugarcane by Using Terahertz Spectroscopy and Chemometrics
NASA Astrophysics Data System (ADS)
Liu, J.; Xie, H.; Zha, B.; Ding, W.; Luo, J.; Hu, C.
2018-03-01
A methodology is proposed to identify genetically modified sugarcane from non-genetically modified sugarcane by using terahertz spectroscopy and chemometrics techniques, including linear discriminant analysis (LDA), support vector machine-discriminant analysis (SVM-DA), and partial least squares-discriminant analysis (PLS-DA). The classification rate of the above mentioned methods is compared, and different types of preprocessing are considered. According to the experimental results, the best option is PLS-DA, with an identification rate of 98%. The results indicated that THz spectroscopy and chemometrics techniques are a powerful tool to identify genetically modified and non-genetically modified sugarcane.
NASA Astrophysics Data System (ADS)
Mi, Jiaping; Li, Yuanqian; Zhou, Xiaoli; Zheng, Bo; Zhou, Ying
2006-01-01
A flow injection-CCD diode array detection spectrophotometry with partial least squares (PLS) program for simultaneous determination of iron, copper and cobalt in food samples has been established. The method was based on the chromogenic reaction of the three metal ions and 2- (5-Bromo-2-pyridylazo)-5-diethylaminophenol, 5-Br-PADAP in acetic acid - sodium acetate buffer solution (pH5) with Triton X-100 and ascorbic acid. The overlapped spectra of the colored complexes were collected by charge-coupled device (CCD) - diode array detector and the multi-wavelength absorbance data was processed using partial least squares (PLS) algorithm. Optimum reaction conditions and parameters of flow injection analysis were investigated. The samples of tea, sesame, laver, millet, cornmeal, mung bean and soybean powder were determined by the proposed method. The average recoveries of spiked samples were 91.80%~100.9% for Iron, 92.50%~108.0% for Copper, 93.00%~110.5% for Cobalt, respectively with relative standard deviation (R.S.D) of 1.1%~12.1%. The sampling rate is 45 samples h-1. The determination results of the food samples were in good agreement between the proposed method and ICP-AES.
Quantitative analysis of red wine tannins using Fourier-transform mid-infrared spectrometry.
Fernandez, Katherina; Agosin, Eduardo
2007-09-05
Tannin content and composition are critical quality components of red wines. No spectroscopic method assessing these phenols in wine has been described so far. We report here a new method using Fourier transform mid-infrared (FT-MIR) spectroscopy and chemometric techniques for the quantitative analysis of red wine tannins. Calibration models were developed using protein precipitation and phloroglucinolysis as analytical reference methods. After spectra preprocessing, six different predictive partial least-squares (PLS) models were evaluated, including the use of interval selection procedures such as iPLS and CSMWPLS. PLS regression with full-range (650-4000 cm(-1)), second derivative of the spectra and phloroglucinolysis as the reference method gave the most accurate determination for tannin concentration (RMSEC = 2.6%, RMSEP = 9.4%, r = 0.995). The prediction of the mean degree of polymerization (mDP) of the tannins also gave a reasonable prediction (RMSEC = 6.7%, RMSEP = 10.3%, r = 0.958). These results represent the first step in the development of a spectroscopic methodology for the quantification of several phenolic compounds that are critical for wine quality.
Determination of cellulose I crystallinity by FT-Raman spectroscopy
Umesh P. Agarwal; Richard S. Reiner; Sally A. Ralph
2009-01-01
Two new methods based on FT-Raman spectroscopy, one simple, based on band intensity ratio, and the other, using a partial least-squares (PLS) regression model, are proposed to determine cellulose I crystallinity. In the simple method, crystallinity in semicrystalline cellulose I samples was determined based on univariate regression that was first developed using the...
Prediction of ethanol in bottled Chinese rice wine by NIR spectroscopy
NASA Astrophysics Data System (ADS)
Ying, Yibin; Yu, Haiyan; Pan, Xingxiang; Lin, Tao
2006-10-01
To evaluate the applicability of non-invasive visible and near infrared (VIS-NIR) spectroscopy for determining ethanol concentration of Chinese rice wine in square brown glass bottle, transmission spectra of 100 bottled Chinese rice wine samples were collected in the spectral range of 350-1200 nm. Statistical equations were established between the reference data and VIS-NIR spectra by partial least squares (PLS) regression method. Performance of three kinds of mathematical treatment of spectra (original spectra, first derivative spectra and second derivative spectra) were also discussed. The PLS models of original spectra turned out better results, with higher correlation coefficient in calibration (R cal) of 0.89, lower root mean standard error of calibration (RMSEC) of 0.165, and lower root mean standard error of cross validation (RMSECV) of 0.179. Using original spectra, PLS models for ethanol concentration prediction were developed. The R cal and the correlation coefficient in validation (R val) were 0.928 and 0.875, respectively; and the RMSEC and the root mean standard error of validation (RMSEP) were 0.135 (%, v v -1) and 0.177 (%, v v -1), respectively. The results demonstrated that VIS-NIR spectroscopy could be used to predict ethanol concentration in bottled Chinese rice wine.
Gorre, Elsa; Owens, Kevin G
2016-11-01
In this work an attenuated total reflection Fourier transform infrared (FT-IR) absorption based method is used to measure the solubility of two matrix-assisted laser desorption-ionization (MALDI) matrices in a few pure solvents and mixtures of acetonitrile and water using low microliter amounts of solution. Results from a method that averages the values obtained from multiple calibration curves created by manual peak picking are compared to those predicted using a partial least squares (PLS) chemometrics approach. The PLS method provided solubility values that were in good agreement with the manual method with significantly greater ease of analysis. As a test, the solubility of adipic acid in acetone was measured using the two methods of analysis, and the values are in good agreement with solubility values reported in literature. The solubilities of the MALDI matrices α-cyano-4-hydroxy cinnamic acid (CHCA) and sinapinic acid (SA) were measured in a series of mixtures made from acetonitrile (ACN) and water; surprisingly, the results show a highly nonlinear trend. While both CHCA and SA show solubility values of less than 10 mg/mL in the pure solvents, the solubility value for SA increases to 56.3 mg/mL in a 75:25 v/v ACN:water mixture. This can have a significant effect on the matrix-to-analyte ratios in the MALDI experiment when sample protocols call for preparation of a saturated solution of the matrix in the chosen solvent system. © The Author(s) 2016.
NASA Astrophysics Data System (ADS)
Attia, Khalid A. M.; Nassar, Mohammed W. I.; El-Zeiny, Mohamed B.; Serag, Ahmed
2016-03-01
Different chemometric models were applied for the quantitative analysis of amoxicillin (AMX), and flucloxacillin (FLX) in their binary mixtures, namely, partial least squares (PLS), spectral residual augmented classical least squares (SRACLS), concentration residual augmented classical least squares (CRACLS) and artificial neural networks (ANNs). All methods were applied with and without variable selection procedure (genetic algorithm GA). The methods were used for the quantitative analysis of the drugs in laboratory prepared mixtures and real market sample via handling the UV spectral data. Robust and simpler models were obtained by applying GA. The proposed methods were found to be rapid, simple and required no preliminary separation steps.
NASA Astrophysics Data System (ADS)
Ying, Yibin; Liu, Yande; Fu, Xiaping; Lu, Huishan
2005-11-01
The artificial neural networks (ANNs) have been used successfully in applications such as pattern recognition, image processing, automation and control. However, majority of today's applications of ANNs is back-propagate feed-forward ANN (BP-ANN). In this paper, back-propagation artificial neural networks (BP-ANN) were applied for modeling soluble solid content (SSC) of intact pear from their Fourier transform near infrared (FT-NIR) spectra. One hundred and sixty-four pear samples were used to build the calibration models and evaluate the models predictive ability. The results are compared to the classical calibration approaches, i.e. principal component regression (PCR), partial least squares (PLS) and non-linear PLS (NPLS). The effects of the optimal methods of training parameters on the prediction model were also investigated. BP-ANN combine with principle component regression (PCR) resulted always better than the classical PCR, PLS and Weight-PLS methods, from the point of view of the predictive ability. Based on the results, it can be concluded that FT-NIR spectroscopy and BP-ANN models can be properly employed for rapid and nondestructive determination of fruit internal quality.
Aleixandre-Tudo, José Luis; Nieuwoudt, Helené; Aleixandre, José Luis; Du Toit, Wessel J
2015-02-04
The validation of ultraviolet-visible (UV-vis) spectroscopy combined with partial least-squares (PLS) regression to quantify red wine tannins is reported. The methylcellulose precipitable (MCP) tannin assay and the bovine serum albumin (BSA) tannin assay were used as reference methods. To take the high variability of wine tannins into account when the calibration models were built, a diverse data set was collected from samples of South African red wines that consisted of 18 different cultivars, from regions spanning the wine grape-growing areas of South Africa with their various sites, climates, and soils, ranging in vintage from 2000 to 2012. A total of 240 wine samples were analyzed, and these were divided into a calibration set (n = 120) and a validation set (n = 120) to evaluate the predictive ability of the models. To test the robustness of the PLS calibration models, the predictive ability of the classifying variables cultivar, vintage year, and experimental versus commercial wines was also tested. In general, the statistics obtained when BSA was used as a reference method were slightly better than those obtained with MCP. Despite this, the MCP tannin assay should also be considered as a valid reference method for developing PLS calibrations. The best calibration statistics for the prediction of new samples were coefficient of correlation (R 2 val) = 0.89, root mean standard error of prediction (RMSEP) = 0.16, and residual predictive deviation (RPD) = 3.49 for MCP and R 2 val = 0.93, RMSEP = 0.08, and RPD = 4.07 for BSA, when only the UV region (260-310 nm) was selected, which also led to a faster analysis time. In addition, a difference in the results obtained when the predictive ability of the classifying variables vintage, cultivar, or commercial versus experimental wines was studied suggests that tannin composition is highly affected by many factors. This study also discusses the correlations in tannin values between the methylcellulose and protein precipitation methods.
Lascola, Robert; O'Rourke, Patrick E.; Kyser, Edward A.
2017-10-05
Here, we have developed a piecewise local (PL) partial least squares (PLS) analysis method for total plutonium measurements by absorption spectroscopy in nitric acid-based nuclear material processing streams. Instead of using a single PLS model that covers all expected solution conditions, the method selects one of several local models based on an assessment of solution absorbance, acidity, and Pu oxidation state distribution. The local models match the global model for accuracy against the calibration set, but were observed in several instances to be more robust to variations associated with measurements in the process. The improvements are attributed to the relativemore » parsimony of the local models. Not all of the sources of spectral variation are uniformly present at each part of the calibration range. Thus, the global model is locally overfitting and susceptible to increased variance when presented with new samples. A second set of models quantifies the relative concentrations of Pu(III), (IV), and (VI). Standards containing a mixture of these species were not at equilibrium due to a disproportionation reaction. Therefore, a separate principal component analysis is used to estimate of the concentrations of the individual oxidation states in these standards in the absence of independent confirmatory analysis. The PL analysis approach is generalizable to other systems where the analysis of chemically complicated systems can be aided by rational division of the overall range of solution conditions into simpler sub-regions.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Lascola, Robert; O'Rourke, Patrick E.; Kyser, Edward A.
Here, we have developed a piecewise local (PL) partial least squares (PLS) analysis method for total plutonium measurements by absorption spectroscopy in nitric acid-based nuclear material processing streams. Instead of using a single PLS model that covers all expected solution conditions, the method selects one of several local models based on an assessment of solution absorbance, acidity, and Pu oxidation state distribution. The local models match the global model for accuracy against the calibration set, but were observed in several instances to be more robust to variations associated with measurements in the process. The improvements are attributed to the relativemore » parsimony of the local models. Not all of the sources of spectral variation are uniformly present at each part of the calibration range. Thus, the global model is locally overfitting and susceptible to increased variance when presented with new samples. A second set of models quantifies the relative concentrations of Pu(III), (IV), and (VI). Standards containing a mixture of these species were not at equilibrium due to a disproportionation reaction. Therefore, a separate principal component analysis is used to estimate of the concentrations of the individual oxidation states in these standards in the absence of independent confirmatory analysis. The PL analysis approach is generalizable to other systems where the analysis of chemically complicated systems can be aided by rational division of the overall range of solution conditions into simpler sub-regions.« less
Ferragina, A.; de los Campos, G.; Vazquez, A. I.; Cecchinato, A.; Bittante, G.
2017-01-01
The aim of this study was to assess the performance of Bayesian models commonly used for genomic selection to predict “difficult-to-predict” dairy traits, such as milk fatty acid (FA) expressed as percentage of total fatty acids, and technological properties, such as fresh cheese yield and protein recovery, using Fourier-transform infrared (FTIR) spectral data. Our main hypothesis was that Bayesian models that can estimate shrinkage and perform variable selection may improve our ability to predict FA traits and technological traits above and beyond what can be achieved using the current calibration models (e.g., partial least squares, PLS). To this end, we assessed a series of Bayesian methods and compared their prediction performance with that of PLS. The comparison between models was done using the same sets of data (i.e., same samples, same variability, same spectral treatment) for each trait. Data consisted of 1,264 individual milk samples collected from Brown Swiss cows for which gas chromatographic FA composition, milk coagulation properties, and cheese-yield traits were available. For each sample, 2 spectra in the infrared region from 5,011 to 925 cm−1 were available and averaged before data analysis. Three Bayesian models: Bayesian ridge regression (Bayes RR), Bayes A, and Bayes B, and 2 reference models: PLS and modified PLS (MPLS) procedures, were used to calibrate equations for each of the traits. The Bayesian models used were implemented in the R package BGLR (http://cran.r-project.org/web/packages/BGLR/index.html), whereas the PLS and MPLS were those implemented in the WinISI II software (Infrasoft International LLC, State College, PA). Prediction accuracy was estimated for each trait and model using 25 replicates of a training-testing validation procedure. Compared with PLS, which is currently the most widely used calibration method, MPLS and the 3 Bayesian methods showed significantly greater prediction accuracy. Accuracy increased in moving from calibration to external validation methods, and in moving from PLS and MPLS to Bayesian methods, particularly Bayes A and Bayes B. The maximum R2 value of validation was obtained with Bayes B and Bayes A. For the FA, C10:0 (% of each FA on total FA basis) had the highest R2 (0.75, achieved with Bayes A and Bayes B), and among the technological traits, fresh cheese yield R2 of 0.82 (achieved with Bayes B). These 2 methods have proven to be useful instruments in shrinking and selecting very informative wavelengths and inferring the structure and functions of the analyzed traits. We conclude that Bayesian models are powerful tools for deriving calibration equations, and, importantly, these equations can be easily developed using existing open-source software. As part of our study, we provide scripts based on the open source R software BGLR, which can be used to train customized prediction equations for other traits or populations. PMID:26387015
Shimizu, Yu; Yoshimoto, Junichiro; Takamura, Masahiro; Okada, Go; Okamoto, Yasumasa; Yamawaki, Shigeto; Doya, Kenji
2017-01-01
In diagnostic applications of statistical machine learning methods to brain imaging data, common problems include data high-dimensionality and co-linearity, which often cause over-fitting and instability. To overcome these problems, we applied partial least squares (PLS) regression to resting-state functional magnetic resonance imaging (rs-fMRI) data, creating a low-dimensional representation that relates symptoms to brain activity and that predicts clinical measures. Our experimental results, based upon data from clinically depressed patients and healthy controls, demonstrated that PLS and its kernel variants provided significantly better prediction of clinical measures than ordinary linear regression. Subsequent classification using predicted clinical scores distinguished depressed patients from healthy controls with 80% accuracy. Moreover, loading vectors for latent variables enabled us to identify brain regions relevant to depression, including the default mode network, the right superior frontal gyrus, and the superior motor area. PMID:28700672
Darwish, Hany W; Bakheit, Ahmed H; Abdelhameed, Ali S
2016-03-01
Simultaneous spectrophotometric analysis of a multi-component dosage form of olmesartan, amlodipine and hydrochlorothiazide used for the treatment of hypertension has been carried out using various chemometric methods. Multivariate calibration methods include classical least squares (CLS) executed by net analyte processing (NAP-CLS), orthogonal signal correction (OSC-CLS) and direct orthogonal signal correction (DOSC-CLS) in addition to multivariate curve resolution-alternating least squares (MCR-ALS). Results demonstrated the efficiency of the proposed methods as quantitative tools of analysis as well as their qualitative capability. The three analytes were determined precisely using the aforementioned methods in an external data set and in a dosage form after optimization of experimental conditions. Finally, the efficiency of the models was validated via comparison with the partial least squares (PLS) method in terms of accuracy and precision.
Kumar, Keshav
2018-03-01
Excitation-emission matrix fluorescence (EEMF) and total synchronous fluorescence spectroscopy (TSFS) are the 2 fluorescence techniques that are commonly used for the analysis of multifluorophoric mixtures. These 2 fluorescence techniques are conceptually different and provide certain advantages over each other. The manual analysis of such highly correlated large volume of EEMF and TSFS towards developing a calibration model is difficult. Partial least square (PLS) analysis can analyze the large volume of EEMF and TSFS data sets by finding important factors that maximize the correlation between the spectral and concentration information for each fluorophore. However, often the application of PLS analysis on entire data sets does not provide a robust calibration model and requires application of suitable pre-processing step. The present work evaluates the application of genetic algorithm (GA) analysis prior to PLS analysis on EEMF and TSFS data sets towards improving the precision and accuracy of the calibration model. The GA algorithm essentially combines the advantages provided by stochastic methods with those provided by deterministic approaches and can find the set of EEMF and TSFS variables that perfectly correlate well with the concentration of each of the fluorophores present in the multifluorophoric mixtures. The utility of the GA assisted PLS analysis is successfully validated using (i) EEMF data sets acquired for dilute aqueous mixture of four biomolecules and (ii) TSFS data sets acquired for dilute aqueous mixtures of four carcinogenic polycyclic aromatic hydrocarbons (PAHs) mixtures. In the present work, it is shown that by using the GA it is possible to significantly improve the accuracy and precision of the PLS calibration model developed for both EEMF and TSFS data set. Hence, GA must be considered as a useful pre-processing technique while developing an EEMF and TSFS calibration model.
Martelo-Vidal, M J; Vázquez, M
2014-09-01
Spectral analysis is a quick and non-destructive method to analyse wine. In this work, trans-resveratrol, oenin, malvin, catechin, epicatechin, quercetin and syringic acid were determined in commercial red wines from DO Rías Baixas and DO Ribeira Sacra (Spain) by UV-VIS-NIR spectroscopy. Calibration models were developed using principal component regression (PCR) or partial least squares (PLS) regression. HPLC was used as reference method. The results showed that reliable PLS models were obtained to quantify all polyphenols for Rías Baixas wines. For Ribeira Sacra, feasible models were obtained to determine quercetin, epicatechin, oenin and syringic acid. PCR calibration models showed worst reliable of prediction than PLS models. For red wines from mencía grapes, feasible models were obtained for catechin and oenin, regardless the geographical origin. The results obtained demonstrate that UV-VIS-NIR spectroscopy can be used to determine individual polyphenolic compounds in red wines. Copyright © 2014 Elsevier Ltd. All rights reserved.
Jiménez-Carvelo, Ana M; González-Casado, Antonio; Pérez-Castaño, Estefanía; Cuadros-Rodríguez, Luis
2017-03-01
A new analytical method for the differentiation of olive oil from other vegetable oils using reversed-phase LC and applying chemometric techniques was developed. A 3 cm short column was used to obtain the chromatographic fingerprint of the methyl-transesterified fraction of each vegetable oil. The chromatographic analysis took only 4 min. The multivariate classification methods used were k-nearest neighbors, partial least-squares (PLS) discriminant analysis, one-class PLS, support vector machine classification, and soft independent modeling of class analogies. The discrimination of olive oil from other vegetable edible oils was evaluated by several classification quality metrics. Several strategies for the classification of the olive oil were used: one input-class, two input-class, and pseudo two input-class.
de Oliveira, Rodrigo Rocha; de Lima, Kássio Michell Gomes; Tauler, Romà; de Juan, Anna
2014-07-01
This study describes two applications of a variant of the multivariate curve resolution alternating least squares (MCR-ALS) method with a correlation constraint. The first application describes the use of MCR-ALS for the determination of biodiesel concentrations in biodiesel blends using near infrared (NIR) spectroscopic data. In the second application, the proposed method allowed the determination of the synthetic antioxidant N,N'-Di-sec-butyl-p-phenylenediamine (PDA) present in biodiesel mixtures from different vegetable sources using UV-visible spectroscopy. Well established multivariate regression algorithm, partial least squares (PLS), were calculated for comparison of the quantification performance in the models developed in both applications. The correlation constraint has been adapted to handle the presence of batch-to-batch matrix effects due to ageing effects, which might occur when different groups of samples were used to build a calibration model in the first application. Different data set configurations and diverse modes of application of the correlation constraint are explored and guidelines are given to cope with different type of analytical problems, such as the correction of matrix effects among biodiesel samples, where MCR-ALS outperformed PLS reducing the relative error of prediction RE (%) from 9.82% to 4.85% in the first application, or the determination of minor compound with overlapped weak spectroscopic signals, where MCR-ALS gave higher (RE (%)=3.16%) for prediction of PDA compared to PLS (RE (%)=1.99%), but with the advantage of recovering the related pure spectral profile of analytes and interferences. The obtained results show the potential of the MCR-ALS method with correlation constraint to be adapted to diverse data set configurations and analytical problems related to the determination of biodiesel mixtures and added compounds therein. Copyright © 2014 Elsevier B.V. All rights reserved.
Bunaciu, Andrei A.; Udristioiu, Gabriela Elena; Ruţă, Lavinia L.; Fleschin, Şerban; Aboul-Enein, Hassan Y.
2009-01-01
A Fourier transform infrared (FT-IR) spectrometric method was developed for the rapid, direct measurement of diosmin in different pharmaceutical drugs. Conventional KBr-spectra were compared for best determination of active substance in commercial preparations. The Beer–Lambert law and two chemometric approaches, partial least squares (PLS) and principal component regression (PCR+) methods, were tried in data processing. PMID:23960715
NASA Astrophysics Data System (ADS)
Hashim, Noor Haslinda Noor; Latip, Jalifah; Khatib, Alfi
2016-11-01
The metabolites of Clinacanthus nutans leaves extracts and their dependence on drying process were systematically characterized using 1H nuclear magnetic resonance spectroscopy (NMR) multivariate data analysis. Principal component analysis (PCA) and partial least square-discriminant analysis (PLS-DA) were able to distinguish the leaves extracts obtained from different drying methods. The identified metabolites were carbohydrates, amino acid, flavonoids and sulfur glucoside compounds. The major metabolites responsible for the separation in PLS-DA loading plots were lupeol, cycloclinacosides, betulin, cerebrosides and choline. The results showed that the combination of 1H NMR spectroscopy and multivariate data analyses could act as an efficient technique to understand the C. nutans composition and its variation.
Determination of total phenolic compounds in compost by infrared spectroscopy.
Cascant, M M; Sisouane, M; Tahiri, S; Krati, M El; Cervera, M L; Garrigues, S; de la Guardia, M
2016-06-01
Middle and near infrared (MIR and NIR) were applied to determine the total phenolic compounds (TPC) content in compost samples based on models built by using partial least squares (PLS) regression. The multiplicative scatter correction, standard normal variate and first derivative were employed as spectra pretreatment, and the number of latent variable were optimized by leave-one-out cross-validation. The performance of PLS-ATR-MIR and PLS-DR-NIR models was evaluated according to root mean square error of cross validation and prediction (RMSECV and RMSEP), the coefficient of determination for prediction (Rpred(2)) and residual predictive deviation (RPD) being obtained for this latter values of 5.83 and 8.26 for MIR and NIR, respectively. Copyright © 2016 Elsevier B.V. All rights reserved.
Vignaduzzo, Silvana E; Maggio, Rubén M; Castellano, Patricia M; Kaufman, Teodoro S
2006-12-01
Two new analytical methods have been developed as convenient and useful alternatives for simultaneous determination of hydrochlorothiazide (HCT) and propranolol hydrochloride (PRO) in pharmaceutical formulations. The methods are based on the first derivative of ratio spectra (DRS) and on partial least squares (PLS) analysis of the ultraviolet absorption spectra of the samples in the 250-350-nm region. The methods were calibrated between 8.7 and 16.0 mg L(-1) for HCT and between 14.0 and 51.5 mg L(-1) for PRO. An asymmetric full-factorial design and wavelength selection (277-294 nm for HCT and 297-319 for PRO) were used for the PLS method and signal intensities at 276 and 322 nm were used in the DRS method for HCT and PRO, respectively. Performance characteristics of the analytical methods were evaluated by use of validation samples and both methods showed to be accurate and precise, furnishing near quantitative analyte recoveries (100.4 and 99.3% for HCT and PRO by use of PLS) and relative standard deviations below 2%. For PLS the lower limits of quantification were 0.37 and 0.66 mg L(-1) for HCT and PRO, respectively, whereas for DRS they were 1.15 and 3.05 mg L(-1) for HCT and PRO, respectively. The methods were used for quantification of HCT and PRO in synthetic mixtures and in two commercial tablet preparations containing different proportions of the analytes. The results of the drug content assay and the tablet dissolution test were in statistical agreement (p < 0.05) with those furnished by the official procedures of the USP 29. Preparation of dissolution profiles of the combined tablet formulations was also performed with the aid of the proposed methods. The methods are easy to apply, use relatively simple equipment, require minimum sample pre-treatment, enable high sample throughput, and generate less solvent waste than other procedures.
Paradowska, Katarzyna; Jamróz, Marta Katarzyna; Kobyłka, Mariola; Gowin, Ewelina; Maczka, Paulina; Skibiński, Robert; Komsta, Łukasz
2012-01-01
This paper presents a preliminary study in building discriminant models from solid-state NMR spectrometry data to detect the presence of acetaminophen in over-the-counter pharmaceutical formulations. The dataset, containing 11 spectra of pure substances and 21 spectra of various formulations, was processed by partial least squares discriminant analysis (PLS-DA). The model found coped with the discrimination, and its quality parameters were acceptable. It was found that standard normal variate preprocessing had almost no influence on unsupervised investigation of the dataset. The influence of variable selection with the uninformative variable elimination by PLS method was studied, reducing the dataset from 7601 variables to around 300 informative variables, but not improving the model performance. The results showed the possibility to construct well-working PLS-DA models from such small datasets without a full experimental design.
Balabin, Roman M; Smirnov, Sergey V
2011-07-15
Melamine (2,4,6-triamino-1,3,5-triazine) is a nitrogen-rich chemical implicated in the pet and human food recalls and in the global food safety scares involving milk products. Due to the serious health concerns associated with melamine consumption and the extensive scope of affected products, rapid and sensitive methods to detect melamine's presence are essential. We propose the use of spectroscopy data-produced by near-infrared (near-IR/NIR) and mid-infrared (mid-IR/MIR) spectroscopies, in particular-for melamine detection in complex dairy matrixes. None of the up-to-date reported IR-based methods for melamine detection has unambiguously shown its wide applicability to different dairy products as well as limit of detection (LOD) below 1 ppm on independent sample set. It was found that infrared spectroscopy is an effective tool to detect melamine in dairy products, such as infant formula, milk powder, or liquid milk. ALOD below 1 ppm (0.76±0.11 ppm) can be reached if a correct spectrum preprocessing (pretreatment) technique and a correct multivariate (MDA) algorithm-partial least squares regression (PLS), polynomial PLS (Poly-PLS), artificial neural network (ANN), support vector regression (SVR), or least squares support vector machine (LS-SVM)-are used for spectrum analysis. The relationship between MIR/NIR spectrum of milk products and melamine content is nonlinear. Thus, nonlinear regression methods are needed to correctly predict the triazine-derivative content of milk products. It can be concluded that mid- and near-infrared spectroscopy can be regarded as a quick, sensitive, robust, and low-cost method for liquid milk, infant formula, and milk powder analysis. Copyright © 2011 Elsevier B.V. All rights reserved.
Liu, Xue-Mei; Liu, Jian-She
2012-11-01
Visible infrared spectroscopy (Vis/SW-NIRS) was investigated in the present study for measurement accuracy of soil properties,namely, available nitrogen(N) and available potassium(K). Three types of pretreatments including standard normal variate (SNV), multiplicative scattering correction (MSC) and Savitzky-Golay smoothing+first derivative were adopted to eliminate the system noises and external disturbances. Then partial least squares (PLS) and least squares-support vector machine (LS-SVM) models analysis were implemented for calibration models. Simultaneously, the performance of least squares-support vector machine (LS-SVM) models was compared with three kinds of inputs, including PCA(PCs), latent variables (LVs), and effective wavelengths (EWs). The results indicated that all LS-SVM models outperformed PLS models. The performance of the model was evaluated by the correlation coefficient (r2) and RMSEP. The optimal EWs-LS-SVM models were achieved, and the correlation coefficient (r2) and RMSEP were 0.82 and 17.2 for N and 0.72 and 15.0 for K, respectively. The results indicated that visible and short wave-near infrared spectroscopy (Vis/SW-NIRS)(325-1 075 nm) combined with LS-SVM could be utilized as a precision method for the determination of soil properties.
Zuo, Yamin; Deng, Xuehua; Wu, Qing
2018-05-04
Discrimination of Gastrodia elata ( G. elata ) geographical origin is of great importance to pharmaceutical companies and consumers in China. this paper focuses on the feasibility of near infrared spectrum (NIRS) combined multivariate analysis as a rapid and non-destructive method to prove its fit for this purpose. Firstly, 16 batches of G. elata samples from four main-cultivation regions in China were quantified by traditional HPLC method. It showed that samples from different origins could not be efficiently differentiated by the contents of four phenolic compounds in this study. Secondly, the raw near infrared (NIR) spectra of those samples were acquired and two different pattern recognition techniques were used to classify the geographical origins. The results showed that with spectral transformation optimized, discriminant analysis (DA) provided 97% and 99% correct classification for the calibration and validation sets of samples from discriminating of four different main-cultivation regions, and provided 98% and 99% correct classifications for the calibration and validation sets of samples from eight different cities, respectively, which all performed better than the principal component analysis (PCA) method. Thirdly, as phenolic compounds content (PCC) is highly related with the quality of G. elata , synergy interval partial least squares (Si-PLS) was applied to build the PCC prediction model. The coefficient of determination for prediction (R p ²) of the Si-PLS model was 0.9209, and root mean square error for prediction (RMSEP) was 0.338. The two regions (4800 cm −1 ⁻5200 cm −1 , and 5600 cm −1 ⁻6000 cm −1 ) selected by Si-PLS corresponded to the absorptions of aromatic ring in the basic phenolic structure. It can be concluded that NIR spectroscopy combined with PCA, DA and Si-PLS would be a potential tool to provide a reference for the quality control of G. elata.
[Determination of Cu in Shell of Preserved Egg by LIBS Coupled with PLS].
Hu, Hui-qin; Xu, Xue-hong; Liu, Mu-hua; Tu, Jian-ping; Huang, Le; Huang, Lin; Yao, Ming-yin; Chen, Tian-bing; Yang, Ping
2015-12-01
In this work, the content of copper in the shell of preserved eggs were determined directly by Laser induced breakdown spectroscopy (LIBS), and the characteristics lines of Cu was obtained. The samples of eggshell were pretreated by acid wet digestion, and the real content of Cu was obtained by atomic absorption spectrophotometer (AAS). Due to the test precision and accuracy of LIBS was influenced by a serious of factors, for example, the complex matrix effect of sample, the enviro nment noise, the system noise of the instrument, the stability of laser energy and so on. And the conventional unvariate linear calibration curve between LIBS intensity and content of element of sample, such as by use of Schiebe G-Lomakin equation, can not meet the requirement of quantitative analysis. In account of that, a kind of multivariate calibration method is needed. In this work, the data of LIBS spectra were processed by partial least squares (PLS), the precision and accuracy of PLS model were compared by different smoothing treatment and five pretreatment methods. The result showed that the correlation coefficient and the accuracy of the PLS model were improved, and the root mean square error and the average relative error were reduced effectively by 11 point smoothing with Multiplicative scatter correction (MSC) pretreatment. The results of the study show that, heavy metal Cu in preserved egg shells can be direct detected accurately by laser induced breakdown spectroscopy, and the next step batch tests will been conducted to find out the relationship of heavy metal Cu content in the preserved egg between the eggshell, egg white and egg yolk. And the goal of the contents of heavy metals in the egg white, egg yolk can be knew through determinate the eggshell by the LIBS can be achieved, to provide new method for rapid non-destructive testing technology for quality and satety of agricultural products.
Teng, Wei-Zhuo; Song, Jia; Meng, Fan-Xin; Meng, Qing-Fan; Lu, Jia-Hui; Hu, Shuang; Teng, Li-Rong; Wang, Di; Xie, Jing
2014-10-01
Partial least squares (PLS) and radial basis function neural network (RBFNN) combined with near infrared spectros- copy (NIR) were applied to develop models for cordycepic acid, polysaccharide and adenosine analysis in Paecilomyces hepialid fermentation mycelium. The developed models possess well generalization and predictive ability which can be applied for crude drugs and related productions determination. During the experiment, 214 Paecilomyces hepialid mycelium samples were obtained via chemical mutagenesis combined with submerged fermentation. The contents of cordycepic acid, polysaccharide and adenosine were determined via traditional methods and the near infrared spectroscopy data were collected. The outliers were removed and the numbers of calibration set were confirmed via Monte Carlo partial least square (MCPLS) method. Based on the values of degree of approach (Da), both moving window partial least squares (MWPLS) and moving window radial basis function neural network (MWRBFNN) were applied to optimize characteristic wavelength variables, optimum preprocessing methods and other important variables in the models. After comparison, the RBFNN, RBFNN and PLS models were developed successfully for cordycepic acid, polysaccharide and adenosine detection, and the correlation between reference values and predictive values in both calibration set (R2c) and validation set (R2p) of optimum models was 0.9417 and 0.9663, 0.9803 and 0.9850, and 0.9761 and 0.9728, respectively. All the data suggest that these models possess well fitness and predictive ability.
Detection of pit fragments in fresh cherries using near infrared spectroscopy
USDA-ARS?s Scientific Manuscript database
NIR spectroscopy in the wavelength region from 900nm to 2600nm was evaluated as the basis for a rapid, non-destructive method for the detection of pits and pit fragments in fresh cherries. Partial Least Squares discriminant analysis (PLS-DA) following various spectral pretreatments was applied to sp...
Li, Wei; Zhang, Xuan; Zheng, Kaiyi; Du, Yiping; Cap, Peng; Sui, Tao; Geng, Jinpei
2015-01-01
A fluidized bed enrichment technique was developed to improve sensitivity of near infrared (NIR) spectroscopy with features of rapidness and large volume solution. D301 resin was used as an adsorption material to preconcentrate β-naphthalenesulfonic acid in solutions in a concentration range of 2.0-100.0 μg/mL, and NIR spectra were measured directly relative to the β-naphthalenesulfonic acid adsorbed on the material. An improved partial least squares (PLS) model was attained with the aid of multiplicative scatter correction pretreatment and stability competitive adaptive reweighted sampling wavenumber selection method. The root mean square error of cross validation was 1.87 μg/mL at PLS factor of 7. An independent test set was used to assess the model, with the relative error (RE) in an acceptable range of 0.46 to 10.03% and mean RE of 3.72%. This study confirmed the viability of the proposed method for the measurement of a low content of β-naphthalenesulfonic acid in water.
Partial least squares for efficient models of fecal indicator bacteria on Great Lakes beaches
Brooks, Wesley R.; Fienen, Michael N.; Corsi, Steven R.
2013-01-01
At public beaches, it is now common to mitigate the impact of water-borne pathogens by posting a swimmer's advisory when the concentration of fecal indicator bacteria (FIB) exceeds an action threshold. Since culturing the bacteria delays public notification when dangerous conditions exist, regression models are sometimes used to predict the FIB concentration based on readily-available environmental measurements. It is hard to know which environmental parameters are relevant to predicting FIB concentration, and the parameters are usually correlated, which can hurt the predictive power of a regression model. Here the method of partial least squares (PLS) is introduced to automate the regression modeling process. Model selection is reduced to the process of setting a tuning parameter to control the decision threshold that separates predicted exceedances of the standard from predicted non-exceedances. The method is validated by application to four Great Lakes beaches during the summer of 2010. Performance of the PLS models compares favorably to that of the existing state-of-the-art regression models at these four sites.
NASA Astrophysics Data System (ADS)
Duarte, Janaína; Pacheco, Marcos T. T.; Villaverde, Antonio Balbin; Machado, Rosangela Z.; Zângaro, Renato A.; Silveira, Landulfo
2010-07-01
Toxoplasmosis is an important zoonosis in public health because domestic cats are the main agents responsible for the transmission of this disease in Brazil. We investigate a method for diagnosing toxoplasmosis based on Raman spectroscopy. Dispersive near-infrared Raman spectra are used to quantify anti-Toxoplasma gondii (IgG) antibodies in blood sera from domestic cats. An 830-nm laser is used for sample excitation, and a dispersive spectrometer is used to detect the Raman scattering. A serological test is performed in all serum samples by the enzyme-linked immunosorbent assay (ELISA) for validation. Raman spectra are taken from 59 blood serum samples and a quantification model is implemented based on partial least squares (PLS) to quantify the sample's serology by Raman spectra compared to the results provided by the ELISA test. Based on the serological values provided by the Raman/PLS model, diagnostic parameters such as sensitivity, specificity, accuracy, positive prediction values, and negative prediction values are calculated to discriminate negative from positive samples, obtaining 100, 80, 90, 83.3, and 100%, respectively. Raman spectroscopy, associated with the PLS, is promising as a serological assay for toxoplasmosis, enabling fast and sensitive diagnosis.
Quantitative determination of wool in textile by near-infrared spectroscopy and multivariate models.
Chen, Hui; Tan, Chao; Lin, Zan
2018-08-05
The wool content in textiles is a key quality index and the corresponding quantitative analysis takes an important position due to common adulterations in both raw and finished textiles. Conventional methods are maybe complicated, destructive, time-consuming, environment-unfriendly. Developing a quick, easy-to-use and green alternative method is interesting. The work focuses on exploring the feasibility of combining near-infrared (NIR) spectroscopy and several partial least squares (PLS)-based algorithms and elastic component regression (ECR) algorithms for measuring wool content in textile. A total of 108 cloth samples with wool content ranging from 0% to 100% (w/w) were collected and all the compositions are really existent in the market. The dataset was divided equally into the training and test sets for developing and validating calibration models. When using local PLS, the original spectrum axis was split into 20 sub-intervals. No obvious difference of performance can be seen for the local PLS models. The ECR model is comparable or superior to the other models due its flexibility, i.e., being transition state from PCR to PLS. It seems that ECR combined with NIR technique may be a potential method for determining wool content in textile products. In addition, it might have regulatory advantages to avoid time-consuming and environmental-unfriendly chemical analysis. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Luna, Aderval S.; da Silva, Arnaldo P.; Ferré, Joan; Boqué, Ricard
This research work describes two studies for the classification and characterization of edible oils and its quality parameters through Fourier transform mid infrared spectroscopy (FT-mid-IR) together with chemometric methods. The discrimination of canola, sunflower, corn and soybean oils was investigated using SVM-DA, SIMCA and PLS-DA. Using FT-mid-IR, DPLS was able to classify 100% of the samples from the validation set, but SIMCA and SVM-DA were not. The quality parameters: refraction index and relative density of edible oils were obtained from reference methods. Prediction models for FT-mid-IR spectra were calculated for these quality parameters using partial least squares (PLS) and support vector machines (SVM). Several preprocessing alternatives (first derivative, multiplicative scatter correction, mean centering, and standard normal variate) were investigated. The best result for the refraction index was achieved with SVM as well as for the relative density except when the preprocessing combination of mean centering and first derivative was used. For both of quality parameters, the best results obtained for the figures of merit expressed by the root mean square error of cross validation (RMSECV) and prediction (RMSEP) were equal to 0.0001.
NASA Astrophysics Data System (ADS)
Saad, Ahmed S.; Hamdy, Abdallah M.; Salama, Fathy M.; Abdelkawy, Mohamed
2016-10-01
Effect of data manipulation in preprocessing step proceeding construction of chemometric models was assessed. The same set of UV spectral data was used for construction of PLS and PCR models directly and after mathematically manipulation as per well known first and second derivatives of the absorption spectra, ratio spectra and first and second derivatives of the ratio spectra spectrophotometric methods, meanwhile the optimal working wavelength ranges were carefully selected for each model and the models were constructed. Unexpectedly, number of latent variables used for models' construction varied among the different methods. The prediction power of the different models was compared using a validation set of 8 mixtures prepared as per the multilevel multifactor design and results were statistically compared using two-way ANOVA test. Root mean squares error of prediction (RMSEP) was used for further comparison of the predictability among different constructed models. Although no significant difference was found between results obtained using Partial Least Squares (PLS) and Principal Component Regression (PCR) models, however, discrepancies among results was found to be attributed to the variation in the discrimination power of adopted spectrophotometric methods on spectral data.
Ferragina, A; de los Campos, G; Vazquez, A I; Cecchinato, A; Bittante, G
2015-11-01
The aim of this study was to assess the performance of Bayesian models commonly used for genomic selection to predict "difficult-to-predict" dairy traits, such as milk fatty acid (FA) expressed as percentage of total fatty acids, and technological properties, such as fresh cheese yield and protein recovery, using Fourier-transform infrared (FTIR) spectral data. Our main hypothesis was that Bayesian models that can estimate shrinkage and perform variable selection may improve our ability to predict FA traits and technological traits above and beyond what can be achieved using the current calibration models (e.g., partial least squares, PLS). To this end, we assessed a series of Bayesian methods and compared their prediction performance with that of PLS. The comparison between models was done using the same sets of data (i.e., same samples, same variability, same spectral treatment) for each trait. Data consisted of 1,264 individual milk samples collected from Brown Swiss cows for which gas chromatographic FA composition, milk coagulation properties, and cheese-yield traits were available. For each sample, 2 spectra in the infrared region from 5,011 to 925 cm(-1) were available and averaged before data analysis. Three Bayesian models: Bayesian ridge regression (Bayes RR), Bayes A, and Bayes B, and 2 reference models: PLS and modified PLS (MPLS) procedures, were used to calibrate equations for each of the traits. The Bayesian models used were implemented in the R package BGLR (http://cran.r-project.org/web/packages/BGLR/index.html), whereas the PLS and MPLS were those implemented in the WinISI II software (Infrasoft International LLC, State College, PA). Prediction accuracy was estimated for each trait and model using 25 replicates of a training-testing validation procedure. Compared with PLS, which is currently the most widely used calibration method, MPLS and the 3 Bayesian methods showed significantly greater prediction accuracy. Accuracy increased in moving from calibration to external validation methods, and in moving from PLS and MPLS to Bayesian methods, particularly Bayes A and Bayes B. The maximum R(2) value of validation was obtained with Bayes B and Bayes A. For the FA, C10:0 (% of each FA on total FA basis) had the highest R(2) (0.75, achieved with Bayes A and Bayes B), and among the technological traits, fresh cheese yield R(2) of 0.82 (achieved with Bayes B). These 2 methods have proven to be useful instruments in shrinking and selecting very informative wavelengths and inferring the structure and functions of the analyzed traits. We conclude that Bayesian models are powerful tools for deriving calibration equations, and, importantly, these equations can be easily developed using existing open-source software. As part of our study, we provide scripts based on the open source R software BGLR, which can be used to train customized prediction equations for other traits or populations. Copyright © 2015 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Abdel Hameed, Eman A.; Abdel Salam, Randa A.; Hadad, Ghada M.
2015-04-01
Chemometric-assisted spectrophotometric methods and high performance liquid chromatography (HPLC) were developed for the simultaneous determination of the seven most commonly prescribed β-blockers (atenolol, sotalol, metoprolol, bisoprolol, propranolol, carvedilol and nebivolol). Principal component regression PCR, partial least square PLS and PLS with previous wavelength selection by genetic algorithm (GA-PLS) were used for chemometric analysis of spectral data of these drugs. The compositions of the mixtures used in the calibration set were varied to cover the linearity ranges 0.7-10 μg ml-1 for AT, 1-15 μg ml-1 for ST, 1-15 μg ml-1 for MT, 0.3-5 μg ml-1 for BS, 0.1-3 μg ml-1 for PR, 0.1-3 μg ml-1 for CV and 0.7-5 μg ml-1 for NB. The analytical performances of these chemometric methods were characterized by relative prediction errors and were compared with each other. GA-PLS showed superiority over the other applied multivariate methods due to the wavelength selection. A new gradient HPLC method had been developed using statistical experimental design. Optimum conditions of separation were determined with the aid of central composite design. The developed HPLC method was found to be linear in the range of 0.2-20 μg ml-1 for AT, 0.2-20 μg ml-1 for ST, 0.1-15 μg ml-1 for MT, 0.1-15 μg ml-1 for BS, 0.1-13 μg ml-1 for PR, 0.1-13 μg ml-1 for CV and 0.4-20 μg ml-1 for NB. No significant difference between the results of the proposed GA-PLS and HPLC methods with respect to accuracy and precision. The proposed analytical methods did not show any interference of the excipients when applied to pharmaceutical products.
Elkhoudary, Mahmoud M; Naguib, Ibrahim A; Abdel Salam, Randa A; Hadad, Ghada M
2017-05-01
Four accurate, sensitive and reliable stability indicating chemometric methods were developed for the quantitative determination of Agomelatine (AGM) whether in pure form or in pharmaceutical formulations. Two supervised learning machines' methods; linear artificial neural networks (PC-linANN) preceded by principle component analysis and linear support vector regression (linSVR), were compared with two principle component based methods; principle component regression (PCR) as well as partial least squares (PLS) for the spectrofluorimetric determination of AGM and its degradants. The results showed the benefits behind using linear learning machines' methods and the inherent merits of their algorithms in handling overlapped noisy spectral data especially during the challenging determination of AGM alkaline and acidic degradants (DG1 and DG2). Relative mean squared error of prediction (RMSEP) for the proposed models in the determination of AGM were 1.68, 1.72, 0.68 and 0.22 for PCR, PLS, SVR and PC-linANN; respectively. The results showed the superiority of supervised learning machines' methods over principle component based methods. Besides, the results suggested that linANN is the method of choice for determination of components in low amounts with similar overlapped spectra and narrow linearity range. Comparison between the proposed chemometric models and a reported HPLC method revealed the comparable performance and quantification power of the proposed models.
NASA Astrophysics Data System (ADS)
Yehia, Ali M.; Mohamed, Heba M.
2016-01-01
Three advanced chemmometric-assisted spectrophotometric methods namely; Concentration Residuals Augmented Classical Least Squares (CRACLS), Multivariate Curve Resolution-Alternating Least Squares (MCR-ALS) and Principal Component Analysis-Artificial Neural Networks (PCA-ANN) were developed, validated and benchmarked to PLS calibration; to resolve the severely overlapped spectra and simultaneously determine; Paracetamol (PAR), Guaifenesin (GUA) and Phenylephrine (PHE) in their ternary mixture and in presence of p-aminophenol (AP) the main degradation product and synthesis impurity of Paracetamol. The analytical performance of the proposed methods was described by percentage recoveries, root mean square error of calibration and standard error of prediction. The four multivariate calibration methods could be directly used without any preliminary separation step and successfully applied for pharmaceutical formulation analysis, showing no excipients' interference.
Pappas, Christos; Kyraleou, Maria; Voskidi, Eleni; Kotseridis, Yorgos; Taranilis, Petros A; Kallithraka, Stamatina
2015-02-01
The direct and simultaneous quantitative determination of the mean degree of polymerization (mDP) and the degree of galloylation (%G) in grape seeds were quantified using diffuse reflectance infrared Fourier transform spectroscopy and partial least squares (PLS). The results were compared with those obtained using the conventional analysis employing phloroglucinolysis as pretreatment followed by high performance liquid chromatography-UV and mass spectrometry detection. Infrared spectra were recorded in solid state samples after freeze drying. The 2nd derivative of the 1832 to 1416 and 918 to 739 cm(-1) spectral regions for the quantification of mDP, the 2nd derivative of the 1813 to 607 cm(-1) spectral region for the degree of %G determination and PLS regression were used. The determination coefficients (R(2) ) of mDP and %G were 0.99 and 0.98, respectively. The corresponding values of the root-mean-square error of calibration were found 0.506 and 0.692, the root-mean-square error of cross validation 0.811 and 0.921, and the root-mean-square error of prediction 0.612 and 0.801. The proposed method in comparison with the conventional method is simpler, less time consuming, more economical, and requires reduced quantities of chemical reagents and fewer sample pretreatment steps. It could be a starting point for the design of more specific models according to the requirements of the wineries. © 2015 Institute of Food Technologists®
Siebers, Nina; Kruse, Jens; Eckhardt, Kai-Uwe; Hu, Yongfeng; Leinweber, Peter
2012-07-01
Cadmium (Cd) has a high toxicity and resolving its speciation in soil is challenging but essential for estimating the environmental risk. In this study partial least-square (PLS) regression was tested for its capability to deconvolute Cd L(3)-edge X-ray absorption near-edge structure (XANES) spectra of multi-compound mixtures. For this, a library of Cd reference compound spectra and a spectrum of a soil sample were acquired. A good coefficient of determination (R(2)) of Cd compounds in mixtures was obtained for the PLS model using binary and ternary mixtures of various Cd reference compounds proving the validity of this approach. In order to describe complex systems like soil, multi-compound mixtures of a variety of Cd compounds must be included in the PLS model. The obtained PLS regression model was then applied to a highly Cd-contaminated soil revealing Cd(3)(PO(4))(2) (36.1%), Cd(NO(3))(2)·4H(2)O (24.5%), Cd(OH)(2) (21.7%), CdCO(3) (17.1%) and CdCl(2) (0.4%). These preliminary results proved that PLS regression is a promising approach for a direct determination of Cd speciation in the solid phase of a soil sample.
Borràs, Eva; Ferré, Joan; Boqué, Ricard; Mestres, Montserrat; Aceña, Laura; Calvo, Angels; Busto, Olga
2016-08-01
Headspace-Mass Spectrometry (HS-MS), Fourier Transform Mid-Infrared spectroscopy (FT-MIR) and UV-Visible spectrophotometry (UV-vis) instrumental responses have been combined to predict virgin olive oil sensory descriptors. 343 olive oil samples analyzed during four consecutive harvests (2010-2014) were used to build multivariate calibration models using partial least squares (PLS) regression. The reference values of the sensory attributes were provided by expert assessors from an official taste panel. The instrumental data were modeled individually and also using data fusion approaches. The use of fused data with both low- and mid-level of abstraction improved PLS predictions for all the olive oil descriptors. The best PLS models were obtained for two positive attributes (fruity and bitter) and two defective descriptors (fusty and musty), all of them using data fusion of MS and MIR spectral fingerprints. Although good predictions were not obtained for some sensory descriptors, the results are encouraging, specially considering that the legal categorization of virgin olive oils only requires the determination of fruity and defective descriptors. Copyright © 2016 Elsevier B.V. All rights reserved.
Yang, Yuan-Gui; Zhang, Ji; Zhao, Yan-Li; Zhang, Jin-Yu; Wang, Yuan-Zhong
2017-07-01
A rapid method was developed and validated by ultra-performance liquid chromatography-triple quadrupole mass spectroscopy with ultraviolet detection (UPLC-UV-MS) for simultaneous determination of paris saponin I, paris saponin II, paris saponin VI and paris saponin VII. Partial least squares discriminant analysis (PLS-DA) based on UPLC and Fourier transform infrared (FT-IR) spectroscopy was employed to evaluate Paris polyphylla var. yunnanensis (PPY) at different harvesting times. Quantitative determination implied that the various contents of bioactive compounds with different harvesting times may lead to different pharmacological effects; the average content of total saponins for PPY harvested at 8 years was higher than that from other samples. The PLS-DA of FT-IR spectra had a better performance than that of UPLC for discrimination of PPY from different harvesting times. Copyright © 2016 John Wiley & Sons, Ltd.
ERIC Educational Resources Information Center
Bookstein, Fred L.; And Others
1996-01-01
Discusses the use of new statistical procedures in a study of the enduring effects of prenatal alcohol exposure upon the neurobehavioral development of some 500 children born in 1975-76. Explains how the Partial Least Squares (PLS) methodology can summarize the data powerfully while avoiding familiar inferential pitfalls. (MDM)
Analysis of pork adulteration in beef meatball using Fourier transform infrared (FTIR) spectroscopy.
Rohman, A; Sismindari; Erwanto, Y; Che Man, Yaakob B
2011-05-01
Meatball is one of the favorite foods in Indonesia. The adulteration of pork in beef meatball is frequently occurring. This study was aimed to develop a fast and non destructive technique for the detection and quantification of pork in beef meatball using Fourier transform infrared (FTIR) spectroscopy and partial least square (PLS) calibration. The spectral bands associated with pork fat (PF), beef fat (BF), and their mixtures in meatball formulation were scanned, interpreted, and identified by relating them to those spectroscopically representative to pure PF and BF. For quantitative analysis, PLS regression was used to develop a calibration model at the selected fingerprint regions of 1200-1000 cm(-1). The equation obtained for the relationship between actual PF value and FTIR predicted values in PLS calibration model was y = 0.999x + 0.004, with coefficient of determination (R(2)) and root mean square error of calibration are 0.999 and 0.442, respectively. The PLS calibration model was subsequently used for the prediction of independent samples using laboratory made meatball samples containing the mixtures of BF and PF. Using 4 principal components, root mean square error of prediction is 0.742. The results showed that FTIR spectroscopy can be used for the detection and quantification of pork in beef meatball formulation for Halal verification purposes. Copyright © 2010 The American Meat Science Association. Published by Elsevier Ltd. All rights reserved.
Yu, Peigen; Low, Mei Yin; Zhou, Weibiao
2018-01-01
In order to develop products that would be preferred by consumers, the effects of the chemical compositions of ready-to-drink green tea beverages on consumer liking were studied through regression analyses. Green tea model systems were prepared by dosing solutions of 0.1% green tea extract with differing concentrations of eight flavour keys deemed to be important for green tea aroma and taste, based on a D-optimal experimental design, before undergoing commercial sterilisation. Sensory evaluation of the green tea model system was carried out using an untrained consumer panel to obtain hedonic liking scores of the samples. Regression models were subsequently trained to objectively predict the consumer liking scores of the green tea model systems. A linear partial least squares (PLS) regression model was developed to describe the effects of the eight flavour keys on consumer liking, with a coefficient of determination (R 2 ) of 0.733, and a root-mean-square error (RMSE) of 3.53%. The PLS model was further augmented with an artificial neural network (ANN) to establish a PLS-ANN hybrid model. The established hybrid model was found to give a better prediction of consumer liking scores, based on its R 2 (0.875) and RMSE (2.41%). Copyright © 2017 Elsevier Ltd. All rights reserved.
Comparison of 3 Methods for Identifying Dietary Patterns Associated With Risk of Disease
DiBello, Julia R.; Kraft, Peter; McGarvey, Stephen T.; Goldberg, Robert; Campos, Hannia
2008-01-01
Reduced rank regression and partial least-squares regression (PLS) are proposed alternatives to principal component analysis (PCA). Using all 3 methods, the authors derived dietary patterns in Costa Rican data collected on 3,574 cases and controls in 1994–2004 and related the resulting patterns to risk of first incident myocardial infarction. Four dietary patterns associated with myocardial infarction were identified. Factor 1, characterized by high intakes of lean chicken, vegetables, fruit, and polyunsaturated oil, was generated by all 3 dietary pattern methods and was associated with a significantly decreased adjusted risk of myocardial infarction (28%–46%, depending on the method used). PCA and PLS also each yielded a pattern associated with a significantly decreased risk of myocardial infarction (31% and 23%, respectively); this pattern was characterized by moderate intake of alcohol and polyunsaturated oil and low intake of high-fat dairy products. The fourth factor derived from PCA was significantly associated with a 38% increased risk of myocardial infarction and was characterized by high intakes of coffee and palm oil. Contrary to previous studies, the authors found PCA and PLS to produce more patterns associated with cardiovascular disease than reduced rank regression. The most effective method for deriving dietary patterns related to disease may vary depending on the study goals. PMID:18945692
Variables selection methods in near-infrared spectroscopy.
Xiaobo, Zou; Jiewen, Zhao; Povey, Malcolm J W; Holmes, Mel; Hanpin, Mao
2010-05-14
Near-infrared (NIR) spectroscopy has increasingly been adopted as an analytical tool in various fields, such as the petrochemical, pharmaceutical, environmental, clinical, agricultural, food and biomedical sectors during the past 15 years. A NIR spectrum of a sample is typically measured by modern scanning instruments at hundreds of equally spaced wavelengths. The large number of spectral variables in most data sets encountered in NIR spectral chemometrics often renders the prediction of a dependent variable unreliable. Recently, considerable effort has been directed towards developing and evaluating different procedures that objectively identify variables which contribute useful information and/or eliminate variables containing mostly noise. This review focuses on the variable selection methods in NIR spectroscopy. Selection methods include some classical approaches, such as manual approach (knowledge based selection), "Univariate" and "Sequential" selection methods; sophisticated methods such as successive projections algorithm (SPA) and uninformative variable elimination (UVE), elaborate search-based strategies such as simulated annealing (SA), artificial neural networks (ANN) and genetic algorithms (GAs) and interval base algorithms such as interval partial least squares (iPLS), windows PLS and iterative PLS. Wavelength selection with B-spline, Kalman filtering, Fisher's weights and Bayesian are also mentioned. Finally, the websites of some variable selection software and toolboxes for non-commercial use are given. Copyright 2010 Elsevier B.V. All rights reserved.
Fischer, Katharina E
2012-08-02
Decision-making in healthcare is complex. Research on coverage decision-making has focused on comparative studies for several countries, statistical analyses for single decision-makers, the decision outcome and appraisal criteria. Accounting for decision processes extends the complexity, as they are multidimensional and process elements need to be regarded as latent constructs (composites) that are not observed directly. The objective of this study was to present a practical application of partial least square path modelling (PLS-PM) to evaluate how it offers a method for empirical analysis of decision-making in healthcare. Empirical approaches that applied PLS-PM to decision-making in healthcare were identified through a systematic literature search. PLS-PM was used as an estimation technique for a structural equation model that specified hypotheses between the components of decision processes and the reasonableness of decision-making in terms of medical, economic and other ethical criteria. The model was estimated for a sample of 55 coverage decisions on the extension of newborn screening programmes in Europe. Results were evaluated by standard reliability and validity measures for PLS-PM. After modification by dropping two indicators that showed poor measures in the measurement models' quality assessment and were not meaningful for newborn screening, the structural equation model estimation produced plausible results. The presence of three influences was supported: the links between both stakeholder participation or transparency and the reasonableness of decision-making; and the effect of transparency on the degree of scientific rigour of assessment. Reliable and valid measurement models were obtained to describe the composites of 'transparency', 'participation', 'scientific rigour' and 'reasonableness'. The structural equation model was among the first applications of PLS-PM to coverage decision-making. It allowed testing of hypotheses in situations where there are links between several non-observable constructs. PLS-PM was compatible in accounting for the complexity of coverage decisions to obtain a more realistic perspective for empirical analysis. The model specification can be used for hypothesis testing by using larger sample sizes and for data in the full domain of health technologies.
Effective diagnosis of Alzheimer’s disease by means of large margin-based methodology
2012-01-01
Background Functional brain images such as Single-Photon Emission Computed Tomography (SPECT) and Positron Emission Tomography (PET) have been widely used to guide the clinicians in the Alzheimer’s Disease (AD) diagnosis. However, the subjectivity involved in their evaluation has favoured the development of Computer Aided Diagnosis (CAD) Systems. Methods It is proposed a novel combination of feature extraction techniques to improve the diagnosis of AD. Firstly, Regions of Interest (ROIs) are selected by means of a t-test carried out on 3D Normalised Mean Square Error (NMSE) features restricted to be located within a predefined brain activation mask. In order to address the small sample-size problem, the dimension of the feature space was further reduced by: Large Margin Nearest Neighbours using a rectangular matrix (LMNN-RECT), Principal Component Analysis (PCA) or Partial Least Squares (PLS) (the two latter also analysed with a LMNN transformation). Regarding the classifiers, kernel Support Vector Machines (SVMs) and LMNN using Euclidean, Mahalanobis and Energy-based metrics were compared. Results Several experiments were conducted in order to evaluate the proposed LMNN-based feature extraction algorithms and its benefits as: i) linear transformation of the PLS or PCA reduced data, ii) feature reduction technique, and iii) classifier (with Euclidean, Mahalanobis or Energy-based methodology). The system was evaluated by means of k-fold cross-validation yielding accuracy, sensitivity and specificity values of 92.78%, 91.07% and 95.12% (for SPECT) and 90.67%, 88% and 93.33% (for PET), respectively, when a NMSE-PLS-LMNN feature extraction method was used in combination with a SVM classifier, thus outperforming recently reported baseline methods. Conclusions All the proposed methods turned out to be a valid solution for the presented problem. One of the advances is the robustness of the LMNN algorithm that not only provides higher separation rate between the classes but it also makes (in combination with NMSE and PLS) this rate variation more stable. In addition, their generalization ability is another advance since several experiments were performed on two image modalities (SPECT and PET). PMID:22849649
Akimoto, Yuki; Yugi, Katsuyuki; Uda, Shinsuke; Kudo, Takamasa; Komori, Yasunori; Kubota, Hiroyuki; Kuroda, Shinya
2013-01-01
Cells use common signaling molecules for the selective control of downstream gene expression and cell-fate decisions. The relationship between signaling molecules and downstream gene expression and cellular phenotypes is a multiple-input and multiple-output (MIMO) system and is difficult to understand due to its complexity. For example, it has been reported that, in PC12 cells, different types of growth factors activate MAP kinases (MAPKs) including ERK, JNK, and p38, and CREB, for selective protein expression of immediate early genes (IEGs) such as c-FOS, c-JUN, EGR1, JUNB, and FOSB, leading to cell differentiation, proliferation and cell death; however, how multiple-inputs such as MAPKs and CREB regulate multiple-outputs such as expression of the IEGs and cellular phenotypes remains unclear. To address this issue, we employed a statistical method called partial least squares (PLS) regression, which involves a reduction of the dimensionality of the inputs and outputs into latent variables and a linear regression between these latent variables. We measured 1,200 data points for MAPKs and CREB as the inputs and 1,900 data points for IEGs and cellular phenotypes as the outputs, and we constructed the PLS model from these data. The PLS model highlighted the complexity of the MIMO system and growth factor-specific input-output relationships of cell-fate decisions in PC12 cells. Furthermore, to reduce the complexity, we applied a backward elimination method to the PLS regression, in which 60 input variables were reduced to 5 variables, including the phosphorylation of ERK at 10 min, CREB at 5 min and 60 min, AKT at 5 min and JNK at 30 min. The simple PLS model with only 5 input variables demonstrated a predictive ability comparable to that of the full PLS model. The 5 input variables effectively extracted the growth factor-specific simple relationships within the MIMO system in cell-fate decisions in PC12 cells.
Data Mining Methods for Omics and Knowledge of Crude Medicinal Plants toward Big Data Biology
Afendi, Farit M.; Ono, Naoaki; Nakamura, Yukiko; Nakamura, Kensuke; Darusman, Latifah K.; Kibinge, Nelson; Morita, Aki Hirai; Tanaka, Ken; Horai, Hisayuki; Altaf-Ul-Amin, Md.; Kanaya, Shigehiko
2013-01-01
Molecular biological data has rapidly increased with the recent progress of the Omics fields, e.g., genomics, transcriptomics, proteomics and metabolomics that necessitates the development of databases and methods for efficient storage, retrieval, integration and analysis of massive data. The present study reviews the usage of KNApSAcK Family DB in metabolomics and related area, discusses several statistical methods for handling multivariate data and shows their application on Indonesian blended herbal medicines (Jamu) as a case study. Exploration using Biplot reveals many plants are rarely utilized while some plants are highly utilized toward specific efficacy. Furthermore, the ingredients of Jamu formulas are modeled using Partial Least Squares Discriminant Analysis (PLS-DA) in order to predict their efficacy. The plants used in each Jamu medicine served as the predictors, whereas the efficacy of each Jamu provided the responses. This model produces 71.6% correct classification in predicting efficacy. Permutation test then is used to determine plants that serve as main ingredients in Jamu formula by evaluating the significance of the PLS-DA coefficients. Next, in order to explain the role of plants that serve as main ingredients in Jamu medicines, information of pharmacological activity of the plants is added to the predictor block. Then N-PLS-DA model, multiway version of PLS-DA, is utilized to handle the three-dimensional array of the predictor block. The resulting N-PLS-DA model reveals that the effects of some pharmacological activities are specific for certain efficacy and the other activities are diverse toward many efficacies. Mathematical modeling introduced in the present study can be utilized in global analysis of big data targeting to reveal the underlying biology. PMID:24688691
NASA Astrophysics Data System (ADS)
Krepper, Gabriela; Romeo, Florencia; Fernandes, David Douglas de Sousa; Diniz, Paulo Henrique Gonçalves Dias; de Araújo, Mário César Ugulino; Di Nezio, María Susana; Pistonesi, Marcelo Fabián; Centurión, María Eugenia
2018-01-01
Determining fat content in hamburgers is very important to minimize or control the negative effects of fat on human health, effects such as cardiovascular diseases and obesity, which are caused by the high consumption of saturated fatty acids and cholesterol. This study proposed an alternative analytical method based on Near Infrared Spectroscopy (NIR) and Successive Projections Algorithm for interval selection in Partial Least Squares regression (iSPA-PLS) for fat content determination in commercial chicken hamburgers. For this, 70 hamburger samples with a fat content ranging from 14.27 to 32.12 mg kg- 1 were prepared based on the upper limit recommended by the Argentinean Food Codex, which is 20% (w w- 1). NIR spectra were then recorded and then preprocessed by applying different approaches: base line correction, SNV, MSC, and Savitzky-Golay smoothing. For comparison, full-spectrum PLS and the Interval PLS are also used. The best performance for the prediction set was obtained for the first derivative Savitzky-Golay smoothing with a second-order polynomial and window size of 19 points, achieving a coefficient of correlation of 0.94, RMSEP of 1.59 mg kg- 1, REP of 7.69% and RPD of 3.02. The proposed methodology represents an excellent alternative to the conventional Soxhlet extraction method, since waste generation is avoided, yet without the use of either chemical reagents or solvents, which follows the primary principles of Green Chemistry. The new method was successfully applied to chicken hamburger analysis, and the results agreed with those with reference values at a 95% confidence level, making it very attractive for routine analysis.
Krepper, Gabriela; Romeo, Florencia; Fernandes, David Douglas de Sousa; Diniz, Paulo Henrique Gonçalves Dias; de Araújo, Mário César Ugulino; Di Nezio, María Susana; Pistonesi, Marcelo Fabián; Centurión, María Eugenia
2018-01-15
Determining fat content in hamburgers is very important to minimize or control the negative effects of fat on human health, effects such as cardiovascular diseases and obesity, which are caused by the high consumption of saturated fatty acids and cholesterol. This study proposed an alternative analytical method based on Near Infrared Spectroscopy (NIR) and Successive Projections Algorithm for interval selection in Partial Least Squares regression (iSPA-PLS) for fat content determination in commercial chicken hamburgers. For this, 70 hamburger samples with a fat content ranging from 14.27 to 32.12mgkg -1 were prepared based on the upper limit recommended by the Argentinean Food Codex, which is 20% (ww -1 ). NIR spectra were then recorded and then preprocessed by applying different approaches: base line correction, SNV, MSC, and Savitzky-Golay smoothing. For comparison, full-spectrum PLS and the Interval PLS are also used. The best performance for the prediction set was obtained for the first derivative Savitzky-Golay smoothing with a second-order polynomial and window size of 19 points, achieving a coefficient of correlation of 0.94, RMSEP of 1.59mgkg -1 , REP of 7.69% and RPD of 3.02. The proposed methodology represents an excellent alternative to the conventional Soxhlet extraction method, since waste generation is avoided, yet without the use of either chemical reagents or solvents, which follows the primary principles of Green Chemistry. The new method was successfully applied to chicken hamburger analysis, and the results agreed with those with reference values at a 95% confidence level, making it very attractive for routine analysis. Copyright © 2017 Elsevier B.V. All rights reserved.
Locally-Based Kernal PLS Smoothing to Non-Parametric Regression Curve Fitting
NASA Technical Reports Server (NTRS)
Rosipal, Roman; Trejo, Leonard J.; Wheeler, Kevin; Korsmeyer, David (Technical Monitor)
2002-01-01
We present a novel smoothing approach to non-parametric regression curve fitting. This is based on kernel partial least squares (PLS) regression in reproducing kernel Hilbert space. It is our concern to apply the methodology for smoothing experimental data where some level of knowledge about the approximate shape, local inhomogeneities or points where the desired function changes its curvature is known a priori or can be derived based on the observed noisy data. We propose locally-based kernel PLS regression that extends the previous kernel PLS methodology by incorporating this knowledge. We compare our approach with existing smoothing splines, hybrid adaptive splines and wavelet shrinkage techniques on two generated data sets.
Teoh, Shao Thing; Kitamura, Miki; Nakayama, Yasumune; Putri, Sastia; Mukai, Yukio; Fukusaki, Eiichiro
2016-08-01
In recent years, the advent of high-throughput omics technology has made possible a new class of strain engineering approaches, based on identification of possible gene targets for phenotype improvement from omic-level comparison of different strains or growth conditions. Metabolomics, with its focus on the omic level closest to the phenotype, lends itself naturally to this semi-rational methodology. When a quantitative phenotype such as growth rate under stress is considered, regression modeling using multivariate techniques such as partial least squares (PLS) is often used to identify metabolites correlated with the target phenotype. However, linear modeling techniques such as PLS require a consistent metabolite-phenotype trend across the samples, which may not be the case when outliers or multiple conflicting trends are present in the data. To address this, we proposed a data-mining strategy that utilizes random sample consensus (RANSAC) to select subsets of samples with consistent trends for construction of better regression models. By applying a combination of RANSAC and PLS (RANSAC-PLS) to a dataset from a previous study (gas chromatography/mass spectrometry metabolomics data and 1-butanol tolerance of 19 yeast mutant strains), new metabolites were indicated to be correlated with tolerance within certain subsets of the samples. The relevance of these metabolites to 1-butanol tolerance were then validated from single-deletion strains of corresponding metabolic genes. The results showed that RANSAC-PLS is a promising strategy to identify unique metabolites that provide additional hints for phenotype improvement, which could not be detected by traditional PLS modeling using the entire dataset. Copyright © 2016 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.
Real‐time monitoring and control of the load phase of a protein A capture step
Rüdt, Matthias; Brestrich, Nina; Rolinger, Laura
2016-01-01
ABSTRACT The load phase in preparative Protein A capture steps is commonly not controlled in real‐time. The load volume is generally based on an offline quantification of the monoclonal antibody (mAb) prior to loading and on a conservative column capacity determined by resin‐life time studies. While this results in a reduced productivity in batch mode, the bottleneck of suitable real‐time analytics has to be overcome in order to enable continuous mAb purification. In this study, Partial Least Squares Regression (PLS) modeling on UV/Vis absorption spectra was applied to quantify mAb in the effluent of a Protein A capture step during the load phase. A PLS model based on several breakthrough curves with variable mAb titers in the HCCF was successfully calibrated. The PLS model predicted the mAb concentrations in the effluent of a validation experiment with a root mean square error (RMSE) of 0.06 mg/mL. The information was applied to automatically terminate the load phase, when a product breakthrough of 1.5 mg/mL was reached. In a second part of the study, the sensitivity of the method was further increased by only considering small mAb concentrations in the calibration and by subtracting an impurity background signal. The resulting PLS model exhibited a RMSE of prediction of 0.01 mg/mL and was successfully applied to terminate the load phase, when a product breakthrough of 0.15 mg/mL was achieved. The proposed method has hence potential for the real‐time monitoring and control of capture steps at large scale production. This might enhance the resin capacity utilization, eliminate time‐consuming offline analytics, and contribute to the realization of continuous processing. Biotechnol. Bioeng. 2017;114: 368–373. © 2016 The Authors. Biotechnology and Bioengineering published by Wiley Periodicals, Inc. PMID:27543789
2013-01-01
Background Given the serious threats posed to terrestrial ecosystems by industrial contamination, environmental monitoring is a standard procedure used for assessing the current status of an environment or trends in environmental parameters. Measurement of metal concentrations at different trophic levels followed by their statistical analysis using exploratory multivariate methods can provide meaningful information on the status of environmental quality. In this context, the present paper proposes a novel chemometric approach to standard statistical methods by combining the Block clustering with Partial least square (PLS) analysis to investigate the accumulation patterns of metals in anthropized terrestrial ecosystems. The present study focused on copper, zinc, manganese, iron, cobalt, cadmium, nickel, and lead transfer along a soil-plant-snai food chain, and the hepatopancreas of the Roman snail (Helix pomatia) was used as a biological end-point of metal accumulation. Results Block clustering deliniates between the areas exposed to industrial and vehicular contamination. The toxic metals have similar distributions in the nettle leaves and snail hepatopancreas. PLS analysis showed that (1) zinc and copper concentrations at the lower trophic levels are the most important latent factors that contribute to metal accumulation in land snails; (2) cadmium and lead are the main determinants of pollution pattern in areas exposed to industrial contamination; (3) at the sites located near roads lead is the most threatfull metal for terrestrial ecosystems. Conclusion There were three major benefits by applying block clustering with PLS for processing the obtained data: firstly, it helped in grouping sites depending on the type of contamination. Secondly, it was valuable for identifying the latent factors that contribute the most to metal accumulation in land snails. Finally, it optimized the number and type of data that are best for monitoring the status of metallic contamination in terrestrial ecosystems exposed to different kinds of anthropic polution. PMID:23987502
NASA Astrophysics Data System (ADS)
Hegazy, Maha A.; Lotfy, Hayam M.; Mowaka, Shereen; Mohamed, Ekram Hany
2016-07-01
Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations.
Yehia, Ali M; Mohamed, Heba M
2016-01-05
Three advanced chemmometric-assisted spectrophotometric methods namely; Concentration Residuals Augmented Classical Least Squares (CRACLS), Multivariate Curve Resolution-Alternating Least Squares (MCR-ALS) and Principal Component Analysis-Artificial Neural Networks (PCA-ANN) were developed, validated and benchmarked to PLS calibration; to resolve the severely overlapped spectra and simultaneously determine; Paracetamol (PAR), Guaifenesin (GUA) and Phenylephrine (PHE) in their ternary mixture and in presence of p-aminophenol (AP) the main degradation product and synthesis impurity of Paracetamol. The analytical performance of the proposed methods was described by percentage recoveries, root mean square error of calibration and standard error of prediction. The four multivariate calibration methods could be directly used without any preliminary separation step and successfully applied for pharmaceutical formulation analysis, showing no excipients' interference. Copyright © 2015 Elsevier B.V. All rights reserved.
Kaniu, M I; Angeyo, K H; Mwala, A K; Mwangi, F K
2012-08-30
Soil quality assessment (SQA) calls for rapid, simple and affordable but accurate analysis of soil quality indicators (SQIs). Routine methods of soil analysis are tedious and expensive. Energy dispersive X-ray fluorescence and scattering (EDXRFS) spectrometry in conjunction with chemometrics is a potentially powerful method for rapid SQA. In this study, a 25 m Ci (109)Cd isotope source XRF spectrometer was used to realize EDXRFS spectrometry of soils. Glycerol (a simulate of "organic" soil solution) and kaolin (a model clay soil) doped with soil micro (Fe, Cu, Zn) and macro (NO(3)(-), SO(4)(2-), H(2)PO(4)(-)) nutrients were used to train multivariate chemometric calibration models for direct (non-invasive) analysis of SQIs based on partial least squares (PLS) and artificial neural networks (ANN). The techniques were compared for each SQI with respect to speed, robustness, correction ability for matrix effects, and resolution of spectral overlap. The method was then applied to perform direct rapid analysis of SQIs in field soils. A one-way ANOVA test showed no statistical difference at 95% confidence interval between PLS and ANN results compared to reference soil nutrients. PLS was more accurate analyzing C, N, Na, P and Zn (R(2)>0.9) and low SEP of (0.05%, 0.01%, 0.01%, and 1.98 μg g(-1)respectively), while ANN was better suited for analysis of Mg, Cu and Fe (R(2)>0.9 and SEP of 0.08%, 4.02 μg g(-1), and 0.88 μg g(-1) respectively). Copyright © 2012 Elsevier B.V. All rights reserved.
Payne, Courtney E; Wolfrum, Edward J
2015-01-01
Obtaining accurate chemical composition and reactivity (measures of carbohydrate release and yield) information for biomass feedstocks in a timely manner is necessary for the commercialization of biofuels. Our objective was to use near-infrared (NIR) spectroscopy and partial least squares (PLS) multivariate analysis to develop calibration models to predict the feedstock composition and the release and yield of soluble carbohydrates generated by a bench-scale dilute acid pretreatment and enzymatic hydrolysis assay. Major feedstocks included in the calibration models are corn stover, sorghum, switchgrass, perennial cool season grasses, rice straw, and miscanthus. We present individual model statistics to demonstrate model performance and validation samples to more accurately measure predictive quality of the models. The PLS-2 model for composition predicts glucan, xylan, lignin, and ash (wt%) with uncertainties similar to primary measurement methods. A PLS-2 model was developed to predict glucose and xylose release following pretreatment and enzymatic hydrolysis. An additional PLS-2 model was developed to predict glucan and xylan yield. PLS-1 models were developed to predict the sum of glucose/glucan and xylose/xylan for release and yield (grams per gram). The release and yield models have higher uncertainties than the primary methods used to develop the models. It is possible to build effective multispecies feedstock models for composition, as well as carbohydrate release and yield. The model for composition is useful for predicting glucan, xylan, lignin, and ash with good uncertainties. The release and yield models have higher uncertainties; however, these models are useful for rapidly screening sample populations to identify unusual samples.
Özbalci, Beril; Boyaci, İsmail Hakkı; Topcu, Ali; Kadılar, Cem; Tamer, Uğur
2013-02-15
The aim of this study was to quantify glucose, fructose, sucrose and maltose contents of honey samples using Raman spectroscopy as a rapid method. By performing a single measurement, quantifications of sugar contents have been said to be unaffordable according to the molecular similarities between sugar molecules in honey matrix. This bottleneck was overcome by coupling Raman spectroscopy with chemometric methods (principal component analysis (PCA) and partial least squares (PLS)) and an artificial neural network (ANN). Model solutions of four sugars were processed with PCA and significant separation was observed. This operation, done with the spectral features by using PLS and ANN methods, led to the discriminant analysis of sugar contents. Models/trained networks were created using a calibration data set and evaluated using a validation data set. The correlation coefficient values between actual and predicted values of glucose, fructose, sucrose and maltose were determined as 0.964, 0.965, 0.968 and 0.949 for PLS and 0.965, 0.965, 0.978 and 0.956 for ANN, respectively. The requirement of rapid analysis of sugar contents of commercial honeys has been met by the data processed within this article. Copyright © 2012 Elsevier Ltd. All rights reserved.
Dinç, Erdal; Ustündağ, Ozgür; Baleanu, Dumitru
2010-08-01
The sole use of pyridoxine hydrochloride during treatment of tuberculosis gives rise to pyridoxine deficiency. Therefore, a combination of pyridoxine hydrochloride and isoniazid is used in pharmaceutical dosage form in tuberculosis treatment to reduce this side effect. In this study, two chemometric methods, partial least squares (PLS) and principal component regression (PCR), were applied to the simultaneous determination of pyridoxine (PYR) and isoniazid (ISO) in their tablets. A concentration training set comprising binary mixtures of PYR and ISO consisting of 20 different combinations were randomly prepared in 0.1 M HCl. Both multivariate calibration models were constructed using the relationships between the concentration data set (concentration data matrix) and absorbance data matrix in the spectral region 200-330 nm. The accuracy and the precision of the proposed chemometric methods were validated by analyzing synthetic mixtures containing the investigated drugs. The recovery results obtained by applying PCR and PLS calibrations to the artificial mixtures were found between 100.0 and 100.7%. Satisfactory results obtained by applying the PLS and PCR methods to both artificial and commercial samples were obtained. The results obtained in this manuscript strongly encourage us to use them for the quality control and the routine analysis of the marketing tablets containing PYR and ISO drugs. Copyright © 2010 John Wiley & Sons, Ltd.
El Alami El Hassani, Nadia; Tahri, Khalid; Llobet, Eduard; Bouchikhi, Benachir; Errachid, Abdelhamid; Zine, Nadia; El Bari, Nezha
2018-03-15
Moroccan and French honeys from different geographical areas were classified and characterized by applying a voltammetric electronic tongue (VE-tongue) coupled to analytical methods. The studied parameters include color intensity, free lactonic and total acidity, proteins, phenols, hydroxymethylfurfural content (HMF), sucrose, reducing and total sugars. The geographical classification of different honeys was developed through three-pattern recognition techniques: principal component analysis (PCA), support vector machines (SVMs) and hierarchical cluster analysis (HCA). Honey characterization was achieved by partial least squares modeling (PLS). All the PLS models developed were able to accurately estimate the correct values of the parameters analyzed using as input the voltammetric experimental data (i.e. r>0.9). This confirms the potential ability of the VE-tongue for performing a rapid characterization of honeys via PLS in which an uncomplicated, cost-effective sample preparation process that does not require the use of additional chemicals is implemented. Copyright © 2017 Elsevier Ltd. All rights reserved.
Enhancement of partial robust M-regression (PRM) performance using Bisquare weight function
NASA Astrophysics Data System (ADS)
Mohamad, Mazni; Ramli, Norazan Mohamed; Ghani@Mamat, Nor Azura Md; Ahmad, Sanizah
2014-09-01
Partial Least Squares (PLS) regression is a popular regression technique for handling multicollinearity in low and high dimensional data which fits a linear relationship between sets of explanatory and response variables. Several robust PLS methods are proposed to accommodate the classical PLS algorithms which are easily affected with the presence of outliers. The recent one was called partial robust M-regression (PRM). Unfortunately, the use of monotonous weighting function in the PRM algorithm fails to assign appropriate and proper weights to large outliers according to their severity. Thus, in this paper, a modified partial robust M-regression is introduced to enhance the performance of the original PRM. A re-descending weight function, known as Bisquare weight function is recommended to replace the fair function in the PRM. A simulation study is done to assess the performance of the modified PRM and its efficiency is also tested in both contaminated and uncontaminated simulated data under various percentages of outliers, sample sizes and number of predictors.
Many multivariate methods are used in describing and predicting relation; each has its unique usage of categorical and non-categorical data. In multivariate analysis of variance (MANOVA), many response variables (y's) are related to many independent variables that are categorical...
NASA Astrophysics Data System (ADS)
Jiang, Junjun; Hu, Ruimin; Han, Zhen; Wang, Zhongyuan; Chen, Jun
2013-10-01
Face superresolution (SR), or face hallucination, refers to the technique of generating a high-resolution (HR) face image from a low-resolution (LR) one with the help of a set of training examples. It aims at transcending the limitations of electronic imaging systems. Applications of face SR include video surveillance, in which the individual of interest is often far from cameras. A two-step method is proposed to infer a high-quality and HR face image from a low-quality and LR observation. First, we establish the nonlinear relationship between LR face images and HR ones, according to radial basis function and partial least squares (RBF-PLS) regression, to transform the LR face into the global face space. Then, a locality-induced sparse representation (LiSR) approach is presented to enhance the local facial details once all the global faces for each LR training face are constructed. A comparison of some state-of-the-art SR methods shows the superiority of the proposed two-step approach, RBF-PLS global face regression followed by LiSR-based local patch reconstruction. Experiments also demonstrate the effectiveness under both simulation conditions and some real conditions.
NASA Astrophysics Data System (ADS)
Gholizadeh, H.; Robeson, S. M.
2015-12-01
Empirical models have been widely used to estimate global chlorophyll content from remotely sensed data. Here, we focus on the standard NASA empirical models that use blue-green band ratios. These band ratio ocean color (OC) algorithms are in the form of fourth-order polynomials and the parameters of these polynomials (i.e. coefficients) are estimated from the NASA bio-Optical Marine Algorithm Data set (NOMAD). Most of the points in this data set have been sampled from tropical and temperate regions. However, polynomial coefficients obtained from this data set are used to estimate chlorophyll content in all ocean regions with different properties such as sea-surface temperature, salinity, and downwelling/upwelling patterns. Further, the polynomial terms in these models are highly correlated. In sum, the limitations of these empirical models are as follows: 1) the independent variables within the empirical models, in their current form, are correlated (multicollinear), and 2) current algorithms are global approaches and are based on the spatial stationarity assumption, so they are independent of location. Multicollinearity problem is resolved by using partial least squares (PLS). PLS, which transforms the data into a set of independent components, can be considered as a combined form of principal component regression (PCR) and multiple regression. Geographically weighted regression (GWR) is also used to investigate the validity of spatial stationarity assumption. GWR solves a regression model over each sample point by using the observations within its neighbourhood. PLS results show that the empirical method underestimates chlorophyll content in high latitudes, including the Southern Ocean region, when compared to PLS (see Figure 1). Cluster analysis of GWR coefficients also shows that the spatial stationarity assumption in empirical models is not likely a valid assumption.
NASA Astrophysics Data System (ADS)
Yan, Hong; Song, Xiangzhong; Tian, Kuangda; Chen, Yilin; Xiong, Yanmei; Min, Shungeng
2018-02-01
A novel method, mid-infrared (MIR) spectroscopy, which enables the determination of Chlorantraniliprole in Abamectin within minutes, is proposed. We further evaluate the prediction ability of four wavelength selection methods, including bootstrapping soft shrinkage approach (BOSS), Monte Carlo uninformative variable elimination (MCUVE), genetic algorithm partial least squares (GA-PLS) and competitive adaptive reweighted sampling (CARS) respectively. The results showed that BOSS method obtained the lowest root mean squared error of cross validation (RMSECV) (0.0245) and root mean squared error of prediction (RMSEP) (0.0271), as well as the highest coefficient of determination of cross-validation (Qcv2) (0.9998) and the coefficient of determination of test set (Q2test) (0.9989), which demonstrated that the mid infrared spectroscopy can be used to detect Chlorantraniliprole in Abamectin conveniently. Meanwhile, a suitable wavelength selection method (BOSS) is essential to conducting a component spectral analysis.
NASA Astrophysics Data System (ADS)
Guo, Yugao; Zhao, He; Han, Yelin; Liu, Xia; Guan, Shan; Zhang, Qingyin; Bian, Xihui
2017-02-01
A simultaneous spectrophotometric determination method for trace heavy metal ions based on solid-phase extraction coupled with partial least squares approaches was developed. In the proposed method, trace metal ions in aqueous samples were adsorbed by cation exchange fibers and desorbed by acidic solution from the fibers. After the ion preconcentration process, the enriched solution was detected by ultraviolet and visible spectrophotometer (UV-Vis). Then, the concentration of heavy metal ions were quantified by analyzing ultraviolet and visible spectrum with the help of partial least squares (PLS) approaches. Under the optimal conditions of operation time, flow rate and detection parameters, the overlapped absorption peaks of mixed ions were obtained. The experimental data showed that the concentration, which can be calculated through chemometrics method, of each metal ion increased significantly. The heavy metal ions can be enriched more than 80-fold. The limits of detection (LOD) for the target analytes of copper ions (Cu2 +), cobalt ions (Co2 +) and nickel ions (Ni2 +) mixture was 0.10 μg L- 1, 0.15 μg L- 1 and 0.13 μg L- 1, respectively. The relative standard deviations (RSD) were less than 5%. The performance of the solid-phase extraction can enrich the ions efficiently and the combined method of spectrophotometric detection and PLS can evaluate the ions concentration accurately. The work proposed here is an interesting and promising attempt for the trace ions determination in water samples and will have much more applied field.
Guo, Yugao; Zhao, He; Han, Yelin; Liu, Xia; Guan, Shan; Zhang, Qingyin; Bian, Xihui
2017-02-15
A simultaneous spectrophotometric determination method for trace heavy metal ions based on solid-phase extraction coupled with partial least squares approaches was developed. In the proposed method, trace metal ions in aqueous samples were adsorbed by cation exchange fibers and desorbed by acidic solution from the fibers. After the ion preconcentration process, the enriched solution was detected by ultraviolet and visible spectrophotometer (UV-Vis). Then, the concentration of heavy metal ions were quantified by analyzing ultraviolet and visible spectrum with the help of partial least squares (PLS) approaches. Under the optimal conditions of operation time, flow rate and detection parameters, the overlapped absorption peaks of mixed ions were obtained. The experimental data showed that the concentration, which can be calculated through chemometrics method, of each metal ion increased significantly. The heavy metal ions can be enriched more than 80-fold. The limits of detection (LOD) for the target analytes of copper ions (Cu 2+ ), cobalt ions (Co 2+ ) and nickel ions (Ni 2+ ) mixture was 0.10μgL -1 , 0.15μgL -1 and 0.13μgL -1 , respectively. The relative standard deviations (RSD) were less than 5%. The performance of the solid-phase extraction can enrich the ions efficiently and the combined method of spectrophotometric detection and PLS can evaluate the ions concentration accurately. The work proposed here is an interesting and promising attempt for the trace ions determination in water samples and will have much more applied field. Copyright © 2016 Elsevier B.V. All rights reserved.
Hou, Siyuan; Riley, Christopher B; Mitchell, Cynthia A; Shaw, R Anthony; Bryanton, Janet; Bigsby, Kathryn; McClure, J Trenton
2015-09-01
Immunoglobulin G (IgG) is crucial for the protection of the host from invasive pathogens. Due to its importance for human health, tools that enable the monitoring of IgG levels are highly desired. Consequently there is a need for methods to determine the IgG concentration that are simple, rapid, and inexpensive. This work explored the potential of attenuated total reflectance (ATR) infrared spectroscopy as a method to determine IgG concentrations in human serum samples. Venous blood samples were collected from adults and children, and from the umbilical cord of newborns. The serum was harvested and tested using ATR infrared spectroscopy. Partial least squares (PLS) regression provided the basis to develop the new analytical methods. Three PLS calibrations were determined: one for the combined set of the venous and umbilical cord serum samples, the second for only the umbilical cord samples, and the third for only the venous samples. The number of PLS factors was chosen by critical evaluation of Monte Carlo-based cross validation results. The predictive performance for each PLS calibration was evaluated using the Pearson correlation coefficient, scatter plot and Bland-Altman plot, and percent deviations for independent prediction sets. The repeatability was evaluated by standard deviation and relative standard deviation. The results showed that ATR infrared spectroscopy is potentially a simple, quick, and inexpensive method to measure IgG concentrations in human serum samples. The results also showed that it is possible to build a united calibration curve for the umbilical cord and the venous samples. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Darwish, Hany W.; Hassan, Said A.; Salem, Maissa Y.; El-Zeany, Badr A.
2016-02-01
Two advanced, accurate and precise chemometric methods are developed for the simultaneous determination of amlodipine besylate (AML) and atorvastatin calcium (ATV) in the presence of their acidic degradation products in tablet dosage forms. The first method was Partial Least Squares (PLS-1) and the second was Artificial Neural Networks (ANN). PLS was compared to ANN models with and without variable selection procedure (genetic algorithm (GA)). For proper analysis, a 5-factor 5-level experimental design was established resulting in 25 mixtures containing different ratios of the interfering species. Fifteen mixtures were used as calibration set and the other ten mixtures were used as validation set to validate the prediction ability of the suggested models. The proposed methods were successfully applied to the analysis of pharmaceutical tablets containing AML and ATV. The methods indicated the ability of the mentioned models to solve the highly overlapped spectra of the quinary mixture, yet using inexpensive and easy to handle instruments like the UV-VIS spectrophotometer.
NASA Astrophysics Data System (ADS)
Wu, Di; He, Yong
2007-11-01
The aim of this study is to investigate the potential of the visible and near infrared spectroscopy (Vis/NIRS) technique for non-destructive measurement of soluble solids contents (SSC) in grape juice beverage. 380 samples were studied in this paper. Smoothing way of Savitzky-Golay and standard normal variate were applied for the pre-processing of spectral data. Least-squares support vector machines (LS-SVM) with RBF kernel function was applied to developing the SSC prediction model based on the Vis/NIRS absorbance data. The determination coefficient for prediction (Rp2) of the results predicted by LS-SVM model was 0. 962 and root mean square error (RMSEP) was 0. 434137. It is concluded that Vis/NIRS technique can quantify the SSC of grape juice beverage fast and non-destructively.. At the same time, LS-SVM model was compared with PLS and back propagation neural network (BP-NN) methods. The results showed that LS-SVM was superior to the conventional linear and non-linear methods in predicting SSC of grape juice beverage. In this study, the generation ability of LS-SVM, PLS and BP-NN models were also investigated. It is concluded that LS-SVM regression method is a promising technique for chemometrics in quantitative prediction.
Guo, Wei-Liang; Du, Yi-Ping; Zhou, Yong-Can; Yang, Shuang; Lu, Jia-Hui; Zhao, Hong-Yu; Wang, Yao; Teng, Li-Rong
2012-03-01
An analytical procedure has been developed for at-line (fast off-line) monitoring of 4 key parameters including nisin titer (NT), the concentration of reducing sugars, cell concentration and pH during a nisin fermentation process. This procedure is based on near infrared (NIR) spectroscopy and Partial Least Squares (PLS). Samples without any preprocessing were collected at intervals of 1 h during fifteen batch of fermentations. These fermentation processes were implemented in 3 different 5 l fermentors at various conditions. NIR spectra of the samples were collected in 10 min. And then, PLS was used for modeling the relationship between NIR spectra and the key parameters which were determined by reference methods. Monte Carlo Partial Least Squares (MCPLS) was applied to identify the outliers and select the most efficacious methods for preprocessing spectra, wavelengths and the suitable number of latent variables (n (LV)). Then, the optimum models for determining NT, concentration of reducing sugars, cell concentration and pH were established. The correlation coefficients of calibration set (R (c)) were 0.8255, 0.9000, 0.9883 and 0.9581, respectively. These results demonstrated that this method can be successfully applied to at-line monitor of NT, concentration of reducing sugars, cell concentration and pH during nisin fermentation processes.
Freye, Chris E; Fitz, Brian D; Billingsley, Matthew C; Synovec, Robert E
2016-06-01
The chemical composition and several physical properties of RP-1 fuels were studied using comprehensive two-dimensional (2D) gas chromatography (GC×GC) coupled with flame ionization detection (FID). A "reversed column" GC×GC configuration was implemented with a RTX-wax column on the first dimension ((1)D), and a RTX-1 as the second dimension ((2)D). Modulation was achieved using a high temperature diaphragm valve mounted directly in the oven. Using leave-one-out cross-validation (LOOCV), the summed GC×GC-FID signal of three compound-class selective 2D regions (alkanes, cycloalkanes, and aromatics) was regressed against previously measured ASTM derived values for these compound classes, yielding root mean square errors of cross validation (RMSECV) of 0.855, 0.734, and 0.530mass%, respectively. For comparison, using partial least squares (PLS) analysis with LOOCV, the GC×GC-FID signal of the entire 2D separations was regressed against the same ASTM values, yielding a linear trend for the three compound classes (alkanes, cycloalkanes, and aromatics), yielding RMSECV values of 1.52, 2.76, and 0.945 mass%, respectively. Additionally, a more detailed PLS analysis was undertaken of the compounds classes (n-alkanes, iso-alkanes, mono-, di-, and tri-cycloalkanes, and aromatics), and of physical properties previously determined by ASTM methods (such as net heat of combustion, hydrogen content, density, kinematic viscosity, sustained boiling temperature and vapor rise temperature). Results from these PLS studies using the relatively simple to use and inexpensive GC×GC-FID instrumental platform are compared to previously reported results using the GC×GC-TOFMS instrumental platform. Copyright © 2016 Elsevier B.V. All rights reserved.
González-Sáiz, J M; Esteban-Díez, I; Sánchez-Gallardo, C; Pizarro, C
2008-08-01
Wastes and by-products of the onion-processing industry pose an increasing disposal and environmental problem and represent a loss of valuable sources of nutrients. The present study focused on the production of vinegar from worthless onions as a potential valorisation route which could provide a viable solution to multiple disposal and environmental problems, simultaneously offering the possibility of converting waste materials into a useful food-grade product and of exploiting the unique properties and health benefits of onions. This study deals specifically with the second and definitive step of the onion vinegar production process: the efficient production of vinegar from onion waste by transforming onion ethanol, previously produced by alcoholic fermentation, into acetic acid via acetic fermentation. Near-infrared spectroscopy (NIRS), coupled with multivariate calibration methods, has been used to monitor the concentrations of both substrates and products in acetic fermentation. Separate partial least squares (PLS) regression models, correlating NIR spectral data of fermentation samples with each kinetic parameter studied, were developed. Wavelength selection was also performed applying the iterative predictor weighting-PLS (IPW-PLS) method in order to only consider significant spectral features in each model development to improve the quality of the final models constructed. Biomass, substrate (ethanol) and product (acetic acid) concentration were predicted in the acetic fermentation of onion alcohol with high accuracy using IPW-PLS models with a root-mean-square error of the residuals in external prediction (RMSEP) lower than 2.5% for both ethanol and acetic acid, and an RMSEP of 6.1% for total biomass concentration (a very satisfactory result considering the relatively low precision and accuracy associated with the reference method used for determining the latter). Thus, the simple and reliable calibration models proposed in this study suggest that they could be implemented in routine applications to monitor and predict the key species involved in the acetic fermentation of onion alcohol, allowing the onion vinegar production process to be controlled in real time.
Effective diagnosis of Alzheimer's disease by means of large margin-based methodology.
Chaves, Rosa; Ramírez, Javier; Górriz, Juan M; Illán, Ignacio A; Gómez-Río, Manuel; Carnero, Cristobal
2012-07-31
Functional brain images such as Single-Photon Emission Computed Tomography (SPECT) and Positron Emission Tomography (PET) have been widely used to guide the clinicians in the Alzheimer's Disease (AD) diagnosis. However, the subjectivity involved in their evaluation has favoured the development of Computer Aided Diagnosis (CAD) Systems. It is proposed a novel combination of feature extraction techniques to improve the diagnosis of AD. Firstly, Regions of Interest (ROIs) are selected by means of a t-test carried out on 3D Normalised Mean Square Error (NMSE) features restricted to be located within a predefined brain activation mask. In order to address the small sample-size problem, the dimension of the feature space was further reduced by: Large Margin Nearest Neighbours using a rectangular matrix (LMNN-RECT), Principal Component Analysis (PCA) or Partial Least Squares (PLS) (the two latter also analysed with a LMNN transformation). Regarding the classifiers, kernel Support Vector Machines (SVMs) and LMNN using Euclidean, Mahalanobis and Energy-based metrics were compared. Several experiments were conducted in order to evaluate the proposed LMNN-based feature extraction algorithms and its benefits as: i) linear transformation of the PLS or PCA reduced data, ii) feature reduction technique, and iii) classifier (with Euclidean, Mahalanobis or Energy-based methodology). The system was evaluated by means of k-fold cross-validation yielding accuracy, sensitivity and specificity values of 92.78%, 91.07% and 95.12% (for SPECT) and 90.67%, 88% and 93.33% (for PET), respectively, when a NMSE-PLS-LMNN feature extraction method was used in combination with a SVM classifier, thus outperforming recently reported baseline methods. All the proposed methods turned out to be a valid solution for the presented problem. One of the advances is the robustness of the LMNN algorithm that not only provides higher separation rate between the classes but it also makes (in combination with NMSE and PLS) this rate variation more stable. In addition, their generalization ability is another advance since several experiments were performed on two image modalities (SPECT and PET).
Prediction model of sinoatrial node field potential using high order partial least squares.
Feng, Yu; Cao, Hui; Zhang, Yanbin
2015-01-01
High order partial least squares (HOPLS) is a novel data processing method. It is highly suitable for building prediction model which has tensor input and output. The objective of this study is to build a prediction model of the relationship between sinoatrial node field potential and high glucose using HOPLS. The three sub-signals of the sinoatrial node field potential made up the model's input. The concentration and the actuation duration of high glucose made up the model's output. The results showed that on the premise of predicting two dimensional variables, HOPLS had the same predictive ability and a lower dispersion degree compared with partial least squares (PLS).
NASA Astrophysics Data System (ADS)
Belal, F.; Ibrahim, F.; Sheribah, Z. A.; Alaa, H.
2018-06-01
In this paper, novel univariate and multivariate regression methods along with model-updating technique were developed and validated for the simultaneous determination of quaternary mixture of imatinib (IMB), gemifloxacin (GMI), nalbuphine (NLP) and naproxen (NAP). The univariate method is extended derivative ratio (EDR) which depends on measuring every drug in the quaternary mixture by using a ternary mixture of the other three drugs as divisor. Peak amplitudes were measured at 294 nm, 250 nm, 283 nm and 239 nm within linear concentration ranges of 4.0-17.0, 3.0-15.0, 4.0-80.0 and 1.0-6.0 μg mL-1 for IMB, GMI, NLP and NAB, respectively. Multivariate methods adopted are partial least squares (PLS) in original and derivative mode. These models were constructed for simultaneous determination of the studied drugs in the ranges of 4.0-8.0, 3.0-11.0, 10.0-18.0 and 1.0-3.0 μg mL-1 for IMB, GMI, NLP and NAB, respectively, by using eighteen mixtures as a calibration set and seven mixtures as a validation set. The root mean square error of predication (RMSEP) were 0.09 and 0.06 for IMB, 0.14 and 0.13 for GMI, 0.07 and 0.02 for NLP and 0.64 and 0.27 for NAP by PLS in original and derivative mode, respectively. Both models were successfully applied for analysis of IMB, GMI, NLP and NAP in their dosage forms. Updated PLS in derivative mode and EDR were applied for determination of the studied drugs in spiked human urine. The obtained results were statistically compared with those obtained by the reported methods giving a conclusion that there is no significant difference regarding accuracy and precision.
Belal, F; Ibrahim, F; Sheribah, Z A; Alaa, H
2018-06-05
In this paper, novel univariate and multivariate regression methods along with model-updating technique were developed and validated for the simultaneous determination of quaternary mixture of imatinib (IMB), gemifloxacin (GMI), nalbuphine (NLP) and naproxen (NAP). The univariate method is extended derivative ratio (EDR) which depends on measuring every drug in the quaternary mixture by using a ternary mixture of the other three drugs as divisor. Peak amplitudes were measured at 294nm, 250nm, 283nm and 239nm within linear concentration ranges of 4.0-17.0, 3.0-15.0, 4.0-80.0 and 1.0-6.0μgmL -1 for IMB, GMI, NLP and NAB, respectively. Multivariate methods adopted are partial least squares (PLS) in original and derivative mode. These models were constructed for simultaneous determination of the studied drugs in the ranges of 4.0-8.0, 3.0-11.0, 10.0-18.0 and 1.0-3.0μgmL -1 for IMB, GMI, NLP and NAB, respectively, by using eighteen mixtures as a calibration set and seven mixtures as a validation set. The root mean square error of predication (RMSEP) were 0.09 and 0.06 for IMB, 0.14 and 0.13 for GMI, 0.07 and 0.02 for NLP and 0.64 and 0.27 for NAP by PLS in original and derivative mode, respectively. Both models were successfully applied for analysis of IMB, GMI, NLP and NAP in their dosage forms. Updated PLS in derivative mode and EDR were applied for determination of the studied drugs in spiked human urine. The obtained results were statistically compared with those obtained by the reported methods giving a conclusion that there is no significant difference regarding accuracy and precision. Copyright © 2018 Elsevier B.V. All rights reserved.
[Detection of Hawthorn Fruit Defects Using Hyperspectral Imaging].
Liu, De-hua; Zhang, Shu-juan; Wang, Bin; Yu, Ke-qiang; Zhao, Yan-ru; He, Yong
2015-11-01
Hyperspectral imaging technology covered the range of 380-1000 nm was employed to detect defects (bruise and insect damage) of hawthorn fruit. A total of 134 samples were collected, which included damage fruit of 46, pest fruit of 30, injure and pest fruit of 10 and intact fruit of 48. Because calyx · s⁻¹ tem-end and bruise/insect damage regions offered a similar appearance characteristic in RGB images, which could produce easily confusion between them. Hence, five types of defects including bruise, insect damage, sound, calyx, and stem-end were collected from 230 hawthorn fruits. After acquiring hyperspectral images of hawthorn fruits, the spectral data were extracted from region of interest (ROI). Then, several pretreatment methods of standard normalized variate (SNV), savitzky golay (SG), median filter (MF) and multiplicative scatter correction (MSC) were used and partial least squares method(PLS) model was carried out to obtain the better performance. Accordingly to their results, SNV pretreatment methods assessed by PLS was viewed as best pretreatment method. Lastly, SNV was chosen as the pretreatment method. Spectral features of five different regions were combined with Regression coefficients(RCs) of partial least squares-discriminant analysis (PLS-DA) model was used to identify the important wavelengths and ten wavebands at 483, 563, 645, 671, 686, 722, 777, 819, 837 and 942 nm were selected from all of the wavebands. Using Kennard-Stone algorithm, all kinds of samples were randomly divided into training set (173) and test set (57) according to the proportion of 3:1. And then, least squares-support vector machine (LS-SVM) discriminate model was established by using the selected wavebands. The results showed that the discriminate accuracy of the method was 91.23%. In the other hand, images at ten important wavebands were executed to Principal component analysis (PCA). Using "Sobel" operator and region growing algrorithm "Regiongrow", the edge and defect feature of 86 Hawthorn could be recognized. Lastly, the detect precision of bruised, insect damage and two-defect samples is 95.65%, 86.67% and 100%, respectively. This investigation demonstrated that hyperspectral imaging technology could detect the defects of bruise, insect damage, calyx, and stem-end in hawthorn fruit in qualitative analysis and feature detection which provided a theoretical reference for the defects nondestructive detection of hawthorn fruit.
Mo, Changyeun; Kim, Giyoung; Lee, Kangjin; Kim, Moon S; Cho, Byoung-Kwan; Lim, Jongguk; Kang, Sukwon
2014-04-24
In this study, we developed a viability evaluation method for pepper (Capsicum annuum L.) seeds based on hyperspectral reflectance imaging. The reflectance spectra of pepper seeds in the 400-700 nm range are collected from hyperspectral reflectance images obtained using blue, green, and red LED illumination. A partial least squares-discriminant analysis (PLS-DA) model is developed to classify viable and non-viable seeds. Four spectral ranges generated with four types of LEDs (blue, green, red, and RGB), which were pretreated using various methods, are investigated to develop the classification models. The optimal PLS-DA model based on the standard normal variate for RGB LED illumination (400-700 nm) yields discrimination accuracies of 96.7% and 99.4% for viable seeds and nonviable seeds, respectively. The use of images based on the PLS-DA model with the first-order derivative of a 31.5-nm gap for red LED illumination (600-700 nm) yields 100% discrimination accuracy for both viable and nonviable seeds. The results indicate that a hyperspectral imaging technique based on LED light can be potentially applied to high-quality pepper seed sorting.
NASA Astrophysics Data System (ADS)
Al-Harrasi, Ahmed; Rehman, Najeeb Ur; Mabood, Fazal; Albroumi, Muhammaed; Ali, Liaqat; Hussain, Javid; Hussain, Hidayat; Csuk, René; Khan, Abdul Latif; Alam, Tanveer; Alameri, Saif
2017-09-01
In the present study, for the first time, NIR spectroscopy coupled with PLS regression as a rapid and alternative method was developed to quantify the amount of Keto-β-Boswellic Acid (KBA) in different plant parts of Boswellia sacra and the resin exudates of the trunk. NIR spectroscopy was used for the measurement of KBA standards and B. sacra samples in absorption mode in the wavelength range from 700-2500 nm. PLS regression model was built from the obtained spectral data using 70% of KBA standards (training set) in the range from 0.1 ppm to 100 ppm. The PLS regression model obtained was having R-square value of 98% with 0.99 corelationship value and having good prediction with RMSEP value 3.2 and correlation of 0.99. It was then used to quantify the amount of KBA in the samples of B. sacra. The results indicated that the MeOH extract of resin has the highest concentration of KBA (0.6%) followed by essential oil (0.1%). However, no KBA was found in the aqueous extract. The MeOH extract of the resin was subjected to column chromatography to get various sub-fractions at different polarity of organic solvents. The sub-fraction at 4% MeOH/CHCl3 (4.1% of KBA) was found to contain the highest percentage of KBA followed by another sub-fraction at 2% MeOH/CHCl3 (2.2% of KBA). The present results also indicated that KBA is only present in the gum-resin of the trunk and not in all parts of the plant. These results were further confirmed through HPLC analysis and therefore it is concluded that NIRS coupled with PLS regression is a rapid and alternate method for quantification of KBA in Boswellia sacra. It is non-destructive, rapid, sensitive and uses simple methods of sample preparation.
Hordge, LaQuana N; McDaniel, Kiara L; Jones, Derick D; Fakayode, Sayo O
2016-05-15
The endocrine disruption property of estrogens necessitates the immediate need for effective monitoring and development of analytical protocols for their analyses in biological and human specimens. This study explores the first combined utility of a steady-state fluorescence spectroscopy and multivariate partial-least-square (PLS) regression analysis for the simultaneous determination of two estrogens (17α-ethinylestradiol (EE) and norgestimate (NOR)) concentrations in bovine serum albumin (BSA) and human serum albumin (HSA) samples. The influence of EE and NOR concentrations and temperature on the emission spectra of EE-HSA EE-BSA, NOR-HSA, and NOR-BSA complexes was also investigated. The binding of EE with HSA and BSA resulted in increase in emission characteristics of HSA and BSA and a significant blue spectra shift. In contrast, the interaction of NOR with HSA and BSA quenched the emission characteristics of HSA and BSA. The observed emission spectral shifts preclude the effective use of traditional univariate regression analysis of fluorescent data for the determination of EE and NOR concentrations in HSA and BSA samples. Multivariate partial-least-squares (PLS) regression analysis was utilized to correlate the changes in emission spectra with EE and NOR concentrations in HSA and BSA samples. The figures-of-merit of the developed PLS regression models were excellent, with limits of detection as low as 1.6×10(-8) M for EE and 2.4×10(-7) M for NOR and good linearity (R(2)>0.994985). The PLS models correctly predicted EE and NOR concentrations in independent validation HSA and BSA samples with a root-mean-square-percent-relative-error (RMS%RE) of less than 6.0% at physiological condition. On the contrary, the use of univariate regression resulted in poor predictions of EE and NOR in HSA and BSA samples, with RMS%RE larger than 40% at physiological conditions. High accuracy, low sensitivity, simplicity, low-cost with no prior analyte extraction or separation required makes this method promising, compelling, and attractive alternative for the rapid determination of estrogen concentrations in biomedical and biological specimens, pharmaceuticals, or environmental samples. Published by Elsevier B.V.
Chotimah, Chusnul; Sudjadi; Riyanto, Sugeng; Rohman, Abdul
2015-01-01
Purpose: Analysis of drugs in multicomponent system officially is carried out using chromatographic technique, however, this technique is too laborious and involving sophisticated instrument. Therefore, UV-VIS spectrophotometry coupled with multivariate calibration of partial least square (PLS) for quantitative analysis of metamizole, thiamin and pyridoxin is developed in the presence of cyanocobalamine without any separation step. Methods: The calibration and validation samples are prepared. The calibration model is prepared by developing a series of sample mixture consisting these drugs in certain proportion. Cross validation of calibration sample using leave one out technique is used to identify the smaller set of components that provide the greatest predictive ability. The evaluation of calibration model was based on the coefficient of determination (R2) and root mean square error of calibration (RMSEC). Results: The results showed that the coefficient of determination (R2) for the relationship between actual values and predicted values for all studied drugs was higher than 0.99 indicating good accuracy. The RMSEC values obtained were relatively low, indicating good precision. The accuracy and presision results of developed method showed no significant difference compared to those obtained by official method of HPLC. Conclusion: The developed method (UV-VIS spectrophotometry in combination with PLS) was succesfully used for analysis of metamizole, thiamin and pyridoxin in tablet dosage form. PMID:26819934
Inácio, Maria Raquel Cavalcanti; de Lima, Kássio Michell Gomes; Lopes, Valquiria Garcia; Pessoa, José Dalton Cruz; de Almeida Teixeira, Gustavo Henrique
2013-02-15
The aim of this study was to evaluate near-infrared reflectance spectroscopy (NIR), and multivariate calibration potential as a rapid method to determinate anthocyanin content in intact fruit (açaí and palmitero-juçara). Several multivariate calibration techniques, including partial least squares (PLS), interval partial least squares, genetic algorithm, successive projections algorithm, and net analyte signal were compared and validated by establishing figures of merit. Suitable results were obtained with the PLS model (four latent variables and 5-point smoothing) with a detection limit of 6.2 g kg(-1), limit of quantification of 20.7 g kg(-1), accuracy estimated as root mean square error of prediction of 4.8 g kg(-1), mean selectivity of 0.79 g kg(-1), sensitivity of 5.04×10(-3) g kg(-1), precision of 27.8 g kg(-1), and signal-to-noise ratio of 1.04×10(-3) g kg(-1). These results suggest NIR spectroscopy and multivariate calibration can be effectively used to determine anthocyanin content in intact açaí and palmitero-juçara fruit. Copyright © 2012 Elsevier Ltd. All rights reserved.
Fast classification of hazelnut cultivars through portable infrared spectroscopy and chemometrics
NASA Astrophysics Data System (ADS)
Manfredi, Marcello; Robotti, Elisa; Quasso, Fabio; Mazzucco, Eleonora; Calabrese, Giorgio; Marengo, Emilio
2018-01-01
The authentication and traceability of hazelnuts is very important for both the consumer and the food industry, to safeguard the protected varieties and the food quality. This study investigates the use of a portable FTIR spectrometer coupled to multivariate statistical analysis for the classification of raw hazelnuts. The method discriminates hazelnuts from different origins/cultivars based on differences of the signal intensities of their IR spectra. The multivariate classification methods, namely principal component analysis (PCA) followed by linear discriminant analysis (LDA) and partial least square discriminant analysis (PLS-DA), with or without variable selection, allowed a very good discrimination among the groups, with PLS-DA coupled to variable selection providing the best results. Due to the fast analysis, high sensitivity, simplicity and no sample preparation, the proposed analytical methodology could be successfully used to verify the cultivar of hazelnuts, and the analysis can be performed quickly and directly on site.
Üstündağ, Özgür; Dinç, Erdal; Özdemir, Nurten; Tilkan, M Günseli
2015-01-01
In the development strategies of new drug products and generic drug products, the simultaneous in-vitro dissolution behavior of oral dosage formulations is the most important indication for the quantitative estimation of efficiency and biopharmaceutical characteristics of drug substances. This is to force the related field's scientists to improve very powerful analytical methods to get more reliable, precise and accurate results in the quantitative analysis and dissolution testing of drug formulations. In this context, two new chemometric tools, partial least squares (PLS) and principal component regression (PCR) were improved for the simultaneous quantitative estimation and dissolution testing of zidovudine (ZID) and lamivudine (LAM) in a tablet dosage form. The results obtained in this study strongly encourage us to use them for the quality control, the routine analysis and the dissolution test of the marketing tablets containing ZID and LAM drugs.
NASA Technical Reports Server (NTRS)
Soller, Babs R.; Favreau, Janice; Idwasi, Patrick O.
2003-01-01
The feasibility of using near-infrared (NIR) spectroscopy in combination with partial least-squares (PLS) regression was explored to measure electrolyte concentration in whole blood samples. Spectra were collected from diluted blood samples containing randomized, clinically relevant concentrations of Na+, K+, and Ca2+. Sodium was also studied in lysed blood. Reference measurements were made from the same samples using a standard clinical chemistry instrument. Partial least squares (PLS) was used to develop calibration models for each ion with acceptable results (Na+, R2 = 0.86, CVSEP = 9.5 mmol/L; K+, R2 = 0.54, CVSEP = 1.4 mmol/L; Ca2+, R2 = 0.56, CVSEP = 0.18 mmol/L). Slightly improved results were obtained using a narrower wavelength region (470-925 nm) where hemoglobin, but not water, absorbed indicating that ionic interaction with hemoglobin is as effective as water in causing measurable spectral variation. Good models were also achieved for sodium in lysed blood, illustrating that cell swelling, which is correlated with sodium concentration, is not required for calibration model development.
Lopes, Marta B; Calado, Cecília R C; Figueiredo, Mário A T; Bioucas-Dias, José M
2017-06-01
The monitoring of biopharmaceutical products using Fourier transform infrared (FT-IR) spectroscopy relies on calibration techniques involving the acquisition of spectra of bioprocess samples along the process. The most commonly used method for that purpose is partial least squares (PLS) regression, under the assumption that a linear model is valid. Despite being successful in the presence of small nonlinearities, linear methods may fail in the presence of strong nonlinearities. This paper studies the potential usefulness of nonlinear regression methods for predicting, from in situ near-infrared (NIR) and mid-infrared (MIR) spectra acquired in high-throughput mode, biomass and plasmid concentrations in Escherichia coli DH5-α cultures producing the plasmid model pVAX-LacZ. The linear methods PLS and ridge regression (RR) are compared with their kernel (nonlinear) versions, kPLS and kRR, as well as with the (also nonlinear) relevance vector machine (RVM) and Gaussian process regression (GPR). For the systems studied, RR provided better predictive performances compared to the remaining methods. Moreover, the results point to further investigation based on larger data sets whenever differences in predictive accuracy between a linear method and its kernelized version could not be found. The use of nonlinear methods, however, shall be judged regarding the additional computational cost required to tune their additional parameters, especially when the less computationally demanding linear methods herein studied are able to successfully monitor the variables under study.
Payne, Courtney E.; Wolfrum, Edward J.
2015-03-12
Obtaining accurate chemical composition and reactivity (measures of carbohydrate release and yield) information for biomass feedstocks in a timely manner is necessary for the commercialization of biofuels. Our objective was to use near-infrared (NIR) spectroscopy and partial least squares (PLS) multivariate analysis to develop calibration models to predict the feedstock composition and the release and yield of soluble carbohydrates generated by a bench-scale dilute acid pretreatment and enzymatic hydrolysis assay. Major feedstocks included in the calibration models are corn stover, sorghum, switchgrass, perennial cool season grasses, rice straw, and miscanthus. Here are the results: We present individual model statistics tomore » demonstrate model performance and validation samples to more accurately measure predictive quality of the models. The PLS-2 model for composition predicts glucan, xylan, lignin, and ash (wt%) with uncertainties similar to primary measurement methods. A PLS-2 model was developed to predict glucose and xylose release following pretreatment and enzymatic hydrolysis. An additional PLS-2 model was developed to predict glucan and xylan yield. PLS-1 models were developed to predict the sum of glucose/glucan and xylose/xylan for release and yield (grams per gram). The release and yield models have higher uncertainties than the primary methods used to develop the models. In conclusion, it is possible to build effective multispecies feedstock models for composition, as well as carbohydrate release and yield. The model for composition is useful for predicting glucan, xylan, lignin, and ash with good uncertainties. The release and yield models have higher uncertainties; however, these models are useful for rapidly screening sample populations to identify unusual samples.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Payne, Courtney E.; Wolfrum, Edward J.
Obtaining accurate chemical composition and reactivity (measures of carbohydrate release and yield) information for biomass feedstocks in a timely manner is necessary for the commercialization of biofuels. Our objective was to use near-infrared (NIR) spectroscopy and partial least squares (PLS) multivariate analysis to develop calibration models to predict the feedstock composition and the release and yield of soluble carbohydrates generated by a bench-scale dilute acid pretreatment and enzymatic hydrolysis assay. Major feedstocks included in the calibration models are corn stover, sorghum, switchgrass, perennial cool season grasses, rice straw, and miscanthus. Here are the results: We present individual model statistics tomore » demonstrate model performance and validation samples to more accurately measure predictive quality of the models. The PLS-2 model for composition predicts glucan, xylan, lignin, and ash (wt%) with uncertainties similar to primary measurement methods. A PLS-2 model was developed to predict glucose and xylose release following pretreatment and enzymatic hydrolysis. An additional PLS-2 model was developed to predict glucan and xylan yield. PLS-1 models were developed to predict the sum of glucose/glucan and xylose/xylan for release and yield (grams per gram). The release and yield models have higher uncertainties than the primary methods used to develop the models. In conclusion, it is possible to build effective multispecies feedstock models for composition, as well as carbohydrate release and yield. The model for composition is useful for predicting glucan, xylan, lignin, and ash with good uncertainties. The release and yield models have higher uncertainties; however, these models are useful for rapidly screening sample populations to identify unusual samples.« less
Hegazy, Maha A; Lotfy, Hayam M; Mowaka, Shereen; Mohamed, Ekram Hany
2016-07-05
Wavelets have been adapted for a vast number of signal-processing applications due to the amount of information that can be extracted from a signal. In this work, a comparative study on the efficiency of continuous wavelet transform (CWT) as a signal processing tool in univariate regression and a pre-processing tool in multivariate analysis using partial least square (CWT-PLS) was conducted. These were applied to complex spectral signals of ternary and quaternary mixtures. CWT-PLS method succeeded in the simultaneous determination of a quaternary mixture of drotaverine (DRO), caffeine (CAF), paracetamol (PAR) and p-aminophenol (PAP, the major impurity of paracetamol). While, the univariate CWT failed to simultaneously determine the quaternary mixture components and was able to determine only PAR and PAP, the ternary mixtures of DRO, CAF, and PAR and CAF, PAR, and PAP. During the calculations of CWT, different wavelet families were tested. The univariate CWT method was validated according to the ICH guidelines. While for the development of the CWT-PLS model a calibration set was prepared by means of an orthogonal experimental design and their absorption spectra were recorded and processed by CWT. The CWT-PLS model was constructed by regression between the wavelet coefficients and concentration matrices and validation was performed by both cross validation and external validation sets. Both methods were successfully applied for determination of the studied drugs in pharmaceutical formulations. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
De Lucia, Frank C., Jr.; Gottfried, Jennifer L.
2011-02-01
Using a series of thirteen organic materials that includes novel high-nitrogen energetic materials, conventional organic military explosives, and benign organic materials, we have demonstrated the importance of variable selection for maximizing residue discrimination with partial least squares discriminant analysis (PLS-DA). We built several PLS-DA models using different variable sets based on laser induced breakdown spectroscopy (LIBS) spectra of the organic residues on an aluminum substrate under an argon atmosphere. The model classification results for each sample are presented and the influence of the variables on these results is discussed. We found that using the whole spectra as the data input for the PLS-DA model gave the best results. However, variables due to the surrounding atmosphere and the substrate contribute to discrimination when the whole spectra are used, indicating this may not be the most robust model. Further iterative testing with additional validation data sets is necessary to determine the most robust model.
Li, Yankun; Shao, Xueguang; Cai, Wensheng
2007-04-15
Consensus modeling of combining the results of multiple independent models to produce a single prediction avoids the instability of single model. Based on the principle of consensus modeling, a consensus least squares support vector regression (LS-SVR) method for calibrating the near-infrared (NIR) spectra was proposed. In the proposed approach, NIR spectra of plant samples were firstly preprocessed using discrete wavelet transform (DWT) for filtering the spectral background and noise, then, consensus LS-SVR technique was used for building the calibration model. With an optimization of the parameters involved in the modeling, a satisfied model was achieved for predicting the content of reducing sugar in plant samples. The predicted results show that consensus LS-SVR model is more robust and reliable than the conventional partial least squares (PLS) and LS-SVR methods.
Alladio, E; Giacomelli, L; Biosa, G; Corcia, D Di; Gerace, E; Salomone, A; Vincenti, M
2018-01-01
The chronic intake of an excessive amount of alcohol is currently ascertained by determining the concentration of direct alcohol metabolites in the hair samples of the alleged abusers, including ethyl glucuronide (EtG) and, less frequently, fatty acid ethyl esters (FAEEs). Indirect blood biomarkers of alcohol abuse are still determined to support hair EtG results and diagnose a consequent liver impairment. In the present study, the supporting role of hair FAEEs is compared with indirect blood biomarkers with respect to the contexts in which hair EtG interpretation is uncertain. Receiver Operating Characteristics (ROC) curves and multivariate Principal Component Analysis (PCA) demonstrated much stronger correlation of EtG results with FAEEs than with any single indirect biomarker or their combinations. Partial Least Squares Discriminant Analysis (PLS-DA) models based on hair EtG and FAEEs were developed to maximize the biomarkers information content on a multivariate background. The final PLS-DA model yielded 100% correct classification on a training/evaluation dataset of 155 subjects, including both chronic alcohol abusers and social drinkers. Then, the PLS-DA model was validated on an external dataset of 81 individual providing optimal discrimination ability between chronic alcohol abusers and social drinkers, in terms of specificity and sensitivity. The PLS-DA scores obtained for each subject, with respect to the PLS-DA model threshold that separates the probabilistic distributions for the two classes, furnished a likelihood ratio value, which in turn conveys the strength of the experimental data support to the classification decision, within a Bayesian logic. Typical boundary real cases from daily work are discussed, too. Copyright © 2017 Elsevier B.V. All rights reserved.
Balss, Karin M; Long, Frederick H; Veselov, Vladimir; Orana, Argjenta; Akerman-Revis, Eugena; Papandreou, George; Maryanoff, Cynthia A
2008-07-01
Multivariate data analysis was applied to confocal Raman measurements on stents coated with the polymers and drug used in the CYPHER Sirolimus-eluting Coronary Stents. Partial least-squares (PLS) regression was used to establish three independent calibration curves for the coating constituents: sirolimus, poly(n-butyl methacrylate) [PBMA], and poly(ethylene-co-vinyl acetate) [PEVA]. The PLS calibrations were based on average spectra generated from each spatial location profiled. The PLS models were tested on six unknown stent samples to assess accuracy and precision. The wt % difference between PLS predictions and laboratory assay values for sirolimus was less than 1 wt % for the composite of the six unknowns, while the polymer models were estimated to be less than 0.5 wt % difference for the combined samples. The linearity and specificity of the three PLS models were also demonstrated with the three PLS models. In contrast to earlier univariate models, the PLS models achieved mass balance with better accuracy. This analysis was extended to evaluate the spatial distribution of the three constituents. Quantitative bitmap images of drug-eluting stent coatings are presented for the first time to assess the local distribution of components.
Seasonal forecasting of high wind speeds over Western Europe
NASA Astrophysics Data System (ADS)
Palutikof, J. P.; Holt, T.
2003-04-01
As financial losses associated with extreme weather events escalate, there is interest from end users in the forestry and insurance industries, for example, in the development of seasonal forecasting models with a long lead time. This study uses exceedences of the 90th, 95th, and 99th percentiles of daily maximum wind speed over the period 1958 to present to derive predictands of winter wind extremes. The source data is the 6-hourly NCEP Reanalysis gridded surface wind field. Predictor variables include principal components of Atlantic sea surface temperature and several indices of climate variability, including the NAO and SOI. Lead times of up to a year are considered, in monthly increments. Three regression techniques are evaluated; multiple linear regression (MLR), principal component regression (PCR), and partial least squares regression (PLS). PCR and PLS proved considerably superior to MLR with much lower standard errors. PLS was chosen to formulate the predictive model since it offers more flexibility in experimental design and gave slightly better results than PCR. The results indicate that winter windiness can be predicted with considerable skill one year ahead for much of coastal Europe, but that this deteriorates rapidly in the hinterland. The experiment succeeded in highlighting PLS as a very useful method for developing more precise forecasting models, and in identifying areas of high predictability.
NASA Astrophysics Data System (ADS)
Sindt, Nathan M.; Robison, Faith; Brick, Mark A.; Schwartz, Howard F.; Heuberger, Adam L.; Prenni, Jessica E.
2018-02-01
Matrix-assisted desorption/ionization time of flight mass spectrometry (MALDI-TOF-MS) is a fast and effective tool for microbial species identification. However, current approaches are limited to species-level identification even when genetic differences are known. Here, we present a novel workflow that applies the statistical method of partial least squares discriminant analysis (PLS-DA) to MALDI-TOF-MS protein fingerprint data of Xanthomonas axonopodis, an important bacterial plant pathogen of fruit and vegetable crops. Mass spectra of 32 X. axonopodis strains were used to create a mass spectral library and PLS-DA was employed to model the closely related strains. A robust workflow was designed to optimize the PLS-DA model by assessing the model performance over a range of signal-to-noise ratios (s/n) and mass filter (MF) thresholds. The optimized parameters were observed to be s/n = 3 and MF = 0.7. The model correctly classified 83% of spectra withheld from the model as a test set. A new decision rule was developed, termed the rolled-up Maximum Decision Rule (ruMDR), and this method improved identification rates to 92%. These results demonstrate that MALDI-TOF-MS protein fingerprints of bacterial isolates can be utilized to enable identification at the strain level. Furthermore, the open-source framework of this workflow allows for broad implementation across various instrument platforms as well as integration with alternative modeling and classification algorithms.
Han, Xue; Jiang, Hong; Zhang, Dingkun; Zhang, Yingying; Xiong, Xi; Jiao, Jiaojiao; Xu, Runchun; Yang, Ming; Han, Li; Lin, Junzhi
2017-01-01
Background: The current astringency evaluation for herbs has become dissatisfied with the requirement of pharmaceutical process. It needed a new method to accurately assess astringency. Methods: First, quinine, sucrose, citric acid, sodium chloride, monosodium glutamate, and tannic acid (TA) were analyzed by electronic tongue (e-tongue) to determine the approximate region of astringency in partial least square (PLS) map. Second, different concentrations of TA were detected to define the standard curve of astringency. Meanwhile, coordinate-concentration relationship could be obtained by fitting the PLS abscissa of standard curve and corresponding concentration. Third, Chebulae Fructus (CF), Yuganzi throat tablets (YGZTT), and Sanlejiang oral liquid (SLJOL) were tested to define the region in PLS map. Finally, the astringent intensities of samples were calculated combining with the standard coordinate-concentration relationship and expressed by concentrations of TA. Then, Euclidean distance (Ed) analysis and human sensory test were processed to verify the results. Results: The fitting equation between concentration and abscissa of TA was Y = 0.00498 × e(−X/0.51035) + 0.10905 (r = 0.999). The astringency of 1, 0.1 mg/mL CF was predicted at 0.28, 0.12 mg/mL TA; 2, 0.2 mg/mL YGZTTs was predicted at 0.18, 0.11 mg/mL TA; 0.002, 0.0002 mg/mL SLJOL was predicted at 0.15, 0.10 mg/mL TA. The validation results showed that the predicted astringency of e-tongue was basically consistent to human sensory and was more accuracy than Ed analysis. Conclusion: The study indicated the established method was objective and feasible. It provided a new quantitative method for astringency of herbs. SUMMARY The astringency of Chebulae Fructus, Yuganzi throat tablets, and Sanlejiang oral liquid was predicted by electronic tongueEuclidean distance analysis and human sensory test verified the resultsA new strategy which was objective, simple, and sensitive to compare astringent intensity of herbs and preparations was provided. Abbreviations used: CF: Chebulae Fructus; E-tongue: Electronic tongue; Ed: Euclidean distance; PLS: Partial least square; PCA: Principal component analysis; SLJOL: Sanlejiang oral liquid; TA: Tannic acid; VAS: Visual analog scale; YGZTT: Yuganzi throat tablets. PMID:28839378
Riahi, Siavash; Hadiloo, Farshad; Milani, Seyed Mohammad R; Davarkhah, Nazila; Ganjali, Mohammad R; Norouzi, Parviz; Seyfi, Payam
2011-05-01
The accuracy in predicting different chemometric methods was compared when applied on ordinary UV spectra and first order derivative spectra. Principal component regression (PCR) and partial least squares with one dependent variable (PLS1) and two dependent variables (PLS2) were applied on spectral data of pharmaceutical formula containing pseudoephedrine (PDP) and guaifenesin (GFN). The ability to derivative in resolved overlapping spectra chloropheniramine maleate was evaluated when multivariate methods are adopted for analysis of two component mixtures without using any chemical pretreatment. The chemometrics models were tested on an external validation dataset and finally applied to the analysis of pharmaceuticals. Significant advantages were found in analysis of the real samples when the calibration models from derivative spectra were used. It should also be mentioned that the proposed method is a simple and rapid way requiring no preliminary separation steps and can be used easily for the analysis of these compounds, especially in quality control laboratories. Copyright © 2011 John Wiley & Sons, Ltd.
Computerized pigment design based on property hypersurfaces
NASA Astrophysics Data System (ADS)
Halova, Jaroslava; Sulcova, Petra; Kupka, Karel
2007-05-01
Competition is tough in the pigment market. Rational pigment design has therefore a competitive advantage, saving time and money. The aim of this work is to provide methods that can assist in designing pigments with defined properties. These methods include partial least squares regression (PLSR), neural network (NN) and generalized regression ANOVA model. Authors show how PLS bi-plot can be used to identify market gaps poorly covered by pigment manufacturers, thus giving an opportunity to develop pigments with potentially profitable properties.
NASA Astrophysics Data System (ADS)
Hong, Jangho; Kawashima, Ayato; Hamada, Noriaki
2017-06-01
In this study, we developed a facile fabrication method to access a highly reproducible plasmonic surface enhanced Raman scattering substrate via the immobilization of gold nanoparticles on an Ultrafiltration (UF) membrane using a suction technique. This was combined with a simple and rapid analyte concentration and detection method utilizing portable Raman spectroscopy. The minimum detectable concentrations for aqueous thiabendazole standard solution and thiabendazole in orange extract are 0.01 μg/mL and 0.125 μg/g, respectively. The partial least squares (PLS) regression plot shows a good linear relationship between 0.001 and 100 μg/mL of analyte, with a root mean square error of prediction (RMSEP) of 0.294 and a correlation coefficient (R2) of 0.976 for the thiabendazole standard solution. Meanwhile, the PLS plot also shows a good linear relationship between 0.0 and 2.5 μg/g of analyte, with an RMSEP value of 0.298 and an R2 value of 0.993 for the orange peel extract. In addition to the detection of other types of pesticides in agricultural products, this highly uniform plasmonic substrate has great potential for application in various environmentally-related areas.
NASA Astrophysics Data System (ADS)
Palou, Anna; Miró, Aira; Blanco, Marcelo; Larraz, Rafael; Gómez, José Francisco; Martínez, Teresa; González, Josep Maria; Alcalà, Manel
2017-06-01
Even when the feasibility of using near infrared (NIR) spectroscopy combined with partial least squares (PLS) regression for prediction of physico-chemical properties of biodiesel/diesel blends has been widely demonstrated, inclusion in the calibration sets of the whole variability of diesel samples from diverse production origins still remains as an important challenge when constructing the models. This work presents a useful strategy for the systematic selection of calibration sets of samples of biodiesel/diesel blends from diverse origins, based on a binary code, principal components analysis (PCA) and the Kennard-Stones algorithm. Results show that using this methodology the models can keep their robustness over time. PLS calculations have been done using a specialized chemometric software as well as the software of the NIR instrument installed in plant, and both produced RMSEP under reproducibility values of the reference methods. The models have been proved for on-line simultaneous determination of seven properties: density, cetane index, fatty acid methyl esters (FAME) content, cloud point, boiling point at 95% of recovery, flash point and sulphur.
A rapid tool for determination of titanium dioxide content in white chickpea samples.
Sezer, Banu; Bilge, Gonca; Berkkan, Aysel; Tamer, Ugur; Hakki Boyaci, Ismail
2018-02-01
Titanium dioxide (TiO 2 ) is a widely used additive in foods. However, in the scientific community there is an ongoing debate on health concerns about TiO 2 . The main goal of this study is to determine TiO 2 content by using laser induced breakdown spectroscopy (LIBS). To this end, different amounts of TiO 2 was added to white chickpeas and analyzed by using LIBS. Calibration curve was obtained by following Ti emissions at 390.11nm for univariate calibration, and partial least square (PLS) calibration curve was obtained by evaluating the whole spectra. The results showed that Ti calibration curve at 390.11nm provides successful determination of Ti level with 0.985 of R 2 and 33.9ppm of limit of detection (LOD) value, while PLS has 0.989 of R 2 and 60.9ppm of LOD. Furthermore, commercial white chickpea samples were used to validate the method, and validation R 2 for simple calibration and PLS were calculated as 0.989 and 0.951, respectively. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Wilson, Machelle; Ustin, Susan L.; Rocke, David
2003-03-01
Remote sensing technologies with high spatial and spectral resolution show a great deal of promise in addressing critical environmental monitoring issues, but the ability to analyze and interpret the data lags behind the technology. Robust analytical methods are required before the wealth of data available through remote sensing can be applied to a wide range of environmental problems for which remote detection is the best method. In this study we compare the classification effectiveness of two relatively new techniques on data consisting of leaf-level reflectance from plants that have been exposed to varying levels of heavy metal toxicity. If these methodologies work well on leaf-level data, then there is some hope that they will also work well on data from airborne and space-borne platforms. The classification methods compared were support vector machine classification of exposed and non-exposed plants based on the reflectance data, and partial east squares compression of the reflectance data followed by classification using logistic discrimination (PLS/LD). PLS/LD was performed in two ways. We used the continuous concentration data as the response during compression, and then used the binary response required during logistic discrimination. We also used a binary response during compression followed by logistic discrimination. The statistics we used to compare the effectiveness of the methodologies was the leave-one-out cross validation estimate of the prediction error.
Rahman, Ziyaur; Siddiqui, Akhtar; Khan, Mansoor A
2013-12-01
The focus of present investigation was to characterize and evaluate the variability of solid dispersion (SD) of amorphous vancomycin (VCM), utilizing crystalline polyethylene glycol (PEG-6000) as a carrier and subsequently, determining their percentage composition by nondestructive method of process analytical technology (PAT) sensors. The SD were prepared by heat fusion method and characterized for physicochemical and spectral properties. Enhanced dissolution was shown by the SD formulations. Decreased crystallinity of PEG-6000 was observed indicating that the drug was present as solution and dispersed form within the polymer. The SD formulations were homogenous as shown by near infrared (NIR) chemical imaging data. Principal component analysis (PCA) and partial least square (PLS) method were applied to NIR and PXRD (powder X-ray diffraction) data to develop model for quantification of drug and carrier. PLS of both data showed correlation coefficient >0.9934 with good prediction capability as revealed by smaller value of root mean square and standard error. The model based on NIR and PXRD were two folds more accurate in estimating PEG-6000 than VCM. In conclusion, the drug dissolution from the SD increased by decreasing crystallinity of PEG-6000, and the chemometric models showed usefulness of PAT sensor in estimating percentage of both VCM and PEG-600 simultaneously. © 2013 Wiley Periodicals, Inc. and the American Pharmacists Association.
Vásquez, Valeria; Báez, María E; Bravo, Manuel; Fuentes, Edwar
2013-09-01
Seven heavy polycyclic aromatic hydrocarbons (PAHs) of concern on the US Environmental Protection Agency priority pollutant list (benzo[a]anthracene, benzo[b]fluoranthene, benzo[k]fluoranthene, benzo[a]pyrene, dibenz[a,h]anthracene, benzo[g,h,i]perylene, and indeno[1,2,3-c,d]-pyrene) were simultaneously analyzed in extra virgin olive oil. The analysis is based on the measurement of excitation-emission matrices on nylon membrane and processing of data using unfolded partial least-squares regression with residual bilinearization (U-PLS/RBL). The conditions needed to retain the PAHs present in the oil matrix on the nylon membrane were evaluated. The limit of detection for the proposed method ranged from 0.29 to 1.0 μg kg(-1), with recoveries between 64 and 78 %. The predicted U-PLS/RBL concentrations compared favorably with those measured using high-performance liquid chromatography with fluorescence detection. The proposed method was applied to ten samples of edible oil, two of which presented PAHs ranging from 0.35 to 0.63 μg kg(-1). The principal advantages of the proposed analytical method are that it provides a significant reduction in time and solvent consumption with a similar limit of detection as compared with chromatography.
NASA Astrophysics Data System (ADS)
Zhang, George Z.; Myers, Kyle J.; Park, Subok
2013-03-01
Digital breast tomosynthesis (DBT) has shown promise for improving the detection of breast cancer, but it has not yet been fully optimized due to a large space of system parameters to explore. A task-based statistical approach1 is a rigorous method for evaluating and optimizing this promising imaging technique with the use of optimal observers such as the Hotelling observer (HO). However, the high data dimensionality found in DBT has been the bottleneck for the use of a task-based approach in DBT evaluation. To reduce data dimensionality while extracting salient information for performing a given task, efficient channels have to be used for the HO. In the past few years, 2D Laguerre-Gauss (LG) channels, which are a complete basis for stationary backgrounds and rotationally symmetric signals, have been utilized for DBT evaluation2, 3 . But since background and signal statistics from DBT data are neither stationary nor rotationally symmetric, LG channels may not be efficient in providing reliable performance trends as a function of system parameters. Recently, partial least squares (PLS) has been shown to generate efficient channels for the Hotelling observer in detection tasks involving random backgrounds and signals.4 In this study, we investigate the use of PLS as a method for extracting salient information from DBT in order to better evaluate such systems.
Dabkiewicz, Vanessa Emídio; de Mello Pereira Abrantes, Shirley; Cassella, Ricardo Jorgensen
2018-08-05
Near infrared spectroscopy (NIR) with diffuse reflectance associated to multivariate calibration has as main advantage the replacement of the physical separation of interferents by the mathematical separation of their signals, rapidly with no need for reagent consumption, chemical waste production or sample manipulation. Seeking to optimize quality control analyses, this spectroscopic analytical method was shown to be a viable alternative to the classical Kjeldahl method for the determination of protein nitrogen in yellow fever vaccine. The most suitable multivariate calibration was achieved by the partial least squares method (PLS) with multiplicative signal correction (MSC) treatment and data mean centering (MC), using a minimum number of latent variables (LV) equal to 1, with the lower value of the square root of the mean squared prediction error (0.00330) associated with the highest percentage value (91%) of samples. Accuracy ranged 95 to 105% recovery in the 4000-5184 cm -1 region. Copyright © 2018 Elsevier B.V. All rights reserved.
Darwish, Hany W; Hassan, Said A; Salem, Maissa Y; El-Zeany, Badr A
2016-02-05
Two advanced, accurate and precise chemometric methods are developed for the simultaneous determination of amlodipine besylate (AML) and atorvastatin calcium (ATV) in the presence of their acidic degradation products in tablet dosage forms. The first method was Partial Least Squares (PLS-1) and the second was Artificial Neural Networks (ANN). PLS was compared to ANN models with and without variable selection procedure (genetic algorithm (GA)). For proper analysis, a 5-factor 5-level experimental design was established resulting in 25 mixtures containing different ratios of the interfering species. Fifteen mixtures were used as calibration set and the other ten mixtures were used as validation set to validate the prediction ability of the suggested models. The proposed methods were successfully applied to the analysis of pharmaceutical tablets containing AML and ATV. The methods indicated the ability of the mentioned models to solve the highly overlapped spectra of the quinary mixture, yet using inexpensive and easy to handle instruments like the UV-VIS spectrophotometer. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Yulia, M.; Suhandy, D.
2018-03-01
NIR spectra obtained from spectral data acquisition system contains both chemical information of samples as well as physical information of the samples, such as particle size and bulk density. Several methods have been established for developing calibration models that can compensate for sample physical information variations. One common approach is to include physical information variation in the calibration model both explicitly and implicitly. The objective of this study was to evaluate the feasibility of using explicit method to compensate the influence of different particle size of coffee powder in NIR calibration model performance. A number of 220 coffee powder samples with two different types of coffee (civet and non-civet) and two different particle sizes (212 and 500 µm) were prepared. Spectral data was acquired using NIR spectrometer equipped with an integrating sphere for diffuse reflectance measurement. A discrimination method based on PLS-DA was conducted and the influence of different particle size on the performance of PLS-DA was investigated. In explicit method, we add directly the particle size as predicted variable results in an X block containing only the NIR spectra and a Y block containing the particle size and type of coffee. The explicit inclusion of the particle size into the calibration model is expected to improve the accuracy of type of coffee determination. The result shows that using explicit method the quality of the developed calibration model for type of coffee determination is a little bit superior with coefficient of determination (R2) = 0.99 and root mean square error of cross-validation (RMSECV) = 0.041. The performance of the PLS2 calibration model for type of coffee determination with particle size compensation was quite good and able to predict the type of coffee in two different particle sizes with relatively high R2 pred values. The prediction also resulted in low bias and RMSEP values.
Fan, Shu-Xiang; Huang, Wen-Qian; Li, Jiang-Bo; Guo, Zhi-Ming; Zhaq, Chun-Jiang
2014-10-01
In order to detect the soluble solids content(SSC)of apple conveniently and rapidly, a ring fiber probe and a portable spectrometer were applied to obtain the spectroscopy of apple. Different wavelength variable selection methods, including unin- formative variable elimination (UVE), competitive adaptive reweighted sampling (CARS) and genetic algorithm (GA) were pro- posed to select effective wavelength variables of the NIR spectroscopy of the SSC in apple based on PLS. The back interval LS- SVM (BiLS-SVM) and GA were used to select effective wavelength variables based on LS-SVM. Selected wavelength variables and full wavelength range were set as input variables of PLS model and LS-SVM model, respectively. The results indicated that PLS model built using GA-CARS on 50 characteristic variables selected from full-spectrum which had 1512 wavelengths achieved the optimal performance. The correlation coefficient (Rp) and root mean square error of prediction (RMSEP) for prediction sets were 0.962, 0.403°Brix respectively for SSC. The proposed method of GA-CARS could effectively simplify the portable detection model of SSC in apple based on near infrared spectroscopy and enhance the predictive precision. The study can provide a reference for the development of portable apple soluble solids content spectrometer.
NASA Astrophysics Data System (ADS)
Hu, Leqian; Ma, Shuai; Yin, Chunling
2018-03-01
In this work, fluorescence spectroscopy combined with multi-way pattern recognition techniques were developed for determining the geographical origin of kudzu root and detection and quantification of adulterants in kudzu root. Excitation-emission (EEM) spectra were obtained for 150 pure kudzu root samples of different geographical origins and 150 fake kudzu roots with different adulteration proportions by recording emission from 330 to 570 nm with excitation in the range of 320-480 nm, respectively. Multi-way principal components analysis (M-PCA) and multilinear partial least squares discriminant analysis (N-PLS-DA) methods were used to decompose the excitation-emission matrices datasets. 150 pure kudzu root samples could be differentiated exactly from each other according to their geographical origins by M-PCA and N-PLS-DA models. For the adulteration kudzu root samples, N-PLS-DA got better and more reliable classification result comparing with the M-PCA model. The results obtained in this study indicated that EEM spectroscopy coupling with multi-way pattern recognition could be used as an easy, rapid and novel tool to distinguish the geographical origin of kudzu root and detect adulterated kudzu root. Besides, this method was also suitable for determining the geographic origin and detection the adulteration of the other foodstuffs which can produce fluorescence.
Pereira, Leandro S A; Lisboa, Fernanda L C; Coelho Neto, José; Valladão, Frederico N; Sena, Marcelo M
2018-05-09
Several new psychoactive substances (NPS) have reached the illegal drug market in recent years, and ecstasy-like tablets are one of the forms affected by this change. Cathinones and tryptamines have increasingly been found in ecstasy-like seized samples as well as other amphetamine type stimulants. A presumptive method for identifying different drugs in seized ecstasy tablets (n=92) using ATR-FTIR (attenuated total reflectance - Fourier transform infrared spectroscopy) and PLS-DA (partial least squares discriminant analysis) was developed. A hierarchical strategy of sequential modeling was performed with PLS-DA. The main model discriminated four classes: 5-MeO-MIPT, methylenedioxyamphetamines (MDMA and MDA), methamphetamine, and cathinones. Two submodels were built to identify drugs present in MDs and cathinones classes. Models were validated through the estimate of figures of merit. The average reliability rate (RLR) of the main model was 96.8% and accordance (ACC) was 100%. For the submodels, RLR and ACC were 100%. The reliability of the models was corroborated through their spectral interpretation. Thus, spectral assignments were performed by associating informative vectors of each specific modeled class to the respective drugs. The developed method is simple, fast, and can be applied to the forensic laboratory routine, leading to objective results reports useful for forensic scientists and law enforcement. Copyright © 2018 Elsevier B.V. All rights reserved.
Kusumaningrum, Dewi; Lee, Hoonsoo; Lohumi, Santosh; Mo, Changyeun; Kim, Moon S; Cho, Byoung-Kwan
2018-03-01
The viability of seeds is important for determining their quality. A high-quality seed is one that has a high capability of germination that is necessary to ensure high productivity. Hence, developing technology for the detection of seed viability is a high priority in agriculture. Fourier transform near-infrared (FT-NIR) spectroscopy is one of the most popular devices among other vibrational spectroscopies. This study aims to use FT-NIR spectroscopy to determine the viability of soybean seeds. Viable and artificial ageing seeds as non-viable soybeans were used in this research. The FT-NIR spectra of soybean seeds were collected and analysed using a partial least-squares discriminant analysis (PLS-DA) to classify viable and non-viable soybean seeds. Moreover, the variable importance in projection (VIP) method for variable selection combined with the PLS-DA was employed. The most effective wavelengths were selected by the VIP method, which selected 146 optimal variables from the full set of 1557 variables. The results demonstrated that the FT-NIR spectral analysis with the PLS-DA method that uses all variables or the selected variables showed good performance based on the high value of prediction accuracy for soybean viability with an accuracy close to 100%. Hence, FT-NIR techniques with a chemometric analysis have the potential for rapidly measuring soybean seed viability. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
Jiang, Wei; Zhou, Chengfeng; Han, Guangting; Via, Brian; Swain, Tammy; Fan, Zhaofei; Liu, Shaoyang
2017-01-01
Plant fibrous material is a good resource in textile and other industries. Normally, several kinds of plant fibrous materials used in one process are needed to be identified and characterized in advance. It is easy to identify them when they are in raw condition. However, most of the materials are semi products which are ground, rotted or pre-hydrolyzed. To classify these samples which include different species with high accuracy is a big challenge. In this research, both qualitative and quantitative analysis methods were chosen to classify six different species of samples, including softwood, hardwood, bast, and aquatic plant. Soft Independent Modeling of Class Analogy (SIMCA) and partial least squares (PLS) were used. The algorithm to classify different species of samples using PLS was created independently in this research. Results found that the six species can be successfully classified using SIMCA and PLS methods, and these two methods show similar results. The identification rates of kenaf, ramie and pine are 100%, and the identification rates of lotus, eucalyptus and tallow are higher than 94%. It is also found that spectra loadings can help pick up best wavenumber ranges for constructing the NIR model. Inter material distance can show how close between two species. Scores graph is helpful to choose the principal components numbers during the model construction. PMID:28105037
NASA Astrophysics Data System (ADS)
Chen, Jiemei; Peng, Lijun; Han, Yun; Yao, Lijun; Zhang, Jing; Pan, Tao
2018-03-01
Near-infrared (NIR) spectroscopy combined with chemometrics was applied to rapidly analyse haemoglobin A2 (HbA2) for β-thalassemia screening in human haemolysate samples. The relative content indicator HbA2 was indirectly quantified by simultaneous analysis of two absolute content indicators (Hb and Hb • HbA2). According to the comprehensive prediction effect of the multiple partitioning of calibration and prediction sets, the parameters were optimized to achieve modelling stability, and the preferred models were validated using the samples not involved in modelling. Savitzky-Golay smoothing was firstly used for the spectral pretreatment. The absorbance optimization partial least squares (AO-PLS) was used to eliminate high-absorption wave-bands appropriately. The equidistant combination PLS (EC-PLS) was further used to optimize wavelength models. The selected optimal models were I = 856 nm, N = 16, G = 1 and F = 6 for Hb and I = 988 nm, N = 12, G = 2 and F = 5 for Hb • HbA2. Through independent validation, the root-mean-square errors and correlation coefficients for prediction (RMSEP, RP) were 3.50 g L- 1 and 0.977 for Hb and 0.38 g L- 1 and 0.917 for Hb • HbA2, respectively. The predicted values of relative percentage HbA2 were further calculated, and the calculated RMSEP and RP were 0.31% and 0.965, respectively. The sensitivity and specificity for β-thalassemia both reached 100%. Therefore, the prediction of HbA2 achieved high accuracy for distinguishing β-thalassemia. The local optimal models for single parameter and the optimal equivalent model sets were proposed, providing more models to match possible constraints in practical applications. The NIR analysis method for the screening indicator of β-thalassemia was successfully established. The proposed method was rapid, simple and promising for thalassemia screening in a large population.
Delwiche, Stephen R; Reeves, James B
2010-01-01
In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly smoothing operations or derivatives. While such operations are often useful in reducing the number of latent variables of the actual decomposition and lowering residual error, they also run the risk of misleading the practitioner into accepting calibration equations that are poorly adapted to samples outside of the calibration. The current study developed a graphical method to examine this effect on partial least squares (PLS) regression calibrations of near-infrared (NIR) reflection spectra of ground wheat meal with two analytes, protein content and sodium dodecyl sulfate sedimentation (SDS) volume (an indicator of the quantity of the gluten proteins that contribute to strong doughs). These two properties were chosen because of their differing abilities to be modeled by NIR spectroscopy: excellent for protein content, fair for SDS sedimentation volume. To further demonstrate the potential pitfalls of preprocessing, an artificial component, a randomly generated value, was included in PLS regression trials. Savitzky-Golay (digital filter) smoothing, first-derivative, and second-derivative preprocess functions (5 to 25 centrally symmetric convolution points, derived from quadratic polynomials) were applied to PLS calibrations of 1 to 15 factors. The results demonstrated the danger of an over reliance on preprocessing when (1) the number of samples used in a multivariate calibration is low (<50), (2) the spectral response of the analyte is weak, and (3) the goodness of the calibration is based on the coefficient of determination (R(2)) rather than a term based on residual error. The graphical method has application to the evaluation of other preprocess functions and various types of spectroscopy data.
Balabin, Roman M; Lomakina, Ekaterina I
2011-04-21
In this study, we make a general comparison of the accuracy and robustness of five multivariate calibration models: partial least squares (PLS) regression or projection to latent structures, polynomial partial least squares (Poly-PLS) regression, artificial neural networks (ANNs), and two novel techniques based on support vector machines (SVMs) for multivariate data analysis: support vector regression (SVR) and least-squares support vector machines (LS-SVMs). The comparison is based on fourteen (14) different datasets: seven sets of gasoline data (density, benzene content, and fractional composition/boiling points), two sets of ethanol gasoline fuel data (density and ethanol content), one set of diesel fuel data (total sulfur content), three sets of petroleum (crude oil) macromolecules data (weight percentages of asphaltenes, resins, and paraffins), and one set of petroleum resins data (resins content). Vibrational (near-infrared, NIR) spectroscopic data are used to predict the properties and quality coefficients of gasoline, biofuel/biodiesel, diesel fuel, and other samples of interest. The four systems presented here range greatly in composition, properties, strength of intermolecular interactions (e.g., van der Waals forces, H-bonds), colloid structure, and phase behavior. Due to the high diversity of chemical systems studied, general conclusions about SVM regression methods can be made. We try to answer the following question: to what extent can SVM-based techniques replace ANN-based approaches in real-world (industrial/scientific) applications? The results show that both SVR and LS-SVM methods are comparable to ANNs in accuracy. Due to the much higher robustness of the former, the SVM-based approaches are recommended for practical (industrial) application. This has been shown to be especially true for complicated, highly nonlinear objects.
Wood, Clive; Alwati, Abdolati; Halsey, Sheelagh; Gough, Tim; Brown, Elaine; Kelly, Adrian; Paradkar, Anant
2016-09-10
The use of near infra red spectroscopy to predict the concentration of two pharmaceutical co-crystals; 1:1 ibuprofen-nicotinamide (IBU-NIC) and 1:1 carbamazepine-nicotinamide (CBZ-NIC) has been evaluated. A partial least squares (PLS) regression model was developed for both co-crystal pairs using sets of standard samples to create calibration and validation data sets with which to build and validate the models. Parameters such as the root mean square error of calibration (RMSEC), root mean square error of prediction (RMSEP) and correlation coefficient were used to assess the accuracy and linearity of the models. Accurate PLS regression models were created for both co-crystal pairs which can be used to predict the co-crystal concentration in a powder mixture of the co-crystal and the active pharmaceutical ingredient (API). The IBU-NIC model had smaller errors than the CBZ-NIC model, possibly due to the complex CBZ-NIC spectra which could reflect the different arrangement of hydrogen bonding associated with the co-crystal compared to the IBU-NIC co-crystal. These results suggest that NIR spectroscopy can be used as a PAT tool during a variety of pharmaceutical co-crystal manufacturing methods and the presented data will facilitate future offline and in-line NIR studies involving pharmaceutical co-crystals. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
On-line milk spectrometry: analysis of bovine milk composition
NASA Astrophysics Data System (ADS)
Spitzer, Kyle; Kuennemeyer, Rainer; Woolford, Murray; Claycomb, Rod
2005-04-01
We present partial least squares (PLS) regressions to predict the composition of raw, unhomogenised milk using visible to near infrared spectroscopy. A total of 370 milk samples from individual quarters were collected and analysed on-line by two low cost spectrometers in the wavelength ranges 380-1100 nm and 900-1700 nm. Samples were collected from 22 Friesian, 17 Jersey, 2 Ayrshire and 3 Friesian-Jersey crossbred cows over a period of 7 consecutive days. Transmission spectra were recorded in an inline flowcell through a 0.5 mm thick milk sample. PLS models, where wavelength selection was performed using iterative PLS, were developed for fat, protein, lactose, and somatic cell content. The root mean square error of prediction (and correlation coefficient) for the nir and visible spectrometers respectively were 0.70%(0.93) and 0.91%(0.91) for fat, 0.65%(0.5) and 0.47%(0.79) for protein, 0.36%(0.49) and 0.45%(0.43) for lactose, and 0.50(0.54) and 0.48(0.51) for log10 somatic cells.
Li, Shuifang; Zhang, Xin; Shan, Yang; Su, Donglin; Ma, Qiang; Wen, Ruizhi; Li, Jiaojuan
2017-03-01
Near-infrared spectroscopy (NIR) was used for qualitative and quantitative detection of honey adulterated with high-fructose corn syrup (HFCS) or maltose syrup (MS). Competitive adaptive reweighted sampling (CARS) was employed to select key variables. Partial least squares linear discriminant analysis (PLS-LDA) was adopted to classify the adulterated honey samples. The CARS-PLS-LDA models showed an accuracy of 86.3% (honey vs. adulterated honey with HFCS) and 96.1% (honey vs. adulterated honey with MS), respectively. PLS regression (PLSR) was used to predict the extent of adulteration in the honeys. The results showed that NIR combined with PLSR could not be used to quantify adulteration with HFCS, but could be used to quantify adulteration with MS: coefficient (R p 2 ) and root mean square of prediction (RMSEP) were 0.901 and 4.041 for MS-adulterated samples from different floral origins, and 0.981 and 1.786 for MS-adulterated samples from the same floral origin (Brassica spp.), respectively. Copyright © 2016. Published by Elsevier Ltd.
De Luca, Michele; Restuccia, Donatella; Clodoveo, Maria Lisa; Puoci, Francesco; Ragno, Gaetano
2016-07-01
Chemometric discrimination of extra virgin olive oils (EVOO) from whole and stoned olive pastes was carried out by using Fourier transform infrared (FTIR) data and partial least squares-discriminant analysis (PLS1-DA) approach. Four Italian commercial EVOO brands, all in both whole and stoned version, were considered in this study. The adopted chemometric methodologies were able to describe the different chemical features in phenolic and volatile compounds contained in the two types of oil by using unspecific IR spectral information. Principal component analysis (PCA) was employed in cluster analysis to capture data patterns and to highlight differences between technological processes and EVOO brands. The PLS1-DA algorithm was used as supervised discriminant analysis to identify the different oil extraction procedures. Discriminant analysis was extended to the evaluation of possible adulteration by addition of aliquots of oil from whole paste to the most valuable oil from stoned olives. The statistical parameters from external validation of all the PLS models were very satisfactory, with low root mean square error of prediction (RMSEP) and relative error (RE%). Copyright © 2016 Elsevier Ltd. All rights reserved.
Comparison of three chemometrics methods for near-infrared spectra of glucose in the whole blood
NASA Astrophysics Data System (ADS)
Zhang, Hongyan; Ding, Dong; Li, Xin; Chen, Yu; Tang, Yuguo
2005-01-01
Principal Component Regression (PCR), Partial Least Square (PLS) and Artificial Neural Networks (ANN) methods are used in the analysis for the near infrared (NIR) spectra of glucose in the whole blood. The calibration model is built up in the spectrum band where there are the glucose has much more spectral absorption than the water, fat, and protein with these methods and the correlation coefficients of the model are showed in this paper. Comparing these results, a suitable method to analyze the glucose NIR spectrum in the whole blood is found.
In vivo diagnosis of cervical precancer using Raman spectroscopy and genetic algorithm techniques.
Duraipandian, Shiyamala; Zheng, Wei; Ng, Joseph; Low, Jeffrey J H; Ilancheran, A; Huang, Zhiwei
2011-10-21
This study aimed to evaluate the clinical utility of applying near-infrared (NIR) Raman spectroscopy and genetic algorithm-partial least squares-discriminant analysis (GA-PLS-DA) to identify biomolecular changes of cervical tissues associated with dysplastic transformation during colposcopic examination. A total of 105 in vivo Raman spectra were measured from 57 cervical sites (35 normal and 22 precancer sites) of 29 patients recruited, in which 65 spectra were from normal sites, while 40 spectra were from cervical precancerous lesions (i.e., 7 low-grade CIN and 33 high-grade CIN). The GA feature selection technique incorporated with PLS was utilized to study the significant biochemical Raman bands for differentiation between normal and precancer cervical tissues. The GA-PLS-DA algorithm with double cross-validation (dCV) identified seven diagnostically significant Raman bands in the ranges of 925-935, 979-999, 1080-1090, 1240-1260, 1320-1340, 1400-1420, and 1625-1645 cm(-1) related to proteins, nucleic acids and lipids in tissue, and yielded a diagnostic accuracy of 82.9% (sensitivity of 72.5% (29/40) and specificity of 89.2% (58/65)) for precancer detection. The results of this exploratory study suggest that Raman spectroscopy in conjunction with GA-PLS-DA and dCV methods has the potential to provide clinically significant discrimination between normal and precancer cervical tissues at the molecular level.
Quantification of amine functional groups and their influence on OM/OC in the IMPROVE network
NASA Astrophysics Data System (ADS)
Kamruzzaman, Mohammed; Takahama, Satoshi; Dillner, Ann M.
2018-01-01
Recently, we developed a method using FT-IR spectroscopy coupled with partial least squares (PLS) regression to measure the four most abundant organic functional groups, aliphatic C-H, alcohol OH, carboxylic acid OH and carbonyl C=O, in atmospheric particulate matter. These functional groups are summed to estimate organic matter (OM) while the carbon from the functional groups is summed to estimate organic carbon (OC). With this method, OM and OM/OC can be estimated for each sample rather than relying on one assumed value to convert OC measurements to OM. This study continues the development of the FT-IR and PLS method for estimating OM and OM/OC by including the amine functional group. Amines are ubiquitous in the atmosphere and come from motor vehicle exhaust, animal husbandry, biomass burning, and vegetation among other sources. In this study, calibration standards for amines are produced by aerosolizing individual amine compounds and collecting them on PTFE filters using an IMPROVE sampler, thereby mimicking the filter media and collection geometry of ambient standards. The moles of amine functional group on each standard and a narrow range of amine-specific wavenumbers in the FT-IR spectra (wavenumber range 1 550-1 500 cm-1) are used to develop a PLS calibration model. The PLS model is validated using three methods: prediction of a set of laboratory standards not included in the model, a peak height analysis and a PLS model with a broader wavenumber range. The model is then applied to the ambient samples collected throughout 2013 from 16 IMPROVE sites in the USA. Urban sites have higher amine concentrations than most rural sites, but amine functional groups account for a lower fraction of OM at urban sites. Amine concentrations, contributions to OM and seasonality vary by site and sample. Amine has a small impact on the annual average OM/OC for urban sites, but for some rural sites including amine in the OM/OC calculations increased OM/OC by 0.1 or more.
Guelpa, Anina; Bevilacqua, Marta; Marini, Federico; O'Kennedy, Kim; Geladi, Paul; Manley, Marena
2015-04-15
It has been established in this study that the Rapid Visco Analyser (RVA) can describe maize hardness, irrespective of the RVA profile, when used in association with appropriate multivariate data analysis techniques. Therefore, the RVA can complement or replace current and/or conventional methods as a hardness descriptor. Hardness modelling based on RVA viscograms was carried out using seven conventional hardness methods (hectoliter mass (HLM), hundred kernel mass (HKM), particle size index (PSI), percentage vitreous endosperm (%VE), protein content, percentage chop (%chop) and near infrared (NIR) spectroscopy) as references and three different RVA profiles (hard, soft and standard) as predictors. An approach using locally weighted partial least squares (LW-PLS) was followed to build the regression models. The resulted prediction errors (root mean square error of cross-validation (RMSECV) and root mean square error of prediction (RMSEP)) for the quantification of hardness values were always lower or in the same order of the laboratory error of the reference method. Copyright © 2014 Elsevier Ltd. All rights reserved.
Crop/weed discrimination using near-infrared reflectance spectroscopy (NIRS)
NASA Astrophysics Data System (ADS)
Zhang, Yun; He, Yong
2006-09-01
The traditional uniform herbicide application often results in an over chemical residues on soil, crop plants and agriculture produce, which have imperiled the environment and food security. Near-infrared reflectance spectroscopy (NIRS) offers a promising means for weed detection and site-specific herbicide application. In laboratory, a total of 90 samples (30 for each species) of the detached leaves of two weeds, i.e., threeseeded mercury (Acalypha australis L.) and fourleafed duckweed (Marsilea quadrfolia L.), and one crop soybean (Glycine max) was investigated for NIRS on 325- 1075 nm using a field spectroradiometer. 20 absorbance samples of each species after pretreatment were exported and the lacked Y variables were assigned independent values for partial least squares (PLS) analysis. During the combined principle component analysis (PCA) on 400-1000 nm, the PC1 and PC2 could together explain over 91% of the total variance and detect the three plant species with 98.3% accuracy. The full-cross validation results of PLS, i.e., standard error of prediction (SEP) 0.247, correlation coefficient (r) 0.954 and root mean square error of prediction (RMSEP) 0.245, indicated an optimum model for weed identification. By predicting the remaining 10 samples of each species in the PLS model, the results with deviation presented a 100% crop/weed detection rate. Thus, it could be concluded that PLS was an available alternative of for qualitative weed discrimination on NTRS.
USDA-ARS?s Scientific Manuscript database
Several partial least squares (PLS) models were created correlating various properties and chemical composition measurements with the 1H and 13C NMR spectra of 73 different of pyrolysis bio-oil samples from various biomass sources (crude and intermediate products), finished oils and small molecule s...
NASA Astrophysics Data System (ADS)
Luna, Aderval S.; da Silva, Arnaldo P.; Pinho, Jéssica S. A.; Ferré, Joan; Boqué, Ricard
Near infrared (NIR) spectroscopy and multivariate classification were applied to discriminate soybean oil samples into non-transgenic and transgenic. Principal Component Analysis (PCA) was applied to extract relevant features from the spectral data and to remove the anomalous samples. The best results were obtained when with Support Vectors Machine-Discriminant Analysis (SVM-DA) and Partial Least Squares-Discriminant Analysis (PLS-DA) after mean centering plus multiplicative scatter correction. For SVM-DA the percentage of successful classification was 100% for the training group and 100% and 90% in validation group for non transgenic and transgenic soybean oil samples respectively. For PLS-DA the percentage of successful classification was 95% and 100% in training group for non transgenic and transgenic soybean oil samples respectively and 100% and 80% in validation group for non transgenic and transgenic respectively. The results demonstrate that NIR spectroscopy can provide a rapid, nondestructive and reliable method to distinguish non-transgenic and transgenic soybean oils.
Wastewater quality monitoring system using sensor fusion and machine learning techniques.
Qin, Xusong; Gao, Furong; Chen, Guohua
2012-03-15
A multi-sensor water quality monitoring system incorporating an UV/Vis spectrometer and a turbidimeter was used to monitor the Chemical Oxygen Demand (COD), Total Suspended Solids (TSS) and Oil & Grease (O&G) concentrations of the effluents from the Chinese restaurant on campus and an electrocoagulation-electroflotation (EC-EF) pilot plant. In order to handle the noise and information unbalance in the fused UV/Vis spectra and turbidity measurements during the calibration model building, an improved boosting method, Boosting-Iterative Predictor Weighting-Partial Least Squares (Boosting-IPW-PLS), was developed in the present study. The Boosting-IPW-PLS method incorporates IPW into boosting scheme to suppress the quality-irrelevant variables by assigning small weights, and builds up the models for the wastewater quality predictions based on the weighted variables. The monitoring system was tested in the field with satisfactory results, underlying the potential of this technique for the online monitoring of water quality. Copyright © 2011 Elsevier Ltd. All rights reserved.
The Extent and Prediction of Heavy Metal Pollution in Soils of Shahrood and Damghan, Iran.
Sakizadeh, Mohamad; Mirzaei, Rouhollah; Ghorbani, Hadi
2015-12-01
The levels of 12 heavy metals (Ag, Ba, Be, Cd, Co, Cr, Cu, Ni, Pb, Tl, V, Zn) were considered in 229 soil samples in Semnan Province, Iran. To discriminate between natural and anthropogenic inputs of heavy metals, factor analysis was used. Seven factors accounting for 90.5 % of the total variance were extracted. The mining and agricultural activities along with geogenic sources have been attributed as the main causes of the levels of heavy metals in the study area. The partial least squares regression was utilized to predict the level of soil pollution index (SPI) considering the concentrations of 12 heavy metals. The eigenvectors from the first three PLS represented more than 98 % of the overall variance. The correlation coefficient between the observed and predicted SPI was 0.99 indicating the high efficiency of this method. The resultant coefficient of determination for three PLS components was 0.984 confirming the predictive ability of this method.
Teodoro, Janaína Aparecida Reis; Pereira, Hebert Vinicius; Sena, Marcelo Martins; Piccin, Evandro; Zacca, Jorge Jardim; Augusti, Rodinei
2017-12-15
A direct method based on the application of paper spray mass spectrometry (PS-MS) combined with a chemometric supervised method (partial least square discriminant analysis, PLS-DA) was developed and applied to the discrimination of authentic and counterfeit samples of blended Scottish whiskies. The developed methodology employed the negative ion mode MS, included 44 authentic whiskies from diverse brands and batches and 44 counterfeit samples of the same brands seized during operations of the Brazilian Federal Police, totalizing 88 samples. An exploratory principal component analysis (PCA) model showed a reasonable discrimination of the counterfeit whiskies in PC2. In spite of the samples heterogeneity, a robust, reliable and accurate PLS-DA model was generated and validated, which was able to correctly classify the samples with nearly 100% success rate. The use of PS-MS also allowed the identification of the main marker compounds associated with each type of sample analyzed: authentic or counterfeit. Copyright © 2017 Elsevier Ltd. All rights reserved.
Random forest models to predict aqueous solubility.
Palmer, David S; O'Boyle, Noel M; Glen, Robert C; Mitchell, John B O
2007-01-01
Random Forest regression (RF), Partial-Least-Squares (PLS) regression, Support Vector Machines (SVM), and Artificial Neural Networks (ANN) were used to develop QSPR models for the prediction of aqueous solubility, based on experimental data for 988 organic molecules. The Random Forest regression model predicted aqueous solubility more accurately than those created by PLS, SVM, and ANN and offered methods for automatic descriptor selection, an assessment of descriptor importance, and an in-parallel measure of predictive ability, all of which serve to recommend its use. The prediction of log molar solubility for an external test set of 330 molecules that are solid at 25 degrees C gave an r2 = 0.89 and RMSE = 0.69 log S units. For a standard data set selected from the literature, the model performed well with respect to other documented methods. Finally, the diversity of the training and test sets are compared to the chemical space occupied by molecules in the MDL drug data report, on the basis of molecular descriptors selected by the regression analysis.
Laser-Induced Breakdown Spectroscopy (LIBS) Measurement of Uranium in Molten Salt.
Williams, Ammon; Phongikaroon, Supathorn
2018-01-01
In this current study, the molten salt aerosol-laser-induced breakdown spectroscopy (LIBS) system was used to measure the uranium (U) content in a ternary UCl 3 -LiCl-KCl salt to investigate and assess a near real-time analytical approach for material safeguards and accountability. Experiments were conducted using five different U concentrations to determine the analytical figures of merit for the system with respect to U. In the analysis, three U lines were used to develop univariate calibration curves at the 367.01 nm, 385.96 nm, and 387.10 nm lines. The 367.01 nm line had the lowest limit of detection (LOD) of 0.065 wt% U. The 385.96 nm line had the best root mean square error of cross-validation (RMSECV) of 0.20 wt% U. In addition to the univariate calibration approach, a multivariate partial least squares (PLS) model was developed to further analyze the data. Using partial least squares (PLS) modeling, an RMSECV of 0.085 wt% U was determined. The RMSECV from the multivariate approach was significantly better than the univariate case and the PLS model is recommended for future LIBS analysis. Overall, the aerosol-LIBS system performed well in monitoring the U concentration and it is expected that the system could be used to quantitatively determine the U compositions within the normal operational concentrations of U in pyroprocessing molten salts.
NASA Astrophysics Data System (ADS)
Zhang, Xuexi; Xiao, Zhi-Yan; Yin, Jianhua; Xia, Yang
2014-09-01
Fourier transform infrared imaging (FTIRI) combined with chemometrics can be used to detect the structure of bio-macromolecule, measure the concentrations of some components, and so on. In this study, FTIRI with Partial Least-Squares (PLS) regression was applied to study the concentration of two main components in bovine nasal cartilage (BNC), collagen and proteoglycan. An infrared spectrum library was built by mixing the collagen and chondroitin 6-sulfate (main of proteoglycan) at different ratios. Some pretreatments are needed for building PLS model. FTIR images were collected from BNC sections at 6.25μm and 25μm pixel size. The spectra extracted from BNC-FTIR images were imported into the PLS regression program to predict the concentrations of collagen and proteoglycan. These PLS-determined concentrations are agreed with the result in our previous work and biochemical analytical results. The prediction shows that the concentrations of collagen and proteoglycan in BNC are comparative on the whole. However, the concentration of proteoglycan is a litter higher than that of collagen, to some extent.
Andrade, Letícia; Farhat, Imad A; Aeberhardt, Kasia; Bro, Rasmus; Engelsen, Søren Balling
2009-02-01
The influence of temperature on near-infrared (NIR) and nuclear magnetic resonance (NMR) spectroscopy complicates the industrial applications of both spectroscopic methods. The focus of this study is to analyze and model the effect of temperature variation on NIR spectra and NMR relaxation data. Different multivariate methods were tested for constructing robust prediction models based on NIR and NMR data acquired at various temperatures. Data were acquired on model spray-dried limonene systems at five temperatures in the range from 20 degrees C to 60 degrees C and partial least squares (PLS) regression models were computed for limonene and water predictions. The predictive ability of the models computed on the NIR spectra (acquired at various temperatures) improved significantly when data were preprocessed using extended inverted signal correction (EISC). The average PLS regression prediction error was reduced to 0.2%, corresponding to 1.9% and 3.4% of the full range of limonene and water reference values, respectively. The removal of variation induced by temperature prior to calibration, by direct orthogonalization (DO), slightly enhanced the predictive ability of the models based on NMR data. Bilinear PLS models, with implicit inclusion of the temperature, enabled limonene and water predictions by NMR with an error of 0.3% (corresponding to 2.8% and 7.0% of the full range of limonene and water). For NMR, and in contrast to the NIR results, modeling the data using multi-way N-PLS improved the models' performance. N-PLS models, in which temperature was included as an extra variable, enabled more accurate prediction, especially for limonene (prediction error was reduced to 0.2%). Overall, this study proved that it is possible to develop models for limonene and water content prediction based on NIR and NMR data, independent of the measurement temperature.
NASA Astrophysics Data System (ADS)
Isingizwe Nturambirwe, J. Frédéric; Perold, Willem J.; Opara, Umezuruike L.
2016-02-01
Near infrared (NIR) spectroscopy has gained extensive use in quality evaluation. It is arguably one of the most advanced spectroscopic tools in non-destructive quality testing of food stuff, from measurement to data analysis and interpretation. NIR spectral data are interpreted through means often involving multivariate statistical analysis, sometimes associated with optimisation techniques for model improvement. The objective of this research was to explore the extent to which genetic algorithms (GA) can be used to enhance model development, for predicting fruit quality. Apple fruits were used, and NIR spectra in the range from 12000 to 4000 cm-1 were acquired on both bruised and healthy tissues, with different degrees of mechanical damage. GAs were used in combination with partial least squares regression methods to develop bruise severity prediction models, and compared to PLS models developed using the full NIR spectrum. A classification model was developed, which clearly separated bruised from unbruised apple tissue. GAs helped improve prediction models by over 10%, in comparison with full spectrum-based models, as evaluated in terms of error of prediction (Root Mean Square Error of Cross-validation). PLS models to predict internal quality, such as sugar content and acidity were developed and compared to the versions optimized by genetic algorithm. Overall, the results highlighted the potential use of GA method to improve speed and accuracy of fruit quality prediction.
Ahmad, Iftikhar; Ahmad, Manzoor; Khan, Karim; Ikram, Masroor
2016-06-01
Optical polarimetry was employed for assessment of ex vivo healthy and basal cell carcinoma (BCC) tissue samples from human skin. Polarimetric analyses revealed that depolarization and retardance for healthy tissue group were significantly higher (p<0.001) compared to BCC tissue group. Histopathology indicated that these differences partially arise from BCC-related characteristic changes in tissue morphology. Wilks lambda statistics demonstrated the potential of all investigated polarimetric properties for computer assisted classification of the two tissue groups. Based on differences in polarimetric properties, partial least square (PLS) regression classified the samples with 100% accuracy, sensitivity and specificity. These findings indicate that optical polarimetry together with PLS statistics hold promise for automated pathology classification. Copyright © 2016 Elsevier B.V. All rights reserved.
Sankar, A S Kamatchi; Vetrichelvan, Thangarasu; Venkappaya, Devashya
2011-09-01
In the present work, three different spectrophotometric methods for simultaneous estimation of ramipril, aspirin and atorvastatin calcium in raw materials and in formulations are described. Overlapped data was quantitatively resolved by using chemometric methods, viz. inverse least squares (ILS), principal component regression (PCR) and partial least squares (PLS). Calibrations were constructed using the absorption data matrix corresponding to the concentration data matrix. The linearity range was found to be 1-5, 10-50 and 2-10 μg mL-1 for ramipril, aspirin and atorvastatin calcium, respectively. The absorbance matrix was obtained by measuring the zero-order absorbance in the wavelength range between 210 and 320 nm. A training set design of the concentration data corresponding to the ramipril, aspirin and atorvastatin calcium mixtures was organized statistically to maximize the information content from the spectra and to minimize the error of multivariate calibrations. By applying the respective algorithms for PLS 1, PCR and ILS to the measured spectra of the calibration set, a suitable model was obtained. This model was selected on the basis of RMSECV and RMSEP values. The same was applied to the prediction set and capsule formulation. Mean recoveries of the commercial formulation set together with the figures of merit (calibration sensitivity, selectivity, limit of detection, limit of quantification and analytical sensitivity) were estimated. Validity of the proposed approaches was successfully assessed for analyses of drugs in the various prepared physical mixtures and formulations.
Yulia, Meinilwita
2017-01-01
Asian palm civet coffee or kopi luwak (Indonesian words for coffee and palm civet) is well known as the world's priciest and rarest coffee. To protect the authenticity of luwak coffee and protect consumer from luwak coffee adulteration, it is very important to develop a robust and simple method for determining the adulteration of luwak coffee. In this research, the use of UV-Visible spectra combined with PLSR was evaluated to establish rapid and simple methods for quantification of adulteration in luwak-arabica coffee blend. Several preprocessing methods were tested and the results show that most of the preprocessing spectra were effective in improving the quality of calibration models with the best PLS calibration model selected for Savitzky-Golay smoothing spectra which had the lowest RMSECV (0.039) and highest RPDcal value (4.64). Using this PLS model, a prediction for quantification of luwak content was calculated and resulted in satisfactory prediction performance with high both RPDp and RER values. PMID:28913348
NASA Astrophysics Data System (ADS)
Thumanu, Kanjana; Tanthanuch, Waraporn; Ye, Danna; Sangmalee, Anawat; Lorthongpanich, Chanchao; Parnpai, Rangsun; Heraud, Philip
2011-05-01
Stem cell-based therapy for liver regeneration has been proposed to overcome the persistent shortage in the supply of suitable donor organs. A requirement for this to succeed is to find a rapid method to detect functional hepatocytes, differentiated from embryonic stem cells. We propose Fourier transform infrared (FTIR) microspectroscopy as a versatile method to identify the early and last stages of the differentiation process leading to the formation of hepatocytes. Using synchrotron-FTIR microspectroscopy, the means of identifying hepatocytes at the single-cell level is possible and explored. Principal component analysis and subsequent partial least-squares (PLS) discriminant analysis is applied to distinguish endoderm induction from hepatic progenitor cells and matured hepatocyte-like cells. The data are well modeled by PLS with endoderm induction, hepatic progenitor cells, and mature hepatocyte-like cells able to be discriminated with very high sensitivity and specificity. This method provides a practical tool to monitor endoderm induction and has the potential to be applied for quality control of cell differentiation leading to hepatocyte formation.
Cheng, Weiwei; Sun, Da-Wen; Pu, Hongbin; Wei, Qingyi
2017-04-15
The feasibility of hyperspectral imaging (HSI) (400-1000nm) for tracing the chemical spoilage extent of the raw meat used for two kinds of processed meats was investigated. Calibration models established separately for salted and cooked meats using full wavebands showed good results with the determination coefficient in prediction (R 2 P ) of 0.887 and 0.832, respectively. For simplifying the calibration models, two variable selection methods were used and compared. The results showed that genetic algorithm-partial least squares (GA-PLS) with as much continuous wavebands selected as possible always had better performance. The potential of HSI to develop one multispectral system for simultaneously tracing the chemical spoilage extent of the two kinds of processed meats was also studied. Good result with an R 2 P of 0.854 was obtained using GA-PLS as the dimension reduction method, which was thus used to visualize total volatile base nitrogen (TVB-N) contents corresponding to each pixel of the image. Copyright © 2016 Elsevier Ltd. All rights reserved.
Dong, Jian-Jun; Li, Qing-Liang; Yin, Hua; Zhong, Cheng; Hao, Jun-Guang; Yang, Pan-Fei; Tian, Yu-Hong; Jia, Shi-Ru
2014-10-15
Sensory evaluation is regarded as a necessary procedure to ensure a reproducible quality of beer. Meanwhile, high-throughput analytical methods provide a powerful tool to analyse various flavour compounds, such as higher alcohol and ester. In this study, the relationship between flavour compounds and sensory evaluation was established by non-linear models such as partial least squares (PLS), genetic algorithm back-propagation neural network (GA-BP), support vector machine (SVM). It was shown that SVM with a Radial Basis Function (RBF) had a better performance of prediction accuracy for both calibration set (94.3%) and validation set (96.2%) than other models. Relatively lower prediction abilities were observed for GA-BP (52.1%) and PLS (31.7%). In addition, the kernel function of SVM played an essential role of model training when the prediction accuracy of SVM with polynomial kernel function was 32.9%. As a powerful multivariate statistics method, SVM holds great potential to assess beer quality. Copyright © 2014 Elsevier Ltd. All rights reserved.
[Quality evaluation of American ginseng using UPLC coupled with multivariate analysis].
Tang, Yan; Yan, Shu-Mo; Wang, Jing-Jing; Yuan, Yuan; Yang, Bin
2016-05-01
An ultra performance liquid chromatography (UPLC)method combined with multivariate data analysis was developed to evaluate the quality of American ginseng by simultaneously determining the concentrations of six ginsenosides (Rg₁, Re, Rb₁, Rc, Ro and Rd)in the samples. For UPLC, acetonitrile with 0.01% formic acid and water with 0.01% formic acid were used as the mobile phase with gradient elution. Under the established chromatographic conditions, the six ginsenosides could be well separated and the results of linearity, stability, precision, repeatability, and recovery rate all reached the requirement of quantification analysis, respectively. The total contents of Rg₁, Re, and Rb₁ in 57 samples all reached the requirement of the 2015 edition of Chinese Pharmacopoeia. At the same time, the experimental data were analyzed by principle component analysis (PCA) and partial least squares discriminant analysis (PLS-DA). The crude drugs and the decoction pieces can be discriminated by a PCA method and the samples with different age can be distinguished by a PLS-DA method. Copyright© by the Chinese Pharmaceutical Association.
Detection of triglycerides using immobilized enzymes in food and biological samples
NASA Astrophysics Data System (ADS)
Raichur, Ashish; Lesi, Abiodun; Pedersen, Henrik
1996-04-01
A scheme for the determination of total triglyceride (fat) content in biomedical and food samples is being developed. The primary emphasis is to minimize the reagents used, simplify sample preparation and develop a robust system that would facilitate on-line monitoring. The new detection scheme developed thus far involves extracting triglycerides into an organic solvent (cyclohexane) and performing partial least squares (PLS) analysis on the NIR (1100 - 2500 nm) absorbance spectra of the solution. A training set using 132 spectra of known triglyceride mixtures was complied. Eight PLS calibrations were generated and were used to predict the total fat extracted from commercial samples such as mayonnaise, butter, corn oil and coconut oil. The results typically gave a correlation coefficient (r) of 0.99 or better. Predictions were typically within 90% and better at higher concentrations. Experiments were also performed using an immobilized lipase reactor to hydrolyze the fat extracted into the organic solvent. Performing PLS analysis on the difference spectra of the substrate and product could enhance specificity. This is being verified experimentally. Further work with biomedical samples is to be performed. This scheme may be developed into a feasible detection method for triglycerides in the biomedical and food industries.
ERIC Educational Resources Information Center
Huang, Jie-Tsuen; Hsieh, Hui-Hsien
2011-01-01
The purpose of this study was to investigate the contributions of socioeconomic status (SES) in predicting social cognitive career theory (SCCT) factors. Data were collected from 738 college students in Taiwan. The results of the partial least squares (PLS) analyses indicated that SES significantly predicted career decision self-efficacy (CDSE);…
Elsohaby, Ibrahim; McClure, J Trenton; Riley, Christopher B; Bryanton, Janet; Bigsby, Kathryn; Shaw, R Anthony
2018-02-20
Attenuated total reflectance infrared (ATR-IR) spectroscopy is a simple, rapid and cost-effective method for the analysis of serum. However, the complex nature of serum remains a limiting factor to the reliability of this method. We investigated the benefits of coupling the centrifugal ultrafiltration with ATR-IR spectroscopy for quantification of human serum IgA concentration. Human serum samples (n = 196) were analyzed for IgA using an immunoturbidimetric assay. ATR-IR spectra were acquired for whole serum samples and for the retentate (residue) reconstituted with saline following 300 kDa centrifugal ultrafiltration. IR-based analytical methods were developed for each of the two spectroscopic datasets, and the accuracy of each of the two methods compared. Analytical methods were based upon partial least squares regression (PLSR) calibration models - one with 5-PLS factors (for whole serum) and the second with 9-PLS factors (for the reconstituted retentate). Comparison of the two sets of IR-based analytical results to reference IgA values revealed improvements in the Pearson correlation coefficient (from 0.66 to 0.76), and the root mean squared error of prediction in IR-based IgA concentrations (from 102 to 79 mg/dL) for the ultrafiltration retentate-based method as compared to the method built upon whole serum spectra. Depleting human serum low molecular weight proteins using a 300 kDa centrifugal filter thus enhances the accuracy IgA quantification by ATR-IR spectroscopy. Further evaluation and optimization of this general approach may ultimately lead to routine analysis of a range of high molecular-weight analytical targets that are otherwise unsuitable for IR-based analysis. Copyright © 2017 Elsevier B.V. All rights reserved.
Multi-element fingerprinting as a tool in origin authentication of four east China marine species.
Guo, Lipan; Gong, Like; Yu, Yanlei; Zhang, Hong
2013-12-01
The contents of 25 elements in 4 types of commercial marine species from the East China Sea were determined by inductively coupled plasma mass spectrometry and atomic absorption spectrometry. The elemental composition was used to differentiate marine species according to geographical origin by multivariate statistical analysis. The results showed that principal component analysis could distinguish samples from different areas and reveal the elements which played the most important role in origin diversity. The established models by partial least squares discriminant analysis (PLS-DA) and by probabilistic neural network (PNN) can both precisely predict the origin of the marine species. Further study indicated that PLS-DA and PNN were efficacious in regional discrimination. The models from these 2 statistical methods, with an accuracy of 97.92% and 100%, respectively, could both distinguish samples from different areas without the need for species differentiation. © 2013 Institute of Food Technologists®
[On-site evaluation of raw milk qualities by portable Vis/NIR transmittance technique].
Wang, Jia-Hua; Zhang, Xiao-Wei; Wang, Jun; Han, Dong-Hai
2014-10-01
To ensure the material safety of dairy products, visible (Vis)/near infrared (NIR) spectroscopy combined with che- mometrics methods was used to develop models for fat, protein, dry matter (DM) and lactose on-site evaluation. A total of 88 raw milk samples were collected from individual livestocks in different years. The spectral of raw milk were measured by a porta- ble Vis/NIR spectrometer with diffused transmittance accessory. To remove the scatter effect and baseline drift, the diffused transmittance spectra were preprocessed by 2nd order derivative with Savitsky-Golay (polynomial order 2, data point 25). Changeable size moving window partial least squares (CSMWPLS) and genetic algorithms partial least squares (GAPLS) meth- ods were suggested to select informative regions for PLS calibration. The PLS and multiple linear regression (MLR) methods were used to develop models for predicting quality index of raw milk. The prediction performance of CSMWPLS models were similar to GAPLS models for fat, protein, DM and lactose evaluation, the root mean standard errors of prediction (RMSEP) were 0.115 6/0.103 3, 0.096 2/0.113 7, 0.201 3/0.123 7 and 0.077 4/0.066 8, and the relative standard deviations of prediction (RPD) were 8.99/10.06, 3.53/2.99, 5.76/9.38 and 1.81/2.10, respectively. Meanwhile, the MLR models were also cal- ibrated with 8, 10, 9 and 7 variables for fat, protein, DM and lactose, respectively. The prediction performance of MLR models was better than or close to PLS models. The MLR models to predict fat, protein, DM and lactose yielded the RMSEP of 0.107 0, 0.093 0, 0.136 0 and 0.065 8, and the RPD of 9.72, 3.66, 8.53 and 2.13, respectively. The results demonstrated the usefulness of Vis/NIR spectra combined with multivariate calibration methods as an objective and rapid method for the quality evaluation of complicated raw milks. And the results obtained also highlight the potential of portable Vis/NIR instruments for on-site assessing quality indexes of raw milk.
Zhou, Zhenyu; Liu, Wei; Cui, Jiali; Wang, Xunheng; Arias, Diana; Wen, Ying; Bansal, Ravi; Hao, Xuejun; Wang, Zhishun; Peterson, Bradley S; Xu, Dongrong
2011-02-01
Signal variation in diffusion-weighted images (DWIs) is influenced both by thermal noise and by spatially and temporally varying artifacts, such as rigid-body motion and cardiac pulsation. Motion artifacts are particularly prevalent when scanning difficult patient populations, such as human infants. Although some motion during data acquisition can be corrected using image coregistration procedures, frequently individual DWIs are corrupted beyond repair by sudden, large amplitude motion either within or outside of the imaging plane. We propose a novel approach to identify and reject outlier images automatically using local binary patterns (LBP) and 2D partial least square (2D-PLS) to estimate diffusion tensors robustly. This method uses an enhanced LBP algorithm to extract texture features from a local texture feature of the image matrix from the DWI data. Because the images have been transformed to local texture matrices, we are able to extract discriminating information that identifies outliers in the data set by extending a traditional one-dimensional PLS algorithm to a two-dimension operator. The class-membership matrix in this 2D-PLS algorithm is adapted to process samples that are image matrix, and the membership matrix thus represents varying degrees of importance of local information within the images. We also derive the analytic form of the generalized inverse of the class-membership matrix. We show that this method can effectively extract local features from brain images obtained from a large sample of human infants to identify images that are outliers in their textural features, permitting their exclusion from further processing when estimating tensors using the DWIs. This technique is shown to be superior in performance when compared with visual inspection and other common methods to address motion-related artifacts in DWI data. This technique is applicable to correct motion artifact in other magnetic resonance imaging (MRI) techniques (e.g., the bootstrapping estimation) that use univariate or multivariate regression methods to fit MRI data to a pre-specified model. Copyright © 2011 Elsevier Inc. All rights reserved.
Klein-Júnior, Luiz C; Viaene, Johan; Tuenter, Emmy; Salton, Juliana; Gasper, André L; Apers, Sandra; Andries, Jan P M; Pieters, Luc; Henriques, Amélia T; Vander Heyden, Yvan
2016-09-09
Psychotria nemorosa is chemically characterized by indole alkaloids and displays significant inhibitory activity on butyrylcholinesterase (BChE) and monoamine oxidase-A (MAO-A), both enzymes related to neurodegenerative disorders. In the present study, 43 samples of P. nemorosa leaves were extracted and fractionated in accordance to previously optimized methods (see Part I). These fractions were analyzed by means of UPLC-DAD and assayed for their BChE and MAO-A inhibitory potencies. The chromatographic fingerprint data was first aligned using correlation optimized warping and Principal Component Analysis to explore the data structure was performed. Multivariate calibration techniques, namely Partial Least Squares (PLS1), PLS2 and Orthogonal Projections to Latent Structure (O-PLS1), were evaluated for modelling the activities as a function of the fingerprints. Since the best results were obtained with O-PLS1 model (RMSECV=9.3 and 3.3 for BChE and MAO-A, respectively), the regression coefficients of the model were analyzed and plotted relative to the original fingerprints. Four peaks were indicated as multifunctional compounds, with the capacity to impair both BChE and MAO-A activities. In order to confirm these results, a semi-prep HPLC technique was used and a fraction containing the four peaks was purified and evaluated in vitro. It was observed that the fraction exhibited an IC50 of 2.12μgmL(-1) for BChE and 1.07μgmL(-1) for MAO-A. These results reinforce the prediction obtained by O-PLS1 modelling. Copyright © 2016 Elsevier B.V. All rights reserved.
Optical scatterometry of quarter-micron patterns using neural regression
NASA Astrophysics Data System (ADS)
Bischoff, Joerg; Bauer, Joachim J.; Haak, Ulrich; Hutschenreuther, Lutz; Truckenbrodt, Horst
1998-06-01
With shrinking dimensions and increasing chip areas, a rapid and non-destructive full wafer characterization after every patterning cycle is an inevitable necessity. In former publications it was shown that Optical Scatterometry (OS) has the potential to push the attainable feature limits of optical techniques from 0.8 . . . 0.5 microns for imaging methods down to 0.1 micron and below. Thus the demands of future metrology can be met. Basically being a nonimaging method, OS combines light scatter (or diffraction) measurements with modern data analysis schemes to solve the inverse scatter issue. For very fine patterns with lambda-to-pitch ratios grater than one, the specular reflected light versus the incidence angle is recorded. Usually, the data analysis comprises two steps -- a training cycle connected the a rigorous forward modeling and the prediction itself. Until now, two data analysis schemes are usually applied -- the multivariate regression based Partial Least Squares method (PLS) and a look-up-table technique which is also referred to as Minimum Mean Square Error approach (MMSE). Both methods are afflicted with serious drawbacks. On the one hand, the prediction accuracy of multivariate regression schemes degrades with larger parameter ranges due to the linearization properties of the method. On the other hand, look-up-table methods are rather time consuming during prediction thus prolonging the processing time and reducing the throughput. An alternate method is an Artificial Neural Network (ANN) based regression which combines the advantages of multivariate regression and MMSE. Due to the versatility of a neural network, not only can its structure be adapted more properly to the scatter problem, but also the nonlinearity of the neuronal transfer functions mimic the nonlinear behavior of optical diffraction processes more adequately. In spite of these pleasant properties, the prediction speed of ANN regression is comparable with that of the PLS-method. In this paper, the viability and performance of ANN-regression will be demonstrated with the example of sub-quarter-micron resist metrology. To this end, 0.25 micrometer line/space patterns have been printed in positive photoresist by means of DUV projection lithography. In order to evaluate the total metrology chain from light scatter measurement through data analysis, a thorough modeling has been performed. Assuming a trapezoidal shape of the developed resist profile, a training data set was generated by means of the Rigorous Coupled Wave Approach (RCWA). After training the model, a second data set was computed and deteriorated by Gaussian noise to imitate real measuring conditions. Then, these data have been fed into the models established before resulting in a Standard Error of Prediction (SEP) which corresponds to the measuring accuracy. Even with putting only little effort in the design of a back-propagation network, the ANN is clearly superior to the PLS-method. Depending on whether a network with one or two hidden layers was used, accuracy gains between 2 and 5 can be achieved compared with PLS regression. Furthermore, the ANN is less noise sensitive, for there is only a doubling of the SEP at 5% noise for ANN whereas for PLS the accuracy degrades rapidly with increasing noise. The accuracy gain also depends on the light polarization and on the measured parameters. Finally, these results have been proven experimentally, where the OS-results are in good accordance with the profiles obtained from cross- sectioning micrographs.
Moscetti, Roberto; Sturm, Barbara; Crichton, Stuart Oj; Amjad, Waseem; Massantini, Riccardo
2018-05-01
The potential of hyperspectral imaging (500-1010 nm) was evaluated for monitoring of the quality of potato slices (var. Anuschka) of 5, 7 and 9 mm thickness subjected to air drying at 50 °C. The study investigated three different feature selection methods for the prediction of dry basis moisture content and colour of potato slices using partial least squares regression (PLS). The feature selection strategies tested include interval PLS regression (iPLS), and differences and ratios between raw reflectance values for each possible pair of wavelengths (R[λ 1 ]-R[λ 2 ] and R[λ 1 ]:R[λ 2 ], respectively). Moreover, the combination of spectral and spatial domains was tested. Excellent results were obtained using the iPLS algorithm. However, features from both datasets of raw reflectance differences and ratios represent suitable alternatives for development of low-complex prediction models. Finally, the dry basis moisture content was high accurately predicted by combining spectral data (i.e. R[511 nm]-R[994 nm]) and spatial domain (i.e. relative area shrinkage of slice). Modelling the data acquired during drying through hyperspectral imaging can provide useful information concerning the chemical and physicochemical changes of the product. With all this information, the proposed approach lays the foundations for a more efficient smart dryer that can be designed and its process optimized for drying of potato slices. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
Analysis of spreadable cheese by Raman spectroscopy and chemometric tools.
Oliveira, Kamila de Sá; Callegaro, Layce de Souza; Stephani, Rodrigo; Almeida, Mariana Ramos; de Oliveira, Luiz Fernando Cappa
2016-03-01
In this work, FT-Raman spectroscopy was explored to evaluate spreadable cheese samples. A partial least squares discriminant analysis was employed to identify the spreadable cheese samples containing starch. To build the models, two types of samples were used: commercial samples and samples manufactured in local industries. The method of supervised classification PLS-DA was employed to classify the samples as adulterated or without starch. Multivariate regression was performed using the partial least squares method to quantify the starch in the spreadable cheese. The limit of detection obtained for the model was 0.34% (w/w) and the limit of quantification was 1.14% (w/w). The reliability of the models was evaluated by determining the confidence interval, which was calculated using the bootstrap re-sampling technique. The results show that the classification models can be used to complement classical analysis and as screening methods. Copyright © 2015 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Thangsunan, Patcharapong; Kittiwachana, Sila; Meepowpan, Puttinan; Kungwan, Nawee; Prangkio, Panchika; Hannongbua, Supa; Suree, Nuttee
2016-06-01
Improving performance of scoring functions for drug docking simulations is a challenging task in the modern discovery pipeline. Among various ways to enhance the efficiency of scoring function, tuning of energetic component approach is an attractive option that provides better predictions. Herein we present the first development of rapid and simple tuning models for predicting and scoring inhibitory activity of investigated ligands docked into catalytic core domain structures of HIV-1 integrase (IN) enzyme. We developed the models using all energetic terms obtained from flexible ligand-rigid receptor dockings by AutoDock4, followed by a data analysis using either partial least squares (PLS) or self-organizing maps (SOMs). The models were established using 66 and 64 ligands of mercaptobenzenesulfonamides for the PLS-based and the SOMs-based inhibitory activity predictions, respectively. The models were then evaluated for their predictability quality using closely related test compounds, as well as five different unrelated inhibitor test sets. Weighting constants for each energy term were also optimized, thus customizing the scoring function for this specific target protein. Root-mean-square error (RMSE) values between the predicted and the experimental inhibitory activities were determined to be <1 (i.e. within a magnitude of a single log scale of actual IC50 values). Hence, we propose that, as a pre-functional assay screening step, AutoDock4 docking in combination with these subsequent rapid weighted energy tuning methods via PLS and SOMs analyses is a viable approach to predict the potential inhibitory activity and to discriminate among small drug-like molecules to target a specific protein of interest.
NASA Astrophysics Data System (ADS)
He, Anhua; Singh, Ramesh P.; Sun, Zhaohua; Ye, Qing; Zhao, Gang
2016-07-01
The earth tide, atmospheric pressure, precipitation and earthquake fluctuations, especially earthquake greatly impacts water well levels, thus anomalous co-seismic changes in ground water levels have been observed. In this paper, we have used four different models, simple linear regression (SLR), multiple linear regression (MLR), principal component analysis (PCA) and partial least squares (PLS) to compute the atmospheric pressure and earth tidal effects on water level. Furthermore, we have used the Akaike information criterion (AIC) to study the performance of various models. Based on the lowest AIC and sum of squares for error values, the best estimate of the effects of atmospheric pressure and earth tide on water level is found using the MLR model. However, MLR model does not provide multicollinearity between inputs, as a result the atmospheric pressure and earth tidal response coefficients fail to reflect the mechanisms associated with the groundwater level fluctuations. On the premise of solving serious multicollinearity of inputs, PLS model shows the minimum AIC value. The atmospheric pressure and earth tidal response coefficients show close response with the observation using PLS model. The atmospheric pressure and the earth tidal response coefficients are found to be sensitive to the stress-strain state using the observed data for the period 1 April-8 June 2008 of Chuan 03# well. The transient enhancement of porosity of rock mass around Chuan 03# well associated with the Wenchuan earthquake (Mw = 7.9 of 12 May 2008) that has taken its original pre-seismic level after 13 days indicates that the co-seismic sharp rise of water well could be induced by static stress change, rather than development of new fractures.
Zhan, Xiaobin; Jiang, Shulan; Yang, Yili; Liang, Jian; Shi, Tielin; Li, Xiwen
2015-09-18
This paper proposes an ultrasonic measurement system based on least squares support vector machines (LS-SVM) for inline measurement of particle concentrations in multicomponent suspensions. Firstly, the ultrasonic signals are analyzed and processed, and the optimal feature subset that contributes to the best model performance is selected based on the importance of features. Secondly, the LS-SVM model is tuned, trained and tested with different feature subsets to obtain the optimal model. In addition, a comparison is made between the partial least square (PLS) model and the LS-SVM model. Finally, the optimal LS-SVM model with the optimal feature subset is applied to inline measurement of particle concentrations in the mixing process. The results show that the proposed method is reliable and accurate for inline measuring the particle concentrations in multicomponent suspensions and the measurement accuracy is sufficiently high for industrial application. Furthermore, the proposed method is applicable to the modeling of the nonlinear system dynamically and provides a feasible way to monitor industrial processes.
Souza, Beatriz C C; De Oliveira, Tiago B; Aquino, Thiago M; de Lima, Maria C A; Pitta, Ivan R; Galdino, Suely L; Lima, Edeltrudes O; Gonçalves-Silva, Teresinha; Militão, Gardênia C G; Scotti, Luciana; Scotti, Marcus T; Mendonça, Francisco J B
2012-06-01
A series of 2-[(arylidene)amino]-cycloalkyl[b]thiophene-3-carbonitriles (2a-x) was synthesized by incorporation of substituted aromatic aldehydes in Gewald adducts (1a-c). The title compounds were screened for their antifungal activity against Candida krusei and Criptococcus neoformans and for their antiproliferative activity against a panel of 3 human cancer cell lines (HT29, NCI H-292 and HEP). For antiproliferative activity, the partial least squares (PLS) methodology was applied. Some of the prepared compounds exhibited promising antifungal and proliferative properties. The most active compounds for antifungal activity were cyclohexyl[b]thiophene derivatives, and for antiproliferative activity cycloheptyl[b]thiophene derivatives, especially 2-[(1H-indol-2-yl-methylidene)amino]- 5,6,7,8-tetrahydro-4H-cyclohepta[b]thiophene-3-carbonitrile (2r), which inhibited more than 97 % growth of the three cell lines. The PLS discriminant analysis (PLS-DA) applied generated good exploratory and predictive results and showed that the descriptors having shape characteristics were strongly correlated with the biological data.
ERIC Educational Resources Information Center
Henseler, Jorg; Chin, Wynne W.
2010-01-01
In social and business sciences, the importance of the analysis of interaction effects between manifest as well as latent variables steadily increases. Researchers using partial least squares (PLS) to analyze interaction effects between latent variables need an overview of the available approaches as well as their suitability. This article…
Liu, Jie; Zhang, Fu-Dong; Teng, Fei; Li, Jun; Wang, Zhi-Hong
2014-10-01
In order to in-situ detect the oil yield of oil shale, based on portable near infrared spectroscopy analytical technology, with 66 rock core samples from No. 2 well drilling of Fuyu oil shale base in Jilin, the modeling and analyzing methods for in-situ detection were researched. By the developed portable spectrometer, 3 data formats (reflectance, absorbance and K-M function) spectra were acquired. With 4 different modeling data optimization methods: principal component-mahalanobis distance (PCA-MD) for eliminating abnormal samples, uninformative variables elimination (UVE) for wavelength selection and their combina- tions: PCA-MD + UVE and UVE + PCA-MD, 2 modeling methods: partial least square (PLS) and back propagation artificial neural network (BPANN), and the same data pre-processing, the modeling and analyzing experiment were performed to determine the optimum analysis model and method. The results show that the data format, modeling data optimization method and modeling method all affect the analysis precision of model. Results show that whether or not using the optimization method, reflectance or K-M function is the proper spectrum format of the modeling database for two modeling methods. Using two different modeling methods and four different data optimization methods, the model precisions of the same modeling database are different. For PLS modeling method, the PCA-MD and UVE + PCA-MD data optimization methods can improve the modeling precision of database using K-M function spectrum data format. For BPANN modeling method, UVE, UVE + PCA-MD and PCA- MD + UVE data optimization methods can improve the modeling precision of database using any of the 3 spectrum data formats. In addition to using the reflectance spectra and PCA-MD data optimization method, modeling precision by BPANN method is better than that by PLS method. And modeling with reflectance spectra, UVE optimization method and BPANN modeling method, the model gets the highest analysis precision, its correlation coefficient (Rp) is 0.92, and its standard error of prediction (SEP) is 0.69%.
Missing RRI interpolation for HRV analysis using locally-weighted partial least squares regression.
Kamata, Keisuke; Fujiwara, Koichi; Yamakawa, Toshiki; Kano, Manabu
2016-08-01
The R-R interval (RRI) fluctuation in electrocardiogram (ECG) is called heart rate variability (HRV). Since HRV reflects autonomic nervous function, HRV-based health monitoring services, such as stress estimation, drowsy driving detection, and epileptic seizure prediction, have been proposed. In these HRV-based health monitoring services, precise R wave detection from ECG is required; however, R waves cannot always be detected due to ECG artifacts. Missing RRI data should be interpolated appropriately for HRV analysis. The present work proposes a missing RRI interpolation method by utilizing using just-in-time (JIT) modeling. The proposed method adopts locally weighted partial least squares (LW-PLS) for RRI interpolation, which is a well-known JIT modeling method used in the filed of process control. The usefulness of the proposed method was demonstrated through a case study of real RRI data collected from healthy persons. The proposed JIT-based interpolation method could improve the interpolation accuracy in comparison with a static interpolation method.
Whelan, Jessica; Craven, Stephen; Glennon, Brian
2012-01-01
In this study, the application of Raman spectroscopy to the simultaneous quantitative determination of glucose, glutamine, lactate, ammonia, glutamate, total cell density (TCD), and viable cell density (VCD) in a CHO fed-batch process was demonstrated in situ in 3 L and 15 L bioreactors. Spectral preprocessing and partial least squares (PLS) regression were used to correlate spectral data with off-line reference data. Separate PLS calibration models were developed for each analyte at the 3 L laboratory bioreactor scale before assessing its transferability to the same bioprocess conducted at the 15 L pilot scale. PLS calibration models were successfully developed for all analytes bar VCD and transferred to the 15 L scale. Copyright © 2012 American Institute of Chemical Engineers (AIChE).
Liu, Xiaona; Zhang, Qiao; Wu, Zhisheng; Shi, Xinyuan; Zhao, Na; Qiao, Yanjiang
2015-01-01
Laser-induced breakdown spectroscopy (LIBS) was applied to perform a rapid elemental analysis and provenance study of Blumea balsamifera DC. Principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA) were implemented to exploit the multivariate nature of the LIBS data. Scores and loadings of computed principal components visually illustrated the differing spectral data. The PLS-DA algorithm showed good classification performance. The PLS-DA model using complete spectra as input variables had similar discrimination performance to using selected spectral lines as input variables. The down-selection of spectral lines was specifically focused on the major elements of B. balsamifera samples. Results indicated that LIBS could be used to rapidly analyze elements and to perform provenance study of B. balsamifera. PMID:25558999
Mixture quantification using PLS in plastic scintillation measurements.
Bagán, H; Tarancón, A; Rauret, G; García, J F
2011-06-01
This article reports the capability of plastic scintillation (PS) combined with multivariate calibration (Partial least squares; PLS) to detect and quantify alpha and beta emitters in mixtures. While several attempts have been made with this purpose in mind using liquid scintillation (LS), no attempt was done using PS that has the great advantage of not producing mixed waste after the measurements are performed. Following this objective, ternary mixtures of alpha and beta emitters ((241)Am, (137)Cs and (90)Sr/(90)Y) have been quantified. Procedure optimisation has evaluated the use of the net spectra or the sample spectra, the inclusion of different spectra obtained at different values of the Pulse Shape Analysis parameter and the application of the PLS1 or PLS2 algorithms. The conclusions show that the use of PS+PLS2 applied to the sample spectra, without the use of any pulse shape discrimination, allows quantification of the activities with relative errors less than 10% in most of the cases. This procedure not only allows quantification of mixtures but also reduces measurement time (no blanks are required) and the application of this procedure does not require detectors that include the pulse shape analysis parameter. Copyright © 2011 Elsevier Ltd. All rights reserved.
Farrés, Mireia; Piña, Benjamí; Tauler, Romà
2016-08-01
Copper containing fungicides are used to protect vineyards from fungal infections. Higher residues of copper in grapes at toxic concentrations are potentially toxic and affect the microorganisms living in vineyards, such as Saccharomyces cerevisiae. In this study, the response of the metabolic profiles of S. cerevisiae at different concentrations of copper sulphate (control, 1 mM, 3 mM and 6 mM) was analysed by liquid chromatography coupled to mass spectrometry (LC-MS) and multivariate curve resolution-alternating least squares (MCR-ALS) using an untargeted metabolomics approach. Peak areas of the MCR-ALS resolved elution profiles in control and in Cu(ii)-treated samples were compared using partial least squares regression (PLSR) and PLS-discriminant analysis (PLS-DA), and the intracellular metabolites best contributing to sample discrimination were selected and identified. Fourteen metabolites showed significant concentration changes upon Cu(ii) exposure, following a dose-response effect. The observed changes were consistent with the expected effects of Cu(ii) toxicity, including oxidative stress and DNA damage. This research confirmed that LC-MS based metabolomics coupled to chemometric methods are a powerful approach for discerning metabolomics changes in S. cerevisiae and for elucidating modes of toxicity of environmental stressors, including heavy metals like Cu(ii).
de Almeida, Maurício Liberal; Saatkamp, Cassiano Junior; Fernandes, Adriana Barrinha; Pinheiro, Antonio Luiz Barbosa; Silveira, Landulfo
2016-09-01
Urea and creatinine are commonly used as biomarkers of renal function. Abnormal concentrations of these biomarkers are indicative of pathological processes such as renal failure. This study aimed to develop a model based on Raman spectroscopy to estimate the concentration values of urea and creatinine in human serum. Blood sera from 55 clinically normal subjects and 47 patients with chronic kidney disease undergoing dialysis were collected, and concentrations of urea and creatinine were determined by spectrophotometric methods. A Raman spectrum was obtained with a high-resolution dispersive Raman spectrometer (830 nm). A spectral model was developed based on partial least squares (PLS), where the concentrations of urea and creatinine were correlated with the Raman features. Principal components analysis (PCA) was used to discriminate dialysis patients from normal subjects. The PLS model showed r = 0.97 and r = 0.93 for urea and creatinine, respectively. The root mean square errors of cross-validation (RMSECV) for the model were 17.6 and 1.94 mg/dL, respectively. PCA showed high discrimination between dialysis and normality (95 % accuracy). The Raman technique was able to determine the concentrations with low error and to discriminate dialysis from normal subjects, consistent with a rapid and low-cost test.
Steingass, Christof Björn; Jutzi, Manfred; Müller, Jenny; Carle, Reinhold; Schmarr, Hans-Georg
2015-03-01
Ripening-dependent changes of pineapple volatiles were studied in a nontargeted profiling analysis. Volatiles were isolated via headspace solid phase microextraction and analyzed by comprehensive 2D gas chromatography and mass spectrometry (HS-SPME-GC×GC-qMS). Profile patterns presented in the contour plots were evaluated applying image processing techniques and subsequent multivariate statistical data analysis. Statistical methods comprised unsupervised hierarchical cluster analysis (HCA) and principal component analysis (PCA) to classify the samples. Supervised partial least squares discriminant analysis (PLS-DA) and partial least squares (PLS) regression were applied to discriminate different ripening stages and describe the development of volatiles during postharvest storage, respectively. Hereby, substantial chemical markers allowing for class separation were revealed. The workflow permitted the rapid distinction between premature green-ripe pineapples and postharvest-ripened sea-freighted fruits. Volatile profiles of fully ripe air-freighted pineapples were similar to those of green-ripe fruits postharvest ripened for 6 days after simulated sea freight export, after PCA with only two principal components. However, PCA considering also the third principal component allowed differentiation between air-freighted fruits and the four progressing postharvest maturity stages of sea-freighted pineapples.
Lee, Hoonsoo; Kim, Moon S; Song, Yu-Rim; Oh, Chang-Sik; Lim, Hyoun-Sub; Lee, Wang-Hee; Kang, Jum-Soon; Cho, Byoung-Kwan
2017-03-01
There is a need to minimize economic damage by sorting infected seeds from healthy seeds before seeding. However, current methods of detecting infected seeds, such as seedling grow-out, enzyme-linked immunosorbent assays, the polymerase chain reaction (PCR) and the real-time PCR have a critical drawbacks in that they are time-consuming, labor-intensive and destructive procedures. The present study aimed to evaluate the potential of visible/near-infrared (Vis/NIR) hyperspectral imaging system for detecting bacteria-infected watermelon seeds. A hyperspectral Vis/NIR reflectance imaging system (spectral region of 400-1000 nm) was constructed to obtain hyperspectral reflectance images for 336 bacteria-infected watermelon seeds, which were then subjected to partial least square discriminant analysis (PLS-DA) and a least-squares support vector machine (LS-SVM) to classify bacteria-infected watermelon seeds from healthy watermelon seeds. The developed system detected bacteria-infected watermelon seeds with an accuracy > 90% (PLS-DA: 91.7%, LS-SVM: 90.5%), suggesting that the Vis/NIR hyperspectral imaging system is effective for quarantining bacteria-infected watermelon seeds. The results of the present study show that it is possible to use the Vis/NIR hyperspectral imaging system for detecting bacteria-infected watermelon seeds. © 2016 Society of Chemical Industry. © 2016 Society of Chemical Industry.
Jović, Ozren; Smrečki, Neven; Popović, Zora
2016-04-01
A novel quantitative prediction and variable selection method called interval ridge regression (iRR) is studied in this work. The method is performed on six data sets of FTIR, two data sets of UV-vis and one data set of DSC. The obtained results show that models built with ridge regression on optimal variables selected with iRR significantly outperfom models built with ridge regression on all variables in both calibration (6 out of 9 cases) and validation (2 out of 9 cases). In this study, iRR is also compared with interval partial least squares regression (iPLS). iRR outperfomed iPLS in validation (insignificantly in 6 out of 9 cases and significantly in one out of 9 cases for p<0.05). Also, iRR can be a fast alternative to iPLS, especially in case of unknown degree of complexity of analyzed system, i.e. if upper limit of number of latent variables is not easily estimated for iPLS. Adulteration of hempseed (H) oil, a well known health beneficial nutrient, is studied in this work by mixing it with cheap and widely used oils such as soybean (So) oil, rapeseed (R) oil and sunflower (Su) oil. Binary mixture sets of hempseed oil with these three oils (HSo, HR and HSu) and a ternary mixture set of H oil, R oil and Su oil (HRSu) were considered. The obtained accuracy indicates that using iRR on FTIR and UV-vis data, each particular oil can be very successfully quantified (in all 8 cases RMSEP<1.2%). This means that FTIR-ATR coupled with iRR can very rapidly and effectively determine the level of adulteration in the adulterated hempseed oil (R(2)>0.99). Copyright © 2015 Elsevier B.V. All rights reserved.
Lao, Wan-li; He, Yu-chan; Li, Gai-yun; Zhou, Qun
2016-01-01
The biomass to plastic ratio in wood plastic composites (WPCs) greatly affects the physical and mechanical properties and price. Fast and accurate evaluation of the biomass to plastic ratio is important for the further development of WPCs. Quantitative analysis of the WPC main composition currently relies primarily on thermo-analytical methods. However, these methods have some inherent disadvantages, including time-consuming, high analytical errors and sophisticated, which severely limits the applications of these techniques. Therefore, in this study, Fourier Transform Infrared (FTIR) spectroscopy in combination with partial least square (PLS) has been used for rapid prediction of bamboo and polypropylene (PP) content in bamboo/PP composites. The bamboo powders were used as filler after being dried at 105 degrees C for 24 h. PP was used as matrix materials, and some chemical regents were used as additives. Then 42 WPC samples with different ratios of bamboo and PP were prepared by the methods of extrusion. FTIR spectral data of 42 WPC samples were collected by means of KBr pellets technique. The model for bamboo and PP content prediction was developed by PLS-2 and full cross validation. Results of internal cross validation showed that the first derivative spectra in the range of 1 800-800 cm(-1) corrected by standard normal variate (SNV) yielded the optimal model. For both bamboo and PP calibration, the coefficients of determination (R2) were 0.955. The standard errors of calibration (SEC) were 1.872 for bamboo content and 1.848 for PP content, respectively. For both bamboo and PP validation, the R2 values were 0.950. The standard errors of cross validation (SECV) were 1.927 for bamboo content and 1.950 for PP content, respectively. And the ratios of performance to deviation (RPD) were 4.45 for both biomass and PP examinations. The results of external validation showed that the relative prediction deviations for both biomass and PP contents were lower than ± 6%. FTIR combined with PLS can be used for rapid and accurate determination of bamboo and PP content in bamboo/PP composites.
NASA Astrophysics Data System (ADS)
Jintao, Xue; Yufei, Liu; Liming, Ye; Chunyan, Li; Quanwei, Yang; Weiying, Wang; Yun, Jing; Minxiang, Zhang; Peng, Li
2018-01-01
Near-Infrared Spectroscopy (NIRS) was first used to develop a method for rapid and simultaneous determination of 5 active alkaloids (berberine, coptisine, palmatine, epiberberine and jatrorrhizine) in 4 parts (rhizome, fibrous root, stem and leaf) of Coptidis Rhizoma. A total of 100 samples from 4 main places of origin were collected and studied. With HPLC analysis values as calibration reference, the quantitative analysis of 5 marker components was performed by two different modeling methods, partial least-squares (PLS) regression as linear regression and artificial neural networks (ANN) as non-linear regression. The results indicated that the 2 types of models established were robust, accurate and repeatable for five active alkaloids, and the ANN models was more suitable for the determination of berberine, coptisine and palmatine while the PLS model was more suitable for the analysis of epiberberine and jatrorrhizine. The performance of the optimal models was achieved as follows: the correlation coefficient (R) for berberine, coptisine, palmatine, epiberberine and jatrorrhizine was 0.9958, 0.9956, 0.9959, 0.9963 and 0.9923, respectively; the root mean square error of validation (RMSEP) was 0.5093, 0.0578, 0.0443, 0.0563 and 0.0090, respectively. Furthermore, for the comprehensive exploitation and utilization of plant resource of Coptidis Rhizoma, the established NIR models were used to analysis the content of 5 active alkaloids in 4 parts of Coptidis Rhizoma and 4 main origin of places. This work demonstrated that NIRS may be a promising method as routine screening for off-line fast analysis or on-line quality assessment of traditional Chinese medicine (TCM).
Monakhova, Yulia B; Diehl, Bernd W K; Do, Tung X; Schulze, Margit; Witzleben, Steffen
2018-02-05
Apart from the characterization of impurities, the full characterization of heparin and low molecular weight heparin (LMWH) also requires the determination of average molecular weight, which is closely related to the pharmaceutical properties of anticoagulant drugs. To determine average molecular weight of these animal-derived polymer products, partial least squares regression (PLS) was utilized for modelling of diffused-ordered spectroscopy NMR data (DOSY) of a representative set of heparin (n=32) and LMWH (n=30) samples. The same sets of samples were measured by gel permeation chromatography (GPC) to obtain reference data. The application of PLS to the data led to calibration models with root mean square error of prediction of 498Da and 179Da for heparin and LMWH, respectively. The average coefficients of variation (CVs) did not exceed 2.1% excluding sample preparation (by successive measuring one solution, n=5) and 2.5% including sample preparation (by preparing and analyzing separate samples, n=5). An advantage of the method is that the sample after standard 1D NMR characterization can be used for the molecular weight determination without further manipulation. The accuracy of multivariate models is better than the previous results for other matrices employing internal standards. Therefore, DOSY experiment is recommended to be employed for the calculation of molecular weight of heparin products as a complementary measurement to standard 1D NMR quality control. The method can be easily transferred to other matrices as well. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Pérez-Rodríguez, Marta; Horák-Terra, Ingrid; Rodríguez-Lado, Luis; Martínez Cortizas, Antonio
2016-11-01
Despite its potential, infrared spectroscopy combined with multivariate statistics has been seldom used to model peat properties with environmental value, such us the concentration of potentially toxic metals. In this research, we applied attenuated total reflectance (ATR) Fourier-Transform Infrared (FTIR) spectroscopy to evaluate the ability of the technique to predict mercury concentrations in late-Pleistocene/Holocene peat from a minerogenic peatland from Minas Gerais (Brazil). Mercury concentrations were analysed using a Milestone DMA-80 analyzer and attenuated total reflectance FTIR-ATR was performed using a Gladi-ATR (Pike Technologies) in the mid IR spectrum (4000-400 cm- 1). Concentrations were modelled using principal components (PCR) and partial least squares regression (PLS). The performance of the models varied between moderate and very good (R2 0.67-0.90), with low RMSD values (0.35-1.06). A PLS model based on three latent vectors (LV1 to LV3) provided the best (R2 0.90, RMSD 0.35) results. LV1 reflected total organic matter content versus mineral matter (mainly quartz from local fluxes), LV2 was related to dust deposition from regional sources, and LV3 reflected peat organic matter decomposition. Compared to a previous investigation based on geochemical data, the spectroscopy-based PLS model performed better, but it has to be complemented with additional data (as δ13 C ratios) to reliably reproduce the changes of the factors controlling mercury accumulation over time. This, time- and cost-effective, methodology may help to develop multi-core approaches to study the within and between mire (of a similar type and area) variability in mercury accumulation, and probably also other peat properties. Fig. S2 Loadings weights of the three and two significant components from the direct (dPCR) and transposed (trPCR) PCR models. Fig. S3 Depth records of the cumulative effects of the factors involved in the variation of mercury concentrations. Left, MIR-PLS model; centre, MIR-PLS + δ13 C data model; right, geochemical model from Pérez-Rodríguez et al. [44].
Lu, Shao Hua; Li, Bao Qiong; Zhai, Hong Lin; Zhang, Xin; Zhang, Zhuo Yong
2018-04-25
Terahertz time-domain spectroscopy has been applied to many fields, however, it still encounters drawbacks in multicomponent mixtures analysis due to serious spectral overlapping. Here, an effective approach to quantitative analysis was proposed, and applied on the determination of the ternary amino acids in foxtail millet substrate. Utilizing three parameters derived from the THz-TDS, the images were constructed and the Tchebichef image moments were used to extract the information of target components. Then the quantitative models were obtained by stepwise regression. The correlation coefficients of leave-one-out cross-validation (R loo-cv 2 ) were more than 0.9595. As for external test set, the predictive correlation coefficients (R p 2 ) were more than 0.8026 and the root mean square error of prediction (RMSE p ) were less than 1.2601. Compared with the traditional methods (PLS and N-PLS methods), our approach is more accurate, robust and reliable, and can be a potential excellent approach to quantify multicomponent with THz-TDS spectroscopy. Copyright © 2017 Elsevier Ltd. All rights reserved.
Mello, Cesar; Ribeiro, Diórginis; Novaes, Fábio; Poppi, Ronei J
2005-10-01
Use of classical microbiological methods to differentiate bacteria that cause gastroenteritis is cumbersome but usually very efficient. The high cost of reagents and the time required for such identifications, approximately four days, could have serious consequences, however, mainly when the patients are children, the elderly, or adults with low resistance. The search for new methods enabling rapid and reagentless differentiation of these microorganisms is, therefore, extremely relevant. In this work the main microorganisms responsible for gastroenteritis, Escherichia coli, Salmonella choleraesuis, and Shigella flexneri, were studied. For each microorganism sixty different dispersions were prepared in physiological solution. The Raman spectra of these dispersions were recorded using a diode laser operating in the near infrared region. Partial least-squares (PLS) discriminant analysis was used to differentiate among the bacteria by use of their respective Raman spectra. This approach enabled correct classification of 100% of the bacteria evaluated and unknown samples from the clinical environment, in less time ( approximately 10 h), by use of a low-cost, portable Raman spectrometer, which can be easily used in intensive care units and clinical environments.
NASA Astrophysics Data System (ADS)
Anne, Marie-Laure; Le Lan, Caroline; Monbet, Valérie; Boussard-Plédel, Catherine; Ropert, Martine; Sire, Olivier; Pouchard, Michel; Jard, Christine; Lucas, Jacques; Adam, Jean Luc; Brissot, Pierre; Bureau, Bruno; Loréal, Olivier
2009-09-01
Fiber evanescent wave spectroscopy (FEWS) explores the mid-infrared domain, providing information on functional chemical groups represented in the sample. Our goal is to evaluate whether spectral fingerprints obtained by FEWS might orientate clinical diagnosis. Serum samples from normal volunteers and from four groups of patients with metabolic abnormalities are analyzed by FEWS. These groups consist of iron overloaded genetic hemochromatosis (GH), iron depleted GH, cirrhosis, and dysmetabolic hepatosiderosis (DYSH). A partial least squares (PLS) logistic method is used in a training group to create a classification algorithm, thereafter applied to a test group. Patients with cirrhosis or DYSH, two groups exhibiting important metabolic disturbances, are clearly discriminated from control groups with AUROC values of 0.94+/-0.05 and 0.90+/-0.06, and sensibility/specificity of 86/84% and 87/87%, respectively. When pooling all groups, the PLS method contributes to discriminate controls, cirrhotic, and dysmetabolic patients. Our data demonstrate that metabolic profiling using infrared FEWS is a possible way to investigate metabolic alterations in patients.
Dong, Yanhong; Li, Juan; Zhong, Xiaoxiao; Cao, Liya; Luo, Yang; Fan, Qi
2016-04-15
This paper establishes a novel method to simultaneously predict the tablet weight (TW) and trimethoprim (TMP) content of compound sulfamethoxazole tablets (SMZCO) by near infrared (NIR) spectroscopy with partial least squares (PLS) regression for controlling the uniformity of dosage units (UODU). The NIR spectra for 257 samples were measured using the optimized parameter values and pretreated using the optimized chemometric techniques. After the outliers were ignored, two PLS models for predicting TW and TMP content were respectively established by using the selected spectral sub-ranges and the reference values. The TW model reaches the correlation coefficient of calibration (R(c)) 0.9543 and the TMP content model has the R(c) 0.9205. The experimental results indicate that this strategy expands the NIR application in controlling UODU, especially in the high-throughput and rapid analysis of TWs and contents of the compound pharmaceutical tablets, and may be an important complement to the common NIR on-line analytical method for pharmaceutical tablets. Copyright © 2016 Elsevier B.V. All rights reserved.
Ciura, Krzesimir; Belka, Mariusz; Kawczak, Piotr; Bączek, Tomasz; Markuszewski, Michał J; Nowakowska, Joanna
2017-09-05
The objective of this paper is to build QSRR/QSAR model for predicting the blood-brain barrier (BBB) permeability. The obtained models are based on salting-out thin layer chromatography (SOTLC) constants and calculated molecular descriptors. Among chromatographic methods SOTLC was chosen, since the mobile phases are free of organic solvent. As consequences, there are less toxic, and have lower environmental impact compared to classical reserved phases liquid chromatography (RPLC). During the study three stationary phase silica gel, cellulose plates and neutral aluminum oxide were examined. The model set of solutes presents a wide range of log BB values, containing compounds which cross the BBB readily and molecules poorly distributed to the brain including drugs acting on the nervous system as well as peripheral acting drugs. Additionally, the comparison of three regression models: multiple linear regression (MLR), partial least-squares (PLS) and orthogonal partial least squares (OPLS) were performed. The designed QSRR/QSAR models could be useful to predict BBB of systematically synthesized newly compounds in the drug development pipeline and are attractive alternatives of time-consuming and demanding directed methods for log BB measurement. The study also shown that among several regression techniques, significant differences can be obtained in models performance, measured by R 2 and Q 2 , hence it is strongly suggested to evaluate all available options as MLR, PLS and OPLS. Copyright © 2017 Elsevier B.V. All rights reserved.
Fadzillah, Nurrulhidayah Ahmad; Man, Yaakob bin Che; Rohman, Abdul; Rosman, Arieff Salleh; Ismail, Amin; Mustafa, Shuhaimi; Khatib, Alfi
2015-01-01
The authentication of food products from the presence of non-allowed components for certain religion like lard is very important. In this study, we used proton Nuclear Magnetic Resonance ((1)H-NMR) spectroscopy for the analysis of butter adulterated with lard by simultaneously quantification of all proton bearing compounds, and consequently all relevant sample classes. Since the spectra obtained were too complex to be analyzed visually by the naked eyes, the classification of spectra was carried out.The multivariate calibration of partial least square (PLS) regression was used for modelling the relationship between actual value of lard and predicted value. The model yielded a highest regression coefficient (R(2)) of 0.998 and the lowest root mean square error calibration (RMSEC) of 0.0091% and root mean square error prediction (RMSEP) of 0.0090, respectively. Cross validation testing evaluates the predictive power of the model. PLS model was shown as good models as the intercept of R(2)Y and Q(2)Y were 0.0853 and -0.309, respectively.
NASA Astrophysics Data System (ADS)
Shao, Yongni; Xie, Chuanqi; Jiang, Linjun; Shi, Jiahui; Zhu, Jiajin; He, Yong
2015-04-01
Visible/near infrared spectroscopy (Vis/NIR) based on sensitive wavelengths (SWs) and chemometrics was proposed to discriminate different tomatoes bred by spaceflight mutagenesis from their leafs or fruits (green or mature). The tomato breeds were mutant M1, M2 and their parent. Partial least squares (PLS) analysis and least squares-support vector machine (LS-SVM) were implemented for calibration models. PLS analysis was implemented for calibration models with different wavebands including the visible region (400-700 nm) and the near infrared region (700-1000 nm). The best PLS models were achieved in the visible region for the leaf and green fruit samples and in the near infrared region for the mature fruit samples. Furthermore, different latent variables (4-8 LVs for leafs, 5-9 LVs for green fruits, and 4-9 LVs for mature fruits) were used as inputs of LS-SVM to develop the LV-LS-SVM models with the grid search technique and radial basis function (RBF) kernel. The optimal LV-LS-SVM models were achieved with six LVs for the leaf samples, seven LVs for green fruits, and six LVs for mature fruits, respectively, and they outperformed the PLS models. Moreover, independent component analysis (ICA) was executed to select several SWs based on loading weights. The optimal LS-SVM model was achieved with SWs of 550-560 nm, 562-574 nm, 670-680 nm and 705-715 nm for the leaf samples; 548-556 nm, 559-564 nm, 678-685 nm and 962-974 nm for the green fruit samples; and 712-718 nm, 720-729 nm, 968-978 nm and 820-830 nm for the mature fruit samples. All of them had better performance than PLS and LV-LS-SVM, with the parameters of correlation coefficient (rp), root mean square error of prediction (RMSEP) and bias of 0.9792, 0.2632 and 0.0901 based on leaf discrimination, 0.9837, 0.2783 and 0.1758 based on green fruit discrimination, 0.9804, 0.2215 and -0.0035 based on mature fruit discrimination, respectively. The overall results indicated that ICA was an effective way for the selection of SWs, and the Vis/NIR combined with LS-SVM models had the capability to predict the different breeds (mutant M1, mutant M2 and their parent) of tomatoes from leafs and fruits.
Zhang, Xuan; Li, Wei; Yin, Bin; Chen, Weizhong; Kelly, Declan P; Wang, Xiaoxin; Zheng, Kaiyi; Du, Yiping
2013-10-01
Coffee is the most heavily consumed beverage in the world after water, for which quality is a key consideration in commercial trade. Therefore, caffeine content which has a significant effect on the final quality of the coffee products requires to be determined fast and reliably by new analytical techniques. The main purpose of this work was to establish a powerful and practical analytical method based on near infrared spectroscopy (NIRS) and chemometrics for quantitative determination of caffeine content in roasted Arabica coffees. Ground coffee samples within a wide range of roasted levels were analyzed by NIR, meanwhile, in which the caffeine contents were quantitative determined by the most commonly used HPLC-UV method as the reference values. Then calibration models based on chemometric analyses of the NIR spectral data and reference concentrations of coffee samples were developed. Partial least squares (PLS) regression was used to construct the models. Furthermore, diverse spectra pretreatment and variable selection techniques were applied in order to obtain robust and reliable reduced-spectrum regression models. Comparing the respective quality of the different models constructed, the application of second derivative pretreatment and stability competitive adaptive reweighted sampling (SCARS) variable selection provided a notably improved regression model, with root mean square error of cross validation (RMSECV) of 0.375 mg/g and correlation coefficient (R) of 0.918 at PLS factor of 7. An independent test set was used to assess the model, with the root mean square error of prediction (RMSEP) of 0.378 mg/g, mean relative error of 1.976% and mean relative standard deviation (RSD) of 1.707%. Thus, the results provided by the high-quality calibration model revealed the feasibility of NIR spectroscopy for at-line application to predict the caffeine content of unknown roasted coffee samples, thanks to the short analysis time of a few seconds and non-destructive advantages of NIRS. Copyright © 2013 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Zhang, Xuan; Li, Wei; Yin, Bin; Chen, Weizhong; Kelly, Declan P.; Wang, Xiaoxin; Zheng, Kaiyi; Du, Yiping
2013-10-01
Coffee is the most heavily consumed beverage in the world after water, for which quality is a key consideration in commercial trade. Therefore, caffeine content which has a significant effect on the final quality of the coffee products requires to be determined fast and reliably by new analytical techniques. The main purpose of this work was to establish a powerful and practical analytical method based on near infrared spectroscopy (NIRS) and chemometrics for quantitative determination of caffeine content in roasted Arabica coffees. Ground coffee samples within a wide range of roasted levels were analyzed by NIR, meanwhile, in which the caffeine contents were quantitative determined by the most commonly used HPLC-UV method as the reference values. Then calibration models based on chemometric analyses of the NIR spectral data and reference concentrations of coffee samples were developed. Partial least squares (PLS) regression was used to construct the models. Furthermore, diverse spectra pretreatment and variable selection techniques were applied in order to obtain robust and reliable reduced-spectrum regression models. Comparing the respective quality of the different models constructed, the application of second derivative pretreatment and stability competitive adaptive reweighted sampling (SCARS) variable selection provided a notably improved regression model, with root mean square error of cross validation (RMSECV) of 0.375 mg/g and correlation coefficient (R) of 0.918 at PLS factor of 7. An independent test set was used to assess the model, with the root mean square error of prediction (RMSEP) of 0.378 mg/g, mean relative error of 1.976% and mean relative standard deviation (RSD) of 1.707%. Thus, the results provided by the high-quality calibration model revealed the feasibility of NIR spectroscopy for at-line application to predict the caffeine content of unknown roasted coffee samples, thanks to the short analysis time of a few seconds and non-destructive advantages of NIRS.
Hutengs, Christopher; Ludwig, Bernard; Jung, András; Eisele, Andreas; Vohland, Michael
2018-03-27
Mid-infrared (MIR) spectroscopy has received widespread interest as a method to complement traditional soil analysis. Recently available portable MIR spectrometers additionally offer potential for on-site applications, given sufficient spectral data quality. We therefore tested the performance of the Agilent 4300 Handheld FTIR (DRIFT spectra) in comparison to a Bruker Tensor 27 bench-top instrument in terms of (i) spectral quality and measurement noise quantified by wavelet analysis; (ii) accuracy of partial least squares (PLS) calibrations for soil organic carbon (SOC), total nitrogen (N), pH, clay and sand content with a repeated cross-validation analysis; and (iii) key spectral regions for these soil properties identified with a Monte Carlo spectral variable selection approach. Measurements and multivariate calibrations with the handheld device were as good as or slightly better than Bruker equipped with a DRIFT accessory, but not as accurate as with directional hemispherical reflectance (DHR) data collected with an integrating sphere. Variations in noise did not markedly affect the accuracy of multivariate PLS calibrations. Identified key spectral regions for PLS calibrations provided a good match between Agilent and Bruker DHR data, especially for SOC and N. Our findings suggest that portable FTIR instruments are a viable alternative for MIR measurements in the laboratory and offer great potential for on-site applications.
Firefly as a novel swarm intelligence variable selection method in spectroscopy.
Goodarzi, Mohammad; dos Santos Coelho, Leandro
2014-12-10
A critical step in multivariate calibration is wavelength selection, which is used to build models with better prediction performance when applied to spectral data. Up to now, many feature selection techniques have been developed. Among all different types of feature selection techniques, those based on swarm intelligence optimization methodologies are more interesting since they are usually simulated based on animal and insect life behavior to, e.g., find the shortest path between a food source and their nests. This decision is made by a crowd, leading to a more robust model with less falling in local minima during the optimization cycle. This paper represents a novel feature selection approach to the selection of spectroscopic data, leading to more robust calibration models. The performance of the firefly algorithm, a swarm intelligence paradigm, was evaluated and compared with genetic algorithm and particle swarm optimization. All three techniques were coupled with partial least squares (PLS) and applied to three spectroscopic data sets. They demonstrate improved prediction results in comparison to when only a PLS model was built using all wavelengths. Results show that firefly algorithm as a novel swarm paradigm leads to a lower number of selected wavelengths while the prediction performance of built PLS stays the same. Copyright © 2014. Published by Elsevier B.V.
Adedipe, Oluwatosin E; Johanningsmeier, Suzanne D; Truong, Van-Den; Yencho, G Craig
2016-03-02
This study investigated the ability of near-infrared spectroscopy (NIRS) to predict acrylamide content in French-fried potato. Potato flour spiked with acrylamide (50-8000 μg/kg) was used to determine if acrylamide could be accurately predicted in a potato matrix. French fries produced with various pretreatments and cook times (n = 84) and obtained from quick-service restaurants (n = 64) were used for model development and validation. Acrylamide was quantified using gas chromatography-mass spectrometry, and reflectance spectra (400-2500 nm) of each freeze-dried sample were captured on a Foss XDS Rapid Content Analyzer-NIR spectrometer. Partial least-squares (PLS) discriminant analysis and PLS regression modeling demonstrated that NIRS could accurately detect acrylamide content as low as 50 μg/kg in the model potato matrix. Prediction errors of 135 μg/kg (R(2) = 0.98) and 255 μg/kg (R(2) = 0.93) were achieved with the best PLS models for acrylamide prediction in Russet Norkotah French-fried potato and multiple samples of unknown varieties, respectively. The findings indicate that NIRS can be used as a screening tool in potato breeding and potato processing research to reduce acrylamide in the food supply.
Yao, Sen; Li, Tao; Liu, HongGao; Li, JieQing; Wang, YuanZhong
2018-04-01
Boletaceae mushrooms are wild-grown edible mushrooms that have high nutrition, delicious flavor and large economic value distributing in Yunnan Province, China. Traceability is important for the authentication and quality assessment of Boletaceae mushrooms. In this study, UV-visible and Fourier transform infrared (FTIR) spectroscopies were applied for traceability of 247 Boletaceae mushroom samples in combination with chemometrics. Compared with a single spectroscopy technique, data fusion strategy can obviously improve the classification performance in partial least square discriminant analysis (PLS-DA) and grid-search support vector machine (GS-SVM) models, for both species and geographical origin traceability. In addition, PLS-DA and GS-SVM models can provide 100.00% accuracy for species traceability and have reliable evaluation parameters. For geographical origin traceability, the accuracy of prediction in the PLS-DA model by data fusion was just 64.63%, but the GS-SVM model based on data fusion was 100.00%. The results demonstrated that the data fusion strategy of UV-visible and FTIR combined with GS-SVM could provide a higher synergic effect for traceability of Boletaceae mushrooms and have a good generalization ability for the comprehensive quality control and evaluation of similar foods. © 2017 Society of Chemical Industry. © 2017 Society of Chemical Industry.
Differences in chewing sounds of dry-crisp snacks by multivariate data analysis
NASA Astrophysics Data System (ADS)
De Belie, N.; Sivertsvik, M.; De Baerdemaeker, J.
2003-09-01
Chewing sounds of different types of dry-crisp snacks (two types of potato chips, prawn crackers, cornflakes and low calorie snacks from extruded starch) were analysed to assess differences in sound emission patterns. The emitted sounds were recorded by a microphone placed over the ear canal. The first bite and the first subsequent chew were selected from the time signal and a fast Fourier transformation provided the power spectra. Different multivariate analysis techniques were used for classification of the snack groups. This included principal component analysis (PCA) and unfold partial least-squares (PLS) algorithms, as well as multi-way techniques such as three-way PLS, three-way PCA (Tucker3), and parallel factor analysis (PARAFAC) on the first bite and subsequent chew. The models were evaluated by calculating the classification errors and the root mean square error of prediction (RMSEP) for independent validation sets. It appeared that the logarithm of the power spectra obtained from the chewing sounds could be used successfully to distinguish the different snack groups. When different chewers were used, recalibration of the models was necessary. Multi-way models distinguished better between chewing sounds of different snack groups than PCA on bite or chew separately and than unfold PLS. From all three-way models applied, N-PLS with three components showed the best classification capabilities, resulting in classification errors of 14-18%. The major amount of incorrect classifications was due to one type of potato chips that had a very irregular shape, resulting in a wide variation of the emitted sounds.
A multiple hold-out framework for Sparse Partial Least Squares.
Monteiro, João M; Rao, Anil; Shawe-Taylor, John; Mourão-Miranda, Janaina
2016-09-15
Supervised classification machine learning algorithms may have limitations when studying brain diseases with heterogeneous populations, as the labels might be unreliable. More exploratory approaches, such as Sparse Partial Least Squares (SPLS), may provide insights into the brain's mechanisms by finding relationships between neuroimaging and clinical/demographic data. The identification of these relationships has the potential to improve the current understanding of disease mechanisms, refine clinical assessment tools, and stratify patients. SPLS finds multivariate associative effects in the data by computing pairs of sparse weight vectors, where each pair is used to remove its corresponding associative effect from the data by matrix deflation, before computing additional pairs. We propose a novel SPLS framework which selects the adequate number of voxels and clinical variables to describe each associative effect, and tests their reliability by fitting the model to different splits of the data. As a proof of concept, the approach was applied to find associations between grey matter probability maps and individual items of the Mini-Mental State Examination (MMSE) in a clinical sample with various degrees of dementia. The framework found two statistically significant associative effects between subsets of brain voxels and subsets of the questions/tasks. SPLS was compared with its non-sparse version (PLS). The use of projection deflation versus a classical PLS deflation was also tested in both PLS and SPLS. SPLS outperformed PLS, finding statistically significant effects and providing higher correlation values in hold-out data. Moreover, projection deflation provided better results. Copyright © 2016 The Author(s). Published by Elsevier B.V. All rights reserved.
Quantification of brain lipids by FTIR spectroscopy and partial least squares regression
NASA Astrophysics Data System (ADS)
Dreissig, Isabell; Machill, Susanne; Salzer, Reiner; Krafft, Christoph
2009-01-01
Brain tissue is characterized by high lipid content. Its content decreases and the lipid composition changes during transformation from normal brain tissue to tumors. Therefore, the analysis of brain lipids might complement the existing diagnostic tools to determine the tumor type and tumor grade. Objective of this work is to extract lipids from gray matter and white matter of porcine brain tissue, record infrared (IR) spectra of these extracts and develop a quantification model for the main lipids based on partial least squares (PLS) regression. IR spectra of the pure lipids cholesterol, cholesterol ester, phosphatidic acid, phosphatidylcholine, phosphatidylethanolamine, phosphatidylserine, phosphatidylinositol, sphingomyelin, galactocerebroside and sulfatide were used as references. Two lipid mixtures were prepared for training and validation of the quantification model. The composition of lipid extracts that were predicted by the PLS regression of IR spectra was compared with lipid quantification by thin layer chromatography.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Cen Haiyan; Bao Yidan; He Yong
2006-10-10
Visible and near-infrared reflectance (visible-NIR) spectroscopy is applied to discriminate different varieties of bayberry juices. The discrimination of visible-NIR spectra from samples is a matter of pattern recognition. By partial least squares (PLS), the spectrum is reduced to certain factors, which are then taken as the input of the backpropagation neural network (BPNN). Through training and prediction, three different varieties of bayberry juice are classified based on the output of the BPNN. In addition, a mathematical model is built and the algorithm is optimized. With proper parameters in the training set,100% accuracy is obtained by the BPNN. Thus it ismore » concluded that the PLS analysis combined with the BPNN is an alternative for pattern recognition based on visible and NIR spectroscopy.« less
Consistent Partial Least Squares Path Modeling via Regularization.
Jung, Sunho; Park, JaeHong
2018-01-01
Partial least squares (PLS) path modeling is a component-based structural equation modeling that has been adopted in social and psychological research due to its data-analytic capability and flexibility. A recent methodological advance is consistent PLS (PLSc), designed to produce consistent estimates of path coefficients in structural models involving common factors. In practice, however, PLSc may frequently encounter multicollinearity in part because it takes a strategy of estimating path coefficients based on consistent correlations among independent latent variables. PLSc has yet no remedy for this multicollinearity problem, which can cause loss of statistical power and accuracy in parameter estimation. Thus, a ridge type of regularization is incorporated into PLSc, creating a new technique called regularized PLSc. A comprehensive simulation study is conducted to evaluate the performance of regularized PLSc as compared to its non-regularized counterpart in terms of power and accuracy. The results show that our regularized PLSc is recommended for use when serious multicollinearity is present.
Devrim, Burcu; Dinç, Erdal; Bozkir, Asuman
2014-01-01
Diphenhydramine hydrochloride (DPH), a histamine H1-receptor antagonist, is widely used as antiallergic, antiemetic and antitussive drug found in many pharmaceutical preparations. In this study, a new reconstitutable syrup formulation of DPH was prepared because it is more stable in solid form than that in liquid form. The quantitative estimation of the DPH content of a reconstitutable syrup formulation in the presence of pharmaceutical excipients, D-sorbitol, sodium citrate, sodium benzoate and sodium EDTA is not possible by the direct absorbance measurement. Therefore, a signal processing approach based on continuous wavelet transform was used to determine the DPH in the reconstitutable syrup formulations and to eliminate the effect of excipients on the analysis. The absorption spectra of DPH in the range of 5.0-40.0 μg/mL were recorded between 200-300 nm. Various wavelet families were tested and Biorthogonal1.1 continuous wavelet transform (BIOR1.1-CWT) was found to be optimal signal processing family to get fast and desirable determination results and to overcome excipient interference effects. For a comparison of the experimental results obtained by partial least squares (PLS) and principal component regression (PCR) methods were applied to the quantitative prediction of DPH in the mentioned samples. The validity of the proposed BIOR1.1-CWT, PLS and PCR methods were achieved analyzing the prepared samples containing the mentioned excipients and using standard addition technique. It was observed that the proposed graphical and numerical approaches are suitable for the quantitative analysis of DPH in samples including excipients.
Zhang, Ji; Li, Bing; Wang, Qi; Wei, Xin; Feng, Weibo; Chen, Yijiu; Huang, Ping; Wang, Zhenyuan
2017-12-21
Postmortem interval (PMI) evaluation remains a challenge in the forensic community due to the lack of efficient methods. Studies have focused on chemical analysis of biofluids for PMI estimation; however, no reports using spectroscopic methods in pericardial fluid (PF) are available. In this study, Fourier transform infrared (FTIR) spectroscopy with attenuated total reflectance (ATR) accessory was applied to collect comprehensive biochemical information from rabbit PF at different PMIs. The PMI-dependent spectral signature was determined by two-dimensional (2D) correlation analysis. The partial least square (PLS) and nu-support vector machine (nu-SVM) models were then established based on the acquired spectral dataset. Spectral variables associated with amide I, amide II, COO - , C-H bending, and C-O or C-OH vibrations arising from proteins, polypeptides, amino acids and carbohydrates, respectively, were susceptible to PMI in 2D correlation analysis. Moreover, the nu-SVM model appeared to achieve a more satisfactory prediction than the PLS model in calibration; the reliability of both models was determined in an external validation set. The study shows the possibility of application of ATR-FTIR methods in postmortem interval estimation using PF samples.
Igne, Benoît; Drennen, James K; Anderson, Carl A
2014-01-01
Changes in raw materials and process wear and tear can have significant effects on the prediction error of near-infrared calibration models. When the variability that is present during routine manufacturing is not included in the calibration, test, and validation sets, the long-term performance and robustness of the model will be limited. Nonlinearity is a major source of interference. In near-infrared spectroscopy, nonlinearity can arise from light path-length differences that can come from differences in particle size or density. The usefulness of support vector machine (SVM) regression to handle nonlinearity and improve the robustness of calibration models in scenarios where the calibration set did not include all the variability present in test was evaluated. Compared to partial least squares (PLS) regression, SVM regression was less affected by physical (particle size) and chemical (moisture) differences. The linearity of the SVM predicted values was also improved. Nevertheless, although visualization and interpretation tools have been developed to enhance the usability of SVM-based methods, work is yet to be done to provide chemometricians in the pharmaceutical industry with a regression method that can supplement PLS-based methods.
Investigation of iterative image reconstruction in three-dimensional optoacoustic tomography
Wang, Kun; Su, Richard; Oraevsky, Alexander A; Anastasio, Mark A
2012-01-01
Iterative image reconstruction algorithms for optoacoustic tomography (OAT), also known as photoacoustic tomography, have the ability to improve image quality over analytic algorithms due to their ability to incorporate accurate models of the imaging physics, instrument response, and measurement noise. However, to date, there have been few reported attempts to employ advanced iterative image reconstruction algorithms for improving image quality in three-dimensional (3D) OAT. In this work, we implement and investigate two iterative image reconstruction methods for use with a 3D OAT small animal imager: namely, a penalized least-squares (PLS) method employing a quadratic smoothness penalty and a PLS method employing a total variation norm penalty. The reconstruction algorithms employ accurate models of the ultrasonic transducer impulse responses. Experimental data sets are employed to compare the performances of the iterative reconstruction algorithms to that of a 3D filtered backprojection (FBP) algorithm. By use of quantitative measures of image quality, we demonstrate that the iterative reconstruction algorithms can mitigate image artifacts and preserve spatial resolution more effectively than FBP algorithms. These features suggest that the use of advanced image reconstruction algorithms can improve the effectiveness of 3D OAT while reducing the amount of data required for biomedical applications. PMID:22864062
Diaz, Sílvia O; Barros, António S; Goodfellow, Brian J; Duarte, Iola F; Galhano, Eulália; Pita, Cristina; Almeida, Maria do Céu; Carreira, Isabel M; Gil, Ana M
2013-06-07
Given the recognized lack of prenatal clinical methods for the early diagnosis of preterm delivery, intrauterine growth restriction, preeclampsia and gestational diabetes mellitus, and the continuing need for optimized diagnosis methods for specific chromosomal disorders (e.g., trisomy 21) and fetal malformations, this work sought specific metabolic signatures of these conditions in second trimester maternal urine, using (1)H Nuclear Magnetic Resonance ((1)H NMR) metabolomics. Several variable importance to the projection (VIP)- and b-coefficient-based variable selection methods were tested, both individually and through their intersection, and the resulting data sets were analyzed by partial least-squares discriminant analysis (PLS-DA) and submitted to Monte Carlo cross validation (MCCV) and permutation tests to evaluate model predictive power. The NMR data subsets produced significantly improved PLS-DA models for all conditions except for pre-premature rupture of membranes. Specific urinary metabolic signatures were unveiled for central nervous system malformations, trisomy 21, preterm delivery, gestational diabetes, intrauterine growth restriction and preeclampsia, and biochemical interpretations were proposed. This work demonstrated, for the first time, the value of maternal urine profiling as a complementary means of prenatal diagnostics and early prediction of several poor pregnancy outcomes.
NASA Astrophysics Data System (ADS)
Li, Zhe; Feng, Jinchao; Liu, Pengyu; Sun, Zhonghua; Li, Gang; Jia, Kebin
2018-05-01
Temperature is usually considered as a fluctuation in near-infrared spectral measurement. Chemometric methods were extensively studied to correct the effect of temperature variations. However, temperature can be considered as a constructive parameter that provides detailed chemical information when systematically changed during the measurement. Our group has researched the relationship between temperature-induced spectral variation (TSVC) and normalized squared temperature. In this study, we focused on the influence of temperature distribution in calibration set. Multi-temperature calibration set selection (MTCS) method was proposed to improve the prediction accuracy by considering the temperature distribution of calibration samples. Furthermore, double-temperature calibration set selection (DTCS) method was proposed based on MTCS method and the relationship between TSVC and normalized squared temperature. We compare the prediction performance of PLS models based on random sampling method and proposed methods. The results from experimental studies showed that the prediction performance was improved by using proposed methods. Therefore, MTCS method and DTCS method will be the alternative methods to improve prediction accuracy in near-infrared spectral measurement.
Razi-Asrami, Mahboobeh; Ghasemi, Jahan B; Amiri, Nayereh; Sadeghi, Seyed Jamal
2017-04-01
In this paper, a simple, fast, and inexpensive method is introduced for the simultaneous spectrophotometric determination of crystal violet (CV) and malachite green (MG) contents in aquatic samples using partial least squares regression (PLS) as a multivariate calibration technique after preconcentration by graphene oxide (GO). The method was based on the sorption and desorption of analytes onto GO and direct determination by ultraviolet-visible spectrophotometric techniques. GO was synthesized according to Hummers method. To characterize the shape and structure of GO, FT-IR, SEM, and XRD were used. The effective factors on the extraction efficiency such as pH, extraction time, and the amount of adsorbent were optimized using central composite design. The optimum values of these factors were 6, 15 min, and 12 mg, respectively. The maximum capacity of GO for the adsorption of CV and MG was 63.17 and 77.02 mg g -1 , respectively. Preconcentration factors and extraction recoveries were obtained and were 19.6, 98% for CV and 20, 100% for MG, respectively. LOD and linear dynamic ranges for CV and MG were 0.009, 0.03-0.3, 0.015, and 0.05-0.5 (μg mL -1 ), respectively. The intra-day and inter-day relative standard deviations were 1.99 and 0.58 for CV and 1.69 and 3.13 for MG at the concentration level of 50 ng mL -1 , respectively. Finally, the proposed DSPE/PLS method was successfully applied for the simultaneous determination of the trace amount of CV and MG in the real water samples.
Song, Jingwei; He, Jiaying; Zhu, Menghua; Tan, Debao; Zhang, Yu; Ye, Song; Shen, Dingtao; Zou, Pengfei
2014-01-01
A simulated annealing (SA) based variable weighted forecast model is proposed to combine and weigh local chaotic model, artificial neural network (ANN), and partial least square support vector machine (PLS-SVM) to build a more accurate forecast model. The hybrid model was built and multistep ahead prediction ability was tested based on daily MSW generation data from Seattle, Washington, the United States. The hybrid forecast model was proved to produce more accurate and reliable results and to degrade less in longer predictions than three individual models. The average one-week step ahead prediction has been raised from 11.21% (chaotic model), 12.93% (ANN), and 12.94% (PLS-SVM) to 9.38%. Five-week average has been raised from 13.02% (chaotic model), 15.69% (ANN), and 15.92% (PLS-SVM) to 11.27%. PMID:25301508
Tsopelas, Fotios; Konstantopoulos, Dimitris; Kakoulidou, Anna Tsantili
2018-07-26
In the present work, two approaches for the voltammetric fingerprinting of oils and their combination with chemometrics were investigated in order to detect the adulteration of extra virgin olive oil with olive pomace oil as well as the most common seed oils, namely sunflower, soybean and corn oil. In particular, cyclic voltammograms of diluted extra virgin olive oils, regular (pure) olive oils (blends of refined olive oils with virgin olive oils), olive pomace oils and seed oils in presence of dichloromethane and 0.1 M of LiClO 4 in EtOH as electrolyte were recorded at a glassy carbon working electrode. Cyclic voltammetry was also employed in methanolic extracts of olive and seed oils. Datapoints of cyclic voltammograms were exported and submitted to Principal Component Analysis (PCA), Partial Least Square- Discriminant Analysis (PLS-DA) and soft independent modeling of class analogy (SIMCA). In diluted oils, PLS-DA provided a clear discrimination between olive oils (extra virgin and regular) and olive pomace/seed oils, while SIMCA showed a clear discrimination of extra virgin olive oil in regard to all other samples. Using methanolic extracts and considering datapoints recorded between 0.6 and 1.3 V, PLS-DA provided more information, resulting in three clusters-extra virgin olive oils, regular olive oils and seed/olive pomace oils-while SIMCA showed inferior performance. For the quantification of extra virgin olive oil adulteration with olive pomace oil or seed oils, a model based on Partial Least Square (PLS) analysis was developed. Detection limit of adulteration in olive oil was found to be 2% (v/v) and the linearity range up to 33% (v/v). Validation and applicability of all models was proved using a suitable test set. In the case of PLS, synthetic oil mixtures with 4 known adulteration levels in the range of 4-26% were also employed as a blind test set. Copyright © 2018 Elsevier B.V. All rights reserved.
Henrique, C M; Teófilo, R F; Sabino, L; Ferreira, M M C; Cereda, M P
2007-05-01
Cassava starches are widely used in the production of biodegradable films, but their resistance to humidity migration is very low. In this work, commercial cassava starch films were studied and classified according to their physicochemical properties. A nondestructive method for water vapor permeability determination, which combines with infrared spectroscopy and multivariate calibration, is also presented. The following commercial cassava starches were studied: pregelatinized (amidomax 3550), carboxymethylated starch (CMA) of low and high viscosities, and esterified starches. To make the films, 2 different starch concentrations were evaluated, consisting of water suspensions with 3% and 5% starch. The filmogenic solutions were dried and characterized for their thickness, grammage, water vapor permeability, water activity, tensile strength (deformation force), water solubility, and puncture strength (deformation). The minimum thicknesses were 0.5 to 0.6 mm in pregelatinized starch films. The results were treated by means of the following chemometric methods: principal component analysis (PCA) and partial least squares (PLS) regression. PCA analysis on the physicochemical properties of the films showed that the differences in concentration of the dried material (3% and 5% starch) and also in the type of starch modification were mainly related to the following properties: permeability, solubility, and thickness. IR spectra collected in the region of 4000 to 600 cm(-1) were used to build a PLS model with good predictive power for water vapor permeability determination, with mean relative errors of 10.0% for cross-validation and 7.8% for the prediction set.
Gomes, Adriano de Araújo; Alcaraz, Mirta Raquel; Goicoechea, Hector C; Araújo, Mario Cesar U
2014-02-06
In this work the Successive Projection Algorithm is presented for intervals selection in N-PLS for three-way data modeling. The proposed algorithm combines noise-reduction properties of PLS with the possibility of discarding uninformative variables in SPA. In addition, second-order advantage can be achieved by the residual bilinearization (RBL) procedure when an unexpected constituent is present in a test sample. For this purpose, SPA was modified in order to select intervals for use in trilinear PLS. The ability of the proposed algorithm, namely iSPA-N-PLS, was evaluated on one simulated and two experimental data sets, comparing the results to those obtained by N-PLS. In the simulated system, two analytes were quantitated in two test sets, with and without unexpected constituent. In the first experimental system, the determination of the four fluorophores (l-phenylalanine; l-3,4-dihydroxyphenylalanine; 1,4-dihydroxybenzene and l-tryptophan) was conducted with excitation-emission data matrices. In the second experimental system, quantitation of ofloxacin was performed in water samples containing two other uncalibrated quinolones (ciprofloxacin and danofloxacin) by high performance liquid chromatography with UV-vis diode array detector. For comparison purpose, a GA algorithm coupled with N-PLS/RBL was also used in this work. In most of the studied cases iSPA-N-PLS proved to be a promising tool for selection of variables in second-order calibration, generating models with smaller RMSEP, when compared to both the global model using all of the sensors in two dimensions and GA-NPLS/RBL. Copyright © 2013 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Coble, Jamie; Orton, Christopher; Schwantes, Jon
Abstract—The Multi-Isotope Process (MIP) Monitor provides an efficient approach to monitoring the process conditions in used nuclear fuel reprocessing facilities to support process verification and validation. The MIP Monitor applies multivariate analysis to gamma spectroscopy of reprocessing streams in order to detect small changes in the gamma spectrum, which may indicate changes in process conditions. This research extends the MIP Monitor by characterizing a used fuel sample after initial dissolution according to the type of reactor of origin (pressurized or boiling water reactor), initial enrichment, burn up, and cooling time. Simulated gamma spectra were used to develop and test threemore » fuel characterization algorithms. The classification and estimation models employed are based on the partial least squares regression (PLS) algorithm. A PLS discriminate analysis model was developed which perfectly classified reactor type. Locally weighted PLS models were fitted on-the-fly to estimate continuous fuel characteristics. Burn up was predicted within 0.1% root mean squared percent error (RMSPE) and both cooling time and initial enrichment within approximately 2% RMSPE. This automated fuel characterization can be used to independently verify operator declarations of used fuel characteristics and inform the MIP Monitor anomaly detection routines at later stages of the fuel reprocessing stream to improve sensitivity to changes in operational parameters and material diversions.« less
NASA Astrophysics Data System (ADS)
Solimun
2017-05-01
The aim of this research is to model survival data from kidney-transplant patients using the partial least squares (PLS)-Cox regression, which can both meet and not meet the no-multicollinearity assumption. The secondary data were obtained from research entitled "Factors affecting the survival of kidney-transplant patients". The research subjects comprised 250 patients. The predictor variables consisted of: age (X1), sex (X2); two categories, prior hemodialysis duration (X3), diabetes (X4); two categories, prior transplantation number (X5), number of blood transfusions (X6), discrepancy score (X7), use of antilymphocyte globulin(ALG) (X8); two categories, while the response variable was patient survival time (in months). Partial least squares regression is a model that connects the predictor variables X and the response variable y and it initially aims to determine the relationship between them. Results of the above analyses suggest that the survival of kidney transplant recipients ranged from 0 to 55 months, with 62% of the patients surviving until they received treatment that lasted for 55 months. The PLS-Cox regression analysis results revealed that patients' age and the use of ALG significantly affected the survival time of patients. The factor of patients' age (X1) in the PLS-Cox regression model merely affected the failure probability by 1.201. This indicates that the probability of dying for elderly patients with a kidney transplant is 1.152 times higher than that for younger patients.
Sirois, S; Tsoukas, C M; Chou, Kuo-Chen; Wei, Dongqing; Boucher, C; Hatzakis, G E
2005-03-01
Quantitative Structure Activity Relationship (QSAR) techniques are used routinely by computational chemists in drug discovery and development to analyze datasets of compounds. Quantitative numerical methods like Partial Least Squares (PLS) and Artificial Neural Networks (ANN) have been used on QSAR to establish correlations between molecular properties and bioactivity. However, ANN may be advantageous over PLS because it considers the interrelations of the modeled variables. This study focused on the HIV-1 Protease (HIV-1 Pr) inhibitors belonging to the peptidomimetic class of compounds. The main objective was to select molecular descriptors with the best predictive value for antiviral potency (Ki). PLS and ANN were used to predict Ki activity of HIV-1 Pr inhibitors and the results were compared. To address the issue of dimensionality reduction, Genetic Algorithms (GA) were used for variable selection and their performance was compared against that of ANN. Finally, the structure of the optimum ANN achieving the highest Pearson's-R coefficient was determined. On the basis of Pearson's-R, PLS and ANN were compared to determine which exhibits maximum performance. Training and validation of models was performed on 15 random split sets of the master dataset consisted of 231 compounds. For each compound 192 molecular descriptors were considered. The molecular structure and constant of inhibition (Ki) were selected from the NIAID database. Study findings suggested that non-covalent interactions such as hydrophobicity, shape and hydrogen bonding describe well the antiviral activity of the HIV-1 Pr compounds. The significance of lipophilicity and relationship to HIV-1 associated hyperlipidemia and lipodystrophy syndrome warrant further investigation.
NASA Astrophysics Data System (ADS)
Li, Xiongwei; Wang, Zhe; Lui, Siu-Lung; Fu, Yangting; Li, Zheng; Liu, Jianming; Ni, Weidou
2013-10-01
A bottleneck of the wide commercial application of laser-induced breakdown spectroscopy (LIBS) technology is its relatively high measurement uncertainty. A partial least squares (PLS) based normalization method was proposed to improve pulse-to-pulse measurement precision for LIBS based on our previous spectrum standardization method. The proposed model utilized multi-line spectral information of the measured element and characterized the signal fluctuations due to the variation of plasma characteristic parameters (plasma temperature, electron number density, and total number density) for signal uncertainty reduction. The model was validated by the application of copper concentration prediction in 29 brass alloy samples. The results demonstrated an improvement on both measurement precision and accuracy over the generally applied normalization as well as our previously proposed simplified spectrum standardization method. The average relative standard deviation (RSD), average of the standard error (error bar), the coefficient of determination (R2), the root-mean-square error of prediction (RMSEP), and average value of the maximum relative error (MRE) were 1.80%, 0.23%, 0.992, 1.30%, and 5.23%, respectively, while those for the generally applied spectral area normalization were 3.72%, 0.71%, 0.973, 1.98%, and 14.92%, respectively.
Noncontact analysis of the fiber weight per unit area in prepreg by near-infrared spectroscopy.
Jiang, B; Huang, Y D
2008-05-26
The fiber weight per unit area in prepreg is an important factor to ensure the quality of the composite products. Near-infrared spectroscopy (NIRS) technology together with a noncontact reflectance sources has been applied for quality analysis of the fiber weight per unit area. The range of the unit area fiber weight was 13.39-14.14mgcm(-2). The regression method was employed by partial least squares (PLS) and principal components regression (PCR). The calibration model was developed by 55 samples to determine the fiber weight per unit area in prepreg. The determination coefficient (R(2)), root mean square error of calibration (RMSEC) and root mean square error of prediction (RMSEP) were 0.82, 0.092, 0.099, respectively. The predicted values of the fiber weight per unit area in prepreg measured by NIRS technology were comparable to the values obtained by the reference method. For this technology, the noncontact reflectance sources focused directly on the sample with neither previous treatment nor manipulation. The results of the paired t-test revealed that there was no significant difference between the NIR method and the reference method. Besides, the prepreg could be analyzed one time within 20s without sample destruction.
Croker, Denise M; Hennigan, Michelle C; Maher, Anthony; Hu, Yun; Ryder, Alan G; Hodnett, Benjamin K
2012-04-07
Diffraction and spectroscopic methods were evaluated for quantitative analysis of binary powder mixtures of FII(6.403) and FIII(6.525) piracetam. The two polymorphs of piracetam could be distinguished using powder X-ray diffraction (PXRD), Raman and near-infrared (NIR) spectroscopy. The results demonstrated that Raman and NIR spectroscopy are most suitable for quantitative analysis of this polymorphic mixture. When the spectra are treated with the combination of multiplicative scatter correction (MSC) and second derivative data pretreatments, the partial least squared (PLS) regression model gave a root mean square error of calibration (RMSEC) of 0.94 and 0.99%, respectively. FIII(6.525) demonstrated some preferred orientation in PXRD analysis, making PXRD the least preferred method of quantification. Copyright © 2012 Elsevier B.V. All rights reserved.
Simultaneous determination of three herbicides by differential pulse voltammetry and chemometrics.
Ni, Yongnian; Wang, Lin; Kokot, Serge
2011-01-01
A novel differential pulse voltammetry method (DPV) was researched and developed for the simultaneous determination of Pendimethalin, Dinoseb and sodium 5-nitroguaiacolate (5NG) with the aid of chemometrics. The voltammograms of these three compounds overlapped significantly, and to facilitate the simultaneous determination of the three analytes, chemometrics methods were applied. These included classical least squares (CLS), principal component regression (PCR), partial least squares (PLS) and radial basis function-artificial neural networks (RBF-ANN). A separately prepared verification data set was used to confirm the calibrations, which were built from the original and first derivative data matrices of the voltammograms. On the basis relative prediction errors and recoveries of the analytes, the RBF-ANN and the DPLS (D - first derivative spectra) models performed best and are particularly recommended for application. The DPLS calibration model was applied satisfactorily for the prediction of the three analytes from market vegetables and lake water samples.
Augmented classical least squares multivariate spectral analysis
Haaland, David M.; Melgaard, David K.
2004-02-03
A method of multivariate spectral analysis, termed augmented classical least squares (ACLS), provides an improved CLS calibration model when unmodeled sources of spectral variation are contained in a calibration sample set. The ACLS methods use information derived from component or spectral residuals during the CLS calibration to provide an improved calibration-augmented CLS model. The ACLS methods are based on CLS so that they retain the qualitative benefits of CLS, yet they have the flexibility of PLS and other hybrid techniques in that they can define a prediction model even with unmodeled sources of spectral variation that are not explicitly included in the calibration model. The unmodeled sources of spectral variation may be unknown constituents, constituents with unknown concentrations, nonlinear responses, non-uniform and correlated errors, or other sources of spectral variation that are present in the calibration sample spectra. Also, since the various ACLS methods are based on CLS, they can incorporate the new prediction-augmented CLS (PACLS) method of updating the prediction model for new sources of spectral variation contained in the prediction sample set without having to return to the calibration process. The ACLS methods can also be applied to alternating least squares models. The ACLS methods can be applied to all types of multivariate data.
Augmented Classical Least Squares Multivariate Spectral Analysis
Haaland, David M.; Melgaard, David K.
2005-07-26
A method of multivariate spectral analysis, termed augmented classical least squares (ACLS), provides an improved CLS calibration model when unmodeled sources of spectral variation are contained in a calibration sample set. The ACLS methods use information derived from component or spectral residuals during the CLS calibration to provide an improved calibration-augmented CLS model. The ACLS methods are based on CLS so that they retain the qualitative benefits of CLS, yet they have the flexibility of PLS and other hybrid techniques in that they can define a prediction model even with unmodeled sources of spectral variation that are not explicitly included in the calibration model. The unmodeled sources of spectral variation may be unknown constituents, constituents with unknown concentrations, nonlinear responses, non-uniform and correlated errors, or other sources of spectral variation that are present in the calibration sample spectra. Also, since the various ACLS methods are based on CLS, they can incorporate the new prediction-augmented CLS (PACLS) method of updating the prediction model for new sources of spectral variation contained in the prediction sample set without having to return to the calibration process. The ACLS methods can also be applied to alternating least squares models. The ACLS methods can be applied to all types of multivariate data.
Augmented Classical Least Squares Multivariate Spectral Analysis
Haaland, David M.; Melgaard, David K.
2005-01-11
A method of multivariate spectral analysis, termed augmented classical least squares (ACLS), provides an improved CLS calibration model when unmodeled sources of spectral variation are contained in a calibration sample set. The ACLS methods use information derived from component or spectral residuals during the CLS calibration to provide an improved calibration-augmented CLS model. The ACLS methods are based on CLS so that they retain the qualitative benefits of CLS, yet they have the flexibility of PLS and other hybrid techniques in that they can define a prediction model even with unmodeled sources of spectral variation that are not explicitly included in the calibration model. The unmodeled sources of spectral variation may be unknown constituents, constituents with unknown concentrations, nonlinear responses, non-uniform and correlated errors, or other sources of spectral variation that are present in the calibration sample spectra. Also, since the various ACLS methods are based on CLS, they can incorporate the new prediction-augmented CLS (PACLS) method of updating the prediction model for new sources of spectral variation contained in the prediction sample set without having to return to the calibration process. The ACLS methods can also be applied to alternating least squares models. The ACLS methods can be applied to all types of multivariate data.
Hacisalihoglu, Gokhan; Larbi, Bismark; Settles, A Mark
2010-01-27
The objective of this study was to explore the potential of near-infrared reflectance (NIR) spectroscopy to determine individual seed composition in common bean ( Phaseolus vulgaris L.). NIR spectra and analytical measurements of seed weight, protein, and starch were collected from 267 individual bean seeds representing 91 diverse genotypes. Partial least-squares (PLS) regression models were developed with 61 bean accessions randomly assigned to a calibration data set and 30 accessions assigned to an external validation set. Protein gave the most accurate PLS regression, with the external validation set having a standard error of prediction (SEP) = 1.6%. PLS regressions for seed weight and starch had sufficient accuracy for seed sorting applications, with SEP = 41.2 mg and 4.9%, respectively. Seed color had a clear effect on the NIR spectra, with black beans having a distinct spectral type. Seed coat color did not impact the accuracy of PLS predictions. This research demonstrates that NIR is a promising technique for simultaneous sorting of multiple seed traits in single bean seeds with no sample preparation.
Land, Walker H; Heine, John J; Raway, Tom; Mizaku, Alda; Kovalchuk, Nataliya; Yang, Jack Y; Yang, Mary Qu
2008-01-01
The automated decision paradigms presented in this work address the false positive (FP) biopsy occurrence in diagnostic mammography. An EP/ES stochastic hybrid and two kernelized Partial Least Squares (K-PLS) paradigms were investigated with following studies: methodology performance comparisonsautomated diagnostic accuracy assessments with two data sets. The findings showed: the new hybrid produced comparable results more rapidlythe new K-PLS paradigms train and operate Essentially in real time for the data sets studied. Both advancements are essential components for eventually achieving the FP reduction goal, while maintaining acceptable diagnostic sensitivities.
Raway, Tom; Mizaku, Alda; Kovalchuk, Nataliya; Yang, Jack Y.; Yang, Mary Qu
2015-01-01
The automated decision paradigms presented in this work address the false positive (FP) biopsy occurrence in diagnostic mammography. An EP/ES stochastic hybrid and two kernelized Partial Least Squares (K-PLS) paradigms were investigated with following studies: methodology performance comparisonsautomated diagnostic accuracy assessments with two data sets. The findings showed: the new hybrid produced comparable results more rapidlythe new K-PLS paradigms train and operate Essentially in real time for the data sets studied. Both advancements are essential components for eventually achieving the FP reduction goal, while maintaining acceptable diagnostic sensitivities. PMID:26430470
Carranco, Núria; Farrés-Cebrián, Mireia; Saurina, Javier
2018-01-01
High performance liquid chromatography method with ultra-violet detection (HPLC-UV) fingerprinting was applied for the analysis and characterization of olive oils, and was performed using a Zorbax Eclipse XDB-C8 reversed-phase column under gradient elution, employing 0.1% formic acid aqueous solution and methanol as mobile phase. More than 130 edible oils, including monovarietal extra-virgin olive oils (EVOOs) and other vegetable oils, were analyzed. Principal component analysis results showed a noticeable discrimination between olive oils and other vegetable oils using raw HPLC-UV chromatographic profiles as data descriptors. However, selected HPLC-UV chromatographic time-window segments were necessary to achieve discrimination among monovarietal EVOOs. Partial least square (PLS) regression was employed to tackle olive oil authentication of Arbequina EVOO adulterated with Picual EVOO, a refined olive oil, and sunflower oil. Highly satisfactory results were obtained after PLS analysis, with overall errors in the quantitation of adulteration in the Arbequina EVOO (minimum 2.5% adulterant) below 2.9%. PMID:29561820
Optical glucose monitoring using vertical cavity surface emitting lasers (VCSELs)
NASA Astrophysics Data System (ADS)
Talebi Fard, Sahba; Hofmann, Werner; Talebi Fard, Pouria; Kwok, Ezra; Amann, Markus-Christian; Chrostowski, Lukas
2009-08-01
Diabetes Mellitus is a common chronic disease that has become a public health issue. Continuous glucose monitoring improves patient health by stabilizing the glucose levels. Optical methods are one of the painless and promising methods that can be used for blood glucose predictions. However, having accuracies lower than what is acceptable clinically has been a major concern. Using lasers along with multivariate techniques such as Partial Least Square (PLS) can improve glucose predictions. This research involves investigations for developing a novel optical system for accurate glucose predictions, which leads to the development of a small, low power, implantable optical sensor for diabetes patients.
NASA Astrophysics Data System (ADS)
Mei, Yaguang; Cheng, Yuxin; Cheng, Shusen; Hao, Zhongqi; Guo, Lianbo; Li, Xiangyou; Zeng, Xiaoyan
2017-10-01
During the iron-making process in blast furnace, the Si content in liquid pig iron was usually used to evaluate the quality of liquid iron and thermal state of blast furnace. None effective method was found for rapid detecting the Si concentration of liquid iron. Laser-induced breakdown spectroscopy (LIBS) is a kind of atomic emission spectrometry technology based on laser ablation. Its obvious advantage is realizing rapid, in-situ, online analysis of element concentration in open air without sample pretreatment. The characteristics of Si in liquid iron were analyzed from the aspect of thermodynamic theory and metallurgical technology. The relationship between Si and C, Mn, S, P or other alloy elements were revealed based on thermodynamic calculation. Subsequently, LIBS was applied on rapid detection of Si of pig iron in this work. During LIBS detection process, several groups of standard pig iron samples were employed in this work to calibrate the Si content in pig iron. The calibration methods including linear, quadratic and cubic internal standard calibration, multivariate linear calibration and partial least squares (PLS) were compared with each other. It revealed that the PLS improved by normalization was the best calibration method for Si detection by LIBS.
Liu, Ze; Xie, Hua-Lin; Chen, Lin; Huang, Jian-Hua
2018-05-02
Background: Pu-erh tea is a unique microbially fermented tea, which distinctive chemical constituents and activities are worthy of systematic study. Near infrared spectroscopy (NIR) coupled with suitable chemometrics approaches can rapidly and accurately quantitatively analyze multiple compounds in samples. Methods: In this study, an improved weighted partial least squares (PLS) algorithm combined with near infrared spectroscopy (NIR) was used to construct a fast calibration model for determining four main components, i.e., tea polyphenols, tea polysaccharide, total flavonoids, theanine content, and further determine the total antioxidant capacity of pu-erh tea. Results: The final correlation coefficients R square for tea polyphenols, tea polysaccharide, total flavonoids content, theanine content, and total antioxidant capacity were 0.8288, 0.8403, 0.8415, 0.8537 and 0.8682, respectively. Conclusions : The current study provided a comprehensive study of four main ingredients and activity of pu-erh tea, and demonstrated that NIR spectroscopy technology coupled with multivariate calibration analysis could be successfully applied to pu-erh tea quality assessment.
DOE Office of Scientific and Technical Information (OSTI.GOV)
De Lucia, Frank C. Jr.; Gottfried, Jennifer L.; Munson, Chase A.
2008-11-01
A technique being evaluated for standoff explosives detection is laser-induced breakdown spectroscopy (LIBS). LIBS is a real-time sensor technology that uses components that can be configured into a ruggedized standoff instrument. The U.S. Army Research Laboratory has been coupling standoff LIBS spectra with chemometrics for several years now in order to discriminate between explosives and nonexplosives. We have investigated the use of partial least squares discriminant analysis (PLS-DA) for explosives detection. We have extended our study of PLS-DA to more complex sample types, including binary mixtures, different types of explosives, and samples not included in the model. We demonstrate themore » importance of building the PLS-DA model by iteratively testing it against sample test sets. Independent test sets are used to test the robustness of the final model.« less
Study on rapid valid acidity evaluation of apple by fiber optic diffuse reflectance technique
NASA Astrophysics Data System (ADS)
Liu, Yande; Ying, Yibin; Fu, Xiaping; Jiang, Xuesong
2004-03-01
Some issues related to nondestructive evaluation of valid acidity in intact apples by means of Fourier transform near infrared (FTNIR) (800-2631nm) method were addressed. A relationship was established between the diffuse reflectance spectra recorded with a bifurcated optic fiber and the valid acidity. The data were analyzed by multivariate calibration analysis such as partial least squares (PLS) analysis and principal component regression (PCR) technique. A total of 120 Fuji apples were tested and 80 of them were used to form a calibration data set. The influence of data preprocessing and different spectra treatments were also investigated. Models based on smoothing spectra were slightly worse than models based on derivative spectra and the best result was obtained when the segment length was 5 and the gap size was 10. Depending on data preprocessing and multivariate calibration technique, the best prediction model had a correlation efficient (0.871), a low RMSEP (0.0677), a low RMSEC (0.056) and a small difference between RMSEP and RMSEC by PLS analysis. The results point out the feasibility of FTNIR spectral analysis to predict the fruit valid acidity non-destructively. The ratio of data standard deviation to the root mean square error of prediction (SDR) is better to be less than 3 in calibration models, however, the results cannot meet the demand of actual application. Therefore, further study is required for better calibration and prediction.
NIR spectroscopic measurement of moisture content in Scots pine seeds.
Lestander, Torbjörn A; Geladi, Paul
2003-04-01
When tree seeds are used for seedling production it is important that they are of high quality in order to be viable. One of the factors influencing viability is moisture content and an ideal quality control system should be able to measure this factor quickly for each seed. Seed moisture content within the range 3-34% was determined by near-infrared (NIR) spectroscopy on Scots pine (Pinus sylvestris L.) single seeds and on bulk seed samples consisting of 40-50 seeds. The models for predicting water content from the spectra were made by partial least squares (PLS) and ordinary least squares (OLS) regression. Different conditions were simulated involving both using less wavelengths and going from samples to single seeds. Reflectance and transmission measurements were used. Different spectral pretreatment methods were tested on the spectra. Including bias, the lowest prediction errors for PLS models based on reflectance within 780-2280 nm from bulk samples and single seeds were 0.8% and 1.9%, respectively. Reduction of the single seed reflectance spectrum to 850-1048 nm gave higher biases and prediction errors in the test set. In transmission (850-1048 nm) the prediction error was 2.7% for single seeds. OLS models based on simulated 4-sensor single seed system consisting of optical filters with Gaussian transmission indicated more than 3.4% error in prediction. A practical F-test based on test sets to differentiate models is introduced.
Yang, Jun-Ho; Yoh, Jack J
2018-01-01
A novel technique is reported for separating overlapping latent fingerprints using chemometric approaches that combine laser-induced breakdown spectroscopy (LIBS) and multivariate analysis. The LIBS technique provides the capability of real time analysis and high frequency scanning as well as the data regarding the chemical composition of overlapping latent fingerprints. These spectra offer valuable information for the classification and reconstruction of overlapping latent fingerprints by implementing appropriate statistical multivariate analysis. The current study employs principal component analysis and partial least square methods for the classification of latent fingerprints from the LIBS spectra. This technique was successfully demonstrated through a classification study of four distinct latent fingerprints using classification methods such as soft independent modeling of class analogy (SIMCA) and partial least squares discriminant analysis (PLS-DA). The novel method yielded an accuracy of more than 85% and was proven to be sufficiently robust. Furthermore, through laser scanning analysis at a spatial interval of 125 µm, the overlapping fingerprints were reconstructed as separate two-dimensional forms.
Alves, Julio Cesar Laurentino; Poppi, Ronei Jesus
2013-11-07
Highly polluting fuels based on non-renewable resources such as fossil fuels need to be replaced with potentially less polluting renewable fuels derived from vegetable or animal biomass, these so-called biofuels, are a reality nowadays and many countries have started the challenge of increasing the use of different types of biofuels, such as ethanol and biodiesel (fatty acid alkyl esters), often mixed with petroleum derivatives, such as gasoline and diesel, respectively. The quantitative determination of these fuel blends using simple, fast and low cost methods based on near infrared (NIR) spectroscopy combined with chemometric methods has been reported. However, advanced biofuels based on a mixture of hydrocarbons or a single hydrocarbon molecule, such as farnesane (2,6,10-trimethyldodecane), a hydrocarbon renewable diesel, can also be used in mixtures with biodiesel and petroleum diesel fuel and the use of NIR spectroscopy for the quantitative determination of a ternary fuel blend of these two hydrocarbon-based fuels and biodiesel can be a useful tool for quality control. This work presents a development of an analytical method for the quantitative determination of hydrocarbon renewable diesel (farnesane), biodiesel and petroleum diesel fuel blends using NIR spectroscopy combined with chemometric methods, such as partial least squares (PLS) and support vector machines (SVM). This development leads to a more accurate, simpler, faster and cheaper method when compared to the standard reference method ASTM D6866 and with the main advantage of providing the individual quantification of two different biofuels in a mixture with petroleum diesel fuel. Using the developed PLS model the three fuel blend components were determined simultaneously with values of root mean square error of prediction (RMSEP) of 0.25%, 0.19% and 0.38% for hydrocarbon renewable diesel, biodiesel and petroleum diesel, respectively, the values obtained were in agreement with those suggested by reference methods for the determination of renewable fuels.
Chang, Xiangwei; Zhang, Juanjuan; Li, Dekun; Zhou, Dazheng; Zhang, Yuling; Wang, Jincheng; Hu, Bing; Ju, Aichun; Ye, Zhengliang
2017-07-15
The adulteration or falsification of the cultivation age of mountain cultivated ginseng (MCG) has been a serious problem in the commercial MCG market. To develop an efficient discrimination tool for the cultivation age and to explore potential age-dependent markers, an optimized ultra high-performance liquid chromatography/quadrupole time-of-flight mass spectrometry (UHPLC/QTOF-MS)-based metabolomics approach was applied in the global metabolite profiling of 156 MCG leaf (MGL) samples aged from 6 to 18 years. Multivariate statistical methods such as principal component analysis (PCA) and partial least squares discriminant analysis (PLS-DA) were used to compare the derived patterns between MGL samples of different cultivation ages. The present study demonstrated that 6-18-year-old MGL samples can be successfully discriminated using two simple successive steps, together with four PLS-DA discrimination models. Furthermore, 39 robust age-dependent markers enabling differentiation among the 6-18-year-old MGL samples were discovered. The results were validated by a permutation test and an external test set to verify the predictability and reliability of the established discrimination models. More importantly, without destroying the MCG roots, the proposed approach could also be applied to discriminate MCG root ages indirectly, using a minimum amount of homophyletic MGL samples combined with the established four PLS-DA models and identified markers. Additionally, to the best of our knowledge, this is the first study in which 6-18-year-old MCG root ages have been nondestructively differentiated by analyzing homophyletic MGL samples using UHPLC/QTOF-MS analysis and two simple successive steps together with four PLS-DA models. The method developed in this study can be used as a standard protocol for discriminating and predicting MGL ages directly and homophyletic MCG root ages indirectly. Copyright © 2017 Elsevier B.V. All rights reserved.
Lim, Jongguk; Kim, Giyoung; Mo, Changyeun; Kim, Moon S; Chao, Kuanglin; Qin, Jianwei; Fu, Xiaping; Baek, Insuck; Cho, Byoung-Kwan
2016-05-01
Illegal use of nitrogen-rich melamine (C3H6N6) to boost perceived protein content of food products such as milk, infant formula, frozen yogurt, pet food, biscuits, and coffee drinks has caused serious food safety problems. Conventional methods to detect melamine in foods, such as Enzyme-linked immunosorbent assay (ELISA), High-performance liquid chromatography (HPLC), and Gas chromatography-mass spectrometry (GC-MS), are sensitive but they are time-consuming, expensive, and labor-intensive. In this research, near-infrared (NIR) hyperspectral imaging technique combined with regression coefficient of partial least squares regression (PLSR) model was used to detect melamine particles in milk powders easily and quickly. NIR hyperspectral reflectance imaging data in the spectral range of 990-1700nm were acquired from melamine-milk powder mixture samples prepared at various concentrations ranging from 0.02% to 1%. PLSR models were developed to correlate the spectral data (independent variables) with melamine concentration (dependent variables) in melamine-milk powder mixture samples. PLSR models applying various pretreatment methods were used to reconstruct the two-dimensional PLS images. PLS images were converted to the binary images to detect the suspected melamine pixels in milk powder. As the melamine concentration was increased, the numbers of suspected melamine pixels of binary images were also increased. These results suggested that NIR hyperspectral imaging technique and the PLSR model can be regarded as an effective tool to detect melamine particles in milk powders. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Ni, Yongnian; Wang, Yong; Kokot, Serge
2008-10-01
A spectrophotometric method for the simultaneous determination of the important pharmaceuticals, pefloxacin and its structurally similar metabolite, norfloxacin, is described for the first time. The analysis is based on the monitoring of a kinetic spectrophotometric reaction of the two analytes with potassium permanganate as the oxidant. The measurement of the reaction process followed the absorbance decrease of potassium permanganate at 526 nm, and the accompanying increase of the product, potassium manganate, at 608 nm. It was essential to use multivariate calibrations to overcome severe spectral overlaps and similarities in reaction kinetics. Calibration curves for the individual analytes showed linear relationships over the concentration ranges of 1.0-11.5 mg L -1 at 526 and 608 nm for pefloxacin, and 0.15-1.8 mg L -1 at 526 and 608 nm for norfloxacin. Various multivariate calibration models were applied, at the two analytical wavelengths, for the simultaneous prediction of the two analytes including classical least squares (CLS), principal component regression (PCR), partial least squares (PLS), radial basis function-artificial neural network (RBF-ANN) and principal component-radial basis function-artificial neural network (PC-RBF-ANN). PLS and PC-RBF-ANN calibrations with the data collected at 526 nm, were the preferred methods—%RPE T ˜ 5, and LODs for pefloxacin and norfloxacin of 0.36 and 0.06 mg L -1, respectively. Then, the proposed method was applied successfully for the simultaneous determination of pefloxacin and norfloxacin present in pharmaceutical and human plasma samples. The results compared well with those from the alternative analysis by HPLC.
Khoshmanesh, Aazam; Cook, Perran L M; Wood, Bayden R
2012-08-21
Phosphorus (P) is a major cause of eutrophication and subsequent loss of water quality in freshwater ecosystems. A major part of the flux of P to eutrophic lake sediments is organically bound or of biogenic origin. Despite the broad relevance of polyphosphate (Poly-P) in bioremediation and P release processes in the environment, its quantification is not yet well developed for sediment samples. Current methods possess significant disadvantages because of the difficulties associated with using a single extractant to extract a specific P compound without altering others. A fast and reliable method to estimate the quantitative contribution of microorganisms to sediment P release processes is needed, especially when an excessive P accumulation in the form of polyphosphate (Poly-P) occurs. Development of novel approaches for application of emerging spectroscopic techniques to complex environmental matrices such as sediments significantly contributes to the speciation models of P mobilization, biogeochemical nutrient cycling and development of nutrient models. In this study, for the first time Attenuated Total Reflectance-Fourier Transform Infrared (ATR-FTIR) spectroscopy in combination with partial least squares (PLS) was used to quantify Poly-P in sediments. To reduce the high absorption matrix components in sediments such as silica, a physical extraction method was developed to separate sediment biological materials from abiotic particles. The aim was to achieve optimal separation of the biological materials from sediment abiotic particles with minimum chemical change in the sample matrix prior to ATR-FTIR analysis. Using a calibration set of 60 samples for the PLS prediction models in the Poly-P concentration range of 0-1 mg g(-1) d.w. (dry weight of sediment) (R(2) = 0.984 and root mean square error of prediction RMSEP = 0.041 at Factor-1) Poly-P could be detected at less than 50 μg g(-l) d.w. Using this technique, there is no solvent extraction or chemical treatment required, sample preparation is minimal and simple, and the analysis time is greatly reduced. The results from this study demonstrated the potential of ATR FT-IR spectroscopy as an alternative method to study Poly-P in sediments.
Lin, M; Al-Holy, M; Mousavi-Hesary, M; Al-Qadiri, H; Cavinato, A G; Rasco, B A
2004-01-01
To evaluate the feasibility of visible and short-wavelength near-infrared (SW-NIR) diffuse reflectance spectroscopy (600-1100 nm) to quantify the microbial loads in chicken meat and to develop a rapid methodology for monitoring the onset of spoilage. Twenty-four prepackaged fresh chicken breast muscle samples were prepared and stored at 21 degrees C for 24 h. Visible and SW-NIR was used to detect and quantify the microbial loads in chicken breast muscle at time intervals of 0, 2, 4, 6, 8, 10, 12 and 24 h. Spectra were collected in the diffuse reflectance mode (600-1100 nm). Total aerobic plate count (APC) of each sample was determined by the spread plate method at 32 degrees C for 48 h. Principal component analysis (PCA) and partial least squares (PLS) based prediction models were developed. PCA analysis showed clear segregation of samples held 8 h or longer compared with 0-h control. An optimum PLS model required eight latent variables for chicken muscle (R = 0.91, SEP = 0.48 log CFU g(-1)). Visible and SW-NIR combined with PCA is capable of perceiving the change of the microbial loads in chicken muscle once the APC increases slightly above 1 log cycle. Accurate quantification of the bacterial loads in chicken muscle can be calculated from the PLS-based prediction method. Visible and SW-NIR spectroscopy is a technique with a considerable potential for monitoring food safety and food spoilage. Visible and SW-NIR can acquire a metabolic snapshot and quantify the microbial loads of food samples rapidly, accurately, and noninvasively. This method would allow for more expeditious applications of quality control in food industries.
De Girolamo, A; Lippolis, V; Nordkvist, E; Visconti, A
2009-06-01
Fourier transform near-infrared spectroscopy (FT-NIR) was used for rapid and non-invasive analysis of deoxynivalenol (DON) in durum and common wheat. The relevance of using ground wheat samples with a homogeneous particle size distribution to minimize measurement variations and avoid DON segregation among particles of different sizes was established. Calibration models for durum wheat, common wheat and durum + common wheat samples, with particle size <500 microm, were obtained by using partial least squares (PLS) regression with an external validation technique. Values of root mean square error of prediction (RMSEP, 306-379 microg kg(-1)) were comparable and not too far from values of root mean square error of cross-validation (RMSECV, 470-555 microg kg(-1)). Coefficients of determination (r(2)) indicated an "approximate to good" level of prediction of the DON content by FT-NIR spectroscopy in the PLS calibration models (r(2) = 0.71-0.83), and a "good" discrimination between low and high DON contents in the PLS validation models (r(2) = 0.58-0.63). A "limited to good" practical utility of the models was ascertained by range error ratio (RER) values higher than 6. A qualitative model, based on 197 calibration samples, was developed to discriminate between blank and naturally contaminated wheat samples by setting a cut-off at 300 microg kg(-1) DON to separate the two classes. The model correctly classified 69% of the 65 validation samples with most misclassified samples (16 of 20) showing DON contamination levels quite close to the cut-off level. These findings suggest that FT-NIR analysis is suitable for the determination of DON in unprocessed wheat at levels far below the maximum permitted limits set by the European Commission.
Fadzlillah, Nurrulhidayah Ahmad; Rohman, Abdul; Ismail, Amin; Mustafa, Shuhaimi; Khatib, Alfi
2013-01-01
In dairy product sector, butter is one of the potential sources of fat soluble vitamins, namely vitamin A, D, E, K; consequently, butter is taken into account as high valuable price from other dairy products. This fact has attracted unscrupulous market players to blind butter with other animal fats to gain economic profit. Animal fats like mutton fat (MF) are potential to be mixed with butter due to the similarity in terms of fatty acid composition. This study focused on the application of FTIR-ATR spectroscopy in conjunction with chemometrics for classification and quantification of MF as adulterant in butter. The FTIR spectral region of 3910-710 cm⁻¹ was used for classification between butter and butter blended with MF at various concentrations with the aid of discriminant analysis (DA). DA is able to classify butter and adulterated butter without any mistakenly grouped. For quantitative analysis, partial least square (PLS) regression was used to develop a calibration model at the frequency regions of 3910-710 cm⁻¹. The equation obtained for the relationship between actual value of MF and FTIR predicted values of MF in PLS calibration model was y = 0.998x + 1.033, with the values of coefficient of determination (R²) and root mean square error of calibration are 0.998 and 0.046% (v/v), respectively. The PLS calibration model was subsequently used for the prediction of independent samples containing butter in the binary mixtures with MF. Using 9 principal components, root mean square error of prediction (RMSEP) is 1.68% (v/v). The results showed that FTIR spectroscopy can be used for the classification and quantification of MF in butter formulation for verification purposes.
Wang, Shenghao; Zhang, Yuyan; Cao, Fuyi; Pei, Zhenying; Gao, Xuewei; Zhang, Xu; Zhao, Yong
2018-02-13
This paper presents a novel spectrum analysis tool named synergy adaptive moving window modeling based on immune clone algorithm (SA-MWM-ICA) considering the tedious and inconvenient labor involved in the selection of pre-processing methods and spectral variables by prior experience. In this work, immune clone algorithm is first introduced into the spectrum analysis field as a new optimization strategy, covering the shortage of the relative traditional methods. Based on the working principle of the human immune system, the performance of the quantitative model is regarded as antigen, and a special vector corresponding to the above mentioned antigen is regarded as antibody. The antibody contains a pre-processing method optimization region which is created by 11 decimal digits, and a spectrum variable optimization region which is formed by some moving windows with changeable width and position. A set of original antibodies are created by modeling with this algorithm. After calculating the affinity of these antibodies, those with high affinity will be selected to clone. The regulation for cloning is that the higher the affinity, the more copies will be. In the next step, another import operation named hyper-mutation is applied to the antibodies after cloning. Moreover, the regulation for hyper-mutation is that the lower the affinity, the more possibility will be. Several antibodies with high affinity will be created on the basis of these steps. Groups of simulated dataset, gasoline near-infrared spectra dataset, and soil near-infrared spectra dataset are employed to verify and illustrate the performance of SA-MWM-ICA. Analysis results show that the performance of the quantitative models adopted by SA-MWM-ICA are better especially for structures with relatively complex spectra than traditional models such as partial least squares (PLS), moving window PLS (MWPLS), genetic algorithm PLS (GAPLS), and pretreatment method classification and adjustable parameter changeable size moving window PLS (CA-CSMWPLS). The selected pre-processing methods and spectrum variables are easily explained. The proposed method will converge in few generations and can be used not only for near-infrared spectroscopy analysis but also for other similar spectral analysis, such as infrared spectroscopy. Copyright © 2017 Elsevier B.V. All rights reserved.
Jiménez-Carvelo, Ana M; González-Casado, Antonio; Cuadros-Rodríguez, Luis
2017-03-01
A new analytical method for the quantification of olive oil and palm oil in blends with other vegetable edible oils (canola, safflower, corn, peanut, seeds, grapeseed, linseed, sesame and soybean) using normal phase liquid chromatography, and applying chemometric tools was developed. The procedure for obtaining of chromatographic fingerprint from the methyl-transesterified fraction from each blend is described. The multivariate quantification methods used were Partial Least Square-Regression (PLS-R) and Support Vector Regression (SVR). The quantification results were evaluated by several parameters as the Root Mean Square Error of Validation (RMSEV), Mean Absolute Error of Validation (MAEV) and Median Absolute Error of Validation (MdAEV). It has to be highlighted that the new proposed analytical method, the chromatographic analysis takes only eight minutes and the results obtained showed the potential of this method and allowed quantification of mixtures of olive oil and palm oil with other vegetable oils. Copyright © 2016 Elsevier B.V. All rights reserved.
Mabood, F; Boqué, R; Folcarelli, R; Busto, O; Jabeen, F; Al-Harrasi, Ahmed; Hussain, J
2016-05-15
In this study the effect of thermal treatment on the enhancement of synchronous fluorescence spectroscopic method for discrimination and quantification of pure extra virgin olive oil (EVOO) samples from EVOO samples adulterated with refined oil was investigated. Two groups of samples were used. One group was analyzed at room temperature (25 °C) and the other group was thermally treated in a thermostatic water bath at 75 °C for 8h, in contact with air and with light exposure, to favor oxidation. All the samples were then measured with synchronous fluorescence spectroscopy. Synchronous fluorescence spectra were acquired by varying the wavelength in the region from 250 to 720 nm at 20 nm wavelength differential interval of excitation and emission. Pure and adulterated olive oils were discriminated by using partial least-squares discriminant analysis (PLS-DA). It was found that the best PLS-DA models were those built with the difference spectra (75 °C-25 °C), which were able to discriminate pure from adulterated oils at a 2% level of adulteration of refined olive oils. Furthermore, PLS regression models were also built to quantify the level of adulteration. Again, the best model was the one built with the difference spectra, with a prediction error of 3.18% of adulteration. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Larasati, Ophilia; Puspita Dirgahayani, Eng., Dr.
2018-05-01
Transport services are essential to support daily life. A lack of transport supply leads to the existence of transport disadvantaged (TDA) groups who are vulnerable to social exclusion, which happens when a particular group or individual is having difficulties to access certain activities that are considered normal in society. To tackle this phenomenon, the understanding of the influence of TDA variables on social exclusion is needed. The aim of this study is to analyze the influences of TDA variables on social exclusion in a rural context, with Cibeureum Village (Bandung Barat Regency) and Bunikasih Village (Subang Regency) as the study case. Both case studies provide different characteristics of accessibility. Partial Least Squares (PLS) Structural Equation Modeling (SEM) is chosen as the method to analyze the influences of TDA variables on social exclusion. The PLS-SEM model is developed according to the social exclusion variable and four TDA variables, i.e., accessibility, individual characteristics, private vehicle existence, and travel behavior. IPMA is done after the PLS-SEM model is evaluated. The study reveals that among four of the TDA variables, accessibility has the most influence on social exclusion, hence interventions related to improving accessibility are needed to tackle social exclusion. More specifically, the provision of alternative modes is needed in both study areas, while in Bunikasih Village the cost of travel is also an important variable to consider.
NASA Astrophysics Data System (ADS)
Mabood, F.; Boqué, R.; Folcarelli, R.; Busto, O.; Jabeen, F.; Al-Harrasi, Ahmed; Hussain, J.
2016-05-01
In this study the effect of thermal treatment on the enhancement of synchronous fluorescence spectroscopic method for discrimination and quantification of pure extra virgin olive oil (EVOO) samples from EVOO samples adulterated with refined oil was investigated. Two groups of samples were used. One group was analyzed at room temperature (25 °C) and the other group was thermally treated in a thermostatic water bath at 75 °C for 8 h, in contact with air and with light exposure, to favor oxidation. All the samples were then measured with synchronous fluorescence spectroscopy. Synchronous fluorescence spectra were acquired by varying the wavelength in the region from 250 to 720 nm at 20 nm wavelength differential interval of excitation and emission. Pure and adulterated olive oils were discriminated by using partial least-squares discriminant analysis (PLS-DA). It was found that the best PLS-DA models were those built with the difference spectra (75 °C-25 °C), which were able to discriminate pure from adulterated oils at a 2% level of adulteration of refined olive oils. Furthermore, PLS regression models were also built to quantify the level of adulteration. Again, the best model was the one built with the difference spectra, with a prediction error of 3.18% of adulteration.
Geographical provenance of palm oil by fatty acid and volatile compound fingerprinting techniques.
Tres, A; Ruiz-Samblas, C; van der Veer, G; van Ruth, S M
2013-04-15
Analytical methods are required in addition to administrative controls to verify the geographical origin of vegetable oils such as palm oil in an objective manner. In this study the application of fatty acid and volatile organic compound fingerprinting in combination with chemometrics have been applied to verify the geographical origin of crude palm oil (continental scale). For this purpose 94 crude palm oil samples were collected from South East Asia (55), South America (11) and Africa (28). Partial least squares discriminant analysis (PLS-DA) was used to develop a hierarchical classification model by combining two consecutive binary PLS-DA models. First, a PLS-DA model was built to distinguish South East Asian from non-South East Asian palm oil samples. Then a second model was developed, only for the non-Asian samples, to discriminate African from South American crude palm oil. Models were externally validated by using them to predict the identity of new authentic samples. The fatty acid fingerprinting model revealed three misclassified samples. The volatile compound fingerprinting models showed an 88%, 100% and 100% accuracy for the South East Asian, African and American class, respectively. The verification of the geographical origin of crude palm oil is feasible by fatty acid and volatile compound fingerprinting. Further research is required to further validate the approach and to increase its spatial specificity to country/province scale. Copyright © 2012 Elsevier Ltd. All rights reserved.
Shao, Yongni; Xie, Chuanqi; Jiang, Linjun; Shi, Jiahui; Zhu, Jiajin; He, Yong
2015-04-05
Visible/near infrared spectroscopy (Vis/NIR) based on sensitive wavelengths (SWs) and chemometrics was proposed to discriminate different tomatoes bred by spaceflight mutagenesis from their leafs or fruits (green or mature). The tomato breeds were mutant M1, M2 and their parent. Partial least squares (PLS) analysis and least squares-support vector machine (LS-SVM) were implemented for calibration models. PLS analysis was implemented for calibration models with different wavebands including the visible region (400-700 nm) and the near infrared region (700-1000 nm). The best PLS models were achieved in the visible region for the leaf and green fruit samples and in the near infrared region for the mature fruit samples. Furthermore, different latent variables (4-8 LVs for leafs, 5-9 LVs for green fruits, and 4-9 LVs for mature fruits) were used as inputs of LS-SVM to develop the LV-LS-SVM models with the grid search technique and radial basis function (RBF) kernel. The optimal LV-LS-SVM models were achieved with six LVs for the leaf samples, seven LVs for green fruits, and six LVs for mature fruits, respectively, and they outperformed the PLS models. Moreover, independent component analysis (ICA) was executed to select several SWs based on loading weights. The optimal LS-SVM model was achieved with SWs of 550-560 nm, 562-574 nm, 670-680 nm and 705-71 5 nm for the leaf samples; 548-556 nm, 559-564 nm, 678-685 nm and 962-974 nm for the green fruit samples; and 712-718 nm, 720-729 nm, 968-978 nm and 820-830 nm for the mature fruit samples. All of them had better performance than PLS and LV-LS-SVM, with the parameters of correlation coefficient (rp), root mean square error of prediction (RMSEP) and bias of 0.9792, 0.2632 and 0.0901 based on leaf discrimination, 0.9837, 0.2783 and 0.1758 based on green fruit discrimination, 0.9804, 0.2215 and -0.0035 based on mature fruit discrimination, respectively. The overall results indicated that ICA was an effective way for the selection of SWs, and the Vis/NIR combined with LS-SVM models had the capability to predict the different breeds (mutant M1, mutant M2 and their parent) of tomatoes from leafs and fruits. Copyright © 2015 Elsevier B.V. All rights reserved.
Mirjankar, Nikhil S; Fraga, Carlos G; Carman, April J; Moran, James J
2016-02-02
Chemical attribution signatures (CAS) for chemical threat agents (CTAs), such as cyanides, are being investigated to provide an evidentiary link between CTAs and specific sources to support criminal investigations and prosecutions. Herein, stocks of KCN and NaCN were analyzed for trace anions by high performance ion chromatography (HPIC), carbon stable isotope ratio (δ(13)C) by isotope ratio mass spectrometry (IRMS), and trace elements by inductively coupled plasma optical emission spectroscopy (ICP-OES). The collected analytical data were evaluated using hierarchical cluster analysis (HCA), Fisher-ratio (F-ratio), interval partial least-squares (iPLS), genetic algorithm-based partial least-squares (GAPLS), partial least-squares discriminant analysis (PLSDA), K nearest neighbors (KNN), and support vector machines discriminant analysis (SVMDA). HCA of anion impurity profiles from multiple cyanide stocks from six reported countries of origin resulted in cyanide samples clustering into three groups, independent of the associated alkali metal (K or Na). The three groups were independently corroborated by HCA of cyanide elemental profiles and corresponded to countries each having one known solid cyanide factory: Czech Republic, Germany, and United States. Carbon stable isotope measurements resulted in two clusters: Germany and United States (the single Czech stock grouped with United States stocks). Classification errors for two validation studies using anion impurity profiles collected over five years on different instruments were as low as zero for KNN and SVMDA, demonstrating the excellent reliability associated with using anion impurities for matching a cyanide sample to its factory using our current cyanide stocks. Variable selection methods reduced errors for those classification methods having errors greater than zero; iPLS-forward selection and F-ratio typically provided the lowest errors. Finally, using anion profiles to classify cyanides to a specific stock or stock group for a subset of United States stocks resulted in cross-validation errors ranging from 0 to 5.3%.
Sundbom, E; Jeanneau, M
1996-03-01
The main aim of the study is to establish an empirical connection between perceptual defences as measured by the Defense Mechanism Test (DMT)--a projective percept-genetic method--and manifest linguistic expressions based on word pattern analyses. The subjects were 25 psychiatric patients with the diagnoses neurotic personality organization (NPO), borderline personality organization (BPO) and psychotic personality organization (PPO) in accordance with Kernberg's theory. A set of 130 DMT variables and 40 linguistic variables were analyzed by means of partial least squares (PLS) discriminant analysis separately and then pooled together. The overall hypothesis was that it would be possible to define the personality organization of the patients in terms of an amalgam of perceptual defences and word patterns, and that these two kinds of data would confirm each other. The result of the combined PLS analysis revealed a very good separation between the diagnostic groups as measured by the pooled variable sets. Among other things, it was shown that NPO patients are principally characterized by linguistic variables, whereas BPO and PPO patients are better defined by perceptual defences as measured by the DMT method.
NASA Astrophysics Data System (ADS)
Kong, Wenwen; Liu, Fei; Zhang, Chu; Bao, Yidan; Yu, Jiajia; He, Yong
2014-01-01
Tomatoes are cultivated around the world and gray mold is one of its most prominent and destructive diseases. An early disease detection method can decrease losses caused by plant diseases and prevent the spread of diseases. The activity of peroxidase (POD) is very important indicator of disease stress for plants. The objective of this study is to examine the possibility of fast detection of POD activity in tomato leaves which infected with Botrytis cinerea using hyperspectral imaging data. Five pre-treatment methods were investigated. Genetic algorithm-partial least squares (GA-PLS) was applied to select optimal wavelengths. A new fast learning neural algorithm named extreme learning machine (ELM) was employed as multivariate analytical tool in this study. 21 optimal wavelengths were selected by GA-PLS and used as inputs of three calibration models. The optimal prediction result was achieved by ELM model with selected wavelengths, and the r and RMSEP in validation were 0.8647 and 465.9880 respectively. The results indicated that hyperspectral imaging could be considered as a valuable tool for POD activity prediction. The selected wavelengths could be potential resources for instrument development.
Wu, Yan-Wen; Sun, Su-Qin; Zhou, Qun; Leung, Hei-Wun
2008-02-13
Honghua Oil (HHO), a traditional Chinese medicine (TCM) oil preparation, is a mixture of several plant essential oils. In this text, the extended ranges of Fourier transform mid-infrared (FT-MIR) and near infrared (FT-NIR) were recorded for 48 commercially available HHOs of different batches from nine manufacturers. The qualitative and quantitative analysis of three marker components, alpha-pinene, methyl salicylate and eugenol, in different HHO products were performed rapidly by the two vibrational spectroscopic methods, i.e. MIR with horizontal attenuated total reflection (HATR) accessory and NIR with direct sampling technique, followed by partial least squares (PLS) regression treatment of the set of spectra obtained. The results indicated that it was successful to identify alpha-pinene, methyl salicylate and eugenol in all of the samples by simple inspection of the MIR-HATR spectra. Both PLS models established with MIR-HATR and NIR spectral data using gas chromatography (GC) peak areas as calibration reference showed a good linear correlation for each of all three target substances in HHO samples. The above spectroscopic techniques may be the promising methods for the rapid quality assessment/quality control (QA/QC) of TCM oil preparations.
Passos, Cláudia P; Cardoso, Susana M; Barros, António S; Silva, Carlos M; Coimbra, Manuel A
2010-02-28
Fourier transform infrared (FTIR) spectroscopy has being emphasised as a widespread technique in the quick assess of food components. In this work, procyanidins were extracted with methanol and acetone/water from the seeds of white and red grape varieties. A fractionation by graded methanol/chloroform precipitations allowed to obtain 26 samples that were characterised using thiolysis as pre-treatment followed by HPLC-UV and MS detection. The average degree of polymerisation (DPn) of the procyanidins in the samples ranged from 2 to 11 flavan-3-ol residues. FTIR spectroscopy within the wavenumbers region of 1800-700 cm(-1) allowed to build a partial least squares (PLS1) regression model with 8 latent variables (LVs) for the estimation of the DPn, giving a RMSECV of 11.7%, with a R(2) of 0.91 and a RMSEP of 2.58. The application of orthogonal projection to latent structures (O-PLS1) clarifies the interpretation of the regression model vectors. Moreover, the O-PLS procedure has removed 88% of non-correlated variations with the DPn, allowing to relate the increase of the absorbance peaks at 1203 and 1099 cm(-1) with the increase of the DPn due to the higher proportion of substitutions in the aromatic ring of the polymerised procyanidin molecules. Copyright 2009 Elsevier B.V. All rights reserved.
New consensus multivariate models based on PLS and ANN studies of sigma-1 receptor antagonists.
Oliveira, Aline A; Lipinski, Célio F; Pereira, Estevão B; Honorio, Kathia M; Oliveira, Patrícia R; Weber, Karen C; Romero, Roseli A F; de Sousa, Alexsandro G; da Silva, Albérico B F
2017-10-02
The treatment of neuropathic pain is very complex and there are few drugs approved for this purpose. Among the studied compounds in the literature, sigma-1 receptor antagonists have shown to be promising. In order to develop QSAR studies applied to the compounds of 1-arylpyrazole derivatives, multivariate analyses have been performed in this work using partial least square (PLS) and artificial neural network (ANN) methods. A PLS model has been obtained and validated with 45 compounds in the training set and 13 compounds in the test set (r 2 training = 0.761, q 2 = 0.656, r 2 test = 0.746, MSE test = 0.132 and MAE test = 0.258). Additionally, multi-layer perceptron ANNs (MLP-ANNs) were employed in order to propose non-linear models trained by gradient descent with momentum backpropagation function. Based on MSE test values, the best MLP-ANN models were combined in a MLP-ANN consensus model (MLP-ANN-CM; r 2 test = 0.824, MSE test = 0.088 and MAE test = 0.197). In the end, a general consensus model (GCM) has been obtained using PLS and MLP-ANN-CM models (r 2 test = 0.811, MSE test = 0.100 and MAE test = 0.218). Besides, the selected descriptors (GGI6, Mor23m, SRW06, H7m, MLOGP, and μ) revealed important features that should be considered when one is planning new compounds of the 1-arylpyrazole class. The multivariate models proposed in this work are definitely a powerful tool for the rational drug design of new compounds for neuropathic pain treatment. Graphical abstract Main scaffold of the 1-arylpyrazole derivatives and the selected descriptors.
Nazari, Seyed Saeed Hashemi; Mokhayeri, Yaser; Mansournia, Mohammad Ali; Khodakarim, Soheila; Soori, Hamid
2018-05-21
Some studies shed light on the association between dietary patterns and stroke, though, none of them applied reduced rank regression (RRR). Therefore, we sought to extract dietary patterns using RRR, and showed how well the extracted scores by RRR predict stroke in comparison to those scores produced by partial least squares (PLS) and principal components regression (PCR). Diet data at baseline with four response variables including body mass index (BMI), fibrinogen, IL-6, low-density lipoprotein (LDL) cholesterol were used to extract dietary patterns. Analyses were based on 5468 men and women aged 45-84 y who had no clinical cardiovascular diseases (CVD) from Multi-Ethnic Study of Atherosclerosis (MESA). Dietary patterns were created by three methods RRR, PLS, and PCR. The RRR1 was positively associated with stroke incidence in both models (for model 1 hazard ratio (HR): 7.49; 95% CI: 1.66, 33.69 P for trend = 0.01 and for model 2 HR: 6.83; 95% CI: 1.51, 30.87 for quintile 5 compared with the reference category P for trend = 0.02). The RRR1, PLS1, and PCR1 were high in fats and oils, poultry, tomatoes, fried potato and processed meat. Additionally, RRR1 and PLS1 were high in dark-yellow and cruciferous vegetables which negatively were correlated with the first dietary pattern. Mainly according to the RRR, we identified that a dietary pattern high in fats and oil, poultry, non-diet soda, processed meat, tomatoes, legumes, chicken, tuna and egg salad, fried potato and low in dark-yellow and cruciferous vegetables may increase the incidence of stroke.
Bricklemyer, Ross S; Brown, David J; Turk, Philip J; Clegg, Sam M
2013-10-01
Laser-induced breakdown spectroscopy (LIBS) provides a potential method for rapid, in situ soil C measurement. In previous research on the application of LIBS to intact soil cores, we hypothesized that ultraviolet (UV) spectrum LIBS (200-300 nm) might not provide sufficient elemental information to reliably discriminate between soil organic C (SOC) and inorganic C (IC). In this study, using a custom complete spectrum (245-925 nm) core-scanning LIBS instrument, we analyzed 60 intact soil cores from six wheat fields. Predictive multi-response partial least squares (PLS2) models using full and reduced spectrum LIBS were compared for directly determining soil total C (TC), IC, and SOC. Two regression shrinkage and variable selection approaches, the least absolute shrinkage and selection operator (LASSO) and sparse multivariate regression with covariance estimation (MRCE), were tested for soil C predictions and the identification of wavelengths important for soil C prediction. Using complete spectrum LIBS for PLS2 modeling reduced the calibration standard error of prediction (SEP) 15 and 19% for TC and IC, respectively, compared to UV spectrum LIBS. The LASSO and MRCE approaches provided significantly improved calibration accuracy and reduced SEP 32-55% over UV spectrum PLS2 models. We conclude that (1) complete spectrum LIBS is superior to UV spectrum LIBS for predicting soil C for intact soil cores without pretreatment; (2) LASSO and MRCE approaches provide improved calibration prediction accuracy over PLS2 but require additional testing with increased soil and target analyte diversity; and (3) measurement errors associated with analyzing intact cores (e.g., sample density and surface roughness) require further study and quantification.
Determinants of caregivers' awareness of Universal Newborn Hearing Screening in Malaysia.
Abdul Majid, Abdul Halim; Zakaria, Mohd Normani; Abdullah, Nor Azimah Chew; Hamzah, Sulaiman; Mukari, Siti Zamratol-Mai Sarah
2017-10-01
This paper aims to investigate the effects of perceived attitude and anxiety on awareness of UNHS among caregivers in Malaysia. Using cross sectional research approach, data were collected and some 46 out of 87 questionnaires distributed to caregivers attending UNHS programs at selected public hospitals were usable for analysis (response rate of 52.8%). Partial Least Squares Method (PLS) algorithm and bootstrapping technique were employed to test the hypotheses of the study. R square value is 0.205, and it implies that exogenous latent variables explained 21% of the variance of the endogenous latent variable. This value indicates moderate and acceptable level of R-squared values. Findings from PLS structural model evaluation revealed that anxiety has no significant influence (β = -0.091, t = 0.753, p > 0.10) on caregivers' awareness; but perceived attitude has significant effect (β = -0.444, t = 3.434, p < 0.01) on caregivers' awareness. Caregivers' awareness of UNHS is influenced by their perceived attitude while anxiety is not associated with caregivers' awareness. This implies that caregivers may not believe in early detection of hearing impairment in children, thinking that their babies are too young to be tested for hearing loss. Moreover, socio-economic situation of the caregivers may have contributed to their failure to honor UNHS screening appointments as some of them may need to work to earn a living while some may perceive it a waste of time honoring such appointments. Non-significant relationship between anxiety and caregivers' awareness may be due to religious beliefs of caregivers. Limitations and suggestions were discussed. Copyright © 2017 Elsevier B.V. All rights reserved.
Wu, Xia; Zhu, Jian-Cheng; Zhang, Yu; Li, Wei-Min; Rong, Xiang-Lu; Feng, Yi-Fan
2016-08-25
Potential impact of lipid research has been increasingly realized both in disease treatment and prevention. An effective metabolomics approach based on ultra-performance liquid chromatography/quadrupole-time-of-flight mass spectrometry (UPLC/Q-TOF-MS) along with multivariate statistic analysis has been applied for investigating the dynamic change of plasma phospholipids compositions in early type 2 diabetic rats after the treatment of an ancient prescription of Chinese Medicine Huang-Qi-San. The exported UPLC/Q-TOF-MS data of plasma samples were subjected to SIMCA-P and processed by bioMark, mixOmics, Rcomdr packages with R software. A clear score plots of plasma sample groups, including normal control group (NC), model group (MC), positive medicine control group (Flu) and Huang-Qi-San group (HQS), were achieved by principal-components analysis (PCA), partial least-squares discriminant analysis (PLS-DA) and orthogonal partial least-squares discriminant analysis (OPLS-DA). Biomarkers were screened out using student T test, principal component regression (PCR), partial least-squares regression (PLS) and important variable method (variable influence on projection, VIP). Structures of metabolites were identified and metabolic pathways were deduced by correlation coefficient. The relationship between compounds was explained by the correlation coefficient diagram, and the metabolic differences between similar compounds were illustrated. Based on KEGG database, the biological significances of identified biomarkers were described. The correlation coefficient was firstly applied to identify the structure and deduce the metabolic pathways of phospholipids metabolites, and the study provided a new methodological cue for further understanding the molecular mechanisms of metabolites in the process of regulating Huang-Qi-San for treating early type 2 diabetes. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Vindimian, Éric; Garric, Jeanne; Flammarion, Patrick; Thybaud, Éric; Babut, Marc
1999-10-01
The evaluation of the ecotoxicity of effluents requires a battery of biological tests on several species. In order to derive a summary parameter from such a battery, a single endpoint was calculated for all the tests: the EC10, obtained by nonlinear regression, with bootstrap evaluation of the confidence intervals. Principal component analysis was used to characterize and visualize the correlation between the tests. The table of the toxicity of the effluents was then submitted to a panel of experts, who classified the effluents according to the test results. Partial least squares (PLS) regression was used to fit the average value of the experts' judgements to the toxicity data, using a simple equation. Furthermore, PLS regression on partial data sets and other considerations resulted in an optimum battery, with two chronic tests and one acute test. The index is intended to be used for the classification of effluents based on their toxicity to aquatic species. Copyright © 1999 SETAC.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Vindimian, E.; Garric, J.; Flammarion, P.
1999-10-01
The evaluation of the ecotoxicity of effluents requires a battery of biological tests on several species. In order to derive a summary parameter from such a battery, a single endpoint was calculated for all the tests: the EC10, obtained by nonlinear regression, with bootstrap evaluation of the confidence intervals. Principal component analysis was used to characterize and visualize the correlation between the tests. The table of the toxicity of the effluents was then submitted to a panel of experts, who classified the effluents according to the test results. Partial least squares (PLS) regression was used to fit the average valuemore » of the experts' judgments to the toxicity data, using a simple equation. Furthermore, PLS regression on partial data sets and other considerations resulted in an optimum battery, with two chronic tests and one acute test. The index is intended to be used for the classification of effluents based on their toxicity to aquatic species.« less
NASA Astrophysics Data System (ADS)
Chen, Quansheng; Qi, Shuai; Li, Huanhuan; Han, Xiaoyan; Ouyang, Qin; Zhao, Jiewen
2014-10-01
To rapidly and efficiently detect the presence of adulterants in honey, three-dimensional fluorescence spectroscopy (3DFS) technique was employed with the help of multivariate calibration. The data of 3D fluorescence spectra were compressed using characteristic extraction and the principal component analysis (PCA). Then, partial least squares (PLS) and back propagation neural network (BP-ANN) algorithms were used for modeling. The model was optimized by cross validation, and its performance was evaluated according to root mean square error of prediction (RMSEP) and correlation coefficient (R) in prediction set. The results showed that BP-ANN model was superior to PLS models, and the optimum prediction results of the mixed group (sunflower ± longan ± buckwheat ± rape) model were achieved as follow: RMSEP = 0.0235 and R = 0.9787 in the prediction set. The study demonstrated that the 3D fluorescence spectroscopy technique combined with multivariate calibration has high potential in rapid, nondestructive, and accurate quantitative analysis of honey adulteration.
Consistent Partial Least Squares Path Modeling via Regularization
Jung, Sunho; Park, JaeHong
2018-01-01
Partial least squares (PLS) path modeling is a component-based structural equation modeling that has been adopted in social and psychological research due to its data-analytic capability and flexibility. A recent methodological advance is consistent PLS (PLSc), designed to produce consistent estimates of path coefficients in structural models involving common factors. In practice, however, PLSc may frequently encounter multicollinearity in part because it takes a strategy of estimating path coefficients based on consistent correlations among independent latent variables. PLSc has yet no remedy for this multicollinearity problem, which can cause loss of statistical power and accuracy in parameter estimation. Thus, a ridge type of regularization is incorporated into PLSc, creating a new technique called regularized PLSc. A comprehensive simulation study is conducted to evaluate the performance of regularized PLSc as compared to its non-regularized counterpart in terms of power and accuracy. The results show that our regularized PLSc is recommended for use when serious multicollinearity is present. PMID:29515491
Partial least squares based identification of Duchenne muscular dystrophy specific genes.
An, Hui-bo; Zheng, Hua-cheng; Zhang, Li; Ma, Lin; Liu, Zheng-yan
2013-11-01
Large-scale parallel gene expression analysis has provided a greater ease for investigating the underlying mechanisms of Duchenne muscular dystrophy (DMD). Previous studies typically implemented variance/regression analysis, which would be fundamentally flawed when unaccounted sources of variability in the arrays existed. Here we aim to identify genes that contribute to the pathology of DMD using partial least squares (PLS) based analysis. We carried out PLS-based analysis with two datasets downloaded from the Gene Expression Omnibus (GEO) database to identify genes contributing to the pathology of DMD. Except for the genes related to inflammation, muscle regeneration and extracellular matrix (ECM) modeling, we found some genes with high fold change, which have not been identified by previous studies, such as SRPX, GPNMB, SAT1, and LYZ. In addition, downregulation of the fatty acid metabolism pathway was found, which may be related to the progressive muscle wasting process. Our results provide a better understanding for the downstream mechanisms of DMD.
Grisales, Jaiver Osorio; Arancibia, Juan A; Castells, Cecilia B; Olivieri, Alejandro C
2012-12-01
In this report, we demonstrate how chiral liquid chromatography combined with multivariate chemometric techniques, specifically unfolded-partial least-squares regression (U-PLS), provides a powerful analytical methodology. Using U-PLS, strongly overlapped enantiomer profiles in a sample could be successfully processed and enantiomeric purity could be accurately determined without requiring baseline enantioresolution between peaks. The samples were partially enantioseparated with a permethyl-β-cyclodextrin chiral column under reversed-phase conditions. Signals detected with a diode-array detector within a wavelength range from 198 to 241 nm were recorded, and the data were processed by a second-order multivariate algorithm to decrease detection limits. The R-(-)-enantiomer of ibuprofen in tablet formulation samples could be determined at the level of 0.5 mg L⁻¹ in the presence of 99.9% of the S-(+)-enantiomorph with relative prediction error within ±3%. Copyright © 2012 Elsevier B.V. All rights reserved.
Kuriakose, Saji; Joe, I Hubert
2013-11-01
Determination of the authenticity of essential oils has become more significant, in recent years, following some illegal adulteration and contamination scandals. The present investigative study focuses on the application of near infrared spectroscopy to detect sample authenticity and quantify economic adulteration of sandalwood oils. Several data pre-treatments are investigated for calibration and prediction using partial least square regression (PLSR). The quantitative data analysis is done using a new spectral approach - full spectrum or sequential spectrum. The optimum number of PLS components is obtained according to the lowest root mean square error of calibration (RMSEC=0.00009% v/v). The lowest root mean square error of prediction (RMSEP=0.00016% v/v) in the test set and the highest coefficient of determination (R(2)=0.99989) are used as the evaluation tools for the best model. A nonlinear method, locally weighted regression (LWR), is added to extract nonlinear information and to compare with the linear PLSR model. Copyright © 2013 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Kuriakose, Saji; Joe, I. Hubert
2013-11-01
Determination of the authenticity of essential oils has become more significant, in recent years, following some illegal adulteration and contamination scandals. The present investigative study focuses on the application of near infrared spectroscopy to detect sample authenticity and quantify economic adulteration of sandalwood oils. Several data pre-treatments are investigated for calibration and prediction using partial least square regression (PLSR). The quantitative data analysis is done using a new spectral approach - full spectrum or sequential spectrum. The optimum number of PLS components is obtained according to the lowest root mean square error of calibration (RMSEC = 0.00009% v/v). The lowest root mean square error of prediction (RMSEP = 0.00016% v/v) in the test set and the highest coefficient of determination (R2 = 0.99989) are used as the evaluation tools for the best model. A nonlinear method, locally weighted regression (LWR), is added to extract nonlinear information and to compare with the linear PLSR model.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tripathi, Markandey M.; Krishnan, Sundar R.; Srinivasan, Kalyan K.
Chemiluminescence emissions from OH*, CH*, C2, and CO2 formed within the reaction zone of premixed flames depend upon the fuel-air equivalence ratio in the burning mixture. In the present paper, a new partial least square regression (PLS-R) based multivariate sensing methodology is investigated and compared with an OH*/CH* intensity ratio-based calibration model for sensing equivalence ratio in atmospheric methane-air premixed flames. Five replications of spectral data at nine different equivalence ratios ranging from 0.73 to 1.48 were used in the calibration of both models. During model development, the PLS-R model was initially validated with the calibration data set using themore » leave-one-out cross validation technique. Since the PLS-R model used the entire raw spectral intensities, it did not need the nonlinear background subtraction of CO2 emission that is required for typical OH*/CH* intensity ratio calibrations. An unbiased spectral data set (not used in the PLS-R model development), for 28 different equivalence ratio conditions ranging from 0.71 to 1.67, was used to predict equivalence ratios using the PLS-R and the intensity ratio calibration models. It was found that the equivalence ratios predicted with the PLS-R based multivariate calibration model matched the experimentally measured equivalence ratios within 7%; whereas, the OH*/CH* intensity ratio calibration grossly underpredicted equivalence ratios in comparison to measured equivalence ratios, especially under rich conditions ( > 1.2). The practical implications of the chemiluminescence-based multivariate equivalence ratio sensing methodology are also discussed.« less
Infrared microspectroscopic determination of collagen cross-links in articular cartilage
NASA Astrophysics Data System (ADS)
Rieppo, Lassi; Kokkonen, Harri T.; Kulmala, Katariina A. M.; Kovanen, Vuokko; Lammi, Mikko J.; Töyräs, Juha; Saarakkala, Simo
2017-03-01
Collagen forms an organized network in articular cartilage to give tensile stiffness to the tissue. Due to its long half-life, collagen is susceptible to cross-links caused by advanced glycation end-products. The current standard method for determination of cross-link concentrations in tissues is the destructive high-performance liquid chromatography (HPLC). The aim of this study was to analyze the cross-link concentrations nondestructively from standard unstained histological articular cartilage sections by using Fourier transform infrared (FTIR) microspectroscopy. Half of the bovine articular cartilage samples (n=27) were treated with threose to increase the collagen cross-linking while the other half (n=27) served as a control group. Partial least squares (PLS) regression with variable selection algorithms was used to predict the cross-link concentrations from the measured average FTIR spectra of the samples, and HPLC was used as the reference method for cross-link concentrations. The correlation coefficients between the PLS regression models and the biochemical reference values were r=0.84 (p<0.001), r=0.87 (p<0.001) and r=0.92 (p<0.001) for hydroxylysyl pyridinoline (HP), lysyl pyridinoline (LP), and pentosidine (Pent) cross-links, respectively. The study demonstrated that FTIR microspectroscopy is a feasible method for investigating cross-link concentrations in articular cartilage.
Analysis of Flavonoid in Medicinal Plant Extract Using Infrared Spectroscopy and Chemometrics
Retnaningtyas, Yuni; Nuri; Lukman, Hilmia
2016-01-01
Infrared (IR) spectroscopy combined with chemometrics has been developed for simple analysis of flavonoid in the medicinal plant extract. Flavonoid was extracted from medicinal plant leaves by ultrasonication and maceration. IR spectra of selected medicinal plant extract were correlated with flavonoid content using chemometrics. The chemometric method used for calibration analysis was Partial Last Square (PLS) and the methods used for classification analysis were Linear Discriminant Analysis (LDA), Soft Independent Modelling of Class Analogies (SIMCA), and Support Vector Machines (SVM). In this study, the calibration of NIR model that showed best calibration with R 2 and RMSEC value was 0.9916499 and 2.1521897, respectively, while the accuracy of all classification models (LDA, SIMCA, and SVM) was 100%. R 2 and RMSEC of calibration of FTIR model were 0.8653689 and 8.8958149, respectively, while the accuracy of LDA, SIMCA, and SVM was 86.0%, 91.2%, and 77.3%, respectively. PLS and LDA of NIR models were further used to predict unknown flavonoid content in commercial samples. Using these models, the significance of flavonoid content that has been measured by NIR and UV-Vis spectrophotometry was evaluated with paired samples t-test. The flavonoid content that has been measured with both methods gave no significant difference. PMID:27529051
Maltarollo, Vinícius G; Homem-de-Mello, Paula; Honorio, Káthia M
2011-10-01
Current researches on treatments for metabolic diseases involve a class of biological receptors called peroxisome proliferator-activated receptors (PPARs), which control the metabolism of carbohydrates and lipids. A subclass of these receptors, PPARδ, regulates several metabolic processes, and the substances that activate them are being studied as new drug candidates for the treatment of diabetes mellitus and metabolic syndrome. In this study, several PPARδ agonists with experimental biological activity were selected for a structural and chemical study. Electronic, stereochemical, lipophilic and topological descriptors were calculated for the selected compounds using various theoretical methods, such as density functional theory (DFT). Fisher's weight and principal components analysis (PCA) methods were employed to select the most relevant variables for this study. The partial least squares (PLS) method was used to construct the multivariate statistical model, and the best model obtained had 4 PCs, q ( 2 ) = 0.80 and r ( 2 ) = 0.90, indicating a good internal consistency. The prediction residues calculated for the compounds in the test set had low values, indicating the good predictive capability of our PLS model. The model obtained in this study is reliable and can be used to predict the biological activity of new untested compounds. Docking studies have also confirmed the importance of the molecular descriptors selected for this system.
Fernández-Novales, Juan; López, María-Isabel; González-Caballero, Virginia; Ramírez, Pilar; Sánchez, María-Teresa
2011-06-01
Volumic mass-a key component of must quality control tests during alcoholic fermentation-is of great interest to the winemaking industry. Transmitance near-infrared (NIR) spectra of 124 must samples over the range of 200-1,100-nm were obtained using a miniature spectrometer. The performance of this instrument to predict volumic mass was evaluated using partial least squares (PLS) regression and multiple linear regression (MLR). The validation statistics coefficient of determination (r(2)) and the standard error of prediction (SEP) were r(2) = 0.98, n = 31 and r(2) = 0.96, n = 31, and SEP = 5.85 and 7.49 g/dm(3) for PLS and MLR equations developed to fit reference data for volumic mass and spectral data. Comparison of results from MLR and PLS demonstrates that a MLR model with six significant wavelengths (P < 0.05) fit volumic mass data to transmittance (1/T) data slightly worse than a more sophisticated PLS model using the full scanning range. The results suggest that NIR spectroscopy is a suitable technique for predicting volumic mass during alcoholic fermentation, and that a low-cost NIR instrument can be used for this purpose.
In Vivo and Ex Vivo Transcutaneous Glucose Detection Using Surface-Enhanced Raman Spectroscopy
NASA Astrophysics Data System (ADS)
Ma, Ke
Diabetes mellitus is widely acknowledged as a large and growing health concern. The lack of practical methods for continuously monitoring glucose levels causes significant difficulties in successful diabetes management. Extensive validation work has been carried out using surface-enhanced Raman spectroscopy (SERS) for in vivo glucose sensing. This dissertation details progress made towards a Raman-based glucose sensor for in vivo, transcutaneous glucose detection. The first presented study combines spatially offset Raman spectroscopy (SORS) with SERS (SESORS) to explore the possibility of in vivo, transcutaneous glucose sensing. A SERS-based glucose sensor was implanted subcutaneously in Sprague-Dawley rats. SERS spectra were acquired transcutaneously and analyzed using partial least-squares (PLS). Highly accurate and consistent results were obtained, especially in the hypoglycemic range. Additionally, the sensor demonstrated functionality at least17 days after implantation. A subsequent study further extends the application of SESORS to the possibility of in vivo detection of glucose in brain through skull. Specifically, SERS nanoantennas were buried in an ovine tissue behind a bone with 8 mm thickness and detected by using SESORS. In addition, quantitative detection through bones by using SESORS was also demonstrated. A device that could measure glucose continuously as well as noninvasively would be of great use to patients with diabetes. The inherent limitation of the SESORS approach may prevent this technique from becoming a noninvasive method. Therefore, the prospect of using normal Raman spectroscopy for glucose detection was re-examined. Quantitative detection of glucose and lactate in the clinically relevant range was demonstrated by using normal Raman spectroscopy with low power and short acquisition time. Finally, a nonlinear calibration method called least-squares support vector machine regression (LS-SVR) was investigated for analyzing spectroscopic data sets of glucose detection. Comparison studies were demonstrated between LS-SVR and PLS. LS-SVR demonstrated significant improvements in accuracy over PLS for glucose detection, especially when a global calibration model was required. The improvements imparted by LS-SVR open up the possibility of developing an accurate prediction algorithm for Raman-based glucose sensing applicable to a large human population. Overall, these studies show the high promise held by the Raman-based sensor for the challenge of optimal glycemic control.
NASA Astrophysics Data System (ADS)
Dyar, M. D.; Carmosino, M. L.; Breves, E. A.; Ozanne, M. V.; Clegg, S. M.; Wiens, R. C.
2012-04-01
A remote laser-induced breakdown spectrometer (LIBS) designed to simulate the ChemCam instrument on the Mars Science Laboratory Rover Curiosity was used to probe 100 geologic samples at a 9-m standoff distance. ChemCam consists of an integrated remote LIBS instrument that will probe samples up to 7 m from the mast of the rover and a remote micro-imager (RMI) that will record context images. The elemental compositions of 100 igneous and highly-metamorphosed rocks are determined with LIBS using three variations of multivariate analysis, with a goal of improving the analytical accuracy. Two forms of partial least squares (PLS) regression are employed with finely-tuned parameters: PLS-1 regresses a single response variable (elemental concentration) against the observation variables (spectra, or intensity at each of 6144 spectrometer channels), while PLS-2 simultaneously regresses multiple response variables (concentrations of the ten major elements in rocks) against the observation predictor variables, taking advantage of natural correlations between elements. Those results are contrasted with those from the multivariate regression technique of the least absolute shrinkage and selection operator (lasso), which is a penalized shrunken regression method that selects the specific channels for each element that explain the most variance in the concentration of that element. To make this comparison, we use results of cross-validation and of held-out testing, and employ unscaled and uncentered spectral intensity data because all of the input variables are already in the same units. Results demonstrate that the lasso, PLS-1, and PLS-2 all yield comparable results in terms of accuracy for this dataset. However, the interpretability of these methods differs greatly in terms of fundamental understanding of LIBS emissions. PLS techniques generate principal components, linear combinations of intensities at any number of spectrometer channels, which explain as much variance in the response variables as possible while avoiding multicollinearity between principal components. When the selected number of principal components is projected back into the original feature space of the spectra, 6144 correlation coefficients are generated, a small fraction of which are mathematically significant to the regression. In contrast, the lasso models require only a small number (< 24) of non-zero correlation coefficients (β values) to determine the concentration of each of the ten major elements. Causality between the positively-correlated emission lines chosen by the lasso and the elemental concentration was examined. In general, the higher the lasso coefficient (β), the greater the likelihood that the selected line results from an emission of that element. Emission lines with negative β values should arise from elements that are anti-correlated with the element being predicted. For elements except Fe, Al, Ti, and P, the lasso-selected wavelength with the highest β value corresponds to the element being predicted, e.g. 559.8 nm for neutral Ca. However, the specific lines chosen by the lasso with positive β values are not always those from the element being predicted. Other wavelengths and the elements that most strongly correlate with them to predict concentration are obviously related to known geochemical correlations or close overlap of emission lines, while others must result from matrix effects. Use of the lasso technique thus directly informs our understanding of the underlying physical processes that give rise to LIBS emissions by determining which lines can best represent concentration, and which lines from other elements are causing matrix effects.
Rahman, Anisur; Faqeerzada, Mohammad A; Cho, Byoung-Kwan
2018-03-14
Allicin and soluble solid content (SSC) in garlic is the responsible for its pungent flavor and odor. However, current conventional methods such as the use of high-pressure liquid chromatography and a refractometer have critical drawbacks in that they are time-consuming, labor-intensive and destructive procedures. The present study aimed to predict allicin and SSC in garlic using hyperspectral imaging in combination with variable selection algorithms and calibration models. Hyperspectral images of 100 garlic cloves were acquired that covered two spectral ranges, from which the mean spectra of each clove were extracted. The calibration models included partial least squares (PLS) and least squares-support vector machine (LS-SVM) regression, as well as different spectral pre-processing techniques, from which the highest performing spectral preprocessing technique and spectral range were selected. Then, variable selection methods, such as regression coefficients, variable importance in projection (VIP) and the successive projections algorithm (SPA), were evaluated for the selection of effective wavelengths (EWs). Furthermore, PLS and LS-SVM regression methods were applied to quantitatively predict the quality attributes of garlic using the selected EWs. Of the established models, the SPA-LS-SVM model obtained an Rpred2 of 0.90 and standard error of prediction (SEP) of 1.01% for SSC prediction, whereas the VIP-LS-SVM model produced the best result with an Rpred2 of 0.83 and SEP of 0.19 mg g -1 for allicin prediction in the range 1000-1700 nm. Furthermore, chemical images of garlic were developed using the best predictive model to facilitate visualization of the spatial distributions of allicin and SSC. The present study clearly demonstrates that hyperspectral imaging combined with an appropriate chemometrics method can potentially be employed as a fast, non-invasive method to predict the allicin and SSC in garlic. © 2018 Society of Chemical Industry. © 2018 Society of Chemical Industry.
[Research on Identification and Determination of Pesticides in Apples Using Raman Spectroscopy].
Zhai, Chen; Peng, Yan-kun; Li, Yong-yu; Dhakal, Sagar; Xu, Tian-feng; Guo, Lang-hua
2015-08-01
Raman spectroscopy combined with chemometric methods has been thought to an efficient method for identification and determination of pesticide residues in fruits and vegetables. In the present research, a rapid and nondestructive method was proposed and testified based on self-developed Raman system for the identification and determination of deltamethrin and acetamiprid remaining in apple. The peaks of Raman spectra at 574 and 843 cm(-1) can be used to identify deltamethrin and acetamiprid, respectively, the characteristic peaks of deltamethrin and acetamiprid were still visible when the concentrations of the two pesticides were 0.78 and 0.15 mg · kg(-1) in apples samples, respectively. Calibration models of pesticide content were developed by partial least square (PLS) algorithm with different spectra pretreatment methods (Savitzky-Golay smoothing, first derivative transformation, second derivative transformation, baseline calibration, standard normal variable transformation). The baseline calibration methods by 8th order polynomial fitting gave the best results. For deltamethrin, the obtained prediction coefficient (Rp) value from PLS model for the results of prediction and gas chromatography measurement was 0.94; and the root mean square error of prediction (RMSEP) was 0.55 mg · kg(-1). The values of Rp and RMSEP were respective 0.85 and 0.12 mg · kg(-1) for acetamiprid. According to the detect performance, applying Raman technology in the nondestructive determination of pesticide residuals in apples is feasible. In consideration of that it needs no pretreatment before spectra collection and causes no damage to sample, this technology can be used in detection department, fruit and vegetable processing enterprises, supermarket, and vegetable market. The result of this research is promising for development of industrially feasible technology for rapid, nondestructive and real time detection of different types of pesticide with its concentration in apples. This supplies a rapid nondestructive and environmentally friendly way for the determination of fruit and vegetable quality and safety.
PREDICTION OF MOLECULAR PROPERTIES WITH MID-INFRARED SPECTRA AND INTERFEROGRAMS
We have built infrared spectroscopy-based partial least squares (PLS) models for molecular polarizabilities using a 97 member training set and a 59 member independent prediction set. These 156 compounds span a very wide range of chemical structure. Our goal was to use this well...
García, M D Gil; Culzoni, M J; De Zan, M M; Valverde, R Santiago; Galera, M Martínez; Goicoechea, H C
2008-02-01
A new powerful algorithm (unfolded-partial least squares followed by residual bilinearization (U-PLS/RBL)) was applied for first time on second-order liquid chromatography with diode array detection (LC-DAD) data and compared with a well-known established method (multivariate curve resolution-alternating least squares (MCR-ALS)) for the simultaneous determination of eight tetracyclines (tetracycline, oxytetracycline, meclocycline, minocycline, metacycline, chlortetracycline, demeclocycline and doxycycline) in wastewaters. Tetracyclines were pre-concentrated using Oasis Max C18 cartridges and then separated on a Thermo Aquasil C18 (150 mm x 4.6mm, 5 microm) column. The whole method was validated using Milli-Q water samples and both univariate and multivariate analytical figures of merit were obtained. Additionally, two data pre-treatment were applied (baseline correction and piecewise direct standardization), which allowed to correct the effect of breakthrough and to reduce the total interferences retained after pre-concentration of wastewaters. The results showed that the eight tetracycline antibiotics can be successfully determined in wastewaters, the drawbacks due to matrix interferences being adequately handled and overcome by using U-PSL/RBL.
Castritius, Stefan; Kron, Alexander; Schäfer, Thomas; Rädle, Matthias; Harms, Diedrich
2010-12-22
A new approach of combination of near-infrared (NIR) spectroscopy and refractometry was developed in this work to determine the concentration of alcohol and real extract in various beer samples. A partial least-squares (PLS) regression, as multivariate calibration method, was used to evaluate the correlation between the data of spectroscopy/refractometry and alcohol/extract concentration. This multivariate combination of spectroscopy and refractometry enhanced the precision in the determination of alcohol, compared to single spectroscopy measurements, due to the effect of high extract concentration on the spectral data, especially of nonalcoholic beer samples. For NIR calibration, two mathematical pretreatments (first-order derivation and linear baseline correction) were applied to eliminate light scattering effects. A sample grouping of the refractometry data was also applied to increase the accuracy of the determined concentration. The root mean squared errors of validation (RMSEV) of the validation process concerning alcohol and extract concentration were 0.23 Mas% (method A), 0.12 Mas% (method B), and 0.19 Mas% (method C) and 0.11 Mas% (method A), 0.11 Mas% (method B), and 0.11 Mas% (method C), respectively.
Roberts, D K; Winters, J E; Castells, D D; Clark, C A; Teitelbaum, B A
2001-01-01
To investigate pigmented striae of the anterior lens capsule in African-Americans, a potential indicator of significant anterior segment pigment dispersion. A group of 40 African-American subjects who exhibited pigmented lens striae (PLS) were identified from a non-referred, primary eye care population in Chicago, IL, USA. These subjects were then compared to an age, race, and gender matched control group relative to refractive error and the presence or absence of diabetes and hypertension. The PLS subjects (mean age = 65.4 +/- 8.8 years, range = 50-87 years) consisted of 36 females and 4 males. PLS were bilateral in 36 (85%) of the 40 subjects. Among the eyes with PLS, 21 (55%) of 38 right eyes and 22 (61%) of 36 left eyes also had significant corneal endothelial pigment dusting, commonly in the shape of a Krukenberg's spindle. Ten (25%) of the PLS subjects had either glaucoma or ocular hypertension (7 bilateral, 3 unilateral). The presence of trabecular meshwork pigment varied from minimal to heavy. The mean +/- SD (range) refractive error of the PLS right eyes was +1.61 +/- 1.43D (-1.50 to +5.00D) and +1.77 +/- 1.37D (-1.00 to +5.00D) for the left eyes. Based on these data, the PLS right eyes were +1.63D (Student's t, p = 0.0001; 95% CI = +0.82 to +2.44D) more hyperopic on average than the control right eyes, and the PLS left eyes were +1.77D (p = 0.0001; 95% CI = +0.92 to +2.63D) more hyperopic on average than the control left eyes. Trend analysis showed a gradually increasing likelihood of PLS with increasing magnitude of hyperopia in both eyes (Mantel-Haenszel chi-square, p = 0.001). Among PLS subjects, 24 (60%) of 40 were hypertensive and 9 (23%) of 40 were diabetic. However, these proportions were not significantly different (two-tailed Fisher's exact test; hypertension: p = 0.30; diabetes: p = 0.70) from the randomly selected controls. Among our African-American group, which consisted predominately of females >50 years of age, the likelihood of PLS increased with increasing hyperopic refractive error. This finding is consistent with the possibility that PLS may, in some circumstances, indicate a significant pigment dispersal process due to iris-lens rubbing that may be associated with crowding of anterior segment structures. Additional study is warranted to further assess the nature of PLS, their precise relationship with an age-related pigment dispersal process, and their true significance as a risk factor for development of glaucoma.
Determination of elemental composition of shale rocks by laser induced breakdown spectroscopy
NASA Astrophysics Data System (ADS)
Sanghapi, Hervé K.; Jain, Jinesh; Bol'shakov, Alexander; Lopano, Christina; McIntyre, Dustin; Russo, Richard
2016-08-01
In this study laser induced breakdown spectroscopy (LIBS) is used for elemental characterization of outcrop samples from the Marcellus Shale. Powdered samples were pressed to form pellets and used for LIBS analysis. Partial least squares regression (PLS-R) and univariate calibration curves were used for quantification of analytes. The matrix effect is substantially reduced using the partial least squares calibration method. Predicted results with LIBS are compared to ICP-OES results for Si, Al, Ti, Mg, and Ca. As for C, its results are compared to those obtained by a carbon analyzer. Relative errors of the LIBS measurements are in the range of 1.7 to 12.6%. The limits of detection (LODs) obtained for Si, Al, Ti, Mg and Ca are 60.9, 33.0, 15.6, 4.2 and 0.03 ppm, respectively. An LOD of 0.4 wt.% was obtained for carbon. This study shows that the LIBS method can provide a rapid analysis of shale samples and can potentially benefit depleted gas shale carbon storage research.
Multivariate classification of the infrared spectra of cell and tissue samples
DOE Office of Scientific and Technical Information (OSTI.GOV)
Haaland, D.M.; Jones, H.D.; Thomas, E.V.
1997-03-01
Infrared microspectroscopy of biopsied canine lymph cells and tissue was performed to investigate the possibility of using IR spectra coupled with multivariate classification methods to classify the samples as normal, hyperplastic, or neoplastic (malignant). IR spectra were obtained in transmission mode through BaF{sub 2} windows and in reflection mode from samples prepared on gold-coated microscope slides. Cytology and histopathology samples were prepared by a variety of methods to identify the optimal methods of sample preparation. Cytospinning procedures that yielded a monolayer of cells on the BaF{sub 2} windows produced a limited set of IR transmission spectra. These transmission spectra weremore » converted to absorbance and formed the basis for a classification rule that yielded 100{percent} correct classification in a cross-validated context. Classifications of normal, hyperplastic, and neoplastic cell sample spectra were achieved by using both partial least-squares (PLS) and principal component regression (PCR) classification methods. Linear discriminant analysis applied to principal components obtained from the spectral data yielded a small number of misclassifications. PLS weight loading vectors yield valuable qualitative insight into the molecular changes that are responsible for the success of the infrared classification. These successful classification results show promise for assisting pathologists in the diagnosis of cell types and offer future potential for {ital in vivo} IR detection of some types of cancer. {copyright} {ital 1997} {ital Society for Applied Spectroscopy}« less
Identification of Coffee Varieties Using Laser-Induced Breakdown Spectroscopy and Chemometrics.
Zhang, Chu; Shen, Tingting; Liu, Fei; He, Yong
2017-12-31
We linked coffee quality to its different varieties. This is of interest because the identification of coffee varieties should help coffee trading and consumption. Laser-induced breakdown spectroscopy (LIBS) combined with chemometric methods was used to identify coffee varieties. Wavelet transform (WT) was used to reduce LIBS spectra noise. Partial least squares-discriminant analysis (PLS-DA), radial basis function neural network (RBFNN), and support vector machine (SVM) were used to build classification models. Loadings of principal component analysis (PCA) were used to select the spectral variables contributing most to the identification of coffee varieties. Twenty wavelength variables corresponding to C I, Mg I, Mg II, Al II, CN, H, Ca II, Fe I, K I, Na I, N I, and O I were selected. PLS-DA, RBFNN, and SVM models on selected wavelength variables showed acceptable results. SVM and RBFNN models performed better with a classification accuracy of over 80% in the prediction set, for both full spectra and the selected variables. The overall results indicated that it was feasible to use LIBS and chemometric methods to identify coffee varieties. For further studies, more samples are needed to produce robust classification models, research should be conducted on which methods to use to select spectral peaks that correspond to the elements contributing most to identification, and the methods for acquiring stable spectra should also be studied.
Identification of Coffee Varieties Using Laser-Induced Breakdown Spectroscopy and Chemometrics
Zhang, Chu; Shen, Tingting
2017-01-01
We linked coffee quality to its different varieties. This is of interest because the identification of coffee varieties should help coffee trading and consumption. Laser-induced breakdown spectroscopy (LIBS) combined with chemometric methods was used to identify coffee varieties. Wavelet transform (WT) was used to reduce LIBS spectra noise. Partial least squares-discriminant analysis (PLS-DA), radial basis function neural network (RBFNN), and support vector machine (SVM) were used to build classification models. Loadings of principal component analysis (PCA) were used to select the spectral variables contributing most to the identification of coffee varieties. Twenty wavelength variables corresponding to C I, Mg I, Mg II, Al II, CN, H, Ca II, Fe I, K I, Na I, N I, and O I were selected. PLS-DA, RBFNN, and SVM models on selected wavelength variables showed acceptable results. SVM and RBFNN models performed better with a classification accuracy of over 80% in the prediction set, for both full spectra and the selected variables. The overall results indicated that it was feasible to use LIBS and chemometric methods to identify coffee varieties. For further studies, more samples are needed to produce robust classification models, research should be conducted on which methods to use to select spectral peaks that correspond to the elements contributing most to identification, and the methods for acquiring stable spectra should also be studied. PMID:29301228
NASA Astrophysics Data System (ADS)
Boucher, Thomas F.; Ozanne, Marie V.; Carmosino, Marco L.; Dyar, M. Darby; Mahadevan, Sridhar; Breves, Elly A.; Lepore, Kate H.; Clegg, Samuel M.
2015-05-01
The ChemCam instrument on the Mars Curiosity rover is generating thousands of LIBS spectra and bringing interest in this technique to public attention. The key to interpreting Mars or any other types of LIBS data are calibrations that relate laboratory standards to unknowns examined in other settings and enable predictions of chemical composition. Here, LIBS spectral data are analyzed using linear regression methods including partial least squares (PLS-1 and PLS-2), principal component regression (PCR), least absolute shrinkage and selection operator (lasso), elastic net, and linear support vector regression (SVR-Lin). These were compared against results from nonlinear regression methods including kernel principal component regression (K-PCR), polynomial kernel support vector regression (SVR-Py) and k-nearest neighbor (kNN) regression to discern the most effective models for interpreting chemical abundances from LIBS spectra of geological samples. The results were evaluated for 100 samples analyzed with 50 laser pulses at each of five locations averaged together. Wilcoxon signed-rank tests were employed to evaluate the statistical significance of differences among the nine models using their predicted residual sum of squares (PRESS) to make comparisons. For MgO, SiO2, Fe2O3, CaO, and MnO, the sparse models outperform all the others except for linear SVR, while for Na2O, K2O, TiO2, and P2O5, the sparse methods produce inferior results, likely because their emission lines in this energy range have lower transition probabilities. The strong performance of the sparse methods in this study suggests that use of dimensionality-reduction techniques as a preprocessing step may improve the performance of the linear models. Nonlinear methods tend to overfit the data and predict less accurately, while the linear methods proved to be more generalizable with better predictive performance. These results are attributed to the high dimensionality of the data (6144 channels) relative to the small number of samples studied. The best-performing models were SVR-Lin for SiO2, MgO, Fe2O3, and Na2O, lasso for Al2O3, elastic net for MnO, and PLS-1 for CaO, TiO2, and K2O. Although these differences in model performance between methods were identified, most of the models produce comparable results when p ≤ 0.05 and all techniques except kNN produced statistically-indistinguishable results. It is likely that a combination of models could be used together to yield a lower total error of prediction, depending on the requirements of the user.
Fernández de la Ossa, Mª Ángeles; Amigo, José Manuel; García-Ruiz, Carmen
2014-09-01
In this study near infrared hyperspectral imaging (NIR-HSI) is used to provide a fast, non-contact, non-invasive and non-destructive method for the analysis of explosive residues on human handprints. Volunteers manipulated individually each of these explosives and after deposited their handprints on plastic sheets. For this purpose, classical explosives, potentially used as part of improvised explosive devices (IEDs) as ammonium nitrate, blackpowder, single- and double-base smokeless gunpowders and dynamite were studied. A partial-least squares discriminant analysis (PLS-DA) model was built to detect and classify the presence of explosive residues in handprints. High levels of sensitivity and specificity for the PLS-DA classification model created to identify ammonium nitrate, blackpowder, single- and double-base smokeless gunpowders and dynamite residues were obtained, allowing the development of a preliminary library and facilitating the direct and in situ detection of explosives by NIR-HSI. Consequently, this technique is showed as a promising forensic tool for the detection of explosive residues and other related samples. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Pérez-Castaño, Estefanía; Sánchez-Viñas, Mercedes; Gázquez-Evangelista, Domingo; Bagur-González, M Gracia
2018-01-15
This paper describes and discusses the application of trimethylsilyl (TMS)-4,4'-desmethylsterols derivatives chromatographic fingerprints (obtained from an off-line HPLC-GC-FID system) for the quantification of extra virgin olive oil in commercial vinaigrettes, dressing salad and in-house reference materials (i-HRM) using two different Partial Least Square-Regression (PLS-R) multivariate quantification methods. Different data pre-processing strategies were carried out being the whole one: (i) internal normalization; (ii) sampling based on The Nyquist Theorem; (iii) internal correlation optimized shifting, icoshift; (iv) baseline correction (v) mean centering and (vi) selecting zones. The first model corresponds to a matrix of dimensions 'n×911' variables and the second one to a matrix of dimensions 'n×431' variables. It has to be highlighted that the proposed two PLS-R models allow the quantification of extra virgin olive oil in binary blends, foodstuffs, etc., when the provided percentage is greater than 25%. Copyright © 2017 Elsevier Ltd. All rights reserved.
Visible micro-Raman spectroscopy for determining glucose content in beverage industry.
Delfino, I; Camerlingo, C; Portaccio, M; Ventura, B Della; Mita, L; Mita, D G; Lepore, M
2011-07-15
The potential of Raman spectroscopy with excitation in the visible as a tool for quantitative determination of single components in food industry products was investigated by focusing the attention on glucose content in commercial sport drinks. At this aim, micro-Raman spectra in the 600-1600cm(-1) wavenumber shift region of four sport drinks were recorded, showing well defined and separated vibrational fingerprints of the various contained sugars (glucose, fructose and sucrose). By profiting of the spectral separation of some peculiar peaks, glucose content was quantified by using a multivariate statistical analysis based on the interval Partial Least Square (iPLS) approach. The iPLS model needed for data analysis procedure was built by using glucose aqueous solutions at known sugar concentrations as calibration data. This model was then applied to sport drink spectra and gave predicted glucose concentrations in good agreement with the values obtained by using a biochemical assay. These results represent a significant step towards the development of a fast and simple method for the on-line glucose quantification in products of food and beverage industry. Copyright © 2011 Elsevier Ltd. All rights reserved.
Sandoval, S; Torres, A; Pawlowsky-Reusing, E; Riechel, M; Caradot, N
2013-01-01
The present study aims to explore the relationship between rainfall variables and water quality/quantity characteristics of combined sewer overflows (CSOs), by the use of multivariate statistical methods and online measurements at a principal CSO outlet in Berlin (Germany). Canonical correlation results showed that the maximum and average rainfall intensities are the most influential variables to describe CSO water quantity and pollutant loads whereas the duration of the rainfall event and the rain depth seem to be the most influential variables to describe CSO pollutant concentrations. The analysis of partial least squares (PLS) regression models confirms the findings of the canonical correlation and highlights three main influences of rainfall on CSO characteristics: (i) CSO water quantity characteristics are mainly influenced by the maximal rainfall intensities, (ii) CSO pollutant concentrations were found to be mostly associated with duration of the rainfall and (iii) pollutant loads seemed to be principally influenced by dry weather duration before the rainfall event. The prediction quality of PLS models is rather low (R² < 0.6) but results can be useful to explore qualitatively the influence of rainfall on CSO characteristics.
Visible/near-infrared spectroscopy to predict water holding capacity in broiler breast meat
USDA-ARS?s Scientific Manuscript database
Visible/Near-infrared spectroscopy (Vis/NIRS) was examined as a tool for rapidly determining water holding capacity (WHC) in broiler breast meat. Both partial least squares (PLS) and principal component analysis (PCA) models were developed to relate Vis/NIRS spectra of 85 broiler breast meat sample...
USDA-ARS?s Scientific Manuscript database
Hyperspectral scattering is a promising technique for rapid and noninvasive measurement of multiple quality attributes of apple fruit. A hierarchical evolutionary algorithm (HEA) approach, in combination with subspace decomposition and partial least squares (PLS) regression, was proposed to select o...
Organizational Commitment, Knowledge Management Interventions, and Learning Organization Capacity
ERIC Educational Resources Information Center
Massingham, Peter; Diment, Kieren
2009-01-01
Purpose: The purpose of this paper is to examine the relationship between organizational commitment and knowledge management initiatives in developing learning organization capacity (LOC). Design/methodology/approach: This is an empirical study based on a single case study, using partial least squares (PLS) analysis. Findings: The strategic…
Kauppinen, Ari; Toiviainen, Maunu; Korhonen, Ossi; Aaltonen, Jaakko; Järvinen, Kristiina; Paaso, Janne; Juuti, Mikko; Ketolainen, Jarkko
2013-02-19
During the past decade, near-infrared (NIR) spectroscopy has been applied for in-line moisture content quantification during a freeze-drying process. However, NIR has been used as a single-vial technique and thus is not representative of the entire batch. This has been considered as one of the main barriers for NIR spectroscopy becoming widely used in process analytical technology (PAT) for freeze-drying. Clearly it would be essential to monitor samples that reliably represent the whole batch. The present study evaluated multipoint NIR spectroscopy for in-line moisture content quantification during a freeze-drying process. Aqueous sucrose solutions were used as model formulations. NIR data was calibrated to predict the moisture content using partial least-squares (PLS) regression with Karl Fischer titration being used as a reference method. PLS calibrations resulted in root-mean-square error of prediction (RMSEP) values lower than 0.13%. Three noncontact, diffuse reflectance NIR probe heads were positioned on the freeze-dryer shelf to measure the moisture content in a noninvasive manner, through the side of the glass vials. The results showed that the detection of unequal sublimation rates within a freeze-dryer shelf was possible with the multipoint NIR system in use. Furthermore, in-line moisture content quantification was reliable especially toward the end of the process. These findings indicate that the use of multipoint NIR spectroscopy can achieve representative quantification of moisture content and hence a drying end point determination to a desired residual moisture level.
Antoniou, Constantinos G; Markopoulou, Catherine K; Kouskoura, Maria G; Koundourellis, John E
2011-01-01
Different HPLC chromatographic systems were investigated on a C18 ACE 5 pm, 150 x 4.6 mm id column for the determination of tymazoline, tramazoline, and antazoline, with either naphazoline or xylometazoline, in commercial preparations. For the development and optimization of the systems, a Response Surface Method (r=0.925-0.980) was used to illustrate the changes in k as a function of pH values and different salt concentrations. The simultaneous separation of 2-imidazolines was accomplished at 40 degrees C with 0.01 M ammonium acetate-methanol (50+50, v/v, pH 6.0) mobile phase at a flow rate of 1.2 mL/min. In order to deal with the usual coexistence of 2-imidazolines with benzethonium and benzalkonium chloride preservatives, it was necessary to use another chromatographic system, 0.01 M ammonium acetate-methanol (50+50, v/v) mobile phase on a cyano ACE 5 pm, 150 x 4.6 mm id column. As part of a more thorough theoretical investigation, a partial least-squares (PLS) technique was used for modeling the RP-HPLC retention data. The model was based on molecular structure descriptors of the analytes' X variables and on their retention time (Log K) Y. The goodness of fit was estimated by the PLS correlation coefficient (r2) and root mean square error of estimation values, which were 0.994 and 0.0479, respectively.
Wang, Chang; Huang, Chichao; Qian, Jian; Xiao, Jian; Li, Huan; Wen, Yongli; He, Xinhua; Ran, Wei; Shen, Qirong; Yu, Guanghui
2014-01-01
The composting industry has been growing rapidly in China because of a boom in the animal industry. Therefore, a rapid and accurate assessment of the quality of commercial organic fertilizers is of the utmost importance. In this study, a novel technique that combines near infrared (NIR) spectroscopy with partial least squares (PLS) analysis is developed for rapidly and accurately assessing commercial organic fertilizers quality. A total of 104 commercial organic fertilizers were collected from full-scale compost factories in Jiangsu Province, east China. In general, the NIR-PLS technique showed accurate predictions of the total organic matter, water soluble organic nitrogen, pH, and germination index; less accurate results of the moisture, total nitrogen, and electrical conductivity; and the least accurate results for water soluble organic carbon. Our results suggested the combined NIR-PLS technique could be applied as a valuable tool to rapidly and accurately assess the quality of commercial organic fertilizers. PMID:24586313
Wang, Chang; Huang, Chichao; Qian, Jian; Xiao, Jian; Li, Huan; Wen, Yongli; He, Xinhua; Ran, Wei; Shen, Qirong; Yu, Guanghui
2014-01-01
The composting industry has been growing rapidly in China because of a boom in the animal industry. Therefore, a rapid and accurate assessment of the quality of commercial organic fertilizers is of the utmost importance. In this study, a novel technique that combines near infrared (NIR) spectroscopy with partial least squares (PLS) analysis is developed for rapidly and accurately assessing commercial organic fertilizers quality. A total of 104 commercial organic fertilizers were collected from full-scale compost factories in Jiangsu Province, east China. In general, the NIR-PLS technique showed accurate predictions of the total organic matter, water soluble organic nitrogen, pH, and germination index; less accurate results of the moisture, total nitrogen, and electrical conductivity; and the least accurate results for water soluble organic carbon. Our results suggested the combined NIR-PLS technique could be applied as a valuable tool to rapidly and accurately assess the quality of commercial organic fertilizers.
NASA Astrophysics Data System (ADS)
Barbeira, Paulo J. S.; Paganotti, Rosilene S. N.; Ássimos, Ariane A.
2013-10-01
This study had the objective of determining the content of dry extract of commercial alcoholic extracts of bee propolis through Partial Least Squares (PLS) multivariate calibration and electronic spectroscopy. The PLS model provided a good prediction of dry extract content in commercial alcoholic extracts of bee propolis in the range of 2.7 a 16.8% (m/v), presenting the advantage of being less laborious and faster than the traditional gravimetric methodology. The PLS model was optimized with outlier detection tests according to the ASTM E 1655-05. In this study it was possible to verify that a centrifugation stage is extremely important in order to avoid the presence of waxes, resulting in a more accurate model. Around 50% of the analyzed samples presented content of dry extract lower than the value established by Brazilian legislation, in most cases, the values found were different from the values claimed in the product's label.
Durakli Velioglu, Serap; Ercioglu, Elif; Boyaci, Ismail Hakki
2017-05-01
This research paper describes the potential of synchronous fluorescence (SF) spectroscopy for authentication of buffalo milk, a favourable raw material in the production of some premium dairy products. Buffalo milk is subjected to fraudulent activities like many other high priced foodstuffs. The current methods widely used for the detection of adulteration of buffalo milk have various disadvantages making them unattractive for routine analysis. Thus, the aim of the present study was to assess the potential of SF spectroscopy in combination with multivariate methods for rapid discrimination between buffalo and cow milk and detection of the adulteration of buffalo milk with cow milk. SF spectra of cow and buffalo milk samples were recorded between 400-550 nm excitation range with Δλ of 10-100 nm, in steps of 10 nm. The data obtained for ∆λ = 10 nm were utilised to classify the samples using principal component analysis (PCA), and detect the adulteration level of buffalo milk with cow milk using partial least square (PLS) methods. Successful discrimination of samples and detection of adulteration of buffalo milk with limit of detection value (LOD) of 6% are achieved with the models having root mean square error of calibration (RMSEC) and the root mean square error of cross-validation (RMSECV) and root mean square error of prediction (RMSEP) values of 2, 7, and 4%, respectively. The results reveal the potential of SF spectroscopy for rapid authentication of buffalo milk.
Monitoring fungal growth on brown rice grains using rapid and non-destructive hyperspectral imaging.
Siripatrawan, U; Makino, Y
2015-04-16
This research aimed to develop a rapid, non-destructive, and accurate method based on hyperspectral imaging (HSI) for monitoring spoilage fungal growth on stored brown rice. Brown rice was inoculated with a non-pathogenic strain of Aspergillus oryzae and stored at 30 °C and 85% RH. Growth of A. oryzae on rice was monitored using viable colony counts, expressed as colony forming units per gram (CFU/g). The fungal development was observed using scanning electron microscopy. The HSI system was used to acquire reflectance images of the samples covering the visible and near-infrared (NIR) wavelength range of 400-1000 nm. Unsupervised self-organizing map (SOM) was used to visualize data classification of different levels of fungal infection. Partial least squares (PLS) regression was used to predict fungal growth on rice grains from the HSI reflectance spectra. The HSI spectral signals decreased with increasing colony counts, while conserving similar spectral pattern during the fungal growth. When integrated with SOM, the proposed HSI method could be used to classify rice samples with different levels of fungal infection without sample manipulation. Moreover, HSI was able to rapidly identify infected rice although the samples showed no symptoms of fungal infection. Based on PLS regression, the coefficient of determination was 0.97 and root mean square error of prediction was 0.39 log (CFU/g), demonstrating that the HSI technique was effective for prediction of fungal infection in rice grains. The ability of HSI to detect fungal infection at early stage would help to prevent contaminated rice grains from entering the food chain. This research provides scientific information on the rapid, non-destructive, and effective fungal detection system for rice grains. Copyright © 2015 Elsevier B.V. All rights reserved.
Großhans, Steffen; Rüdt, Matthias; Sanden, Adrian; Brestrich, Nina; Morgenstern, Josefine; Heissler, Stefan; Hubbuch, Jürgen
2018-04-27
Fourier-transform infrared spectroscopy (FTIR) is a well-established spectroscopic method in the analysis of small molecules and protein secondary structure. However, FTIR is not commonly applied for in-line monitoring of protein chromatography. Here, the potential of in-line FTIR as a process analytical technology (PAT) in downstream processing was investigated in three case studies addressing the limits of currently applied spectroscopic PAT methods. A first case study exploited the secondary structural differences of monoclonal antibodies (mAbs) and lysozyme to selectively quantify the two proteins with partial least squares regression (PLS) giving root mean square errors of cross validation (RMSECV) of 2.42 g/l and 1.67 g/l, respectively. The corresponding Q 2 values are 0.92 and, respectively, 0.99, indicating robust models in the calibration range. Second, a process separating lysozyme and PEGylated lysozyme species was monitored giving an estimate of the PEGylation degree of currently eluting species with RMSECV of 2.35 g/l for lysozyme and 1.24 g/l for PEG with Q 2 of 0.96 and 0.94, respectively. Finally, Triton X-100 was added to a feed of lysozyme as a typical process-related impurity. It was shown that the species could be selectively quantified from the FTIR 3D field without PLS calibration. In summary, the proposed PAT tool has the potential to be used as a versatile option for monitoring protein chromatography. It may help to achieve a more complete implementation of the PAT initiative by mitigating limitations of currently used techniques. Copyright © 2018 The Authors. Published by Elsevier B.V. All rights reserved.
Basatnia, Nabee; Hossein, Seyed Abbas; Rodrigo-Comino, Jesús; Khaledian, Yones; Brevik, Eric C; Aitkenhead-Peterson, Jacqueline; Natesan, Usha
2018-04-29
Coastal lagoon ecosystems are vulnerable to eutrophication, which leads to the accumulation of nutrients from the surrounding watershed over the long term. However, there is a lack of information about methods that could accurate quantify this problem in rapidly developed countries. Therefore, various statistical methods such as cluster analysis (CA), principal component analysis (PCA), partial least square (PLS), principal component regression (PCR), and ordinary least squares regression (OLS) were used in this study to estimate total organic matter content in sediments (TOM) using other parameters such as temperature, dissolved oxygen (DO), pH, electrical conductivity (EC), nitrite (NO 2 ), nitrate (NO 3 ), biological oxygen demand (BOD), phosphate (PO 4 ), total phosphorus (TP), salinity, and water depth along a 3-km transect in the Gomishan Lagoon (Iran). Results indicated that nutrient concentration and the dissolved oxygen gradient were the most significant parameters in the lagoon water quality heterogeneity. Additionally, anoxia at the bottom of the lagoon in sediments and re-suspension of the sediments were the main factors affecting internal nutrient loading. To validate the models, R 2 , RMSECV, and RPDCV were used. The PLS model was stronger than the other models. Also, classification analysis of the Gomishan Lagoon identified two hydrological zones: (i) a North Zone characterized by higher water exchange, higher dissolved oxygen and lower salinity and nutrients, and (ii) a Central and South Zone with high residence time, higher nutrient concentrations, lower dissolved oxygen, and higher salinity. A recommendation for the management of coastal lagoons, specifically the Gomishan Lagoon, to decrease or eliminate nutrient loadings is discussed and should be transferred to policy makers, the scientific community, and local inhabitants.
Zhang, Bing-Fang; Yuan, Li-Bo; Kong, Qing-Ming; Shen, Wei-Zheng; Zhang, Bing-Xiu; Liu, Cheng-Hai
2014-10-01
In the present study, a new method using near infrared spectroscopy combined with optical fiber sensing technology was applied to the analysis of hogwash oil in blended oil. The 50 samples were a blend of frying oil and "nine three" soybean oil according to a certain volume ratio. The near infrared transmission spectroscopies were collected and the quantitative analysis model of frying oil was established by partial least squares (PLS) and BP artificial neural network The coefficients of determina- tion of calibration sets were 0.908 and 0.934 respectively. The coefficients of determination of validation sets were 0.961 and 0.952, the root mean square error of calibrations (RMSEC) was 0.184 and 0.136, and the root mean square error of predictions (RMSEP) was all 0.111 6. They conform to the model application requirement. At the same time, frying oil and qualified edible oil were identified with the principal component analysis (PCA), and the accurate rate was 100%. The experiment proved that near infrared spectral technology not only can quickly and accurately identify hogwash oil, but also can quantitatively detect hog- wash oil. This method has a wide application prospect in the detection of oil.
Esteki, M; Nouroozi, S; Shahsavari, Z
2016-02-01
To develop a simple and efficient spectrophotometric technique combined with chemometrics for the simultaneous determination of methyl paraben (MP) and hydroquinone (HQ) in cosmetic products, and specifically, to: (i) evaluate the potential use of successive projections algorithm (SPA) to derivative spectrophotometric data in order to provide sufficient accuracy and model robustness and (ii) determine MP and HQ concentration in cosmetics without tedious pre-treatments such as derivatization or extraction techniques which are time-consuming and require hazardous solvents. The absorption spectra were measured in the wavelength range of 200-350 nm. Prior to performing chemometric models, the original and first-derivative absorption spectra of binary mixtures were used as calibration matrices. Variable selected by successive projections algorithm was used to obtain multiple linear regression (MLR) models based on a small subset of wavelengths. The number of wavelengths and the starting vector were optimized, and the comparison of the root mean square error of calibration (RMSEC) and cross-validation (RMSECV) was applied to select effective wavelengths with the least collinearity and redundancy. Principal component regression (PCR) and partial least squares (PLS) were also developed for comparison. The concentrations of the calibration matrix ranged from 0.1 to 20 μg mL(-1) for MP, and from 0.1 to 25 μg mL(-1) for HQ. The constructed models were tested on an external validation data set and finally cosmetic samples. The results indicated that successive projections algorithm-multiple linear regression (SPA-MLR), applied on the first-derivative spectra, achieved the optimal performance for two compounds when compared with the full-spectrum PCR and PLS. The root mean square error of prediction (RMSEP) was 0.083, 0.314 for MP and HQ, respectively. To verify the accuracy of the proposed method, a recovery study on real cosmetic samples was carried out with satisfactory results (84-112%). The proposed method, which is an environmentally friendly approach, using minimum amount of solvent, is a simple, fast and low-cost analysis method that can provide high accuracy and robust models. The suggested method does not need any complex extraction procedure which is time-consuming and requires hazardous solvents. © 2015 Society of Cosmetic Scientists and the Société Française de Cosmétologie.
Prediction of specialty coffee cup quality based on near infrared spectra of green coffee beans.
Tolessa, Kassaye; Rademaker, Michael; De Baets, Bernard; Boeckx, Pascal
2016-04-01
The growing global demand for specialty coffee increases the need for improved coffee quality assessment methods. Green bean coffee quality analysis is usually carried out by physical (e.g. black beans, immature beans) and cup quality (e.g. acidity, flavour) evaluation. However, these evaluation methods are subjective, costly, time consuming, require sample preparation and may end up in poor grading systems. This calls for the development of a rapid, low-cost, reliable and reproducible analytical method to evaluate coffee quality attributes and eventually chemical compounds of interest (e.g. chlorogenic acid) in coffee beans. The aim of this study was to develop a model able to predict coffee cup quality based on NIR spectra of green coffee beans. NIR spectra of 86 samples of green Arabica beans of varying quality were analysed. Partial least squares (PLS) regression method was used to develop a model correlating spectral data to cupping score data (cup quality). The selected PLS model had a good predictive power for total specialty cup quality and its individual quality attributes (overall cup preference, acidity, body and aftertaste) showing a high correlation coefficient with r-values of 90, 90,78, 72 and 72, respectively, between measured and predicted cupping scores for 20 out of 86 samples. The corresponding root mean square error of prediction (RMSEP) was 1.04, 0.22, 0.27, 0.24 and 0.27 for total specialty cup quality, overall cup preference, acidity, body and aftertaste, respectively. The results obtained suggest that NIR spectra of green coffee beans are a promising tool for fast and accurate prediction of coffee quality and for classifying green coffee beans into different specialty grades. However, the model should be further tested for coffee samples from different regions in Ethiopia and test if one generic or region-specific model should be developed. Copyright © 2015 Elsevier B.V. All rights reserved.
Mass Spectrometry and Fourier Transform Infrared Spectroscopy for Analysis of Biological Materials
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anderson, Timothy J.
Time-of-flight mass spectrometry along with statistical analysis was utilized to study metabolic profiles among rats fed resistant starch (RS) diets. Fischer 344 rats were fed four starch diets consisting of 55% (w/w, dbs) starch. A control starch diet consisting of corn starch was compared against three RS diets. The RS diets were high-amylose corn starch (HA7), HA7 chemically modified with octenyl succinic anhydride, and stearic-acid-complexed HA7 starch. A subgroup received antibiotic treatment to determine if perturbations in the gut microbiome were long lasting. A second subgroup was treated with azoxymethane (AOM), a carcinogen. At the end of the eight weekmore » study, cecal and distal-colon contents samples were collected from the sacrificed rats. Metabolites were extracted from cecal and distal colon samples into acetonitrile. The extracts were then analyzed on an accurate-mass time-of-flight mass spectrometer to obtain their metabolic profile. The data were analyzed using partial least-squares discriminant analysis (PLS-DA). The PLS-DA analysis utilized a training set and verification set to classify samples within diet and treatment groups. PLS-DA could reliably differentiate the diet treatments for both cecal and distal colon samples. The PLS-DA analyses of the antibiotic and no antibiotic treated subgroups were well classified for cecal samples and modestly separated for distal-colon samples. PLS-DA analysis had limited success separating distal colon samples for rats given AOM from those not treated; the cecal samples from AOM had very poor classification. Mass spectrometry profiling coupled with PLS-DA can readily classify metabolite differences among rats given RS diets.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Anderson, Timothy J.; Jones, Roger W.; Ai, Yongfeng
Time-of-flight mass spectrometry along with statistical analysis was utilized to study metabolic profiles among rats fed resistant starch (RS) diets. Fischer 344 rats were fed four starch diets consisting of 55 % (w/w, dbs) starch. A control starch diet consisting of corn starch was compared against three RS diets. The RS diets were high-amylose corn starch (HA7), HA7 chemically modified with octenyl succinic anhydride, and stearic-acid-complexed HA7 starch. A subgroup received antibiotic treatment to determine if perturbations in the gut microbiome were long lasting. A second subgroup was treated with azoxymethane (AOM), a carcinogen. At the end of the 8-weekmore » study, cecal and distal colon content samples were collected from the sacrificed rats. Metabolites were extracted from cecal and distal colon samples into acetonitrile. The extracts were then analyzed on an accurate-mass time-of-flight mass spectrometer to obtain their metabolic profile. The data were analyzed using partial least-squares discriminant analysis (PLS-DA). The PLS-DA analysis utilized a training set and verification set to classify samples within diet and treatment groups. PLS-DA could reliably differentiate the diet treatments for both cecal and distal colon samples. The PLS-DA analyses of the antibiotic and no antibiotic-treated subgroups were well classified for cecal samples and modestly separated for distal colon samples. PLS-DA analysis had limited success separating distal colon samples for rats given AOM from those not treated; the cecal samples from AOM had very poor classification. Mass spectrometry profiling coupled with PLS-DA can readily classify metabolite differences among rats given RS diets.« less
Impact of multicollinearity on small sample hydrologic regression models
NASA Astrophysics Data System (ADS)
Kroll, Charles N.; Song, Peter
2013-06-01
Often hydrologic regression models are developed with ordinary least squares (OLS) procedures. The use of OLS with highly correlated explanatory variables produces multicollinearity, which creates highly sensitive parameter estimators with inflated variances and improper model selection. It is not clear how to best address multicollinearity in hydrologic regression models. Here a Monte Carlo simulation is developed to compare four techniques to address multicollinearity: OLS, OLS with variance inflation factor screening (VIF), principal component regression (PCR), and partial least squares regression (PLS). The performance of these four techniques was observed for varying sample sizes, correlation coefficients between the explanatory variables, and model error variances consistent with hydrologic regional regression models. The negative effects of multicollinearity are magnified at smaller sample sizes, higher correlations between the variables, and larger model error variances (smaller R2). The Monte Carlo simulation indicates that if the true model is known, multicollinearity is present, and the estimation and statistical testing of regression parameters are of interest, then PCR or PLS should be employed. If the model is unknown, or if the interest is solely on model predictions, is it recommended that OLS be employed since using more complicated techniques did not produce any improvement in model performance. A leave-one-out cross-validation case study was also performed using low-streamflow data sets from the eastern United States. Results indicate that OLS with stepwise selection generally produces models across study regions with varying levels of multicollinearity that are as good as biased regression techniques such as PCR and PLS.
Povey, Jane F; O'Malley, Christopher J; Root, Tracy; Martin, Elaine B; Montague, Gary A; Feary, Marc; Trim, Carol; Lang, Dietmar A; Alldread, Richard; Racher, Andrew J; Smales, C Mark
2014-08-20
Despite many advances in the generation of high producing recombinant mammalian cell lines over the last few decades, cell line selection and development is often slowed by the inability to predict a cell line's phenotypic characteristics (e.g. growth or recombinant protein productivity) at larger scale (large volume bioreactors) using data from early cell line construction at small culture scale. Here we describe the development of an intact cell MALDI-ToF mass spectrometry fingerprinting method for mammalian cells early in the cell line construction process whereby the resulting mass spectrometry data are used to predict the phenotype of mammalian cell lines at larger culture scale using a Partial Least Squares Discriminant Analysis (PLS-DA) model. Using MALDI-ToF mass spectrometry, a library of mass spectrometry fingerprints was generated for individual cell lines at the 96 deep well plate stage of cell line development. The growth and productivity of these cell lines were evaluated in a 10L bioreactor model of Lonza's large-scale (up to 20,000L) fed-batch cell culture processes. Using the mass spectrometry information at the 96 deep well plate stage and phenotype information at the 10L bioreactor scale a PLS-DA model was developed to predict the productivity of unknown cell lines at the 10L scale based upon their MALDI-ToF fingerprint at the 96 deep well plate scale. This approach provides the basis for the very early prediction of cell lines' performance in cGMP manufacturing-scale bioreactors and the foundation for methods and models for predicting other mammalian cell phenotypes from rapid, intact-cell mass spectrometry based measurements. Copyright © 2014 Elsevier B.V. All rights reserved.
Arroz, Erin; Jordan, Michael; Dumancas, Gerard G
2017-07-01
An ultraviolet visible (UV-Vis) spectrophotometric and partial least squares (PLS) chemometric method was developed for the simultaneous determination of erythrosine B (red), Brilliant Blue, and tartrazine (yellow) dyes. A training set (n = 64) was generated using a full factorial design and its accuracy was tested in a test set (n = 13) using a Box-Behnken design. The test set garnered a root mean square error (RMSE) of 1.79 × 10 -7 for blue, 4.59 × 10 -7 for red, and 1.13 × 10 -6 for yellow dyes. The relatively small RMSE suggests only a small difference between predicted versus measured concentrations, demonstrating the accuracy of our model. The relative error of prediction (REP) for the test set were 11.73%, 19.52%, 19.38%, for blue, red, and yellow dyes, respectively. A comparable overlay between the actual candy samples and their replicated synthetic spectra were also obtained indicating the model as a potentially accurate method for determining concentrations of dyes in food samples.
Wang, Pei; Zhang, Hui; Yang, Hailong; Nie, Lei; Zang, Hengchang
2015-02-25
Near-infrared (NIR) spectroscopy has been developed into an indispensable tool for both academic research and industrial quality control in a wide field of applications. The feasibility of NIR spectroscopy to monitor the concentration of puerarin, daidzin, daidzein and total isoflavonoid (TIF) during the extraction process of kudzu (Pueraria lobata) was verified in this work. NIR spectra were collected in transmission mode and pretreated with smoothing and derivative. Partial least square regression (PLSR) was used to establish calibration models. Three different variable selection methods, including correlation coefficient method, interval partial least squares (iPLS), and successive projections algorithm (SPA) were performed and compared with models based on all of the variables. The results showed that the approach was very efficient and environmentally friendly for rapid determination of the four quality indices (QIs) in the kudzu extraction process. This method established may have the potential to be used as a process analytical technological (PAT) tool in the future. Copyright © 2014 Elsevier B.V. All rights reserved.
Maltesen, Morten Jonas; van de Weert, Marco; Grohganz, Holger
2012-09-01
Moisture content and aerodynamic particle size are critical quality attributes for spray-dried protein formulations. In this study, spray-dried insulin powders intended for pulmonary delivery were produced applying design of experiments methodology. Near infrared spectroscopy (NIR) in combination with preprocessing and multivariate analysis in the form of partial least squares projections to latent structures (PLS) were used to correlate the spectral data with moisture content and aerodynamic particle size measured by a time of flight principle. PLS models predicting the moisture content were based on the chemical information of the water molecules in the NIR spectrum. Models yielded prediction errors (RMSEP) between 0.39% and 0.48% with thermal gravimetric analysis used as reference method. The PLS models predicting the aerodynamic particle size were based on baseline offset in the NIR spectra and yielded prediction errors between 0.27 and 0.48 μm. The morphology of the spray-dried particles had a significant impact on the predictive ability of the models. Good predictive models could be obtained for spherical particles with a calibration error (RMSECV) of 0.22 μm, whereas wrinkled particles resulted in much less robust models with a Q (2) of 0.69. Based on the results in this study, NIR is a suitable tool for process analysis of the spray-drying process and for control of moisture content and particle size, in particular for smooth and spherical particles.
Orthogonal decomposition of left ventricular remodeling in myocardial infarction
Zhang, Xingyu; Medrano-Gracia, Pau; Ambale-Venkatesh, Bharath; Bluemke, David A.; Cowan, Brett R; Finn, J. Paul; Kadish, Alan H.; Lee, Daniel C.; Lima, Joao A. C.; Young, Alistair A.; Suinesiaputra, Avan
2017-01-01
Abstract Left ventricular size and shape are important for quantifying cardiac remodeling in response to cardiovascular disease. Geometric remodeling indices have been shown to have prognostic value in predicting adverse events in the clinical literature, but these often describe interrelated shape changes. We developed a novel method for deriving orthogonal remodeling components directly from any (moderately independent) set of clinical remodeling indices. Results: Six clinical remodeling indices (end-diastolic volume index, sphericity, relative wall thickness, ejection fraction, apical conicity, and longitudinal shortening) were evaluated using cardiac magnetic resonance images of 300 patients with myocardial infarction, and 1991 asymptomatic subjects, obtained from the Cardiac Atlas Project. Partial least squares (PLS) regression of left ventricular shape models resulted in remodeling components that were optimally associated with each remodeling index. A Gram–Schmidt orthogonalization process, by which remodeling components were successively removed from the shape space in the order of shape variance explained, resulted in a set of orthonormal remodeling components. Remodeling scores could then be calculated that quantify the amount of each remodeling component present in each case. A one-factor PLS regression led to more decoupling between scores from the different remodeling components across the entire cohort, and zero correlation between clinical indices and subsequent scores. Conclusions: The PLS orthogonal remodeling components had similar power to describe differences between myocardial infarction patients and asymptomatic subjects as principal component analysis, but were better associated with well-understood clinical indices of cardiac remodeling. The data and analyses are available from www.cardiacatlas.org. PMID:28327972
A(1)H NMR-based metabonomic study on the SAMP8 and SAMR1 mice and the effect of electro-acupuncture.
Qiao-feng, Wu; Ling-ling, Guo; Shu-guang, Yu; Qi, Zhang; Sheng-feng, Lu; Fang, Zeng; Hai-yan, Yin; Yong, Tang; Xian-zhong, Yan
2011-10-01
A (1)H NMR-based metabonomic method was used to investigate the metabolic change of plasma in senescence-prone 8 (SAMP8) mice before and after electro-acupuncture (EA). Sixteen SAMP8 male mice (aged 8 months) were randomly divided into model group and acupuncture treatment group while the later group received EA treatment for 21 days. Eight senescence-resistant 1 (SAMR1) mice were used as the control group. Morris water maze was used to evaluate the effects of EA. All mice plasma samples obtained from different groups were analyzed by using 600 MHz (1)H nuclear magnetic resonances ((1)H NMR) spectroscopy. The data sets were analyzed by Principal Components Analysis (PCA) and Partial Least Squares-Discriminant Analysis (PLS-DA) to discriminate the key plasma metabolites among different groups. Results indicated that both the escape and probe tasks of SAMP8 could be improved by EA treatment. Metabonomic study showed that SAMR1 and SAMP8 were separated clearly in both CPMG_OSC_PLS and LED _OSC_PLS score plots. Interestingly, samples obtained from EA group were distributed closely to SAMR1 group in CPMG_OSC_PLS score plot, but away from SAMP8 group in LED_OSC_PLS score plot. Corresponding loading plots showed that much less lactate was seen in SAMP8 mice plasma. Other changes including higher levels of dimethylamine (DMA) Choline and α-glucose but lower levels of leucine/isoleucine, HDL, LDL/VLDL, 3-Hydroxybutyrate (3-HB), and Trimethylamine N-oxide (TMAO) were observed in the SAMP8 mice plasma than in the SAMR1. After EA treatment, the levels of lactate, DMA, choline and TMAO were improved. Results of this work can provide valuable clues to the understanding of the metabolic changes in the senile impairment of mice. It is also hoped that the methodology can be used in evaluating the effects of EA and understanding the underlying acupuncture mechanism in treating neurodegenerative diseases. Copyright © 2011 Elsevier Inc. All rights reserved.
Weisberg, Arel; Lakis, Rollin E; Simpson, Michael F; Horowitz, Leo; Craparo, Joseph
2014-01-01
The versatility of laser-induced breakdown spectroscopy (LIBS) as an analytical method for high-temperature applications was demonstrated through measurement of the concentrations of the lanthanide elements europium (Eu) and praseodymium (Pr) in molten eutectic lithium chloride-potassium chloride (LiCl-KCl) salts at a temperature of 500 °C. Laser pulses (1064 nm, 7 ns, 120 mJ/pulse) were focused on the top surface of the molten salt samples in a laboratory furnace under an argon atmosphere, and the resulting LIBS signals were collected using a broadband Echelle-type spectrometer. Partial least squares (PLS) regression using leave-one-sample-out cross-validation was used to quantify the concentrations of Eu and Pr in the samples. The root mean square error of prediction (RMSEP) for Eu was 0.13% (absolute) over a concentration range of 0-3.01%, and for Pr was 0.13% (absolute) over a concentration range of 0-1.04%.
Kandhro, Aftab A; Laghari, Abdul Hafeez; Mahesar, Sarfaraz A; Saleem, Rubina; Nelofar, Aisha; Khan, Salman Tariq; Sherazi, S T H
2013-11-01
A quick and reliable analytical method for the quantitative assessment of cefixime in orally administered pharmaceutical formulations is developed by using diamond cell attenuated total reflectance (ATR) Fourier transform infrared (FT-IR) spectroscopy as an easy procedure for quality control laboratories. The standards for calibration were prepared in aqueous medium ranging from 350 to 6000mg/kg. The calibration model was developed based on partial least square (PLS) using finger print region of FT-IR spectrum in the range from 1485 to 887cm(-1). Excellent coefficient of determination (R(2)) was achieved as high as 0.99976 with root mean square error of 44.8 for calibration. The application of diamond cell (smart accessory) ATR FT-IR proves a reliable determination of cefixime in pharmaceutical formulations to assess the quality of the final product. Copyright © 2013 Elsevier B.V. All rights reserved.
Facial Age Synthesis Using Sparse Partial Least Squares (The Case of Ben Needham).
Bukar, Ali M; Ugail, Hassan
2017-09-01
Automatic facial age progression (AFAP) has been an active area of research in recent years. This is due to its numerous applications which include searching for missing. This study presents a new method of AFAP. Here, we use an active appearance model (AAM) to extract facial features from available images. An aging function is then modelled using sparse partial least squares regression (sPLS). Thereafter, the aging function is used to render new faces at different ages. To test the accuracy of our algorithm, extensive evaluation is conducted using a database of 500 face images with known ages. Furthermore, the algorithm is used to progress Ben Needham's facial image that was taken when he was 21 months old to the ages of 6, 14, and 22 years. The algorithm presented in this study could potentially be used to enhance the search for missing people worldwide. © 2017 American Academy of Forensic Sciences.
Goicoechea, H C; Olivieri, A C
1999-08-01
The use of multivariate spectrophotometric calibration is presented for the simultaneous determination of the active components of tablets used in the treatment of pulmonary tuberculosis. The resolution of ternary mixtures of rifampicin, isoniazid and pyrazinamide has been accomplished by using partial least squares (PLS-1) regression analysis. Although the components show an important degree of spectral overlap, they have been simultaneously determined with high accuracy and precision, rapidly and with no need of nonaqueous solvents for dissolving the samples. No interference has been observed from the tablet excipients. A comparison is presented with the related multivariate method of classical least squares (CLS) analysis, which is shown to yield less reliable results due to the severe spectral overlap among the studied compounds. This is highlighted in the case of isoniazid, due to the small absorbances measured for this component.
Melamine detection in infant formula powder using near- and mid-infrared spectroscopy.
Mauer, Lisa J; Chernyshova, Alona A; Hiatt, Ashley; Deering, Amanda; Davis, Reeta
2009-05-27
Near- and mid-infrared spectroscopy methods (NIR, FTIR-ATR, FTIR-DRIFT) were evaluated for the detection and quantification of melamine in infant formula powder. Partial least-squares (PLS) models were established for correlating spectral data to melamine concentration: R(2) > 0.99, RMSECV ≤ 0.9, and RPD ≥ 12. Factorization analysis of spectra was able to differentiate unadulterated infant formula powder from samples containing 1 ppm melamine with no misclassifications, a confidence level of 99.99%, and selectivity > 2. These nondestructive methods require little or no sample preparation. The NIR method has an assay time of 1 min, and a 2 min total time to detection. The FTIR methods require up to 5 min for melamine detection. Therefore, NIR and FTIR methods enable rapid detection of 1 ppm melamine in infant formula powder.
Yu, Shaohui; Xiao, Xue; Ding, Hong; Xu, Ge; Li, Haixia; Liu, Jing
2017-08-05
The quantitative analysis is very difficult for the emission-excitation fluorescence spectroscopy of multi-component mixtures whose fluorescence peaks are serious overlapping. As an effective method for the quantitative analysis, partial least squares can extract the latent variables from both the independent variables and the dependent variables, so it can model for multiple correlations between variables. However, there are some factors that usually affect the prediction results of partial least squares, such as the noise, the distribution and amount of the samples in calibration set etc. This work focuses on the problems in the calibration set that are mentioned above. Firstly, the outliers in the calibration set are removed by leave-one-out cross-validation. Then, according to two different prediction requirements, the EWPLS method and the VWPLS method are proposed. The independent variables and dependent variables are weighted in the EWPLS method by the maximum error of the recovery rate and weighted in the VWPLS method by the maximum variance of the recovery rate. Three organic matters with serious overlapping excitation-emission fluorescence spectroscopy are selected for the experiments. The step adjustment parameter, the iteration number and the sample amount in the calibration set are discussed. The results show the EWPLS method and the VWPLS method are superior to the PLS method especially for the case of small samples in the calibration set. Copyright © 2017 Elsevier B.V. All rights reserved.
Real-time Raman spectroscopy for automatic in vivo skin cancer detection: an independent validation.
Zhao, Jianhua; Lui, Harvey; Kalia, Sunil; Zeng, Haishan
2015-11-01
In a recent study, we have demonstrated that real-time Raman spectroscopy could be used for skin cancer diagnosis. As a translational study, the objective of this study is to validate previous findings through a completely independent clinical test. In total, 645 confirmed cases were included in the analysis, including a cohort of 518 cases from a previous study, and an independent cohort of 127 new cases. Multi-variant statistical data analyses including principal component with general discriminant analysis (PC-GDA) and partial least squares (PLS) were used separately for lesion classification, which generated similar results. When the previous cohort (n = 518) was used as training and the new cohort (n = 127) was used as testing, the area under the receiver operating characteristic curve (ROC AUC) was found to be 0.889 (95 % CI 0.834-0.944; PLS); when the two cohorts were combined, the ROC AUC was 0.894 (95 % CI 0.870-0.918; PLS) with the narrowest confidence intervals. Both analyses were comparable to the previous findings, where the ROC AUC was 0.896 (95 % CI 0.846-0.946; PLS). The independent study validates that real-time Raman spectroscopy could be used for automatic in vivo skin cancer diagnosis with good accuracy.
Fulcher, Yan G.; Fotso, Martial; Chang, Chee-Hoon; Rindt, Hans; Reinero, Carol R.
2016-01-01
Asthma is prevalent in children and cats, and needs means of noninvasive diagnosis. We sought to distinguish noninvasively the differences in 53 cats before and soon after induction of allergic asthma, using NMR spectra of exhaled breath condensate (EBC). Statistical pattern recognition was improved considerably by preprocessing the spectra with probabilistic quotient normalization and glog transformation. Classification of the 106 preprocessed spectra by principal component analysis and partial least squares with discriminant analysis (PLS-DA) appears to be impaired by variances unrelated to eosinophilic asthma. By filtering out confounding variances, orthogonal signal correction (OSC) PLS-DA greatly improved the separation of the healthy and early asthmatic states, attaining 94% specificity and 94% sensitivity in predictions. OSC enhancement of multi-level PLS-DA boosted the specificity of the prediction to 100%. OSC-PLS-DA of the normalized spectra suggest the most promising biomarkers of allergic asthma in cats to include increased acetone, metabolite(s) with overlapped NMR peaks near 5.8 ppm, and a hydroxyphenyl-containing metabolite, as well as decreased phthalate. Acetone is elevated in the EBC of 74% of the cats with early asthma. The noninvasive detection of early experimental asthma, biomarkers in EBC, and metabolic perturbation invite further investigation of the diagnostic potential in humans. PMID:27764146
NASA Astrophysics Data System (ADS)
Tan, Chao; Chen, Hui; Wang, Chao; Zhu, Wanping; Wu, Tong; Diao, Yuanbo
2013-03-01
Near and mid-infrared (NIR/MIR) spectroscopy techniques have gained great acceptance in the industry due to their multiple applications and versatility. However, a success of application often depends heavily on the construction of accurate and stable calibration models. For this purpose, a simple multi-model fusion strategy is proposed. It is actually the combination of Kohonen self-organizing map (KSOM), mutual information (MI) and partial least squares (PLSs) and therefore named as KMICPLS. It works as follows: First, the original training set is fed into a KSOM for unsupervised clustering of samples, on which a series of training subsets are constructed. Thereafter, on each of the training subsets, a MI spectrum is calculated and only the variables with higher MI values than the mean value are retained, based on which a candidate PLS model is constructed. Finally, a fixed number of PLS models are selected to produce a consensus model. Two NIR/MIR spectral datasets from brewing industry are used for experiments. The results confirms its superior performance to two reference algorithms, i.e., the conventional PLS and genetic algorithm-PLS (GAPLS). It can build more accurate and stable calibration models without increasing the complexity, and can be generalized to other NIR/MIR applications.
Towards molecular design using 2D-molecular contour maps obtained from PLS regression coefficients
NASA Astrophysics Data System (ADS)
Borges, Cleber N.; Barigye, Stephen J.; Freitas, Matheus P.
2017-12-01
The multivariate image analysis descriptors used in quantitative structure-activity relationships are direct representations of chemical structures as they are simply numerical decodifications of pixels forming the 2D chemical images. These MDs have found great utility in the modeling of diverse properties of organic molecules. Given the multicollinearity and high dimensionality of the data matrices generated with the MIA-QSAR approach, modeling techniques that involve the projection of the data space onto orthogonal components e.g. Partial Least Squares (PLS) have been generally used. However, the chemical interpretation of the PLS-based MIA-QSAR models, in terms of the structural moieties affecting the modeled bioactivity has not been straightforward. This work describes the 2D-contour maps based on the PLS regression coefficients, as a means of assessing the relevance of single MIA predictors to the response variable, and thus allowing for the structural, electronic and physicochemical interpretation of the MIA-QSAR models. A sample study to demonstrate the utility of the 2D-contour maps to design novel drug-like molecules is performed using a dataset of some anti-HIV-1 2-amino-6-arylsulfonylbenzonitriles and derivatives, and the inferences obtained are consistent with other reports in the literature. In addition, the different schemes for encoding atomic properties in molecules are discussed and evaluated.
Simultaneous determination of specific alpha and beta emitters by LSC-PLS in water samples.
Fons-Castells, J; Tent-Petrus, J; Llauradó, M
2017-01-01
Liquid scintillation counting (LSC) is a commonly used technique for the determination of alpha and beta emitters. However, LSC has poor resolution and the continuous spectra for beta emitters hinder the simultaneous determination of several alpha and beta emitters from the same spectrum. In this paper, the feasibility of multivariate calibration by partial least squares (PLS) models for the determination of several alpha ( nat U, 241 Am and 226 Ra) and beta emitters ( 40 K, 60 Co, 90 Sr/ 90 Y, 134 Cs and 137 Cs) in water samples is reported. A set of alpha and beta spectra from radionuclide calibration standards were used to construct three PLS models. Experimentally mixed radionuclides and intercomparision materials were used to validate the models. The results had a maximum relative bias of 25% when all the radionuclides in the sample were included in the calibration set; otherwise the relative bias was over 100% for some radionuclides. The results obtained show that LSC-PLS is a useful approach for the simultaneous determination of alpha and beta emitters in multi-radionuclide samples. However, to obtain useful results, it is important to include all the radionuclides expected in the studied scenario in the calibration set. Copyright © 2016 Elsevier Ltd. All rights reserved.
Experiments on Supervised Learning Algorithms for Text Categorization
NASA Technical Reports Server (NTRS)
Namburu, Setu Madhavi; Tu, Haiying; Luo, Jianhui; Pattipati, Krishna R.
2005-01-01
Modern information society is facing the challenge of handling massive volume of online documents, news, intelligence reports, and so on. How to use the information accurately and in a timely manner becomes a major concern in many areas. While the general information may also include images and voice, we focus on the categorization of text data in this paper. We provide a brief overview of the information processing flow for text categorization, and discuss two supervised learning algorithms, viz., support vector machines (SVM) and partial least squares (PLS), which have been successfully applied in other domains, e.g., fault diagnosis [9]. While SVM has been well explored for binary classification and was reported as an efficient algorithm for text categorization, PLS has not yet been applied to text categorization. Our experiments are conducted on three data sets: Reuter's- 21578 dataset about corporate mergers and data acquisitions (ACQ), WebKB and the 20-Newsgroups. Results show that the performance of PLS is comparable to SVM in text categorization. A major drawback of SVM for multi-class categorization is that it requires a voting scheme based on the results of pair-wise classification. PLS does not have this drawback and could be a better candidate for multi-class text categorization.