Sample records for square regression plsr

  1. Modified locally weighted--partial least squares regression improving clinical predictions from infrared spectra of human serum samples.

    PubMed

    Perez-Guaita, David; Kuligowski, Julia; Quintás, Guillermo; Garrigues, Salvador; Guardia, Miguel de la

    2013-03-30

    Locally weighted partial least squares regression (LW-PLSR) has been applied to the determination of four clinical parameters in human serum samples (total protein, triglyceride, glucose and urea contents) by Fourier transform infrared (FTIR) spectroscopy. Classical LW-PLSR models were constructed using different spectral regions. For the selection of parameters by LW-PLSR modeling, a multi-parametric study was carried out employing the minimum root-mean square error of cross validation (RMSCV) as objective function. In order to overcome the effect of strong matrix interferences on the predictive accuracy of LW-PLSR models, this work focuses on sample selection. Accordingly, a novel strategy for the development of local models is proposed. It was based on the use of: (i) principal component analysis (PCA) performed on an analyte specific spectral region for identifying most similar sample spectra and (ii) partial least squares regression (PLSR) constructed using the whole spectrum. Results found by using this strategy were compared to those provided by PLSR using the same spectral intervals as for LW-PLSR. Prediction errors found by both, classical and modified LW-PLSR improved those obtained by PLSR. Hence, both proposed approaches were useful for the determination of analytes present in a complex matrix as in the case of human serum samples. Copyright © 2013 Elsevier B.V. All rights reserved.

  2. Partial Least Squares Regression Models for the Analysis of Kinase Signaling.

    PubMed

    Bourgeois, Danielle L; Kreeger, Pamela K

    2017-01-01

    Partial least squares regression (PLSR) is a data-driven modeling approach that can be used to analyze multivariate relationships between kinase networks and cellular decisions or patient outcomes. In PLSR, a linear model relating an X matrix of dependent variables and a Y matrix of independent variables is generated by extracting the factors with the strongest covariation. While the identified relationship is correlative, PLSR models can be used to generate quantitative predictions for new conditions or perturbations to the network, allowing for mechanisms to be identified. This chapter will provide a brief explanation of PLSR and provide an instructive example to demonstrate the use of PLSR to analyze kinase signaling.

  3. Rapid Detection of Volatile Oil in Mentha haplocalyx by Near-Infrared Spectroscopy and Chemometrics.

    PubMed

    Yan, Hui; Guo, Cheng; Shao, Yang; Ouyang, Zhen

    2017-01-01

    Near-infrared spectroscopy combined with partial least squares regression (PLSR) and support vector machine (SVM) was applied for the rapid determination of chemical component of volatile oil content in Mentha haplocalyx . The effects of data pre-processing methods on the accuracy of the PLSR calibration models were investigated. The performance of the final model was evaluated according to the correlation coefficient ( R ) and root mean square error of prediction (RMSEP). For PLSR model, the best preprocessing method combination was first-order derivative, standard normal variate transformation (SNV), and mean centering, which had of 0.8805, of 0.8719, RMSEC of 0.091, and RMSEP of 0.097, respectively. The wave number variables linking to volatile oil are from 5500 to 4000 cm-1 by analyzing the loading weights and variable importance in projection (VIP) scores. For SVM model, six LVs (less than seven LVs in PLSR model) were adopted in model, and the result was better than PLSR model. The and were 0.9232 and 0.9202, respectively, with RMSEC and RMSEP of 0.084 and 0.082, respectively, which indicated that the predicted values were accurate and reliable. This work demonstrated that near infrared reflectance spectroscopy with chemometrics could be used to rapidly detect the main content volatile oil in M. haplocalyx . The quality of medicine directly links to clinical efficacy, thus, it is important to control the quality of Mentha haplocalyx . Near-infrared spectroscopy combined with partial least squares regression (PLSR) and support vector machine (SVM) was applied for the rapid determination of chemical component of volatile oil content in Mentha haplocalyx . For SVM model, 6 LVs (less than 7 LVs in PLSR model) were adopted in model, and the result was better than PLSR model. It demonstrated that near infrared reflectance spectroscopy with chemometrics could be used to rapidly detect the main content volatile oil in Mentha haplocalyx . Abbreviations used: 1 st der: First-order derivative; 2 nd der: Second-order derivative; LOO: Leave-one-out; LVs: Latent variables; MC: Mean centering, NIR: Near-infrared; NIRS: Near infrared spectroscopy; PCR: Principal component regression, PLSR: Partial least squares regression; RBF: Radial basis function; RMSEC: Root mean square error of cross validation, RMSEC: Root mean square error of calibration; RMSEP: Root mean square error of prediction; SNV: Standard normal variate transformation; SVM: Support vector machine; VIP: Variable Importance in projection.

  4. Application of principal component regression and partial least squares regression in ultraviolet spectrum water quality detection

    NASA Astrophysics Data System (ADS)

    Li, Jiangtong; Luo, Yongdao; Dai, Honglin

    2018-01-01

    Water is the source of life and the essential foundation of all life. With the development of industrialization, the phenomenon of water pollution is becoming more and more frequent, which directly affects the survival and development of human. Water quality detection is one of the necessary measures to protect water resources. Ultraviolet (UV) spectral analysis is an important research method in the field of water quality detection, which partial least squares regression (PLSR) analysis method is becoming predominant technology, however, in some special cases, PLSR's analysis produce considerable errors. In order to solve this problem, the traditional principal component regression (PCR) analysis method was improved by using the principle of PLSR in this paper. The experimental results show that for some special experimental data set, improved PCR analysis method performance is better than PLSR. The PCR and PLSR is the focus of this paper. Firstly, the principal component analysis (PCA) is performed by MATLAB to reduce the dimensionality of the spectral data; on the basis of a large number of experiments, the optimized principal component is extracted by using the principle of PLSR, which carries most of the original data information. Secondly, the linear regression analysis of the principal component is carried out with statistic package for social science (SPSS), which the coefficients and relations of principal components can be obtained. Finally, calculating a same water spectral data set by PLSR and improved PCR, analyzing and comparing two results, improved PCR and PLSR is similar for most data, but improved PCR is better than PLSR for data near the detection limit. Both PLSR and improved PCR can be used in Ultraviolet spectral analysis of water, but for data near the detection limit, improved PCR's result better than PLSR.

  5. Hyperspectral analysis of soil organic matter in coal mining regions using wavelets, correlations, and partial least squares regression.

    PubMed

    Lin, Lixin; Wang, Yunjia; Teng, Jiyao; Wang, Xuchen

    2016-02-01

    Hyperspectral estimation of soil organic matter (SOM) in coal mining regions is an important tool for enhancing fertilization in soil restoration programs. The correlation--partial least squares regression (PLSR) method effectively solves the information loss problem of correlation--multiple linear stepwise regression, but results of the correlation analysis must be optimized to improve precision. This study considers the relationship between spectral reflectance and SOM based on spectral reflectance curves of soil samples collected from coal mining regions. Based on the major absorption troughs in the 400-1006 nm spectral range, PLSR analysis was performed using 289 independent bands of the second derivative (SDR) with three levels and measured SOM values. A wavelet-correlation-PLSR (W-C-PLSR) model was then constructed. By amplifying useful information that was previously obscured by noise, the W-C-PLSR model was optimal for estimating SOM content, with smaller prediction errors in both calibration (R(2) = 0.970, root mean square error (RMSEC) = 3.10, and mean relative error (MREC) = 8.75) and validation (RMSEV = 5.85 and MREV = 14.32) analyses, as compared with other models. Results indicate that W-C-PLSR has great potential to estimate SOM in coal mining regions.

  6. Hierarchical cluster-based partial least squares regression (HC-PLSR) is an efficient tool for metamodelling of nonlinear dynamic models.

    PubMed

    Tøndel, Kristin; Indahl, Ulf G; Gjuvsland, Arne B; Vik, Jon Olav; Hunter, Peter; Omholt, Stig W; Martens, Harald

    2011-06-01

    Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. HC-PLSR is a promising approach for metamodelling in systems biology, especially for highly nonlinear or non-monotone parameter to phenotype maps. The algorithm can be flexibly adjusted to suit the complexity of the dynamic model behaviour, inviting automation in the metamodelling of complex systems.

  7. Hierarchical Cluster-based Partial Least Squares Regression (HC-PLSR) is an efficient tool for metamodelling of nonlinear dynamic models

    PubMed Central

    2011-01-01

    Background Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Results Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. Conclusions HC-PLSR is a promising approach for metamodelling in systems biology, especially for highly nonlinear or non-monotone parameter to phenotype maps. The algorithm can be flexibly adjusted to suit the complexity of the dynamic model behaviour, inviting automation in the metamodelling of complex systems. PMID:21627852

  8. Partial Least Squares Regression Calibration of an Ultraviolet-Visible Spectrophotometer for Measurements of Chemical Oxygen Demand in Dye Wastewater

    NASA Astrophysics Data System (ADS)

    Mai, W.; Zhang, J.-F.; Zhao, X.-M.; Li, Z.; Xu, Z.-W.

    2017-11-01

    Wastewater from the dye industry is typically analyzed using a standard method for measurement of chemical oxygen demand (COD) or by a single-wavelength spectroscopic method. To overcome the disadvantages of these methods, ultraviolet-visible (UV-Vis) spectroscopy was combined with principal component regression (PCR) and partial least squares regression (PLSR) in this study. Unlike the standard method, this method does not require digestion of the samples for preparation. Experiments showed that the PLSR model offered high prediction performance for COD, with a mean relative error of about 5% for two dyes. This error is similar to that obtained with the standard method. In this study, the precision of the PLSR model decreased with the number of dye compounds present. It is likely that multiple models will be required in reality, and the complexity of a COD monitoring system would be greatly reduced if the PLSR model is used because it can include several dyes. UV-Vis spectroscopy with PLSR successfully enhanced the performance of COD prediction for dye wastewater and showed good potential for application in on-line water quality monitoring.

  9. Acidity measurement of iron ore powders using laser-induced breakdown spectroscopy with partial least squares regression.

    PubMed

    Hao, Z Q; Li, C M; Shen, M; Yang, X Y; Li, K H; Guo, L B; Li, X Y; Lu, Y F; Zeng, X Y

    2015-03-23

    Laser-induced breakdown spectroscopy (LIBS) with partial least squares regression (PLSR) has been applied to measuring the acidity of iron ore, which can be defined by the concentrations of oxides: CaO, MgO, Al₂O₃, and SiO₂. With the conventional internal standard calibration, it is difficult to establish the calibration curves of CaO, MgO, Al₂O₃, and SiO₂ in iron ore due to the serious matrix effects. PLSR is effective to address this problem due to its excellent performance in compensating the matrix effects. In this work, fifty samples were used to construct the PLSR calibration models for the above-mentioned oxides. These calibration models were validated by the 10-fold cross-validation method with the minimum root-mean-square errors (RMSE). Another ten samples were used as a test set. The acidities were calculated according to the estimated concentrations of CaO, MgO, Al₂O₃, and SiO₂ using the PLSR models. The average relative error (ARE) and RMSE of the acidity achieved 3.65% and 0.0048, respectively, for the test samples.

  10. Spectroscopic Determination of Aboveground Biomass in Grasslands Using Spectral Transformations, Support Vector Machine and Partial Least Squares Regression

    PubMed Central

    Marabel, Miguel; Alvarez-Taboada, Flor

    2013-01-01

    Aboveground biomass (AGB) is one of the strategic biophysical variables of interest in vegetation studies. The main objective of this study was to evaluate the Support Vector Machine (SVM) and Partial Least Squares Regression (PLSR) for estimating the AGB of grasslands from field spectrometer data and to find out which data pre-processing approach was the most suitable. The most accurate model to predict the total AGB involved PLSR and the Maximum Band Depth index derived from the continuum removed reflectance in the absorption features between 916–1,120 nm and 1,079–1,297 nm (R2 = 0.939, RMSE = 7.120 g/m2). Regarding the green fraction of the AGB, the Area Over the Minimum index derived from the continuum removed spectra provided the most accurate model overall (R2 = 0.939, RMSE = 3.172 g/m2). Identifying the appropriate absorption features was proved to be crucial to improve the performance of PLSR to estimate the total and green aboveground biomass, by using the indices derived from those spectral regions. Ordinary Least Square Regression could be used as a surrogate for the PLSR approach with the Area Over the Minimum index as the independent variable, although the resulting model would not be as accurate. PMID:23925082

  11. Soil sail content estimation in the yellow river delta with satellite hyperspectral data

    USGS Publications Warehouse

    Weng, Yongling; Gong, Peng; Zhu, Zhi-Liang

    2008-01-01

    Soil salinization is one of the most common land degradation processes and is a severe environmental hazard. The primary objective of this study is to investigate the potential of predicting salt content in soils with hyperspectral data acquired with EO-1 Hyperion. Both partial least-squares regression (PLSR) and conventional multiple linear regression (MLR), such as stepwise regression (SWR), were tested as the prediction model. PLSR is commonly used to overcome the problem caused by high-dimensional and correlated predictors. Chemical analysis of 95 samples collected from the top layer of soils in the Yellow River delta area shows that salt content was high on average, and the dominant chemicals in the saline soil were NaCl and MgCl2. Multivariate models were established between soil contents and hyperspectral data. Our results indicate that the PLSR technique with laboratory spectral data has a strong prediction capacity. Spectral bands at 1487-1527, 1971-1991, 2032-2092, and 2163-2355 nm possessed large absolute values of regression coefficients, with the largest coefficient at 2203 nm. We obtained a root mean squared error (RMSE) for calibration (with 61 samples) of RMSEC = 0.753 (R2 = 0.893) and a root mean squared error for validation (with 30 samples) of RMSEV = 0.574. The prediction model was applied on a pixel-by-pixel basis to a Hyperion reflectance image to yield a quantitative surface distribution map of soil salt content. The result was validated successfully from 38 sampling points. We obtained an RMSE estimate of 1.037 (R2 = 0.784) for the soil salt content map derived by the PLSR model. The salinity map derived from the SWR model shows that the predicted value is higher than the true value. These results demonstrate that the PLSR method is a more suitable technique than stepwise regression for quantitative estimation of soil salt content in a large area. ?? 2008 CASI.

  12. The comparison of robust partial least squares regression with robust principal component regression on a real

    NASA Astrophysics Data System (ADS)

    Polat, Esra; Gunay, Suleyman

    2013-10-01

    One of the problems encountered in Multiple Linear Regression (MLR) is multicollinearity, which causes the overestimation of the regression parameters and increase of the variance of these parameters. Hence, in case of multicollinearity presents, biased estimation procedures such as classical Principal Component Regression (CPCR) and Partial Least Squares Regression (PLSR) are then performed. SIMPLS algorithm is the leading PLSR algorithm because of its speed, efficiency and results are easier to interpret. However, both of the CPCR and SIMPLS yield very unreliable results when the data set contains outlying observations. Therefore, Hubert and Vanden Branden (2003) have been presented a robust PCR (RPCR) method and a robust PLSR (RPLSR) method called RSIMPLS. In RPCR, firstly, a robust Principal Component Analysis (PCA) method for high-dimensional data on the independent variables is applied, then, the dependent variables are regressed on the scores using a robust regression method. RSIMPLS has been constructed from a robust covariance matrix for high-dimensional data and robust linear regression. The purpose of this study is to show the usage of RPCR and RSIMPLS methods on an econometric data set, hence, making a comparison of two methods on an inflation model of Turkey. The considered methods have been compared in terms of predictive ability and goodness of fit by using a robust Root Mean Squared Error of Cross-validation (R-RMSECV), a robust R2 value and Robust Component Selection (RCS) statistic.

  13. [Variable selection methods combined with local linear embedding theory used for optimization of near infrared spectral quantitative models].

    PubMed

    Hao, Yong; Sun, Xu-Dong; Yang, Qiang

    2012-12-01

    Variables selection strategy combined with local linear embedding (LLE) was introduced for the analysis of complex samples by using near infrared spectroscopy (NIRS). Three methods include Monte Carlo uninformation variable elimination (MCUVE), successive projections algorithm (SPA) and MCUVE connected with SPA were used for eliminating redundancy spectral variables. Partial least squares regression (PLSR) and LLE-PLSR were used for modeling complex samples. The results shown that MCUVE can both extract effective informative variables and improve the precision of models. Compared with PLSR models, LLE-PLSR models can achieve more accurate analysis results. MCUVE combined with LLE-PLSR is an effective modeling method for NIRS quantitative analysis.

  14. Water Quality Variable Estimation using Partial Least Squares Regression and Multi-Scale Remote Sensing.

    NASA Astrophysics Data System (ADS)

    Peterson, K. T.; Wulamu, A.

    2017-12-01

    Water, essential to all living organisms, is one of the Earth's most precious resources. Remote sensing offers an ideal approach to monitor water quality over traditional in-situ techniques that are highly time and resource consuming. Utilizing a multi-scale approach, incorporating data from handheld spectroscopy, UAS based hyperspectal, and satellite multispectral images were collected in coordination with in-situ water quality samples for the two midwestern watersheds. The remote sensing data was modeled and correlated to the in-situ water quality variables including chlorophyll content (Chl), turbidity, and total dissolved solids (TDS) using Normalized Difference Spectral Indices (NDSI) and Partial Least Squares Regression (PLSR). The results of the study supported the original hypothesis that correlating water quality variables with remotely sensed data benefits greatly from the use of more complex modeling and regression techniques such as PLSR. The final results generated from the PLSR analysis resulted in much higher R2 values for all variables when compared to NDSI. The combination of NDSI and PLSR analysis also identified key wavelengths for identification that aligned with previous study's findings. This research displays the advantages and future for complex modeling and machine learning techniques to improve water quality variable estimation from spectral data.

  15. High and low frequency unfolded partial least squares regression based on empirical mode decomposition for quantitative analysis of fuel oil samples.

    PubMed

    Bian, Xihui; Li, Shujuan; Lin, Ligang; Tan, Xiaoyao; Fan, Qingjie; Li, Ming

    2016-06-21

    Accurate prediction of the model is fundamental to the successful analysis of complex samples. To utilize abundant information embedded over frequency and time domains, a novel regression model is presented for quantitative analysis of hydrocarbon contents in the fuel oil samples. The proposed method named as high and low frequency unfolded PLSR (HLUPLSR), which integrates empirical mode decomposition (EMD) and unfolded strategy with partial least squares regression (PLSR). In the proposed method, the original signals are firstly decomposed into a finite number of intrinsic mode functions (IMFs) and a residue by EMD. Secondly, the former high frequency IMFs are summed as a high frequency matrix and the latter IMFs and residue are summed as a low frequency matrix. Finally, the two matrices are unfolded to an extended matrix in variable dimension, and then the PLSR model is built between the extended matrix and the target values. Coupled with Ultraviolet (UV) spectroscopy, HLUPLSR has been applied to determine hydrocarbon contents of light gas oil and diesel fuels samples. Comparing with single PLSR and other signal processing techniques, the proposed method shows superiority in prediction ability and better model interpretation. Therefore, HLUPLSR method provides a promising tool for quantitative analysis of complex samples. Copyright © 2016 Elsevier B.V. All rights reserved.

  16. Evaluation of the prediction precision capability of partial least squares regression approach for analysis of high alloy steel by laser induced breakdown spectroscopy

    NASA Astrophysics Data System (ADS)

    Sarkar, Arnab; Karki, Vijay; Aggarwal, Suresh K.; Maurya, Gulab S.; Kumar, Rohit; Rai, Awadhesh K.; Mao, Xianglei; Russo, Richard E.

    2015-06-01

    Laser induced breakdown spectroscopy (LIBS) was applied for elemental characterization of high alloy steel using partial least squares regression (PLSR) with an objective to evaluate the analytical performance of this multivariate approach. The optimization of the number of principle components for minimizing error in PLSR algorithm was investigated. The effect of different pre-treatment procedures on the raw spectral data before PLSR analysis was evaluated based on several statistical (standard error of prediction, percentage relative error of prediction etc.) parameters. The pre-treatment with "NORM" parameter gave the optimum statistical results. The analytical performance of PLSR model improved by increasing the number of laser pulses accumulated per spectrum as well as by truncating the spectrum to appropriate wavelength region. It was found that the statistical benefit of truncating the spectrum can also be accomplished by increasing the number of laser pulses per accumulation without spectral truncation. The constituents (Co and Mo) present in hundreds of ppm were determined with relative precision of 4-9% (2σ), whereas the major constituents Cr and Ni (present at a few percent levels) were determined with a relative precision of ~ 2%(2σ).

  17. [Research on optimal modeling strategy for licorice extraction process based on near-infrared spectroscopy technology].

    PubMed

    Wang, Hai-Xia; Suo, Tong-Chuan; Yu, He-Shui; Li, Zheng

    2016-10-01

    The manufacture of traditional Chinese medicine (TCM) products is always accompanied by processing complex raw materials and real-time monitoring of the manufacturing process. In this study, we investigated different modeling strategies for the extraction process of licorice. Near-infrared spectra associate with the extraction time was used to detemine the states of the extraction processes. Three modeling approaches, i.e., principal component analysis (PCA), partial least squares regression (PLSR) and parallel factor analysis-PLSR (PARAFAC-PLSR), were adopted for the prediction of the real-time status of the process. The overall results indicated that PCA, PLSR and PARAFAC-PLSR can effectively detect the errors in the extraction procedure and predict the process trajectories, which has important significance for the monitoring and controlling of the extraction processes. Copyright© by the Chinese Pharmaceutical Association.

  18. Quantitative Analysis of Single and Mix Food Antiseptics Basing on SERS Spectra with PLSR Method

    NASA Astrophysics Data System (ADS)

    Hou, Mengjing; Huang, Yu; Ma, Lingwei; Zhang, Zhengjun

    2016-06-01

    Usage and dosage of food antiseptics are very concerned due to their decisive influence in food safety. Surface-enhanced Raman scattering (SERS) effect was employed in this research to realize trace potassium sorbate (PS) and sodium benzoate (SB) detection. HfO2 ultrathin film-coated Ag NR array was fabricated as SERS substrate. Protected by HfO2 film, the SERS substrate possesses good acid resistance, which enables it to be applicable in acidic environment where PS and SB work. Regression relationship between SERS spectra of 0.3~10 mg/L PS solution and their concentration was calibrated by partial least squares regression (PLSR) method, and the concentration prediction performance was quite satisfactory. Furthermore, mixture solution of PS and SB was also quantitatively analyzed by PLSR method. Spectrum data of characteristic peak sections corresponding to PS and SB was used to establish the regression models of these two solutes, respectively, and their concentrations were determined accurately despite their characteristic peak sections overlapping. It is possible that the unique modeling process of PLSR method prevented the overlapped Raman signal from reducing the model accuracy.

  19. Detection of melamine in milk powders using near-infrared hyperspectral imaging combined with regression coefficient of partial least square regression model.

    PubMed

    Lim, Jongguk; Kim, Giyoung; Mo, Changyeun; Kim, Moon S; Chao, Kuanglin; Qin, Jianwei; Fu, Xiaping; Baek, Insuck; Cho, Byoung-Kwan

    2016-05-01

    Illegal use of nitrogen-rich melamine (C3H6N6) to boost perceived protein content of food products such as milk, infant formula, frozen yogurt, pet food, biscuits, and coffee drinks has caused serious food safety problems. Conventional methods to detect melamine in foods, such as Enzyme-linked immunosorbent assay (ELISA), High-performance liquid chromatography (HPLC), and Gas chromatography-mass spectrometry (GC-MS), are sensitive but they are time-consuming, expensive, and labor-intensive. In this research, near-infrared (NIR) hyperspectral imaging technique combined with regression coefficient of partial least squares regression (PLSR) model was used to detect melamine particles in milk powders easily and quickly. NIR hyperspectral reflectance imaging data in the spectral range of 990-1700nm were acquired from melamine-milk powder mixture samples prepared at various concentrations ranging from 0.02% to 1%. PLSR models were developed to correlate the spectral data (independent variables) with melamine concentration (dependent variables) in melamine-milk powder mixture samples. PLSR models applying various pretreatment methods were used to reconstruct the two-dimensional PLS images. PLS images were converted to the binary images to detect the suspected melamine pixels in milk powder. As the melamine concentration was increased, the numbers of suspected melamine pixels of binary images were also increased. These results suggested that NIR hyperspectral imaging technique and the PLSR model can be regarded as an effective tool to detect melamine particles in milk powders. Copyright © 2016 Elsevier B.V. All rights reserved.

  20. Non-destructive and rapid prediction of moisture content in red pepper (Capsicum annuum L.) powder using near-infrared spectroscopy and a partial least squares regression model

    USDA-ARS?s Scientific Manuscript database

    Purpose: The aim of this study was to develop a technique for the non-destructive and rapid prediction of the moisture content in red pepper powder using near-infrared (NIR) spectroscopy and a partial least squares regression (PLSR) model. Methods: Three red pepper powder products were separated in...

  1. Baseline Correction of Diffuse Reflection Near-Infrared Spectra Using Searching Region Standard Normal Variate (SRSNV).

    PubMed

    Genkawa, Takuma; Shinzawa, Hideyuki; Kato, Hideaki; Ishikawa, Daitaro; Murayama, Kodai; Komiyama, Makoto; Ozaki, Yukihiro

    2015-12-01

    An alternative baseline correction method for diffuse reflection near-infrared (NIR) spectra, searching region standard normal variate (SRSNV), was proposed. Standard normal variate (SNV) is an effective pretreatment method for baseline correction of diffuse reflection NIR spectra of powder and granular samples; however, its baseline correction performance depends on the NIR region used for SNV calculation. To search for an optimal NIR region for baseline correction using SNV, SRSNV employs moving window partial least squares regression (MWPLSR), and an optimal NIR region is identified based on the root mean square error (RMSE) of cross-validation of the partial least squares regression (PLSR) models with the first latent variable (LV). The performance of SRSNV was evaluated using diffuse reflection NIR spectra of mixture samples consisting of wheat flour and granular glucose (0-100% glucose at 5% intervals). From the obtained NIR spectra of the mixture in the 10 000-4000 cm(-1) region at 4 cm intervals (1501 spectral channels), a series of spectral windows consisting of 80 spectral channels was constructed, and then SNV spectra were calculated for each spectral window. Using these SNV spectra, a series of PLSR models with the first LV for glucose concentration was built. A plot of RMSE versus the spectral window position obtained using the PLSR models revealed that the 8680–8364 cm(-1) region was optimal for baseline correction using SNV. In the SNV spectra calculated using the 8680–8364 cm(-1) region (SRSNV spectra), a remarkable relative intensity change between a band due to wheat flour at 8500 cm(-1) and that due to glucose at 8364 cm(-1) was observed owing to successful baseline correction using SNV. A PLSR model with the first LV based on the SRSNV spectra yielded a determination coefficient (R2) of 0.999 and an RMSE of 0.70%, while a PLSR model with three LVs based on SNV spectra calculated in the full spectral region gave an R2 of 0.995 and an RMSE of 2.29%. Additional evaluation of SRSNV was carried out using diffuse reflection NIR spectra of marzipan and corn samples, and PLSR models based on SRSNV spectra showed good prediction results. These evaluation results indicate that SRSNV is effective in baseline correction of diffuse reflection NIR spectra and provides regression models with good prediction accuracy.

  2. Quantitative structure-retention relationship studies with immobilized artificial membrane chromatography II: partial least squares regression.

    PubMed

    Li, Jie; Sun, Jin; He, Zhonggui

    2007-01-26

    We aimed to establish quantitative structure-retention relationship (QSRR) with immobilized artificial membrane (IAM) chromatography using easily understood and obtained physicochemical molecular descriptors and to elucidate which descriptors are critical to affect the interaction process between solutes and immobilized phospholipid membranes. The retention indices (logk(IAM)) of 55 structurally diverse drugs were determined on an immobilized artificial membrane column (IAM.PC.DD2) directly or obtained by extrapolation method for highly hydrophobic compounds. Ten simple physicochemical property descriptors (clogP, rings, rotatory bond, hydro-bond counting, etc.) of these drugs were collected and used to establish QSRR and predict the retention data by partial least squares regression (PLSR). Five descriptors, clogP, rotatory bond (RotB), rings, molecular weight (MW) and total surface area (TSA), were reserved by using the Variable Importance for Projection (VIP) values as criterion to build the final PLSR model. An external test set was employed to verify the QSRR based on the training set with the five variables, and QSRR by PLSR exhibited a satisfying predictive ability with R(p)=0.902 and RMSE(p)=0.400. Comparison of coefficients of centered and scaled variables by PLSR demonstrated that, for the descriptors studied, clogP and TSA have the most significant positive effect but the rotatable bond has significant negative effect on drug IAM chromatographic retention.

  3. Determination of total iron-reactive phenolics, anthocyanins and tannins in wine grapes of skins and seeds based on near-infrared hyperspectral imaging.

    PubMed

    Zhang, Ni; Liu, Xu; Jin, Xiaoduo; Li, Chen; Wu, Xuan; Yang, Shuqin; Ning, Jifeng; Yanne, Paul

    2017-12-15

    Phenolics contents in wine grapes are key indicators for assessing ripeness. Near-infrared hyperspectral images during ripening have been explored to achieve an effective method for predicting phenolics contents. Principal component regression (PCR), partial least squares regression (PLSR) and support vector regression (SVR) models were built, respectively. The results show that SVR behaves globally better than PLSR and PCR, except in predicting tannins content of seeds. For the best prediction results, the squared correlation coefficient and root mean square error reached 0.8960 and 0.1069g/L (+)-catechin equivalents (CE), respectively, for tannins in skins, 0.9065 and 0.1776 (g/L CE) for total iron-reactive phenolics (TIRP) in skins, 0.8789 and 0.1442 (g/L M3G) for anthocyanins in skins, 0.9243 and 0.2401 (g/L CE) for tannins in seeds, and 0.8790 and 0.5190 (g/L CE) for TIRP in seeds. Our results indicated that NIR hyperspectral imaging has good prospects for evaluation of phenolics in wine grapes. Copyright © 2017 Elsevier Ltd. All rights reserved.

  4. Feasibility of using near infrared spectroscopy to detect and quantify an adulterant in high quality sandalwood oil.

    PubMed

    Kuriakose, Saji; Joe, I Hubert

    2013-11-01

    Determination of the authenticity of essential oils has become more significant, in recent years, following some illegal adulteration and contamination scandals. The present investigative study focuses on the application of near infrared spectroscopy to detect sample authenticity and quantify economic adulteration of sandalwood oils. Several data pre-treatments are investigated for calibration and prediction using partial least square regression (PLSR). The quantitative data analysis is done using a new spectral approach - full spectrum or sequential spectrum. The optimum number of PLS components is obtained according to the lowest root mean square error of calibration (RMSEC=0.00009% v/v). The lowest root mean square error of prediction (RMSEP=0.00016% v/v) in the test set and the highest coefficient of determination (R(2)=0.99989) are used as the evaluation tools for the best model. A nonlinear method, locally weighted regression (LWR), is added to extract nonlinear information and to compare with the linear PLSR model. Copyright © 2013 Elsevier B.V. All rights reserved.

  5. Feasibility of using near infrared spectroscopy to detect and quantify an adulterant in high quality sandalwood oil

    NASA Astrophysics Data System (ADS)

    Kuriakose, Saji; Joe, I. Hubert

    2013-11-01

    Determination of the authenticity of essential oils has become more significant, in recent years, following some illegal adulteration and contamination scandals. The present investigative study focuses on the application of near infrared spectroscopy to detect sample authenticity and quantify economic adulteration of sandalwood oils. Several data pre-treatments are investigated for calibration and prediction using partial least square regression (PLSR). The quantitative data analysis is done using a new spectral approach - full spectrum or sequential spectrum. The optimum number of PLS components is obtained according to the lowest root mean square error of calibration (RMSEC = 0.00009% v/v). The lowest root mean square error of prediction (RMSEP = 0.00016% v/v) in the test set and the highest coefficient of determination (R2 = 0.99989) are used as the evaluation tools for the best model. A nonlinear method, locally weighted regression (LWR), is added to extract nonlinear information and to compare with the linear PLSR model.

  6. A deep belief network with PLSR for nonlinear system modeling.

    PubMed

    Qiao, Junfei; Wang, Gongming; Li, Wenjing; Li, Xiaoli

    2018-08-01

    Nonlinear system modeling plays an important role in practical engineering, and deep learning-based deep belief network (DBN) is now popular in nonlinear system modeling and identification because of the strong learning ability. However, the existing weights optimization for DBN is based on gradient, which always leads to a local optimum and a poor training result. In this paper, a DBN with partial least square regression (PLSR-DBN) is proposed for nonlinear system modeling, which focuses on the problem of weights optimization for DBN using PLSR. Firstly, unsupervised contrastive divergence (CD) algorithm is used in weights initialization. Secondly, initial weights derived from CD algorithm are optimized through layer-by-layer PLSR modeling from top layer to bottom layer. Instead of gradient method, PLSR-DBN can determine the optimal weights using several PLSR models, so that a better performance of PLSR-DBN is achieved. Then, the analysis of convergence is theoretically given to guarantee the effectiveness of the proposed PLSR-DBN model. Finally, the proposed PLSR-DBN is tested on two benchmark nonlinear systems and an actual wastewater treatment system as well as a handwritten digit recognition (nonlinear mapping and modeling) with high-dimension input data. The experiment results show that the proposed PLSR-DBN has better performances of time and accuracy on nonlinear system modeling than that of other methods. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Developing a NIR multispectral imaging for prediction and visualization of peanut protein content using variable selection algorithms

    NASA Astrophysics Data System (ADS)

    Cheng, Jun-Hu; Jin, Huali; Liu, Zhiwei

    2018-01-01

    The feasibility of developing a multispectral imaging method using important wavelengths from hyperspectral images selected by genetic algorithm (GA), successive projection algorithm (SPA) and regression coefficient (RC) methods for modeling and predicting protein content in peanut kernel was investigated for the first time. Partial least squares regression (PLSR) calibration model was established between the spectral data from the selected optimal wavelengths and the reference measured protein content ranged from 23.46% to 28.43%. The RC-PLSR model established using eight key wavelengths (1153, 1567, 1972, 2143, 2288, 2339, 2389 and 2446 nm) showed the best predictive results with the coefficient of determination of prediction (R2P) of 0.901, and root mean square error of prediction (RMSEP) of 0.108 and residual predictive deviation (RPD) of 2.32. Based on the obtained best model and image processing algorithms, the distribution maps of protein content were generated. The overall results of this study indicated that developing a rapid and online multispectral imaging system using the feature wavelengths and PLSR analysis is potential and feasible for determination of the protein content in peanut kernels.

  8. Hyperspectral Imaging in Tandem with R Statistics and Image Processing for Detection and Visualization of pH in Japanese Big Sausages Under Different Storage Conditions.

    PubMed

    Feng, Chao-Hui; Makino, Yoshio; Yoshimura, Masatoshi; Thuyet, Dang Quoc; García-Martín, Juan Francisco

    2018-02-01

    The potential of hyperspectral imaging with wavelengths of 380 to 1000 nm was used to determine the pH of cooked sausages after different storage conditions (4 °C for 1 d, 35 °C for 1, 3, and 5 d). The mean spectra of the sausages were extracted from the hyperspectral images and partial least squares regression (PLSR) model was developed to relate spectral profiles with the pH of the cooked sausages. Eleven important wavelengths were selected based on the regression coefficient values. The PLSR model established using the optimal wavelengths showed good precision being the prediction coefficient of determination (R p 2 ) 0.909 and the root mean square error of prediction 0.035. The prediction map for illustrating pH indices in sausages was for the first time developed by R statistics. The overall results suggested that hyperspectral imaging combined with PLSR and R statistics are capable to quantify and visualize the sausages pH evolution under different storage conditions. In this paper, hyperspectral imaging is for the first time used to detect pH in cooked sausages using R statistics, which provides another useful information for the researchers who do not have the access to Matlab. Eleven optimal wavelengths were successfully selected, which were used for simplifying the PLSR model established based on the full wavelengths. This simplified model achieved a high R p 2 (0.909) and a low root mean square error of prediction (0.035), which can be useful for the design of multispectral imaging systems. © 2017 Institute of Food Technologists®.

  9. Quantitative determination of Auramine O by terahertz spectroscopy with 2DCOS-PLSR model

    NASA Astrophysics Data System (ADS)

    Zhang, Huo; Li, Zhi; Chen, Tao; Qin, Binyi

    2017-09-01

    Residues of harmful dyes such as Auramine O (AO) in herb and food products threaten the health of people. So, fast and sensitive detection techniques of the residues are needed. As a powerful tool for substance detection, terahertz (THz) spectroscopy was used for the quantitative determination of AO by combining with an improved partial least-squares regression (PLSR) model in this paper. Absorbance of herbal samples with different concentrations was obtained by THz-TDS in the band between 0.2THz and 1.6THz. We applied two-dimensional correlation spectroscopy (2DCOS) to improve the PLSR model. This method highlighted the spectral differences of different concentrations, provided a clear criterion of the input interval selection, and improved the accuracy of detection result. The experimental result indicated that the combination of the THz spectroscopy and 2DCOS-PLSR is an excellent quantitative analysis method.

  10. Discrimination and characterization of strawberry juice based on electronic nose and tongue: comparison of different juice processing approaches by LDA, PLSR, RF, and SVM.

    PubMed

    Qiu, Shanshan; Wang, Jun; Gao, Liping

    2014-07-09

    An electronic nose (E-nose) and an electronic tongue (E-tongue) have been used to characterize five types of strawberry juices based on processing approaches (i.e., microwave pasteurization, steam blanching, high temperature short time pasteurization, frozen-thawed, and freshly squeezed). Juice quality parameters (vitamin C, pH, total soluble solid, total acid, and sugar/acid ratio) were detected by traditional measuring methods. Multivariate statistical methods (linear discriminant analysis (LDA) and partial least squares regression (PLSR)) and neural networks (Random Forest (RF) and Support Vector Machines) were employed to qualitative classification and quantitative regression. E-tongue system reached higher accuracy rates than E-nose did, and the simultaneous utilization did have an advantage in LDA classification and PLSR regression. According to cross-validation, RF has shown outstanding and indisputable performances in the qualitative and quantitative analysis. This work indicates that the simultaneous utilization of E-nose and E-tongue can discriminate processed fruit juices and predict quality parameters successfully for the beverage industry.

  11. Application of near-infrared spectroscopy in the detection of fat-soluble vitamins in premix feed

    NASA Astrophysics Data System (ADS)

    Jia, Lian Ping; Tian, Shu Li; Zheng, Xue Cong; Jiao, Peng; Jiang, Xun Peng

    2018-02-01

    Vitamin is the organic compound and necessary for animal physiological maintenance. The rapid determination of the content of different vitamins in premix feed can help to achieve accurate diets and efficient feeding. Compared with high-performance liquid chromatography and other wet chemical methods, near-infrared spectroscopy is a fast, non-destructive, non-polluting method. 168 samples of premix feed were collected and the contents of vitamin A, vitamin E and vitamin D3 were detected by the standard method. The near-infrared spectra of samples ranging from 10 000 to 4 000 cm-1 were obtained. Partial least squares regression (PLSR) and support vector machine regression (SVMR) were used to construct the quantitative model. The results showed that the RMSEP of PLSR model of vitamin A, vitamin E and vitamin D3 were 0.43×107 IU/kg, 0.09×105 IU/kg and 0.17×107 IU/kg, respectively. The RMSEP of SVMR model was 0.45×107 IU/kg, 0.11×105 IU/kg and 0.18×107 IU/kg. Compared with nonlinear regression method (SVMR), linear regression method (PLSR) is more suitable for the quantitative analysis of vitamins in premix feed.

  12. Aroma profile and sensory characteristics of a sulfur dioxide-free mulberry (Morus nigra) wine subjected to non-thermal accelerating aging techniques.

    PubMed

    Tchabo, William; Ma, Yongkun; Kwaw, Emmanuel; Zhang, Haining; Xiao, Lulu; Tahir, Haroon Elrasheid

    2017-10-01

    The present study was undertaken to assess accelerating aging effects of high pressure, ultrasound and manosonication on the aromatic profile and sensorial attributes of aged mulberry wines (AMW). A total of 166 volatile compounds were found amongst the AMW. The outcomes of the investigation were presented by means of geometric mean (GM), cluster analysis (CA), principal component analysis (PCA), partial least squares regressions (PLSR) and principal component regression (PCR). GM highlighted 24 organoleptic attributes responsible for the sensorial profile of the AMW. Moreover, CA revealed that the volatile composition of the non-thermal accelerated aged wines differs from that of the conventional aged wines. Besides, PCA discriminated the AMW on the basis of their main sensorial characteristics. Furthermore, PLSR identified 75 aroma compounds which were mainly responsible for the olfactory notes of the AMW. Finally, the overall quality of the AMW was noted to be better predicted by PLSR than PCR. Copyright © 2017 Elsevier Ltd. All rights reserved.

  13. How to predict the sugariness and hardness of melons: A near-infrared hyperspectral imaging method.

    PubMed

    Sun, Meijun; Zhang, Dong; Liu, Li; Wang, Zheng

    2017-03-01

    Hyperspectral imaging (HSI) in the near-infrared (NIR) region (900-1700nm) was used for non-intrusive quality measurements (of sweetness and texture) in melons. First, HSI data from melon samples were acquired to extract the spectral signatures. The corresponding sample sweetness and hardness values were recorded using traditional intrusive methods. Partial least squares regression (PLSR), principal component analysis (PCA), support vector machine (SVM), and artificial neural network (ANN) models were created to predict melon sweetness and hardness values from the hyperspectral data. Experimental results for the three types of melons show that PLSR produces the most accurate results. To reduce the high dimensionality of the hyperspectral data, the weighted regression coefficients of the resulting PLSR models were used to identify the most important wavelengths. On the basis of these wavelengths, each image pixel was used to visualize the sweetness and hardness in all the portions of each sample. Copyright © 2016 Elsevier Ltd. All rights reserved.

  14. Hyperspectral Reflectance Imaging Technique for Visualization of Moisture Distribution in Cooked Chicken Breast

    PubMed Central

    Kandpal, Lalit Mohan; Lee, Hoonsoo; Kim, Moon S.; Mo, Changyeun; Cho, Byoung-Kwan

    2013-01-01

    Spectroscopy has proven to be an efficient tool for measuring the properties of meat. In this article, hyperspectral imaging (HSI) techniques are used to determine the moisture content in cooked chicken breast over the VIS/NIR (400–1,000 nm) spectral range. Moisture measurements were performed using an oven drying method. A partial least squares regression (PLSR) model was developed to extract a relationship between the HSI spectra and the moisture content. In the full wavelength range, the PLSR model possessed a maximum R2p of 0.90 and an SEP of 0.74%. For the NIR range, the PLSR model yielded an R2p of 0.94 and an SEP of 0.71%. The majority of the absorption peaks occurred around 760 and 970 nm, representing the water content in the samples. Finally, PLSR images were constructed to visualize the dehydration and water distribution within different sample regions. The high correlation coefficient and low prediction error from the PLSR analysis validates that HSI is an effective tool for visualizing the chemical properties of meat. PMID:24084119

  15. Least-Squares Regression and Spectral Residual Augmented Classical Least-Squares Chemometric Models for Stability-Indicating Analysis of Agomelatine and Its Degradation Products: A Comparative Study.

    PubMed

    Naguib, Ibrahim A; Abdelrahman, Maha M; El Ghobashy, Mohamed R; Ali, Nesma A

    2016-01-01

    Two accurate, sensitive, and selective stability-indicating methods are developed and validated for simultaneous quantitative determination of agomelatine (AGM) and its forced degradation products (Deg I and Deg II), whether in pure forms or in pharmaceutical formulations. Partial least-squares regression (PLSR) and spectral residual augmented classical least-squares (SRACLS) are two chemometric models that are being subjected to a comparative study through handling UV spectral data in range (215-350 nm). For proper analysis, a three-factor, four-level experimental design was established, resulting in a training set consisting of 16 mixtures containing different ratios of interfering species. An independent test set consisting of eight mixtures was used to validate the prediction ability of the suggested models. The results presented indicate the ability of mentioned multivariate calibration models to analyze AGM, Deg I, and Deg II with high selectivity and accuracy. The analysis results of the pharmaceutical formulations were statistically compared to the reference HPLC method, with no significant differences observed regarding accuracy and precision. The SRACLS model gives comparable results to the PLSR model; however, it keeps the qualitative spectral information of the classical least-squares algorithm for analyzed components.

  16. Estimation of leaf water contents from mid- and thermal infrared spectra by coupling genetic algorithm and partial least squares regression

    NASA Astrophysics Data System (ADS)

    Arshad, Muhammad; Ullah, Saleem; Khurshid, Khurram; Ali, Asad

    2017-10-01

    Leaf Water Content (LWC) is an essential constituent of plant leaves that determines vegetation heath and its productivity. An accurate and on-time measurement of water content is crucial for planning irrigation, forecasting drought and predicting woodland fire. The retrieval of LWC from Visible to Shortwave Infrared (VSWIR: 0.4-2.5 μm) has been extensively investigated but little has been done in the Mid and Thermal Infrared (MIR and TIR: 2.50 -14.0 μm), windows of electromagnetic spectrum. This study is mainly focused on retrieval of LWC from Mid and Thermal Infrared, using Genetic Algorithm integrated with Partial Least Square Regression (PLSR). Genetic Algorithm fused with PLSR selects spectral wavebands with high predictive performance i.e., yields high adjusted-R2 and low RMSE. In our case, GA-PLSR selected eight variables (bands) and yielded highly accurate models with adjusted-R2 of 0.93 and RMSEcv equal to 7.1 %. The study also demonstrated that MIR is more sensitive to the variation in LWC as compared to TIR. However, the combined use of MIR and TIR spectra enhances the predictive performance in retrieval of LWC. The integration of Genetic Algorithm and PLSR, not only increases the estimation precision by selecting the most sensitive spectral bands but also helps in identifying the important spectral regions for quantifying water stresses in vegetation. The findings of this study will allow the future space missions (like HyspIRI) to position wavebands at sensitive regions for characterizing vegetation stresses.

  17. Using an optimal CC-PLSR-RBFNN model and NIR spectroscopy for the starch content determination in corn

    NASA Astrophysics Data System (ADS)

    Jiang, Hao; Lu, Jiangang

    2018-05-01

    Corn starch is an important material which has been traditionally used in the fields of food and chemical industry. In order to enhance the rapidness and reliability of the determination for starch content in corn, a methodology is proposed in this work, using an optimal CC-PLSR-RBFNN calibration model and near-infrared (NIR) spectroscopy. The proposed model was developed based on the optimal selection of crucial parameters and the combination of correlation coefficient method (CC), partial least squares regression (PLSR) and radial basis function neural network (RBFNN). To test the performance of the model, a standard NIR spectroscopy data set was introduced, containing spectral information and chemical reference measurements of 80 corn samples. For comparison, several other models based on the identical data set were also briefly discussed. In this process, the root mean square error of prediction (RMSEP) and coefficient of determination (Rp2) in the prediction set were used to make evaluations. As a result, the proposed model presented the best predictive performance with the smallest RMSEP (0.0497%) and the highest Rp2 (0.9968). Therefore, the proposed method combining NIR spectroscopy with the optimal CC-PLSR-RBFNN model can be helpful to determine starch content in corn.

  18. The prediction of food additives in the fruit juice based on electronic nose with chemometrics.

    PubMed

    Qiu, Shanshan; Wang, Jun

    2017-09-01

    Food additives are added to products to enhance their taste, and preserve flavor or appearance. While their use should be restricted to achieve a technological benefit, the contents of food additives should be also strictly controlled. In this study, E-nose was applied as an alternative to traditional monitoring technologies for determining two food additives, namely benzoic acid and chitosan. For quantitative monitoring, support vector machine (SVM), random forest (RF), extreme learning machine (ELM) and partial least squares regression (PLSR) were applied to establish regression models between E-nose signals and the amount of food additives in fruit juices. The monitoring models based on ELM and RF reached higher correlation coefficients (R 2 s) and lower root mean square errors (RMSEs) than models based on PLSR and SVM. This work indicates that E-nose combined with RF or ELM can be a cost-effective, easy-to-build and rapid detection system for food additive monitoring. Copyright © 2017 Elsevier Ltd. All rights reserved.

  19. Assessing the sensitivity and robustness of prediction models for apple firmness using spectral scattering technique

    USDA-ARS?s Scientific Manuscript database

    Spectral scattering is useful for nondestructive sensing of fruit firmness. Prediction models, however, are typically built using multivariate statistical methods such as partial least squares regression (PLSR), whose performance generally depends on the characteristics of the data. The aim of this ...

  20. Spatial assessment of soluble solid contents on apple slices using hyperspectral imaging

    USDA-ARS?s Scientific Manuscript database

    A partial least squares regression (PLSR) model to map internal soluble solids content (SSC) of apples using visible/near-infrared (VNIR) hyperspectral imaging was developed. The reflectance spectra of sliced apples were extracted from hyperspectral absorbance images obtained in the 400e1000 nm rang...

  1. In Situ Measurement of Some Soil Properties in Paddy Soil Using Visible and Near-Infrared Spectroscopy

    PubMed Central

    Wenjun, Ji; Zhou, Shi; Jingyi, Huang; Shuo, Li

    2014-01-01

    In situ measurements with visible and near-infrared spectroscopy (vis-NIR) provide an efficient way for acquiring soil information of paddy soils in the short time gap between the harvest and following rotation. The aim of this study was to evaluate its feasibility to predict a series of soil properties including organic matter (OM), organic carbon (OC), total nitrogen (TN), available nitrogen (AN), available phosphorus (AP), available potassium (AK) and pH of paddy soils in Zhejiang province, China. Firstly, the linear partial least squares regression (PLSR) was performed on the in situ spectra and the predictions were compared to those with laboratory-based recorded spectra. Then, the non-linear least-square support vector machine (LS-SVM) algorithm was carried out aiming to extract more useful information from the in situ spectra and improve predictions. Results show that in terms of OC, OM, TN, AN and pH, (i) the predictions were worse using in situ spectra compared to laboratory-based spectra with PLSR algorithm (ii) the prediction accuracy using LS-SVM (R2>0.75, RPD>1.90) was obviously improved with in situ vis-NIR spectra compared to PLSR algorithm, and comparable or even better than results generated using laboratory-based spectra with PLSR; (iii) in terms of AP and AK, poor predictions were obtained with in situ spectra (R2<0.5, RPD<1.50) either using PLSR or LS-SVM. The results highlight the use of LS-SVM for in situ vis-NIR spectroscopic estimation of soil properties of paddy soils. PMID:25153132

  2. Synchronous front-face fluorescence spectroscopy for authentication of the adulteration of edible vegetable oil with refined used frying oil.

    PubMed

    Tan, Jin; Li, Rong; Jiang, Zi-Tao; Tang, Shu-Hua; Wang, Ying; Shi, Meng; Xiao, Yi-Qian; Jia, Bin; Lu, Tian-Xiang; Wang, Hao

    2017-02-15

    Synchronous front-face fluorescence spectroscopy has been developed for the discrimination of used frying oil (UFO) from edible vegetable oil (EVO), the estimation of the using time of UFO, and the determination of the adulteration of EVO with UFO. Both the heating time of laboratory prepared UFO and the adulteration of EVO with UFO could be determined by partial least squares regression (PLSR). To simulate the EVO adulteration with UFO, for each kind of oil, fifty adulterated samples at the adulterant amounts range of 1-50% were prepared. PLSR was then adopted to build the model and both full (leave-one-out) cross-validation and external validation were performed to evaluate the predictive ability. Under the optimum condition, the plots of observed versus predicted values exhibited high linearity (R(2)>0.96). The root mean square error of cross-validation (RMSECV) and root mean square error of prediction (RMSEP) were both lower than 3%. Copyright © 2016 Elsevier Ltd. All rights reserved.

  3. Novel insights into the lipidome of glioblastoma cells based on a combined PLSR and DD-HDS computational analysis

    NASA Astrophysics Data System (ADS)

    Lespinats, S.; Meyer-Bäse, Anke; He, Huan; Marshall, Alan G.; Conrad, Charles A.; Emmett, Mark R.

    2009-05-01

    Partial Least Square Regression (PLSR) and Data-Driven High Dimensional Scaling (DD-HDS) are employed for the prediction and the visualization of changes in polar lipid expression induced by different combinations of wild-type (wt) p53 gene therapy and SN38 chemotherapy of U87 MG glioblastoma cells. A very detailed analysis of the gangliosides reveals that certain gangliosides of GM3 or GD1-type have unique properties not shared by the others. In summary, this preliminary work shows that data mining techniques are able to determine the modulation of gangliosides by different treatment combinations.

  4. A comparison of five methods to predict genomic breeding values of dairy bulls from genome-wide SNP markers

    PubMed Central

    2009-01-01

    Background Genomic selection (GS) uses molecular breeding values (MBV) derived from dense markers across the entire genome for selection of young animals. The accuracy of MBV prediction is important for a successful application of GS. Recently, several methods have been proposed to estimate MBV. Initial simulation studies have shown that these methods can accurately predict MBV. In this study we compared the accuracies and possible bias of five different regression methods in an empirical application in dairy cattle. Methods Genotypes of 7,372 SNP and highly accurate EBV of 1,945 dairy bulls were used to predict MBV for protein percentage (PPT) and a profit index (Australian Selection Index, ASI). Marker effects were estimated by least squares regression (FR-LS), Bayesian regression (Bayes-R), random regression best linear unbiased prediction (RR-BLUP), partial least squares regression (PLSR) and nonparametric support vector regression (SVR) in a training set of 1,239 bulls. Accuracy and bias of MBV prediction were calculated from cross-validation of the training set and tested against a test team of 706 young bulls. Results For both traits, FR-LS using a subset of SNP was significantly less accurate than all other methods which used all SNP. Accuracies obtained by Bayes-R, RR-BLUP, PLSR and SVR were very similar for ASI (0.39-0.45) and for PPT (0.55-0.61). Overall, SVR gave the highest accuracy. All methods resulted in biased MBV predictions for ASI, for PPT only RR-BLUP and SVR predictions were unbiased. A significant decrease in accuracy of prediction of ASI was seen in young test cohorts of bulls compared to the accuracy derived from cross-validation of the training set. This reduction was not apparent for PPT. Combining MBV predictions with pedigree based predictions gave 1.05 - 1.34 times higher accuracies compared to predictions based on pedigree alone. Some methods have largely different computational requirements, with PLSR and RR-BLUP requiring the least computing time. Conclusions The four methods which use information from all SNP namely RR-BLUP, Bayes-R, PLSR and SVR generate similar accuracies of MBV prediction for genomic selection, and their use in the selection of immediate future generations in dairy cattle will be comparable. The use of FR-LS in genomic selection is not recommended. PMID:20043835

  5. Fusing face-verification algorithms and humans.

    PubMed

    O'Toole, Alice J; Abdi, Hervé; Jiang, Fang; Phillips, P Jonathon

    2007-10-01

    It has been demonstrated recently that state-of-the-art face-recognition algorithms can surpass human accuracy at matching faces over changes in illumination. The ranking of algorithms and humans by accuracy, however, does not provide information about whether algorithms and humans perform the task comparably or whether algorithms and humans can be fused to improve performance. In this paper, we fused humans and algorithms using partial least square regression (PLSR). In the first experiment, we applied PLSR to face-pair similarity scores generated by seven algorithms participating in the Face Recognition Grand Challenge. The PLSR produced an optimal weighting of the similarity scores, which we tested for generality with a jackknife procedure. Fusing the algorithms' similarity scores using the optimal weights produced a twofold reduction of error rate over the most accurate algorithm. Next, human-subject-generated similarity scores were added to the PLSR analysis. Fusing humans and algorithms increased the performance to near-perfect classification accuracy. These results are discussed in terms of maximizing face-verification accuracy with hybrid systems consisting of multiple algorithms and humans.

  6. [Measurement of soil organic matter and available K based on SPA-LS-SVM].

    PubMed

    Zhang, Hai-Liang; Liu, Xue-Mei; He, Yong

    2014-05-01

    Visible and short wave infrared spectroscopy (Vis/SW-NIRS) was investigated in the present study for measurement of soil organic matter (OM) and available potassium (K). Four types of pretreatments including smoothing, SNV, MSC and SG smoothing+first derivative were adopted to eliminate the system noises and external disturbances. Then partial least squares regression (PLSR) and least squares-support vector machine (LS-SVM) models were implemented for calibration models. The LS-SVM model was built by using characteristic wavelength based on successive projections algorithm (SPA). Simultaneously, the performance of LSSVM models was compared with PLSR models. The results indicated that LS-SVM models using characteristic wavelength as inputs based on SPA outperformed PLSR models. The optimal SPA-LS-SVM models were achieved, and the correlation coefficient (r), and RMSEP were 0. 860 2 and 2. 98 for OM and 0. 730 5 and 15. 78 for K, respectively. The results indicated that visible and short wave near infrared spectroscopy (Vis/SW-NIRS) (325 approximately 1 075 nm) combined with LS-SVM based on SPA could be utilized as a precision method for the determination of soil properties.

  7. Development of variable pathlength UV-vis spectroscopy combined with partial-least-squares regression for wastewater chemical oxygen demand (COD) monitoring.

    PubMed

    Chen, Baisheng; Wu, Huanan; Li, Sam Fong Yau

    2014-03-01

    To overcome the challenging task to select an appropriate pathlength for wastewater chemical oxygen demand (COD) monitoring with high accuracy by UV-vis spectroscopy in wastewater treatment process, a variable pathlength approach combined with partial-least squares regression (PLSR) was developed in this study. Two new strategies were proposed to extract relevant information of UV-vis spectral data from variable pathlength measurements. The first strategy was by data fusion with two data fusion levels: low-level data fusion (LLDF) and mid-level data fusion (MLDF). Predictive accuracy was found to improve, indicated by the lower root-mean-square errors of prediction (RMSEP) compared with those obtained for single pathlength measurements. Both fusion levels were found to deliver very robust PLSR models with residual predictive deviations (RPD) greater than 3 (i.e. 3.22 and 3.29, respectively). The second strategy involved calculating the slopes of absorbance against pathlength at each wavelength to generate slope-derived spectra. Without the requirement to select the optimal pathlength, the predictive accuracy (RMSEP) was improved by 20-43% as compared to single pathlength spectroscopy. Comparing to nine-factor models from fusion strategy, the PLSR model from slope-derived spectroscopy was found to be more parsimonious with only five factors and more robust with residual predictive deviation (RPD) of 3.72. It also offered excellent correlation of predicted and measured COD values with R(2) of 0.936. In sum, variable pathlength spectroscopy with the two proposed data analysis strategies proved to be successful in enhancing prediction performance of COD in wastewater and showed high potential to be applied in on-line water quality monitoring. Copyright © 2013 Elsevier B.V. All rights reserved.

  8. Hyperspectral imaging using a color camera and its application for pathogen detection

    NASA Astrophysics Data System (ADS)

    Yoon, Seung-Chul; Shin, Tae-Sung; Heitschmidt, Gerald W.; Lawrence, Kurt C.; Park, Bosoon; Gamble, Gary

    2015-02-01

    This paper reports the results of a feasibility study for the development of a hyperspectral image recovery (reconstruction) technique using a RGB color camera and regression analysis in order to detect and classify colonies of foodborne pathogens. The target bacterial pathogens were the six representative non-O157 Shiga-toxin producing Escherichia coli (STEC) serogroups (O26, O45, O103, O111, O121, and O145) grown in Petri dishes of Rainbow agar. The purpose of the feasibility study was to evaluate whether a DSLR camera (Nikon D700) could be used to predict hyperspectral images in the wavelength range from 400 to 1,000 nm and even to predict the types of pathogens using a hyperspectral STEC classification algorithm that was previously developed. Unlike many other studies using color charts with known and noise-free spectra for training reconstruction models, this work used hyperspectral and color images, separately measured by a hyperspectral imaging spectrometer and the DSLR color camera. The color images were calibrated (i.e. normalized) to relative reflectance, subsampled and spatially registered to match with counterpart pixels in hyperspectral images that were also calibrated to relative reflectance. Polynomial multivariate least-squares regression (PMLR) was previously developed with simulated color images. In this study, partial least squares regression (PLSR) was also evaluated as a spectral recovery technique to minimize multicollinearity and overfitting. The two spectral recovery models (PMLR and PLSR) and their parameters were evaluated by cross-validation. The QR decomposition was used to find a numerically more stable solution of the regression equation. The preliminary results showed that PLSR was more effective especially with higher order polynomial regressions than PMLR. The best classification accuracy measured with an independent test set was about 90%. The results suggest the potential of cost-effective color imaging using hyperspectral image classification algorithms for rapidly differentiating pathogens in agar plates.

  9. Discrimination and prediction of the origin of Chinese and Korean soybeans using Fourier transform infrared spectrometry (FT-IR) with multivariate statistical analysis

    PubMed Central

    Lee, Byeong-Ju; Zhou, Yaoyao; Lee, Jae Soung; Shin, Byeung Kon; Seo, Jeong-Ah; Lee, Doyup; Kim, Young-Suk

    2018-01-01

    The ability to determine the origin of soybeans is an important issue following the inclusion of this information in the labeling of agricultural food products becoming mandatory in South Korea in 2017. This study was carried out to construct a prediction model for discriminating Chinese and Korean soybeans using Fourier-transform infrared (FT-IR) spectroscopy and multivariate statistical analysis. The optimal prediction models for discriminating soybean samples were obtained by selecting appropriate scaling methods, normalization methods, variable influence on projection (VIP) cutoff values, and wave-number regions. The factors for constructing the optimal partial-least-squares regression (PLSR) prediction model were using second derivatives, vector normalization, unit variance scaling, and the 4000–400 cm–1 region (excluding water vapor and carbon dioxide). The PLSR model for discriminating Chinese and Korean soybean samples had the best predictability when a VIP cutoff value was not applied. When Chinese soybean samples were identified, a PLSR model that has the lowest root-mean-square error of the prediction value was obtained using a VIP cutoff value of 1.5. The optimal PLSR prediction model for discriminating Korean soybean samples was also obtained using a VIP cutoff value of 1.5. This is the first study that has combined FT-IR spectroscopy with normalization methods, VIP cutoff values, and selected wave-number regions for discriminating Chinese and Korean soybeans. PMID:29689113

  10. Discrimination of serum Raman spectroscopy between normal and colorectal cancer

    NASA Astrophysics Data System (ADS)

    Li, Xiaozhou; Yang, Tianyue; Yu, Ting; Li, Siqi

    2011-07-01

    Raman spectroscopy of tissues has been widely studied for the diagnosis of various cancers, but biofluids were seldom used as the analyte because of the low concentration. Herein, serum of 30 normal people, 46 colon cancer, and 44 rectum cancer patients were measured Raman spectra and analyzed. The information of Raman peaks (intensity and width) and that of the fluorescence background (baseline function coefficients) were selected as parameters for statistical analysis. Principal component regression (PCR) and partial least square regression (PLSR) were used on the selected parameters separately to see the performance of the parameters. PCR performed better than PLSR in our spectral data. Then linear discriminant analysis (LDA) was used on the principal components (PCs) of the two regression method on the selected parameters, and a diagnostic accuracy of 88% and 83% were obtained. The conclusion is that the selected features can maintain the information of original spectra well and Raman spectroscopy of serum has the potential for the diagnosis of colorectal cancer.

  11. Using visible and near-infrared diffuse reflectance spectroscopy for predicting soil properties based on regression with peaks parameters as derived from continuum-removed spectra

    NASA Astrophysics Data System (ADS)

    Vasat, Radim; Klement, Ales; Jaksik, Ondrej; Kodesova, Radka; Drabek, Ondrej; Boruvka, Lubos

    2014-05-01

    Visible and near-infrared diffuse reflectance spectroscopy (VNIR-DRS) provides a rapid and inexpensive tool for simultaneous prediction of a variety of soil properties. Usually, some sophisticated multivariate mathematical or statistical methods are employed in order to extract the required information from the raw spectra measurement. For this purpose especially the Partial least squares regression (PLSR) and Support vector machines (SVM) are the most frequently used. These methods generally benefit from the complexity with which the soil spectra are treated. But it is interesting that also techniques that focus only on a single spectral feature, such as a simple linear regression with selected continuum-removed spectra (CRS) characteristic (e.g. peak depth), can often provide competitive results. Therefore, we decided to enhance the potential of CRS taking into account all possible CRS peak parameters (area, width and depth) and develop a comprehensive methodology based on multiple linear regression approach. The eight considered soil properties were oxidizable carbon content (Cox), exchangeable (pHex) and active soil pH (pHa), particle and bulk density, CaCO3 content, crystalline and amorphous (Fed) and amorphous Fe (Feox) forms. In four cases (pHa, bulk density, Fed and Feox), of which two (Fed and Feox) were predicted reliably accurately (0.50 < R2cv < 0.80) and the other two (pHa and bulk density) only poorly (R2cv < 0.50), we obtained slightly better results than with PLSR and SVM. In one case (pHex) we achieved a significantly higher, although just reliable, accuracy (R2cv = 0.601) than with PLSR and SVM (R2cv = 0.448 and 0.442, resp.). But most interestingly, in the case of particle density, the presented approach outperformed the PLSR and SVM dramatically offering a fairly accurate prediction (R2cv = 0.827) against two failures (R2cv = 0.034 and 0.121 for PLSR and SVM, resp.). In last two cases (Cox and CaCO3) a slightly worse results were achieved then with PLSR and SVM with overall fairly accurate prediction (R2cv > 0.80). Acknowledgment: Authors acknowledge the financial support of the Ministry of Agriculture of the Czech Republic (grant No. QJ1230319).

  12. Rapid prediction of total petroleum hydrocarbons concentration in contaminated soil using vis-NIR spectroscopy and regression techniques.

    PubMed

    Douglas, R K; Nawar, S; Alamar, M C; Mouazen, A M; Coulon, F

    2018-03-01

    Visible and near infrared spectrometry (vis-NIRS) coupled with data mining techniques can offer fast and cost-effective quantitative measurement of total petroleum hydrocarbons (TPH) in contaminated soils. Literature showed however significant differences in the performance on the vis-NIRS between linear and non-linear calibration methods. This study compared the performance of linear partial least squares regression (PLSR) with a nonlinear random forest (RF) regression for the calibration of vis-NIRS when analysing TPH in soils. 88 soil samples (3 uncontaminated and 85 contaminated) collected from three sites located in the Niger Delta were scanned using an analytical spectral device (ASD) spectrophotometer (350-2500nm) in diffuse reflectance mode. Sequential ultrasonic solvent extraction-gas chromatography (SUSE-GC) was used as reference quantification method for TPH which equal to the sum of aliphatic and aromatic fractions ranging between C 10 and C 35 . Prior to model development, spectra were subjected to pre-processing including noise cut, maximum normalization, first derivative and smoothing. Then 65 samples were selected as calibration set and the remaining 20 samples as validation set. Both vis-NIR spectrometry and gas chromatography profiles of the 85 soil samples were subjected to RF and PLSR with leave-one-out cross-validation (LOOCV) for the calibration models. Results showed that RF calibration model with a coefficient of determination (R 2 ) of 0.85, a root means square error of prediction (RMSEP) 68.43mgkg -1 , and a residual prediction deviation (RPD) of 2.61 outperformed PLSR (R 2 =0.63, RMSEP=107.54mgkg -1 and RDP=2.55) in cross-validation. These results indicate that RF modelling approach is accounting for the nonlinearity of the soil spectral responses hence, providing significantly higher prediction accuracy compared to the linear PLSR. It is recommended to adopt the vis-NIRS coupled with RF modelling approach as a portable and cost effective method for the rapid quantification of TPH in soils. Copyright © 2017 Elsevier B.V. All rights reserved.

  13. Modeling of feed-forward control using the partial least squares regression method in the tablet compression process.

    PubMed

    Hattori, Yusuke; Otsuka, Makoto

    2017-05-30

    In the pharmaceutical industry, the implementation of continuous manufacturing has been widely promoted in lieu of the traditional batch manufacturing approach. More specially, in recent years, the innovative concept of feed-forward control has been introduced in relation to process analytical technology. In the present study, we successfully developed a feed-forward control model for the tablet compression process by integrating data obtained from near-infrared (NIR) spectra and the physical properties of granules. In the pharmaceutical industry, batch manufacturing routinely allows for the preparation of granules with the desired properties through the manual control of process parameters. On the other hand, continuous manufacturing demands the automatic determination of these process parameters. Here, we proposed the development of a control model using the partial least squares regression (PLSR) method. The most significant feature of this method is the use of dataset integrating both the NIR spectra and the physical properties of the granules. Using our model, we determined that the properties of products, such as tablet weight and thickness, need to be included as independent variables in the PLSR analysis in order to predict unknown process parameters. Copyright © 2017 Elsevier B.V. All rights reserved.

  14. Feature reconstruction of LFP signals based on PLSR in the neural information decoding study.

    PubMed

    Yonghui Dong; Zhigang Shang; Mengmeng Li; Xinyu Liu; Hong Wan

    2017-07-01

    To solve the problems of Signal-to-Noise Ratio (SNR) and multicollinearity when the Local Field Potential (LFP) signals is used for the decoding of animal motion intention, a feature reconstruction of LFP signals based on partial least squares regression (PLSR) in the neural information decoding study is proposed in this paper. Firstly, the feature information of LFP coding band is extracted based on wavelet transform. Then the PLSR model is constructed by the extracted LFP coding features. According to the multicollinearity characteristics among the coding features, several latent variables which contribute greatly to the steering behavior are obtained, and the new LFP coding features are reconstructed. Finally, the K-Nearest Neighbor (KNN) method is used to classify the reconstructed coding features to verify the decoding performance. The results show that the proposed method can achieve the highest accuracy compared to the other three methods and the decoding effect of the proposed method is robust.

  15. Computerized pigment design based on property hypersurfaces

    NASA Astrophysics Data System (ADS)

    Halova, Jaroslava; Sulcova, Petra; Kupka, Karel

    2007-05-01

    Competition is tough in the pigment market. Rational pigment design has therefore a competitive advantage, saving time and money. The aim of this work is to provide methods that can assist in designing pigments with defined properties. These methods include partial least squares regression (PLSR), neural network (NN) and generalized regression ANOVA model. Authors show how PLS bi-plot can be used to identify market gaps poorly covered by pigment manufacturers, thus giving an opportunity to develop pigments with potentially profitable properties.

  16. Classification and quantification analysis of peach kernel from different origins with near-infrared diffuse reflection spectroscopy

    PubMed Central

    Liu, Wei; Wang, Zhen-Zhong; Qing, Jian-Ping; Li, Hong-Juan; Xiao, Wei

    2014-01-01

    Background: Peach kernels which contain kinds of fatty acids play an important role in the regulation of a variety of physiological and biological functions. Objective: To establish an innovative and rapid diffuse reflectance near-infrared spectroscopy (DR-NIR) analysis method along with chemometric techniques for the qualitative and quantitative determination of a peach kernel. Materials and Methods: Peach kernel samples from nine different origins were analyzed with high-performance liquid chromatography (HPLC) as a reference method. DR-NIR is in the spectral range 1100-2300 nm. Principal component analysis (PCA) and partial least squares regression (PLSR) algorithm were applied to obtain prediction models, The Savitzky-Golay derivative and first derivative were adopted for the spectral pre-processing, PCA was applied to classify the varieties of those samples. For the quantitative calibration, the models of linoleic and oleinic acids were established with the PLSR algorithm and the optimal principal component (PC) numbers were selected with leave-one-out (LOO) cross-validation. The established models were evaluated with the root mean square error of deviation (RMSED) and corresponding correlation coefficients (R2). Results: The PCA results of DR-NIR spectra yield clear classification of the two varieties of peach kernel. PLSR had a better predictive ability. The correlation coefficients of the two calibration models were above 0.99, and the RMSED of linoleic and oleinic acids were 1.266% and 1.412%, respectively. Conclusion: The DR-NIR combined with PCA and PLSR algorithm could be used efficiently to identify and quantify peach kernels and also help to solve variety problem. PMID:25422544

  17. Chemiluminescence-based multivariate sensing of local equivalence ratios in premixed atmospheric methane-air flames

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Tripathi, Markandey M.; Krishnan, Sundar R.; Srinivasan, Kalyan K.

    Chemiluminescence emissions from OH*, CH*, C2, and CO2 formed within the reaction zone of premixed flames depend upon the fuel-air equivalence ratio in the burning mixture. In the present paper, a new partial least square regression (PLS-R) based multivariate sensing methodology is investigated and compared with an OH*/CH* intensity ratio-based calibration model for sensing equivalence ratio in atmospheric methane-air premixed flames. Five replications of spectral data at nine different equivalence ratios ranging from 0.73 to 1.48 were used in the calibration of both models. During model development, the PLS-R model was initially validated with the calibration data set using themore » leave-one-out cross validation technique. Since the PLS-R model used the entire raw spectral intensities, it did not need the nonlinear background subtraction of CO2 emission that is required for typical OH*/CH* intensity ratio calibrations. An unbiased spectral data set (not used in the PLS-R model development), for 28 different equivalence ratio conditions ranging from 0.71 to 1.67, was used to predict equivalence ratios using the PLS-R and the intensity ratio calibration models. It was found that the equivalence ratios predicted with the PLS-R based multivariate calibration model matched the experimentally measured equivalence ratios within 7%; whereas, the OH*/CH* intensity ratio calibration grossly underpredicted equivalence ratios in comparison to measured equivalence ratios, especially under rich conditions ( > 1.2). The practical implications of the chemiluminescence-based multivariate equivalence ratio sensing methodology are also discussed.« less

  18. Contribution of Monomeric Anthocyanins to the Color of Young Red Wine: Statistical and Experimental Approaches.

    PubMed

    Han, Fu Liang; Li, Zheng; Xu, Yan

    2015-12-01

    Monomeric anthocyanin contributions to young red wine color were investigated using partial least square regression (PLSR) and aqueous alcohol solutions in this study. Results showed that the correlation between the anthocyanin concentration and the solution color fitted in a quadratic regression rather than linear or cubic regression. Malvidin-3-O-glucoside was estimated to show the highest contribution to young red wine color according to its concentration in wine, whereas peonidin-3-O-glucoside in its concentration contributed the least. The PLSR suggested that delphinidin-3-O-glucoside and peonidin-3-O-glucoside under the same concentration resulted in a stronger color of young red wine compared with malvidin-3-O-glucoside. These estimates were further confirmed by their color in aqueous alcohol solutions. These results suggested that delphinidin-3-O-glucoside and peonidin-3-O-glucoside were primary anthocyanins to enhance young red wine color by increasing their concentrations. This study could provide an alternative approach to improve young red wine color by adjusting anthocyanin composition and concentration. © 2015 Institute of Food Technologists®

  19. Estimation of soil clay and organic matter using two quantitative methods (PLSR and MARS) based on reflectance spectroscopy

    NASA Astrophysics Data System (ADS)

    Nawar, Said; Buddenbaum, Henning; Hill, Joachim

    2014-05-01

    A rapid and inexpensive soil analytical technique is needed for soil quality assessment and accurate mapping. This study investigated a method for improved estimation of soil clay (SC) and organic matter (OM) using reflectance spectroscopy. Seventy soil samples were collected from Sinai peninsula in Egypt to estimate the soil clay and organic matter relative to the soil spectra. Soil samples were scanned with an Analytical Spectral Devices (ASD) spectrometer (350-2500 nm). Three spectral formats were used in the calibration models derived from the spectra and the soil properties: (1) original reflectance spectra (OR), (2) first-derivative spectra smoothened using the Savitzky-Golay technique (FD-SG) and (3) continuum-removed reflectance (CR). Partial least-squares regression (PLSR) models using the CR of the 400-2500 nm spectral region resulted in R2 = 0.76 and 0.57, and RPD = 2.1 and 1.5 for estimating SC and OM, respectively, indicating better performance than that obtained using OR and SG. The multivariate adaptive regression splines (MARS) calibration model with the CR spectra resulted in an improved performance (R2 = 0.89 and 0.83, RPD = 3.1 and 2.4) for estimating SC and OM, respectively. The results show that the MARS models have a great potential for estimating SC and OM compared with PLSR models. The results obtained in this study have potential value in the field of soil spectroscopy because they can be applied directly to the mapping of soil properties using remote sensing imagery in arid environment conditions. Key Words: soil clay, organic matter, PLSR, MARS, reflectance spectroscopy.

  20. A comparison of two adaptive multivariate analysis methods (PLSR and ANN) for winter wheat yield forecasting using Landsat-8 OLI images

    NASA Astrophysics Data System (ADS)

    Chen, Pengfei; Jing, Qi

    2017-02-01

    An assumption that the non-linear method is more reasonable than the linear method when canopy reflectance is used to establish the yield prediction model was proposed and tested in this study. For this purpose, partial least squares regression (PLSR) and artificial neural networks (ANN), represented linear and non-linear analysis method, were applied and compared for wheat yield prediction. Multi-period Landsat-8 OLI images were collected at two different wheat growth stages, and a field campaign was conducted to obtain grain yields at selected sampling sites in 2014. The field data were divided into a calibration database and a testing database. Using calibration data, a cross-validation concept was introduced for the PLSR and ANN model construction to prevent over-fitting. All models were tested using the test data. The ANN yield-prediction model produced R2, RMSE and RMSE% values of 0.61, 979 kg ha-1, and 10.38%, respectively, in the testing phase, performing better than the PLSR yield-prediction model, which produced R2, RMSE, and RMSE% values of 0.39, 1211 kg ha-1, and 12.84%, respectively. Non-linear method was suggested as a better method for yield prediction.

  1. Classification and quantitation of milk powder by near-infrared spectroscopy and mutual information-based variable selection and partial least squares

    NASA Astrophysics Data System (ADS)

    Chen, Hui; Tan, Chao; Lin, Zan; Wu, Tong

    2018-01-01

    Milk is among the most popular nutrient source worldwide, which is of great interest due to its beneficial medicinal properties. The feasibility of the classification of milk powder samples with respect to their brands and the determination of protein concentration is investigated by NIR spectroscopy along with chemometrics. Two datasets were prepared for experiment. One contains 179 samples of four brands for classification and the other contains 30 samples for quantitative analysis. Principal component analysis (PCA) was used for exploratory analysis. Based on an effective model-independent variable selection method, i.e., minimal-redundancy maximal-relevance (MRMR), only 18 variables were selected to construct a partial least-square discriminant analysis (PLS-DA) model. On the test set, the PLS-DA model based on the selected variable set was compared with the full-spectrum PLS-DA model, both of which achieved 100% accuracy. In quantitative analysis, the partial least-square regression (PLSR) model constructed by the selected subset of 260 variables outperforms significantly the full-spectrum model. It seems that the combination of NIR spectroscopy, MRMR and PLS-DA or PLSR is a powerful tool for classifying different brands of milk and determining the protein content.

  2. Rapid Isolation and Detection for RNA Biomarkers for TBI Diagnostics

    DTIC Science & Technology

    2015-10-01

    V., Grape and wine sensory attributes correlate with pattern- based discrimination of Cabernet Sauvignon wines by a peptidic sensor array, Tetrahedron... wine samples. Partial Least Squares Regression (PLSR) was used for the correlation of wine sensory attributes to the peptide-based receptor...responses. Data analysis was done using the software XLSTAT Addinsoft, NewYork) and R.Absorbance values due to wine without the sensing ensembles were

  3. Rapid and non-destructive determination of rancidity levels in butter cookies by multi-spectral imaging.

    PubMed

    Xia, Qing; Liu, Changhong; Liu, Jinxia; Pan, Wenjuan; Lu, Xuzhong; Yang, Jianbo; Chen, Wei; Zheng, Lei

    2016-03-30

    Rancidity is an important attribute for quality assessment of butter cookies, while traditional methods for rancidity measurement are usually laborious, destructive and prone to operational error. In the present paper, the potential of applying multi-spectral imaging (MSI) technology with 19 wavelengths in the range of 405-970 nm to evaluate the rancidity in butter cookies was investigated. Moisture content, acid value and peroxide value were determined by traditional methods and then related with the spectral information by partial least squares regression (PLSR) and back-propagation artificial neural network (BP-ANN). The optimal models for predicting moisture content, acid value and peroxide value were obtained by PLSR. The correlation coefficient (r) obtained by PLSR models revealed that MSI had a perfect ability to predict moisture content (r = 0.909), acid value (r = 0.944) and peroxide value (r = 0.971). The study demonstrated that the rancidity level of butter cookies can be continuously monitored and evaluated in real-time by the multi-spectral imaging, which is of great significance for developing online food safety monitoring solutions. © 2015 Society of Chemical Industry.

  4. Qualitative and quantitative detection of honey adulterated with high-fructose corn syrup and maltose syrup by using near-infrared spectroscopy.

    PubMed

    Li, Shuifang; Zhang, Xin; Shan, Yang; Su, Donglin; Ma, Qiang; Wen, Ruizhi; Li, Jiaojuan

    2017-03-01

    Near-infrared spectroscopy (NIR) was used for qualitative and quantitative detection of honey adulterated with high-fructose corn syrup (HFCS) or maltose syrup (MS). Competitive adaptive reweighted sampling (CARS) was employed to select key variables. Partial least squares linear discriminant analysis (PLS-LDA) was adopted to classify the adulterated honey samples. The CARS-PLS-LDA models showed an accuracy of 86.3% (honey vs. adulterated honey with HFCS) and 96.1% (honey vs. adulterated honey with MS), respectively. PLS regression (PLSR) was used to predict the extent of adulteration in the honeys. The results showed that NIR combined with PLSR could not be used to quantify adulteration with HFCS, but could be used to quantify adulteration with MS: coefficient (R p 2 ) and root mean square of prediction (RMSEP) were 0.901 and 4.041 for MS-adulterated samples from different floral origins, and 0.981 and 1.786 for MS-adulterated samples from the same floral origin (Brassica spp.), respectively. Copyright © 2016. Published by Elsevier Ltd.

  5. A partial least square regression method to quantitatively retrieve soil salinity using hyper-spectral reflectance data

    NASA Astrophysics Data System (ADS)

    Qu, Yonghua; Jiao, Siong; Lin, Xudong

    2008-10-01

    Hetao Irrigation District located in Inner Mongolia, is one of the three largest irrigated area in China. In the irrigational agriculture region, for the reasons that many efforts have been put on irrigation rather than on drainage, as a result much sedimentary salt that usually is solved in water has been deposited in surface soil. So there has arisen a problem in such irrigation district that soil salinity has become a chief fact which causes land degrading. Remote sensing technology is an efficiency way to map the salinity in regional scale. In the principle of remote sensing, soil spectrum is one of the most important indications which can be used to reflect the status of soil salinity. In the past decades, many efforts have been made to reveal the spectrum characteristics of the salinized soil, such as the traditional statistic regression method. But it also has been found that when the hyper-spectral reflectance data are considered, the traditional regression method can't be treat the large dimension data, because the hyper-spectral data usually have too higher spectral band number. In this paper, a partial least squares regression (PLSR) model was established based on the statistical analysis on the soil salinity and the reflectance of hyper-spectral. Dataset were collect through the field soil samples were collected in the region of Hetao irrigation from the end of July to the beginning of August. The independent validation using data which are not included in the calibration model reveals that the proposed model can predicate the main soil components such as the content of total ions(S%), PH with higher determination coefficients(R2) of 0.728 and 0.715 respectively. And the rate of prediction to deviation(RPD) of the above predicted value are larger than 1.6, which indicates that the calibrated PLSR model can be used as a tool to retrieve soil salinity with accurate results. When the PLSR model's regression coefficients were aggregated according to the wavelength of visual (blue, green, red) and near infrared bands of LandSat Thematic Mapper(TM) sensor, some significant response values were observed, which indicates that the proposed method in this paper can be used to analysis the remotely sensed data from the space-boarded platform.

  6. Impacts of land use change on watershed streamflow and sediment yield: An assessment using hydrologic modelling and partial least squares regression

    NASA Astrophysics Data System (ADS)

    Yan, B.; Fang, N. F.; Zhang, P. C.; Shi, Z. H.

    2013-03-01

    SummaryUnderstanding how changes in individual land use types influence the dynamics of streamflow and sediment yield would greatly improve the predictability of the hydrological consequences of land use changes and could thus help stakeholders to make better decisions. Multivariate statistics are commonly used to compare individual land use types to control the dynamics of streamflow or sediment yields. However, one issue with the use of conventional statistical methods to address relationships between land use types and streamflow or sediment yield is multicollinearity. In this study, an integrated approach involving hydrological modelling and partial least squares regression (PLSR) was used to quantify the contributions of changes in individual land use types to changes in streamflow and sediment yield. In a case study, hydrological modelling was conducted using land use maps from four time periods (1978, 1987, 1999, and 2007) for the Upper Du watershed (8973 km2) in China using the Soil and Water Assessment Tool (SWAT). Changes in streamflow and sediment yield across the two simulations conducted using the land use maps from 2007 to 1978 were found to be related to land use changes according to a PLSR, which was used to quantify the effect of this influence at the sub-basin scale. The major land use changes that affected streamflow in the studied catchment areas were related to changes in the farmland, forest and urban areas between 1978 and 2007; the corresponding regression coefficients were 0.232, -0.147 and 1.256, respectively, and the Variable Influence on Projection (VIP) was greater than 1. The dominant first-order factors affecting the changes in sediment yield in our study were: farmland (the VIP and regression coefficient were 1.762 and 14.343, respectively) and forest (the VIP and regression coefficient were 1.517 and -7.746, respectively). The PLSR methodology presented in this paper is beneficial and novel, as it partially eliminates the co-dependency of the variables and facilitates a more unbiased view of the contribution of the changes in individual land use types to changes in streamflow and sediment yield. This practicable and simple approach could be applied to a variety of other watersheds for which time-sequenced digital land use maps are available.

  7. Fresh Biomass Estimation in Heterogeneous Grassland Using Hyperspectral Measurements and Multivariate Statistical Analysis

    NASA Astrophysics Data System (ADS)

    Darvishzadeh, R.; Skidmore, A. K.; Mirzaie, M.; Atzberger, C.; Schlerf, M.

    2014-12-01

    Accurate estimation of grassland biomass at their peak productivity can provide crucial information regarding the functioning and productivity of the rangelands. Hyperspectral remote sensing has proved to be valuable for estimation of vegetation biophysical parameters such as biomass using different statistical techniques. However, in statistical analysis of hyperspectral data, multicollinearity is a common problem due to large amount of correlated hyper-spectral reflectance measurements. The aim of this study was to examine the prospect of above ground biomass estimation in a heterogeneous Mediterranean rangeland employing multivariate calibration methods. Canopy spectral measurements were made in the field using a GER 3700 spectroradiometer, along with concomitant in situ measurements of above ground biomass for 170 sample plots. Multivariate calibrations including partial least squares regression (PLSR), principal component regression (PCR), and Least-Squared Support Vector Machine (LS-SVM) were used to estimate the above ground biomass. The prediction accuracy of the multivariate calibration methods were assessed using cross validated R2 and RMSE. The best model performance was obtained using LS_SVM and then PLSR both calibrated with first derivative reflectance dataset with R2cv = 0.88 & 0.86 and RMSEcv= 1.15 & 1.07 respectively. The weakest prediction accuracy was appeared when PCR were used (R2cv = 0.31 and RMSEcv= 2.48). The obtained results highlight the importance of multivariate calibration methods for biomass estimation when hyperspectral data are used.

  8. Soil-Bacterium Compatibility Model as a Decision-Making Tool for Soil Bioremediation.

    PubMed

    Horemans, Benjamin; Breugelmans, Philip; Saeys, Wouter; Springael, Dirk

    2017-02-07

    Bioremediation of organic pollutant contaminated soil involving bioaugmentation with dedicated bacteria specialized in degrading the pollutant is suggested as a green and economically sound alternative to physico-chemical treatment. However, intrinsic soil characteristics impact the success of bioaugmentation. The feasibility of using partial least-squares regression (PLSR) to predict the success of bioaugmentation in contaminated soil based on the intrinsic physico-chemical soil characteristics and, hence, to improve the success of bioaugmentation, was examined. As a proof of principle, PLSR was used to build soil-bacterium compatibility models to predict the bioaugmentation success of the phenanthrene-degrading Novosphingobium sp. LH128. The survival and biodegradation activity of strain LH128 were measured in 20 soils and correlated with the soil characteristics. PLSR was able to predict the strain's survival using 12 variables or less while the PAH-degrading activity of strain LH128 in soils that show survival was predicted using 9 variables. A three-step approach using the developed soil-bacterium compatibility models is proposed as a decision making tool and first estimation to select compatible soils and organisms and increase the chance of success of bioaugmentation.

  9. Process spectroscopy in microemulsions—Raman spectroscopy for online monitoring of a homogeneous hydroformylation process

    NASA Astrophysics Data System (ADS)

    Paul, Andrea; Meyer, Klas; Ruiken, Jan-Paul; Illner, Markus; Müller, David-Nicolas; Esche, Erik; Wozny, Günther; Westad, Frank; Maiwald, Michael

    2017-03-01

    A major industrial reaction based on homogeneous catalysis is hydroformylation for the production of aldehydes from alkenes and syngas. Hydroformylation in microemulsions, which is currently under investigation at Technische Universität Berlin on a mini-plant scale, was identified as a cost efficient approach which also enhances product selectivity. Herein, we present the application of online Raman spectroscopy on the reaction of 1-dodecene to 1-tridecanal within a microemulsion. To achieve a good representation of the operation range in the mini-plant with regard to concentrations of the reactants a design of experiments was used. Based on initial Raman spectra partial least squares regression (PLSR) models were calibrated for the prediction of 1-dodecene and 1-tridecanal. Limits of predictions arise from nonlinear correlations between Raman intensity and mass fractions of compounds in the microemulsion system. Furthermore, the prediction power of PLSR models becomes limited due to unexpected by-product formation. Application of the lab-scale derived calibration spectra and PLSR models on online spectra from a mini-plant operation yielded promising estimations of 1-tridecanal and acceptable predictions of 1-dodecene mass fractions suggesting Raman spectroscopy as a suitable technique for process analytics in microemulsions.

  10. Boiling points of halogenated aliphatic compounds: a quantitative structure-property relationship for prediction and validation.

    PubMed

    Oberg, Tomas

    2004-01-01

    Halogenated aliphatic compounds have many technical uses, but substances within this group are also ubiquitous environmental pollutants that can affect the ozone layer and contribute to global warming. The establishment of quantitative structure-property relationships is of interest not only to fill in gaps in the available database but also to validate experimental data already acquired. The three-dimensional structures of 240 compounds were modeled with molecular mechanics prior to the generation of empirical descriptors. Two bilinear projection methods, principal component analysis (PCA) and partial-least-squares regression (PLSR), were used to identify outliers. PLSR was subsequently used to build a multivariate calibration model by extracting the latent variables that describe most of the covariation between the molecular structure and the boiling point. Boiling points were also estimated with an extension of the group contribution method of Stein and Brown.

  11. Rapid analysis of pharmaceutical drugs using LIBS coupled with multivariate analysis.

    PubMed

    Tiwari, P K; Awasthi, S; Kumar, R; Anand, R K; Rai, P K; Rai, A K

    2018-02-01

    Type 2 diabetes drug tablets containing voglibose having dose strengths of 0.2 and 0.3 mg of various brands have been examined, using laser-induced breakdown spectroscopy (LIBS) technique. The statistical methods such as the principal component analysis (PCA) and the partial least square regression analysis (PLSR) have been employed on LIBS spectral data for classifying and developing the calibration models of drug samples. We have developed the ratio-based calibration model applying PLSR in which relative spectral intensity ratios H/C, H/N and O/N are used. Further, the developed model has been employed to predict the relative concentration of element in unknown drug samples. The experiment has been performed in air and argon atmosphere, respectively, and the obtained results have been compared. The present model provides rapid spectroscopic method for drug analysis with high statistical significance for online control and measurement process in a wide variety of pharmaceutical industrial applications.

  12. High-Throughput Field Phenotyping of Leaves, Leaf Sheaths, Culms and Ears of Spring Barley Cultivars at Anthesis and Dough Ripeness.

    PubMed

    Barmeier, Gero; Schmidhalter, Urs

    2017-01-01

    To optimize plant architecture (e.g., photosynthetic active leaf area, leaf-stem ratio), plant physiologists and plant breeders rely on destructively and tediously harvested biomass samples. A fast and non-destructive method for obtaining information about different plant organs could be vehicle-based spectral proximal sensing. In this 3-year study, the mobile phenotyping platform PhenoTrac 4 was used to compare the measurements from active and passive spectral proximal sensors of leaves, leaf sheaths, culms and ears of 34 spring barley cultivars at anthesis and dough ripeness. Published vegetation indices (VI), partial least square regression (PLSR) models and contour map analysis were compared to assess these traits. Contour maps are matrices consisting of coefficients of determination for all of the binary combinations of wavelengths and the biomass parameters. The PLSR models of leaves, leaf sheaths and culms showed strong correlations ( R 2 = 0.61-0.76). Published vegetation indices depicted similar coefficients of determination; however, their RMSEs were higher. No wavelength combination could be found by the contour map analysis to improve the results of the PLSR or published VIs. The best results were obtained for the dry weight and N uptake of leaves and culms. The PLSR models yielded satisfactory relationships for leaf sheaths at anthesis ( R 2 = 0.69), whereas only a low performance for all of sensors and methods was observed at dough ripeness. No relationships with ears were observed. Active and passive sensors performed comparably, with slight advantages observed for the passive spectrometer. The results indicate that tractor-based proximal sensing in combination with optimized spectral indices or PLSR models may represent a suitable tool for plant breeders to assess relevant morphological traits, allowing for a better understanding of plant architecture, which is closely linked to the physiological performance. Further validation of PLSR models is required in independent studies. Organ specific phenotyping represents a first step toward breeding by design.

  13. High-Throughput Field Phenotyping of Leaves, Leaf Sheaths, Culms and Ears of Spring Barley Cultivars at Anthesis and Dough Ripeness

    PubMed Central

    Barmeier, Gero; Schmidhalter, Urs

    2017-01-01

    To optimize plant architecture (e.g., photosynthetic active leaf area, leaf-stem ratio), plant physiologists and plant breeders rely on destructively and tediously harvested biomass samples. A fast and non-destructive method for obtaining information about different plant organs could be vehicle-based spectral proximal sensing. In this 3-year study, the mobile phenotyping platform PhenoTrac 4 was used to compare the measurements from active and passive spectral proximal sensors of leaves, leaf sheaths, culms and ears of 34 spring barley cultivars at anthesis and dough ripeness. Published vegetation indices (VI), partial least square regression (PLSR) models and contour map analysis were compared to assess these traits. Contour maps are matrices consisting of coefficients of determination for all of the binary combinations of wavelengths and the biomass parameters. The PLSR models of leaves, leaf sheaths and culms showed strong correlations (R2 = 0.61–0.76). Published vegetation indices depicted similar coefficients of determination; however, their RMSEs were higher. No wavelength combination could be found by the contour map analysis to improve the results of the PLSR or published VIs. The best results were obtained for the dry weight and N uptake of leaves and culms. The PLSR models yielded satisfactory relationships for leaf sheaths at anthesis (R2 = 0.69), whereas only a low performance for all of sensors and methods was observed at dough ripeness. No relationships with ears were observed. Active and passive sensors performed comparably, with slight advantages observed for the passive spectrometer. The results indicate that tractor-based proximal sensing in combination with optimized spectral indices or PLSR models may represent a suitable tool for plant breeders to assess relevant morphological traits, allowing for a better understanding of plant architecture, which is closely linked to the physiological performance. Further validation of PLSR models is required in independent studies. Organ specific phenotyping represents a first step toward breeding by design. PMID:29163629

  14. Discrimination and Measurements of Three Flavonols with Similar Structure Using Terahertz Spectroscopy and Chemometrics

    NASA Astrophysics Data System (ADS)

    Yan, Ling; Liu, Changhong; Qu, Hao; Liu, Wei; Zhang, Yan; Yang, Jianbo; Zheng, Lei

    2018-03-01

    Terahertz (THz) technique, a recently developed spectral method, has been researched and used for the rapid discrimination and measurements of food compositions due to its low-energy and non-ionizing characteristics. In this study, THz spectroscopy combined with chemometrics has been utilized for qualitative and quantitative analysis of myricetin, quercetin, and kaempferol with concentrations of 0.025, 0.05, and 0.1 mg/mL. The qualitative discrimination was achieved by KNN, ELM, and RF models with the spectra pre-treatments. An excellent discrimination (100% CCR in the prediction set) could be achieved using the RF model. Furthermore, the quantitative analyses were performed by partial least square regression (PLSR) and least squares support vector machine (LS-SVM). Comparing to the PLSR models, the LS-SVM yielded better results with low RMSEP (0.0044, 0.0039, and 0.0048), higher Rp (0.9601, 0.9688, and 0.9359), and higher RPD (8.6272, 9.6333, and 7.9083) for myricetin, quercetin, and kaempferol, respectively. Our results demonstrate that THz spectroscopy technique is a powerful tool for identification of three flavonols with similar chemical structures and quantitative determination of their concentrations.

  15. Predicting organic matter, nitrogen, and phosphorus concentrations in runoff from peat extraction sites using partial least squares regression

    NASA Astrophysics Data System (ADS)

    Tuukkanen, T.; Marttila, H.; Kløve, B.

    2017-07-01

    Organic matter and nutrient export from drained peatlands is affected by complex hydrological and biogeochemical interactions. Here partial least squares regression (PLSR) was used to relate various soil and catchment characteristics to variations in chemical oxygen demand (COD), total nitrogen (TN), and total phosphorus (TP) concentrations in runoff. Peat core samples and water quality data were collected from 15 peat extraction sites in Finland. PLSR models constructed by cross-validation and variable selection routines predicted 92, 88, and 95% of the variation in mean COD, TN, and TP concentration in runoff, respectively. The results showed that variations in COD were mainly related to net production (temperature and water-extractable dissolved organic carbon (DOC)), hydrology (topographical relief), and solubility of dissolved organic matter (peat sulfur (S) and calcium (Ca) concentrations). Negative correlations for peat S and runoff COD indicated that acidity from oxidation of organic S stored in peat may be an important mechanism suppressing organic matter leaching. Moreover, runoff COD was associated with peat aluminum (Al), P, and sodium (Na) concentrations. Hydrological controls on TN and COD were similar (i.e., related to topography), whereas degree of humification, bulk density, and water-extractable COD and Al provided additional explanations for TN concentration. Variations in runoff TP concentration were attributed to erosion of particulate P, as indicated by a positive correlation with suspended sediment concentration (SSC), and factors associated with metal-humic complexation and P adsorption (peat Al, water-extractable P, and water-extractable iron (Fe)).

  16. Canopy Spectral Reflectance as a Predictor of Soil Water Potential in Rice

    NASA Astrophysics Data System (ADS)

    Panigrahi, N.; Das, B. S.

    2018-04-01

    Soil water potential (SWP) is a key parameter for characterizing water stress. Typically, a tensiometer is used to measure SWP. However, the measurement range for commercially available tensiometers is limited to -90 kPa and a tensiometer can only provide estimate of SWP at a single location. In this study, a new approach was developed for estimating SWP from spectral reflectance data of a standing rice crop over the visible to shortwave-infrared region (wavelength: 350-2,500 nm). Five water stress treatments corresponding to targeted SWP of -30, -50, -70, -120, and -140 kPa were examined by withholding irrigation during the vegetative growth stage of three rice varieties. Tensiometers and mechanistic water flow model were used for monitoring SWP. Spectral models for SWP were developed using partial-least-squares regression (PLSR), support vector regression (SVR), and coupled PLSR and feature selection (PLSRFS) approaches. Results showed that the SVR approach was the best model for estimating SWP from spectral reflectance data with the coefficient of determination values of 0.71 and 0.55 for the calibration and validation data sets, respectively. Observed root-mean-squared residuals for the predicted SWPs were in the range of -7 to -19 kPa. A new spectral water stress index was also developed using the reflectance values at 745 and 2,002 nm, which showed strong correlation with relative water contents and electrolyte leakage. This new approach is rapid and noninvasive and may be used for estimating SWP over large areas.

  17. Radioecological modelling of Polonium-210 and Caesium-137 in lichen-reindeer-man and top predators.

    PubMed

    Persson, Bertil R R; Gjelsvik, Runhild; Holm, Elis

    2018-06-01

    This work deals with analysis and modelling of the radionuclides 210 Pb and 210 Po in the food-chain lichen-reindeer-man in addition to 210 Po and 137 Cs in top predators. By using the methods of Partial Least Square Regression (PLSR) the atmospheric deposition of 210 Pb and 210 Po is predicted at the sample locations. Dynamic modelling of the activity concentration with differential equations is fitted to the sample data. Reindeer lichen consumption, gastrointestinal absorption, organ distribution and elimination is derived from information in the literature. Dynamic modelling of transfer of 210 Pb and 210 Po to reindeer meat, liver and bone from lichen consumption, fitted well with data from Sweden and Finland from 1966 to 1971. The activity concentration of 210 Pb in the skeleton in man is modelled by using the results of studying the kinetics of lead in skeleton and blood in lead-workers after end of occupational exposure. The result of modelling 210 Pb and 210 Po activity in skeleton matched well with concentrations of 210 Pb and 210 Po in teeth from reindeer-breeders and autopsy bone samples in Finland. The results of 210 Po and 137 Cs in different tissues of wolf, wolverine and lynx previously published, are analysed with multivariate data processing methods such as Principal Component Analysis PCA, and modelled with the method of Projection to Latent Structures, PLS, or Partial Least Square Regression PLSR. Copyright © 2017 Elsevier Ltd. All rights reserved.

  18. Partial Least Squares with Structured Output for Modelling the Metabolomics Data Obtained from Complex Experimental Designs: A Study into the Y-Block Coding.

    PubMed

    Xu, Yun; Muhamadali, Howbeer; Sayqal, Ali; Dixon, Neil; Goodacre, Royston

    2016-10-28

    Partial least squares (PLS) is one of the most commonly used supervised modelling approaches for analysing multivariate metabolomics data. PLS is typically employed as either a regression model (PLS-R) or a classification model (PLS-DA). However, in metabolomics studies it is common to investigate multiple, potentially interacting, factors simultaneously following a specific experimental design. Such data often cannot be considered as a "pure" regression or a classification problem. Nevertheless, these data have often still been treated as a regression or classification problem and this could lead to ambiguous results. In this study, we investigated the feasibility of designing a hybrid target matrix Y that better reflects the experimental design than simple regression or binary class membership coding commonly used in PLS modelling. The new design of Y coding was based on the same principle used by structural modelling in machine learning techniques. Two real metabolomics datasets were used as examples to illustrate how the new Y coding can improve the interpretability of the PLS model compared to classic regression/classification coding.

  19. Estimation of water quality by UV/Vis spectrometry in the framework of treated wastewater reuse.

    PubMed

    Carré, Erwan; Pérot, Jean; Jauzein, Vincent; Lin, Liming; Lopez-Ferber, Miguel

    2017-07-01

    The aim of this study is to investigate the potential of ultraviolet/visible (UV/Vis) spectrometry as a complementary method for routine monitoring of reclaimed water production. Robustness of the models and compliance of their sensitivity with current quality limits are investigated. The following indicators are studied: total suspended solids (TSS), turbidity, chemical oxygen demand (COD) and nitrate. Partial least squares regression (PLSR) is used to find linear correlations between absorbances and indicators of interest. Artificial samples are made by simulating a sludge leak on the wastewater treatment plant and added to the original dataset, then divided into calibration and prediction datasets. The models are built on the calibration set, and then tested on the prediction set. The best models are developed with: PLSR for COD (R pred 2 = 0.80), TSS (R pred 2 = 0.86) and turbidity (R pred 2 = 0.96), and with a simple linear regression from absorbance at 208 nm (R pred 2 = 0.95) for nitrate concentration. The input of artificial data significantly enhances the robustness of the models. The sensitivity of the UV/Vis spectrometry monitoring system developed is compatible with quality requirements of reclaimed water production processes.

  20. New robust sensitive fluorescence spectroscopy coupled with PLSR for estimation of quercetin in Ziziphus mucronata and Ziziphus sativa

    NASA Astrophysics Data System (ADS)

    Hussain, Javid; Mabood, Fazal; Al-Harrasi, Ahmed; Ali, Liaqat; Rizvi, Tania Shamim; Jabeen, Farah; Gilani, Syed Abdullah; Shinwari, Shehla; Ahmad, Mushtaq; Alabri, Zahra Khalfan; Al Ghawi, Said Hamood Salim

    2018-04-01

    Flavonoids are natural antioxidants derived from plants and commonly found in a variety of foods to sequester free radicals. Quercetin, belonging to flavonol subclass of flavonoids, has received considerable attention because of its wide uses as a nutritional supplement as well as a phytochemical remedy for a number of diseases. In the current study, quantification of quercetin was carried out in two medicinally important flavonoid rich plant Ziziphus mucronata and Ziziphus sativa. Emission spectroscopy was utilized as a new method coupled with Partial Least Squares Regression (PLSR) and the cross validation was done by UV-Visible spectroscopy. The results indicated the higher quercetin content in Z. mucronata (1.50 ± 0.034%) than Z. sativa (1.21 ± 0.052%), and were further verified through Folin-Ciocalteu Colorimetric method (Z. mucronata; 1.41 ± 0.26% and Z. sativa; 1.13 ± 0.136%). In this study the sensitivity was explained in term of slope i.e. Slope = 0.9973.

  1. Electronic tongue-based discrimination of Korean rice wines (makgeolli) including prediction of sensory evaluation and instrumental measurements.

    PubMed

    Kang, Bo-Sik; Lee, Jang-Eun; Park, Hyun-Jin

    2014-05-15

    A commercial electronic tongue was used to discriminate Korean rice wines (makgeolli) brewed from nine cultivars of rice with different amino acid and fatty acid compositions. The E-tongue was applied to establish prediction models with sensory evaluation or LC-MS/MS by partial least squares regression (PLSR). All makgeollis were classified into three groups by principal components analysis, and the separation pattern was affected by rice qualities and yeast fermentation. Makgeolli taste changed from the complicated comprising sweetness, saltiness, and umami to the uncomplicated, such as bitterness and then, sourness, with a decrease of amino acids and fatty acids in the rice. The quantitative correlation between E-tongue and sensory scores or LC-MS/MS by PLSR demonstrated that E-tongue could well predict most of the sensory attributes with relatively acceptable r(2), except for bitterness, but could not predict most of the chemical compounds responsible for taste attributes, except for ribose, lactate, succinate, and tryptophan. Copyright © 2013 Elsevier Ltd. All rights reserved.

  2. Near-Infrared Spectroscopy as an Analytical Process Technology for the On-Line Quantification of Water Precipitation Processes during Danhong Injection.

    PubMed

    Liu, Xuesong; Wu, Chunyan; Geng, Shu; Jin, Ye; Luan, Lianjun; Chen, Yong; Wu, Yongjiang

    2015-01-01

    This paper used near-infrared (NIR) spectroscopy for the on-line quantitative monitoring of water precipitation during Danhong injection. For these NIR measurements, two fiber optic probes designed to transmit NIR radiation through a 2 mm flow cell were used to collect spectra in real-time. Partial least squares regression (PLSR) was developed as the preferred chemometrics quantitative analysis of the critical intermediate qualities: the danshensu (DSS, (R)-3, 4-dihydroxyphenyllactic acid), protocatechuic aldehyde (PA), rosmarinic acid (RA), and salvianolic acid B (SAB) concentrations. Optimized PLSR models were successfully built and used for on-line detecting of the concentrations of DSS, PA, RA, and SAB of water precipitation during Danhong injection. Besides, the information of DSS, PA, RA, and SAB concentrations would be instantly fed back to site technical personnel for control and adjustment timely. The verification experiments determined that the predicted values agreed with the actual homologic value.

  3. Prediction of erodibility in Oxisols using iron oxides, soil color and diffuse reflectance spectroscopy

    NASA Astrophysics Data System (ADS)

    Arantes Camargo, Livia; Marques, José, Jr.

    2015-04-01

    The prediction of erodibility using indirect methods such as diffuse reflectance spectroscopy could facilitate the characterization of the spatial variability in large areas and optimize implementation of conservation practices. The aim of this study was to evaluate the prediction of interrill erodibility (Ki) and rill erodibility (Kr) by means of iron oxides content and soil color using multiple linear regression and diffuse reflectance spectroscopy (DRS) using regression analysis by least squares partial (PLSR). The soils were collected from three geomorphic surfaces and analyzed for chemical, physical and mineralogical properties, plus scanned in the spectral range from the visible and infrared. Maps of spatial distribution of Ki and Kr were built with the values calculated by the calibrated models that obtained the best accuracy using geostatistics. Interrill-rill erodibility presented negative correlation with iron extracted by dithionite-citrate-bicarbonate, hematite, and chroma, confirming the influence of iron oxides in soil structural stability. Hematite and hue were the attributes that most contributed in calibration models by multiple linear regression for the prediction of Ki (R2 = 0.55) and Kr (R2 = 0.53). The diffuse reflectance spectroscopy via PLSR allowed to predict Interrill-rill erodibility with high accuracy (R2adj = 0.76, 0.81 respectively and RPD> 2.0) in the range of the visible spectrum (380-800 nm) and the characterization of the spatial variability of these attributes by geostatistics.

  4. Discrimination and prediction of cultivation age and parts of Panax ginseng by Fourier-transform infrared spectroscopy combined with multivariate statistical analysis.

    PubMed

    Lee, Byeong-Ju; Kim, Hye-Youn; Lim, Sa Rang; Huang, Linfang; Choi, Hyung-Kyoon

    2017-01-01

    Panax ginseng C.A. Meyer is a herb used for medicinal purposes, and its discrimination according to cultivation age has been an important and practical issue. This study employed Fourier-transform infrared (FT-IR) spectroscopy with multivariate statistical analysis to obtain a prediction model for discriminating cultivation ages (5 and 6 years) and three different parts (rhizome, tap root, and lateral root) of P. ginseng. The optimal partial-least-squares regression (PLSR) models for discriminating ginseng samples were determined by selecting normalization methods, number of partial-least-squares (PLS) components, and variable influence on projection (VIP) cutoff values. The best prediction model for discriminating 5- and 6-year-old ginseng was developed using tap root, vector normalization applied after the second differentiation, one PLS component, and a VIP cutoff of 1.0 (based on the lowest root-mean-square error of prediction value). In addition, for discriminating among the three parts of P. ginseng, optimized PLSR models were established using data sets obtained from vector normalization, two PLS components, and VIP cutoff values of 1.5 (for 5-year-old ginseng) and 1.3 (for 6-year-old ginseng). To our knowledge, this is the first study to provide a novel strategy for rapidly discriminating the cultivation ages and parts of P. ginseng using FT-IR by selected normalization methods, number of PLS components, and VIP cutoff values.

  5. Discrimination and prediction of cultivation age and parts of Panax ginseng by Fourier-transform infrared spectroscopy combined with multivariate statistical analysis

    PubMed Central

    Lim, Sa Rang; Huang, Linfang

    2017-01-01

    Panax ginseng C.A. Meyer is a herb used for medicinal purposes, and its discrimination according to cultivation age has been an important and practical issue. This study employed Fourier-transform infrared (FT-IR) spectroscopy with multivariate statistical analysis to obtain a prediction model for discriminating cultivation ages (5 and 6 years) and three different parts (rhizome, tap root, and lateral root) of P. ginseng. The optimal partial-least-squares regression (PLSR) models for discriminating ginseng samples were determined by selecting normalization methods, number of partial-least-squares (PLS) components, and variable influence on projection (VIP) cutoff values. The best prediction model for discriminating 5- and 6-year-old ginseng was developed using tap root, vector normalization applied after the second differentiation, one PLS component, and a VIP cutoff of 1.0 (based on the lowest root-mean-square error of prediction value). In addition, for discriminating among the three parts of P. ginseng, optimized PLSR models were established using data sets obtained from vector normalization, two PLS components, and VIP cutoff values of 1.5 (for 5-year-old ginseng) and 1.3 (for 6-year-old ginseng). To our knowledge, this is the first study to provide a novel strategy for rapidly discriminating the cultivation ages and parts of P. ginseng using FT-IR by selected normalization methods, number of PLS components, and VIP cutoff values. PMID:29049369

  6. Comparison of univariate and multivariate calibration for the determination of micronutrients in pellets of plant materials by laser induced breakdown spectrometry

    NASA Astrophysics Data System (ADS)

    Braga, Jez Willian Batista; Trevizan, Lilian Cristina; Nunes, Lidiane Cristina; Rufini, Iolanda Aparecida; Santos, Dário, Jr.; Krug, Francisco José

    2010-01-01

    The application of laser induced breakdown spectrometry (LIBS) aiming the direct analysis of plant materials is a great challenge that still needs efforts for its development and validation. In this way, a series of experimental approaches has been carried out in order to show that LIBS can be used as an alternative method to wet acid digestions based methods for analysis of agricultural and environmental samples. The large amount of information provided by LIBS spectra for these complex samples increases the difficulties for selecting the most appropriated wavelengths for each analyte. Some applications have suggested that improvements in both accuracy and precision can be achieved by the application of multivariate calibration in LIBS data when compared to the univariate regression developed with line emission intensities. In the present work, the performance of univariate and multivariate calibration, based on partial least squares regression (PLSR), was compared for analysis of pellets of plant materials made from an appropriate mixture of cryogenically ground samples with cellulose as the binding agent. The development of a specific PLSR model for each analyte and the selection of spectral regions containing only lines of the analyte of interest were the best conditions for the analysis. In this particular application, these models showed a similar performance, but PLSR seemed to be more robust due to a lower occurrence of outliers in comparison to the univariate method. Data suggests that efforts dealing with sample presentation and fitness of standards for LIBS analysis must be done in order to fulfill the boundary conditions for matrix independent development and validation.

  7. Data Pre-Processing Method to Remove Interference of Gas Bubbles and Cell Clusters During Anaerobic and Aerobic Yeast Fermentations in a Stirred Tank Bioreactor

    NASA Astrophysics Data System (ADS)

    Princz, S.; Wenzel, U.; Miller, R.; Hessling, M.

    2014-11-01

    One aerobic and four anaerobic batch fermentations of the yeast Saccharomyces cerevisiae were conducted in a stirred bioreactor and monitored inline by NIR spectroscopy and a transflectance dip probe. From the acquired NIR spectra, chemometric partial least squares regression (PLSR) models for predicting biomass, glucose and ethanol were constructed. The spectra were directly measured in the fermentation broth and successfully inspected for adulteration using our novel data pre-processing method. These adulterations manifested as strong fluctuations in the shape and offset of the absorption spectra. They resulted from cells, cell clusters, or gas bubbles intercepting the optical path of the dip probe. In the proposed data pre-processing method, adulterated signals are removed by passing the time-scanned non-averaged spectra through two filter algorithms with a 5% quantile cutoff. The filtered spectra containing meaningful data are then averaged. A second step checks whether the whole time scan is analyzable. If true, the average is calculated and used to prepare the PLSR models. This new method distinctly improved the prediction results. To dissociate possible correlations between analyte concentrations, such as glucose and ethanol, the feeding analytes were alternately supplied at different concentrations (spiking) at the end of the four anaerobic fermentations. This procedure yielded low-error (anaerobic) PLSR models for predicting analyte concentrations of 0.31 g/l for biomass, 3.41 g/l for glucose, and 2.17 g/l for ethanol. The maximum concentrations were 14 g/l biomass, 167 g/l glucose, and 80 g/l ethanol. Data from the aerobic fermentation, carried out under high agitation and high aeration, were incorporated to realize combined PLSR models, which have not been previously reported to our knowledge.

  8. Multi-parameters monitoring during traditional Chinese medicine concentration process with near infrared spectroscopy and chemometrics

    NASA Astrophysics Data System (ADS)

    Liu, Ronghua; Sun, Qiaofeng; Hu, Tian; Li, Lian; Nie, Lei; Wang, Jiayue; Zhou, Wanhui; Zang, Hengchang

    2018-03-01

    As a powerful process analytical technology (PAT) tool, near infrared (NIR) spectroscopy has been widely used in real-time monitoring. In this study, NIR spectroscopy was applied to monitor multi-parameters of traditional Chinese medicine (TCM) Shenzhiling oral liquid during the concentration process to guarantee the quality of products. Five lab scale batches were employed to construct quantitative models to determine five chemical ingredients and physical change (samples density) during concentration process. The paeoniflorin, albiflorin, liquiritin and samples density were modeled by partial least square regression (PLSR), while the content of the glycyrrhizic acid and cinnamic acid were modeled by support vector machine regression (SVMR). Standard normal variate (SNV) and/or Savitzkye-Golay (SG) smoothing with derivative methods were adopted for spectra pretreatment. Variable selection methods including correlation coefficient (CC), competitive adaptive reweighted sampling (CARS) and interval partial least squares regression (iPLS) were performed for optimizing the models. The results indicated that NIR spectroscopy was an effective tool to successfully monitoring the concentration process of Shenzhiling oral liquid.

  9. Potential of Visible and Near Infrared Spectroscopy and Pattern Recognition for Rapid Quantification of Notoginseng Powder with Adulterants

    PubMed Central

    Nie, Pengcheng; Wu, Di; Sun, Da-Wen; Cao, Fang; Bao, Yidan; He, Yong

    2013-01-01

    Notoginseng is a classical traditional Chinese medical herb, which is of high economic and medical value. Notoginseng powder (NP) could be easily adulterated with Sophora flavescens powder (SFP) or corn flour (CF), because of their similar tastes and appearances and much lower cost for these adulterants. The objective of this study is to quantify the NP content in adulterated NP by using a rapid and non-destructive visible and near infrared (Vis-NIR) spectroscopy method. Three wavelength ranges of visible spectra, short-wave near infrared spectra (SNIR) and long-wave near infrared spectra (LNIR) were separately used to establish the model based on two calibration methods of partial least square regression (PLSR) and least-squares support vector machines (LS-SVM), respectively. Competitive adaptive reweighted sampling (CARS) was conducted to identify the most important wavelengths/variables that had the greatest influence on the adulterant quantification throughout the whole wavelength range. The CARS-PLSR models based on LNIR were determined as the best models for the quantification of NP adulterated with SFP, CF, and their mixtures, in which the rP values were 0.940, 0.939, and 0.867 for the three models respectively. The research demonstrated the potential of the Vis-NIR spectroscopy technique for the rapid and non-destructive quantification of NP containing adulterants. PMID:24129019

  10. Rapid prediction of phenolic compounds and antioxidant activity of Sudanese honey using Raman and Fourier transform infrared (FT-IR) spectroscopy.

    PubMed

    Tahir, Haroon Elrasheid; Xiaobo, Zou; Zhihua, Li; Jiyong, Shi; Zhai, Xiaodong; Wang, Sheng; Mariod, Abdalbasit Adam

    2017-07-01

    Fourier transform infrared with attenuated total reflectance (FTIR-ATR) and Raman spectroscopy combined with partial least square regression (PLSR) were applied for the prediction of phenolic compounds and antioxidant activity in honey. Standards of catechin, syringic, vanillic, and chlorogenic acids were used for the identification and quantification of the individual phenolic compounds in six honey varieties using HPLC-DAD. Total antioxidant activity (TAC) and ferrous chelating capacity were measured spectrophotometrically. For the establishment of PLSR model, Raman spectra with Savitzky-Golay smoothing in wavenumber region 1500-400cm -1 was used while for FTIR-ATR the wavenumber regions of 1800-700 and 3000-2800cm -1 with multiplicative scattering correction (MSC) and Savitzky-Golay smoothing were used. The determination coefficients (R 2 ) were ranged from 0.9272 to 0.9992 for Raman while from 0.9461 to 0.9988 for FTIT-ART. The FTIR-ATR and Raman demonstrated to be simple, rapid and nondestructive methods to quantify phenolic compounds and antioxidant activities in honey. Copyright © 2017 Elsevier Ltd. All rights reserved.

  11. Quantification of extra virgin olive oil in dressing and edible oil blends using the representative TMS-4,4'-desmethylsterols gas-chromatographic-normalized fingerprint.

    PubMed

    Pérez-Castaño, Estefanía; Sánchez-Viñas, Mercedes; Gázquez-Evangelista, Domingo; Bagur-González, M Gracia

    2018-01-15

    This paper describes and discusses the application of trimethylsilyl (TMS)-4,4'-desmethylsterols derivatives chromatographic fingerprints (obtained from an off-line HPLC-GC-FID system) for the quantification of extra virgin olive oil in commercial vinaigrettes, dressing salad and in-house reference materials (i-HRM) using two different Partial Least Square-Regression (PLS-R) multivariate quantification methods. Different data pre-processing strategies were carried out being the whole one: (i) internal normalization; (ii) sampling based on The Nyquist Theorem; (iii) internal correlation optimized shifting, icoshift; (iv) baseline correction (v) mean centering and (vi) selecting zones. The first model corresponds to a matrix of dimensions 'n×911' variables and the second one to a matrix of dimensions 'n×431' variables. It has to be highlighted that the proposed two PLS-R models allow the quantification of extra virgin olive oil in binary blends, foodstuffs, etc., when the provided percentage is greater than 25%. Copyright © 2017 Elsevier Ltd. All rights reserved.

  12. Near-infrared hyperspectral imaging and partial least squares regression for rapid and reagentless determination of Enterobacteriaceae on chicken fillets.

    PubMed

    Feng, Yao-Ze; Elmasry, Gamal; Sun, Da-Wen; Scannell, Amalia G M; Walsh, Des; Morcy, Noha

    2013-06-01

    Bacterial pathogens are the main culprits for outbreaks of food-borne illnesses. This study aimed to use the hyperspectral imaging technique as a non-destructive tool for quantitative and direct determination of Enterobacteriaceae loads on chicken fillets. Partial least squares regression (PLSR) models were established and the best model using full wavelengths was obtained in the spectral range 930-1450 nm with coefficients of determination R(2)≥ 0.82 and root mean squared errors (RMSEs) ≤ 0.47 log(10)CFUg(-1). In further development of simplified models, second derivative spectra and weighted PLS regression coefficients (BW) were utilised to select important wavelengths. However, the three wavelengths (930, 1121 and 1345 nm) selected from BW were competent and more preferred for predicting Enterobacteriaceae loads with R(2) of 0.89, 0.86 and 0.87 and RMSEs of 0.33, 0.40 and 0.45 log(10)CFUg(-1) for calibration, cross-validation and prediction, respectively. Besides, the constructed prediction map provided the distribution of Enterobacteriaceae bacteria on chicken fillets, which cannot be achieved by conventional methods. It was demonstrated that hyperspectral imaging is a potential tool for determining food sanitation and detecting bacterial pathogens on food matrix without using complicated laboratory regimes. Copyright © 2012 Elsevier Ltd. All rights reserved.

  13. Multivariate analysis of ATR-FTIR spectra for assessment of oil shale organic geochemical properties

    USGS Publications Warehouse

    Washburn, Kathryn E.; Birdwell, Justin E.

    2013-01-01

    In this study, attenuated total reflectance (ATR) Fourier transform infrared spectroscopy (FTIR) was coupled with partial least squares regression (PLSR) analysis to relate spectral data to parameters from total organic carbon (TOC) analysis and programmed pyrolysis to assess the feasibility of developing predictive models to estimate important organic geochemical parameters. The advantage of ATR-FTIR over traditional analytical methods is that source rocks can be analyzed in the laboratory or field in seconds, facilitating more rapid and thorough screening than would be possible using other tools. ATR-FTIR spectra, TOC concentrations and Rock–Eval parameters were measured for a set of oil shales from deposits around the world and several pyrolyzed oil shale samples. PLSR models were developed to predict the measured geochemical parameters from infrared spectra. Application of the resulting models to a set of test spectra excluded from the training set generated accurate predictions of TOC and most Rock–Eval parameters. The critical region of the infrared spectrum for assessing S1, S2, Hydrogen Index and TOC consisted of aliphatic organic moieties (2800–3000 cm−1) and the models generated a better correlation with measured values of TOC and S2 than did integrated aliphatic peak areas. The results suggest that combining ATR-FTIR with PLSR is a reliable approach for estimating useful geochemical parameters of oil shales that is faster and requires less sample preparation than current screening methods.

  14. Estimation of Nitrogen Vertical Distribution by Bi-Directional Canopy Reflectance in Winter Wheat

    PubMed Central

    Huang, Wenjiang; Yang, Qinying; Pu, Ruiliang; Yang, Shaoyuan

    2014-01-01

    Timely measurement of vertical foliage nitrogen distribution is critical for increasing crop yield and reducing environmental impact. In this study, a novel method with partial least square regression (PLSR) and vegetation indices was developed to determine optimal models for extracting vertical foliage nitrogen distribution of winter wheat by using bi-directional reflectance distribution function (BRDF) data. The BRDF data were collected from ground-based hyperspectral reflectance measurements recorded at the Xiaotangshan Precision Agriculture Experimental Base in 2003, 2004 and 2007. The view zenith angles (1) at nadir, 40° and 50°; (2) at nadir, 30° and 40°; and (3) at nadir, 20° and 30° were selected as optical view angles to estimate foliage nitrogen density (FND) at an upper, middle and bottom layer, respectively. For each layer, three optimal PLSR analysis models with FND as a dependent variable and two vegetation indices (nitrogen reflectance index (NRI), normalized pigment chlorophyll index (NPCI) or a combination of NRI and NPCI) at corresponding angles as explanatory variables were established. The experimental results from an independent model verification demonstrated that the PLSR analysis models with the combination of NRI and NPCI as the explanatory variables were the most accurate in estimating FND for each layer. The coefficients of determination (R2) of this model between upper layer-, middle layer- and bottom layer-derived and laboratory-measured foliage nitrogen density were 0.7335, 0.7336, 0.6746, respectively. PMID:25353983

  15. Estimation of nitrogen vertical distribution by bi-directional canopy reflectance in winter wheat.

    PubMed

    Huang, Wenjiang; Yang, Qinying; Pu, Ruiliang; Yang, Shaoyuan

    2014-10-28

    Timely measurement of vertical foliage nitrogen distribution is critical for increasing crop yield and reducing environmental impact. In this study, a novel method with partial least square regression (PLSR) and vegetation indices was developed to determine optimal models for extracting vertical foliage nitrogen distribution of winter wheat by using bi-directional reflectance distribution function (BRDF) data. The BRDF data were collected from ground-based hyperspectral reflectance measurements recorded at the Xiaotangshan Precision Agriculture Experimental Base in 2003, 2004 and 2007. The view zenith angles (1) at nadir, 40° and 50°; (2) at nadir, 30° and 40°; and (3) at nadir, 20° and 30° were selected as optical view angles to estimate foliage nitrogen density (FND) at an upper, middle and bottom layer, respectively. For each layer, three optimal PLSR analysis models with FND as a dependent variable and two vegetation indices (nitrogen reflectance index (NRI), normalized pigment chlorophyll index (NPCI) or a combination of NRI and NPCI) at corresponding angles as explanatory variables were established. The experimental results from an independent model verification demonstrated that the PLSR analysis models with the combination of NRI and NPCI as the explanatory variables were the most accurate in estimating FND for each layer. The coefficients of determination (R2) of this model between upper layer-, middle layer- and bottom layer-derived and laboratory-measured foliage nitrogen density were 0.7335, 0.7336, 0.6746, respectively.

  16. New type of dry substances content meter using microwaves for application in biogas plants.

    PubMed

    Nacke, Thomas; Brückner, Kathleen; Göller, Arndt; Kaufhold, Sebastian; Nakos, Xenia; Noack, Stephan; Stöber, Heinrich; Beckmann, Dieter

    2005-11-01

    Dry substances (DS) are an important index for monitoring and controlling anaerobic co-digestion in biogas plants. We have developed and tested an online meter that measures suspended solids by means of the reflection coefficient of an exiting microwave signal, which is dependent on the dielectric properties of the suspensions. Intelligent models based on partial least squares regression (PLSR) and artificial neural network (ANN) for calibration allow exact and reproducible measurements under different circumstances. This measuring method is appropriate for contactless and online measurements of dry substance contents in biogas plants in a large range from 2-14%.

  17. Desert soil clay content estimation using reflectance spectroscopy preprocessed by fractional derivative

    PubMed Central

    Tiyip, Tashpolat; Ding, Jianli; Zhang, Dong; Liu, Wei; Wang, Fei; Tashpolat, Nigara

    2017-01-01

    Effective pretreatment of spectral reflectance is vital to model accuracy in soil parameter estimation. However, the classic integer derivative has some disadvantages, including spectral information loss and the introduction of high-frequency noise. In this paper, the fractional order derivative algorithm was applied to the pretreatment and partial least squares regression (PLSR) was used to assess the clay content of desert soils. Overall, 103 soil samples were collected from the Ebinur Lake basin in the Xinjiang Uighur Autonomous Region of China, and used as data sets for calibration and validation. Following laboratory measurements of spectral reflectance and clay content, the raw spectral reflectance and absorbance data were treated using the fractional derivative order from the 0.0 to the 2.0 order (order interval: 0.2). The ratio of performance to deviation (RPD), determinant coefficients of calibration (Rc2), root mean square errors of calibration (RMSEC), determinant coefficients of prediction (Rp2), and root mean square errors of prediction (RMSEP) were applied to assess the performance of predicting models. The results showed that models built on the fractional derivative order performed better than when using the classic integer derivative. Comparison of the predictive effects of 22 models for estimating clay content, calibrated by PLSR, showed that those models based on the fractional derivative 1.8 order of spectral reflectance (Rc2 = 0.907, RMSEC = 0.425%, Rp2 = 0.916, RMSEP = 0.364%, and RPD = 2.484 ≥ 2.000) and absorbance (Rc2 = 0.888, RMSEC = 0.446%, Rp2 = 0.918, RMSEP = 0.383% and RPD = 2.511 ≥ 2.000) were most effective. Furthermore, they performed well in quantitative estimations of the clay content of soils in the study area. PMID:28934274

  18. Moisture Influence Reducing Method for Heavy Metals Detection in Plant Materials Using Laser-Induced Breakdown Spectroscopy: A Case Study for Chromium Content Detection in Rice Leaves.

    PubMed

    Peng, Jiyu; He, Yong; Ye, Lanhan; Shen, Tingting; Liu, Fei; Kong, Wenwen; Liu, Xiaodan; Zhao, Yun

    2017-07-18

    Fast detection of heavy metals in plant materials is crucial for environmental remediation and ensuring food safety. However, most plant materials contain high moisture content, the influence of which cannot be simply ignored. Hence, we proposed moisture influence reducing method for fast detection of heavy metals using laser-induced breakdown spectroscopy (LIBS). First, we investigated the effect of moisture content on signal intensity, stability, and plasma parameters (temperature and electron density) and determined the main influential factors (experimental parameters F and the change of analyte concentration) on the variations of signal. For chromium content detection, the rice leaves were performed with a quick drying procedure, and two strategies were further used to reduce the effect of moisture content and shot-to-shot fluctuation. An exponential model based on the intensity of background was used to correct the actual element concentration in analyte. Also, the ratio of signal-to-background for univariable calibration and partial least squared regression (PLSR) for multivariable calibration were used to compensate the prediction deviations. The PLSR calibration model obtained the best result, with the correlation coefficient of 0.9669 and root-mean-square error of 4.75 mg/kg in the prediction set. The preliminary results indicated that the proposed method allowed for the detection of heavy metals in plant materials using LIBS, and it could be possibly used for element mapping in future work.

  19. A new analytical method for quantification of olive and palm oil in blends with other vegetable edible oils based on the chromatographic fingerprints from the methyl-transesterified fraction.

    PubMed

    Jiménez-Carvelo, Ana M; González-Casado, Antonio; Cuadros-Rodríguez, Luis

    2017-03-01

    A new analytical method for the quantification of olive oil and palm oil in blends with other vegetable edible oils (canola, safflower, corn, peanut, seeds, grapeseed, linseed, sesame and soybean) using normal phase liquid chromatography, and applying chemometric tools was developed. The procedure for obtaining of chromatographic fingerprint from the methyl-transesterified fraction from each blend is described. The multivariate quantification methods used were Partial Least Square-Regression (PLS-R) and Support Vector Regression (SVR). The quantification results were evaluated by several parameters as the Root Mean Square Error of Validation (RMSEV), Mean Absolute Error of Validation (MAEV) and Median Absolute Error of Validation (MdAEV). It has to be highlighted that the new proposed analytical method, the chromatographic analysis takes only eight minutes and the results obtained showed the potential of this method and allowed quantification of mixtures of olive oil and palm oil with other vegetable oils. Copyright © 2016 Elsevier B.V. All rights reserved.

  20. Detection of heavy metal Cd in polluted fresh leafy vegetables by laser-induced breakdown spectroscopy.

    PubMed

    Yao, Mingyin; Yang, Hui; Huang, Lin; Chen, Tianbing; Rao, Gangfu; Liu, Muhua

    2017-05-10

    In seeking a novel method with the ability of green analysis in monitoring toxic heavy metals residue in fresh leafy vegetables, laser-induced breakdown spectroscopy (LIBS) was applied to prove its capability in performing this work. The spectra of fresh vegetable samples polluted in the lab were collected by optimized LIBS experimental setup, and the reference concentrations of cadmium (Cd) from samples were obtained by conventional atomic absorption spectroscopy after wet digestion. The direct calibration employing intensity of single Cd line and Cd concentration exposed the weakness of this calibration method. Furthermore, the accuracy of linear calibration can be improved a little by triple Cd lines as characteristic variables, especially after the spectra were pretreated. However, it is not enough in predicting Cd in samples. Therefore, partial least-squares regression (PLSR) was utilized to enhance the robustness of quantitative analysis. The results of the PLSR model showed that the prediction accuracy of the Cd target can meet the requirement of determination in food safety. This investigation presented that LIBS is a promising and emerging method in analyzing toxic compositions in agricultural products, especially combined with suitable chemometrics.

  1. Multimodal Image Analysis in Alzheimer’s Disease via Statistical Modelling of Non-local Intensity Correlations

    NASA Astrophysics Data System (ADS)

    Lorenzi, Marco; Simpson, Ivor J.; Mendelson, Alex F.; Vos, Sjoerd B.; Cardoso, M. Jorge; Modat, Marc; Schott, Jonathan M.; Ourselin, Sebastien

    2016-04-01

    The joint analysis of brain atrophy measured with magnetic resonance imaging (MRI) and hypometabolism measured with positron emission tomography with fluorodeoxyglucose (FDG-PET) is of primary importance in developing models of pathological changes in Alzheimer’s disease (AD). Most of the current multimodal analyses in AD assume a local (spatially overlapping) relationship between MR and FDG-PET intensities. However, it is well known that atrophy and hypometabolism are prominent in different anatomical areas. The aim of this work is to describe the relationship between atrophy and hypometabolism by means of a data-driven statistical model of non-overlapping intensity correlations. For this purpose, FDG-PET and MRI signals are jointly analyzed through a computationally tractable formulation of partial least squares regression (PLSR). The PLSR model is estimated and validated on a large clinical cohort of 1049 individuals from the ADNI dataset. Results show that the proposed non-local analysis outperforms classical local approaches in terms of predictive accuracy while providing a plausible description of disease dynamics: early AD is characterised by non-overlapping temporal atrophy and temporo-parietal hypometabolism, while the later disease stages show overlapping brain atrophy and hypometabolism spread in temporal, parietal and cortical areas.

  2. On the prediction of threshold friction velocity of wind erosion using soil reflectance spectroscopy

    NASA Astrophysics Data System (ADS)

    Li, Junran; Flagg, Cody; Okin, Gregory S.; Painter, Thomas H.; Dintwe, Kebonye; Belnap, Jayne

    2015-12-01

    Current approaches to estimate threshold friction velocity (TFV) of soil particle movement, including both experimental and empirical methods, suffer from various disadvantages, and they are particularly not effective to estimate TFVs at regional to global scales. Reflectance spectroscopy has been widely used to obtain TFV-related soil properties (e.g., moisture, texture, crust, etc.), however, no studies have attempted to directly relate soil TFV to their spectral reflectance. The objective of this study was to investigate the relationship between soil TFV and soil reflectance in the visible and near infrared (VIS-NIR, 350-2500 nm) spectral region, and to identify the best range of wavelengths or combinations of wavelengths to predict TFV. Threshold friction velocity of 31 soils, along with their reflectance spectra and texture were measured in the Mojave Desert, California and Moab, Utah. A correlation analysis between TFV and soil reflectance identified a number of isolated, narrow spectral domains that largely fell into two spectral regions, the VIS area (400-700 nm) and the short-wavelength infrared (SWIR) area (1100-2500 nm). A partial least squares regression analysis (PLSR) confirmed the significant bands that were identified by correlation analysis. The PLSR further identified the strong relationship between the first-difference transformation and TFV at several narrow regions around 1400, 1900, and 2200 nm. The use of PLSR allowed us to identify a total of 17 key wavelengths in the investigated spectrum range, which may be used as the optimal spectral settings for estimating TFV in the laboratory and field, or mapping of TFV using airborne/satellite sensors.

  3. Mapping Soil Salinity/Sodicity by using Landsat OLI Imagery and PLSR Algorithm over Semiarid West Jilin Province, China

    PubMed Central

    Liu, Mingyue; Du, Baojia; Zhang, Bai

    2018-01-01

    Soil salinity and sodicity can significantly reduce the value and the productivity of affected lands, posing degradation, and threats to sustainable development of natural resources on earth. This research attempted to map soil salinity/sodicity via disentangling the relationships between Landsat 8 Operational Land Imager (OLI) imagery and in-situ measurements (EC, pH) over the west Jilin of China. We established the retrieval models for soil salinity and sodicity using Partial Least Square Regression (PLSR). Spatial distribution of the soils that were subjected to hybridized salinity and sodicity (HSS) was obtained by overlay analysis using maps of soil salinity and sodicity in geographical information system (GIS) environment. We analyzed the severity and occurring sizes of soil salinity, sodicity, and HSS with regard to specified soil types and land cover. Results indicated that the models’ accuracy was improved by combining the reflectance bands and spectral indices that were mathematically transformed. Therefore, our results stipulated that the OLI imagery and PLSR method applied to mapping soil salinity and sodicity in the region. The mapping results revealed that the areas of soil salinity, sodicity, and HSS were 1.61 × 106 hm2, 1.46 × 106 hm2, and 1.36 × 106 hm2, respectively. Also, the occurring area of moderate and intensive sodicity was larger than that of salinity. This research may underpin efficiently mapping regional salinity/sodicity occurrences, understanding the linkages between spectral reflectance and ground measurements of soil salinity and sodicity, and provide tools for soil salinity monitoring and the sustainable utilization of land resources. PMID:29614727

  4. Multidimensional Single-Cell Analysis of BCR Signaling Reveals Proximal Activation Defect As a Hallmark of Chronic Lymphocytic Leukemia B Cells

    PubMed Central

    Palomba, M. Lia; Piersanti, Kelly; Ziegler, Carly G. K.; Decker, Hugo; Cotari, Jesse W.; Bantilan, Kurt; Rijo, Ivelise; Gardner, Jeff R.; Heaney, Mark; Bemis, Debra; Balderas, Robert; Malek, Sami N.; Seymour, Erlene; Zelenetz, Andrew D.

    2014-01-01

    Purpose Chronic Lymphocytic Leukemia (CLL) is defined by a perturbed B-cell receptor-mediated signaling machinery. We aimed to model differential signaling behavior between B cells from CLL and healthy individuals to pinpoint modes of dysregulation. Experimental Design We developed an experimental methodology combining immunophenotyping, multiplexed phosphospecific flow cytometry, and multifactorial statistical modeling. Utilizing patterns of signaling network covariance, we modeled BCR signaling in 67 CLL patients using Partial Least Squares Regression (PLSR). Results from multidimensional modeling were validated using an independent test cohort of 38 patients. Results We identified a dynamic and variable imbalance between proximal (pSYK, pBTK) and distal (pPLCγ2, pBLNK, ppERK) phosphoresponses. PLSR identified the relationship between upstream tyrosine kinase SYK and its target, PLCγ2, as maximally predictive and sufficient to distinguish CLL from healthy samples, pointing to this juncture in the signaling pathway as a hallmark of CLL B cells. Specific BCR pathway signaling signatures that correlate with the disease and its degree of aggressiveness were identified. Heterogeneity in the PLSR response variable within the B cell population is both a characteristic mark of healthy samples and predictive of disease aggressiveness. Conclusion Single-cell multidimensional analysis of BCR signaling permitted focused analysis of the variability and heterogeneity of signaling behavior from patient-to-patient, and from cell-to-cell. Disruption of the pSYK/pPLCγ2 relationship is uncovered as a robust hallmark of CLL B cell signaling behavior. Together, these observations implicate novel elements of the BCR signal transduction as potential therapeutic targets. PMID:24489640

  5. Determination of Rare Earth Elements in Geological Samples Using Laser-Induced Breakdown Spectroscopy (LIBS).

    PubMed

    Bhatt, Chet R; Jain, Jinesh C; Goueguel, Christian L; McIntyre, Dustin L; Singh, Jagdish P

    2018-01-01

    Laser-induced breakdown spectroscopy (LIBS) was used to detect rare earth elements (REEs) in natural geological samples. Low and high intensity emission lines of Ce, La, Nd, Y, Pr, Sm, Eu, Gd, and Dy were identified in the spectra recorded from the samples to claim the presence of these REEs. Multivariate analysis was executed by developing partial least squares regression (PLS-R) models for the quantification of Ce, La, and Nd. Analysis of unknown samples indicated that the prediction results of these samples were found comparable to those obtained by inductively coupled plasma mass spectrometry analysis. Data support that LIBS has potential to quantify REEs in geological minerals/ores.

  6. Variable selection in near-infrared spectroscopy: benchmarking of feature selection methods on biodiesel data.

    PubMed

    Balabin, Roman M; Smirnov, Sergey V

    2011-04-29

    During the past several years, near-infrared (near-IR/NIR) spectroscopy has increasingly been adopted as an analytical tool in various fields from petroleum to biomedical sectors. The NIR spectrum (above 4000 cm(-1)) of a sample is typically measured by modern instruments at a few hundred of wavelengths. Recently, considerable effort has been directed towards developing procedures to identify variables (wavelengths) that contribute useful information. Variable selection (VS) or feature selection, also called frequency selection or wavelength selection, is a critical step in data analysis for vibrational spectroscopy (infrared, Raman, or NIRS). In this paper, we compare the performance of 16 different feature selection methods for the prediction of properties of biodiesel fuel, including density, viscosity, methanol content, and water concentration. The feature selection algorithms tested include stepwise multiple linear regression (MLR-step), interval partial least squares regression (iPLS), backward iPLS (BiPLS), forward iPLS (FiPLS), moving window partial least squares regression (MWPLS), (modified) changeable size moving window partial least squares (CSMWPLS/MCSMWPLSR), searching combination moving window partial least squares (SCMWPLS), successive projections algorithm (SPA), uninformative variable elimination (UVE, including UVE-SPA), simulated annealing (SA), back-propagation artificial neural networks (BP-ANN), Kohonen artificial neural network (K-ANN), and genetic algorithms (GAs, including GA-iPLS). Two linear techniques for calibration model building, namely multiple linear regression (MLR) and partial least squares regression/projection to latent structures (PLS/PLSR), are used for the evaluation of biofuel properties. A comparison with a non-linear calibration model, artificial neural networks (ANN-MLP), is also provided. Discussion of gasoline, ethanol-gasoline (bioethanol), and diesel fuel data is presented. The results of other spectroscopic techniques application, such as Raman, ultraviolet-visible (UV-vis), or nuclear magnetic resonance (NMR) spectroscopies, can be greatly improved by an appropriate feature selection choice. Copyright © 2011 Elsevier B.V. All rights reserved.

  7. Detection and quantification of adulteration in sandalwood oil through near infrared spectroscopy.

    PubMed

    Kuriakose, Saji; Thankappan, Xavier; Joe, Hubert; Venkataraman, Venkateswaran

    2010-10-01

    The confirmation of authenticity of essential oils and the detection of adulteration are problems of increasing importance in the perfumes, pharmaceutical, flavor and fragrance industries. This is especially true for 'value added' products like sandalwood oil. A methodical study is conducted here to demonstrate the potential use of Near Infrared (NIR) spectroscopy along with multivariate calibration models like principal component regression (PCR) and partial least square regression (PLSR) as rapid analytical techniques for the qualitative and quantitative determination of adulterants in sandalwood oil. After suitable pre-processing of the NIR raw spectral data, the models are built-up by cross-validation. The lowest Root Mean Square Error of Cross-Validation and Calibration (RMSECV and RMSEC % v/v) are used as a decision supporting system to fix the optimal number of factors. The coefficient of determination (R(2)) and the Root Mean Square Error of Prediction (RMSEP % v/v) in the prediction sets are used as the evaluation parameters (R(2) = 0.9999 and RMSEP = 0.01355). The overall result leads to the conclusion that NIR spectroscopy with chemometric techniques could be successfully used as a rapid, simple, instant and non-destructive method for the detection of adulterants, even 1% of the low-grade oils, in the high quality form of sandalwood oil.

  8. Wavelet analysis techniques applied to removing varying spectroscopic background in calibration model for pear sugar content

    NASA Astrophysics Data System (ADS)

    Liu, Yande; Ying, Yibin; Lu, Huishan; Fu, Xiaping

    2005-11-01

    A new method is proposed to eliminate the varying background and noise simultaneously for multivariate calibration of Fourier transform near infrared (FT-NIR) spectral signals. An ideal spectrum signal prototype was constructed based on the FT-NIR spectrum of fruit sugar content measurement. The performances of wavelet based threshold de-noising approaches via different combinations of wavelet base functions were compared. Three families of wavelet base function (Daubechies, Symlets and Coiflets) were applied to estimate the performance of those wavelet bases and threshold selection rules by a series of experiments. The experimental results show that the best de-noising performance is reached via the combinations of Daubechies 4 or Symlet 4 wavelet base function. Based on the optimization parameter, wavelet regression models for sugar content of pear were also developed and result in a smaller prediction error than a traditional Partial Least Squares Regression (PLSR) mode.

  9. Characterization of taste-active compounds of various cherry wines and their correlation with sensory attributes.

    PubMed

    Niu, Yunwei; Zhang, Xiaoming; Xiao, Zuobing; Song, Shiqing; Jia, Chengsheng; Yu, Haiyan; Fang, Lingling; Xu, Chunhua

    2012-08-01

    Five cherry wines exhibiting marked differences in taste and mouthfeel were selected for the study. The taste and mouthfeel of cherry wines were described by four sensory terms as sour, sweet, bitter and astringent. Eight organic acids, seventeen amino acids, three sugars and tannic acid were determined by high performance liquid chromatography (HPLC). Five phenolic acids were determined by ultra performance liquid chromatography coupled with mass spectrometry (UPLC-MS). The relationship between these taste-active compounds, wine samples and sensory attributes was modeled by partial least squares regression (PLSR). The regression analysis indicated tartaric acid, methionine, proline, sucrose, glucose, fructose, asparagines, serine, glycine, threonine, phenylalanine, leucine, gallic acid, chlorogenic acid, vanillic acid, arginine and tannic acid made a great contribution to the characteristic taste or mouthfeel of cherry wines. Copyright © 2012 Elsevier B.V. All rights reserved.

  10. Mapping of macro and micro nutrients of mixed pastures using airborne AisaFENIX hyperspectral imagery

    NASA Astrophysics Data System (ADS)

    Pullanagari, R. R.; Kereszturi, Gábor; Yule, I. J.

    2016-07-01

    On-farm assessment of mixed pasture nutrient concentrations is important for animal production and pasture management. Hyperspectral imaging is recognized as a potential tool to quantify the nutrient content of vegetation. However, it is a great challenge to estimate macro and micro nutrients in heterogeneous mixed pastures. In this study, canopy reflectance data was measured by using a high resolution airborne visible-to-shortwave infrared (Vis-SWIR) imaging spectrometer measuring in the wavelength region 380-2500 nm to predict nutrient concentrations, nitrogen (N) phosphorus (P), potassium (K), sulfur (S), zinc (Zn), sodium (Na), manganese (Mn) copper (Cu) and magnesium (Mg) in heterogeneous mixed pastures across a sheep and beef farm in hill country, within New Zealand. Prediction models were developed using four different methods which are included partial least squares regression (PLSR), kernel PLSR, support vector regression (SVR), random forest regression (RFR) algorithms and their performance compared using the test data. The results from the study revealed that RFR produced highest accuracy (0.55 ⩽ R2CV ⩽ 0.78; 6.68% ⩽ nRMSECV ⩽ 26.47%) compared to all other algorithms for the majority of nutrients (N, P, K, Zn, Na, Cu and Mg) described, and the remaining nutrients (S and Mn) were predicted with high accuracy (0.68 ⩽ R2CV ⩽ 0.86; 13.00% ⩽ nRMSECV ⩽ 14.64%) using SVR. The best training models were used to extrapolate over the whole farm with the purpose of predicting those pasture nutrients and expressed through pixel based spatial maps. These spatially registered nutrient maps demonstrate the range and geographical location of often large differences in pasture nutrient values which are normally not measured and therefore not included in decision making when considering more effective ways to utilized pasture.

  11. A rapid integrated bioactivity evaluation system based on near-infrared spectroscopy for quality control of Flos Chrysanthemi.

    PubMed

    Ding, Guoyu; Li, Baiqing; Han, Yanqi; Liu, Aina; Zhang, Jingru; Peng, Jiamin; Jiang, Min; Hou, Yuanyuan; Bai, Gang

    2016-11-30

    For quality control of herbal medicines or functional foods, integral activity evaluation has become more popular in recent studies. The majority of researchers focus on the relationship between chromatography/mass spectroscopy and bioactivity, but the connection with spectrum-activity is easily ignored. In this paper, the near infrared reflection spectra (NIRS) of Flos Chrysanthemi samples were collected as a representative spectrum technology, and corresponding anti-inflammation activities were utilized to illustrate the spectrum-activity study. HPLC/Q-TOF-MS identification and heat map clustering were used to select the quality markers (Q-marker) from five cultivars of Flos Chrysanthemi. Using boxplot analysis and the interval limits of detection (LODs) theory, six crucial markers, namely, chlorogenic acid, 3,5-dicaffeoylquinic acid, 1,5-dicaffeoylquinic acid, luteoloside, apigenin-7-O-β-d-glucoside, and luteolin-7-O-6-malonylglucoside were screened out. Then partial least squares regression (PLSR) calibration models combined with synergy interval partial least squares (siPLS) and 12 different spectral pretreatment methods were developed for the parameters optimization of these Q-markers in Flos Chrysanthemi powder. After comparing the relationship between Q-marker contents and anti-inflammation activity via three machine learning approaches and PLSR, back-propagation neural network (BP-ANN) displayed a more excellent non-linear fitting effect, as its R for new batches reached 0.89. These results indicated that the integrated NIRS and bioactive strategy was suitable for fast quality management in Flos Chrysanthemi, and also applied to other botanical food quality control. Copyright © 2016 Elsevier B.V. All rights reserved.

  12. On the prediction of threshold friction velocity of wind erosion using soil reflectance spectroscopy

    USGS Publications Warehouse

    Li, Junran; Flagg, Cody B.; Okin, Gregory S.; Painter, Thomas H.; Dintwe, Kebonye; Belnap, Jayne

    2015-01-01

    Current approaches to estimate threshold friction velocity (TFV) of soil particle movement, including both experimental and empirical methods, suffer from various disadvantages, and they are particularly not effective to estimate TFVs at regional to global scales. Reflectance spectroscopy has been widely used to obtain TFV-related soil properties (e.g., moisture, texture, crust, etc.), however, no studies have attempted to directly relate soil TFV to their spectral reflectance. The objective of this study was to investigate the relationship between soil TFV and soil reflectance in the visible and near infrared (VIS–NIR, 350–2500 nm) spectral region, and to identify the best range of wavelengths or combinations of wavelengths to predict TFV. Threshold friction velocity of 31 soils, along with their reflectance spectra and texture were measured in the Mojave Desert, California and Moab, Utah. A correlation analysis between TFV and soil reflectance identified a number of isolated, narrow spectral domains that largely fell into two spectral regions, the VIS area (400–700 nm) and the short-wavelength infrared (SWIR) area (1100–2500 nm). A partial least squares regression analysis (PLSR) confirmed the significant bands that were identified by correlation analysis. The PLSR further identified the strong relationship between the first-difference transformation and TFV at several narrow regions around 1400, 1900, and 2200 nm. The use of PLSR allowed us to identify a total of 17 key wavelengths in the investigated spectrum range, which may be used as the optimal spectral settings for estimating TFV in the laboratory and field, or mapping of TFV using airborne/satellite sensors.

  13. Rapid prediction of particulate, humus and resistant fractions of soil organic carbon in reforested lands using infrared spectroscopy.

    PubMed

    Madhavan, Dinesh B; Baldock, Jeff A; Read, Zoe J; Murphy, Simon C; Cunningham, Shaun C; Perring, Michael P; Herrmann, Tim; Lewis, Tom; Cavagnaro, Timothy R; England, Jacqueline R; Paul, Keryn I; Weston, Christopher J; Baker, Thomas G

    2017-05-15

    Reforestation of agricultural lands with mixed-species environmental plantings can effectively sequester C. While accurate and efficient methods for predicting soil organic C content and composition have recently been developed for soils under agricultural land uses, such methods under forested land uses are currently lacking. This study aimed to develop a method using infrared spectroscopy for accurately predicting total organic C (TOC) and its fractions (particulate, POC; humus, HOC; and resistant, ROC organic C) in soils under environmental plantings. Soils were collected from 117 paired agricultural-reforestation sites across Australia. TOC fractions were determined in a subset of 38 reforested soils using physical fractionation by automated wet-sieving and 13 C nuclear magnetic resonance (NMR) spectroscopy. Mid- and near-infrared spectra (MNIRS, 6000-450 cm -1 ) were acquired from finely-ground soils from environmental plantings and agricultural land. Satisfactory prediction models based on MNIRS and partial least squares regression (PLSR) were developed for TOC and its fractions. Leave-one-out cross-validations of MNIRS-PLSR models indicated accurate predictions (R 2  > 0.90, negligible bias, ratio of performance to deviation > 3) and fraction-specific functional group contributions to beta coefficients in the models. TOC and its fractions were predicted using the cross-validated models and soil spectra for 3109 reforested and agricultural soils. The reliability of predictions determined using k-nearest neighbour score distance indicated that >80% of predictions were within the satisfactory inlier limit. The study demonstrated the utility of infrared spectroscopy (MNIRS-PLSR) to rapidly and economically determine TOC and its fractions and thereby accurately describe the effects of land use change such as reforestation on agricultural soils. Copyright © 2017 Elsevier Ltd. All rights reserved.

  14. Mercury and water level fluctuations in lakes of northern Minnesota

    USGS Publications Warehouse

    Larson, James H.; Maki, Ryan P; Christensen, Victoria G.; Sandheinrich, Mark B.; LeDuc, Jaime F.; Kissane, Claire; Knights, Brent C.

    2017-01-01

    Large lake ecosystems support a variety of ecosystem services in surrounding communities, including recreational and commercial fishing. However, many northern temperate fisheries are contaminated by mercury. Annual variation in mercury accumulation in fish has previously been linked to water level (WL) fluctuations, opening the possibility of regulating water levels in a manner that minimizes or reduces mercury contamination in fisheries. Here, we compiled a long-term dataset (1997-2015) of mercury content in young-of-year Yellow Perch (Perca flavescens) from six lakes on the border between the U.S. and Canada and examined whether mercury content appeared to be related to several metrics of WL fluctuation (e.g., spring WL rise, annual maximum WL, and year-to-year change in maximum WL). Using simple correlation analysis, several WL metrics appear to be strongly correlated to Yellow Perch mercury content, although the strength of these correlations varies by lake. We also used many WL metrics, water quality measurements, temperature and annual deposition data to build predictive models using partial least squared regression (PLSR) analysis for each lake. These PLSR models showed some variation among lakes, but also supported strong associations between WL fluctuations and annual variation in Yellow Perch mercury content. The study lakes underwent a modest change in WL management in 2000, when winter WL minimums were increased by about 1 m in five of the six study lakes. Using the PLSR models, we estimated how this change in WL management would have affected Yellow Perch mercury content. For four of the study lakes, the change in WL management that occurred in 2000 likely reduced Yellow Perch mercury content, relative to the previous WL management regime.

  15. [Application of near infrared spectroscopy combined with particle swarm optimization based least square support vactor machine to rapid quantitative analysis of Corni Fructus].

    PubMed

    Liu, Xue-song; Sun, Fen-fang; Jin, Ye; Wu, Yong-jiang; Gu, Zhi-xin; Zhu, Li; Yan, Dong-lan

    2015-12-01

    A novel method was developed for the rapid determination of multi-indicators in corni fructus by means of near infrared (NIR) spectroscopy. Particle swarm optimization (PSO) based least squares support vector machine was investigated to increase the levels of quality control. The calibration models of moisture, extractum, morroniside and loganin were established using the PSO-LS-SVM algorithm. The performance of PSO-LS-SVM models was compared with partial least squares regression (PLSR) and back propagation artificial neural network (BP-ANN). The calibration and validation results of PSO-LS-SVM were superior to both PLS and BP-ANN. For PSO-LS-SVM models, the correlation coefficients (r) of calibrations were all above 0.942. The optimal prediction results were also achieved by PSO-LS-SVM models with the RMSEP (root mean square error of prediction) and RSEP (relative standard errors of prediction) less than 1.176 and 15.5% respectively. The results suggest that PSO-LS-SVM algorithm has a good model performance and high prediction accuracy. NIR has a potential value for rapid determination of multi-indicators in Corni Fructus.

  16. Accuracy of direct genomic values in Holstein bulls and cows using subsets of SNP markers

    PubMed Central

    2010-01-01

    Background At the current price, the use of high-density single nucleotide polymorphisms (SNP) genotyping assays in genomic selection of dairy cattle is limited to applications involving elite sires and dams. The objective of this study was to evaluate the use of low-density assays to predict direct genomic value (DGV) on five milk production traits, an overall conformation trait, a survival index, and two profit index traits (APR, ASI). Methods Dense SNP genotypes were available for 42,576 SNP for 2,114 Holstein bulls and 510 cows. A subset of 1,847 bulls born between 1955 and 2004 was used as a training set to fit models with various sets of pre-selected SNP. A group of 297 bulls born between 2001 and 2004 and all cows born between 1992 and 2004 were used to evaluate the accuracy of DGV prediction. Ridge regression (RR) and partial least squares regression (PLSR) were used to derive prediction equations and to rank SNP based on the absolute value of the regression coefficients. Four alternative strategies were applied to select subset of SNP, namely: subsets of the highest ranked SNP for each individual trait, or a single subset of evenly spaced SNP, where SNP were selected based on their rank for ASI, APR or minor allele frequency within intervals of approximately equal length. Results RR and PLSR performed very similarly to predict DGV, with PLSR performing better for low-density assays and RR for higher-density SNP sets. When using all SNP, DGV predictions for production traits, which have a higher heritability, were more accurate (0.52-0.64) than for survival (0.19-0.20), which has a low heritability. The gain in accuracy using subsets that included the highest ranked SNP for each trait was marginal (5-6%) over a common set of evenly spaced SNP when at least 3,000 SNP were used. Subsets containing 3,000 SNP provided more than 90% of the accuracy that could be achieved with a high-density assay for cows, and 80% of the high-density assay for young bulls. Conclusions Accurate genomic evaluation of the broader bull and cow population can be achieved with a single genotyping assays containing ~ 3,000 to 5,000 evenly spaced SNP. PMID:20950478

  17. Identification of the Rice Wines with Different Marked Ages by Electronic Nose Coupled with Smartphone and Cloud Storage Platform

    PubMed Central

    Wei, Zhebo; Xiao, Xize

    2017-01-01

    In this study, a portable electronic nose (E-nose) was self-developed to identify rice wines with different marked ages—all the operations of the E-nose were controlled by a special Smartphone Application. The sensor array of the E-nose was comprised of 12 MOS sensors and the obtained response values were transmitted to the Smartphone thorough a wireless communication module. Then, Aliyun worked as a cloud storage platform for the storage of responses and identification models. The measurement of the E-nose was composed of the taste information obtained phase (TIOP) and the aftertaste information obtained phase (AIOP). The area feature data obtained from the TIOP and the feature data obtained from the TIOP-AIOP were applied to identify rice wines by using pattern recognition methods. Principal component analysis (PCA), locally linear embedding (LLE) and linear discriminant analysis (LDA) were applied for the classification of those wine samples. LDA based on the area feature data obtained from the TIOP-AIOP proved a powerful tool and showed the best classification results. Partial least-squares regression (PLSR) and support vector machine (SVM) were applied for the predictions of marked ages and SVM (R2 = 0.9942) worked much better than PLSR. PMID:29088076

  18. Use of Standing Gold Nanorods for Detection of Malachite Green and Crystal Violet in Fish by SERS.

    PubMed

    Chen, Xiaowei; Nguyen, Trang H D; Gu, Liqun; Lin, Mengshi

    2017-07-01

    With growing consumption of aquaculture products, there is increasing demand on rapid and sensitive techniques that can detect prohibited substances in the seafood products. This study aimed to develop a novel surface-enhanced Raman spectroscopy (SERS) method coupled with simplified extraction protocol and novel gold nanorod (AuNR) substrates to detect banned aquaculture substances (malachite green [MG] and crystal violet [CV]) and their mixture (1:1) in aqueous solution and fish samples. Multivariate statistical tools such as principal component analysis (PCA) and partial least squares regression (PLSR) were used in data analysis. PCA results demonstrate that SERS can distinguish MG, CV and their mixture (1:1) in aqueous solution and in fish samples. The detection limit of SERS coupled with standing AuNR substrates is 1 ppb for both MG and CV in fish samples. A good linear relationship between the actual concentration and predicted concentration of analytes based on PLSR models with R 2 values from 0.87 to 0.99 were obtained, indicating satisfactory quantification results of this method. These results demonstrate that the SERS method coupled with AuNR substrates can be used for rapid and accurate detection of MG and CV in fish samples. © 2017 Institute of Food Technologists®.

  19. Identification of the Rice Wines with Different Marked Ages by Electronic Nose Coupled with Smartphone and Cloud Storage Platform.

    PubMed

    Wei, Zhebo; Xiao, Xize; Wang, Jun; Wang, Hui

    2017-10-31

    In this study, a portable electronic nose (E-nose) was self-developed to identify rice wines with different marked ages-all the operations of the E-nose were controlled by a special Smartphone Application. The sensor array of the E-nose was comprised of 12 MOS sensors and the obtained response values were transmitted to the Smartphone thorough a wireless communication module. Then, Aliyun worked as a cloud storage platform for the storage of responses and identification models. The measurement of the E-nose was composed of the taste information obtained phase (TIOP) and the aftertaste information obtained phase (AIOP). The area feature data obtained from the TIOP and the feature data obtained from the TIOP-AIOP were applied to identify rice wines by using pattern recognition methods. Principal component analysis (PCA), locally linear embedding (LLE) and linear discriminant analysis (LDA) were applied for the classification of those wine samples. LDA based on the area feature data obtained from the TIOP-AIOP proved a powerful tool and showed the best classification results. Partial least-squares regression (PLSR) and support vector machine (SVM) were applied for the predictions of marked ages and SVM (R² = 0.9942) worked much better than PLSR.

  20. Prediction of blood-brain partitioning: a model based on molecular electronegativity distance vector descriptors.

    PubMed

    Zhang, Yong-Hong; Xia, Zhi-Ning; Qin, Li-Tang; Liu, Shu-Shen

    2010-09-01

    The objective of this paper is to build a reliable model based on the molecular electronegativity distance vector (MEDV) descriptors for predicting the blood-brain barrier (BBB) permeability and to reveal the effects of the molecular structural segments on the BBB permeability. Using 70 structurally diverse compounds, the partial least squares regression (PLSR) models between the BBB permeability and the MEDV descriptors were developed and validated by the variable selection and modeling based on prediction (VSMP) technique. The estimation ability, stability, and predictive power of a model are evaluated by the estimated correlation coefficient (r), leave-one-out (LOO) cross-validation correlation coefficient (q), and predictive correlation coefficient (R(p)). It has been found that PLSR model has good quality, r=0.9202, q=0.7956, and R(p)=0.6649 for M1 model based on the training set of 57 samples. To search the most important structural factors affecting the BBB permeability of compounds, we performed the values of the variable importance in projection (VIP) analysis for MEDV descriptors. It was found that some structural fragments in compounds, such as -CH(3), -CH(2)-, =CH-, =C, triple bond C-, -CH<, =C<, =N-, -NH-, =O, and -OH, are the most important factors affecting the BBB permeability. (c) 2010. Published by Elsevier Inc.

  1. Non-invasive prediction of hematocrit levels by portable visible and near-infrared spectrophotometer.

    PubMed

    Sakudo, Akikazu; Kato, Yukiko Hakariya; Kuratsune, Hirohiko; Ikuta, Kazuyoshi

    2009-10-01

    After blood donation, in some individuals having polycythemia, dehydration causes anemia. Although the hematocrit (Ht) level is closely related to anemia, the current method of measuring Ht is performed after blood drawing. Furthermore, the monitoring of Ht levels contributes to a healthy life. Therefore, a non-invasive test for Ht is warranted for the safe donation of blood and good quality of life. A non-invasive procedure for the prediction of hematocrit levels was developed on the basis of a chemometric analysis of visible and near-infrared (Vis-NIR) spectra of the thumbs using portable spectrophotometer. Transmittance spectra in the 600- to 1100-nm region from thumbs of Japanese volunteers were subjected to a partial least squares regression (PLSR) analysis and leave-out cross-validation to develop chemometric models for predicting Ht levels. Ht levels of masked samples predicted by this model from Vis-NIR spectra provided a coefficient of determination in prediction of 0.6349 with a standard error of prediction of 3.704% and a detection limit in prediction of 17.14%, indicating that the model is applicable for normal and abnormal value in Ht level. These results suggest portable Vis-NIR spectrophotometer to have potential for the non-invasive measurement of Ht levels with a combination of PLSR analysis.

  2. Detection of Powdery Mildew in Two Winter Wheat Plant Densities and Prediction of Grain Yield Using Canopy Hyperspectral Reflectance

    PubMed Central

    Cao, Xueren; Luo, Yong; Zhou, Yilin; Fan, Jieru; Xu, Xiangming; West, Jonathan S.; Duan, Xiayu; Cheng, Dengfa

    2015-01-01

    To determine the influence of plant density and powdery mildew infection of winter wheat and to predict grain yield, hyperspectral canopy reflectance of winter wheat was measured for two plant densities at Feekes growth stage (GS) 10.5.3, 10.5.4, and 11.1 in the 2009–2010 and 2010–2011 seasons. Reflectance in near infrared (NIR) regions was significantly correlated with disease index at GS 10.5.3, 10.5.4, and 11.1 at two plant densities in both seasons. For the two plant densities, the area of the red edge peak (Σdr 680–760 nm), difference vegetation index (DVI), and triangular vegetation index (TVI) were significantly correlated negatively with disease index at three GSs in two seasons. Compared with other parameters Σdr 680–760 nm was the most sensitive parameter for detecting powdery mildew. Linear regression models relating mildew severity to Σdr 680–760 nm were constructed at three GSs in two seasons for the two plant densities, demonstrating no significant difference in the slope estimates between the two plant densities at three GSs. Σdr 680–760 nm was correlated with grain yield at three GSs in two seasons. The accuracies of partial least square regression (PLSR) models were consistently higher than those of models based on Σdr 680760 nm for disease index and grain yield. PLSR can, therefore, provide more accurate estimation of disease index of wheat powdery mildew and grain yield using canopy reflectance. PMID:25815468

  3. Qualitative Analysis of Dairy and Powder Milk Using Laser-Induced Breakdown Spectroscopy (LIBS).

    PubMed

    Alfarraj, Bader A; Sanghapi, Herve K; Bhatt, Chet R; Yueh, Fang Y; Singh, Jagdish P

    2018-01-01

    Laser-induced breakdown spectroscopy (LIBS) technique was used to compare various types of commercial milk products. Laser-induced breakdown spectroscopy spectra were investigated for the determination of the elemental composition of soy and rice milk powder, dairy milk, and lactose-free dairy milk. The analysis was performed using radiative transitions. Atomic emissions from Ca, K, Na, and Mg lines observed in LIBS spectra of dairy milk were compared. In addition, proteins and fat level in milks can be determined using molecular emissions such as CN bands. Ca concentrations were calculated to be 2.165 ± 0.203 g/L in 1% of dairy milk fat samples and 2.809 ± 0.172 g/L in 2% of dairy milk fat samples using the standard addition method (SAM) with LIBS spectra. Univariate and multivariate statistical analysis methods showed that the contents of major mineral elements were higher in lactose-free dairy milk than those in dairy milk. The principal component analysis (PCA) method was used to discriminate four milk samples depending on their mineral elements concentration. In addition, proteins and fat level in dairy milks were determined using molecular emissions such as CN band. We applied partial least squares regression (PLSR) and simple linear regression (SLR) models to predict levels of milk fat in dairy milk samples. The PLSR model was successfully used to predict levels of milk fat in dairy milk sample with the relative accuracy (RA%) less than 6.62% using CN (0,0) band.

  4. Determination of elemental composition of shale rocks by laser induced breakdown spectroscopy

    NASA Astrophysics Data System (ADS)

    Sanghapi, Hervé K.; Jain, Jinesh; Bol'shakov, Alexander; Lopano, Christina; McIntyre, Dustin; Russo, Richard

    2016-08-01

    In this study laser induced breakdown spectroscopy (LIBS) is used for elemental characterization of outcrop samples from the Marcellus Shale. Powdered samples were pressed to form pellets and used for LIBS analysis. Partial least squares regression (PLS-R) and univariate calibration curves were used for quantification of analytes. The matrix effect is substantially reduced using the partial least squares calibration method. Predicted results with LIBS are compared to ICP-OES results for Si, Al, Ti, Mg, and Ca. As for C, its results are compared to those obtained by a carbon analyzer. Relative errors of the LIBS measurements are in the range of 1.7 to 12.6%. The limits of detection (LODs) obtained for Si, Al, Ti, Mg and Ca are 60.9, 33.0, 15.6, 4.2 and 0.03 ppm, respectively. An LOD of 0.4 wt.% was obtained for carbon. This study shows that the LIBS method can provide a rapid analysis of shale samples and can potentially benefit depleted gas shale carbon storage research.

  5. Multivariate models for prediction of rheological characteristics of filamentous fermentation broth from the size distribution.

    PubMed

    Petersen, Nanna; Stocks, Stuart; Gernaey, Krist V

    2008-05-01

    The main purpose of this article is to demonstrate that principal component analysis (PCA) and partial least squares regression (PLSR) can be used to extract information from particle size distribution data and predict rheological properties. Samples from commercially relevant Aspergillus oryzae fermentations conducted in 550 L pilot scale tanks were characterized with respect to particle size distribution, biomass concentration, and rheological properties. The rheological properties were described using the Herschel-Bulkley model. Estimation of all three parameters in the Herschel-Bulkley model (yield stress (tau(y)), consistency index (K), and flow behavior index (n)) resulted in a large standard deviation of the parameter estimates. The flow behavior index was not found to be correlated with any of the other measured variables and previous studies have suggested a constant value of the flow behavior index in filamentous fermentations. It was therefore chosen to fix this parameter to the average value thereby decreasing the standard deviation of the estimates of the remaining rheological parameters significantly. Using a PLSR model, a reasonable prediction of apparent viscosity (micro(app)), yield stress (tau(y)), and consistency index (K), could be made from the size distributions, biomass concentration, and process information. This provides a predictive method with a high predictive power for the rheology of fermentation broth, and with the advantages over previous models that tau(y) and K can be predicted as well as micro(app). Validation on an independent test set yielded a root mean square error of 1.21 Pa for tau(y), 0.209 Pa s(n) for K, and 0.0288 Pa s for micro(app), corresponding to R(2) = 0.95, R(2) = 0.94, and R(2) = 0.95 respectively. Copyright 2007 Wiley Periodicals, Inc.

  6. Quantitative analysis of curcumin-loaded alginate nanocarriers in hydrogels using Raman and attenuated total reflection infrared spectroscopy.

    PubMed

    Miloudi, Lynda; Bonnier, Franck; Bertrand, Dominique; Byrne, Hugh J; Perse, Xavier; Chourpa, Igor; Munnier, Emilie

    2017-07-01

    Core-shell nanocarriers are increasingly being adapted in cosmetic and dermatological fields, aiming to provide an increased penetration of the active pharmaceutical or cosmetic ingredients (API and ACI) through the skin. In the final form, the nanocarriers (NC) are usually prepared in hydrogels, conferring desired viscous properties for topical application. Combined with the high chemical complexity of the encapsulating system itself, involving numerous ingredients to form a stable core and quantifying the NC and/or the encapsulated active without labor-intensive and destructive methods remains challenging. In this respect, the specific molecular fingerprint obtained from vibrational spectroscopy analysis could unambiguously overcome current obstacles in the development of fast and cost-effective quality control tools for NC-based products. The present study demonstrates the feasibility to deliver accurate quantification of the concentrations of curcumin (ACI)-loaded alginate nanocarriers in hydrogel matrices, coupling partial least square regression (PLSR) to infrared (IR) absorption and Raman spectroscopic analyses. With respective root mean square errors of 0.1469 ± 0.0175% w/w and 0.4462 ± 0.0631% w/w, both approaches offer acceptable precision. Further investigation of the PLSR results allowed to highlight the different selectivity of each approach, indicating only IR analysis delivers direct monitoring of the NC through the quantification of the Labrafac®, the main NC ingredient. Raman analyses are rather dominated by the contribution of the ACI which opens numerous perspectives to quantify the active molecules without interferences from the complex core-shell encapsulating systems thus positioning the technique as a powerful analytical tool for industrial screening of cosmetic and pharmaceutical products. Graphical abstract Quantitative analysis of encapuslated active molecules in hydrogel-based samples by means of infrared and Raman spectroscopy.

  7. Structure-activity relationships between sterols and their thermal stability in oil matrix.

    PubMed

    Hu, Yinzhou; Xu, Junli; Huang, Weisu; Zhao, Yajing; Li, Maiquan; Wang, Mengmeng; Zheng, Lufei; Lu, Baiyi

    2018-08-30

    Structure-activity relationships between 20 sterols and their thermal stabilities were studied in a model oil system. All sterol degradations were found to be consistent with a first-order kinetic model with determination of coefficient (R 2 ) higher than 0.9444. The number of double bonds in the sterol structure was negatively correlated with the thermal stability of sterol, whereas the length of the branch chain was positively correlated with the thermal stability of sterol. A quantitative structure-activity relationship (QSAR) model to predict thermal stability of sterol was developed by using partial least squares regression (PLSR) combined with genetic algorithm (GA). A regression model was built with R 2 of 0.806. Almost all sterol degradation constants can be predicted accurately with R 2 of cross-validation equals to 0.680. Four important variables were selected in optimal QSAR model and the selected variables were observed to be related with information indices, RDF descriptors, and 3D-MoRSE descriptors. Copyright © 2018 Elsevier Ltd. All rights reserved.

  8. Improving the prediction of arsenic contents in agricultural soils by combining the reflectance spectroscopy of soils and rice plants

    NASA Astrophysics Data System (ADS)

    Shi, Tiezhu; Wang, Junjie; Chen, Yiyun; Wu, Guofeng

    2016-10-01

    Visible and near-infrared reflectance spectroscopy provides a beneficial tool for investigating soil heavy metal contamination. This study aimed to investigate mechanisms of soil arsenic prediction using laboratory based soil and leaf spectra, compare the prediction of arsenic content using soil spectra with that using rice plant spectra, and determine whether the combination of both could improve the prediction of soil arsenic content. A total of 100 samples were collected and the reflectance spectra of soils and rice plants were measured using a FieldSpec3 portable spectroradiometer (350-2500 nm). After eliminating spectral outliers, the reflectance spectra were divided into calibration (n = 62) and validation (n = 32) data sets using the Kennard-Stone algorithm. Genetic algorithm (GA) was used to select useful spectral variables for soil arsenic prediction. Thereafter, the GA-selected spectral variables of the soil and leaf spectra were individually and jointly employed to calibrate the partial least squares regression (PLSR) models using the calibration data set. The regression models were validated and compared using independent validation data set. Furthermore, the correlation coefficients of soil arsenic against soil organic matter, leaf arsenic and leaf chlorophyll were calculated, and the important wavelengths for PLSR modeling were extracted. Results showed that arsenic prediction using the leaf spectra (coefficient of determination in validation, Rv2 = 0.54; root mean square error in validation, RMSEv = 12.99 mg kg-1; and residual prediction deviation in validation, RPDv = 1.35) was slightly better than using the soil spectra (Rv2 = 0.42, RMSEv = 13.35 mg kg-1, and RPDv = 1.31). However, results also showed that the combinational use of soil and leaf spectra resulted in higher arsenic prediction (Rv2 = 0.63, RMSEv = 11.94 mg kg-1, RPDv = 1.47) compared with either soil or leaf spectra alone. Soil spectral bands near 480, 600, 670, 810, 1980, 2050 and 2290 nm, leaf spectral bands near 700, 890 and 900 nm in PLSR models were important wavelengths for soil arsenic prediction. Moreover, soil arsenic showed significantly positive correlations with soil organic matter (r = 0.62, p < 0.01) and leaf arsenic (r = 0.77, p < 0.01), and a significantly negative correlation with leaf chlorophyll (r = -0.67, p < 0.01). The results showed that the prediction of arsenic contents using soil and leaf spectra may be based on their relationships with soil organic matter and leaf chlorophyll contents, respectively. Although RPD of 1.47 was below the recommended RPD of >2 for soil analysis, arsenic prediction in agricultural soils can be improved by combining the leaf and soil spectra.

  9. Effects of Subsetting by Parent Materials on Prediction of Soil Organic Matter Content in a Hilly Area Using Vis–NIR Spectroscopy

    PubMed Central

    Xu, Shengxiang; Shi, Xuezheng; Wang, Meiyan; Zhao, Yongcun

    2016-01-01

    Assessment and monitoring of soil organic matter (SOM) quality are important for understanding SOM dynamics and developing management practices that will enhance and maintain the productivity of agricultural soils. Visible and near-infrared (Vis–NIR) diffuse reflectance spectroscopy (350–2500 nm) has received increasing attention over the recent decades as a promising technique for SOM analysis. While heterogeneity of sample sets is one critical factor that complicates the prediction of soil properties from Vis–NIR spectra, a spectral library representing the local soil diversity needs to be constructed. The study area, covering a surface of 927 km2 and located in Yujiang County of Jiangsu Province, is characterized by a hilly area with different soil parent materials (e.g., red sandstone, shale, Quaternary red clay, and river alluvium). In total, 232 topsoil (0–20 cm) samples were collected for SOM analysis and scanned with a Vis–NIR spectrometer in the laboratory. Reflectance data were related to surface SOM content by means of a partial least square regression (PLSR) method and several data pre-processing techniques, such as first and second derivatives with a smoothing filter. The performance of the PLSR model was tested under different combinations of calibration/validation sets (global and local calibrations stratified according to parent materials). The results showed that the models based on the global calibrations can only make approximate predictions for SOM content (RMSE (root mean squared error) = 4.23–4.69 g kg−1; R2 (coefficient of determination) = 0.80–0.84; RPD (ratio of standard deviation to RMSE) = 2.19–2.44; RPIQ (ratio of performance to inter-quartile distance) = 2.88–3.08). Under the local calibrations, the individual PLSR models for each parent material improved SOM predictions (RMSE = 2.55–3.49 g kg−1; R2 = 0.87–0.93; RPD = 2.67–3.12; RPIQ = 3.15–4.02). Among the four different parent materials, the largest R2 and the smallest RMSE were observed for the shale soils, which had the lowest coefficient of variation (CV) values for clay (18.95%), free iron oxides (15.93%), and pH (1.04%). This demonstrates the importance of a practical subsetting strategy for the continued improvement of SOM prediction with Vis–NIR spectroscopy. PMID:26974821

  10. Multivariate analysis relating oil shale geochemical properties to NMR relaxometry

    USGS Publications Warehouse

    Birdwell, Justin E.; Washburn, Kathryn E.

    2015-01-01

    Low-field nuclear magnetic resonance (NMR) relaxometry has been used to provide insight into shale composition by separating relaxation responses from the various hydrogen-bearing phases present in shales in a noninvasive way. Previous low-field NMR work using solid-echo methods provided qualitative information on organic constituents associated with raw and pyrolyzed oil shale samples, but uncertainty in the interpretation of longitudinal-transverse (T1–T2) relaxometry correlation results indicated further study was required. Qualitative confirmation of peaks attributed to kerogen in oil shale was achieved by comparing T1–T2 correlation measurements made on oil shale samples to measurements made on kerogen isolated from those shales. Quantitative relationships between T1–T2 correlation data and organic geochemical properties of raw and pyrolyzed oil shales were determined using partial least-squares regression (PLSR). Relaxometry results were also compared to infrared spectra, and the results not only provided further confidence in the organic matter peak interpretations but also confirmed attribution of T1–T2 peaks to clay hydroxyls. In addition, PLSR analysis was applied to correlate relaxometry data to trace element concentrations with good success. The results of this work show that NMR relaxometry measurements using the solid-echo approach produce T1–T2 peak distributions that correlate well with geochemical properties of raw and pyrolyzed oil shales.

  11. Regional prediction of soil organic carbon content over temperate croplands using visible near-infrared airborne hyperspectral imagery and synchronous field spectra

    NASA Astrophysics Data System (ADS)

    Vaudour, E.; Gilliot, J. M.; Bel, L.; Lefevre, J.; Chehdi, K.

    2016-07-01

    This study aimed at identifying the potential of Vis-NIR airborne hyperspectral AISA-Eagle data for predicting the topsoil organic carbon (SOC) content of bare cultivated soils over a large peri-urban area (221 km2) with both contrasted soils and SOC contents, located in the western region of Paris, France. Soil types comprised haplic luvisols, calcaric cambisols and colluvic cambisols. Airborne AISA-Eagle data (400-1000 nm, 126 bands) with 1 m-resolution were acquired on 17 April 2013 over 13 tracks. Tracks were atmospherically corrected then mosaicked at a 2 m-resolution using a set of 24 synchronous field spectra of bare soils, black and white targets and impervious surfaces. The land use identification system layer (RPG) of 2012 was used to mask non-agricultural areas, then calculation and thresholding of NDVI from an atmospherically corrected SPOT image acquired the same day enabled to map agricultural fields with bare soil. A total of 101 sites sampled either in 2013 or in the 3 previous years and in 2015 were identified as bare by means of this map. Predictions were made from the mosaic AISA spectra which were related to topsoil SOC contents by means of partial least squares regression (PLSR). Regression robustness was evaluated through a series of 1000 bootstrap data sets of calibration-validation samples, considering 74 sites outside cloud shadows only, and different sampling strategies for selecting calibration samples. Validation root-mean-square errors (RMSE) were comprised between 3.73 and 4.49 g Kg-1 and were ∼4 g Kg-1 in median. The most performing models in terms of coefficient of determination (R2) and Residual Prediction Deviation (RPD) values were the calibration models derived either from Kennard-Stone or conditioned Latin Hypercube sampling on smoothed spectra. The most generalizable model leading to lowest RMSE value of 3.73 g Kg-1 at the regional scale and 1.44 g Kg-1 at the within-field scale and low bias was the cross-validated leave-one-out PLSR model constructed with the 28 near-synchronous samples and raw spectra.

  12. Qualitative and quantitative prediction of volatile compounds from initial amino acid profiles in Korean rice wine (makgeolli) model.

    PubMed

    Kang, Bo-Sik; Lee, Jang-Eun; Park, Hyun-Jin

    2014-06-01

    In Korean rice wine (makgeolli) model, we tried to develop a prediction model capable of eliciting a quantitative relationship between initial amino acids in makgeolli mash and major aromatic compounds, such as fusel alcohols, their acetate esters, and ethyl esters of fatty acids, in makgeolli brewed. Mass-spectrometry-based electronic nose (MS-EN) was used to qualitatively discriminate between makgeollis made from makgeolli mashes with different amino acid compositions. Following this measurement, headspace solid-phase microextraction coupled to gas chromatography-mass spectrometry (GC-MS) combined with partial least-squares regression (PLSR) method was employed to quantitatively correlate amino acid composition of makgeolli mash with major aromatic compounds evolved during makgeolli fermentation. In qualitative prediction with MS-EN analysis, the makgeollis were well discriminated according to the volatile compounds derived from amino acids of makgeolli mash. Twenty-seven ion fragments with mass-to-charge ratio (m/z) of 55 to 98 amu were responsible for the discrimination. In GC-MS combined with PLSR method, a quantitative approach between the initial amino acids of makgeolli mash and the fusel compounds of makgeolli demonstrated that coefficient of determination (R(2)) of most of the fusel compounds ranged from 0.77 to 0.94 in good correlation, except for 2-phenylethanol (R(2) = 0.21), whereas R(2) for ethyl esters of MCFAs including ethyl caproate, ethyl caprylate, and ethyl caprate was 0.17 to 0.40 in poor correlation. The amino acids have been known to affect the aroma in alcoholic beverages. In this study, we demonstrated that an electronic nose qualitatively differentiated Korean rice wines (makgeollis) by their volatile compounds evolved from amino acids with rapidity and reproducibility and successively, a quantitative correlation with acceptable R2 between amino acids and fusel compounds could be established via HS-SPME GC-MS combined with partial least-squares regression. Our approach for predicting the quantities of volatile compounds in the finished product from initial condition of fermentation will give an insight to food researchers to modify and optimize the qualities of the corresponding products. © 2014 Institute of Food Technologists®

  13. Estimating Biochemical Parameters of Tea (camellia Sinensis (L.)) Using Hyperspectral Techniques

    NASA Astrophysics Data System (ADS)

    Bian, M.; Skidmore, A. K.; Schlerf, M.; Liu, Y.; Wang, T.

    2012-07-01

    Tea (Camellia Sinensis (L.)) is an important economic crop and the market price of tea depends largely on its quality. This research aims to explore the potential of hyperspectral remote sensing on predicting the concentration of biochemical components, namely total tea polyphenols, as indicators of tea quality at canopy scale. Experiments were carried out for tea plants growing in the field and greenhouse. Partial least squares regression (PLSR), which has proven to be the one of the most successful empirical approach, was performed to establish the relationship between reflectance and biochemical concentration across six tea varieties in the field. Moreover, a novel integrated approach involving successive projections algorithms as band selection method and neural networks was developed and applied to detect the concentration of total tea polyphenols for one tea variety, in order to explore and model complex nonlinearity relationships between independent (wavebands) and dependent (biochemicals) variables. The good prediction accuracies (r2 > 0.8 and relative RMSEP < 10 %) achieved for tea plants using both linear (partial lease squares regress) and nonlinear (artificial neural networks) modelling approaches in this study demonstrates the feasibility of using airborne and spaceborne sensors to cover wide areas of tea plantation for in situ monitoring of tea quality cheaply and rapidly.

  14. Non-Destructive Evaluation of the Leaf Nitrogen Concentration by In-Field Visible/Near-Infrared Spectroscopy in Pear Orchards.

    PubMed

    Wang, Jie; Shen, Changwei; Liu, Na; Jin, Xin; Fan, Xueshan; Dong, Caixia; Xu, Yangchun

    2017-03-08

    Non-destructive and timely determination of leaf nitrogen (N) concentration is urgently needed for N management in pear orchards. A two-year field experiment was conducted in a commercial pear orchard with five N application rates: 0 (N0), 165 (N1), 330 (N2), 660 (N3), and 990 (N4) kg·N·ha -1 . The mid-portion leaves on the year's shoot were selected for the spectral measurement first and then N concentration determination in the laboratory at 50 and 80 days after full bloom (DAB). Three methods of in-field spectral measurement (25° bare fibre under solar conditions, black background attached to plant probe, and white background attached to plant probe) were compared. We also investigated the modelling performances of four chemometric techniques (principal components regression, PCR; partial least squares regression, PLSR; stepwise multiple linear regression, SMLR; and back propagation neural network, BPNN) and three vegetation indices (difference spectral index, normalized difference spectral index, and ratio spectral index). Due to the low correlation of reflectance obtained by the 25° field of view method, all of the modelling was performed on two spectral datasets-both acquired by a plant probe. Results showed that the best modelling and prediction accuracy were found in the model established by PLSR and spectra measured with a black background. The randomly-separated subsets of calibration ( n = 1000) and validation ( n = 420) of this model resulted in high R² values of 0.86 and 0.85, respectively, as well as a low mean relative error (<6%). Furthermore, a higher coefficient of determination between the leaf N concentration and fruit yield was found at 50 DAB samplings in both 2015 (R² = 0.77) and 2014 (R² = 0.59). Thus, the leaf N concentration was suggested to be determined at 50 DAB by visible/near-infrared spectroscopy and the threshold should be 24-27 g/kg.

  15. Seasonal variability of multiple leaf traits captured by leaf spectroscopy at two temperate deciduous forests

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Yang, Xi; Tang, Jianwu; Mustard, John F.

    Understanding the temporal patterns of leaf traits is critical in determining the seasonality and magnitude of terrestrial carbon, water, and energy fluxes. However, we lack robust and efficient ways to monitor the temporal dynamics of leaf traits. Here we assessed the potential of leaf spectroscopy to predict and monitor leaf traits across their entire life cycle at different forest sites and light environments (sunlit vs. shaded) using a weekly sampled dataset across the entire growing season at two temperate deciduous forests. In addition, the dataset includes field measured leaf-level directional-hemispherical reflectance/transmittance together with seven important leaf traits [total chlorophyll (chlorophyllmore » a and b), carotenoids, mass-based nitrogen concentration (N mass), mass-based carbon concentration (C mass), and leaf mass per area (LMA)]. All leaf traits varied significantly throughout the growing season, and displayed trait-specific temporal patterns. We used a Partial Least Square Regression (PLSR) modeling approach to estimate leaf traits from spectra, and found that PLSR was able to capture the variability across time, sites, and light environments of all leaf traits investigated (R 2 = 0.6–0.8 for temporal variability; R 2 = 0.3–0.7 for cross-site variability; R 2 = 0.4–0.8 for variability from light environments). We also tested alternative field sampling designs and found that for most leaf traits, biweekly leaf sampling throughout the growing season enabled accurate characterization of the seasonal patterns. Compared with the estimation of foliar pigments, the performance of N mass, C mass and LMA PLSR models improved more significantly with sampling frequency. Our results demonstrate that leaf spectra-trait relationships vary with time, and thus tracking the seasonality of leaf traits requires statistical models calibrated with data sampled throughout the growing season. In conclusion, our results have broad implications for future research that use vegetation spectra to infer leaf traits at different growing stages.« less

  16. Quantitative Prediction of Beef Quality Using Visible and NIR Spectroscopy with Large Data Samples Under Industry Conditions

    NASA Astrophysics Data System (ADS)

    Qiao, T.; Ren, J.; Craigie, C.; Zabalza, J.; Maltin, Ch.; Marshall, S.

    2015-03-01

    It is well known that the eating quality of beef has a significant influence on the repurchase behavior of consumers. There are several key factors that affect the perception of quality, including color, tenderness, juiciness, and flavor. To support consumer repurchase choices, there is a need for an objective measurement of quality that could be applied to meat prior to its sale. Objective approaches such as offered by spectral technologies may be useful, but the analytical algorithms used remain to be optimized. For visible and near infrared (VISNIR) spectroscopy, Partial Least Squares Regression (PLSR) is a widely used technique for meat related quality modeling and prediction. In this paper, a Support Vector Machine (SVM) based machine learning approach is presented to predict beef eating quality traits. Although SVM has been successfully used in various disciplines, it has not been applied extensively to the analysis of meat quality parameters. To this end, the performance of PLSR and SVM as tools for the analysis of meat tenderness is evaluated, using a large dataset acquired under industrial conditions. The spectral dataset was collected using VISNIR spectroscopy with the wavelength ranging from 350 to 1800 nm on 234 beef M. longissimus thoracis steaks from heifers, steers, and young bulls. As the dimensionality with the VISNIR data is very high (over 1600 spectral bands), the Principal Component Analysis (PCA) technique was applied for feature extraction and data reduction. The extracted principal components (less than 100) were then used for data modeling and prediction. The prediction results showed that SVM has a greater potential to predict beef eating quality than PLSR, especially for the prediction of tenderness. The infl uence of animal gender on beef quality prediction was also investigated, and it was found that beef quality traits were predicted most accurately in beef from young bulls.

  17. Seasonal variability of multiple leaf traits captured by leaf spectroscopy at two temperate deciduous forests

    DOE PAGES

    Yang, Xi; Tang, Jianwu; Mustard, John F.; ...

    2016-04-02

    Understanding the temporal patterns of leaf traits is critical in determining the seasonality and magnitude of terrestrial carbon, water, and energy fluxes. However, we lack robust and efficient ways to monitor the temporal dynamics of leaf traits. Here we assessed the potential of leaf spectroscopy to predict and monitor leaf traits across their entire life cycle at different forest sites and light environments (sunlit vs. shaded) using a weekly sampled dataset across the entire growing season at two temperate deciduous forests. In addition, the dataset includes field measured leaf-level directional-hemispherical reflectance/transmittance together with seven important leaf traits [total chlorophyll (chlorophyllmore » a and b), carotenoids, mass-based nitrogen concentration (N mass), mass-based carbon concentration (C mass), and leaf mass per area (LMA)]. All leaf traits varied significantly throughout the growing season, and displayed trait-specific temporal patterns. We used a Partial Least Square Regression (PLSR) modeling approach to estimate leaf traits from spectra, and found that PLSR was able to capture the variability across time, sites, and light environments of all leaf traits investigated (R 2 = 0.6–0.8 for temporal variability; R 2 = 0.3–0.7 for cross-site variability; R 2 = 0.4–0.8 for variability from light environments). We also tested alternative field sampling designs and found that for most leaf traits, biweekly leaf sampling throughout the growing season enabled accurate characterization of the seasonal patterns. Compared with the estimation of foliar pigments, the performance of N mass, C mass and LMA PLSR models improved more significantly with sampling frequency. Our results demonstrate that leaf spectra-trait relationships vary with time, and thus tracking the seasonality of leaf traits requires statistical models calibrated with data sampled throughout the growing season. In conclusion, our results have broad implications for future research that use vegetation spectra to infer leaf traits at different growing stages.« less

  18. Quality Detection of Litchi Stored in Different Environments Using an Electronic Nose

    PubMed Central

    Xu, Sai; Lü, Enli; Lu, Huazhong; Zhou, Zhiyan; Wang, Yu; Yang, Jing; Wang, Yajuan

    2016-01-01

    The purpose of this paper was to explore the utility of an electronic nose to detect the quality of litchi fruit stored in different environments. In this study, a PEN3 electronic nose was adopted to test the storage time and hardness of litchi that were stored in three different types of environment (room temperature, refrigerator and controlled-atmosphere). After acquiring data about the hardness of the sample and from the electronic nose, linear discriminant analysis (LDA), canonical correlation analysis (CCA), BP neural network (BPNN) and BP neural network-partial least squares regression (BPNN-PLSR), were employed for data processing. The experimental results showed that the hardness of litchi fruits stored in all three environments decreased during storage. The litchi stored at room temperature had the fastest rate of decrease in hardness, followed by those stored in a refrigerator environment and under a controlled-atmosphere. LDA has a poor ability to classify the storage time of the three environments in which litchi was stored. BPNN can effectively recognize the storage time of litchi stored in a refrigerator and a controlled-atmosphere environment. However, the BPNN classification of the effect of room temperature storage on litchi was poor. CCA results show a significant correlation between electronic nose data and hardness data under the room temperature, and the correlation is more obvious for those under the refrigerator environment and controlled-atmosphere environment. The BPNN-PLSR can effectively predict the hardness of litchi under refrigerator storage conditions and a controlled-atmosphere environment. However, the BPNN-PLSR prediction of the effect of room temperature storage on litchi and global environment storage on litchi were poor. Thus, this experiment proved that an electronic nose can detect the quality of litchi under refrigeratored storage and a controlled-atmosphere environment. These results provide a useful reference for future studies on nondestructive and intelligent monitoring of fruit quality. PMID:27338391

  19. Nondestructive detection of pork quality based on dual-band VIS/NIR spectroscopy

    NASA Astrophysics Data System (ADS)

    Wang, Wenxiu; Peng, Yankun; Li, Yongyu; Tang, Xiuying; Liu, Yuanyuan

    2015-05-01

    With the continuous development of living standards and the relative change of dietary structure, consumers' rising and persistent demand for better quality of meat is emphasized. Colour, pH value, and cooking loss are important quality attributes when evaluating meat. To realize nondestructive detection of multi-parameter of meat quality simultaneously is popular in production and processing of meat and meat products. The objectives of this research were to compare the effectiveness of two bands for rapid nondestructive and simultaneous detection of pork quality attributes. Reflectance spectra of 60 chilled pork samples were collected from a dual-band visible/near-infrared spectroscopy system which covered 350-1100 nm and 1000-2600 nm. Then colour, pH value and cooking loss were determined by standard methods as reference values. Standard normal variables transform (SNVT) was employed to eliminate the spectral noise. A spectrum connection method was put forward for effective integration of the dual-band spectrum to make full use of the whole efficient information. Partial least squares regression (PLSR) and Principal component analysis (PCA) were applied to establish prediction models using based on single-band spectrum and dual-band spectrum, respectively. The experimental results showed that the PLSR model based on dual-band spectral information was superior to the models based on single band spectral information with lower root means quare error (RMSE) and higher accuracy. The PLSR model based on dual-band (use the overlapping part of first band) yielded the best prediction result with correlation coefficient of validation (Rv) of 0.9469, 0.9495, 0.9180, 0.9054 and 0.8789 for L*, a*, b*, pH value and cooking loss, respectively. This mainly because dual-band spectrum can provide sufficient and comprehensive information which reflected the quality attributes. Data fusion from dual-band spectrum could significantly improve pork quality parameters prediction performance. The research also indicated that multi-band spectral information fusion has potential to comprehensively evaluate other quality and safety attributes of pork.

  20. Evaluation of apparent viscosity of Para rubber latex by diffuse reflection near-infrared spectroscopy.

    PubMed

    Sirisomboon, Panmanas; Chowbankrang, Rawiphan; Williams, Phil

    2012-05-01

    Near-infrared spectroscopy in diffuse reflection mode was used to evaluate the apparent viscosity of Para rubber field latex and concentrated latex over the wavelength range of 1100 to 2500 nm, using partial least square regression (PLSR). The model with ten principal components (PCs) developed using the raw spectra accurately predicted the apparent viscosity with correlation coefficient (r), standard error of prediction (SEP), and bias of 0.974, 8.6 cP, and -0.4 cP, respectively. The ratio of the SEP to the standard deviation (RPD) and the ratio of the SEP to the range (RER) for the prediction were 4.4 and 16.7, respectively. Therefore, the model can be used for measurement of the apparent viscosity of field latex and concentrated latex in quality assurance and process control in the factory.

  1. Rapid quantification of multi-components in alcohol precipitation liquid of Codonopsis Radix using near infrared spectroscopy (NIRS).

    PubMed

    Luo, Yu; Li, Wen-Long; Huang, Wen-Hua; Liu, Xue-Hua; Song, Yan-Gang; Qu, Hai-Bin

    2017-05-01

    A near infrared spectroscopy (NIRS) approach was established for quality control of the alcohol precipitation liquid in the manufacture of Codonopsis Radix. By applying NIRS with multivariate analysis, it was possible to build variation into the calibration sample set, and the Plackett-Burman design, Box-Behnken design, and a concentrating-diluting method were used to obtain the sample set covered with sufficient fluctuation of process parameters and extended concentration information. NIR data were calibrated to predict the four quality indicators using partial least squares regression (PLSR). In the four calibration models, the root mean squares errors of prediction (RMSEPs) were 1.22 μg/ml, 10.5 μg/ml, 1.43 μg/ml, and 0.433% for lobetyolin, total flavonoids, pigments, and total solid contents, respectively. The results indicated that multi-components quantification of the alcohol precipitation liquid of Codonopsis Radix could be achieved with an NIRS-based method, which offers a useful tool for real-time release testing (RTRT) of intermediates in the manufacture of Codonopsis Radix.

  2. Rapid determination of major bioactive isoflavonoid compounds during the extraction process of kudzu (Pueraria lobata) by near-infrared transmission spectroscopy.

    PubMed

    Wang, Pei; Zhang, Hui; Yang, Hailong; Nie, Lei; Zang, Hengchang

    2015-02-25

    Near-infrared (NIR) spectroscopy has been developed into an indispensable tool for both academic research and industrial quality control in a wide field of applications. The feasibility of NIR spectroscopy to monitor the concentration of puerarin, daidzin, daidzein and total isoflavonoid (TIF) during the extraction process of kudzu (Pueraria lobata) was verified in this work. NIR spectra were collected in transmission mode and pretreated with smoothing and derivative. Partial least square regression (PLSR) was used to establish calibration models. Three different variable selection methods, including correlation coefficient method, interval partial least squares (iPLS), and successive projections algorithm (SPA) were performed and compared with models based on all of the variables. The results showed that the approach was very efficient and environmentally friendly for rapid determination of the four quality indices (QIs) in the kudzu extraction process. This method established may have the potential to be used as a process analytical technological (PAT) tool in the future. Copyright © 2014 Elsevier B.V. All rights reserved.

  3. Scaling, propagating and mapping uncertainty in spectroscopy-derived foliar traits from the leaf to the image

    NASA Astrophysics Data System (ADS)

    Singh, A.; Serbin, S. P.; Kingdon, C.; Townsend, P. A.

    2013-12-01

    A major goal of remote sensing, and imaging spectroscopy in particular, is the development of generalizable algorithms to repeatedly and accurately map ecosystem properties such as canopy chemistry across space and time. Existing methods must therefore be tested across a range of measurement approaches to identify and overcome limits to the consistent retrieval of such properties from spectroscopic imagery. Here we illustrate a general approach for the estimation of key foliar biochemical and morphological traits from spectroscopic imagery derived from the AVIRIS instrument and the propagation of errors from the leaf to the image scale using partial least squares regression (PLSR) techniques. Our method involves the integration of three types of data representing different scales of observation: At the image scale, the images were normalized for atmospheric, illumination and BRDF effects. Spectra from field plot locations were extracted from the 51AVIRIS images and were averaged when the field plot was larger than a single pixel. At the plot level, the scaling was conducted using multiple replicates (1000) derived from the leaf-level uncertainty estimates to generate plot-level estimates with their associated uncertainties. Leaf-level estimates of foliar traits (%N, %C, %Fiber, %Cellulose, %Lignin, LMA) were scaled to the canopy based on relative species composition of each plot. Image spectra were iteratively split into 50/50 randomized calibration-validation datasets and multiple (500) trait-predictive PLSR models were generated, this time sampling from within the plot-level uncertainty distribution. This allowed the propagation of uncertainty from the leaf-level dependent variables to the plot level, and finally to models built using AVIRIS image spectra. Moreover, this method allows us to generate spatially explicit maps of uncertainty in our sampled traits. Both LMA and %N PLSR models had a R2 greater than 0.8, root mean square errors (RMSEs) for both variables were less than 6% of the range of data. Fiber and lignin were predicted with R2 > 0.65 and carbon and cellulose greater than 0.5. Although R2 of these variables were lower than LMA and %N, their RMSE values were beneath 9% of the range of data. The comparatively lower R2 values for %C and cellulose in particular were related to the low amount of natural variability in these constituents. Further, coefficients from the randomized set of PLSR models were applied to imagery and aggregated to obtain pixel-wise predicted means and uncertainty estimates for each foliar trait. The resulting maps of nutritional and morphological properties together with their overall uncertainties represent a first-of-its-kind data product for examining the spatio-temporal patterns of forest functioning and nutrient cycling. These data are now being used to relate foliar traits with ecosystem processes such as streamwater nutrient export and insect herbivory. In addition, the ability to assign a retrieval uncertainty enables more efficient assimilation of these data products into ecosystem models to help constrain carbon and nutrient cycling projections.

  4. Prediction of soil organic carbon with different parent materials development using visible-near infrared spectroscopy.

    PubMed

    Liu, Jinbao; Han, Jichang; Zhang, Yang; Wang, Huanyuan; Kong, Hui; Shi, Lei

    2018-06-05

    The storage of soil organic carbon (SOC) should improve soil fertility. Conventional determination of SOC is expensive and tedious. Visible-near infrared reflectance spectroscopy is a practical and cost-effective approach that has been successfully used SOC concentration. Soil spectral inversion model could quickly and efficiently determine SOC content. This paper presents a study dealing with SOC estimation through the combination of soil spectroscopy and stepwise multiple linear regression (SMLR), partial least squares regression (PLSR), principal component regression (PCR). Spectral measurements for 106 soil samples were acquired using an ASD FieldSpec 4 standard-res spectroradiometer (350-2500 nm). Six types of transformations and three regression methods were applied to build for the quantification of different parent materials development soil. The results show that (1)the basaltic volcanic clastics development of SOC spectral response bands located in 500 nm, 800 nm; Trachyte spectral response of the soil quality, and the volcanic clastics development at 405 nm, 465 nm, 575 nm, 1105 nm. (2) Basaltic volcanic debris soil development, first deviation of maximum correlation coefficient is 0.8898; thick surface soil of the development of rocky volcanic debris from bottom reflectivity logarithm of first deviation of maximum correlation coefficient is 0.9029. (3) Soil organic matter content of basaltic volcanic clastics development optimal prediction model based on spectral reflectance inverse logarithms of first deviation of SMLR. Independent variable number is 7, Rv 2  = 0.9720, RMSEP = 2.0590, sig = 0.003. Trachyte qualitative volcanic clastics developed soil organic matter content of the optimal prediction model based on spectral reflectance inverse logarithms of first deviation of PLSR. Model number of the independent variables Pc = 5, Rc = 0.9872, Rc 2  = 0.9745, RMSEC = 0.4821, SEC = 0.4906, forecasts determine coefficient Rv 2  = 0.9702, RMSEP = 0.9563, SEP = 0.9711, Bias = 0.0637. Copyright © 2018 Elsevier B.V. All rights reserved.

  5. ToxiM: A Toxicity Prediction Tool for Small Molecules Developed Using Machine Learning and Chemoinformatics Approaches.

    PubMed

    Sharma, Ashok K; Srivastava, Gopal N; Roy, Ankita; Sharma, Vineet K

    2017-01-01

    The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84-0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better ( R 2 = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better ( R 2 = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules.

  6. ToxiM: A Toxicity Prediction Tool for Small Molecules Developed Using Machine Learning and Chemoinformatics Approaches

    PubMed Central

    Sharma, Ashok K.; Srivastava, Gopal N.; Roy, Ankita; Sharma, Vineet K.

    2017-01-01

    The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84–0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better (R2 = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better (R2 = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules. PMID:29249969

  7. [Determination of Carbaryl in Rice by Using FT Far-IR and THz-TDS Techniques].

    PubMed

    Sun, Tong; Zhang, Zhuo-yong; Xiang, Yu-hong; Zhu, Ruo-hua

    2016-02-01

    Determination of carbaryl in rice by using Fourier transform far-infrared (FT- Far-IR) and terahertz time-domain spectroscopy (THz-TDS) combined with chemometrics was studied and the spectral characteristics of carbaryl in terahertz region was investigated. Samples were prepared by mixing carbaryl at different amounts with rice powder, and then a 13 mm diameter, and about 1 mm thick pellet with polyethylene (PE) as matrix was compressed under the pressure of 5-7 tons. Terahertz time domain spectra of the pellets were measured at 0.5~1.5 THz, and the absorption spectra at 1.6. 3 THz were acquired with Fourier transform far-IR spectroscopy. The method of sample preparation is so simple that it does not need separation and enrichment. The absorption peaks in the frequency range of 1.8-6.3 THz have been found at 3.2 and 5.2 THz by Far-IR. There are several weak absorption peaks in the range of 0.5-1.5 THz by THz-TDS. These two kinds of characteristic absorption spectra were randomly divided into calibration set and prediction set by leave-N-out cross-validation, respectively. Finally, the partial least squares regression (PLSR) method was used to establish two quantitative analysis models. The root mean square error (RMSECV), the root mean square errors of prediction (RMSEP) and the correlation coefficient of the prediction are used as a basis for the model of performance evaluation. For the R,, a higher value is better; for the RMSEC and RMSEP, lower is better. The obtained results demonstrated that the predictive accuracy of. the two models with PLSR method were satisfactory. For the FT-Far-IR model, the correlation between actual and predicted values of prediction samples (Rv) was 0.99. The root mean square error of prediction set (RMSEP) was 0.008 6, and for calibration set (RMSECV) was 0.007 7. For the THz-TDS model, R. was 0. 98, RMSEP was 0.004 4, and RMSECV was 0.002 5. Results proved that the technology of FT-Far-IR and THz- TDS can be a feasible tool for quantitative determination of carbaryl in rice. This paper provides a new method for the quantitative determination pesticide in other grain samples.

  8. Retrieval and Mapping of Heavy Metal Concentration in Soil Using Time Series Landsat 8 Imagery

    NASA Astrophysics Data System (ADS)

    Fang, Y.; Xu, L.; Peng, J.; Wang, H.; Wong, A.; Clausi, D. A.

    2018-04-01

    Heavy metal pollution is a critical global environmental problem which has always been a concern. Traditional approach to obtain heavy metal concentration relying on field sampling and lab testing is expensive and time consuming. Although many related studies use spectrometers data to build relational model between heavy metal concentration and spectra information, and then use the model to perform prediction using the hyperspectral imagery, this manner can hardly quickly and accurately map soil metal concentration of an area due to the discrepancies between spectrometers data and remote sensing imagery. Taking the advantage of easy accessibility of Landsat 8 data, this study utilizes Landsat 8 imagery to retrieve soil Cu concentration and mapping its distribution in the study area. To enlarge the spectral information for more accurate retrieval and mapping, 11 single date Landsat 8 imagery from 2013-2017 are selected to form a time series imagery. Three regression methods, partial least square regression (PLSR), artificial neural network (ANN) and support vector regression (SVR) are used to model construction. By comparing these models unbiasedly, the best model are selected to mapping Cu concentration distribution. The produced distribution map shows a good spatial autocorrelation and consistency with the mining area locations.

  9. Fast Detection of Copper Content in Rice by Laser-Induced Breakdown Spectroscopy with Uni- and Multivariate Analysis.

    PubMed

    Liu, Fei; Ye, Lanhan; Peng, Jiyu; Song, Kunlin; Shen, Tingting; Zhang, Chu; He, Yong

    2018-02-27

    Fast detection of heavy metals is very important for ensuring the quality and safety of crops. Laser-induced breakdown spectroscopy (LIBS), coupled with uni- and multivariate analysis, was applied for quantitative analysis of copper in three kinds of rice (Jiangsu rice, regular rice, and Simiao rice). For univariate analysis, three pre-processing methods were applied to reduce fluctuations, including background normalization, the internal standard method, and the standard normal variate (SNV). Linear regression models showed a strong correlation between spectral intensity and Cu content, with an R 2 more than 0.97. The limit of detection (LOD) was around 5 ppm, lower than the tolerance limit of copper in foods. For multivariate analysis, partial least squares regression (PLSR) showed its advantage in extracting effective information for prediction, and its sensitivity reached 1.95 ppm, while support vector machine regression (SVMR) performed better in both calibration and prediction sets, where R c 2 and R p 2 reached 0.9979 and 0.9879, respectively. This study showed that LIBS could be considered as a constructive tool for the quantification of copper contamination in rice.

  10. Fast Detection of Copper Content in Rice by Laser-Induced Breakdown Spectroscopy with Uni- and Multivariate Analysis

    PubMed Central

    Ye, Lanhan; Song, Kunlin; Shen, Tingting

    2018-01-01

    Fast detection of heavy metals is very important for ensuring the quality and safety of crops. Laser-induced breakdown spectroscopy (LIBS), coupled with uni- and multivariate analysis, was applied for quantitative analysis of copper in three kinds of rice (Jiangsu rice, regular rice, and Simiao rice). For univariate analysis, three pre-processing methods were applied to reduce fluctuations, including background normalization, the internal standard method, and the standard normal variate (SNV). Linear regression models showed a strong correlation between spectral intensity and Cu content, with an R2 more than 0.97. The limit of detection (LOD) was around 5 ppm, lower than the tolerance limit of copper in foods. For multivariate analysis, partial least squares regression (PLSR) showed its advantage in extracting effective information for prediction, and its sensitivity reached 1.95 ppm, while support vector machine regression (SVMR) performed better in both calibration and prediction sets, where Rc2 and Rp2 reached 0.9979 and 0.9879, respectively. This study showed that LIBS could be considered as a constructive tool for the quantification of copper contamination in rice. PMID:29495445

  11. Rapid profiling of Swiss cheese by attenuated total reflectance (ATR) infrared spectroscopy and descriptive sensory analysis.

    PubMed

    Kocaoglu-Vurma, N A; Eliardi, A; Drake, M A; Rodriguez-Saona, L E; Harper, W J

    2009-08-01

    The acceptability of cheese depends largely on the flavor formed during ripening. The flavor profiles of cheeses are complex and region- or manufacturer-specific which have made it challenging to understand the chemistry of flavor development and its correlation with sensory properties. Infrared spectroscopy is an attractive technology for the rapid, sensitive, and high-throughput analysis of foods, providing information related to its composition and conformation of food components from the spectra. Our objectives were to establish infrared spectral profiles to discriminate Swiss cheeses produced by different manufacturers in the United States and to develop predictive models for determination of sensory attributes based on infrared spectra. Fifteen samples from 3 Swiss cheese manufacturers were received and analyzed using attenuated total reflectance infrared spectroscopy (ATR-IR). The spectra were analyzed using soft independent modeling of class analogy (SIMCA) to build a classification model. The cheeses were profiled by a trained sensory panel using descriptive sensory analysis. The relationship between the descriptive sensory scores and ATR-IR spectra was assessed using partial least square regression (PLSR) analysis. SIMCA discriminated the Swiss cheeses based on manufacturer and production region. PLSR analysis generated prediction models with correlation coefficients of validation (rVal) between 0.69 and 0.96 with standard error of cross-validation (SECV) ranging from 0.04 to 0.29. Implementation of rapid infrared analysis by the Swiss cheese industry would help to streamline quality assurance.

  12. Hyperspectral imaging technique for determination of pork freshness attributes

    NASA Astrophysics Data System (ADS)

    Li, Yongyu; Zhang, Leilei; Peng, Yankun; Tang, Xiuying; Chao, Kuanglin; Dhakal, Sagar

    2011-06-01

    Freshness of pork is an important quality attribute, which can vary greatly in storage and logistics. The specific objectives of this research were to develop a hyperspectral imaging system to predict pork freshness based on quality attributes such as total volatile basic-nitrogen (TVB-N), pH value and color parameters (L*,a*,b*). Pork samples were packed in seal plastic bags and then stored at 4°C. Every 12 hours. Hyperspectral scattering images were collected from the pork surface at the range of 400 nm to 1100 nm. Two different methods were performed to extract scattering feature spectra from the hyperspectral scattering images. First, the spectral scattering profiles at individual wavelengths were fitted accurately by a three-parameter Lorentzian distribution (LD) function; second, reflectance spectra were extracted from the scattering images. Partial Least Square Regression (PLSR) method was used to establish prediction models to predict pork freshness. The results showed that the PLSR models based on reflectance spectra was better than combinations of LD "parameter spectra" in prediction of TVB-N with a correlation coefficient (r) = 0.90, a standard error of prediction (SEP) = 7.80 mg/100g. Moreover, a prediction model for pork freshness was established by using a combination of TVB-N, pH and color parameters. It could give a good prediction results with r = 0.91 for pork freshness. The research demonstrated that hyperspectral scattering technique is a valid tool for real-time and nondestructive detection of pork freshness.

  13. Characterization and authentication of a novel vegetable source of omega-3 fatty acids, sacha inchi (Plukenetia volubilis L.) oil.

    PubMed

    Maurer, Natalie E; Hatta-Sakoda, Beatriz; Pascual-Chagman, Gloria; Rodriguez-Saona, Luis E

    2012-09-15

    Consumption of omega-3 fatty acids (ω-3's), whether from fish oils, flax or supplements, can protect against cardiovascular disease. Finding plant-based sources of the essential ω-3's could provide a sustainable, renewable and inexpensive source of ω-3's, compared to fish oils. Our objective was to develop a rapid test to characterize and detect adulteration in sacha inchi oils, a Peruvian seed containing higher levels of ω-3's in comparison to other oleaginous seeds. A temperature-controlled ZnSe ATR mid-infrared benchtop and diamond ATR mid-infrared portable handheld spectrometers were used to characterize sacha inchi oil and evaluate its oxidative stability compared to commercial oils. A soft independent model of class analogy (SIMCA) and partial least squares regression (PLSR) analyzed the spectral data. Fatty acid profiles showed that sacha inchi oil (44% linolenic acid) had levels of PUFA similar to those of flax oils. PLSR showed good correlation coefficients (R(2)>0.9) between reference tests and spectra from infrared devices, allowing for rapid determination of fatty acid composition and prediction of oxidative stability. Oils formed distinct clusters, allowing the evaluation of commercial sacha inchi oils from Peruvian markets and showed some prevalence of adulteration. Determining oil adulteration and quality parameters, by using the ATR-MIR portable handheld spectrometer, allowed for portability and ease-of-use, making it a great alternative to traditional testing methods. Copyright © 2012 Elsevier Ltd. All rights reserved.

  14. Dough performance, quality and shelf life of flat bread supplemented with fractions of germinated date seed.

    PubMed

    Hejri-Zarifi, Sudiyeh; Ahmadian-Kouchaksaraei, Zahra; Pourfarzad, Amir; Khodaparast, Mohammad Hossein Haddad

    2014-12-01

    Germinated palm date seeds were milled into two fractions: germ and residue. Dough rheological characteristics, baking (specific volume and sensory evaluation), and textural properties (at first day and during storage for 5 days) were determined in Barbari flat bread. Germ and residue fractions were incorporated at various levels ranged in 0.5-3 g/100 g of wheat flour. Water absorption, arrival time and gelatination temperature were decreased by germ fraction but accompanied by an increasing effect on the mixing tolerance index and degree of softening in most levels. Although improvement in dough stability was monitored but specific volume of bread was not affected by both fractions. Texture analysis of bread samples during 5 days of storage indicated that both fractions of germinated date seeds were able to diminish bread staling. Avrami non-linear regression equation was chosen as useful mathematical model to properly study bread hardening kinetics. In addition, principal component analysis (PCA) allowed discriminating among dough and bread specialties. Partial least squares regression (PLSR) models were applied to determine the relationships between sensory and instrumental data.

  15. Soil fungal diversity in natural grasslands of the Tibetan Plateau: associations with plant diversity and productivity.

    PubMed

    Yang, Teng; Adams, Jonathan M; Shi, Yu; He, Jin-Sheng; Jing, Xin; Chen, Litong; Tedersoo, Leho; Chu, Haiyan

    2017-07-01

    Previous studies have revealed inconsistent correlations between fungal diversity and plant diversity from local to global scales, and there is a lack of information about the diversity-diversity and productivity-diversity relationships for fungi in alpine regions. Here we investigated the internal relationships between soil fungal diversity, plant diversity and productivity across 60 grassland sites on the Tibetan Plateau, using Illumina sequencing of the internal transcribed spacer 2 (ITS2) region for fungal identification. Fungal alpha and beta diversities were best explained by plant alpha and beta diversities, respectively, when accounting for environmental drivers and geographic distance. The best ordinary least squares (OLS) multiple regression models, partial least squares regression (PLSR) and variation partitioning analysis (VPA) indicated that plant richness was positively correlated with fungal richness. However, no correlation between plant richness and fungal richness was evident for fungal functional guilds when analyzed individually. Plant productivity showed a weaker relationship to fungal diversity which was intercorrelated with other factors such as plant diversity, and was thus excluded as a main driver. Our study points to a predominant effect of plant diversity, along with other factors such as carbon : nitrogen (C : N) ratio, soil phosphorus and dissolved organic carbon, on soil fungal richness. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.

  16. Discrimination of honeys using colorimetric sensor arrays, sensory analysis and gas chromatography techniques.

    PubMed

    Tahir, Haroon Elrasheid; Xiaobo, Zou; Xiaowei, Huang; Jiyong, Shi; Mariod, Abdalbasit Adam

    2016-09-01

    Aroma profiles of six honey varieties of different botanical origins were investigated using colorimetric sensor array, gas chromatography-mass spectrometry (GC-MS) and descriptive sensory analysis. Fifty-eight aroma compounds were identified, including 2 norisoprenoids, 5 hydrocarbons, 4 terpenes, 6 phenols, 7 ketones, 9 acids, 12 aldehydes and 13 alcohols. Twenty abundant or active compounds were chosen as key compounds to characterize honey aroma. Discrimination of the honeys was subsequently implemented using multivariate analysis, including hierarchical clustering analysis (HCA) and principal component analysis (PCA). Honeys of the same botanical origin were grouped together in the PCA score plot and HCA dendrogram. SPME-GC/MS and colorimetric sensor array were able to discriminate the honeys effectively with the advantages of being rapid, simple and low-cost. Moreover, partial least squares regression (PLSR) was applied to indicate the relationship between sensory descriptors and aroma compounds. Copyright © 2016 Elsevier Ltd. All rights reserved.

  17. Oxidative Stress in Wild Boars Naturally and Experimentally Infected with Mycobacterium bovis

    PubMed Central

    Gassó, Diana; Vicente, Joaquín; Mentaberre, Gregorio; Soriguer, Ramón; Jiménez Rodríguez, Rocío; Navarro-González, Nora; Tvarijonaviciute, Asta; Lavín, Santiago; Fernández-Llario, Pedro; Segalés, Joaquim; Serrano, Emmanuel

    2016-01-01

    Reactive oxygen and nitrogen species (ROS-RNS) are important defence substances involved in the immune response against pathogens. An excessive increase in ROS-RNS, however, can damage the organism causing oxidative stress (OS). The organism is able to neutralise OS by the production of antioxidant enzymes (AE); hence, tissue damage is the result of an imbalance between oxidant and antioxidant status. Though some work has been carried out in humans, there is a lack of information about the oxidant/antioxidant status in the presence of tuberculosis (TB) in wild reservoirs. In the Mediterranean Basin, wild boar (Sus scrofa) is the main reservoir of TB. Wild boar showing severe TB have an increased risk to Mycobacterium spp. shedding, leading to pathogen spreading and persistence. If OS is greater in these individuals, oxidant/antioxidant balance in TB-affected boars could be used as a biomarker of disease severity. The present work had a two-fold objective: i) to study the effects of bovine TB on different OS biomarkers (namely superoxide dismutase (SOD), catalasa (CAT), glutathione peroxidase (GPX), glutathione reductase (GR) and thiobarbituric acid reactive substances (TBARS)) in wild boar experimentally challenged with Mycobacterium bovis, and ii) to explore the role of body weight, sex, population and season in explaining the observed variability of OS indicators in two populations of free-ranging wild boar where TB is common. For the first objective, a partial least squares regression (PLSR) approach was used whereas, recursive partitioning with regression tree models (RTM) were applied for the second. A negative relationship between antioxidant enzymes and bovine TB (the more severe lesions, the lower the concentration of antioxidant biomarkers) was observed in experimentally infected animals. The final PLSR model retained the GPX, SOD and GR biomarkers and showed that 17.6% of the observed variability of antioxidant capacity was significantly correlated with the PLSR X’s component represented by both disease status and the age of boars. In the samples from free-ranging wild boar, however, the environmental factors were more relevant to the observed variability of the OS biomarkers than the TB itself. For each OS biomarker, each RTM was defined as a maximum by one node due to the population effect. Along the same lines, the ad hoc tree regression on boars from the population with a higher prevalence of severe TB confirmed that disease status was not the main factor explaining the observed variability in OS biomarkers. It was concluded that oxidative damage caused by TB is significant, but can only be detected in the absence of environmental variation in wild boar. PMID:27682987

  18. Value of Information Analysis for Time-lapse Seismic Data by Simulation-Regression

    NASA Astrophysics Data System (ADS)

    Dutta, G.; Mukerji, T.; Eidsvik, J.

    2016-12-01

    A novel method to estimate the Value of Information (VOI) of time-lapse seismic data in the context of reservoir development is proposed. VOI is a decision analytic metric quantifying the incremental value that would be created by collecting information prior to making a decision under uncertainty. The VOI has to be computed before collecting the information and can be used to justify its collection. Previous work on estimating the VOI of geophysical data has involved explicit approximation of the posterior distribution of reservoir properties given the data and then evaluating the prospect values for that posterior distribution of reservoir properties. Here, we propose to directly estimate the prospect values given the data by building a statistical relationship between them using regression. Various regression techniques such as Partial Least Squares Regression (PLSR), Multivariate Adaptive Regression Splines (MARS) and k-Nearest Neighbors (k-NN) are used to estimate the VOI, and the results compared. For a univariate Gaussian case, the VOI obtained from simulation-regression has been shown to be close to the analytical solution. Estimating VOI by simulation-regression is much less computationally expensive since the posterior distribution of reservoir properties given each possible dataset need not be modeled and the prospect values need not be evaluated for each such posterior distribution of reservoir properties. This method is flexible, since it does not require rigid model specification of posterior but rather fits conditional expectations non-parametrically from samples of values and data.

  19. [Inversion of organic matter content of the north fluvo-aquic soil based on hyperspectral and multi-spectra].

    PubMed

    Wang, Yan-Cang; Gu, Xiao-He; Zhu, Jin-Shan; Long, Hui-Ling; Xu, Peng; Liao, Qin-Hong

    2014-01-01

    The present study aims to assess the feasibility of multi-spectral data in monitoring soil organic matter content. The data source comes from hyperspectral measured under laboratory condition, and simulated multi-spectral data from the hyperspectral. According to the reflectance response functions of Landsat TM and HJ-CCD (the Environment and Disaster Reduction Small Satellites, HJ), the hyperspectra were resampled for the corresponding bands of multi-spectral sensors. The correlation between hyperspectral, simulated reflectance spectra and organic matter content was calculated, and used to extract the sensitive bands of the organic matter in the north fluvo-aquic soil. The partial least square regression (PLSR) method was used to establish experiential models to estimate soil organic matter content. Both root mean squared error (RMSE) and coefficient of the determination (R2) were introduced to test the precision and stability of the modes. Results demonstrate that compared with the hyperspectral data, the best model established by simulated multi-spectral data gives a good result for organic matter content, with R2=0.586, and RMSE=0.280. Therefore, using multi-spectral data to predict tide soil organic matter content is feasible.

  20. Early detection of germinated wheat grains using terahertz image and chemometrics

    NASA Astrophysics Data System (ADS)

    Jiang, Yuying; Ge, Hongyi; Lian, Feiyu; Zhang, Yuan; Xia, Shanhong

    2016-02-01

    In this paper, we propose a feasible tool that uses a terahertz (THz) imaging system for identifying wheat grains at different stages of germination. The THz spectra of the main changed components of wheat grains, maltose and starch, which were obtained by THz time spectroscopy, were distinctly different. Used for original data compression and feature extraction, principal component analysis (PCA) revealed the changes that occurred in the inner chemical structure during germination. Two thresholds, one indicating the start of the release of α-amylase and the second when it reaches the steady state, were obtained through the first five score images. Thus, the first five PCs were input for the partial least-squares regression (PLSR), least-squares support vector machine (LS-SVM), and back-propagation neural network (BPNN) models, which were used to classify seven different germination times between 0 and 48 h, with a prediction accuracy of 92.85%, 93.57%, and 90.71%, respectively. The experimental results indicated that the combination of THz imaging technology and chemometrics could be a new effective way to discriminate wheat grains at the early germination stage of approximately 6 h.

  1. GEMAS: prediction of solid-solution phase partitioning coefficients (Kd) for oxoanions and boric acid in soils using mid-infrared diffuse reflectance spectroscopy.

    PubMed

    Janik, Leslie J; Forrester, Sean T; Soriano-Disla, José M; Kirby, Jason K; McLaughlin, Michael J; Reimann, Clemens

    2015-02-01

    The authors' aim was to develop rapid and inexpensive regression models for the prediction of partitioning coefficients (Kd), defined as the ratio of the total or surface-bound metal/metalloid concentration of the solid phase to the total concentration in the solution phase. Values of Kd were measured for boric acid (B[OH]3(0)) and selected added soluble oxoanions: molybdate (MoO4(2-)), antimonate (Sb[OH](6-)), selenate (SeO4(2-)), tellurate (TeO4(2-)) and vanadate (VO4(3-)). Models were developed using approximately 500 spectrally representative soils of the Geochemical Mapping of Agricultural Soils of Europe (GEMAS) program. These calibration soils represented the major properties of the entire 4813 soils of the GEMAS project. Multiple linear regression (MLR) from soil properties, partial least-squares regression (PLSR) using mid-infrared diffuse reflectance Fourier-transformed (DRIFT) spectra, and models using DRIFT spectra plus analytical pH values (DRIFT + pH), were compared with predicted log K(d + 1) values. Apart from selenate (R(2)  = 0.43), the DRIFT + pH calibrations resulted in marginally better models to predict log K(d + 1) values (R(2)  = 0.62-0.79), compared with those from PSLR-DRIFT (R(2)  = 0.61-0.72) and MLR (R(2)  = 0.54-0.79). The DRIFT + pH calibrations were applied to the prediction of log K(d + 1) values in the remaining 4313 soils. An example map of predicted log K(d + 1) values for added soluble MoO4(2-) in soils across Europe is presented. The DRIFT + pH PLSR models provided a rapid and inexpensive tool to assess the risk of mobility and potential availability of boric acid and selected oxoanions in European soils. For these models to be used in the prediction of log K(d + 1) values in soils globally, additional research will be needed to determine if soil variability is accounted on the calibration. © 2014 SETAC.

  2. Multivariate functions for predicting the sorption of 2,4,6-trinitrotoluene (TNT) and 1,3,5-trinitro-1,3,5-tricyclohexane (RDX) among taxonomically distinct soils.

    PubMed

    Katseanes, Chelsea K; Chappell, Mark A; Hopkins, Bryan G; Durham, Brian D; Price, Cynthia L; Porter, Beth E; Miller, Lesley F

    2016-11-01

    After nearly a century of use in numerous munition platforms, TNT and RDX contamination has turned up largely in the environment due to ammunition manufacturing or as part of releases from low-order detonations during training activities. Although the basic knowledge governing the environmental fate of TNT and RDX are known, accurate predictions of TNT and RDX persistence in soil remain elusive, particularly given the universal heterogeneity of pedomorphic soil types. In this work, we proposed a new solution for modeling the sorption and persistence of these munition constituents as multivariate mathematical functions correlating soil attribute data over a variety of taxonomically distinct soil types to contaminant behavior, instead of a single constant or parameter of a specific absolute value. To test this idea, we conducted experiments measuring the sorption of TNT and RDX on taxonomically different soil types that were extensively physical and chemically characterized. Statistical decomposition of the log-transformed, and auto-scaled soil characterization data using the dimension-reduction technique PCA (principal component analysis) revealed a strong latent structure based in the multiple pairwise correlations among the soil properties. TNT and RDX sorption partitioning coefficients (KD-TNT and KD-RDX) were regressed against this latent structure using partial least squares regression (PLSR), generating a 3-factor, multivariate linear functions. Here, PLSR models predicted KD-TNT and KD-RDX values based on attributes contributing to endogenous alkaline/calcareous and soil fertility criteria, respectively, exhibited among the different soil types: We hypothesized that the latent structure arising from the strong covariance of full multivariate geochemical matrix describing taxonomically distinguished soil types may provide the means for potentially predicting complex phenomena in soils. The development of predictive multivariate models tuned to a local soil's taxonomic designation would have direct benefit to military range managers seeking to anticipate the environmental risks of training activities on impact sites. Published by Elsevier Ltd.

  3. Sensory characteristics and consumer preference for chicken meat in Guinea.

    PubMed

    Sow, T M A; Grongnet, J F

    2010-10-01

    This study identified the sensory characteristics and consumer preference for chicken meat in Guinea. Five chicken samples [live village chicken, live broiler, live spent laying hen, ready-to-cook broiler, and ready-to-cook broiler (imported)] bought from different locations were assessed by 10 trained panelists using 19 sensory attributes. The ANOVA results showed that 3 chicken appearance attributes (brown, yellow, and white), 5 chicken odor attributes (oily, intense, medicine smell, roasted, and mouth persistent), 3 chicken flavor attributes (sweet, bitter, and astringent), and 8 chicken texture attributes (firm, tender, juicy, chew, smooth, springy, hard, and fibrous) were significantly discriminating between the chicken samples (P<0.05). Principal component analysis of the sensory data showed that the first 2 principal components explained 84% of the sensory data variance. The principal component analysis results showed that the live village chicken, the live spent laying hen, and the ready-to-cook broiler (imported) were very well represented and clearly distinguished from the live broiler and the ready-to-cook broiler. One hundred twenty consumers expressed their preferences for the chicken samples using a 5-point Likert scale. The hierarchical cluster analysis of the preference data identified 4 homogenous consumer clusters. The hierarchical cluster analysis results showed that the live village chicken was the most preferred chicken sample, whereas the ready-to-cook broiler was the least preferred one. The partial least squares regression (PLSR) type 1 showed that 72% of the sensory data for the first 2 principal components explained 83% of the chicken preference. The PLSR1 identified that the sensory characteristics juicy, oily, sweet, hard, mouth persistent, and yellow were the most relevant sensory drivers of the Guinean chicken preference. The PLSR2 (with multiple responses) identified the relationship between the chicken samples, their sensory attributes, and the consumer clusters. Our results showed that there was not a chicken category that was exclusively preferred from the other chicken samples and therefore highlight the existence of place for development of all chicken categories in the local market.

  4. Phytoplankton growth rate modelling: can spectroscopic cell chemotyping be superior to physiological predictors?

    PubMed

    Fanesi, Andrea; Wagner, Heiko; Wilhelm, Christian

    2017-02-08

    Climate change has a strong impact on phytoplankton communities and water quality. However, the development of robust techniques to assess phytoplankton growth is still in progress. In this study, the growth rate of phytoplankton cells grown at different temperatures was modelled based on conventional physiological traits (e.g. chlorophyll, carbon and photosynthetic parameters) using the partial least square regression (PLSR) algorithm and compared with a new approach combining Fourier transform infrared-spectroscopy and PLSR. In this second model, it is assumed that the macromolecular composition of phytoplankton cells represents an intracellular marker for growth. The models have comparable high predictive power (R 2 > 0.8) and low error in predicting new observations. Interestingly, not all of the predictors present the same weight in the modelling of growth rate. A set of specific parameters, such as non-photochemical fluorescence quenching (NPQ) and the quantum yield of carbon production in the first model, and lipid, protein and carbohydrate contents for the second one, strongly covary with cell growth rate regardless of the taxonomic position of the phytoplankton species investigated. This reflects a set of specific physiological adjustments covarying with growth rate, conserved among taxonomically distant algal species that might be used as guidelines for the improvement of modern primary production models. The high predictive power of both sets of cellular traits for growth rate is of great importance for applied phycological studies. Our approach may find application as a quality control tool for the monitoring of phytoplankton populations in natural communities or in photobioreactors. © 2017 The Author(s).

  5. Relative importance of climate changes at different time scales on net primary productivity-a case study of the Karst area of northwest Guangxi, China.

    PubMed

    Liu, Huiyu; Zhang, Mingyang; Lin, Zhenshan

    2017-10-05

    Climate changes are considered to significantly impact net primary productivity (NPP). However, there are few studies on how climate changes at multiple time scales impact NPP. With MODIS NPP product and station-based observations of sunshine duration, annual average temperature and annual precipitation, impacts of climate changes at different time scales on annual NPP, have been studied with EEMD (ensemble empirical mode decomposition) method in the Karst area of northwest Guangxi, China, during 2000-2013. Moreover, with partial least squares regression (PLSR) model, the relative importance of climatic variables for annual NPP has been explored. The results show that (1) only at quasi 3-year time scale do sunshine duration and temperature have significantly positive relations with NPP. (2) Annual precipitation has no significant relation to NPP by direct comparison, but significantly positive relation at 5-year time scale, which is because 5-year time scale is not the dominant scale of precipitation; (3) the changes of NPP may be dominated by inter-annual variabilities. (4) Multiple time scales analysis will greatly improve the performance of PLSR model for estimating NPP. The variable importance in projection (VIP) scores of sunshine duration and temperature at quasi 3-year time scale, and precipitation at quasi 5-year time scale are greater than 0.8, indicating important for NPP during 2000-2013. However, sunshine duration and temperature at quasi 3-year time scale are much more important. Our results underscore the importance of multiple time scales analysis for revealing the relations of NPP to changing climate.

  6. Determination of yolk contamination in liquid egg white using Raman spectroscopy.

    PubMed

    Cluff, K; Konda Naganathan, G; Jonnalagada, D; Mortensen, I; Wehling, R; Subbiah, J

    2016-07-01

    Purified egg white is an important ingredient in a number of baked and confectionary foods because of its foaming properties. However, yolk contamination in amounts as low as 0.01% can impede the foaming ability of egg white. In this study, we used Raman spectroscopy to evaluate the hypothesis that yolk contamination in egg white could be detected based on its molecular optical properties. Yolk contaminated egg white samples (n = 115) with contamination levels ranging from 0% to 0.25% (on weight basis) were prepared. The samples were excited with a 785 nm laser and Raman spectra from 250 to 3,200 cm(-1) were recorded. The Raman spectra were baseline corrected using an optimized piecewise cubic interpolation on each spectrum and then normalized with a standard normal variate transformation. Samples were randomly divided into calibration (n = 77) and validation (n = 38) data sets. A partial least squares regression (PLSR) model was developed to predict yolk contamination levels, based on the Raman spectral fingerprint. Raman spectral peaks, in the spectral region of 1,080 and 1,666 cm(-1), had the largest influence on detecting yolk contamination in egg white. The PLSR model was able to correctly predict yolk contamination levels with an R(2) = 0.90 in the validation data set. These results demonstrate the capability of Raman spectroscopy for detection of yolk contamination at very low levels in egg white and present a strong case for development of an on-line system to be deployed in egg processing plants. © 2016 Poultry Science Association Inc.

  7. Quantitative determination and classification of energy drinks using near-infrared spectroscopy.

    PubMed

    Rácz, Anita; Héberger, Károly; Fodor, Marietta

    2016-09-01

    Almost a hundred commercially available energy drink samples from Hungary, Slovakia, and Greece were collected for the quantitative determination of their caffeine and sugar content with FT-NIR spectroscopy and high-performance liquid chromatography (HPLC). Calibration models were built with partial least-squares regression (PLSR). An HPLC-UV method was used to measure the reference values for caffeine content, while sugar contents were measured with the Schoorl method. Both the nominal sugar content (as indicated on the cans) and the measured sugar concentration were used as references. Although the Schoorl method has larger error and bias, appropriate models could be developed using both references. The validation of the models was based on sevenfold cross-validation and external validation. FT-NIR analysis is a good candidate to replace the HPLC-UV method, because it is much cheaper than any chromatographic method, while it is also more time-efficient. The combination of FT-NIR with multidimensional chemometric techniques like PLSR can be a good option for the detection of low caffeine concentrations in energy drinks. Moreover, three types of energy drinks that contain (i) taurine, (ii) arginine, and (iii) none of these two components were classified correctly using principal component analysis and linear discriminant analysis. Such classifications are important for the detection of adulterated samples and for quality control, as well. In this case, more than a hundred samples were used for the evaluation. The classification was validated with cross-validation and several randomization tests (X-scrambling). Graphical Abstract The way of energy drinks from cans to appropriate chemometric models.

  8. Modafinil Reverses Phencyclidine-Induced Deficits in Cognitive Flexibility, Cerebral Metabolism, and Functional Brain Connectivity

    PubMed Central

    Dawson, Neil; Thompson, Rhiannon J.; McVie, Allan; Thomson, David M.; Morris, Brian J.; Pratt, Judith A.

    2012-01-01

    Objective: In the present study, we employ mathematical modeling (partial least squares regression, PLSR) to elucidate the functional connectivity signatures of discrete brain regions in order to identify the functional networks subserving PCP-induced disruption of distinct cognitive functions and their restoration by the procognitive drug modafinil. Methods: We examine the functional connectivity signatures of discrete brain regions that show overt alterations in metabolism, as measured by semiquantitative 2-deoxyglucose autoradiography, in an animal model (subchronic phencyclidine [PCP] treatment), which shows cognitive inflexibility with relevance to the cognitive deficits seen in schizophrenia. Results: We identify the specific components of functional connectivity that contribute to the rescue of this cognitive inflexibility and to the restoration of overt cerebral metabolism by modafinil. We demonstrate that modafinil reversed both the PCP-induced deficit in the ability to switch attentional set and the PCP-induced hypometabolism in the prefrontal (anterior prelimbic) and retrosplenial cortices. Furthermore, modafinil selectively enhanced metabolism in the medial prelimbic cortex. The functional connectivity signatures of these regions identified a unifying functional subsystem underlying the influence of modafinil on cerebral metabolism and cognitive flexibility that included the nucleus accumbens core and locus coeruleus. In addition, these functional connectivity signatures identified coupling events specific to each brain region, which relate to known anatomical connectivity. Conclusions: These data support clinical evidence that modafinil may alleviate cognitive deficits in schizophrenia and also demonstrate the benefit of applying PLSR modeling to characterize functional brain networks in translational models relevant to central nervous system dysfunction. PMID:20810469

  9. Spatially dense morphometrics of craniofacial sexual dimorphism in 1-year-olds.

    PubMed

    Matthews, Harold; Penington, Tony; Saey, Ine; Halliday, Jane; Muggli, Evelyn; Claes, Peter

    2016-10-01

    Recent advances in the field of geometric morphometrics allow for powerful statistical hypothesis testing for effects of biological and environmental variables on anatomical shape. This study used partial least-squares regression (PLSR) and the recently developed bootstrapped response-based imputation modelling (BRIM) algorithm to test for sexual dimorphism in the craniofacial shape of 1-year-old humans. We observed a recession of the forehead in boys relative to girls, and differences in the nose, consistent with adult dimorphism. Results also suggest that the degree to which individuals express dimorphic traits is continuous throughout the population. This is also seen in adult dimorphism but in 1-year-olds the amount of overlap between groups is much higher, indicating the strength of dimorphism between sexes is lower. Our results demonstrate early sexual dimorphism that is not attributable to the influx of sex hormones at puberty. This highlights the need to look at very early ontogeny for the origins of sexual dimorphism. We suggest that future work look at potential mediating effects of this early dimorphism on the later impact of puberty. The subtle shape differences we have detected, may also be applied to sexing fossilised crania. A common artefact in 3D images of faces of young children is that they often have their mouths open to varying degrees, introducing variability in the data unrelated to anatomy. We describe two PLSR-based methods of correcting this. These methods may facilitate surgical planning and assessment of young children based on 3D images. © 2016 Anatomical Society.

  10. Quality evaluation of frozen guava and yellow passion fruit pulps by NIR spectroscopy and chemometrics.

    PubMed

    Alamar, Priscila D; Caramês, Elem T S; Poppi, Ronei J; Pallone, Juliana A L

    2016-07-01

    The present study investigated the application of near infrared spectroscopy as a green, quick, and efficient alternative to analytical methods currently used to evaluate the quality (moisture, total sugars, acidity, soluble solids, pH and ascorbic acid) of frozen guava and passion fruit pulps. Fifty samples were analyzed by near infrared spectroscopy (NIR) and reference methods. Partial least square regression (PLSR) was used to develop calibration models to relate the NIR spectra and the reference values. Reference methods indicated adulteration by water addition in 58% of guava pulp samples and 44% of yellow passion fruit pulp samples. The PLS models produced lower values of root mean squares error of calibration (RMSEC), root mean squares error of prediction (RMSEP), and coefficient of determination above 0.7. Moisture and total sugars presented the best calibration models (RMSEP of 0.240 and 0.269, respectively, for guava pulp; RMSEP of 0.401 and 0.413, respectively, for passion fruit pulp) which enables the application of these models to determine adulteration in guava and yellow passion fruit pulp by water or sugar addition. The models constructed for calibration of quality parameters of frozen fruit pulps in this study indicate that NIR spectroscopy coupled with the multivariate calibration technique could be applied to determine the quality of guava and yellow passion fruit pulp. Copyright © 2016 Elsevier Ltd. All rights reserved.

  11. Robust colour calibration of an imaging system using a colour space transform and advanced regression modelling.

    PubMed

    Jackman, Patrick; Sun, Da-Wen; Elmasry, Gamal

    2012-08-01

    A new algorithm for the conversion of device dependent RGB colour data into device independent L*a*b* colour data without introducing noticeable error has been developed. By combining a linear colour space transform and advanced multiple regression methodologies it was possible to predict L*a*b* colour data with less than 2.2 colour units of error (CIE 1976). By transforming the red, green and blue colour components into new variables that better reflect the structure of the L*a*b* colour space, a low colour calibration error was immediately achieved (ΔE(CAL) = 14.1). Application of a range of regression models on the data further reduced the colour calibration error substantially (multilinear regression ΔE(CAL) = 5.4; response surface ΔE(CAL) = 2.9; PLSR ΔE(CAL) = 2.6; LASSO regression ΔE(CAL) = 2.1). Only the PLSR models deteriorated substantially under cross validation. The algorithm is adaptable and can be easily recalibrated to any working computer vision system. The algorithm was tested on a typical working laboratory computer vision system and delivered only a very marginal loss of colour information ΔE(CAL) = 2.35. Colour features derived on this system were able to safely discriminate between three classes of ham with 100% correct classification whereas colour features measured on a conventional colourimeter were not. Copyright © 2012 Elsevier Ltd. All rights reserved.

  12. Fast detection and visualization of minced lamb meat adulteration using NIR hyperspectral imaging and multivariate image analysis.

    PubMed

    Kamruzzaman, Mohammed; Sun, Da-Wen; ElMasry, Gamal; Allen, Paul

    2013-01-15

    Many studies have been carried out in developing non-destructive technologies for predicting meat adulteration, but there is still no endeavor for non-destructive detection and quantification of adulteration in minced lamb meat. The main goal of this study was to develop and optimize a rapid analytical technique based on near-infrared (NIR) hyperspectral imaging to detect the level of adulteration in minced lamb. Initial investigation was carried out using principal component analysis (PCA) to identify the most potential adulterate in minced lamb. Minced lamb meat samples were then adulterated with minced pork in the range 2-40% (w/w) at approximately 2% increments. Spectral data were used to develop a partial least squares regression (PLSR) model to predict the level of adulteration in minced lamb. Good prediction model was obtained using the whole spectral range (910-1700 nm) with a coefficient of determination (R(2)(cv)) of 0.99 and root-mean-square errors estimated by cross validation (RMSECV) of 1.37%. Four important wavelengths (940, 1067, 1144 and 1217 nm) were selected using weighted regression coefficients (Bw) and a multiple linear regression (MLR) model was then established using these important wavelengths to predict adulteration. The MLR model resulted in a coefficient of determination (R(2)(cv)) of 0.98 and RMSECV of 1.45%. The developed MLR model was then applied to each pixel in the image to obtain prediction maps to visualize the distribution of adulteration of the tested samples. The results demonstrated that the laborious and time-consuming tradition analytical techniques could be replaced by spectral data in order to provide rapid, low cost and non-destructive testing technique for adulterate detection in minced lamb meat. Copyright © 2012 Elsevier B.V. All rights reserved.

  13. Rapid and non-destructive identification of water-injected beef samples using multispectral imaging analysis.

    PubMed

    Liu, Jinxia; Cao, Yue; Wang, Qiu; Pan, Wenjuan; Ma, Fei; Liu, Changhong; Chen, Wei; Yang, Jianbo; Zheng, Lei

    2016-01-01

    Water-injected beef has aroused public concern as a major food-safety issue in meat products. In the study, the potential of multispectral imaging analysis in the visible and near-infrared (405-970 nm) regions was evaluated for identifying water-injected beef. A multispectral vision system was used to acquire images of beef injected with up to 21% content of water, and partial least squares regression (PLSR) algorithm was employed to establish prediction model, leading to quantitative estimations of actual water increase with a correlation coefficient (r) of 0.923. Subsequently, an optimized model was achieved by integrating spectral data with feature information extracted from ordinary RGB data, yielding better predictions (r = 0.946). Moreover, the prediction equation was transferred to each pixel within the images for visualizing the distribution of actual water increase. These results demonstrate the capability of multispectral imaging technology as a rapid and non-destructive tool for the identification of water-injected beef. Copyright © 2015 Elsevier Ltd. All rights reserved.

  14. Rapid monitoring of the fermentation process for Korean traditional rice wine 'Makgeolli' using FT-NIR spectroscopy

    NASA Astrophysics Data System (ADS)

    Kim, Dae-Yong; Cho, Byoung-Kwan

    2015-11-01

    The quality parameters of the Korean traditional rice wine "Makgeolli" were monitored using Fourier transform near-infrared (FT-NIR) spectroscopy with multivariate statistical analysis (MSA) during fermentation. Alcohol, reducing sugar, and titratable acid were the parameters assessed to determine the quality index of fermentation substrates and products. The acquired spectra were analyzed with partial least squares regression (PLSR). The best prediction model for alcohol was obtained with maximum normalization, showing a coefficient of determination (Rp2) of 0.973 and a standard error of prediction (SEP) of 0.760%. In addition, the best prediction model for reducing sugar was obtained with no data preprocessing, with a Rp2 value of 0.945 and a SEP of 1.233%. The prediction of titratable acidity was best with mean normalization, showing a Rp2 value of 0.882 and a SEP of 0.045%. These results demonstrate that FT-NIR spectroscopy can be used for rapid measurements of quality parameters during Makgeolli fermentation.

  15. Gas Chromatography Data Classification Based on Complex Coefficients of an Autoregressive Model

    DOE PAGES

    Zhao, Weixiang; Morgan, Joshua T.; Davis, Cristina E.

    2008-01-01

    This paper introduces autoregressive (AR) modeling as a novel method to classify outputs from gas chromatography (GC). The inverse Fourier transformation was applied to the original sensor data, and then an AR model was applied to transform data to generate AR model complex coefficients. This series of coefficients effectively contains a compressed version of all of the information in the original GC signal output. We applied this method to chromatograms resulting from proliferating bacteria species grown in culture. Three types of neural networks were used to classify the AR coefficients: backward propagating neural network (BPNN), radial basis function-principal component analysismore » (RBF-PCA) approach, and radial basis function-partial least squares regression (RBF-PLSR) approach. This exploratory study demonstrates the feasibility of using complex root coefficient patterns to distinguish various classes of experimental data, such as those from the different bacteria species. This cognition approach also proved to be robust and potentially useful for freeing us from time alignment of GC signals.« less

  16. Rapid identification of soil cadmium pollution risk at regional scale based on visible and near-infrared spectroscopy.

    PubMed

    Chen, Tao; Chang, Qingrui; Clevers, J G P W; Kooistra, L

    2015-11-01

    Soil heavy metal pollution due to long-term sewage irrigation is a serious environmental problem in many irrigation areas in northern China. Quickly identifying its pollution status is an important basis for remediation. Visible-near-infrared reflectance spectroscopy (VNIRS) provides a useful tool. In a case study, 76 soil samples were collected and their reflectance spectra were used to estimate cadmium (Cd) concentration by partial least squares regression (PLSR) and back propagation neural network (BPNN). To reduce noise, six pre-treatments were compared, in which orthogonal signal correction (OSC) was first used in soil Cd estimation. Spectral analysis and geostatistics were combined to identify Cd pollution hotspots. Results showed that Cd was accumulated in topsoil at the study area. OSC can effectively remove irrelevant information to improve prediction accuracy. More accurate estimation was achieved by applying a BPNN. Soil Cd pollution hotspots could be identified by interpolating the predicted values obtained from spectral estimates. Copyright © 2015 Elsevier Ltd. All rights reserved.

  17. LC-MS based metabolomics and chemometrics study of the toxic effects of copper on Saccharomyces cerevisiae.

    PubMed

    Farrés, Mireia; Piña, Benjamí; Tauler, Romà

    2016-08-01

    Copper containing fungicides are used to protect vineyards from fungal infections. Higher residues of copper in grapes at toxic concentrations are potentially toxic and affect the microorganisms living in vineyards, such as Saccharomyces cerevisiae. In this study, the response of the metabolic profiles of S. cerevisiae at different concentrations of copper sulphate (control, 1 mM, 3 mM and 6 mM) was analysed by liquid chromatography coupled to mass spectrometry (LC-MS) and multivariate curve resolution-alternating least squares (MCR-ALS) using an untargeted metabolomics approach. Peak areas of the MCR-ALS resolved elution profiles in control and in Cu(ii)-treated samples were compared using partial least squares regression (PLSR) and PLS-discriminant analysis (PLS-DA), and the intracellular metabolites best contributing to sample discrimination were selected and identified. Fourteen metabolites showed significant concentration changes upon Cu(ii) exposure, following a dose-response effect. The observed changes were consistent with the expected effects of Cu(ii) toxicity, including oxidative stress and DNA damage. This research confirmed that LC-MS based metabolomics coupled to chemometric methods are a powerful approach for discerning metabolomics changes in S. cerevisiae and for elucidating modes of toxicity of environmental stressors, including heavy metals like Cu(ii).

  18. Spatial patterns of vegetation biomass and soil organic carbon acquired from airborne lidar and hyperspectral imagery at Reynolds Creek Critical Zone Observatory

    NASA Astrophysics Data System (ADS)

    Will, R. M.; Li, A.; Glenn, N. F.; Benner, S. G.; Spaete, L.; Ilangakoon, N. T.

    2015-12-01

    Soil organic carbon distribution and the factors influencing this distribution are important for understanding carbon stores, vegetation dynamics, and the overall carbon cycle. Linking soil organic carbon (SOC) with aboveground vegetation biomass may provide a method to better understand SOC distribution in semiarid ecosystems. The Reynolds Creek Critical Zone Observatory (RC CZO) in Idaho, USA, is approximately 240 square kilometers and is situated in the semiarid Great Basin of the sagebrush-steppe ecosystem. Full waveform airborne lidar data and Next-Generation Airborne Visible/Infrared Imaging Spectrometer (AVIRIS-ng) collected in 2014 across the RC CZO are used to map vegetation biomass and SOC and then explore the relationships between them. Vegetation biomass is estimated by identifying vegetation species, and quantifying distribution and structure with lidar and integrating the field-measured biomass. Spectral data from AVIRIS-ng are used to differentiate non-photosynthetic vegetation (NPV) and soil, which are commonly confused in semiarid ecosystems. The information from lidar and AVIRIS-ng are then used to predict SOC by partial least squares regression (PLSR). An uncertainty analysis is provided, demonstrating the applicability of these approaches to improving our understanding of the distribution and patterns of SOC across the landscape.

  19. Direct comparison of low- and mid-frequency Raman spectroscopy for quantitative solid-state pharmaceutical analysis.

    PubMed

    Lipiäinen, Tiina; Fraser-Miller, Sara J; Gordon, Keith C; Strachan, Clare J

    2018-02-05

    This study considers the potential of low-frequency (terahertz) Raman spectroscopy in the quantitative analysis of ternary mixtures of solid-state forms. Direct comparison between low-frequency and mid-frequency spectral regions for quantitative analysis of crystal form mixtures, without confounding sampling and instrumental variations, is reported for the first time. Piroxicam was used as a model drug, and the low-frequency spectra of piroxicam forms β, α2 and monohydrate are presented for the first time. These forms show clear spectral differences in both the low- and mid-frequency regions. Both spectral regions provided quantitative models suitable for predicting the mixture compositions using partial least squares regression (PLSR), but the low-frequency data gave better models, based on lower errors of prediction (2.7, 3.1 and 3.2% root-mean-square errors of prediction [RMSEP] values for the β, α2 and monohydrate forms, respectively) than the mid-frequency data (6.3, 5.4 and 4.8%, for the β, α2 and monohydrate forms, respectively). The better performance of low-frequency Raman analysis was attributed to larger spectral differences between the solid-state forms, combined with a higher signal-to-noise ratio. Copyright © 2017 Elsevier B.V. All rights reserved.

  20. In-line monitoring of extraction process of scutellarein from Erigeron breviscapus (vant.) Hand-Mazz based on qualitative and quantitative uses of near-infrared spectroscopy.

    PubMed

    Wu, Yongjiang; Jin, Ye; Ding, Haiying; Luan, Lianjun; Chen, Yong; Liu, Xuesong

    2011-09-01

    The application of near-infrared (NIR) spectroscopy for in-line monitoring of extraction process of scutellarein from Erigeron breviscapus (vant.) Hand-Mazz was investigated. For NIR measurements, two fiber optic probes designed to transmit NIR radiation through a 2 mm pathlength flow cell were utilized to collect spectra in real-time. High performance liquid chromatography (HPLC) was used as a reference method to determine scutellarein in extract solution. Partial least squares regression (PLSR) calibration model of Savitzky-Golay smoothing NIR spectra in the 5450-10,000 cm(-1) region gave satisfactory predictive results for scutellarein. The results showed that the correlation coefficients of calibration and cross validation were 0.9967 and 0.9811, respectively, and the root mean square error of calibration and cross validation were 0.044 and 0.105, respectively. Furthermore, both the moving block standard deviation (MBSD) method and conformity test were used to identify the end point of extraction process, providing real-time data and instant feedback about the extraction course. The results obtained in this study indicated that the NIR spectroscopy technique provides an efficient and environmentally friendly approach for fast determination of scutellarein and end point control of extraction process. Copyright © 2011 Elsevier B.V. All rights reserved.

  1. Hyperspectral sensing to detect the impact of herbicide drift on cotton growth and yield

    NASA Astrophysics Data System (ADS)

    Suarez, L. A.; Apan, A.; Werth, J.

    2016-10-01

    Yield loss in crops is often associated with plant disease or external factors such as environment, water supply and nutrient availability. Improper agricultural practices can also introduce risks into the equation. Herbicide drift can be a combination of improper practices and environmental conditions which can create a potential yield loss. As traditional assessment of plant damage is often imprecise and time consuming, the ability of remote and proximal sensing techniques to monitor various bio-chemical alterations in the plant may offer a faster, non-destructive and reliable approach to predict yield loss caused by herbicide drift. This paper examines the prediction capabilities of partial least squares regression (PLS-R) models for estimating yield. Models were constructed with hyperspectral data of a cotton crop sprayed with three simulated doses of the phenoxy herbicide 2,4-D at three different growth stages. Fibre quality, photosynthesis, conductance, and two main hormones, indole acetic acid (IAA) and abscisic acid (ABA) were also analysed. Except for fibre quality and ABA, Spearman correlations have shown that these variables were highly affected by the chemical. Four PLS-R models for predicting yield were developed according to four timings of data collection: 2, 7, 14 and 28 days after the exposure (DAE). As indicated by the model performance, the analysis revealed that 7 DAE was the best time for data collection purposes (RMSEP = 2.6 and R2 = 0.88), followed by 28 DAE (RMSEP = 3.2 and R2 = 0.84). In summary, the results of this study show that it is possible to accurately predict yield after a simulated herbicide drift of 2,4-D on a cotton crop, through the analysis of hyperspectral data, thereby providing a reliable, effective and non-destructive alternative based on the internal response of the cotton leaves.

  2. Raman microspectroscopy for in situ examination of carbon-microbe-mineral interactions

    NASA Astrophysics Data System (ADS)

    Creamer, C.; Foster, A. L.; Lawrence, C. R.; Mcfarland, J. W.; Waldrop, M. P.

    2016-12-01

    The changing paradigm of soil organic matter formation and turnover is focused at the nexus of microbe-carbon-mineral interactions. However, visualizing biotic and abiotic stabilization of C on mineral surfaces is difficult given our current techniques. Therefore we investigated Raman microspectroscopy as a potential tool to examine microbially mediated organo-mineral associations. Raman microspectroscopy is a non-destructive technique that has been used to identify microorganisms and minerals, and to quantify microbial assimilation of 13C labeled substrates in culture. We developed a partial least squares regression (PLSR) model to accurately quantify (within 5%) adsorption of four model 12C substrates (glucose, glutamic acid, oxalic acid, p-hydroxybenzoic acid) on a range of soil minerals. We also developed a PLSR model to quantify the incorporation of 13C into E. coli cells. Using these two models, along with measures of the 13C content of respired CO2, we determined the allocation of glucose-derived C into mineral-associated microbial biomass and respired CO2 in situ and through time. We observed progressive 13C enrichment of microbial biomass with incubation time, as well as 13C enrichment of CO2 indicating preferential decomposition of glucose-derived C. We will also present results on the application of our in situ chamber to quantify the formation of organo-mineral associations under both abiotic and biotic conditions with a variety of C and mineral substrates, as well as the rate of turnover and stabilization of microbial residues. Application of Raman microspectroscopy to microbial-mineral interactions represents a novel method to quantify microbial transformation of C substrates and subsequent mineral stabilization without destructive sampling, and has the potential to provide new insights to our conceptual understanding of carbon-microbe-mineral interactions.

  3. Nondestructive detection of total viable count changes of chilled pork in high oxygen storage condition based on hyperspectral technology

    NASA Astrophysics Data System (ADS)

    Zheng, Xiaochun; Peng, Yankun; Li, Yongyu; Chao, Kuanglin; Qin, Jianwei

    2017-05-01

    The plate count method is commonly used to detect the total viable count (TVC) of bacteria in pork, which is timeconsuming and destructive. It has also been used to study the changes of the TVC in pork under different storage conditions. In recent years, many scholars have explored the non-destructive methods on detecting TVC by using visible near infrared (VIS/NIR) technology and hyperspectral technology. The TVC in chilled pork was monitored under high oxygen condition in this study by using hyperspectral technology in order to evaluate the changes of total bacterial count during storage, and then evaluate advantages and disadvantages of the storage condition. The VIS/NIR hyperspectral images of samples stored in high oxygen condition was acquired by a hyperspectral system in range of 400 1100nm. The actual reference value of total bacteria was measured by standard plate count method, and the results were obtained in 48 hours. The reflection spectra of the samples are extracted and used for the establishment of prediction model for TVC. The spectral preprocessing methods of standard normal variate transformation (SNV), multiple scatter correction (MSC) and derivation was conducted to the original reflectance spectra of samples. Partial least squares regression (PLSR) of TVC was performed and optimized to be the prediction model. The results show that the near infrared hyperspectral technology based on 400-1100nm combined with PLSR model can describe the growth pattern of the total bacteria count of the chilled pork under the condition of high oxygen very vividly and rapidly. The results obtained in this study demonstrate that the nondestructive method of TVC based on NIR hyperspectral has great potential in monitoring of edible safety in processing and storage of meat.

  4. Quantitative estimation of soil salinity by means of different modeling methods and visible-near infrared (VIS–NIR) spectroscopy, Ebinur Lake Wetland, Northwest China

    PubMed Central

    Wang, Jingzhe; Abulimiti, Aerzuna; Cai, Lianghong

    2018-01-01

    Soil salinization is one of the most common forms of land degradation. The detection and assessment of soil salinity is critical for the prevention of environmental deterioration especially in arid and semi-arid areas. This study introduced the fractional derivative in the pretreatment of visible and near infrared (VIS–NIR) spectroscopy. The soil samples (n = 400) collected from the Ebinur Lake Wetland, Xinjiang Uyghur Autonomous Region (XUAR), China, were used as the dataset. After measuring the spectral reflectance and salinity in the laboratory, the raw spectral reflectance was preprocessed by means of the absorbance and the fractional derivative order in the range of 0.0–2.0 order with an interval of 0.1. Two different modeling methods, namely, partial least squares regression (PLSR) and random forest (RF) with preprocessed reflectance were used for quantifying soil salinity. The results showed that more spectral characteristics were refined for the spectrum reflectance treated via fractional derivative. The validation accuracies showed that RF models performed better than those of PLSR. The most effective model was established based on RF with the 1.5 order derivative of absorbance with the optimal values of R2 (0.93), RMSE (4.57 dS m−1), and RPD (2.78 ≥ 2.50). The developed RF model was stable and accurate in the application of spectral reflectance for determining the soil salinity of the Ebinur Lake wetland. The pretreatment of fractional derivative could be useful for monitoring multiple soil parameters with higher accuracy, which could effectively help to analyze the soil salinity. PMID:29736341

  5. Rapid and quantitative detection of the microbial spoilage in milk using Fourier transform infrared spectroscopy and chemometrics.

    PubMed

    Nicolaou, Nicoletta; Goodacre, Royston

    2008-10-01

    Microbiological safety plays a very significant part in the quality control of milk and dairy products worldwide. Current methods used in the detection and enumeration of spoilage bacteria in pasteurized milk in the dairy industry, although accurate and sensitive, are time-consuming. FT-IR spectroscopy is a metabolic fingerprinting technique that can potentially be used to deliver results with the same accuracy and sensitivity, within minutes after minimal sample preparation. We tested this hypothesis using attenuated total reflectance (ATR), and high throughput (HT) FT-IR techniques. Three main types of pasteurized milk - whole, semi-skimmed and skimmed - were used and milk was allowed to spoil naturally by incubation at 15 degrees C. Samples for FT-IR were obtained at frequent, fixed time intervals and pH and total viable counts were also recorded. Multivariate statistical methods, including principal components-discriminant function analysis and partial least squares regression (PLSR), were then used to investigate the relationship between metabolic fingerprints and the total viable counts. FT-IR ATR data for all milks showed reasonable results for bacterial loads above 10(5) cfu ml(-1). By contrast, FT-IR HT provided more accurate results for lower viable bacterial counts down to 10(3) cfu ml(-1) for whole milk and, 4 x 10(2) cfu ml(-1) for semi-skimmed and skimmed milk. Using FT-IR with PLSR we were able to acquire a metabolic fingerprint rapidly and quantify the microbial load of milk samples accurately, with very little sample preparation. We believe that metabolic fingerprinting using FT-IR has very good potential for future use in the dairy industry as a rapid method of detection and enumeration.

  6. Rapid determination of sugar level in snack products using infrared spectroscopy.

    PubMed

    Wang, Ting; Rodriguez-Saona, Luis E

    2012-08-01

    Real-time spectroscopic methods can provide a valuable window into food manufacturing to permit optimization of production rate, quality and safety. There is a need for cutting edge sensor technology directed at improving efficiency, throughput and reliability of critical processes. The aim of the research was to evaluate the feasibility of infrared systems combined with chemometric analysis to develop rapid methods for determination of sugars in cereal products. Samples were ground and spectra were collected using a mid-infrared (MIR) spectrometer equipped with a triple-bounce ZnSe MIRacle attenuated total reflectance accessory or Fourier transform near infrared (NIR) system equipped with a diffuse reflection-integrating sphere. Sugar contents were determined using a reference HPLC method. Partial least squares regression (PLSR) was used to create cross-validated calibration models. The predictability of the models was evaluated on an independent set of samples and compared with reference techniques. MIR and NIR spectra showed characteristic absorption bands for sugars, and generated excellent PLSR models (sucrose: SEP < 1.7% and r > 0.96). Multivariate models accurately and precisely predicted sugar level in snacks allowing for rapid analysis. This simple technique allows for reliable prediction of quality parameters, and automation enabling food manufacturers for early corrective actions that will ultimately save time and money while establishing a uniform quality. The U.S. snack food industry generates billions of dollars in revenue each year and vibrational spectroscopic methods combined with pattern recognition analysis could permit optimization of production rate, quality, and safety of many food products. This research showed that infrared spectroscopy is a powerful technique for near real-time (approximately 1 min) assessment of sugar content in various cereal products. © 2012 Institute of Food Technologists®

  7. Characterization of the biosolids composting process by hyperspectral analysis.

    PubMed

    Ilani, Talli; Herrmann, Ittai; Karnieli, Arnon; Arye, Gilboa

    2016-02-01

    Composted biosolids are widely used as a soil supplement to improve soil quality. However, the application of immature or unstable compost can cause the opposite effect. To date, compost maturation determination is time consuming and cannot be done at the composting site. Hyperspectral spectroscopy was suggested as a simple tool for assessing compost maturity and quality. Nevertheless, there is still a gap in knowledge regarding several compost maturation characteristics, such as dissolved organic carbon, NO3, and NH4 contents. In addition, this approach has not yet been tested on a sample at its natural water content. Therefore, in the current study, hyperspectral analysis was employed in order to characterize the biosolids composting process as a function of composting time. This goal was achieved by correlating the reflectance spectra in the range of 400-2400nm, using the partial least squares-regression (PLS-R) model, with the chemical properties of wet and oven-dried biosolid samples. The results showed that the proposed method can be used as a reliable means to evaluate compost maturity and stability. Specifically, the PLS-R model was found to be an adequate tool to evaluate the biosolids' total carbon and dissolved organic carbon, total nitrogen and dissolved nitrogen, and nitrate content, as well as the absorbance ratio of 254/365nm (E2/E3) and C/N ratios in the dry and wet samples. It failed, however, to predict the ammonium content in the dry samples since the ammonium evaporated during the drying process. It was found that in contrast to what is commonly assumed, the spectral analysis of the wet samples can also be successfully used to build a model for predicting the biosolids' compost maturity. Copyright © 2015 Elsevier Ltd. All rights reserved.

  8. Using continuous intraoperative optical coherence tomography measurements of the aphakic eye for intraocular lens power calculation.

    PubMed

    Hirnschall, Nino; Norrby, Sverker; Weber, Maria; Maedel, Sophie; Amir-Asgari, Sahand; Findl, Oliver

    2015-01-01

    To include intraoperative measurements of the anterior lens capsule of the aphakic eye into the intraocular lens power calculation (IPC) process and to compare the refractive outcome with conventional IPC formulae. In this prospective study, a prototype operating microscope with an integrated continuous optical coherence tomography (OCT) device (Visante attached to OPMI VISU 200, Carl Zeiss Meditec AG, Germany) was used to measure the anterior lens capsule position after implanting a capsular tension ring (CTR). Optical biometry (intraocular lens (IOL) Master 500) and ACMaster measurements (Carl Zeiss Meditec AG, Germany) were performed before surgery. Autorefraction and subjective refraction were performed 3 months after surgery. Conventional IPC formulae were compared with a new intraoperatively measured anterior chamber depth (ACD) (ACDIntraOP) partial least squares regression (PLSR) model for prediction of the postoperative refractive outcome. In total, 70 eyes of 70 patients were included. Mean axial eye length (AL) was 23.3 mm (range: 20.6-29.5 mm). Predictive power of the intraoperative measurements was found to be slightly better compared to conventional IOL power calculations. Refractive error dependency on AL for Holladay I, HofferQ, SRK/T, Haigis and ACDintraOP PLSR was r(2)=-0.42 (p<0.0001), r(2)=-0.5 (p<0.0001), r(2)=-0.34 (p=0.010), r(2)=-0.28 (p=0.049) and r(2)<0.001 (p=0.866), respectively, ACDIntraOP measurements help to better predict the refractive outcome and could be useful, if implemented in fourth-generation IPC formulae. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  9. Investigating Antibacterial Effects of Garlic (Allium sativum) Concentrate and Garlic-Derived Organosulfur Compounds on Campylobacter jejuni by Using Fourier Transform Infrared Spectroscopy, Raman Spectroscopy, and Electron Microscopy ▿ †

    PubMed Central

    Lu, Xiaonan; Rasco, Barbara A.; Jabal, Jamie M. F.; Aston, D. Eric; Lin, Mengshi; Konkel, Michael E.

    2011-01-01

    Fourier transform infrared (FT-IR) spectroscopy and Raman spectroscopy were used to study the cell injury and inactivation of Campylobacter jejuni from exposure to antioxidants from garlic. C. jejuni was treated with various concentrations of garlic concentrate and garlic-derived organosulfur compounds in growth media and saline at 4, 22, and 35°C. The antimicrobial activities of the diallyl sulfides increased with the number of sulfur atoms (diallyl sulfide < diallyl disulfide < diallyl trisulfide). FT-IR spectroscopy confirmed that organosulfur compounds are responsible for the substantial antimicrobial activity of garlic, much greater than those of garlic phenolic compounds, as indicated by changes in the spectral features of proteins, lipids, and polysaccharides in the bacterial cell membranes. Confocal Raman microscopy (532-nm-gold-particle substrate) and Raman mapping of a single bacterium confirmed the intracellular uptake of sulfur and phenolic components. Scanning electron microscopy (SEM) and transmission electron microscopy (TEM) were employed to verify cell damage. Principal-component analysis (PCA), discriminant function analysis (DFA), and soft independent modeling of class analogs (SIMCA) were performed, and results were cross validated to differentiate bacteria based upon the degree of cell injury. Partial least-squares regression (PLSR) was employed to quantify and predict actual numbers of healthy and injured bacterial cells remaining following treatment. PLSR-based loading plots were investigated to further verify the changes in the cell membrane of C. jejuni treated with organosulfur compounds. We demonstrated that bacterial injury and inactivation could be accurately investigated by complementary infrared and Raman spectroscopies using a chemical-based, “whole-organism fingerprint” with the aid of chemometrics and electron microscopy. PMID:21642409

  10. Quantification of live Lactobacillus acidophilus in mixed populations of live and killed by application of attenuated reflection Fourier transform infrared spectroscopy combined with chemometrics.

    PubMed

    Toziou, Peristera-Maria; Barmpalexis, Panagiotis; Boukouvala, Paraskevi; Verghese, Susan; Nikolakakis, Ioannis

    2018-05-30

    Since culture-based methods are costly and time consuming, alternative methods are investigated for the quantification of probiotics in commercial products. In this work ATR- FTIR vibration spectroscopy was applied for the differentiation and quantification of live Lactobacillus (La 5) in mixed populations of live and killed La 5, in the absence and in the presence of enteric polymer Eudragit ® L 100-55. Suspensions of live (La 5_L) and killed in acidic environment bacillus (La 5_K) were prepared and binary mixtures of different percentages were used to grow cell cultures for colony counting and spectral analysis. The increase in the number of colonies with added%La 5_L to the mixture was log-linear (r 2  = 0.926). Differentiation of La 5_L from La 5_K was possible directly from the peak area at 1635 cm -1 (amides of proteins and peptides) and a linear relationship between%La 5_L and peak area in the range 0-95% was obtained. Application of partial least squares regression (PLSR) gave reasonable prediction of%La 5_L (RMSEp = 6.48) in binary mixtures of live and killed La 5 but poor prediction (RMSEp = 11.75) when polymer was added to the La 5 mixture. Application of artificial neural networks (ANNs) improved greatly the predictive ability for%La 5_L both in the absence and in the presence of polymer (RMSEp = 8.11 × 10 -8 for La 5 only mixtures and RMSEp = 8.77 × 10 -8 with added polymer) due to their ability to express in the calibration models more hidden spectral information than PLSR. Copyright © 2018 Elsevier B.V. All rights reserved.

  11. Quantitative estimation of soil salinity by means of different modeling methods and visible-near infrared (VIS-NIR) spectroscopy, Ebinur Lake Wetland, Northwest China.

    PubMed

    Wang, Jingzhe; Ding, Jianli; Abulimiti, Aerzuna; Cai, Lianghong

    2018-01-01

    Soil salinization is one of the most common forms of land degradation. The detection and assessment of soil salinity is critical for the prevention of environmental deterioration especially in arid and semi-arid areas. This study introduced the fractional derivative in the pretreatment of visible and near infrared (VIS-NIR) spectroscopy. The soil samples ( n  = 400) collected from the Ebinur Lake Wetland, Xinjiang Uyghur Autonomous Region (XUAR), China, were used as the dataset. After measuring the spectral reflectance and salinity in the laboratory, the raw spectral reflectance was preprocessed by means of the absorbance and the fractional derivative order in the range of 0.0-2.0 order with an interval of 0.1. Two different modeling methods, namely, partial least squares regression (PLSR) and random forest (RF) with preprocessed reflectance were used for quantifying soil salinity. The results showed that more spectral characteristics were refined for the spectrum reflectance treated via fractional derivative. The validation accuracies showed that RF models performed better than those of PLSR. The most effective model was established based on RF with the 1.5 order derivative of absorbance with the optimal values of R 2 (0.93), RMSE (4.57 dS m -1 ), and RPD (2.78 ≥ 2.50). The developed RF model was stable and accurate in the application of spectral reflectance for determining the soil salinity of the Ebinur Lake wetland. The pretreatment of fractional derivative could be useful for monitoring multiple soil parameters with higher accuracy, which could effectively help to analyze the soil salinity.

  12. Application of Raman spectroscopy and chemometric techniques to assess sensory characteristics of young dairy bull beef.

    PubMed

    Zhao, Ming; Nian, Yingqun; Allen, Paul; Downey, Gerard; Kerry, Joseph P; O'Donnell, Colm P

    2018-05-01

    This work aims to develop a rapid analytical technique to predict beef sensory attributes using Raman spectroscopy (RS) and to investigate correlations between sensory attributes using chemometric analysis. Beef samples (n = 72) were obtained from young dairy bulls (Holstein-Friesian and Jersey×Holstein-Friesian) slaughtered at 15 and 19 months old. Trained sensory panel evaluation and Raman spectral data acquisition were both carried out on the same longissimus thoracis muscles after ageing for 21 days. The best prediction results were obtained using a Raman frequency range of 1300-2800 cm -1 . Prediction performance of partial least squares regression (PLSR) models developed using all samples were moderate to high for all sensory attributes (R 2 CV values of 0.50-0.84 and RMSECV values of 1.31-9.07) and were particularly high for desirable flavour attributes (R 2 CVs of 0.80-0.84, RMSECVs of 4.21-4.65). For PLSR models developed on subsets of beef samples i.e. beef of an identical age or breed type, significant improvements on prediction performances were achieved for overall sensory attributes (R 2 CVs of 0.63-0.89 and RMSECVs of 0.38-6.88 for each breed type; R 2 CVs of 0.52-0.89 and RMSECVs of 0.96-6.36 for each age group). Chemometric analysis revealed strong correlations between sensory attributes. Raman spectroscopy combined with chemometric analysis was demonstrated to have high potential as a rapid and non-destructive technique to predict the sensory quality traits of young dairy bull beef. Copyright © 2018. Published by Elsevier Ltd.

  13. Analysis of volatile compounds in exhaled breath condensate in patients with severe pulmonary arterial hypertension.

    PubMed

    Mansoor, J K; Schelegle, Edward S; Davis, Cristina E; Walby, William F; Zhao, Weixiang; Aksenov, Alexander A; Pasamontes, Alberto; Figueroa, Jennifer; Allen, Roblee

    2014-01-01

    An important challenge to pulmonary arterial hypertension (PAH) diagnosis and treatment is early detection of occult pulmonary vascular pathology. Symptoms are frequently confused with other disease entities that lead to inappropriate interventions and allow for progression to advanced states of disease. There is a significant need to develop new markers for early disease detection and management of PAH. Exhaled breath condensate (EBC) samples were compared from 30 age-matched normal healthy individuals and 27 New York Heart Association functional class III and IV idiopathic pulmonary arterial hypertenion (IPAH) patients, a subgroup of PAH. Volatile organic compounds (VOC) in EBC samples were analyzed using gas chromatography/mass spectrometry (GC/MS). Individual peaks in GC profiles were identified in both groups and correlated with pulmonary hemodynamic and clinical endpoints in the IPAH group. Additionally, GC/MS data were analyzed using autoregression followed by partial least squares regression (AR/PLSR) analysis to discriminate between the IPAH and control groups. After correcting for medicaitons, there were 62 unique compounds in the control group, 32 unique compounds in the IPAH group, and 14 in-common compounds between groups. Peak-by-peak analysis of GC profiles of IPAH group EBC samples identified 6 compounds significantly correlated with pulmonary hemodynamic variables important in IPAH diagnosis. AR/PLSR analysis of GC/MS data resulted in a distinct and identifiable metabolic signature for IPAH patients. These findings indicate the utility of EBC VOC analysis to discriminate between severe IPAH and a healthy population; additionally, we identified potential novel biomarkers that correlated with IPAH pulmonary hemodynamic variables that may be important in screening for less severe forms IPAH.

  14. Analysis of Volatile Compounds in Exhaled Breath Condensate in Patients with Severe Pulmonary Arterial Hypertension

    PubMed Central

    Mansoor, J. K.; Schelegle, Edward S.; Davis, Cristina E.; Walby, William F.; Zhao, Weixiang; Aksenov, Alexander A.; Pasamontes, Alberto; Figueroa, Jennifer; Allen, Roblee

    2014-01-01

    Background An important challenge to pulmonary arterial hypertension (PAH) diagnosis and treatment is early detection of occult pulmonary vascular pathology. Symptoms are frequently confused with other disease entities that lead to inappropriate interventions and allow for progression to advanced states of disease. There is a significant need to develop new markers for early disease detection and management of PAH. Methodolgy and Findings Exhaled breath condensate (EBC) samples were compared from 30 age-matched normal healthy individuals and 27 New York Heart Association functional class III and IV idiopathic pulmonary arterial hypertenion (IPAH) patients, a subgroup of PAH. Volatile organic compounds (VOC) in EBC samples were analyzed using gas chromatography/mass spectrometry (GC/MS). Individual peaks in GC profiles were identified in both groups and correlated with pulmonary hemodynamic and clinical endpoints in the IPAH group. Additionally, GC/MS data were analyzed using autoregression followed by partial least squares regression (AR/PLSR) analysis to discriminate between the IPAH and control groups. After correcting for medicaitons, there were 62 unique compounds in the control group, 32 unique compounds in the IPAH group, and 14 in-common compounds between groups. Peak-by-peak analysis of GC profiles of IPAH group EBC samples identified 6 compounds significantly correlated with pulmonary hemodynamic variables important in IPAH diagnosis. AR/PLSR analysis of GC/MS data resulted in a distinct and identifiable metabolic signature for IPAH patients. Conclusions These findings indicate the utility of EBC VOC analysis to discriminate between severe IPAH and a healthy population; additionally, we identified potential novel biomarkers that correlated with IPAH pulmonary hemodynamic variables that may be important in screening for less severe forms IPAH. PMID:24748102

  15. Stand-volume estimation from multi-source data for coppiced and high forest Eucalyptus spp. silvicultural systems in KwaZulu-Natal, South Africa

    NASA Astrophysics Data System (ADS)

    Dube, Timothy; Sibanda, Mbulisi; Shoko, Cletah; Mutanga, Onisimo

    2017-10-01

    Forest stand volume is one of the crucial stand parameters, which influences the ability of these forests to provide ecosystem goods and services. This study thus aimed at examining the potential of integrating multispectral SPOT 5 image, with ancillary data (forest age and rainfall metrics) in estimating stand volume between coppiced and planted Eucalyptus spp. in KwaZulu-Natal, South Africa. To achieve this objective, Partial Least Squares Regression (PLSR) algorithm was used. The PLSR algorithm was implemented by applying three tier analysis stages: stage I: using ancillary data as an independent dataset, stage II: SPOT 5 spectral bands as an independent dataset and stage III: combined SPOT 5 spectral bands and ancillary data. The results of the study showed that the use of an independent ancillary dataset better explained the volume of Eucalyptus spp. growing from coppices (adjusted R2 (R2Adj) = 0.54, RMSEP = 44.08 m3/ha), when compared with those that were planted (R2Adj = 0.43, RMSEP = 53.29 m3/ha). Similar results were also observed when SPOT 5 spectral bands were applied as an independent dataset, whereas improved volume estimates were produced when using combined dataset. For instance, planted Eucalyptus spp. were better predicted adjusted R2 (R2Adj) = 0.77, adjusted R2Adj = 0.59, RMSEP = 36.02 m3/ha) when compared with those that grow from coppices (R2 = 0.76, R2Adj = 0.46, RMSEP = 40.63 m3/ha). Overall, the findings of this study demonstrated the relevance of multi-source data in ecosystems modelling.

  16. Application of transmission infrared spectroscopy and partial least squares regression to predict immunoglobulin G concentration in dairy and beef cow colostrum.

    PubMed

    Elsohaby, Ibrahim; Windeyer, M Claire; Haines, Deborah M; Homerosky, Elizabeth R; Pearson, Jennifer M; McClure, J Trenton; Keefe, Greg P

    2018-03-06

    The objective of this study was to explore the potential of transmission infrared (TIR) spectroscopy in combination with partial least squares regression (PLSR) for quantification of dairy and beef cow colostral immunoglobulin G (IgG) concentration and assessment of colostrum quality. A total of 430 colostrum samples were collected from dairy (n = 235) and beef (n = 195) cows and tested by a radial immunodiffusion (RID) assay and TIR spectroscopy. Colostral IgG concentrations obtained by the RID assay were linked to the preprocessed spectra and divided into combined and prediction data sets. Three PLSR calibration models were built: one for the dairy cow colostrum only, the second for beef cow colostrum only, and the third for the merged dairy and beef cow colostrum. The predictive performance of each model was evaluated separately using the independent prediction data set. The Pearson correlation coefficients between IgG concentrations as determined by the TIR-based assay and the RID assay were 0.84 for dairy cow colostrum, 0.88 for beef cow colostrum, and 0.92 for the merged set of dairy and beef cow colostrum. The average of the differences between colostral IgG concentrations obtained by the RID- and TIR-based assays were -3.5, 2.7, and 1.4 g/L for dairy, beef, and merged colostrum samples, respectively. Further, the average relative error of the colostral IgG predicted by the TIR spectroscopy from the RID assay was 5% for dairy cow, 1.2% for beef cow, and 0.8% for the merged data set. The average intra-assay CV% of the IgG concentration predicted by the TIR-based method were 3.2%, 2.5%, and 6.9% for dairy cow, beef cow, and merged data set, respectively.The utility of TIR method for assessment of colostrum quality was evaluated using the entire data set and showed that TIR spectroscopy accurately identified the quality status of 91% of dairy cow colostrum, 95% of beef cow colostrum, and 89% and 93% of the merged dairy and beef cow colostrum samples, respectively. The results showed that TIR spectroscopy demonstrates potential as a simple, rapid, and cost-efficient method for use as an estimate of IgG concentration in dairy and beef cow colostrum samples and assessment of colostrum quality. The results also showed that merging the dairy and beef cow colostrum sample data sets improved the predictive ability of the TIR spectroscopy.

  17. Within-field and regional-scale accuracies of topsoil organic carbon content prediction from an airborne visible near-infrared hyperspectral image combined with synchronous field spectra for temperate croplands

    NASA Astrophysics Data System (ADS)

    Vaudour, Emmanuelle; Gilliot, Jean-Marc; Bel, Liliane; Lefevre, Josias; Chehdi, Kacem

    2016-04-01

    This study was carried out in the framework of the TOSCA-PLEIADES-CO of the French Space Agency and benefited data from the earlier PROSTOCK-Gessol3 project supported by the French Environment and Energy Management Agency (ADEME). It aimed at identifying the potential of airborne hyperspectral visible near-infrared AISA-Eagle data for predicting the topsoil organic carbon (SOC) content of bare cultivated soils over a large peri-urban area (221 km2) with intensive annual crop cultivation and both contrasted soils and SOC contents, located in the western region of Paris, France. Soils comprise hortic or glossic luvisols, calcaric, rendzic cambisols and colluvic cambisols. Airborne AISA-Eagle images (400-1000 nm, 126 bands) with 1 m-resolution were acquired on 17 April 2013 over 13 tracks. Tracks were atmospherically corrected then mosaicked at a 2 m-resolution using a set of 24 synchronous field spectra of bare soils, black and white targets and impervious surfaces. The land use identification system layer (RPG) of 2012 was used to mask non-agricultural areas, then calculation and thresholding of NDVI from an atmospherically corrected SPOT4 image acquired the same day enabled to map agricultural fields with bare soil. A total of 101 sites, which were sampled either at the regional scale or within one field, were identified as bare by means of this map. Predictions were made from the mosaic AISA spectra which were related to SOC contents by means of partial least squares regression (PLSR). Regression robustness was evaluated through a series of 1000 bootstrap data sets of calibration-validation samples, considering those 75 sites outside cloud shadows only, and different sampling strategies for selecting calibration samples. Validation root-mean-square errors (RMSE) were comprised between 3.73 and 4.49 g. Kg-1 and were ~4 g. Kg-1 in median. The most performing models in terms of coefficient of determination (R²) and Residual Prediction Deviation (RPD) values were the calibration models derived either from Kennard-Stone or conditioned Latin Hypercube sampling on smoothed spectra. However, the most generalizable model leading to lowest RMSE value of 3.73 g. Kg-1 at the regional scale and 1.44 g. Kg-1 at the within-field scale and low validation bias was the cross-validated leave-one-out PLSR model constructed with the 28 near-synchronous samples and raw spectra.

  18. Egg embryo development detection with hyperspectral imaging

    NASA Astrophysics Data System (ADS)

    Lawrence, Kurt C.; Smith, Douglas P.; Windham, William R.; Heitschmidt, Gerald W.; Park, Bosoon

    2006-10-01

    In the U. S. egg industry, anywhere from 130 million to over one billion infertile eggs are incubated each year. Some of these infertile eggs explode in the hatching cabinet and can potentially spread molds or bacteria to all the eggs in the cabinet. A method to detect the embryo development of incubated eggs was developed. Twelve brown-shell hatching eggs from two replicates (n=24) were incubated and imaged to identify embryo development. A hyperspectral imaging system was used to collect transmission images from 420 to 840 nm of brown-shell eggs positioned with the air cell vertical and normal to the camera lens. Raw transmission images from about 400 to 900 nm were collected for every egg on days 0, 1, 2, and 3 of incubation. A total of 96 images were collected and eggs were broken out on day 6 to determine fertility. After breakout, all eggs were found to be fertile. Therefore, this paper presents results for egg embryo development, not fertility. The original hyperspectral data and spectral means for each egg were both used to create embryo development models. With the hyperspectral data range reduced to about 500 to 700 nm, a minimum noise fraction transformation was used, along with a Mahalanobis Distance classification model, to predict development. Days 2 and 3 were all correctly classified (100%), while day 0 and day 1 were classified at 95.8% and 91.7%, respectively. Alternatively, the mean spectra from each egg were used to develop a partial least squares regression (PLSR) model. First, a PLSR model was developed with all eggs and all days. The data were multiplicative scatter corrected, spectrally smoothed, and the wavelength range was reduced to 539 - 770 nm. With a one-out cross validation, all eggs for all days were correctly classified (100%). Second, a PLSR model was developed with data from day 0 and day 3, and the model was validated with data from day 1 and 2. For day 1, 22 of 24 eggs were correctly classified (91.7%) and for day 2, all eggs were correctly classified (100%). Although the results are based on relatively small sample sizes, they are encouraging. However, larger sample sizes, from multiple flocks, will be needed to fully validate and verify these models. Additionally, future experiments must also include non-fertile eggs so the fertile / non-fertile effect can be determined.

  19. [Research on Oil Sands Spectral Characteristics and Oil Content by Remote Sensing Estimation].

    PubMed

    You, Jin-feng; Xing, Li-xin; Pan, Jun; Shan, Xuan-long; Liang, Li-heng; Fan, Rui-xue

    2015-04-01

    Visible and near infrared spectroscopy is a proven technology to be widely used in identification and exploration of hydrocarbon energy sources with high spectral resolution for detail diagnostic absorption characteristics of hydrocarbon groups. The most prominent regions for hydrocarbon absorption bands are 1,740-1,780, 2,300-2,340 and 2,340-2,360 nm by the reflectance of oil sands samples. These spectral ranges are dominated by various C-H overlapping overtones and combination bands. Meanwhile, there is relatively weak even or no absorption characteristics in the region from 1,700 to 1,730 nm in the spectra of oil sands samples with low bitumen content. With the increase in oil content, in the spectral range of 1,700-1,730 nm the obvious hydrocarbon absorption begins to appear. The bitumen content is the critical parameter for oil sands reserves estimation. The absorption depth was used to depict the response intensity of the absorption bands controlled by first-order overtones and combinations of the various C-H stretching and bending fundamentals. According to the Pearson and partial correlation relationships of oil content and absorption depth dominated by hydrocarbon groups in 1,740-1,780, 2,300-2,340 and 2,340-2,360 nm wavelength range, the scheme of association mode was established between the intensity of spectral response and bitumen content, and then unary linear regression(ULR) and partial least squares regression (PLSR) methods were employed to model the equation between absorption depth attributed to various C-H bond and bitumen content. There were two calibration equations in which ULR method was employed to model the relationship between absorption depth near 2,350 nm region and bitumen content and PLSR method was developed to model the relationship between absorption depth of 1,758, 2,310, 2,350 nm regions and oil content. It turned out that the calibration models had good predictive ability and high robustness and they could provide the scientific basis for rapid estimation of oil content in oil sands in future.

  20. In-line and Real-time Monitoring of Resonant Acoustic Mixing by Near-infrared Spectroscopy Combined with Chemometric Technology for Process Analytical Technology Applications in Pharmaceutical Powder Blending Systems.

    PubMed

    Tanaka, Ryoma; Takahashi, Naoyuki; Nakamura, Yasuaki; Hattori, Yusuke; Ashizawa, Kazuhide; Otsuka, Makoto

    2017-01-01

    Resonant acoustic ® mixing (RAM) technology is a system that performs high-speed mixing by vibration through the control of acceleration and frequency. In recent years, real-time process monitoring and prediction has become of increasing interest, and process analytical technology (PAT) systems will be increasingly introduced into actual manufacturing processes. This study examined the application of PAT with the combination of RAM, near-infrared spectroscopy, and chemometric technology as a set of PAT tools for introduction into actual pharmaceutical powder blending processes. Content uniformity was based on a robust partial least squares regression (PLSR) model constructed to manage the RAM configuration parameters and the changing concentration of the components. As a result, real-time monitoring may be possible and could be successfully demonstrated for in-line real-time prediction of active pharmaceutical ingredients and other additives using chemometric technology. This system is expected to be applicable to the RAM method for the risk management of quality.

  1. A study on the use of near-infrared spectroscopy for the rapid quantification of major compounds in Tanreqing injection

    NASA Astrophysics Data System (ADS)

    Li, Wenlong; Cheng, Zhiwei; Wang, Yuefei; Qu, Haibin

    2013-01-01

    In this paper we describe the strategy used in the development and validation of a near infrared spectroscopy method for the rapid determination of baicalin, chlorogenic acid, ursodeoxycholic acid (UDCA), chenodeoxycholic acid (CDCA), and the total solid contents (TSCs) in the Tanreqing injection. To increase the representativeness of calibration sample set, a concentrating-diluting method was adopted to artificially prepare samples. Partial least square regression (PLSR) was used to establish calibration models, with which the five quality indicators can be determined with satisfied accuracy and repeatability. In addition, the slope/bias (S/B) method was used for the models transfer between two different types of NIR instruments from the same manufacturer, which is contributing to enlarge the application range of the established models. With the presented method, a great deal of time, effort and money can be saved when large amounts of Tanreqing injection samples need to be analyzed in a relatively short period of time, which is of great significance to the traditional Chinese medicine (TCM) industries.

  2. The Use of Partial Least Square Regression and Spectral Data in UV-Visible Region for Quantification of Adulteration in Indonesian Palm Civet Coffee

    PubMed Central

    Yulia, Meinilwita

    2017-01-01

    Asian palm civet coffee or kopi luwak (Indonesian words for coffee and palm civet) is well known as the world's priciest and rarest coffee. To protect the authenticity of luwak coffee and protect consumer from luwak coffee adulteration, it is very important to develop a robust and simple method for determining the adulteration of luwak coffee. In this research, the use of UV-Visible spectra combined with PLSR was evaluated to establish rapid and simple methods for quantification of adulteration in luwak-arabica coffee blend. Several preprocessing methods were tested and the results show that most of the preprocessing spectra were effective in improving the quality of calibration models with the best PLS calibration model selected for Savitzky-Golay smoothing spectra which had the lowest RMSECV (0.039) and highest RPDcal value (4.64). Using this PLS model, a prediction for quantification of luwak content was calculated and resulted in satisfactory prediction performance with high both RPDp and RER values. PMID:28913348

  3. Rapid Measurement of Soil Carbon in Rice Paddy Field of Lombok Island Indonesia Using Near Infrared Technology

    NASA Astrophysics Data System (ADS)

    Kusumo, B. H.; Sukartono, S.; Bustan, B.

    2018-02-01

    Measuring soil organic carbon (C) using conventional analysis is tedious procedure, time consuming and expensive. It is needed simple procedure which is cheap and saves time. Near infrared technology offers rapid procedure as it works based on the soil spectral reflectance and without any chemicals. The aim of this research is to test whether this technology able to rapidly measure soil organic C in rice paddy field. Soil samples were collected from rice paddy field of Lombok Island Indonesia, and the coordinates of the samples were recorded. Parts of the samples were analysed using conventional analysis (Walkley and Black) and some other parts were scanned using near infrared spectroscopy (NIRS) for soil spectral collection. Partial Least Square Regression (PLSR) Models were developed using data of soil C analysed using conventional analysis and data from soil spectral reflectance. The models were moderately successful to measure soil C in rice paddy field of Lombok Island. This shows that the NIR technology can be further used to monitor the C change in rice paddy soil.

  4. Application of Fourier-transform mid infrared spectroscopy for the monitoring of pound cakes quality during storage.

    PubMed

    Nhouchi, Zeineb; Karoui, Romdhane

    2018-06-30

    The aim of the present study was to investigate the ability of MIR and texture analyzer to evaluate the quality of pound cake samples produced with palm oil and rapeseed oil throughout storage. The MIR spectra analyzed by using principal component analysis (PCA) showed a clear separation of pound cakes as a function of the storage time and the nature of the used oil in the recipe. By applying partial least square regression (PLSR), excellent prediction was obtained for hardness (R 2  = 0.91; RPD = 2.26), while an approximate qualitative prediction was found for springiness (R 2  = 0.73; RPD = 2.07), cohesiveness (R 2  = 0.67; RPD = 1.31) and resilience (R 2  = 0.65; RPD = 1.24). It could be concluded that the MIR spectroscopy could be used as a rapid and non-destructive technique for monitoring texture of pound cakes throughout storage as well as for the prediction of their hardness. Copyright © 2018 Elsevier Ltd. All rights reserved.

  5. Application of handheld and portable spectrometers for screening acrylamide content in commercial potato chips.

    PubMed

    Ayvaz, Huseyin; Rodriguez-Saona, Luis E

    2015-05-01

    The most common methods for acrylamide analysis in foods require the use of LC-MS/MS and GC-MS. Although these methods have great analytical performance, they need intensive sample preparation, highly specialised instrumentation, and are time consuming. In this study, portable and handheld infrared spectrometers were evaluated as rapid methods for screening acrylamide in potato chips and their performances were compared to those of benchtop infrared systems. The acrylamide content of 64 commercial potato chips (169-2453 μg/kg) was determined by LC-MS/MS. Spectral data were collected using mid-infrared (MIR) and near-infrared (NIR) spectrometers. Partial least squares regression (PLSR) calibration models were developed to predict acrylamide levels. Overall, good linear correlation was found between the predicted acrylamide levels and actual measured acrylamide concentrations by LC-MS/MS (rPred > 0.90 and SEP < 100 μg/kg). Our results indicate that portable and handheld spectrometers can be used as simple and rapid alternatives for acrylamide analysis in potato chips. Copyright © 2014 Elsevier Ltd. All rights reserved.

  6. Towards decadal soil salinity mapping using Landsat time series data

    NASA Astrophysics Data System (ADS)

    Fan, Xingwang; Weng, Yongling; Tao, Jinmei

    2016-10-01

    Salinization is one of the major soil problems around the world. However, decadal variation in soil salinization has not yet been extensively reported. This study exploited thirty years (1985-2015) of Landsat sensor data, including Landsat-4/5 TM (Thematic Mapper), Landsat-7 ETM+ (Enhanced Thematic Mapper Plus) and Landsat-8 OLI (Operational Land Imager), for monitoring soil salinity of the Yellow River Delta, China. The data were initially corrected for atmospheric effects, and then matched the spectral bands of EO-1 (Earth Observing One) ALI (Advanced Land Imager). Subsequently, soil salinity maps were derived with a previously developed PLSR (Partial Least Square Regression) model. On intra-annual scale, the retrievals showed that soil salinity increased in February, stabilized in March, and decreased in April. On inter-annual scale, soil salinity decreased within 1985-2000 (-0.74 g kg-1/10a, p < 0.001), and increased within 2000-2015 (0.79 g kg-1/10a, p < 0.001). Our study presents a new perspective for use of multiple Landsat data in soil salinity retrieval, and further the understanding of soil salinization development over the Yellow River Delta.

  7. Prediction of warmed-over flavour development in cooked chicken by colorimetric sensor array.

    PubMed

    Kim, Su-Yeon; Li, Jinglei; Lim, Na-Ri; Kang, Bo-Sik; Park, Hyun-Jin

    2016-11-15

    The aim of this study was to develop a simple and rapid method based on colorimetric sensor array (CSA) for evaluation of warmed-over flavour (WOF) in cooked chicken. All samples were classified according to storage time by CSA coupled with principle component analysis (PCA) or hierarchical cluster analysis (HCA). The CSA data were used to establish prediction models with thiobarbituric acid reactive substances (TBARS), pentanal, hexanal, or heptanal associated with WOF by partial least square regression (PLSR). For the TBARS model, the coefficient of determination (rp(2)) was 0.9997 in the prediction range of 0.28-0.69mg/kg. In each of the models for pentanal, hexanal, and heptanal, all rp(2) were higher than 0.960 in the range of 0.58-2.10mg/kg, 5.50-11.69mg/kg, and 0.09-0.16mg/kg, respectively. These results demonstrate that the CSA was able to predict WOF development and to distinguish between each storage time. Copyright © 2016 Elsevier Ltd. All rights reserved.

  8. Laser ablation molecular isotopic spectroscopy (LAMIS) towards the determination of multivariate LODs via PLS calibration model of 10B and 11B Boric acid mixtures

    NASA Astrophysics Data System (ADS)

    Harris, C. D.; Profeta, Luisa T. M.; Akpovo, Codjo A.; Johnson, Lewis; Stowe, Ashley C.

    2017-05-01

    A calibration model was created to illustrate the detection capabilities of laser ablation molecular isotopic spectroscopy (LAMIS) discrimination in isotopic analysis. The sample set contained boric acid pellets that varied in isotopic concentrations of 10B and 11B. Each sample set was interrogated with a Q-switched Nd:YAG ablation laser operating at 532 nm. A minimum of four band heads of the β system B2∑ -> Χ2∑transitions were identified and verified with previous literature on BO molecular emission lines. Isotopic shifts were observed in the spectra for each transition and used as the predictors in the calibration model. The spectra along with their respective 10/11B isotopic ratios were analyzed using Partial Least Squares Regression (PLSR). An IUPAC novel approach for determining a multivariate Limit of Detection (LOD) interval was used to predict the detection of the desired isotopic ratios. The predicted multivariate LOD is dependent on the variation of the instrumental signal and other composites in the calibration model space.

  9. [Estimation of organic matter content of north fluvo-aquic soil based on the coupling model of wavelet transform and partial least squares].

    PubMed

    Wang, Yan-Cang; Yang, Gui-Jun; Zhu, Jin-Shan; Gu, Xiao-He; Xu, Peng; Liao, Qin-Hong

    2014-07-01

    For improving the estimation accuracy of soil organic matter content of the north fluvo-aquic soil, wavelet transform technology is introduced. The soil samples were collected from Tongzhou district and Shunyi district in Beijing city. And the data source is from soil hyperspectral data obtained under laboratory condition. First, discrete wavelet transform efficiently decomposes hyperspectral into approximate coefficients and detail coefficients. Then, the correlation between approximate coefficients, detail coefficients and organic matter content was analyzed, and the sensitive bands of the organic matter were screened. Finally, models were established to estimate the soil organic content by using the partial least squares regression (PLSR). Results show that the NIR bands made more contributions than the visible band in estimating organic matter content models; the ability of approximate coefficients to estimate organic matter content is better than that of detail coefficients; The estimation precision of the detail coefficients fir soil organic matter content decreases with the spectral resolution being lower; Compared with the commonly used three types of soil spectral reflectance transforms, the wavelet transform can improve the estimation ability of soil spectral fir organic content; The accuracy of the best model established by the approximate coefficients or detail coefficients is higher, and the coefficient of determination (R2) and the root mean square error (RMSE) of the best model for approximate coefficients are 0.722 and 0.221, respectively. The R2 and RMSE of the best model for detail coefficients are 0.670 and 0.255, respectively.

  10. Prediction of aged red wine aroma properties from aroma chemical composition. Partial least squares regression models.

    PubMed

    Aznar, Margarita; López, Ricardo; Cacho, Juan; Ferreira, Vicente

    2003-04-23

    Partial least squares regression (PLSR) models able to predict some of the wine aroma nuances from its chemical composition have been developed. The aromatic sensory characteristics of 57 Spanish aged red wines were determined by 51 experts from the wine industry. The individual descriptions given by the experts were recorded, and the frequency with which a sensory term was used to define a given wine was taken as a measurement of its intensity. The aromatic chemical composition of the wines was determined by already published gas chromatography (GC)-flame ionization detector and GC-mass spectrometry methods. In the whole, 69 odorants were analyzed. Both matrixes, the sensory and chemical data, were simplified by grouping and rearranging correlated sensory terms or chemical compounds and by the exclusion of secondary aroma terms or of weak aroma chemicals. Finally, models were developed for 18 sensory terms and 27 chemicals or groups of chemicals. Satisfactory models, explaining more than 45% of the original variance, could be found for nine of the most important sensory terms (wood-vanillin-cinnamon, animal-leather-phenolic, toasted-coffee, old wood-reduction, vegetal-pepper, raisin-flowery, sweet-candy-cacao, fruity, and berry fruit). For this set of terms, the correlation coefficients between the measured and predicted Y (determined by cross-validation) ranged from 0.62 to 0.81. Models confirmed the existence of complex multivariate relationships between chemicals and odors. In general, pleasant descriptors were positively correlated to chemicals with pleasant aroma, such as vanillin, beta damascenone, or (E)-beta-methyl-gamma-octalactone, and negatively correlated to compounds showing less favorable odor properties, such as 4-ethyl and vinyl phenols, 3-(methylthio)-1-propanol, or phenylacetaldehyde.

  11. Improved explanation of human intelligence using cortical features with second order moments and regression.

    PubMed

    Park, Hyunjin; Yang, Jin-ju; Seo, Jongbum; Choi, Yu-yong; Lee, Kun-ho; Lee, Jong-min

    2014-04-01

    Cortical features derived from magnetic resonance imaging (MRI) provide important information to account for human intelligence. Cortical thickness, surface area, sulcal depth, and mean curvature were considered to explain human intelligence. One region of interest (ROI) of a cortical structure consisting of thousands of vertices contained thousands of measurements, and typically, one mean value (first order moment), was used to represent a chosen ROI, which led to a potentially significant loss of information. We proposed a technological improvement to account for human intelligence in which a second moment (variance) in addition to the mean value was adopted to represent a chosen ROI, so that the loss of information would be less severe. Two computed moments for the chosen ROIs were analyzed with partial least squares regression (PLSR). Cortical features for 78 adults were measured and analyzed in conjunction with the full-scale intelligence quotient (FSIQ). Our results showed that 45% of the variance of the FSIQ could be explained using the combination of four cortical features using two moments per chosen ROI. Our results showed improvement over using a mean value for each ROI, which explained 37% of the variance of FSIQ using the same set of cortical measurements. Our results suggest that using additional second order moments is potentially better than using mean values of chosen ROIs for regression analysis to account for human intelligence. Copyright © 2014 Elsevier Ltd. All rights reserved.

  12. Assessing Wheat Traits by Spectral Reflectance: Do We Really Need to Focus on Predicted Trait-Values or Directly Identify the Elite Genotypes Group?

    PubMed Central

    Garriga, Miguel; Romero-Bravo, Sebastián; Estrada, Félix; Escobar, Alejandro; Matus, Iván A.; del Pozo, Alejandro; Astudillo, Cesar A.; Lobos, Gustavo A.

    2017-01-01

    Phenotyping, via remote and proximal sensing techniques, of the agronomic and physiological traits associated with yield potential and drought adaptation could contribute to improvements in breeding programs. In the present study, 384 genotypes of wheat (Triticum aestivum L.) were tested under fully irrigated (FI) and water stress (WS) conditions. The following traits were evaluated and assessed via spectral reflectance: Grain yield (GY), spikes per square meter (SM2), kernels per spike (KPS), thousand-kernel weight (TKW), chlorophyll content (SPAD), stem water soluble carbohydrate concentration and content (WSC and WSCC, respectively), carbon isotope discrimination (Δ13C), and leaf area index (LAI). The performances of spectral reflectance indices (SRIs), four regression algorithms (PCR, PLSR, ridge regression RR, and SVR), and three classification methods (PCA-LDA, PLS-DA, and kNN) were evaluated for the prediction of each trait. For the classification approaches, two classes were established for each trait: The lower 80% of the trait variability range (Class 1) and the remaining 20% (Class 2 or elite genotypes). Both the SRIs and regression methods performed better when data from FI and WS were combined. The traits that were best estimated by SRIs and regression methods were GY and Δ13C. For most traits and conditions, the estimations provided by RR and SVR were the same, or better than, those provided by the SRIs. PLS-DA showed the best performance among the categorical methods and, unlike the SRI and regression models, most traits were relatively well-classified within a specific hydric condition (FI or WS), proving that classification approach is an effective tool to be explored in future studies related to genotype selection. PMID:28337210

  13. Quantitative monitoring of sucrose, reducing sugar and total sugar dynamics for phenotyping of water-deficit stress tolerance in rice through spectroscopy and chemometrics

    NASA Astrophysics Data System (ADS)

    Das, Bappa; Sahoo, Rabi N.; Pargal, Sourabh; Krishna, Gopal; Verma, Rakesh; Chinnusamy, Viswanathan; Sehgal, Vinay K.; Gupta, Vinod K.; Dash, Sushanta K.; Swain, Padmini

    2018-03-01

    In the present investigation, the changes in sucrose, reducing and total sugar content due to water-deficit stress in rice leaves were modeled using visible, near infrared (VNIR) and shortwave infrared (SWIR) spectroscopy. The objectives of the study were to identify the best vegetation indices and suitable multivariate technique based on precise analysis of hyperspectral data (350 to 2500 nm) and sucrose, reducing sugar and total sugar content measured at different stress levels from 16 different rice genotypes. Spectral data analysis was done to identify suitable spectral indices and models for sucrose estimation. Novel spectral indices in near infrared (NIR) range viz. ratio spectral index (RSI) and normalised difference spectral indices (NDSI) sensitive to sucrose, reducing sugar and total sugar content were identified which were subsequently calibrated and validated. The RSI and NDSI models had R2 values of 0.65, 0.71 and 0.67; RPD values of 1.68, 1.95 and 1.66 for sucrose, reducing sugar and total sugar, respectively for validation dataset. Different multivariate spectral models such as artificial neural network (ANN), multivariate adaptive regression splines (MARS), multiple linear regression (MLR), partial least square regression (PLSR), random forest regression (RFR) and support vector machine regression (SVMR) were also evaluated. The best performing multivariate models for sucrose, reducing sugars and total sugars were found to be, MARS, ANN and MARS, respectively with respect to RPD values of 2.08, 2.44, and 1.93. Results indicated that VNIR and SWIR spectroscopy combined with multivariate calibration can be used as a reliable alternative to conventional methods for measurement of sucrose, reducing sugars and total sugars of rice under water-deficit stress as this technique is fast, economic, and noninvasive.

  14. Assessing Wheat Traits by Spectral Reflectance: Do We Really Need to Focus on Predicted Trait-Values or Directly Identify the Elite Genotypes Group?

    PubMed

    Garriga, Miguel; Romero-Bravo, Sebastián; Estrada, Félix; Escobar, Alejandro; Matus, Iván A; Del Pozo, Alejandro; Astudillo, Cesar A; Lobos, Gustavo A

    2017-01-01

    Phenotyping, via remote and proximal sensing techniques, of the agronomic and physiological traits associated with yield potential and drought adaptation could contribute to improvements in breeding programs. In the present study, 384 genotypes of wheat ( Triticum aestivum L.) were tested under fully irrigated (FI) and water stress (WS) conditions. The following traits were evaluated and assessed via spectral reflectance: Grain yield (GY), spikes per square meter (SM2), kernels per spike (KPS), thousand-kernel weight (TKW), chlorophyll content (SPAD), stem water soluble carbohydrate concentration and content (WSC and WSCC, respectively), carbon isotope discrimination (Δ 13 C), and leaf area index (LAI). The performances of spectral reflectance indices (SRIs), four regression algorithms (PCR, PLSR, ridge regression RR, and SVR), and three classification methods (PCA-LDA, PLS-DA, and k NN) were evaluated for the prediction of each trait. For the classification approaches, two classes were established for each trait: The lower 80% of the trait variability range (Class 1) and the remaining 20% (Class 2 or elite genotypes). Both the SRIs and regression methods performed better when data from FI and WS were combined. The traits that were best estimated by SRIs and regression methods were GY and Δ 13 C. For most traits and conditions, the estimations provided by RR and SVR were the same, or better than, those provided by the SRIs. PLS-DA showed the best performance among the categorical methods and, unlike the SRI and regression models, most traits were relatively well-classified within a specific hydric condition (FI or WS), proving that classification approach is an effective tool to be explored in future studies related to genotype selection.

  15. Quantitative detection of codeine in human plasma using surface-enhanced Raman scattering via adaptation of the isotopic labelling principle.

    PubMed

    Subaihi, Abdu; Muhamadali, Howbeer; Mutter, Shaun T; Blanch, Ewan; Ellis, David I; Goodacre, Royston

    2017-03-27

    In this study surface enhanced Raman scattering (SERS) combined with the isotopic labelling (IL) principle has been used for the quantification of codeine spiked into both water and human plasma. Multivariate statistical approaches were employed for the analysis of these SERS spectral data, particularly partial least squares regression (PLSR) which was used to generate models using the full SERS spectral data for quantification of codeine with, and without, an internal isotopic labelled standard. The PLSR models provided accurate codeine quantification in water and human plasma with high prediction accuracy (Q 2 ). In addition, the employment of codeine-d 6 as the internal standard further improved the accuracy of the model, by increasing the Q 2 from 0.89 to 0.94 and decreasing the low root-mean-square error of predictions (RMSEP) from 11.36 to 8.44. Using the peak area at 1281 cm -1 assigned to C-N stretching, C-H wagging and ring breathing, the limit of detection was calculated in both water and human plasma to be 0.7 μM (209.55 ng mL -1 ) and 1.39 μM (416.12 ng mL -1 ), respectively. Due to a lack of definitive codeine vibrational assignments, density functional theory (DFT) calculations have also been used to assign the spectral bands with their corresponding vibrational modes, which were in excellent agreement with our experimental Raman and SERS findings. Thus, we have successfully demonstrated the application of SERS with isotope labelling for the absolute quantification of codeine in human plasma for the first time with a high degree of accuracy and reproducibility. The use of the IL principle which employs an isotopolog (that is to say, a molecule which is only different by the substitution of atoms by isotopes) improves quantification and reproducibility because the competition of the codeine and codeine-d 6 for the metal surface used for SERS is equal and this will offset any difference in the number of particles under analysis or any fluctuations in laser fluence. It is our belief that this may open up new exciting opportunities for testing SERS in real-world samples and applications which would be an area of potential future studies.

  16. Imaging spectroscopy algorithms for mapping canopy foliar chemical and morphological traits and their uncertainties

    DOE PAGES

    Singh, Aditya; Serbin, Shawn P.; McNeil, Brenden E.; ...

    2015-12-01

    A major goal of remote sensing is the development of generalizable algorithms to repeatedly and accurately map ecosystem properties across space and time. Imaging spectroscopy has great potential to map vegetation traits that cannot be retrieved from broadband spectral data, but rarely have such methods been tested across broad regions. Here we illustrate a general approach for estimating key foliar chemical and morphological traits through space and time using NASA's Airborne Visible/Infrared Imaging Spectrometer (AVIRIS-Classic). We apply partial least squares regression (PLSR) to data from 237 field plots within 51 images acquired between 2008 and 2011. Using a series ofmore » 500 randomized 50/50 subsets of the original data, we generated spatially explicit maps of seven traits (leaf mass per area (M area), percentage nitrogen, carbon, fiber, lignin, and cellulose, and isotopic nitrogen concentration, δ 15N) as well as pixel-wise uncertainties in their estimates based on error propagation in the analytical methods. Both Marea and %N PLSR models had a R 2 > 0.85. Root mean square errors (RMSEs) for both variables were less than 9% of the range of data. Fiber and lignin were predicted with R 2 > 0.65 and carbon and cellulose with R 2 > 0.45. Although R 2 of %C and cellulose were lower than Marea and %N, the measured variability of these constituents (especially %C) was also lower, and their RMSE values were beneath 12% of the range in overall variability. Model performance for δ 15N was the lowest (R 2 = 0.48, RMSE = 0.95‰), but within 15% of the observed range. The resulting maps of chemical and morphological traits, together with their overall uncertainties, represent a first-of-its-kind approach for examining the spatiotemporal patterns of forest functioning and nutrient cycling across a broad range of temperate and sub-boreal ecosystems. These results offer an alternative to categorical maps of functional or physiognomic types by providing non-discrete maps (i.e., on a continuum) of traits that define those functional types. A key contribution of this work is the ability to assign retrieval uncertainties by pixel, a requirement to enable assimilation of these data products into ecosystem modeling frameworks to constrain carbon and nutrient cycling projections.« less

  17. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Singh, Aditya; Serbin, Shawn P.; McNeil, Brenden E.

    A major goal of remote sensing is the development of generalizable algorithms to repeatedly and accurately map ecosystem properties across space and time. Imaging spectroscopy has great potential to map vegetation traits that cannot be retrieved from broadband spectral data, but rarely have such methods been tested across broad regions. Here we illustrate a general approach for estimating key foliar chemical and morphological traits through space and time using NASA's Airborne Visible/Infrared Imaging Spectrometer (AVIRIS-Classic). We apply partial least squares regression (PLSR) to data from 237 field plots within 51 images acquired between 2008 and 2011. Using a series ofmore » 500 randomized 50/50 subsets of the original data, we generated spatially explicit maps of seven traits (leaf mass per area (M area), percentage nitrogen, carbon, fiber, lignin, and cellulose, and isotopic nitrogen concentration, δ 15N) as well as pixel-wise uncertainties in their estimates based on error propagation in the analytical methods. Both Marea and %N PLSR models had a R 2 > 0.85. Root mean square errors (RMSEs) for both variables were less than 9% of the range of data. Fiber and lignin were predicted with R 2 > 0.65 and carbon and cellulose with R 2 > 0.45. Although R 2 of %C and cellulose were lower than Marea and %N, the measured variability of these constituents (especially %C) was also lower, and their RMSE values were beneath 12% of the range in overall variability. Model performance for δ 15N was the lowest (R 2 = 0.48, RMSE = 0.95‰), but within 15% of the observed range. The resulting maps of chemical and morphological traits, together with their overall uncertainties, represent a first-of-its-kind approach for examining the spatiotemporal patterns of forest functioning and nutrient cycling across a broad range of temperate and sub-boreal ecosystems. These results offer an alternative to categorical maps of functional or physiognomic types by providing non-discrete maps (i.e., on a continuum) of traits that define those functional types. A key contribution of this work is the ability to assign retrieval uncertainties by pixel, a requirement to enable assimilation of these data products into ecosystem modeling frameworks to constrain carbon and nutrient cycling projections.« less

  18. Multisensor on-the-go mapping of readily dispersible clay, particle size and soil organic matter

    NASA Astrophysics Data System (ADS)

    Debaene, Guillaume; Niedźwiecki, Jacek; Papierowska, Ewa

    2016-04-01

    Particle size fractions affect strongly the physical and chemical properties of soil. Readily dispersible clay (RDC) is the part of the clay fraction in soils that is easily or potentially dispersible in water when small amounts of mechanical energy are applied to soil. The amount of RDC in the soil is of significant importance for agriculture and environment because clay dispersion is a cause of poor soil stability in water which in turn contributes to soil erodibility, mud flows, and cementation. To obtain a detailed map of soil texture, many samples are needed. Moreover, RDC determination is time consuming. The use of a mobile visible and near-infrared (VIS-NIR) platform is proposed here to map those soil properties and obtain the first detailed map of RDC at field level. Soil properties prediction was based on calibration model developed with 10 representative samples selected by a fuzzy logic algorithm. Calibration samples were analysed for soil texture (clay, silt and sand), RDC and soil organic carbon (SOC) using conventional wet chemistry analysis. Moreover, the Veris mobile sensor platform is also collecting electrical conductivity (EC) data (deep and shallow), and soil temperature. These auxiliary data were combined with VIS-NIR measurement (data fusion) to improve prediction results. EC maps were also produced to help understanding RDC data. The resulting maps were visually compared with an orthophotography of the field taken at the beginning of the plant growing season. Models were developed with partial least square regression (PLSR) and support vector machine regression (SVMR). There were no significant differences between calibration using PLSR or SVMR. Nevertheless, the best models were obtained with PLSR and standard normal variate (SNV) pretreatment and the fusion with deep EC data (e.g. for RDC and clay content: RMSECV = 0,35% and R2 = 0,71; RMSECV = 0,32% and R2 = 0,73 respectively). The best models were used to predict soil properties from the field spectra collected with the VIS-NIR platform. Maps of soil properties were generated using natural neighbour (NN) interpolation. Calibration results were satisfactory for all soil properties and allowed for the generation of detailed maps. The spatial variability of RDC was in accordance with the field orthophotography. Areas of high RDC content were corresponding to area of bad plant development. Soil texture has been correctly predicted by VIS-NIR spectroscopy (laboratory or on-the-go) before. However, readily dispersible clay (an important parameter for soil stability) has never been investigated before. This study introduces the possibility of using VIS-NIR for predicting readily dispersible clay at field level. The results obtained could be used in preventing soil erosion. Acknowledgement: This research was financed by a National Science Centre grant (NCN - Poland) with decision number UMO-2012/07/B/ST10/04387

  19. Unmanned Aerial System (UAS)-based phenotyping of soybean using multi-sensor data fusion and extreme learning machine

    NASA Astrophysics Data System (ADS)

    Maimaitijiang, Maitiniyazi; Ghulam, Abduwasit; Sidike, Paheding; Hartling, Sean; Maimaitiyiming, Matthew; Peterson, Kyle; Shavers, Ethan; Fishman, Jack; Peterson, Jim; Kadam, Suhas; Burken, Joel; Fritschi, Felix

    2017-12-01

    Estimating crop biophysical and biochemical parameters with high accuracy at low-cost is imperative for high-throughput phenotyping in precision agriculture. Although fusion of data from multiple sensors is a common application in remote sensing, less is known on the contribution of low-cost RGB, multispectral and thermal sensors to rapid crop phenotyping. This is due to the fact that (1) simultaneous collection of multi-sensor data using satellites are rare and (2) multi-sensor data collected during a single flight have not been accessible until recent developments in Unmanned Aerial Systems (UASs) and UAS-friendly sensors that allow efficient information fusion. The objective of this study was to evaluate the power of high spatial resolution RGB, multispectral and thermal data fusion to estimate soybean (Glycine max) biochemical parameters including chlorophyll content and nitrogen concentration, and biophysical parameters including Leaf Area Index (LAI), above ground fresh and dry biomass. Multiple low-cost sensors integrated on UASs were used to collect RGB, multispectral, and thermal images throughout the growing season at a site established near Columbia, Missouri, USA. From these images, vegetation indices were extracted, a Crop Surface Model (CSM) was advanced, and a model to extract the vegetation fraction was developed. Then, spectral indices/features were combined to model and predict crop biophysical and biochemical parameters using Partial Least Squares Regression (PLSR), Support Vector Regression (SVR), and Extreme Learning Machine based Regression (ELR) techniques. Results showed that: (1) For biochemical variable estimation, multispectral and thermal data fusion provided the best estimate for nitrogen concentration and chlorophyll (Chl) a content (RMSE of 9.9% and 17.1%, respectively) and RGB color information based indices and multispectral data fusion exhibited the largest RMSE 22.6%; the highest accuracy for Chl a + b content estimation was obtained by fusion of information from all three sensors with an RMSE of 11.6%. (2) Among the plant biophysical variables, LAI was best predicted by RGB and thermal data fusion while multispectral and thermal data fusion was found to be best for biomass estimation. (3) For estimation of the above mentioned plant traits of soybean from multi-sensor data fusion, ELR yields promising results compared to PLSR and SVR in this study. This research indicates that fusion of low-cost multiple sensor data within a machine learning framework can provide relatively accurate estimation of plant traits and provide valuable insight for high spatial precision in agriculture and plant stress assessment.

  20. Hyperspectral characteristics of Celosia argentea which lived in manganese stress environment and inversion model for concentration effect of manganese

    NASA Astrophysics Data System (ADS)

    Chen, Sanming; Lin, Gang; Yin, Xianyang; Sun, Xiaolin; Xu, Jiasheng; Liu, Zhiying

    2015-12-01

    Sedimentary manganese deposits widely distribute in North Guangxi with the characteristic existing Celosia argentea. Celosia argentea is a kind of plant which has a strong ability to enrich manganese. In order to study the relationship between the hyperspectral characteristics of Celosia argentea and the concentration effect of manganese in the soil, we used soil of B layer in mining area, background soil and the soil adding reagent of MnCl4 to make up experimental sample soil with 10 levels Manganese content for the same batch Celosia argentea. The levels are 0mg/kg, 4500mg/kg, 9000mg/kg, 13500mg/kg, 18000mg/kg, 18020mg/kg, 18040mg/kg, 18080mg/kg, 18160mg/kg. ASD FieldSpec-4 has been used to measure the abnormal spectrums of these Celosia argentea through a whole growth cycle. After pretreating the spectral data, we used Successive Projections Algorithm (SPA) to extract the characteristic variables for extracting 1603 bands into 8 bands. Finally, the relationship between the spectral variables and the concentration of manganese was predicted by the Model of Partial Least Squares Regression (PLSR). The results show that the correlation coefficient-r2 are 0.8714 and 0.9141 in two sets of data. The prediction results are satisfactory, but the front 5 groups are closer to the regression line than the last 5 groups.

  1. Open-target sparse sensing of biological agents using DNA microarray

    PubMed Central

    2011-01-01

    Background Current biosensors are designed to target and react to specific nucleic acid sequences or structural epitopes. These 'target-specific' platforms require creation of new physical capture reagents when new organisms are targeted. An 'open-target' approach to DNA microarray biosensing is proposed and substantiated using laboratory generated data. The microarray consisted of 12,900 25 bp oligonucleotide capture probes derived from a statistical model trained on randomly selected genomic segments of pathogenic prokaryotic organisms. Open-target detection of organisms was accomplished using a reference library of hybridization patterns for three test organisms whose DNA sequences were not included in the design of the microarray probes. Results A multivariate mathematical model based on the partial least squares regression (PLSR) was developed to detect the presence of three test organisms in mixed samples. When all 12,900 probes were used, the model correctly detected the signature of three test organisms in all mixed samples (mean(R2)) = 0.76, CI = 0.95), with a 6% false positive rate. A sampling algorithm was then developed to sparsely sample the probe space for a minimal number of probes required to capture the hybridization imprints of the test organisms. The PLSR detection model was capable of correctly identifying the presence of the three test organisms in all mixed samples using only 47 probes (mean(R2)) = 0.77, CI = 0.95) with nearly 100% specificity. Conclusions We conceived an 'open-target' approach to biosensing, and hypothesized that a relatively small, non-specifically designed, DNA microarray is capable of identifying the presence of multiple organisms in mixed samples. Coupled with a mathematical model applied to laboratory generated data, and sparse sampling of capture probes, the prototype microarray platform was able to capture the signature of each organism in all mixed samples with high sensitivity and specificity. It was demonstrated that this new approach to biosensing closely follows the principles of sparse sensing. PMID:21801424

  2. Evaluation of water-use efficiency in foxtail millet (Setaria italica) using visible-near infrared and thermal spectral sensing techniques.

    PubMed

    Wang, Meng; Ellsworth, Patrick Z; Zhou, Jianfeng; Cousins, Asaph B; Sankaran, Sindhuja

    2016-05-15

    Water limitations decrease stomatal conductance (g(s)) and, in turn, photosynthetic rate (A(net)), resulting in decreased crop productivity. The current techniques for evaluating these physiological responses are limited to leaf-level measures acquired by measuring leaf-level gas exchange. In this regard, proximal sensing techniques can be a useful tool in studying plant biology as they can be used to acquire plant-level measures in a high-throughput manner. However, to confidently utilize the proximal sensing technique for high-throughput physiological monitoring, it is important to assess the relationship between plant physiological parameters and the sensor data. Therefore, in this study, the application of rapid sensing techniques based on thermal imaging and visual-near infrared spectroscopy for assessing water-use efficiency (WUE) in foxtail millet (Setaria italica (L.) P. Beauv) was evaluated. The visible-near infrared spectral reflectance (350-2500 nm) and thermal (7.5-14 µm) data were collected at regular intervals from well-watered and drought-stressed plants in combination with other leaf physiological parameters (transpiration rate-E, A(net), g(s), leaf carbon isotopic signature-δ(13)C(leaf), WUE). Partial least squares regression (PLSR) analysis was used to predict leaf physiological measures based on the spectral data. The PLSR modeling on the hyperspectral data yielded accurate and precise estimates of leaf E, gs, δ(13)C(leaf), and WUE with coefficient of determination in a range of 0.85-0.91. Additionally, significant differences in average leaf temperatures (~1°C) measured with a thermal camera were observed between well-watered plants and drought-stressed plants. In summary, the visible-near infrared reflectance data, and thermal images can be used as a potential rapid technique for evaluating plant physiological responses such as WUE. Copyright © 2016 Elsevier B.V. All rights reserved.

  3. Discharge and suspended sediment patterns in a small mountainous watershed with widely distributed rock fragments

    NASA Astrophysics Data System (ADS)

    Fang, N. F.; Shi, Z. H.; Chen, F. X.; Zhang, H. Y.; Wang, Y. X.

    2015-09-01

    Understanding and quantifying sediment loads is important in watersheds with highly erodible materials, which will eventually cause environmental and ecological problems. Within this context, suspended sediment (SS) transport and its temporal dynamics were studied in a small mountainous watershed with sloping lands containing rock fragments in subtropical China. Soils containing rock fragments with many macro-pores have a high permeability rate. Over a 7-year period, the mean runoff coefficient of this watershed was 0.65. Overall, 30 flood events were monitored and accounted for 95.5%, 27.3%, 17.1% of the total SS load, precipitation and total discharge, respectively, over a 5-year period. The presence of rock fragments in soils can affect soil loss. When comparing the soil loss in the studied watershed with that of other watersheds under similar climatic conditions, rock fragments negatively affect soil loss. However, an extreme event occurred on 14 August 1990, and the sediment load exhibited a phenomenon called "small deposits towards lump withdrawal", which resulted in a soil loss of 20,499 t (4.6 times the mean yearly soil loss). This event exhausted most of the SSs stored by the rock fragments on the slope and channel. Following this event, the mean SS concentration (SSC) of the 11 events was 1.05 kg m-3, and the mean SSC of the 18 previous events was 1.75 kg m-3. Twelve variables were separated using the classical hydrograph separation method. Partial least-squares regression (PLSR) was used to determine the highly co-related variables of the discharge. The results indicated that PLSR could explain runoff well. The relationship between discharge and SSC was highly scattered. During 24 flood events, three types of hysteresis loops were observed: clockwise (17 events), figure-eight (3 events), and complex (4 events).

  4. Predicting heavy metal concentrations in soils and plants using field spectrophotometry

    NASA Astrophysics Data System (ADS)

    Muradyan, V.; Tepanosyan, G.; Asmaryan, Sh.; Sahakyan, L.; Saghatelyan, A.; Warner, T. A.

    2017-09-01

    Aim of this study is to predict heavy metal (HM) concentrations in soils and plants using field remote sensing methods. The studied sites were an industrial town of Kajaran and city of Yerevan. The research also included sampling of soils and leaves of two tree species exposed to different pollution levels and determination of contents of HM in lab conditions. The obtained spectral values were then collated with contents of HM in Kajaran soils and the tree leaves sampled in Yerevan, and statistical analysis was done. Consequently, Zn and Pb have a negative correlation coefficient (p <0.01) in a 2498 nm spectral range for soils. Pb has a significantly higher correlation at red edge for plants. A regression models and artificial neural network (ANN) for HM prediction were developed. Good results were obtained for the best stress sensitive spectral band ANN (R2 0.9, RPD 2.0), Simple Linear Regression (SLR) and Partial Least Squares Regression (PLSR) (R2 0.7, RPD 1.4) models. Multiple Linear Regression (MLR) model was not applicable to predict Pb and Zn concentrations in soils in this research. Almost all full spectrum PLS models provide good calibration and validation results (RPD>1.4). Full spectrum ANN models are characterized by excellent calibration R2, rRMSE and RPD (0.9; 0.1 and >2.5 respectively). For prediction of Pb and Ni contents in plants SLR and PLS models were used. The latter provide almost the same results. Our findings indicate that it is possible to make coarse direct estimation of HM content in soils and plants using rapid and economic reflectance spectroscopy.

  5. Development of VIS/NIR spectroscopic system for real-time prediction of fresh pork quality

    NASA Astrophysics Data System (ADS)

    Zhang, Haiyun; Peng, Yankun; Zhao, Songwei; Sasao, Akira

    2013-05-01

    Quality attributes of fresh meat will influence nutritional value and consumers' purchasing power. The aim of the research was to develop a prototype for real-time detection of quality in meat. It consisted of hardware system and software system. A VIS/NIR spectrograph in the range of 350 to 1100 nm was used to collect the spectral data. In order to acquire more potential information of the sample, optical fiber multiplexer was used. A conveyable and cylindrical device was designed and fabricated to hold optical fibers from multiplexer. High power halogen tungsten lamp was collected as the light source. The spectral data were obtained with the exposure time of 2.17ms from the surface of the sample by press down the trigger switch on the self-developed system. The system could automatically acquire, process, display and save the data. Moreover the quality could be predicted on-line. A total of 55 fresh pork samples were used to develop prediction model for real time detection. The spectral data were pretreated with standard normalized variant (SNV) and partial least squares regression (PLSR) was used to develop prediction model. The correlation coefficient and root mean square error of the validation set for water content and pH were 0.810, 0.653, and 0.803, 0.098 respectively. The research shows that the real-time non-destructive detection system based on VIS/NIR spectroscopy can be efficient to predict the quality of fresh meat.

  6. Suitability of hyperspectral imaging for rapid evaluation of thiobarbituric acid (TBA) value in grass carp (Ctenopharyngodon idella) fillet.

    PubMed

    Cheng, Jun-Hu; Sun, Da-Wen; Pu, Hong-Bin; Wang, Qi-Jun; Chen, Yu-Nan

    2015-03-15

    The suitability of hyperspectral imaging technique (400-1000 nm) was investigated to determine the thiobarbituric acid (TBA) value for monitoring lipid oxidation in fish fillets during cold storage at 4°C for 0, 2, 5, and 8 days. The PLSR calibration model was established with full spectral region between the spectral data extracted from the hyperspectral images and the reference TBA values and showed good performance for predicting TBA value with determination coefficients (R(2)P) of 0.8325 and root-mean-square errors of prediction (RMSEP) of 0.1172 mg MDA/kg flesh. Two simplified PLSR and MLR models were built and compared using the selected ten most important wavelengths. The optimised MLR model yielded satisfactory results with R(2)P of 0.8395 and RMSEP of 0.1147 mg MDA/kg flesh, which was used to visualise the TBA values distribution in fish fillets. The whole results confirmed that using hyperspectral imaging technique as a rapid and non-destructive tool is suitable for the determination of TBA values for monitoring lipid oxidation and evaluation of fish freshness. Copyright © 2014 Elsevier Ltd. All rights reserved.

  7. Sarcoptic mange breaks up bottom-up regulation of body condition in a large herbivore population.

    PubMed

    Carvalho, João; Granados, José E; López-Olvera, Jorge R; Cano-Manuel, Francisco Javier; Pérez, Jesús M; Fandos, Paulino; Soriguer, Ramón C; Velarde, Roser; Fonseca, Carlos; Ráez, Arian; Espinosa, José; Pettorelli, Nathalie; Serrano, Emmanuel

    2015-11-06

    Both parasitic load and resource availability can impact individual fitness, yet little is known about the interplay between these parameters in shaping body condition, a key determinant of fitness in wild mammals inhabiting seasonal environments. Using partial least square regressions (PLSR), we explored how temporal variation in climatic conditions, vegetation dynamics and sarcoptic mange (Sarcoptes scabiei) severity impacted body condition of 473 Iberian ibexes (Capra pyrenaica) harvested between 1995 and 2008 in the highly seasonal Alpine ecosystem of Sierra Nevada Natural Space (SNNS), southern Spain. Bottom-up regulation was found to only occur in healthy ibexes; the condition of infected ibexes was independent of primary productivity and snow cover. No link between ibex abundance and ibex body condition could be established when only considering infected individuals. The pernicious effects of mange on Iberian ibexes overcome the benefits of favorable environmental conditions. Even though the increase in primary production exerts a positive effect on the body condition of healthy ibexes, the scabietic individuals do not derive any advantage from increased resource availability. Further applied research coupled with continuous sanitary surveillance are needed to address remaining knowledge gaps associated with the transmission dynamics and management of sarcoptic mange in free-living populations.

  8. Investigating the Moisture Content of Polyamide 6 by Raman-Microscopy and Multivariate Data Analysis

    NASA Astrophysics Data System (ADS)

    Lechner, Tobias; Noack, Kristina; Thöne, Manuel; Amend, Philipp; Schmidt, Michael; Will, Stefan

    Thermal malleability of thermoplastics results in a high product diversity in various industry sectors. However, industrial applications require a constant and high component quality. Hence, material processing such as laser welding has to consider that, e.g., the moisture content of thermoplastics influences the mechanical properties such as the tensile strength. Moreover, water evaporates during laser welding and can form pores and defects. Thus, there is a large need for non-invasive material inspection before processing. To that end, we developed a methodology based on Raman-microscopy and multivariate data analysis (MVD) to determine the moisture content of polyamide (MCP). Further, the impact of the MCP on the mechanical properties was verified. For samples with a defined variation of the MCP, xyz-Raman-scans were carried out and analysed using MVD. For reference purposes, the samples were weighted and tensile tests were performed. An evaluation by means of partial least squares regression analysis (PLSR) resulted in a prediction of the MCP with a correlation coefficient >98%. Consequently, Raman-microscopy shows large potential for developing new techniques for inspection and quality control of plastics before processing. Dedicated to Professor Alfred Leipertz on the occasion of his 70th birthday.

  9. Discrimination of chicken seasonings and beef seasonings using electronic nose and sensory evaluation.

    PubMed

    Tian, Huaixiang; Li, Fenghua; Qin, Lan; Yu, Haiyan; Ma, Xia

    2014-11-01

    This study examines the feasibility of electronic nose as a method to discriminate chicken and beef seasonings and to predict sensory attributes. Sensory evaluation showed that 8 chicken seasonings and 4 beef seasonings could be well discriminated and classified based on 8 sensory attributes. The sensory attributes including chicken/beef, gamey, garlic, spicy, onion, soy sauce, retention, and overall aroma intensity were generated by a trained evaluation panel. Principal component analysis (PCA), discriminant factor analysis (DFA), and cluster analysis (CA) combined with electronic nose were used to discriminate seasoning samples based on the difference of the sensor response signals of chicken and beef seasonings. The correlation between sensory attributes and electronic nose sensors signal was established using partial least squares regression (PLSR) method. The results showed that the seasoning samples were all correctly classified by the electronic nose combined with PCA, DFA, and CA. The electronic nose gave good prediction results for all the sensory attributes with correlation coefficient (r) higher than 0.8. The work indicated that electronic nose is an effective method for discriminating different seasonings and predicting sensory attributes. © 2014 Institute of Food Technologists®

  10. Rapid, simultaneous and non-destructive assessment of the moisture, water activity, firmness and SO2 content of the intact sulphured-dried apricots using FT-NIRS and chemometrics.

    PubMed

    Özdemir, İbrahim Sani; Öztürk, Bülent; Çelik, Belgin; Sarıtepe, Yüksel; Aksoy, Hatice

    2018-08-15

    The potential of using FT-NIR spectroscopy for the rapid and non-destructive measurement of the moisture, water activity, firmness and SO 2 content of the intact sulphured-dried apricots (SDA) was investigated for the first time in the literature. The partial least squares regression (PLS-R) models constructed using FT-NIR spectra were very successful in predicting the moisture content (R 2 p = 0.986, RMSEP = 1.22%, RPD = 9.15) and water activity (R 2 p = 0.987, RMSEP = 0.016, RPD = 9.37) of SDAs. Satisfactory results were also obtained for the models developed for the prediction of the firmness (R 2 p = 0.845, RMSEP = 0.445, RPD = 2.55) and SO 2 content (R 2 p = 0.804, RMSEP = 349 mg kg -1 , RPD = 2.27). These results clearly demonstrate that the major quality parameters of SDA can be simultaneously measured in a short time by FT-NIR spectroscopy without any need for the sample preparation or skilled laboratory personnel. Copyright © 2018 Elsevier B.V. All rights reserved.

  11. Development of a Drug-Response Modeling Framework to Identify Cell Line Derived Translational Biomarkers That Can Predict Treatment Outcome to Erlotinib or Sorafenib

    PubMed Central

    Li, Bin; Shin, Hyunjin; Gulbekyan, Georgy; Pustovalova, Olga; Nikolsky, Yuri; Hope, Andrew; Bessarabova, Marina; Schu, Matthew; Kolpakova-Hart, Elona; Merberg, David; Dorner, Andrew; Trepicchio, William L.

    2015-01-01

    Development of drug responsive biomarkers from pre-clinical data is a critical step in drug discovery, as it enables patient stratification in clinical trial design. Such translational biomarkers can be validated in early clinical trial phases and utilized as a patient inclusion parameter in later stage trials. Here we present a study on building accurate and selective drug sensitivity models for Erlotinib or Sorafenib from pre-clinical in vitro data, followed by validation of individual models on corresponding treatment arms from patient data generated in the BATTLE clinical trial. A Partial Least Squares Regression (PLSR) based modeling framework was designed and implemented, using a special splitting strategy and canonical pathways to capture robust information for model building. Erlotinib and Sorafenib predictive models could be used to identify a sub-group of patients that respond better to the corresponding treatment, and these models are specific to the corresponding drugs. The model derived signature genes reflect each drug’s known mechanism of action. Also, the models predict each drug’s potential cancer indications consistent with clinical trial results from a selection of globally normalized GEO expression datasets. PMID:26107615

  12. Development of a Drug-Response Modeling Framework to Identify Cell Line Derived Translational Biomarkers That Can Predict Treatment Outcome to Erlotinib or Sorafenib.

    PubMed

    Li, Bin; Shin, Hyunjin; Gulbekyan, Georgy; Pustovalova, Olga; Nikolsky, Yuri; Hope, Andrew; Bessarabova, Marina; Schu, Matthew; Kolpakova-Hart, Elona; Merberg, David; Dorner, Andrew; Trepicchio, William L

    2015-01-01

    Development of drug responsive biomarkers from pre-clinical data is a critical step in drug discovery, as it enables patient stratification in clinical trial design. Such translational biomarkers can be validated in early clinical trial phases and utilized as a patient inclusion parameter in later stage trials. Here we present a study on building accurate and selective drug sensitivity models for Erlotinib or Sorafenib from pre-clinical in vitro data, followed by validation of individual models on corresponding treatment arms from patient data generated in the BATTLE clinical trial. A Partial Least Squares Regression (PLSR) based modeling framework was designed and implemented, using a special splitting strategy and canonical pathways to capture robust information for model building. Erlotinib and Sorafenib predictive models could be used to identify a sub-group of patients that respond better to the corresponding treatment, and these models are specific to the corresponding drugs. The model derived signature genes reflect each drug's known mechanism of action. Also, the models predict each drug's potential cancer indications consistent with clinical trial results from a selection of globally normalized GEO expression datasets.

  13. Application of Attenuated Total Reflectance-Fourier Transformed Infrared (ATR-FTIR) Spectroscopy To Determine the Chlorogenic Acid Isomer Profile and Antioxidant Capacity of Coffee Beans.

    PubMed

    Liang, Ningjian; Lu, Xiaonan; Hu, Yaxi; Kitts, David D

    2016-01-27

    The chlorogenic acid isomer profile and antioxidant activity of both green and roasted coffee beans are reported herein using ATR-FTIR spectroscopy combined with chemometric analyses. High-performance liquid chromatography (HPLC) quantified different chlorogenic acid isomer contents for reference, whereas ORAC, ABTS, and DPPH were used to determine the antioxidant activity of the same coffee bean extracts. FTIR spectral data and reference data of 42 coffee bean samples were processed to build optimized PLSR models, and 18 samples were used for external validation of constructed PLSR models. In total, six PLSR models were constructed for six chlorogenic acid isomers to predict content, with three PLSR models constructed to forecast the free radical scavenging activities, obtained using different chemical assays. In conclusion, FTIR spectroscopy, coupled with PLSR, serves as a reliable, nondestructive, and rapid analytical method to quantify chlorogenic acids and to assess different free radical-scavenging capacities in coffee beans.

  14. Fast and simultaneously determination of light and heavy rare earth elements in monazite using combination of ultraviolet-visible spectrophotometry and multivariate analysis

    NASA Astrophysics Data System (ADS)

    Anggraeni, Anni; Arianto, Fernando; Mutalib, Abdul; Pratomo, Uji; Bahti, Husein H.

    2017-05-01

    Rare Earth Elements (REE) are elements that a lot of function for life, such as metallurgy, optical devices, and manufacture of electronic devices. Sources of REE is present in the mineral, in which each element has similar properties. Currently, to determining the content of REE is used instruments such as ICP-OES, ICP-MS, XRF, and HPLC. But in each instruments, there are still have some weaknesses. Therefore we need an alternative analytical method for the determination of rare earth metal content, one of them is by a combination of UV-Visible spectrophotometry and multivariate analysis, including Principal Component Analysis (PCA), Principal Component Regression (PCR), and Partial Least Square Regression (PLS). The purpose of this experiment is to determine the content of light and medium rare earth elements in the mineral monazite without chemical separation by using a combination of multivariate analysis and UV-Visible spectrophotometric methods. Training set created 22 variations of concentration and absorbance was measured using a UV-Vis spectrophotometer, then the data is processed by PCA, PCR, and PLSR. The results were compared and validated to obtain the mathematical equation with the smallest percent error. From this experiment, mathematical equation used PLS methods was better than PCR after validated, which has RMSE value for La, Ce, Pr, Nd, Gd, Sm, Eu, and Tb respectively 0.095; 0.573; 0.538; 0.440; 3.387; 1.240; 1.870; and 0.639.

  15. Development of sourdough fermented date seed for improving the quality and shelf life of flat bread: study with univariate and multivariate analyses.

    PubMed

    Habibi Najafi, Mohammad B; Pourfarzad, Amir; Zahedi, Hoda; Ahmadian-Kouchaksaraie, Zahra; Haddad Khodaparast, Mohammad H

    2016-01-01

    The aim of this work was to study the effects of a novel sourdough system prepared by wheat flour supplemented by combination of pulverized date seed, Lactobacillus plantarum, and/or Lactobacillus brevis as well as Saccharomyces cerevisiae on the sourdough characteristics, quality, sensory, texture, shelf life and image properties of Barbari flat bread. The highest sourdough acidity and bread specific volume was obtained with co-culture of Lb. plantarum + Lb. brevis + S. cerevisiae. The results suggest that fermentation is a potential bioprocessing technology for improving sensory aspects of bread supplemented with pulverized date seed, as a dietary fiber resource. Texture analysis of bread samples during 7 days of storage indicated that the presence of pulverized date seed in sourdough was able to diminish bread staling. The interaction of baker's yeast and lactic acid bacteria (LAB) has led to increase the particle average size of bread crumb and decrease the area fraction than the LAB samples. It was observed that all treatments of sourdough Barbari breads had higher cell wall thickness than the control Barbari bread. Avrami non-linear regression equation was chosen as useful mathematical model to properly study bread hardening kinetics. In addition, principal component analysis (PCA) allowed discriminating among sourdough and bread specialties. Partial least squares regression (PLSR) models were applied to determine the relationships between sensory and instrumental data.

  16. Predictability of Western Himalayan river flow: melt seasonal inflow into Bhakra Reservoir in northern India

    NASA Astrophysics Data System (ADS)

    Pal, I.; Lall, U.; Robertson, A. W.; Cane, M. A.; Bansal, R.

    2013-06-01

    Snowmelt-dominated streamflow of the Western Himalayan rivers is an important water resource during the dry pre-monsoon spring months to meet the irrigation and hydropower needs in northern India. Here we study the seasonal prediction of melt-dominated total inflow into the Bhakra Dam in northern India based on statistical relationships with meteorological variables during the preceding winter. Total inflow into the Bhakra Dam includes the Satluj River flow together with a flow diversion from its tributary, the Beas River. Both are tributaries of the Indus River that originate from the Western Himalayas, which is an under-studied region. Average measured winter snow volume at the upper-elevation stations and corresponding lower-elevation rainfall and temperature of the Satluj River basin were considered as empirical predictors. Akaike information criteria (AIC) and Bayesian information criteria (BIC) were used to select the best subset of inputs from all the possible combinations of predictors for a multiple linear regression framework. To test for potential issues arising due to multicollinearity of the predictor variables, cross-validated prediction skills of the best subset were also compared with the prediction skills of principal component regression (PCR) and partial least squares regression (PLSR) techniques, which yielded broadly similar results. As a whole, the forecasts of the melt season at the end of winter and as the melt season commences were shown to have potential skill for guiding the development of stochastic optimization models to manage the trade-off between irrigation and hydropower releases versus flood control during the annual fill cycle of the Bhakra Reservoir, a major energy and irrigation source in the region.

  17. Soil moisture dynamics and dominant controls at different spatial scales over semiarid and semi-humid areas

    NASA Astrophysics Data System (ADS)

    Suo, Lizhu; Huang, Mingbin; Zhang, Yongkun; Duan, Liangxia; Shan, Yan

    2018-07-01

    Soil moisture dynamics plays an active role in ecological and hydrological processes, and it depends on a large number of environmental factors, such as topographic attributes, soil properties, land use types, and precipitation. However, studies must still clarify the relative significance of these environmental factors at different soil depths and at different spatial scales. This study aimed: (1) to characterize temporal and spatial variations in soil moisture content (SMC) at four soil layers (0-40, 40-100, 100-200, and 200-500 cm) and three spatial scales (plot, hillslope, and region); and (2) to determine their dominant controls in diverse soil layers at different spatial scales over semiarid and semi-humid areas of the Loess Plateau, China. Given the high co-dependence of environmental factors, partial least squares regression (PLSR) was used to detect relative significance among 15 selected environmental factors that affect SMC. Temporal variation in SMC decreased with increasing soil depth, and vertical changes in the 0-500 cm soil profile were divided into a fast-changing layer (0-40 cm), an active layer (40-100 cm), a sub-active layer (100-200 cm), and a relatively stable layer (200-500 cm). PLSR models simulated SMC accurately in diverse soil layers at different scales; almost all values for variation in response (R2) and goodness of prediction (Q2) were >0.5 and >0.0975, respectively. Upper and lower layer SMCs were the two most important factors that influenced diverse soil layers at three scales, and these SMC variables exhibited the highest importance in projection (VIP) values. The 7-day antecedent precipitation and 7-day antecedent potential evapotranspiration contributed significantly to SMC only at the 0-40 cm soil layer. VIP of soil properties, especially sand and silt content, which influenced SMC strongly, increased significantly after increasing the measured scale. Mean annual precipitation and potential evapotranspiration also influenced SMC at the regional scale significantly. Overall, this study indicated that dominant controls of SMC varied among three spatial scales on the Loess Plateau, and VIP was a function of spatial scale and soil depth.

  18. Soil Organic Carbon Variability in High-Andean Ecosystems: Bringing Together Machine Learning and Proximal Soil Sensing

    NASA Astrophysics Data System (ADS)

    Gavilan, C.; Grunwald, S.; Quiroz, R.

    2017-12-01

    The Andes represent the largest and highest mountain range in the tropics and is considered an important reserve of biodiversity, water provision and soil organic carbon (SOC) stocks. Nevertheless, limited attention has been given to estimate these stocks due to the lack of recent soil data, the poor accessibility and the wide range of coexistent ecosystems. In addition, conventional methods to determine SOC are usually time consuming and expensive to use in large-scale studies, hindering the possibility to have an accurate SOC assessment in the region. Proximal soil sensing techniques, such as visible near infrared (VNIR) and mid infrared (MIR) spectroscopy, have proven to be useful as an alternative to conventional methods for characterizing SOC but have not been tested in Andean soils. The aim of this study was to evaluate the potential of using VNIR and MIR spectroscopy to predict SOC content in the Central Andean region, using multivariate methods. Three study areas were selected across the Peruvian Central Andes. A total of 400 topsoil samples (0-30 cm) were collected and analyzed for SOC. The VNIR and MIR reflectance of the soil samples was measured in the laboratory. Three modeling approaches: Partial least squares regression (PLSR), random forest (RF) and support vector machine (SVM) were used to predict SOC from VNIR and MIR spectra in the study areas. The data was preprocessed in order to minimize the noise and optimize the accuracy of predictions. The models, for each study area, were assessed using 10-fold cross validation. Independent validation was implemented in the whole dataset (400 observations) by splitting it into calibration (70 %) and validation (30%) sets. Overall, the results indicate potential for both VNIR and MIR spectra to predict SOC content in the Andean soils. SOC content predictions from MIR spectra outperformed those from VNIR spectra. The evaluation of model performance shows that RF and SVM provide more accurate SOC predictions compared to PLSR. These findings suggest that integrating VNIR and MIR spectroscopy with machine learning algorithms constitutes a promising approach for assessing SOC content in high-Andean ecosystems.

  19. Application of laboratory and portable attenuated total reflectance infrared spectroscopic approaches for rapid quantification of alpaca serum immunoglobulin G

    PubMed Central

    Burns, Jennifer B.; Riley, Christopher B.; Shaw, R. Anthony; McClure, J. Trenton

    2017-01-01

    The objective of this study was to develop and compare the performance of laboratory grade and portable attenuated total reflectance infrared (ATR-IR) spectroscopic approaches in combination with partial least squares regression (PLSR) for the rapid quantification of alpaca serum IgG concentration, and the identification of low IgG (<1000 mg/dL), which is consistent with the diagnosis of failure of transfer of passive immunity (FTPI) in neonates. Serum samples (n = 175) collected from privately owned, healthy alpacas were tested by the reference method of radial immunodiffusion (RID) assay, and laboratory grade and portable ATR-IR spectrometers. Various pre-processing strategies were applied to the ATR-IR spectra that were linked to corresponding RID-IgG concentrations, and then randomly split into two sets: calibration (training) and test sets. PLSR was applied to the calibration set and calibration models were developed, and the test set was used to assess the accuracy of the analytical method. For the test set, the Pearson correlation coefficients between the IgG measured by RID and predicted by both laboratory grade and portable ATR-IR spectrometers was 0.91. The average differences between reference serum IgG concentrations and the two IR-based methods were 120.5 mg/dL and 71 mg/dL for the laboratory and portable ATR-IR-based assays, respectively. Adopting an IgG concentration <1000 mg/dL as the cut-point for FTPI cases, the sensitivity, specificity, and accuracy for identifying serum samples below this cut point by laboratory ATR-IR assay were 86, 100 and 98%, respectively (within the entire data set). Corresponding values for the portable ATR-IR assay were 95, 99 and 99%, respectively. These results suggest that the two different ATR-IR assays performed similarly for rapid qualitative evaluation of alpaca serum IgG and for diagnosis of IgG <1000 mg/dL, the portable ATR-IR spectrometer performed slightly better, and provides more flexibility for potential application in the field. PMID:28651006

  20. Drivers of Preference and Perception of Freshness in Roasted Peanuts (Arachis spp.) for European Consumers.

    PubMed

    Lykomitros, Dimitrios; Fogliano, Vincenzo; Capuano, Edoardo

    2018-04-01

    Roasted peanuts are a popular snack in Europe, but their drivers of liking and perceived freshness have not been previously studied with European consumers. Consumer research to date has been focused on U.S. consumers, and only on specific peanut cultivars. In this study, 26 unique samples were produced from peanuts of different types, cultivars, origins, and with different process technologies (including baking, frying, and maceration). The peanut samples were subjected to sensory (expert panel, Spectrum TM ) and instrumental analysis (color, headspace volatiles, sugar profile, large deformation compression tests, and graded by size) and were hedonically rated by consumers in The Netherlands, Spain, and Turkey (n > 200 each). Preference Mapping (PREFMAP) on mean liking models revealed that the drivers of liking are similar across the three countries. Sweet taste, roasted peanut, dark roast, and sweet aromas and the color b * value were related to increased liking, and raw bean aroma and bitter taste with decreased liking. Further partial least square regression (PLSR) modeling of liking and perceived freshness against instrumental attributes showed that the color coordinates in combination with sucrose content and a select few headspace volatiles were strong predictors of both preference and perceived freshness. Finally, additional PLSR models focusing on the headspace volatiles only showed that liking and ''fresh'' attributes were correlated with the presence of several pyrroles in the volatile fraction, and inversely related to ''stale'' and to hexanal and 2-heptanone. This study provides insight into which flavor, taste, and appearance attributes drive liking and disliking of roasted peanuts for European consumers. The drivers are linked back to analytical attributes that can be measured instrumentally, thereby reducing the reliance on costly sensory panels. Particular emphasis is placed on color as a predictor of preference, because of the low cost of the measuring equipment, it is available to even smaller producers. In addition to preference, the study also examines whether product attributes that drive perceived freshness exist. The results can be used to design products with high acceptability across several countries within Europe. © 2018 Institute of Food Technologists®.

  1. Application of laboratory and portable attenuated total reflectance infrared spectroscopic approaches for rapid quantification of alpaca serum immunoglobulin G.

    PubMed

    Elsohaby, Ibrahim; Burns, Jennifer B; Riley, Christopher B; Shaw, R Anthony; McClure, J Trenton

    2017-01-01

    The objective of this study was to develop and compare the performance of laboratory grade and portable attenuated total reflectance infrared (ATR-IR) spectroscopic approaches in combination with partial least squares regression (PLSR) for the rapid quantification of alpaca serum IgG concentration, and the identification of low IgG (<1000 mg/dL), which is consistent with the diagnosis of failure of transfer of passive immunity (FTPI) in neonates. Serum samples (n = 175) collected from privately owned, healthy alpacas were tested by the reference method of radial immunodiffusion (RID) assay, and laboratory grade and portable ATR-IR spectrometers. Various pre-processing strategies were applied to the ATR-IR spectra that were linked to corresponding RID-IgG concentrations, and then randomly split into two sets: calibration (training) and test sets. PLSR was applied to the calibration set and calibration models were developed, and the test set was used to assess the accuracy of the analytical method. For the test set, the Pearson correlation coefficients between the IgG measured by RID and predicted by both laboratory grade and portable ATR-IR spectrometers was 0.91. The average differences between reference serum IgG concentrations and the two IR-based methods were 120.5 mg/dL and 71 mg/dL for the laboratory and portable ATR-IR-based assays, respectively. Adopting an IgG concentration <1000 mg/dL as the cut-point for FTPI cases, the sensitivity, specificity, and accuracy for identifying serum samples below this cut point by laboratory ATR-IR assay were 86, 100 and 98%, respectively (within the entire data set). Corresponding values for the portable ATR-IR assay were 95, 99 and 99%, respectively. These results suggest that the two different ATR-IR assays performed similarly for rapid qualitative evaluation of alpaca serum IgG and for diagnosis of IgG <1000 mg/dL, the portable ATR-IR spectrometer performed slightly better, and provides more flexibility for potential application in the field.

  2. Selecting minimum dataset soil variables using PLSR as a regressive multivariate method

    NASA Astrophysics Data System (ADS)

    Stellacci, Anna Maria; Armenise, Elena; Castellini, Mirko; Rossi, Roberta; Vitti, Carolina; Leogrande, Rita; De Benedetto, Daniela; Ferrara, Rossana M.; Vivaldi, Gaetano A.

    2017-04-01

    Long-term field experiments and science-based tools that characterize soil status (namely the soil quality indices, SQIs) assume a strategic role in assessing the effect of agronomic techniques and thus in improving soil management especially in marginal environments. Selecting key soil variables able to best represent soil status is a critical step for the calculation of SQIs. Current studies show the effectiveness of statistical methods for variable selection to extract relevant information deriving from multivariate datasets. Principal component analysis (PCA) has been mainly used, however supervised multivariate methods and regressive techniques are progressively being evaluated (Armenise et al., 2013; de Paul Obade et al., 2016; Pulido Moncada et al., 2014). The present study explores the effectiveness of partial least square regression (PLSR) in selecting critical soil variables, using a dataset comparing conventional tillage and sod-seeding on durum wheat. The results were compared to those obtained using PCA and stepwise discriminant analysis (SDA). The soil data derived from a long-term field experiment in Southern Italy. On samples collected in April 2015, the following set of variables was quantified: (i) chemical: total organic carbon and nitrogen (TOC and TN), alkali-extractable C (TEC and humic substances - HA-FA), water extractable N and organic C (WEN and WEOC), Olsen extractable P, exchangeable cations, pH and EC; (ii) physical: texture, dry bulk density (BD), macroporosity (Pmac), air capacity (AC), and relative field capacity (RFC); (iii) biological: carbon of the microbial biomass quantified with the fumigation-extraction method. PCA and SDA were previously applied to the multivariate dataset (Stellacci et al., 2016). PLSR was carried out on mean centered and variance scaled data of predictors (soil variables) and response (wheat yield) variables using the PLS procedure of SAS/STAT. In addition, variable importance for projection (VIP) statistics was used to quantitatively assess the predictors most relevant for response variable estimation and then for variable selection (Andersen and Bro, 2010). PCA and SDA returned TOC and RFC as influential variables both on the set of chemical and physical data analyzed separately as well as on the whole dataset (Stellacci et al., 2016). Highly weighted variables in PCA were also TEC, followed by K, and AC, followed by Pmac and BD, in the first PC (41.2% of total variance); Olsen P and HA-FA in the second PC (12.6%), Ca in the third (10.6%) component. Variables enabling maximum discrimination among treatments for SDA were WEOC, on the whole dataset, humic substances, followed by Olsen P, EC and clay, in the separate data analyses. The highest PLS-VIP statistics were recorded for Olsen P and Pmac, followed by TOC, TEC, pH and Mg for chemical variables and clay, RFC and AC for the physical variables. Results show that different methods may provide different ranking of the selected variables and the presence of a response variable, in regressive techniques, may affect variable selection. Further investigation with different response variables and with multi-year datasets would allow to better define advantages and limits of single or combined approaches. Acknowledgment The work was supported by the projects "BIOTILLAGE, approcci innovative per il miglioramento delle performances ambientali e produttive dei sistemi cerealicoli no-tillage", financed by PSR-Basilicata 2007-2013, and "DESERT, Low-cost water desalination and sensor technology compact module" financed by ERANET-WATERWORKS 2014. References Andersen C.M. and Bro R., 2010. Variable selection in regression - a tutorial. Journal of Chemometrics, 24 728-737. Armenise et al., 2013. Developing a soil quality index to compare soil fitness for agricultural use under different managements in the mediterranean environment. Soil and Tillage Research, 130:91-98. de Paul Obade et al., 2016. A standardized soil quality index for diverse field conditions. Sci. Total Env. 541:424-434. Pulido Moncada et al., 2014. Data-driven analysis of soil quality indicators using limited data. Geoderma, 235:271-278. Stellacci et al., 2016. Comparison of different multivariate methods to select key soil variables for soil quality indices computation. XLV Congress of the Italian Society of Agronomy (SIA), Sassari, 20-22 September 2016.

  3. What’s Wrong with the Murals at the Mogao Grottoes: A Near-Infrared Hyperspectral Imaging Method

    PubMed Central

    Sun, Meijun; Zhang, Dong; Wang, Zheng; Ren, Jinchang; Chai, Bolong; Sun, Jizhou

    2015-01-01

    Although a significant amount of work has been performed to preserve the ancient murals in the Mogao Grottoes by Dunhuang Cultural Research, non-contact methods need to be developed to effectively evaluate the degree of flaking of the murals. In this study, we propose to evaluate the flaking by automatically analyzing hyperspectral images that were scanned at the site. Murals with various degrees of flaking were scanned in the 126th cave using a near-infrared (NIR) hyperspectral camera with a spectral range of approximately 900 to 1700 nm. The regions of interest (ROIs) of the murals were manually labeled and grouped into four levels: normal, slight, moderate, and severe. The average spectral data from each ROI and its group label were used to train our classification model. To predict the degree of flaking, we adopted four algorithms: deep belief networks (DBNs), partial least squares regression (PLSR), principal component analysis with a support vector machine (PCA + SVM) and principal component analysis with an artificial neural network (PCA + ANN). The experimental results show the effectiveness of our method. In particular, better results are obtained using DBNs when the training data contain a significant amount of striping noise. PMID:26394926

  4. FTIR microspectroscopy for rapid screening and monitoring of polyunsaturated fatty acid production in commercially valuable marine yeasts and protists.

    PubMed

    Vongsvivut, Jitraporn; Heraud, Philip; Gupta, Adarsha; Puri, Munish; McNaughton, Don; Barrow, Colin J

    2013-10-21

    The increase in polyunsaturated fatty acid (PUFA) consumption has prompted research into alternative resources other than fish oil. In this study, a new approach based on focal-plane-array Fourier transform infrared (FPA-FTIR) microspectroscopy and multivariate data analysis was developed for the characterisation of some marine microorganisms. Cell and lipid compositions in lipid-rich marine yeasts collected from the Australian coast were characterised in comparison to a commercially available PUFA-producing marine fungoid protist, thraustochytrid. Multivariate classification methods provided good discriminative accuracy evidenced from (i) separation of the yeasts from thraustochytrids and distinct spectral clusters among the yeasts that conformed well to their biological identities, and (ii) correct classification of yeasts from a totally independent set using cross-validation testing. The findings further indicated additional capability of the developed FPA-FTIR methodology, when combined with partial least squares regression (PLSR) analysis, for rapid monitoring of lipid production in one of the yeasts during the growth period, which was achieved at a high accuracy compared to the results obtained from the traditional lipid analysis based on gas chromatography. The developed FTIR-based approach when coupled to programmable withdrawal devices and a cytocentrifugation module would have strong potential as a novel online monitoring technology suited for bioprocessing applications and large-scale production.

  5. Sequential (step-by-step) detection, identification and quantitation of extra virgin olive oil adulteration by chemometric treatment of chromatographic profiles.

    PubMed

    Capote, F Priego; Jiménez, J Ruiz; de Castro, M D Luque

    2007-08-01

    An analytical method for the sequential detection, identification and quantitation of extra virgin olive oil adulteration with four edible vegetable oils--sunflower, corn, peanut and coconut oils--is proposed. The only data required for this method are the results obtained from an analysis of the lipid fraction by gas chromatography-mass spectrometry. A total number of 566 samples (pure oils and samples of adulterated olive oil) were used to develop the chemometric models, which were designed to accomplish, step-by-step, the three aims of the method: to detect whether an olive oil sample is adulterated, to identify the type of adulterant used in the fraud, and to determine how much aldulterant is in the sample. Qualitative analysis was carried out via two chemometric approaches--soft independent modelling of class analogy (SIMCA) and K nearest neighbours (KNN)--both approaches exhibited prediction abilities that were always higher than 91% for adulterant detection and 88% for type of adulterant identification. Quantitative analysis was based on partial least squares regression (PLSR), which yielded R2 values of >0.90 for calibration and validation sets and thus made it possible to determine adulteration with excellent precision according to the Shenk criteria.

  6. Profiling Taste and Aroma Compound Metabolism during Apricot Fruit Development and Ripening

    PubMed Central

    Xi, Wanpeng; Zheng, Huiwen; Zhang, Qiuyun; Li, Wenhui

    2016-01-01

    Sugars, organic acids and volatiles of apricot were determined by HPLC and GC-MS during fruit development and ripening, and the key taste and aroma components were identified by integrating flavor compound contents with consumers’ evaluation. Sucrose and glucose were the major sugars in apricot fruit. The contents of all sugars increased rapidly, and the accumulation pattern of sugars converted from glucose-predominated to sucrose-predominated during fruit development and ripening. Sucrose synthase (SS), sorbitol oxidase (SO) and sorbitol dehydrogenase (SDH) are under tight developmental control and they might play important roles in sugar accumulation. Almost all organic acids identified increased during early development and then decrease rapidly. During early development, fruit mainly accumulated quinate and malate, with the increase of citrate after maturation, and quinate, malate and citrate were the predominant organic acids at the ripening stage. The odor activity values (OAV) of aroma volatiles showed that 18 aroma compounds were the characteristic components of apricot fruit. Aldehydes and terpenes decreased significantly during the whole development period, whereas lactones and apocarotenoids significantly increased with fruit ripening. The partial least squares regression (PLSR) results revealed that β-ionone, γ-decalactone, sucrose and citrate are the key characteristic flavor factors contributing to consumer acceptance. Carotenoid cleavage dioxygenases (CCD) may be involved in β-ionone formation in apricot fruit. PMID:27347931

  7. Tandem Laser Induced Breakdown Spectroscopy (LIBS), Laser Ablation Inductively Coupled Plasma Mass Spectroscopy (LA-ICP-MS) and/or Laser Ablation Inductively Coupled Plasma Optical Emission Spectroscopy (LA-ICP-OES) for the analysis of samples of geological interest

    NASA Astrophysics Data System (ADS)

    Oropeza, D.

    2016-12-01

    A highly innovative laser ablation sampling instrument (J200 Tandem LA - LIBS) that combines the capabilities and analytical benefits of LIBS, LA-ICP-MS and LA-ICP-OES was used for micrometer-scale, spatially-resolved, elemental analysis of a wide variety of samples of geological interest. Data collected using ablation systems consisted of nanosecond (Nd:YAG operated 266nm) and femtosecond lasers (1030 and 343nm). An ICCD LIBS detector and Quadrupole based mass spectrometer were selected for LIBS and ICP-MS detection, respectively. This tandem instrument allows simultaneous determination of major and minor elements (for example, Si, Ca, Na, and Al, and trace elements such as Li, Ce, Cr, Sr, Y, Zn, Zr among others). The research also focused on elemental mapping and calibration strategies, specifically the use of emission and mass spectra for multivariate data analysis. Partial Least Square Regression (PLSR) is shown to minimize and compensate for matrix effects in the emission and mass spectra improving quantitative analysis by LIBS and LA-ICP-MS, respectively. The study provides a benchmark to evaluate analytical results for more complex geological sample matrices.

  8. Online low-field NMR spectroscopy for process control of an industrial lithiation reaction-automated data analysis.

    PubMed

    Kern, Simon; Meyer, Klas; Guhl, Svetlana; Gräßer, Patrick; Paul, Andrea; King, Rudibert; Maiwald, Michael

    2018-05-01

    Monitoring specific chemical properties is the key to chemical process control. Today, mainly optical online methods are applied, which require time- and cost-intensive calibration effort. NMR spectroscopy, with its advantage being a direct comparison method without need for calibration, has a high potential for enabling closed-loop process control while exhibiting short set-up times. Compact NMR instruments make NMR spectroscopy accessible in industrial and rough environments for process monitoring and advanced process control strategies. We present a fully automated data analysis approach which is completely based on physically motivated spectral models as first principles information (indirect hard modeling-IHM) and applied it to a given pharmaceutical lithiation reaction in the framework of the European Union's Horizon 2020 project CONSENS. Online low-field NMR (LF NMR) data was analyzed by IHM with low calibration effort, compared to a multivariate PLS-R (partial least squares regression) approach, and both validated using online high-field NMR (HF NMR) spectroscopy. Graphical abstract NMR sensor module for monitoring of the aromatic coupling of 1-fluoro-2-nitrobenzene (FNB) with aniline to 2-nitrodiphenylamine (NDPA) using lithium-bis(trimethylsilyl) amide (Li-HMDS) in continuous operation. Online 43.5 MHz low-field NMR (LF) was compared to 500 MHz high-field NMR spectroscopy (HF) as reference method.

  9. Quantitative Determination of Fusarium proliferatum Concentration in Intact Garlic Cloves Using Near-Infrared Spectroscopy.

    PubMed

    Tamburini, Elena; Mamolini, Elisabetta; De Bastiani, Morena; Marchetti, Maria Gabriella

    2016-07-15

    Fusarium proliferatum is considered to be a pathogen of many economically important plants, including garlic. The objective of this research was to apply near-infrared spectroscopy (NIRS) to rapidly determine fungal concentration in intact garlic cloves, avoiding the laborious and time-consuming procedures of traditional assays. Preventive detection of infection before seeding is of great interest for farmers, because it could avoid serious losses of yield during harvesting and storage. Spectra were collected on 95 garlic cloves, divided in five classes of infection (from 1-healthy to 5-very highly infected) in the range of fungal concentration 0.34-7231.15 ppb. Calibration and cross validation models were developed with partial least squares regression (PLSR) on pretreated spectra (standard normal variate, SNV, and derivatives), providing good accuracy in prediction, with a coefficient of determination (R²) of 0.829 and 0.774, respectively, a standard error of calibration (SEC) of 615.17 ppb, and a standard error of cross validation (SECV) of 717.41 ppb. The calibration model was then used to predict fungal concentration in unknown samples, peeled and unpeeled. The results showed that NIRS could be used as a reliable tool to directly detect and quantify F. proliferatum infection in peeled intact garlic cloves, but the presence of the external peel strongly affected the prediction reliability.

  10. The rapid measurement of soil carbon stock using near-infrared technology

    NASA Astrophysics Data System (ADS)

    Kusumo, B. H.; Sukartono; Bustan

    2018-03-01

    As a soil pool stores carbon (C) three times higher than an atmospheric pool, the depletion of C stock in the soil will significantly increase the concentration of CO2 in the atmosphere, causing global warming. However, the monitoring or measurement of soil C stock using conventional procedures is time-consuming and expensive. So it requires a rapid and non-destructive technique that is simple and does not need chemical substances. This research is aimed at testing whether near-infrared (NIR) technology is able to rapidly measure C stock in the soil. Soil samples were collected from an agricultural land at the sub-district of Kayangan, North Lombok, Indonesia. The coordinates of the samples were recorded. Parts of the samples were analyzed using conventional procedure (Walkley and Black) and some other parts were scanned using near-infrared spectroscopy (NIRS) for soil spectral collection. Partial Least Square Regression (PLSR) was used to develop models from soil C data measured by conventional analysis and from spectral data scanned by NIRS. The best model was moderately successful to measure soil C stock in the study area in North Lombok. This indicates that the NIR technology can be further used to monitor the change of soil C stock in the soil.

  11. Rapid non-destructive assessment of pork edible quality by using VIS/NIR spectroscopic technique

    NASA Astrophysics Data System (ADS)

    Zhang, Leilei; Peng, Yankun; Dhakal, Sagar; Song, Yulin; Zhao, Juan; Zhao, Songwei

    2013-05-01

    The objectives of this research were to develop a rapid non-destructive method to evaluate the edible quality of chilled pork. A total of 42 samples were packed in seal plastic bags and stored at 4°C for 1 to 21 days. Reflectance spectra were collected from visible/near-infrared spectroscopy system in the range of 400nm to 1100nm. Microbiological, physicochemical and organoleptic characteristics such as the total viable counts (TVC), total volatile basic-nitrogen (TVB-N), pH value and color parameters L* were determined to appraise pork edible quality. Savitzky-Golay (SG) based on five and eleven smoothing points, Multiple Scattering Correlation (MSC) and first derivative pre-processing methods were employed to eliminate the spectra noise. The support vector machines (SVM) and partial least square regression (PLSR) were applied to establish prediction models using the de-noised spectra. A linear correlation was developed between the VIS/NIR spectroscopy and parameters such as TVC, TVB-N, pH and color parameter L* indexes, which could gain prediction results with Rv of 0.931, 0.844, 0.805 and 0.852, respectively. The results demonstrated that VIS/NIR spectroscopy technique combined with SVM possesses a powerful assessment capability. It can provide a potential tool for detecting pork edible quality rapidly and non-destructively.

  12. Determination of persimmon leaf chloride contents using near-infrared spectroscopy (NIRS).

    PubMed

    de Paz, José Miguel; Visconti, Fernando; Chiaravalle, Mara; Quiñones, Ana

    2016-05-01

    Early diagnosis of specific chloride toxicity in persimmon trees requires the reliable and fast determination of the leaf chloride content, which is usually performed by means of a cumbersome, expensive and time-consuming wet analysis. A methodology has been developed in this study as an alternative to determine chloride in persimmon leaves using near-infrared spectroscopy (NIRS) in combination with multivariate calibration techniques. Based on a training dataset of 134 samples, a predictive model was developed from their NIR spectral data. For modelling, the partial least squares regression (PLSR) method was used. The best model was obtained with the first derivative of the apparent absorbance and using just 10 latent components. In the subsequent external validation carried out with 35 external data this model reached r(2) = 0.93, RMSE = 0.16% and RPD = 3.6, with standard error of 0.026% and bias of -0.05%. From these results, the model based on NIR spectral readings can be used for speeding up the laboratory determination of chloride in persimmon leaves with only a modest loss of precision. The intermolecular interaction between chloride ions and the peptide bonds in leaf proteins through hydrogen bonding, i.e. N-H···Cl, explains the ability for chloride determinations on the basis of NIR spectra.

  13. Mapping The Temporal and Spatial Variability of Soil Moisture Content Using Proximal Soil Sensing

    NASA Astrophysics Data System (ADS)

    Virgawati, S.; Mawardi, M.; Sutiarso, L.; Shibusawa, S.; Segah, H.; Kodaira, M.

    2018-05-01

    In studies related to soil optical properties, it has been proven that visual and NIR soil spectral response can predict soil moisture content (SMC) using proper data analysis techniques. SMC is one of the most important soil properties influencing most physical, chemical, and biological soil processes. The problem is how to provide reliable, fast and inexpensive information of SMC in the subsurface from numerous soil samples and repeated measurement. The use of spectroscopy technology has emerged as a rapid and low-cost tool for extensive investigation of soil properties. The objective of this research was to develop calibration models based on laboratory Vis-NIR spectroscopy to estimate the SMC at four different growth stages of the soybean crop in Yogyakarta Province. An ASD Field-spectrophotoradiometer was used to measure the reflectance of soil samples. The partial least square regression (PLSR) was performed to establish the relationship between the SMC with Vis-NIR soil reflectance spectra. The selected calibration model was used to predict the new samples of SMC. The temporal and spatial variability of SMC was performed in digital maps. The results revealed that the calibration model was excellent for SMC prediction. Vis-NIR spectroscopy was a reliable tool for the prediction of SMC.

  14. Applying Fourier Transform Mid Infrared Spectroscopy to Detect the Adulteration of Salmo salar with Oncorhynchus mykiss

    PubMed Central

    Moreira, Maria João

    2018-01-01

    The aim of this study was to evaluate the potential of Fourier transform infrared (FTIR) spectroscopy coupled with chemometric methods to detect fish adulteration. Muscles of Atlantic salmon (Salmo salar) (SS) and Salmon trout (Onconrhynchus mykiss) (OM) muscles were mixed in different percentages and transformed into mini-burgers. These were stored at 3 °C, then examined at 0, 72, 160, and 240 h for deteriorative microorganisms. Mini-burgers was submitted to Soxhlet extraction, following which lipid extracts were analyzed by FTIR. The principal component analysis (PCA) described the studied adulteration using four principal components with an explained variance of 95.60%. PCA showed that the absorbance in the spectral region from 721, 1097, 1370, 1464, 1655, 2805, to 2935, 3009 cm−1 may be attributed to biochemical fingerprints related to differences between SS and OM. The partial least squares regression (PLS-R) predicted the presence/absence of adulteration in fish samples of an external set with high accuracy. The proposed methods have the advantage of allowing quick measurements, despite the storage time of the adulterated fish. FTIR combined with chemometrics showed that a methodology to identify the adulteration of SS with OM can be established, even when stored for different periods of time. PMID:29621135

  15. Spectroscopic sensitivity of real-time, rapidly induced phytochemical change in response to damage.

    PubMed

    Couture, John J; Serbin, Shawn P; Townsend, Philip A

    2013-04-01

    An ecological consequence of plant-herbivore interactions is the phytochemical induction of defenses in response to insect damage. Here, we used reflectance spectroscopy to characterize the foliar induction profile of cardenolides in Asclepias syriaca in response to damage, tracked in vivo changes and examined the influence of multiple plant traits on cardenolide concentrations. Foliar cardenolide concentrations were measured at specific time points following damage to capture their induction profile. Partial least-squares regression (PLSR) modeling was employed to calibrate cardenolide concentrations to reflectance spectroscopy. In addition, subsets of plants were either repeatedly sampled to track in vivo changes or modified to reduce latex flow to damaged areas. Cardenolide concentrations and the induction profile of A. syriaca were well predicted using models derived from reflectance spectroscopy, and this held true for repeatedly sampled plants. Correlations between cardenolides and other foliar-related variables were weak or not significant. Plant modification for latex reduction inhibited an induced cardenolide response. Our findings show that reflectance spectroscopy can characterize rapid phytochemical changes in vivo. We used reflectance spectroscopy to identify the mechanisms behind the production of plant secondary metabolites, simultaneously characterizing multiple foliar constituents. In this case, cardenolide induction appears to be largely driven by enhanced latex delivery to leaves following damage. © 2013 The Authors. New Phytologist © 2013 New Phytologist Trust.

  16. Near infrared spectroscopy combined with multivariate analysis for monitoring the ethanol precipitation process of fraction I + II + III supernatant in human albumin separation

    NASA Astrophysics Data System (ADS)

    Li, Can; Wang, Fei; Zang, Lixuan; Zang, Hengchang; Alcalà, Manel; Nie, Lei; Wang, Mingyu; Li, Lian

    2017-03-01

    Nowadays, as a powerful process analytical tool, near infrared spectroscopy (NIRS) has been widely applied in process monitoring. In present work, NIRS combined with multivariate analysis was used to monitor the ethanol precipitation process of fraction I + II + III (FI + II + III) supernatant in human albumin (HA) separation to achieve qualitative and quantitative monitoring at the same time and assure the product's quality. First, a qualitative model was established by using principal component analysis (PCA) with 6 of 8 normal batches samples, and evaluated by the remaining 2 normal batches and 3 abnormal batches. The results showed that the first principal component (PC1) score chart could be successfully used for fault detection and diagnosis. Then, two quantitative models were built with 6 of 8 normal batches to determine the content of the total protein (TP) and HA separately by using partial least squares regression (PLS-R) strategy, and the models were validated by 2 remaining normal batches. The determination coefficient of validation (Rp2), root mean square error of cross validation (RMSECV), root mean square error of prediction (RMSEP) and ratio of performance deviation (RPD) were 0.975, 0.501 g/L, 0.465 g/L and 5.57 for TP, and 0.969, 0.530 g/L, 0.341 g/L and 5.47 for HA, respectively. The results showed that the established models could give a rapid and accurate measurement of the content of TP and HA. The results of this study indicated that NIRS is an effective tool and could be successfully used for qualitative and quantitative monitoring the ethanol precipitation process of FI + II + III supernatant simultaneously. This research has significant reference value for assuring the quality and improving the recovery ratio of HA in industrialization scale by using NIRS.

  17. Near infrared spectroscopy combined with multivariate analysis for monitoring the ethanol precipitation process of fraction I+II+III supernatant in human albumin separation.

    PubMed

    Li, Can; Wang, Fei; Zang, Lixuan; Zang, Hengchang; Alcalà, Manel; Nie, Lei; Wang, Mingyu; Li, Lian

    2017-03-15

    Nowadays, as a powerful process analytical tool, near infrared spectroscopy (NIRS) has been widely applied in process monitoring. In present work, NIRS combined with multivariate analysis was used to monitor the ethanol precipitation process of fraction I+II+III (FI+II+III) supernatant in human albumin (HA) separation to achieve qualitative and quantitative monitoring at the same time and assure the product's quality. First, a qualitative model was established by using principal component analysis (PCA) with 6 of 8 normal batches samples, and evaluated by the remaining 2 normal batches and 3 abnormal batches. The results showed that the first principal component (PC1) score chart could be successfully used for fault detection and diagnosis. Then, two quantitative models were built with 6 of 8 normal batches to determine the content of the total protein (TP) and HA separately by using partial least squares regression (PLS-R) strategy, and the models were validated by 2 remaining normal batches. The determination coefficient of validation (R p 2 ), root mean square error of cross validation (RMSECV), root mean square error of prediction (RMSEP) and ratio of performance deviation (RPD) were 0.975, 0.501g/L, 0.465g/L and 5.57 for TP, and 0.969, 0.530g/L, 0.341g/L and 5.47 for HA, respectively. The results showed that the established models could give a rapid and accurate measurement of the content of TP and HA. The results of this study indicated that NIRS is an effective tool and could be successfully used for qualitative and quantitative monitoring the ethanol precipitation process of FI+II+III supernatant simultaneously. This research has significant reference value for assuring the quality and improving the recovery ratio of HA in industrialization scale by using NIRS. Copyright © 2016 Elsevier B.V. All rights reserved.

  18. What Is the Role of Land-Use Compositions and Spatial Configurations in Sediment Yield from Mountainous Watershed?

    NASA Astrophysics Data System (ADS)

    Shi, Z. H.

    2014-12-01

    There are strong ties between land use and sediment yield in watersheds. Many studies have used multivariate regression techniques to explore the response of sediment yield to land-use compositions and spatial configurations in watersheds. However, one issue with the use of conventional statistical methods to address relationships between land-use compositions and spatial configurations and sediment yield is multicollinearity. This paper examines the combined effects of land-use compositions and land-use spatial configurations of the watershed on the specific sediment yield of the Upper Du River watershed (8,973 km2) in China using the Soil and Water Assessment Tool (SWAT) and partial least-squares regression (PLSR). The land-use compositions and spatial configurations of the watershed were calculated at the sub-watershed scale. The sediment yields from sub-watershed were evaluated using SWAT model. The first-order factors were identified by calculating the variable importance for the projection (VIP). The results revealed that the land-use compositions exerted the largest effects on the specific sediment yield and explained 61.2% of the variation in the specific sediment yield. Land-use spatial configurations were also found to have a large effect on the specific sediment yield and explained 21.7% of the observed variation in the specific sediment yield. The following are the dominant first-order factors of the specific sediment yield at the sub-watershed scale: the areal percentages of agriculture and forest, patch density, value of the Shannon's diversity index, contagion. The VIP values suggested that the Shannon's diversity index and contagion are important factors for sediment delivery.

  19. Leaf aging of Amazonian canopy trees as revealed by spectral and physiochemical measurements.

    PubMed

    Chavana-Bryant, Cecilia; Malhi, Yadvinder; Wu, Jin; Asner, Gregory P; Anastasiou, Athanasios; Enquist, Brian J; Cosio Caravasi, Eric G; Doughty, Christopher E; Saleska, Scott R; Martin, Roberta E; Gerard, France F

    2017-05-01

    Leaf aging is a fundamental driver of changes in leaf traits, thereby regulating ecosystem processes and remotely sensed canopy dynamics. We explore leaf reflectance as a tool to monitor leaf age and develop a spectra-based partial least squares regression (PLSR) model to predict age using data from a phenological study of 1099 leaves from 12 lowland Amazonian canopy trees in southern Peru. Results demonstrated monotonic decreases in leaf water (LWC) and phosphorus (P mass ) contents and an increase in leaf mass per unit area (LMA) with age across trees; leaf nitrogen (N mass ) and carbon (C mass ) contents showed monotonic but tree-specific age responses. We observed large age-related variation in leaf spectra across trees. A spectra-based model was more accurate in predicting leaf age (R 2  = 0.86; percent root mean square error (%RMSE) = 33) compared with trait-based models using single (R 2  = 0.07-0.73; %RMSE = 7-38) and multiple (R 2  = 0.76; %RMSE = 28) predictors. Spectra- and trait-based models established a physiochemical basis for the spectral age model. Vegetation indices (VIs) including the normalized difference vegetation index (NDVI), enhanced vegetation index 2 (EVI2), normalized difference water index (NDWI) and photosynthetic reflectance index (PRI) were all age-dependent. This study highlights the importance of leaf age as a mediator of leaf traits, provides evidence of age-related leaf reflectance changes that have important impacts on VIs used to monitor canopy dynamics and productivity and proposes a new approach to predicting and monitoring leaf age with important implications for remote sensing. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.

  20. Estimating soil zinc concentrations using reflectance spectroscopy

    NASA Astrophysics Data System (ADS)

    Sun, Weichao; Zhang, Xia

    2017-06-01

    Soil contamination by heavy metals has been an increasingly severe threat to nature environment and human health. Efficiently investigation of contamination status is essential to soil protection and remediation. Visible and near-infrared reflectance spectroscopy (VNIRS) has been regarded as an alternative for monitoring soil contamination by heavy metals. Generally, the entire VNIR spectral bands are employed to estimate heavy metal concentration, which lacks interpretability and requires much calculation. In this study, 74 soil samples were collected from Hunan Province, China and their reflectance spectra were used to estimate zinc (Zn) concentration in soil. Organic matter and clay minerals have strong adsorption for Zn in soil. Spectral bands associated with organic matter and clay minerals were used for estimation with genetic algorithm based partial least square regression (GA-PLSR). The entire VNIR spectral bands, the bands associated with organic matter and the bands associated with clay minerals were incorporated as comparisons. Root mean square error of prediction, residual prediction deviation, and coefficient of determination (R2) for the model developed using combined bands of organic matter and clay minerals were 329.65 mg kg-1, 1.96 and 0.73, which is better than 341.88 mg kg-1, 1.89 and 0.71 for the entire VNIR spectral bands, 492.65 mg kg-1, 1.31 and 0.40 for the organic matter, and 430.26 mg kg-1, 1.50 and 0.54 for the clay minerals. Additionally, in consideration of atmospheric water vapor absorption in field spectra measurement, combined bands of organic matter and absorption around 2200 nm were used for estimation and achieved high prediction accuracy with R2 reached 0.640. The results indicate huge potential of soil reflectance spectroscopy in estimating Zn concentrations in soil.

  1. Robust new NIRS coupled with multivariate methods for the detection and quantification of tallow adulteration in clarified butter samples.

    PubMed

    Mabood, Fazal; Abbas, Ghulam; Jabeen, Farah; Naureen, Zakira; Al-Harrasi, Ahmed; Hamaed, Ahmad M; Hussain, Javid; Al-Nabhani, Mahmood; Al Shukaili, Maryam S; Khan, Alamgir; Manzoor, Suryyia

    2018-03-01

    Cows' butterfat may be adulterated with animal fat materials like tallow which causes increased serum cholesterol and triglycerides levels upon consumption. There is no reliable technique to detect and quantify tallow adulteration in butter samples in a feasible way. In this study a highly sensitive near-infrared (NIR) spectroscopy combined with chemometric methods was developed to detect as well as quantify the level of tallow adulterant in clarified butter samples. For this investigation the pure clarified butter samples were intentionally adulterated with tallow at the following percentage levels: 1%, 3%, 5%, 7%, 9%, 11%, 13%, 15%, 17% and 20% (wt/wt). Altogether 99 clarified butter samples were used including nine pure samples (un-adulterated clarified butter) and 90 clarified butter samples adulterated with tallow. Each sample was analysed by using NIR spectroscopy in the reflection mode in the range 10,000-4000 cm -1 , at 2 cm -1 resolution and using the transflectance sample accessory which provided a total path length of 0.5 mm. Chemometric models including principal components analysis (PCA), partial least-squares discriminant analysis (PLSDA), and partial least-squares regressions (PLSR) were applied for statistical treatment of the obtained NIR spectral data. The PLSDA model was employed to differentiate pure butter samples from those adulterated with tallow. The employed model was then externally cross-validated by using a test set which included 30% of the total butter samples. The excellent performance of the model was proved by the low RMSEP value of 1.537% and the high correlation factor of 0.95. This newly developed method is robust, non-destructive, highly sensitive, and economical with very minor sample preparation and good ability to quantify less than 1.5% of tallow adulteration in clarified butter samples.

  2. A chemometrics approach applied to Fourier transform infrared spectroscopy (FTIR) for monitoring the spoilage of fresh salmon (Salmo salar) stored under modified atmospheres.

    PubMed

    Saraiva, C; Vasconcelos, H; de Almeida, José M M M

    2017-01-16

    The aim of this work was to investigate the potential of Fourier transform infrared spectroscopy (FTIR) to detect and predict the bacterial load of salmon fillets (Salmo salar) stored at 3, 8 and 30°C under three packaging conditions: air packaging (AP) and two modified atmospheres constituted by a mixture of 50%N 2 /40%CO 2 /10%O 2 with lemon juice (MAPL) and without lemon juice (MAP). Fresh salmon samples were periodically examined for total viable counts (TVC), specific spoilage organisms (SSO) counts, pH, FTIR and sensory assessment of freshness. Principal components analysis (PCA) allowed identification of the wavenumbers potentially correlated with the spoilage process. Linear discriminant analysis (LDA) of infrared spectral data was performed to support sensory data and to accurately identify samples freshness. The effect of the packaging atmospheres was assessed by microbial enumeration and LDA was used to determine sample packaging from the measured infrared spectra. It was verified that modified atmospheres can decrease significantly the bacterial load of fresh salmon. Lemon juice combined with MAP showed a more pronounced delay in the growth of Brochothrix thermosphacta, Photobacterium phosphoreum, psychrotrophs and H 2 S producers. Partial least squares regression (PLS-R) allowed estimates of TVC and psychrotrophs, lactic acid bacteria, molds and yeasts, Brochothrix thermosphacta, Enterobacteriaceae, Pseudomonas spp. and H 2 S producer counts from the infrared spectral data. For TVC, the root mean square error of prediction (RMSEP) value was 0.78logcfug -1 for an external set of samples. According to the results, FTIR can be used as a reliable, accurate and fast method for real time freshness evaluation of salmon fillets stored under different temperatures and packaging atmospheres. Copyright © 2016 Elsevier B.V. All rights reserved.

  3. Centrifugal ultrafiltration of human serum for improving immunoglobulin A quantification using attenuated total reflectance infrared spectroscopy.

    PubMed

    Elsohaby, Ibrahim; McClure, J Trenton; Riley, Christopher B; Bryanton, Janet; Bigsby, Kathryn; Shaw, R Anthony

    2018-02-20

    Attenuated total reflectance infrared (ATR-IR) spectroscopy is a simple, rapid and cost-effective method for the analysis of serum. However, the complex nature of serum remains a limiting factor to the reliability of this method. We investigated the benefits of coupling the centrifugal ultrafiltration with ATR-IR spectroscopy for quantification of human serum IgA concentration. Human serum samples (n = 196) were analyzed for IgA using an immunoturbidimetric assay. ATR-IR spectra were acquired for whole serum samples and for the retentate (residue) reconstituted with saline following 300 kDa centrifugal ultrafiltration. IR-based analytical methods were developed for each of the two spectroscopic datasets, and the accuracy of each of the two methods compared. Analytical methods were based upon partial least squares regression (PLSR) calibration models - one with 5-PLS factors (for whole serum) and the second with 9-PLS factors (for the reconstituted retentate). Comparison of the two sets of IR-based analytical results to reference IgA values revealed improvements in the Pearson correlation coefficient (from 0.66 to 0.76), and the root mean squared error of prediction in IR-based IgA concentrations (from 102 to 79 mg/dL) for the ultrafiltration retentate-based method as compared to the method built upon whole serum spectra. Depleting human serum low molecular weight proteins using a 300 kDa centrifugal filter thus enhances the accuracy IgA quantification by ATR-IR spectroscopy. Further evaluation and optimization of this general approach may ultimately lead to routine analysis of a range of high molecular-weight analytical targets that are otherwise unsuitable for IR-based analysis. Copyright © 2017 Elsevier B.V. All rights reserved.

  4. Improving the prediction of African savanna vegetation variables using time series of MODIS products

    NASA Astrophysics Data System (ADS)

    Tsalyuk, Miriam; Kelly, Maggi; Getz, Wayne M.

    2017-09-01

    African savanna vegetation is subject to extensive degradation as a result of rapid climate and land use change. To better understand these changes detailed assessment of vegetation structure is needed across an extensive spatial scale and at a fine temporal resolution. Applying remote sensing techniques to savanna vegetation is challenging due to sparse cover, high background soil signal, and difficulty to differentiate between spectral signals of bare soil and dry vegetation. In this paper, we attempt to resolve these challenges by analyzing time series of four MODIS Vegetation Products (VPs): Normalized Difference Vegetation Index (NDVI), Enhanced Vegetation Index (EVI), Leaf Area Index (LAI), and Fraction of Photosynthetically Active Radiation (FPAR) for Etosha National Park, a semiarid savanna in north-central Namibia. We create models to predict the density, cover, and biomass of the main savanna vegetation forms: grass, shrubs, and trees. To calibrate remote sensing data we developed an extensive and relatively rapid field methodology and measured herbaceous and woody vegetation during both the dry and wet seasons. We compared the efficacy of the four MODIS-derived VPs in predicting vegetation field measured variables. We then compared the optimal time span of VP time series to predict ground-measured vegetation. We found that Multiyear Partial Least Square Regression (PLSR) models were superior to single year or single date models. Our results show that NDVI-based PLSR models yield robust prediction of tree density (R2 = 0.79, relative Root Mean Square Error, rRMSE = 1.9%) and tree cover (R2 = 0.78, rRMSE = 0.3%). EVI provided the best model for shrub density (R2 = 0.82) and shrub cover (R2 = 0.83), but was only marginally superior over models based on other VPs. FPAR was the best predictor of vegetation biomass of trees (R2 = 0.76), shrubs (R2 = 0.83), and grass (R2 = 0.91). Finally, we addressed an enduring challenge in the remote sensing of semiarid vegetation by examining the transferability of predictive models through space and time. Our results show that models created in the wetter part of Etosha could accurately predict trees' and shrubs' variables in the drier part of the reserve and vice versa. Moreover, our results demonstrate that models created for vegetation variables in the dry season of 2011 could be successfully applied to predict vegetation in the wet season of 2012. We conclude that extensive field data combined with multiyear time series of MODIS vegetation products can produce robust predictive models for multiple vegetation forms in the African savanna. These methods advance the monitoring of savanna vegetation dynamics and contribute to improved management and conservation of these valuable ecosystems.

  5. Tree-ring growth of Scots pine, Common beech and Pedunculate oak under future climate in northeastern Germany

    NASA Astrophysics Data System (ADS)

    Jurasinski, Gerald; Scharnweber, Tobias; Schröder, Christian; Lennartz, Bernd; Bauwe, Andreas

    2017-04-01

    Tree growth depends, among other factors, largely on the prevailing climatic conditions. Therefore, tree growth patterns are to be expected under climate change. Here, we analyze the tree-ring growth response of three major European tree species to projected future climate across a climatic (mostly precipitation) gradient in northeastern Germany. We used monthly data for temperature, precipitation, and the standardized precipitation evapotranspiration index (SPEI) over multiple time scales (1, 3, 6, 12, and 24 months) to construct models of tree-ring growth for Scots pine (Pinus syl- vestris L.) at three pure stands, and for Common beech (Fagus sylvatica L.) and Pedunculate oak (Quercus robur L.) at three mature mixed stands. The regression models were derived using a two-step approach based on partial least squares regression (PLSR) to extract potentially well explaining variables followed by ordinary least squares regression (OLSR) to consolidate the models to the least number of variables while retaining high explanatory power. The stability of the models was tested with a comprehensive calibration-verification scheme. All models were successfully verified with R2s ranging from 0.21 for the western pine stand to 0.62 for the beech stand in the east. For growth prediction, climate data forecasted until 2100 by the regional climate model WETTREG2010 based on the A1B Intergovernmental Panel on Climate Change (IPCC) emission scenario was used. For beech and oak, growth rates will likely decrease until the end of the 21st century. For pine, modeled growth trends vary and range from a slight growth increase to a weak decrease in growth rates depending on the position along the climatic gradient. The climatic gradient across the study area will possibly affect the future growth of oak with larger growth reductions towards the drier east. For beech, site-specific adaptations seem to override the influence of the climatic gradient. We conclude that in Northeastern Germany Scots pine has great potential to remain resilient to projected climate change without any greater impairment, whereas Common beech and Pedunculate oak will likely face lesser growth under the expected warmer and dryer climate conditions. The results call for an adaptation of forest management to mitigate the negative effects of climate change for beech and oak in the region.

  6. Linking Stream Dissolved Oxygen with the Dynamic Environmental Drivers across the Pacific Coast of U.S.A.

    NASA Astrophysics Data System (ADS)

    Araya, F. Z.; Abdul-Aziz, O. I.

    2017-12-01

    This study utilized a systematic data analytics approach to determine the relative linkages of stream dissolved oxygen (DO) with the hydro-climatic and biogeochemical drivers across the U.S. Pacific Coast. Multivariate statistical techniques of Pearson correlation matrix, principal component analysis, and factor analysis were applied to a complex water quality dataset (1998-2015) at 35 water quality monitoring stations of USGS NWIS and EPA STORET. Power-law based partial least squares regression (PLSR) models with a bootstrap Monte Carlo procedure (1000 iterations) were developed to reliably estimate the relative linkages by resolving multicollinearity (Nash-Sutcliffe Efficiency, NSE = 0.50-0.94). Based on the dominant drivers, four environmental regimes have been identified and adequately described the system-data variances. In Pacific North West and Southern California, water temperature was the most dominant driver of DO in majority of the streams. However, in Central and Northern California, stream DO was controlled by multiple drivers (i.e., water temperature, pH, stream flow, and total phosphorus), exhibiting a transitional environmental regime. Further, total phosphorus (TP) appeared to be the limiting nutrient for most streams. The estimated linkages and insights would be useful to identify management priorities to achieve healthy coastal stream ecosystems across the Pacific Coast of U.S.A. and similar regions around the world. Keywords: Data analytics, water quality, coastal streams, dissolved oxygen, environmental regimes, Pacific Coast, United States.

  7. Quantitative Determination of Fusarium proliferatum Concentration in Intact Garlic Cloves Using Near-Infrared Spectroscopy

    PubMed Central

    Tamburini, Elena; Mamolini, Elisabetta; De Bastiani, Morena; Marchetti, Maria Gabriella

    2016-01-01

    Fusarium proliferatum is considered to be a pathogen of many economically important plants, including garlic. The objective of this research was to apply near-infrared spectroscopy (NIRS) to rapidly determine fungal concentration in intact garlic cloves, avoiding the laborious and time-consuming procedures of traditional assays. Preventive detection of infection before seeding is of great interest for farmers, because it could avoid serious losses of yield during harvesting and storage. Spectra were collected on 95 garlic cloves, divided in five classes of infection (from 1-healthy to 5-very highly infected) in the range of fungal concentration 0.34–7231.15 ppb. Calibration and cross validation models were developed with partial least squares regression (PLSR) on pretreated spectra (standard normal variate, SNV, and derivatives), providing good accuracy in prediction, with a coefficient of determination (R2) of 0.829 and 0.774, respectively, a standard error of calibration (SEC) of 615.17 ppb, and a standard error of cross validation (SECV) of 717.41 ppb. The calibration model was then used to predict fungal concentration in unknown samples, peeled and unpeeled. The results showed that NIRS could be used as a reliable tool to directly detect and quantify F. proliferatum infection in peeled intact garlic cloves, but the presence of the external peel strongly affected the prediction reliability. PMID:27428978

  8. Handling of uncertainty due to interference fringe in FT-NIR transmittance spectroscopy - Performance comparison of interference elimination techniques using glucose-water system

    NASA Astrophysics Data System (ADS)

    Beganović, Anel; Beć, Krzysztof B.; Henn, Raphael; Huck, Christian W.

    2018-05-01

    The applicability of two elimination techniques for interferences occurring in measurements with cells of short pathlength using Fourier transform near-infrared (FT-NIR) spectroscopy was evaluated. Due to the growing interest in the field of vibrational spectroscopy in aqueous biological fluids (e.g. glucose in blood), aqueous solutions of D-(+)-glucose were prepared and split into a calibration set and an independent validation set. All samples were measured with two FT-NIR spectrometers at various spectral resolutions. Moving average smoothing (MAS) and fast Fourier transform filter (FFT filter) were applied to the interference affected FT-NIR spectra in order to eliminate the interference pattern. After data pre-treatment, partial least squares regression (PLSR) models using different NIR regions were constructed using untreated (interference affected) spectra and spectra treated with MAS and FFT filter. The prediction of the independent validation set revealed information about the performance of the utilized interference elimination techniques, as well as the different NIR regions. The results showed that the combination band of water at approx. 5200 cm-1 is of great importance since its performance was superior to the one of the so-called first overtone of water at approx. 6800 cm-1. Furthermore, this work demonstrated that MAS and FFT filter are fast and easy-to-use techniques for the elimination of interference fringes in FT-NIR transmittance spectroscopy.

  9. Design and Fabrication of a Real-Time Measurement System for the Capsaicinoid Content of Korean Red Pepper (Capsicum annuum L.) Powder by Visible and Near-Infrared Spectroscopy.

    PubMed

    Lim, Jongguk; Kim, Giyoung; Mo, Changyeun; Kim, Moon S

    2015-10-29

    This research aims to design and fabricate a system to measure the capsaicinoid content of red pepper powder in a non-destructive and rapid method using visible and near infrared spectroscopy (VNIR). The developed system scans a well-leveled powder surface continuously to minimize the influence of the placenta distribution, thus acquiring stable and representative reflectance spectra. The system incorporates flat belts driven by a sample input hopper and stepping motor, a powder surface leveler, charge-coupled device (CCD) image sensor-embedded VNIR spectrometer, fiber optic probe, and tungsten halogen lamp, and an automated reference measuring unit with a reference panel to measure the standard spectrum. The operation program includes device interface, standard reflectivity measurement, and a graphical user interface to measure the capsaicinoid content. A partial least square regression (PLSR) model was developed to predict the capsaicinoid content; 44 red pepper powder samples whose measured capsaicinoid content ranged 13.45-159.48 mg/100 g by per high-performance liquid chromatography (HPLC) and 1242 VNIR absorbance spectra acquired by the pungency measurement system were used. The determination coefficient of validation (RV2) and standard error of prediction (SEP) for the model with the first-order derivative pretreatment method for Korean red pepper powder were 0.8484 and ±13.6388 mg/100 g, respectively.

  10. Predicting Soil Salinity with Vis–NIR Spectra after Removing the Effects of Soil Moisture Using External Parameter Orthogonalization

    PubMed Central

    Liu, Ya; Pan, Xianzhang; Wang, Changkun; Li, Yanli; Shi, Rongjie

    2015-01-01

    Robust models for predicting soil salinity that use visible and near-infrared (vis–NIR) reflectance spectroscopy are needed to better quantify soil salinity in agricultural fields. Currently available models are not sufficiently robust for variable soil moisture contents. Thus, we used external parameter orthogonalization (EPO), which effectively projects spectra onto the subspace orthogonal to unwanted variation, to remove the variations caused by an external factor, e.g., the influences of soil moisture on spectral reflectance. In this study, 570 spectra between 380 and 2400 nm were obtained from soils with various soil moisture contents and salt concentrations in the laboratory; 3 soil types × 10 salt concentrations × 19 soil moisture levels were used. To examine the effectiveness of EPO, we compared the partial least squares regression (PLSR) results established from spectra with and without EPO correction. The EPO method effectively removed the effects of moisture, and the accuracy and robustness of the soil salt contents (SSCs) prediction model, which was built using the EPO-corrected spectra under various soil moisture conditions, were significantly improved relative to the spectra without EPO correction. This study contributes to the removal of soil moisture effects from soil salinity estimations when using vis–NIR reflectance spectroscopy and can assist others in quantifying soil salinity in the future. PMID:26468645

  11. [Prediction of Encapsulation Temperatures of Copolymer Films in Photovoltaic Cells Using Hyperspectral Imaging Techniques and Chemometrics].

    PubMed

    Lin, Ping; Chen, Yong-ming; Yao, Zhi-lei

    2015-11-01

    A novel method of combination of the chemometrics and the hyperspectral imaging techniques was presented to detect the temperatures of Ethylene-Vinyl Acetate copolymer (EVA) films in photovoltaic cells during the thermal encapsulation process. Four varieties of the EVA films which had been heated at the temperatures of 128, 132, 142 and 148 °C during the photovoltaic cells production process were used for investigation in this paper. These copolymer encapsulation films were firstly scanned by the hyperspectral imaging equipment (Spectral Imaging Ltd. Oulu, Finland). The scanning band range of hyperspectral equipemnt was set between 904.58 and 1700.01 nm. The hyperspectral dataset of copolymer films was randomly divided into two parts for the training and test purpose. Each type of the training set and test set contained 90 and 10 instances, respectively. The obtained hyperspectral images of EVA films were dealt with by using the ENVI (Exelis Visual Information Solutions, USA) software. The size of region of interest (ROI) of each obtained hyperspectral image of EVA film was set as 150 x 150 pixels. The average of reflectance hyper spectra of all the pixels in the ROI was used as the characteristic curve to represent the instance. There kinds of chemometrics methods including partial least squares regression (PLSR), multi-class support vector machine (SVM) and large margin nearest neighbor (LMNN) were used to correlate the characteristic hyper spectra with the encapsulation temperatures of of copolymer films. The plot of weighted regression coefficients illustrated that both bands of short- and long-wave near infrared hyperspectral data contributed to enhancing the prediction accuracy of the forecast model. Because the attained reflectance hyperspectral data of EVA materials displayed the strong nonlinearity, the prediction performance of linear modeling method of PLSR declined and the prediction precision only reached to 95%. The kernel-based forecast models were introduced to eliminate the impact of nonlinear hyperspectral data to some extent through mapping the original nonlinear hyperspectral data to the high dimensional linear feature space, so the relationship between the nonlinear hyperspectral data and the encapsulation temperatures of EVA films was fully disclosed finally. Compared with the prediction results of three proposed models, the prediction performance of LMNN was superior to the other two, whose final recognition accuracy achieved 100%. The results indicated that the methods of combination of LMNN model with the hyperspectral imaging techniques was the best one for accurately and rapidly determining the encapsulation temperatures of EVA films of photovoltaic cells. In addition, this paper had created the ideal conditions for automatically monitoring and effectively controlling the encapsulation temperatures of EVA films in the photovoltaic cells production process.

  12. Rapid estimation of sugar release from winter wheat straw during bioethanol production using FTIR-photoacoustic spectroscopy

    DOE PAGES

    Bekiaris, Georgios; Lindedam, Jane; Peltre, Clément; ...

    2015-06-18

    Complexity and high cost are the main limitations for high-throughput screening methods for the estimation of the sugar release from plant materials during bioethanol production. In addition, it is important that we improve our understanding of the mechanisms by which different chemical components are affecting the degradability of plant material. In this study, Fourier transform infrared photoacoustic spectroscopy (FTIR-PAS) was combined with advanced chemometrics to develop calibration models predicting the amount of sugars released after pretreatment and enzymatic hydrolysis of wheat straw during bioethanol production, and the spectra were analysed to identify components associated with recalcitrance. A total of 1122more » wheat straw samples from nine different locations in Denmark and one location in the United Kingdom, spanning a large variation in genetic material and environmental conditions during growth, were analysed. The FTIR-PAS spectra of non-pretreated wheat straw were correlated with the measured sugar release, determined by a high-throughput pretreatment and enzymatic hydrolysis (HTPH) assay. A partial least square regression (PLSR) calibration model predicting the glucose and xylose release was developed. The interpretation of the regression coefficients revealed a positive correlation between the released glucose and xylose with easily hydrolysable compounds, such as amorphous cellulose and hemicellulose. Additionally, we observed a negative correlation with crystalline cellulose and lignin, which inhibits cellulose and hemicellulose hydrolysis. FTIR-PAS was used as a reliable method for the rapid estimation of sugar release during bioethanol production. The spectra revealed that lignin inhibited the hydrolysis of polysaccharides into monomers, while the crystallinity of cellulose retarded its hydrolysis into glucose. Amorphous cellulose and xylans were found to contribute significantly to the released amounts of glucose and xylose, respectively.« less

  13. Rapid estimation of sugar release from winter wheat straw during bioethanol production using FTIR-photoacoustic spectroscopy

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Bekiaris, Georgios; Lindedam, Jane; Peltre, Clément

    Complexity and high cost are the main limitations for high-throughput screening methods for the estimation of the sugar release from plant materials during bioethanol production. In addition, it is important that we improve our understanding of the mechanisms by which different chemical components are affecting the degradability of plant material. In this study, Fourier transform infrared photoacoustic spectroscopy (FTIR-PAS) was combined with advanced chemometrics to develop calibration models predicting the amount of sugars released after pretreatment and enzymatic hydrolysis of wheat straw during bioethanol production, and the spectra were analysed to identify components associated with recalcitrance. A total of 1122more » wheat straw samples from nine different locations in Denmark and one location in the United Kingdom, spanning a large variation in genetic material and environmental conditions during growth, were analysed. The FTIR-PAS spectra of non-pretreated wheat straw were correlated with the measured sugar release, determined by a high-throughput pretreatment and enzymatic hydrolysis (HTPH) assay. A partial least square regression (PLSR) calibration model predicting the glucose and xylose release was developed. The interpretation of the regression coefficients revealed a positive correlation between the released glucose and xylose with easily hydrolysable compounds, such as amorphous cellulose and hemicellulose. Additionally, we observed a negative correlation with crystalline cellulose and lignin, which inhibits cellulose and hemicellulose hydrolysis. FTIR-PAS was used as a reliable method for the rapid estimation of sugar release during bioethanol production. The spectra revealed that lignin inhibited the hydrolysis of polysaccharides into monomers, while the crystallinity of cellulose retarded its hydrolysis into glucose. Amorphous cellulose and xylans were found to contribute significantly to the released amounts of glucose and xylose, respectively.« less

  14. Hyperspectral Remote Sensing of Terrestrial Ecosystem Productivity from ISS

    NASA Astrophysics Data System (ADS)

    Huemmrich, K. F.; Campbell, P. K. E.; Gao, B. C.; Flanagan, L. B.; Goulden, M.

    2017-12-01

    Data from the Hyperspectral Imager for Coastal Ocean (HICO), mounted on the International Space Station (ISS), were used to develop and test algorithms for remotely retrieving ecosystem productivity. The ISS orbit introduces both limitations and opportunities for observing ecosystem dynamics. Twenty six HICO images were used from four study sites representing different vegetation types: grasslands, shrubland, and forest. Gross ecosystem production (GEP) data from eddy covariance were matched with HICO-derived spectra. Multiple algorithms were successful relating spectral reflectance with GEP, including: Spectral Vegetation Indices (SVI), SVI in a light use efficiency model framework, spectral shape characteristics through spectral derivatives and absorption feature analysis, and statistical models leading to Multiband Hyperspectral Indices (MHI) from stepwise regressions and Partial Least Squares Regression (PLSR). Algorithms were able to achieve r2 better than 0.7 for both GEP at the overpass time and daily GEP. These algorithms were successful using a diverse set of observations combining data from multiple years, multiple times during growing season, different times of day, with different view angles, and different vegetation types. The demonstrated robustness of the algorithms presented in this study over these conditions provides some confidence in mapping spatial patterns of GEP, describing variability within fields as well as the regional patterns based only on spectral reflectance information. The ISS orbit provides periods with multiple observations collected at different times of the day within a period of a few days. Diurnal GEP patterns were estimated comparing the half-hourly average GEP from the flux tower against HICO estimates of GEP (r2=0.87) if morning, midday, and afternoon observations were available for average fluxes in the time period.

  15. Rapid and Portable Methods for Identification of Bacterially Influenced Calcite: Application of Laser-Induced Breakdown Spectroscopy and AOTF Reflectance Spectroscopy, Fort Stanton Cave, New Mexico

    NASA Astrophysics Data System (ADS)

    McMillan, N. J.; Chavez, A.; Chanover, N.; Voelz, D.; Uckert, K.; Tawalbeh, R.; Gariano, J.; Dragulin, I.; Xiao, X.; Hull, R.

    2014-12-01

    Rapid, in-situ methods for identification of biologic and non-biologic mineral precipitation sites permit mapping of biological hot spots. Two portable spectrometers, Laser-Induced Breakdown Spectroscopy (LIBS) and Acoustic-Optic Tunable Filter Reflectance Spectroscopy (AOTFRS) were used to differentiate between bacterially influenced and inorganically precipitated calcite specimens from Fort Stanton Cave, NM, USA. LIBS collects light emitted from the decay of excited electrons in a laser ablation plasma; the spectrum is a chemical fingerprint of the analyte. AOTFRS collects light reflected from the surface of a specimen and provides structural information about the material (i.e., the presence of O-H bonds). These orthogonal data sets provide a rigorous method to determine the origin of calcite in cave deposits. This study used a set of 48 calcite samples collected from Fort Stanton cave. Samples were examined in SEM for the presence of biologic markers; these data were used to separate the samples into biologic and non-biologic groups. Spectra were modeled using the multivariate technique Partial Least Squares Regression (PLSR). Half of the spectra were used to train a PLSR model, in which biologic samples were assigned to the independent variable "0" and non-biologic samples were assigned the variable "1". Values of the independent variable were calculated for each of the training samples, which were close to 0 for the biologic samples (-0.09 - 0.23) and close to 1 for the non-biologic samples (0.57 - 1.14). A Value of Apparent Distinction (VAD) of 0.55 was used to numerically distinguish between the two groups; any sample with an independent variable value < 0.55 was classified as having a biologic origin; a sample with a value > 0.55 was determined to be non-biologic in origin. After the model was trained, independent variable values for the remaining half of the samples were calculated. Biologic or non-biologic origin was assigned by comparison to the VAD. Using LIBS data alone, the model has a 92% success rate, correctly identifying 23 of 25 samples. Modeling of AOTFRS spectra and the combined LIBS-AOTFRS data set have similar success rates. This study demonstrates that rapid, portable LIBS and AOTFRS instruments can be used to map the spatial distribution of biologic precipitation in caves.

  16. Prediction of Ba, Co and Ni for tropical soils using diffuse reflectance spectroscopy and X-ray fluorescence spectroscopy

    NASA Astrophysics Data System (ADS)

    Arantes Camargo, Livia; Marques Júnior, José; Reynaldo Ferracciú Alleoni, Luís; Tadeu Pereira, Gener; De Bortoli Teixeira, Daniel; Santos Rabelo de Souza Bahia, Angélica

    2017-04-01

    Environmental impact assessments may be assisted by spatial characterization of potentially toxic elements (PTEs). Diffuse reflectance spectroscopy (DRS) and X-ray fluorescence spectroscopy (XRF) are rapid, non-destructive, low-cost, prediction tools for a simultaneous characterization of different soil attributes. Although low concentrations of PTEs might preclude the observation of spectral features, their contents can be predicted using spectroscopy by exploring the existing relationship between the PTEs and soil attributes with spectral features. This study aimed to evaluate, in three geomorphic surfaces of Oxisols, the capacity for predicting PTEs (Ba, Co, and Ni) and their spatial variability by means of diffuse reflectance spectroscopy (DRS) and X-ray fluorescence spectroscopy (XRF). For that, soil samples were collected from three geomorphic surfaces and analyzed for chemical, physical, and mineralogical properties, and then analyzed in DRS (visible + near infrared - VIS+NIR and medium infrared - MIR) and XRF equipment. PTE prediction models were calibrated using partial least squares regression (PLSR). PTE spatial distribution maps were built using the values calculated by the calibrated models that reached the best accuracy using geostatistics. PTE prediction models were satisfactorily calibrated using MIR DRS for Ba, and Co (residual prediction deviation - RPD > 3.0), Vis DRS for Ni (RPD > 2.0) and FRX for all the studied PTEs (RPD > 1.8). DRS- and XRF-predicted values allowed the characterization and the understanding of spatial variability of the studied PTEs.

  17. Strategies for soil quality assessment using VNIR gyperspectral spectroscopy in a western Kenya Chronosequence

    USGS Publications Warehouse

    Kinoshita, Rintaro; Moebius-Clune, Bianca N.; van Es, Harold M.; Hively, W. Dean; Bilgilis, A. Volkan

    2012-01-01

    Visible and near-infrared reflectance spectroscopy (VNIRS) is a rapid and nondestructive method that can predict multiple soil properties simultaneously, but its application in multidimensional soil quality (SQ) assessment in the tropics still needs to be further assessed. In this study, VNIRS (350–2500 nm) was employed to analyze 227 air-dried soil samples of Ultisols from a soil chronosequence in western Kenya and assess 16 SQ indicators. Partial least squares regression (PLSR) was validated using the full-site cross-validation method by grouping samples from each farm or forest site. Most suitable models successfully predicted SQ indicators (R2 ≥ 0.80; ratio of performance to deviation [RPD] ≥ 2.00) including soil organic matter (OMLOI), active C, Ca, cation exchange capacity (CEC), and clay. Moderately-well predicted indicators (0.50 ≤ R2 pwp), and field capacity (Θfc). Poorly predicted indicators (R2 < 0.50; RPD < 1.40) were EC, S, P, available water capacity (AWC), K, Zn, and penetration resistance. Combining VNIRS with selected field- and laboratory-measured SQ indicator values increased predictability. Furthermore, VNIRS showed moderate to substantial agreement in predicting interpretive SQ scores and a composite soil quality index (CSQI) especially when combined with directly measured SQ indicator values. In conclusion, VNIRS has good potential for low cost, rapid assessment of physical and biological SQ indicators but conventional soil chemical tests may need to be retained to provide comprehensive SQ assessments.

  18. Regional prediction of soil organic carbon content over croplands using airborne hyperspectral data

    NASA Astrophysics Data System (ADS)

    Vaudour, Emmanuelle; Gilliot, Jean-Marc; Bel, Liliane; Lefebvre, Josias; Chehdi, Kacem

    2015-04-01

    This study was carried out in the framework of the Prostock-Gessol3 and the BASC-SOCSENSIT projects, dedicated to the spatial monitoring of the effects of exogenous organic matter land application on soil organic carbon storage. It aims at identifying the potential of airborne hyperspectral AISA-Eagle data for predicting the topsoil organic carbon (SOC) content of bare cultivated soils over a large peri-urban area (221 km2) with both contrasted soils and SOC contents, located in the western region of Paris, France. Soils comprise hortic or glossic luvisols, calcaric, rendzic cambisols and colluvic cambisols. Airborne AISA-Eagle data (400-1000 nm, 126 bands) with 1 m-resolution were acquired on 17 April 2013 over 13 tracks which were georeferenced. Tracks were atmospherically corrected using a set of 22 synchronous field spectra of both bare soils, black and white targets and impervious surfaces. Atmospherically corrected track tiles were mosaicked at a 2 m-resolution resulting in a 66 Gb image. A SPOT4 satellite image was acquired the same day in the framework of the SPOT4-Take Five program of the French Space Agency (CNES) which provided it with atmospheric correction. The land use identification system layer (RPG) of 2012 was used to mask non-agricultural areas, then NDVI calculation and thresholding enabled to map agricultural fields with bare soil. All 18 sampled sites known to be bare at this very date were correctly included in this map. A total of 85 sites sampled in 2013 or in the 3 previous years were identified as bare by means of this map. Predictions were made from the mosaic spectra which were related to topsoil SOC contents by means of partial least squares regression (PLSR). Regression robustness was evaluated through a series of 1000 bootstrap data sets of calibration-validation samples. The use of the total sample including 27 sites under cloud shadows led to non-significant results. Considering 43 sites outside cloud shadows only, median validation root-mean-square errors (RMSE) were ~4-4.5 g. kg-1. An additional set of 15 samples with bare soils led to similar RMSE values. Such results are only slightly better than those resulting from an earlier study with multispectral satellite images (Vaudour et al., 2013). The influence of soil surface condition and particularly soil roughness is discussed.

  19. Using LUCAS topsoil database to estimate soil organic carbon content in local spectral libraries

    NASA Astrophysics Data System (ADS)

    Castaldi, Fabio; van Wesemael, Bas; Chabrillat, Sabine; Chartin, Caroline

    2017-04-01

    The quantification of the soil organic carbon (SOC) content over large areas is mandatory to obtain accurate soil characterization and classification, which can improve site specific management at local or regional scale exploiting the strong relationship between SOC and crop growth. The estimation of the SOC is not only important for agricultural purposes: in recent years, the increasing attention towards global warming highlighted the crucial role of the soil in the global carbon cycle. In this context, soil spectroscopy is a well consolidated and widespread method to estimate soil variables exploiting the interaction between chromophores and electromagnetic radiation. The importance of spectroscopy in soil science is reflected by the increasing number of large soil spectral libraries collected in the world. These large libraries contain soil samples derived from a consistent number of pedological regions and thus from different parent material and soil types; this heterogeneity entails, in turn, a large variability in terms of mineralogical and organic composition. In the light of the huge variability of the spectral responses to SOC content and composition, a rigorous classification process is necessary to subset large spectral libraries and to avoid the calibration of global models failing to predict local variation in SOC content. In this regard, this study proposes a method to subset the European LUCAS topsoil database into soil classes using a clustering analysis based on a large number of soil properties. The LUCAS database was chosen to apply a standardized multivariate calibration approach valid for large areas without the need for extensive field and laboratory work for calibration of local models. Seven soil classes were detected by the clustering analyses and the samples belonging to each class were used to calibrate specific partial least square regression (PLSR) models to estimate SOC content of three local libraries collected in Belgium (Loam belt and Wallonia) and Luxembourg. The three local libraries only consist of spectral data (199 samples) acquired using the same protocol as the one used for the LUCAS database. SOC was estimated with a good accuracy both within each local library (RMSE: 1.2 ÷ 5.4 g kg-1; RPD: 1.41 ÷ 2.06) and for the samples of the three libraries together (RMSE: 3.9 g kg-1; RPD: 2.47). The proposed approach could allow to estimate SOC everywhere in Europe only collecting spectra, without the need for chemical laboratory analyses, exploiting the potentiality of the LUCAS database and specific PLSR models.

  20. Investigating bias in squared regression structure coefficients

    PubMed Central

    Nimon, Kim F.; Zientek, Linda R.; Thompson, Bruce

    2015-01-01

    The importance of structure coefficients and analogs of regression weights for analysis within the general linear model (GLM) has been well-documented. The purpose of this study was to investigate bias in squared structure coefficients in the context of multiple regression and to determine if a formula that had been shown to correct for bias in squared Pearson correlation coefficients and coefficients of determination could be used to correct for bias in squared regression structure coefficients. Using data from a Monte Carlo simulation, this study found that squared regression structure coefficients corrected with Pratt's formula produced less biased estimates and might be more accurate and stable estimates of population squared regression structure coefficients than estimates with no such corrections. While our findings are in line with prior literature that identified multicollinearity as a predictor of bias in squared regression structure coefficients but not coefficients of determination, the findings from this study are unique in that the level of predictive power, number of predictors, and sample size were also observed to contribute bias in squared regression structure coefficients. PMID:26217273

  1. Digital soil classification and elemental mapping using imaging Vis-NIR spectroscopy: How to explicitly quantify stagnic properties of a Luvisol under Norway spruce

    NASA Astrophysics Data System (ADS)

    Kriegs, Stefanie; Buddenbaum, Henning; Rogge, Derek; Steffens, Markus

    2015-04-01

    Laboratory imaging Vis-NIR spectroscopy of soil profiles is a novel technique in soil science that can determine quantity and quality of various chemical soil properties with a hitherto unreached spatial resolution in undisturbed soil profiles. We have applied this technique to soil cores in order to get quantitative proof of redoximorphic processes under two different tree species and to proof tree-soil interactions at microscale. Due to the imaging capabilities of Vis-NIR spectroscopy a spatially explicit understanding of soil processes and properties can be achieved. Spatial heterogeneity of the soil profile can be taken into account. We took six 30 cm long rectangular soil columns of adjacent Luvisols derived from quaternary aeolian sediments (Loess) in a forest soil near Freising/Bavaria using stainless steel boxes (100×100×300 mm). Three profiles were sampled under Norway spruce and three under European beech. A hyperspectral camera (VNIR, 400-1000 nm in 160 spectral bands) with spatial resolution of 63×63 µm² per pixel was used for data acquisition. Reference samples were taken at representative spots and analysed for organic carbon (OC) quantity and quality with a CN elemental analyser and for iron oxides (Fe) content using dithionite extraction followed by ICP-OES measurement. We compared two supervised classification algorithms, Spectral Angle Mapper and Maximum Likelihood, using different sets of training areas and spectral libraries. As established in chemometrics we used multivariate analysis such as partial least-squares regression (PLSR) in addition to multivariate adaptive regression splines (MARS) to correlate chemical data with Vis-NIR spectra. As a result elemental mapping of Fe and OC within the soil core at high spatial resolution has been achieved. The regression model was validated by a new set of reference samples for chemical analysis. Digital soil classification easily visualizes soil properties within the soil profiles. By combining both techniques, detailed soil maps, elemental balances and a deeper understanding of soil forming processes at the microscale become feasible for complete soil profiles.

  2. Modeling soil organic matter (SOM) from satellite data using VISNIR-SWIR spectroscopy and PLS regression with step-down variable selection algorithm: case study of Campos Amazonicos National Park savanna enclave, Brazil

    NASA Astrophysics Data System (ADS)

    Rosero-Vlasova, O.; Borini Alves, D.; Vlassova, L.; Perez-Cabello, F.; Montorio Lloveria, R.

    2017-10-01

    Deforestation in Amazon basin due, among other factors, to frequent wildfires demands continuous post-fire monitoring of soil and vegetation. Thus, the study posed two objectives: (1) evaluate the capacity of Visible - Near InfraRed - ShortWave InfraRed (VIS-NIR-SWIR) spectroscopy to estimate soil organic matter (SOM) in fire-affected soils, and (2) assess the feasibility of SOM mapping from satellite images. For this purpose, 30 soil samples (surface layer) were collected in 2016 in areas of grass and riparian vegetation of Campos Amazonicos National Park, Brazil, repeatedly affected by wildfires. Standard laboratory procedures were applied to determine SOM. Reflectance spectra of soils were obtained in controlled laboratory conditions using Fieldspec4 spectroradiometer (spectral range 350nm- 2500nm). Measured spectra were resampled to simulate reflectances for Landsat-8, Sentinel-2 and EnMap spectral bands, used as predictors in SOM models developed using Partial Least Squares regression and step-down variable selection algorithm (PLSR-SD). The best fit was achieved with models based on reflectances simulated for EnMap bands (R2=0.93; R2cv=0.82 and NMSE=0.07; NMSEcv=0.19). The model uses only 8 out of 244 predictors (bands) chosen by the step-down variable selection algorithm. The least reliable estimates (R2=0.55 and R2cv=0.40 and NMSE=0.43; NMSEcv=0.60) resulted from Landsat model, while Sentinel-2 model showed R2=0.68 and R2cv=0.63; NMSE=0.31 and NMSEcv=0.38. The results confirm high potential of VIS-NIR-SWIR spectroscopy for SOM estimation. Application of step-down produces sparser and better-fit models. Finally, SOM can be estimated with an acceptable accuracy (NMSE 0.35) from EnMap and Sentinel-2 data enabling mapping and analysis of impacts of repeated wildfires on soils in the study area.

  3. Integrating proximal soil sensing techniques and terrain indexes to generate 3D maps of soil restrictive layers in the Palouse region, Washington, USA

    NASA Astrophysics Data System (ADS)

    Poggio, Matteo; Brown, David J.; Gasch, Caley K.; Brooks, Erin S.; Yourek, Matt A.

    2015-04-01

    In the Palouse region of eastern Washington and northern Idaho (USA), spatially discontinuous restrictive layers impede rooting growth and water infiltration. Consequently, accurate maps showing the depth and spatial extent of these restrictive layers are essential for watershed hydrologic modeling appropriate for precision agriculture. In this presentation, we report on the use of a Visible and Near-Infrared (VisNIR) penetrometer fore optic to construct detailed maps of three wheat fields in the Palouse region. The VisNIR penetrometer was used to deliver in situ soil reflectance to an Analytical Spectral Devices (ASD, Boulder, CO, USA) spectrometer and simultaneously acquire insertion force. With a hydraulic push-type soil coring systems for insertion (e.g. Giddings), we collected soil spectra and insertion force data along 41m x 41m grid points (2 fields) and 50m x 50m grid points (1 field) to ≈80cm depth, in addition to interrogation points at 36 representative instrumented locations per field. At each of the 36 instrumented locations, two soil cores were extracted for laboratory determination of clay content and bulk density. We developed calibration models of soil clay content and bulk density with spectra and insertion force collected in situ, using partial least squares regression 2 (PLSR2). Applying spline functions, we delineated clay and bulk density profiles at each points (grid and 24 locations). The soil profiles were then used as inputs in a regression-kriging model with terrain indexes and ECa data (derived from an EM38 field survey, Geonics, Mississauga, Ontario, Canada) as covariates to generate 3D soil maps. Preliminary results show that the VisNIR penetrometer can capture the spatial patterns of restrictive layers. Work is ongoing to evaluate the prediction accuracy of penetrometer-derived 3D clay content and restriction layer maps.

  4. Spatial distribution of particulate organic matter pools, quantified and characterized by mid-infrared spectroscopy

    NASA Astrophysics Data System (ADS)

    Bornemann, L.; Welp, G.; Amelung, W.

    2009-04-01

    Comprising more than 60 % of the terrestrial carbon pool, soil organic carbon (SOC) is one of the principal factors regulating the global C-cycle. Against the background of worldwide increasing CO2 emissions, much effort has been put to the modelling of soil-C turnover in order to evaluate its potential for mitigation of climate change. Soil organic matter is an ever changing assemblage of various organic components that interact with the mineral matrix and in dependence of its ecological environment. Carbon storage is thereby assumed to propagate by hierarchical saturation of different carbon pools. A homogeneous distribution of the respective pools within natural environments is unlikely as the controlling soil parameters are subject to spatial and temporal heterogeneity. Several attempts to operationalize this complex soil compartment have been proposed, most of them resting upon a concept of pools with different stability and varying turnover times. Among these pools, particulate organic matter (POM) is considered to be most sensitive to environmental changes and has been shown to explain major parts of the SOC variations. Until today, rather laborious physical and physico-chemical fractionation procedures are most commonly applied for the initialization and validation of POM in C-turnover models. Mid-infrared spectroscopy (MIRS) in combination with partial least squares regression (PLSR) could overcome this problem. The technique is fast, cheap, and requires little sample preparation. All the same, it is an appropriate technique not only for the determination of gross parameters like total soil organic carbon contents, but also for the determination and characterization of minor constituents like black carbon in soils. Basically, the infrared radiation is absorbed by molecules that express a dipole-moment during vibration. As virtually all constituents of soil organic matter and also a multitude of inorganic soil constituents express such a dipole-moment, plentiful chemical information can be extracted from absorption spectra of soil samples. In this work we present the development of calibration models for POM quantification via MIRS-PLSR, and the compilation of a raster data set including SOC and POM of three size classes for the testsite of the SFB-TR32 at Selhausen near Jülich (Germany). The studied test site is an orthic luvisol which has been sampled in a ten times ten meter raster from 0-30 cm depth (n=131). For POM fractionation samples were gently sonicated and material from 2000-250 µm was gained by wet sieving. After a second, more intense sonication, intermediate (250-53 µm) and fine (53-20 µm) material was also gained by wet sieving. All fractions were dried at 40 °C, carbon contents were determined by elemental analysis. For calibration of MIRS-PLSR, SOC contents of 87 bulk soil samples were determined by elemental analysis. Contributions of the different POM fractions to bulk SOC as well as the SOC contents within the particular POM fraction were determined for 36 soil samples by physical particle size fractionation as described above. MIRS-PLSR based predictions for the contribution of POM fractions to bulk soil proved to be satisfactory (R² >0.77) and improved with decreasing particle size. For the predictions of SOC contents in bulk soil and the different POM fractions R² even reached values ≥0.97. Root mean squared errors of the cross validations were in the range of standard deviations of the lab analysis or smaller. As physical fractionation methods are intrinsically susceptible to measurement errors, determination of POM fractions by MIRS analysis may even improve data sets for modelling. Apart from the generally convincing statistical parameters, further evidence for reliable predictions of the contributions of the different POM fractions to bulk SOC could be drawn from the spectral information itself. The spectral features utilized for the determination of the contribution of the different POM fractions to bulk SOC were matching the features for the prediction of the absolute SOC concentrations within the particular fractions. As these predictions were conducted with independent sample sets (bulk soil for the POM contribution and soil fractions for the SOC content within the fraction) the matching structural information for both features of the individual POM fraction indirectly validates the prediction for the POM pools. The latter is especially true as the observed features coincide with the actual knowledge on chemistry and stabilization of POM in soils. For the compilation of a complete raster data-set, the developed calibrations were applied to all of the 131 topsoil samples taken at the SFB-TR32 testsite. Correlation analysis indicated that the coarse and the intermediate POM fractions are related to each other, to bulk SOC content and textural parameters respectively, while the fine POM fraction seems to be independent from these factors. The observed coherences and the applicability of a C-saturation concept will be discussed by visual map-comparison and geostatistical analysis of the determined parameters.

  5. Multivariate Analysis of Fruit Antioxidant Activities of Blackberry Treated with 1-Methylcyclopropene or Vacuum Precooling

    PubMed Central

    Li, Jian; Ma, Guowei; Ma, Lin; Bao, Xiaolin; Li, Liping; Zhao, Qian

    2018-01-01

    Effects of 1-methylcyclopropene (1-MCP) and vacuum precooling on quality and antioxidant properties of blackberries (Rubus spp.) were evaluated using one-way analysis of variance, principal component analysis (PCA), partial least squares (PLS), and path analysis. Results showed that the activities of antioxidant enzymes were enhanced by both 1-MCP treatment and vacuum precooling. PCA could discriminate 1-MCP treated fruit and the vacuum precooled fruit and showed that the radical-scavenging activities in vacuum precooled fruit were higher than those in 1-MCP treated fruit. The scores of PCA showed that H2O2 content was the most important variables of blackberry fruit. PLSR results showed that peroxidase (POD) activity negatively correlated with H2O2 content. The results of path coefficient analysis indicated that glutathione (GSH) also had an indirect effect on H2O2 content. PMID:29487622

  6. The crux of the method: assumptions in ordinary least squares and logistic regression.

    PubMed

    Long, Rebecca G

    2008-10-01

    Logistic regression has increasingly become the tool of choice when analyzing data with a binary dependent variable. While resources relating to the technique are widely available, clear discussions of why logistic regression should be used in place of ordinary least squares regression are difficult to find. The current paper compares and contrasts the assumptions of ordinary least squares with those of logistic regression and explains why logistic regression's looser assumptions make it adept at handling violations of the more important assumptions in ordinary least squares.

  7. Soil Organic Carbon Estimation and Mapping Using "on-the-go" VisNIR Spectroscopy

    NASA Astrophysics Data System (ADS)

    Brown, D. J.; Bricklemyer, R. S.; Christy, C.

    2007-12-01

    Soil organic carbon (SOC) and other soil properties related to carbon sequestration (eg. soil clay content and mineralogy) vary spatially across landscapes. To cost effectively capture this variability, new technologies, such as Visible and Near Infrared (VisNIR) spectroscopy, have been applied to soils for rapid, accurate, and inexpensive estimation of SOC and other soil properties. For this study, we evaluated an "on the go" VisNIR sensor developed by Veris Technologies, Inc. (Salinas, KS) for mapping SOC, soil clay content and mineralogy. The Veris spectrometer spanned 350 to 2224 nm with 8 nm spectral resolution, and 25 spectra were integrated every 2 seconds resulting in 3 -5 m scanning distances on the ground. The unit was mounted to a mobile sensor platform pulled by a tractor, and scanned soils at an average depth of 10 cm through a quartz-sapphire window. We scanned eight 16.2 ha (40 ac) wheat fields in north central Montana (USA), with 15 m transect intervals. Using random sampling with spatial inhibition, 100 soil samples from 0-10 cm depths were extracted along scanned transects from each field and were analyzed for SOC. Neat, sieved (<2 mm) soil sample materials were also scanned in the lab using an Analytical Spectral Devices (ASD, Boulder, CO, USA) Fieldspec Pro FR spectroradiometer with a spectral range of 350-2500 and spectral resolution of 2-10 nm. The analyzed samples were used to calibrate and validate a number of partial least squares regression (PLSR) VisNIR models to compare on-the-go scanning vs. higher spectral resolution laboratory spectroscopy vs. standard SOC measurement methods.

  8. Predicting soil properties for sustainable agriculture using vis-NIR spectroscopy: a case study in northern Greece

    NASA Astrophysics Data System (ADS)

    Tsakiridis, Nikolaos L.; Tziolas, Nikolaos; Dimitrakos, Agathoklis; Galanis, Georgios; Ntonou, Eleftheria; Tsirika, Anastasia; Terzopoulou, Evangelia; Kalopesa, Eleni; Zalidis, George C.

    2017-09-01

    Soil Spectral Libraries facilitate agricultural production taking into account the principles of a low-input sustainable agriculture and provide more valuable knowledge to environmental policy makers, enabling improved decision making and effective management of natural resources in the region. In this paper, a comparison in the predictive performance of two state of the art algorithms, one linear (Partial Least Squares Regression) and one non-linear (Cubist), employed in soil spectroscopy is conducted. The comparison was carried out in a regional Soil Spectral Library developed in the Eastern Macedonia and Thrace region of Northern Greece, comprised of roughly 450 Entisol soil samples from soil horizons A (0-30 cm) and B (30-60 cm). The soil spectra were acquired in the visible - Near Infrared Red region (vis- NIR, 350nm-2500nm) using a standard protocol in the laboratory. Three soil properties, which are essential for agriculture, were analyzed and taken into account for the comparison. These were the Organic Matter, the Clay content and the concentration of nitrate-N. Additionally, three different spectral pre-processing techniques were utilized, namely the continuum removal, the absorbance transformation, and the first derivative. Following the removal of outliers using the Mahalanobis distance in the first 5 principal components of the spectra (accounting for 99.8% of the variance), a five-fold cross-validation experiment was considered for all 12 datasets. Statistical comparisons were conducted on the results, which indicate that the Cubist algorithm outperforms PLSR, while the most informative transformation is the first derivative.

  9. Role of Bai-Shao towards the antidepressant effect of Chaihu-Shu-Gan-San using metabonomics integrated with chemical fingerprinting.

    PubMed

    Chang, Xing; Jia, Hongmei; Zhou, Chao; Zhang, Hongwu; Yu, Meng; Yang, Junshan; Zou, Zhongmei

    2015-12-01

    Chaihu-Shu-Gan-San (CSGS) is a classical traditional Chinese medicine formula for the treatment of depression. As one of the single herbs in CSGS, Bai-Shao displayed antidepressant effect. In order to explore the role of Bai-Shao towards the antidepressant effect of CSGS, the metabolic regulation and chemical profiles of CSGS with and without Bai-Shao (QBS) were investigated using metabonomics integrated with chemical fingerprinting. At first, partial least squares regression (PLSR) analysis was applied to characterize the potential biomarkers associated with chronic unpredictable mild stress (CUMS)-induced depression. Among 46 differential metabolites found in the ultra-performance liquid chromatography quadrupole time of flight mass spectrometry (UPLC-Q-TOF/MS) and (1)H NMR-based urinary metabonomics, 20 were significantly correlated with the preferred sucrose consumption observed in behavior experiments and were considered as biomarkers to evaluate the antidepressant effect of CSGS. Based on differential regulation on CUMS-induced metabolic disturbances with CSGS and QBS treatments, we concluded that Bai-Shao made crucial contribution to CSGS in the improvement of the metabolic deviations of six biomarkers (i.e., glutamate, acetoacetic acid, creatinine, xanthurenic acid, kynurenic acid, and N-acetylserotonin) disturbed in CUMS-induced depression. While the chemical constituents of Bai-Shao contributed to CSGS were paeoniflorin, albiflorin, isomaltopaeoniflorin, and benzoylpaeoniflorin based on the multivariate analysis of the UPLC-Q-TOF/MS chemical profiles from CSGS and QBS extracts. These findings suggested that Bai-Shao played an indispensable role in the antidepressant effect of CSGS. Copyright © 2015 Elsevier B.V. All rights reserved.

  10. Hyperspectral Imaging Analysis for the Classification of Soil Types and the Determination of Soil Total Nitrogen

    PubMed Central

    Jia, Shengyao; Li, Hongyang; Wang, Yanjie; Tong, Renyuan; Li, Qing

    2017-01-01

    Soil is an important environment for crop growth. Quick and accurately access to soil nutrient content information is a prerequisite for scientific fertilization. In this work, hyperspectral imaging (HSI) technology was applied for the classification of soil types and the measurement of soil total nitrogen (TN) content. A total of 183 soil samples collected from Shangyu City (People’s Republic of China), were scanned by a near-infrared hyperspectral imaging system with a wavelength range of 874–1734 nm. The soil samples belonged to three major soil types typical of this area, including paddy soil, red soil and seashore saline soil. The successive projections algorithm (SPA) method was utilized to select effective wavelengths from the full spectrum. Pattern texture features (energy, contrast, homogeneity and entropy) were extracted from the gray-scale images at the effective wavelengths. The support vector machines (SVM) and partial least squares regression (PLSR) methods were used to establish classification and prediction models, respectively. The results showed that by using the combined data sets of effective wavelengths and texture features for modelling an optimal correct classification rate of 91.8%. could be achieved. The soil samples were first classified, then the local models were established for soil TN according to soil types, which achieved better prediction results than the general models. The overall results indicated that hyperspectral imaging technology could be used for soil type classification and soil TN determination, and data fusion combining spectral and image texture information showed advantages for the classification of soil types. PMID:28974005

  11. A general structure-property relationship to predict the enthalpy of vaporisation at ambient temperatures.

    PubMed

    Oberg, T

    2007-01-01

    The vapour pressure is the most important property of an anthropogenic organic compound in determining its partitioning between the atmosphere and the other environmental media. The enthalpy of vaporisation quantifies the temperature dependence of the vapour pressure and its value around 298 K is needed for environmental modelling. The enthalpy of vaporisation can be determined by different experimental methods, but estimation methods are needed to extend the current database and several approaches are available from the literature. However, these methods have limitations, such as a need for other experimental results as input data, a limited applicability domain, a lack of domain definition, and a lack of predictive validation. Here we have attempted to develop a quantitative structure-property relationship (QSPR) that has general applicability and is thoroughly validated. Enthalpies of vaporisation at 298 K were collected from the literature for 1835 pure compounds. The three-dimensional (3D) structures were optimised and each compound was described by a set of computationally derived descriptors. The compounds were randomly assigned into a calibration set and a prediction set. Partial least squares regression (PLSR) was used to estimate a low-dimensional QSPR model with 12 latent variables. The predictive performance of this model, within the domain of application, was estimated at n=560, q2Ext=0.968 and s=0.028 (log transformed values). The QSPR model was subsequently applied to a database of 100,000+ structures, after a similar 3D optimisation and descriptor generation. Reliable predictions can be reported for compounds within the previously defined applicability domain.

  12. Sparse partial least squares regression for simultaneous dimension reduction and variable selection

    PubMed Central

    Chun, Hyonho; Keleş, Sündüz

    2010-01-01

    Partial least squares regression has been an alternative to ordinary least squares for handling multicollinearity in several areas of scientific research since the 1960s. It has recently gained much attention in the analysis of high dimensional genomic data. We show that known asymptotic consistency of the partial least squares estimator for a univariate response does not hold with the very large p and small n paradigm. We derive a similar result for a multivariate response regression with partial least squares. We then propose a sparse partial least squares formulation which aims simultaneously to achieve good predictive performance and variable selection by producing sparse linear combinations of the original predictors. We provide an efficient implementation of sparse partial least squares regression and compare it with well-known variable selection and dimension reduction approaches via simulation experiments. We illustrate the practical utility of sparse partial least squares regression in a joint analysis of gene expression and genomewide binding data. PMID:20107611

  13. Mapping within-field variations of soil organic carbon content using UAV multispectral visible near-infrared images

    NASA Astrophysics Data System (ADS)

    Gilliot, Jean-Marc; Vaudour, Emmanuelle; Michelin, Joël

    2016-04-01

    This study was carried out in the framework of the PROSTOCK-Gessol3 project supported by the French Environment and Energy Management Agency (ADEME), the TOSCA-PLEIADES-CO project of the French Space Agency (CNES) and the SOERE PRO network working on environmental impacts of Organic Waste Products recycling on field crops at long time scale. The organic matter is an important soil fertility parameter and previous studies have shown the potential of spectral information measured in the laboratory or directly in the field using field spectro-radiometer or satellite imagery to predict the soil organic carbon (SOC) content. This work proposes a method for a spatial prediction of bare cultivated topsoil SOC content, from Unmanned Aerial Vehicle (UAV) multispectral imagery. An agricultural plot of 13 ha, located in the western region of Paris France, was analysed in April 2013, shortly before sowing while it was still bare soil. Soils comprised haplic luvisols, rendzic cambisols and calcaric or colluvic cambisols. The UAV platform used was a fixed wing provided by Airinov® flying at an altitude of 150m and was equipped with a four channels multispectral visible near-infrared camera MultiSPEC 4C® (550nm, 660nm, 735 nm and 790 nm). Twenty three ground control points (GCP) were sampled within the plot according to soils descriptions. GCP positions were determined with a centimetric DGPS. Different observations and measurements were made synchronously with the drone flight: soil surface description, spectral measurements (with ASD FieldSpec 3® spectroradiometer), roughness measurements by a photogrammetric method. Each of these locations was sampled for both soil standard physico-chemical analysis and soil water content. A Structure From Motion (SFM) processing was done from the UAV imagery to produce a 15 cm resolution multispectral mosaic using the Agisoft Photoscan® software. The SOC content was modelled by partial least squares regression (PLSR) between the laboratory analyses and the multispectral information for the 23 plots. The mean squared error of cross validation (RMSECV) by LOO (Leave One Out) method was 1.97 g of OC per kg of soil. A second correction of the model incorporating the effects of moisture and roughness on reflectance, has improved the quality of the prediction by 18% and a RMSECV of 1.61 g / kg. The model was finally spatialized on the whole plot using ArcGIS® by applying the regression formula on all mosaic pixels. Results are discussed in the light of an additional sampling campaign carried out in October 2015, providing 34 independent samples.

  14. Determining quality of caviar from Caspian Sea based on Raman spectroscopy and using artificial neural networks.

    PubMed

    Mohamadi Monavar, H; Afseth, N K; Lozano, J; Alimardani, R; Omid, M; Wold, J P

    2013-07-15

    The purpose of this study was to evaluate the feasibility of Raman spectroscopy for predicting purity of caviars. The 93 wild caviar samples of three different types, namely; Beluga, Asetra and Sevruga were analysed by Raman spectroscopy in the range 1995 cm(-1) to 545 cm(-1). Also, 60 samples from combinations of every two types were examined. The chemical origin of the samples was identified by reference measurements on pure samples. Linear chemometric methods like Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) were used for data visualisation and classification which permitted clear distinction between different caviars. Non-linear methods like Artificial Neural Networks (ANN) were used to classify caviar samples. Two different networks were tested in the classification: Probabilistic Neural Network with Radial-Basis Function (PNN) and Multilayer Feed Forward Networks with Back Propagation (BP-NN). In both cases, scores of principal components (PCs) were chosen as input nodes for the input layer in PC-ANN models in order to reduce the redundancy of data and time of training. Leave One Out (LOO) cross validation was applied in order to check the performance of the networks. Results of PCA indicated that, features like type and purity can be used to discriminate different caviar samples. These findings were also supported by LDA with efficiency between 83.77% and 100%. These results were confirmed with the results obtained by developed PC-ANN models, able to classify pure caviar samples with 93.55% and 71.00% accuracy in BP network and PNN, respectively. In comparison, LDA, PNN and BP-NN models for predicting caviar types have 90.3%, 73.1% and 91.4% accuracy. Partial least squares regression (PLSR) models were built under cross validation and tested with different independent data sets, yielding determination coefficients (R(2)) of 0.86, 0.83, 0.92 and 0.91 with root mean square error (RMSE) of validation of 0.32, 0.11, 0.03 and 0.09 for fatty acids of 16.0, 20.5, 22.6 and fat, respectively. Crown Copyright © 2013. Published by Elsevier B.V. All rights reserved.

  15. Using Weighted Least Squares Regression for Obtaining Langmuir Sorption Constants

    USDA-ARS?s Scientific Manuscript database

    One of the most commonly used models for describing phosphorus (P) sorption to soils is the Langmuir model. To obtain model parameters, the Langmuir model is fit to measured sorption data using least squares regression. Least squares regression is based on several assumptions including normally dist...

  16. Two Enhancements of the Logarithmic Least-Squares Method for Analyzing Subjective Comparisons

    DTIC Science & Technology

    1989-03-25

    error term. 1 For this model, the total sum of squares ( SSTO ), defined as n 2 SSTO = E (yi y) i=1 can be partitioned into error and regression sums...of the regression line around the mean value. Mathematically, for the model given by equation A.4, SSTO = SSE + SSR (A.6) A-4 where SSTO is the total...sum of squares (i.e., the variance of the yi’s), SSE is error sum of squares, and SSR is the regression sum of squares. SSTO , SSE, and SSR are given

  17. Ordinary Least Squares and Quantile Regression: An Inquiry-Based Learning Approach to a Comparison of Regression Methods

    ERIC Educational Resources Information Center

    Helmreich, James E.; Krog, K. Peter

    2018-01-01

    We present a short, inquiry-based learning course on concepts and methods underlying ordinary least squares (OLS), least absolute deviation (LAD), and quantile regression (QR). Students investigate squared, absolute, and weighted absolute distance functions (metrics) as location measures. Using differential calculus and properties of convex…

  18. Difficulties of biomass estimation over natural grassland

    NASA Astrophysics Data System (ADS)

    Kertész, Péter; Gecse, Bernadett; Pintér, Krisztina; Fóti, Szilvia; Nagy, Zoltán

    2017-04-01

    Estimation of biomass amount in grasslands using remote sensing is a challenge due to the high diversity and different phenologies of the constituting plant species. The aim of this study was to estimate the biomass amount (dry weight per area) during the vegetation period of a diverse semi-natural grassland with remote sensing. A multispectral camera (Tetracam Mini-MCA 6) was used with 3 cm ground resolution. The pre-processing method includes noise reduction, the correction for the vignetting effect and the calculation of the reflectance using an Incident Light Sensor (ILS). Calibration was made with ASD spectrophotometer as reference. To estimate biomass Partial Least Squares Regression (PLSR) statistical method was used with 5 bands and NDVI as input variables. Above ground biomass was cut in 15 quadrats (50×50 cm) as reference. The best prediction was attained in spring (r2=0.94, RMSE: 26.37 g m-2). The average biomass amount was 167 g m-2. The variability of the biomass is mainly determined by the relief, which causes the high and low biomass patches to be stable. The reliability of biomass estimation was negatively affected by the appearance of flowers and by the senescent plant parts during the summer. To determine the effects of flower's presence on the biomass estimation, 20 dominant species with visually dominant flowers in the area were selected and cover of flowers (%) were estimated in permanent plots during measurement campaigns. If the cover of flowers was low (<25%), the biomass amount estimation was successful (r2 >0,9), while at higher cover of flowers (>30%), the estimation failed (r2 <0,2). This effect restricts the usage of the remote sensing method to the spring - early summer period in diverse grasslands.

  19. Soil organic matter composition from correlated thermal analysis and nuclear magnetic resonance data in Australian national inventory of agricultural soils

    NASA Astrophysics Data System (ADS)

    Moore, T. S.; Sanderman, J.; Baldock, J.; Plante, A. F.

    2016-12-01

    National-scale inventories typically include soil organic carbon (SOC) content, but not chemical composition or biogeochemical stability. Australia's Soil Carbon Research Programme (SCaRP) represents a national inventory of SOC content and composition in agricultural systems. The program used physical fractionation followed by 13C nuclear magnetic resonance (NMR) spectroscopy. While these techniques are highly effective, they are typically too expensive and time consuming for use in large-scale SOC monitoring. We seek to understand if analytical thermal analysis is a viable alternative. Coupled differential scanning calorimetry (DSC) and evolved gas analysis (CO2- and H2O-EGA) yields valuable data on SOC composition and stability via ramped combustion. The technique requires little training to use, and does not require fractionation or other sample pre-treatment. We analyzed 300 agricultural samples collected by SCaRP, divided into four fractions: whole soil, coarse particulates (POM), untreated mineral associated (HUM), and hydrofluoric acid (HF)-treated HUM. All samples were analyzed by DSC-EGA, but only the POM and HF-HUM fractions were analyzed by NMR. Multivariate statistical analyses were used to explore natural clustering in SOC composition and stability based on DSC-EGA data. A partial least-squares regression (PLSR) model was used to explore correlations among the NMR and DSC-EGA data. Correlations demonstrated regions of combustion attributable to specific functional groups, which may relate to SOC stability. We are increasingly challenged with developing an efficient technique to assess SOC composition and stability at large spatial and temporal scales. Correlations between NMR and DSC-EGA may demonstrate the viability of using thermal analysis in lieu of more demanding methods in future large-scale surveys, and may provide data that goes beyond chemical composition to better approach quantification of biogeochemical stability.

  20. A portable device for detecting fruit quality by diffuse reflectance Vis/NIR spectroscopy

    NASA Astrophysics Data System (ADS)

    Sun, Hongwei; Peng, Yankun; Li, Peng; Wang, Wenxiu

    2017-05-01

    Soluble solid content (SSC) is a major quality parameter to fruit, which has influence on its flavor or texture. Some researches on the on-line non-invasion detection of fruit quality were published. However, consumers desire portable devices currently. This study aimed to develop a portable device for accurate, real-time and nondestructive determination of quality factors of fruit based on diffuse reflectance Vis/NIR spectroscopy (520-950 nm). The hardware of the device consisted of four units: light source unit, spectral acquisition unit, central processing unit, display unit. Halogen lamp was chosen as light source. When working, its hand-held probe was in contact with the surface of fruit samples thus forming dark environment to shield the interferential light outside. Diffuse reflectance light was collected and measured by spectrometer (USB4000). ARM (Advanced RISC Machines), as central processing unit, controlled all parts in device and analyzed spectral data. Liquid Crystal Display (LCD) touch screen was used to interface with users. To validate its reliability and stability, 63 apples were tested in experiment, 47 of which were chosen as calibration set, while others as prediction set. Their SSC reference values were measured by refractometer. At the same time, samples' spectral data acquired by portable device were processed by standard normalized variables (SNV) and Savitzky-Golay filter (S-G) to eliminate the spectra noise. Then partial least squares regression (PLSR) was applied to build prediction models, and the best predictions results was achieved with correlation coefficient (r) of 0.855 and standard error of 0.6033° Brix. The results demonstrated that this device was feasible to quantitatively analyze soluble solid content of apple.

  1. Near-infrared chemical imaging (NIR-CI) as a process monitoring solution for a production line of roll compaction and tableting.

    PubMed

    Khorasani, Milad; Amigo, José M; Sun, Changquan Calvin; Bertelsen, Poul; Rantanen, Jukka

    2015-06-01

    In the present study the application of near-infrared chemical imaging (NIR-CI) supported by chemometric modeling as non-destructive tool for monitoring and assessing the roller compaction and tableting processes was investigated. Based on preliminary risk-assessment, discussion with experts and current work from the literature the critical process parameter (roll pressure and roll speed) and critical quality attributes (ribbon porosity, granule size, amount of fines, tablet tensile strength) were identified and a design space was established. Five experimental runs with different process settings were carried out which revealed intermediates (ribbons, granules) and final products (tablets) with different properties. Principal component analysis (PCA) based model of NIR images was applied to map the ribbon porosity distribution. The ribbon porosity distribution gained from the PCA based NIR-CI was used to develop predictive models for granule size fractions. Predictive methods with acceptable R(2) values could be used to predict the granule particle size. Partial least squares regression (PLS-R) based model of the NIR-CI was used to map and predict the chemical distribution and content of active compound for both roller compacted ribbons and corresponding tablets. In order to select the optimal process, setting the standard deviation of tablet tensile strength and tablet weight for each tablet batch was considered. Strong linear correlation between tablet tensile strength and amount of fines and granule size was established, respectively. These approaches are considered to have a potentially large impact on quality monitoring and control of continuously operating manufacturing lines, such as roller compaction and tableting processes. Copyright © 2015 Elsevier B.V. All rights reserved.

  2. Characterization of the Key Aroma Volatile Compounds in Cranberry (Vaccinium macrocarpon Ait.) Using Gas Chromatography-Olfactometry (GC-O) and Odor Activity Value (OAV).

    PubMed

    Zhu, JianCai; Chen, Feng; Wang, LingYing; Niu, YunWei; Chen, HeXing; Wang, HongLin; Xiao, ZuoBing

    2016-06-22

    The volatile compounds of cranberries obtained from four cultivars (Early Black, Y1; Howes, Y2; Searles, Y3; and McFarlin, Y4) were analyzed by gas chromatography-olfactometry (GC-O), gas chromatography-mass spectrometry (GC-MS), and GC-flame photometric detection (FPD). The result presented that a total of thirty-three, thirty-four, thirty-four, and thirty-six odor-active compounds were identified by GC-O in the Y1, Y2, Y3, and Y4, respectively. In addition, twenty-two, twenty-two, thirty, and twenty-seven quantified compounds were demonstrated as important odorants according to odor activity values (OAVs > 1). Among these compounds, hexanal (OAV: 27-60), pentanal (OAV: 31-51), (E)-2-heptenal (OAV: 17-66), (E)-2-hexenal (OAV: 18-63), (E)-2-octenal (OAV: 10-28), (E)-2-nonenal (OAV: 8-77), ethyl 2-methylbutyrate (OAV: 10-33), β-ionone (OAV: 8-73), 2-methylbutyric acid (OAV: 18-37), and octanal (OAV: 4-24) contributed greatly to the aroma of cranberry. Partial least-squares regression (PLSR) was used to process the mean data accumulated from sensory evaluation by the panelists, odor-active aroma compounds (OAVs > 1), and samples. Sample Y3 was highly correlated with the sensory descriptors "floral" and "fruity". Sample Y4 was greatly related to the sensory descriptors "mellow" and "green and grass". Finally, an aroma reconstitution (Model A) was prepared by mixing the odor-active aroma compounds (OAVs > 1) based on their measured concentrations in the Y1 sample, indicating that the aroma profile of the reconstitution was pretty similar to that of the original sample.

  3. Modeling pollen time series using seasonal-trend decomposition procedure based on LOESS smoothing

    NASA Astrophysics Data System (ADS)

    Rojo, Jesús; Rivero, Rosario; Romero-Morte, Jorge; Fernández-González, Federico; Pérez-Badia, Rosa

    2017-02-01

    Analysis of airborne pollen concentrations provides valuable information on plant phenology and is thus a useful tool in agriculture—for predicting harvests in crops such as the olive and for deciding when to apply phytosanitary treatments—as well as in medicine and the environmental sciences. Variations in airborne pollen concentrations, moreover, are indicators of changing plant life cycles. By modeling pollen time series, we can not only identify the variables influencing pollen levels but also predict future pollen concentrations. In this study, airborne pollen time series were modeled using a seasonal-trend decomposition procedure based on LOcally wEighted Scatterplot Smoothing (LOESS) smoothing (STL). The data series—daily Poaceae pollen concentrations over the period 2006-2014—was broken up into seasonal and residual (stochastic) components. The seasonal component was compared with data on Poaceae flowering phenology obtained by field sampling. Residuals were fitted to a model generated from daily temperature and rainfall values, and daily pollen concentrations, using partial least squares regression (PLSR). This method was then applied to predict daily pollen concentrations for 2014 (independent validation data) using results for the seasonal component of the time series and estimates of the residual component for the period 2006-2013. Correlation between predicted and observed values was r = 0.79 (correlation coefficient) for the pre-peak period (i.e., the period prior to the peak pollen concentration) and r = 0.63 for the post-peak period. Separate analysis of each of the components of the pollen data series enables the sources of variability to be identified more accurately than by analysis of the original non-decomposed data series, and for this reason, this procedure has proved to be a suitable technique for analyzing the main environmental factors influencing airborne pollen concentrations.

  4. Relative Linkages of Chlorophyll-a with the Hydroclimatic and Biogeochemical Variables across the Continental U.S. (CONUS)

    NASA Astrophysics Data System (ADS)

    Ahmed, M. H.; Abdul-Aziz, O. I.

    2017-12-01

    Chlorophyll-a (Chl-a) is a key indicator for stream water quality and ecological health. The characterization of interplay between Chl-a and its numerous hydroclimatic and biogeochemical drivers is complex, and often involves multicollinear datasets. A systematic data analytics methodology was employed to determine the relative linkages of stream Chl-a with its dynamic environmental drivers at 50 stream water quality monitoring stations across the continental U.S. Multivariate statistical techniques of principal component analysis (PCA) and factor analysis (FA), in concert with Pearson correlation analysis, were applied to evaluate interrelationships among hydroclimatic, biogeochemical, and biological variables. Power-law based partial least square regression (PLSR) models were developed with a bootstrap Monte Carlo procedure (1000 iterations) to reliably estimate the comparative linkages of Chl-a by resolving multicollinearity in the data matrices (Nash-Sutcliff efficiency = 0.50-87). The data analytics suggested four environmental regimes of stream Chl-a, as dominated by nutrient, climate, redox, and hydro-atmospheric contributions, respectively. Total phosphorous (TP) was the most dominant driver of stream Chl-a in the nutrient controlled regime. Water temperature demonstrated the strongest control of Chl-a in the climate-dominated regime. Furthermore, pH and stream flow were found to be the most important drivers of Chl-a in the redox and hydro-atmospheric component dominated regimes, respectively. The research led to a significant reduction of dimensionality in the large data matrices, providing quantitative and qualitative insights on the dynamics of stream Chl-a. The findings would be useful to manage stream water quality and ecosystem health in the continental U.S. and around the world under a changing climate and environment.

  5. Modeling pollen time series using seasonal-trend decomposition procedure based on LOESS smoothing.

    PubMed

    Rojo, Jesús; Rivero, Rosario; Romero-Morte, Jorge; Fernández-González, Federico; Pérez-Badia, Rosa

    2017-02-01

    Analysis of airborne pollen concentrations provides valuable information on plant phenology and is thus a useful tool in agriculture-for predicting harvests in crops such as the olive and for deciding when to apply phytosanitary treatments-as well as in medicine and the environmental sciences. Variations in airborne pollen concentrations, moreover, are indicators of changing plant life cycles. By modeling pollen time series, we can not only identify the variables influencing pollen levels but also predict future pollen concentrations. In this study, airborne pollen time series were modeled using a seasonal-trend decomposition procedure based on LOcally wEighted Scatterplot Smoothing (LOESS) smoothing (STL). The data series-daily Poaceae pollen concentrations over the period 2006-2014-was broken up into seasonal and residual (stochastic) components. The seasonal component was compared with data on Poaceae flowering phenology obtained by field sampling. Residuals were fitted to a model generated from daily temperature and rainfall values, and daily pollen concentrations, using partial least squares regression (PLSR). This method was then applied to predict daily pollen concentrations for 2014 (independent validation data) using results for the seasonal component of the time series and estimates of the residual component for the period 2006-2013. Correlation between predicted and observed values was r = 0.79 (correlation coefficient) for the pre-peak period (i.e., the period prior to the peak pollen concentration) and r = 0.63 for the post-peak period. Separate analysis of each of the components of the pollen data series enables the sources of variability to be identified more accurately than by analysis of the original non-decomposed data series, and for this reason, this procedure has proved to be a suitable technique for analyzing the main environmental factors influencing airborne pollen concentrations.

  6. On Using the Average Intercorrelation Among Predictor Variables and Eigenvector Orientation to Choose a Regression Solution.

    ERIC Educational Resources Information Center

    Mugrage, Beverly; And Others

    Three ridge regression solutions are compared with ordinary least squares regression and with principal components regression using all components. Ridge regression, particularly the Lawless-Wang solution, out-performed ordinary least squares regression and the principal components solution on the criteria of stability of coefficient and closeness…

  7. Technique for estimating the 2- to 500-year flood discharges on unregulated streams in rural Missouri

    USGS Publications Warehouse

    Alexander, Terry W.; Wilson, Gary L.

    1995-01-01

    A generalized least-squares regression technique was used to relate the 2- to 500-year flood discharges from 278 selected streamflow-gaging stations to statistically significant basin characteristics. The regression relations (estimating equations) were defined for three hydrologic regions (I, II, and III) in rural Missouri. Ordinary least-squares regression analyses indicate that drainage area (Regions I, II, and III) and main-channel slope (Regions I and II) are the only basin characteristics needed for computing the 2- to 500-year design-flood discharges at gaged or ungaged stream locations. The resulting generalized least-squares regression equations provide a technique for estimating the 2-, 5-, 10-, 25-, 50-, 100-, and 500-year flood discharges on unregulated streams in rural Missouri. The regression equations for Regions I and II were developed from stream-flow-gaging stations with drainage areas ranging from 0.13 to 11,500 square miles and 0.13 to 14,000 square miles, and main-channel slopes ranging from 1.35 to 150 feet per mile and 1.20 to 279 feet per mile. The regression equations for Region III were developed from streamflow-gaging stations with drainage areas ranging from 0.48 to 1,040 square miles. Standard errors of estimate for the generalized least-squares regression equations in Regions I, II, and m ranged from 30 to 49 percent.

  8. Orthogonal Regression: A Teaching Perspective

    ERIC Educational Resources Information Center

    Carr, James R.

    2012-01-01

    A well-known approach to linear least squares regression is that which involves minimizing the sum of squared orthogonal projections of data points onto the best fit line. This form of regression is known as orthogonal regression, and the linear model that it yields is known as the major axis. A similar method, reduced major axis regression, is…

  9. Application of VNIR diffuse reflectance spectroscopy to estimate soil organic carbon content, and content of different forms of iron and manganese

    NASA Astrophysics Data System (ADS)

    Klement, Ales; Jaksik, Ondrej; Kodesova, Radka; Drabek, Ondrej; Boruvka, Lubos

    2013-04-01

    Visible and near-infrared (VNIR) diffuse reflectance spectroscopy is a progressive method used for prediction of soil properties. Study was performed on the soils from the agricultural land from the south Moravia municipality of Brumovice. Studied area is characterized by a relatively flat upper part, a tributary valley in the middle and a colluvial fan at the bottom. Haplic Chernozem reminded at the flat upper part of the area. Regosols were formed at steep parts of the valley. Colluvial Chernozem and Colluvial soils were formed at the bottom parts of the valley and at the bottom part of the studied field. The goal of the study was to evaluate relationship between soil spectra curves and organic matter content, and different forms iron and manganese content (Mehlich III extract, ammonium oxalate extract and dithionite-citrate extract). Samples (87) were taken from the topsoil within regular grid covering studied area. The soil spectra curves (of air dry soil and sieved using 2 mm sieve) were measured in the laboratory using spectometer FieldSpec®3 (350 - 2 500 nm). The Fe and Mn contents in different extract were measured using ICP-OES (with an iCAP 6500 Radial ICP Emission spectrometer; Thermo Scientific, UK) under standard analytical conditions. Partial least squares regression (PLSR) was used for modeling of the relationship between spectra and measured soil properties. Prediction ability was evaluated using the R2, root mean square error (RMSE) and normalized root mean square deviation (NRMSD). The results showed the best prediction for Mn (R2 = 0.86, RMSE = 29, NRMSD = 0.11), Fe in ammonium oxalate extract (R2 = 0.82, RMSE = 171, NRMSD = 0.12) and organic matter content (R2 = 0.84, RMSE = 0.13, NRMSD = 0.09). The slightly worse prediction was obtained for Mn and Fe in citrate extract (R2 = 0.82, RMSE = 21, NRMSD = 0.10; R2 = 0.77, RMSE = 522, NRMSD = 0.23). Poor prediction was evaluated for Mn and Fe in Mehlich III extract (R2 = 0.43, RMSE = 13, NRMSD = 0.17; R2 = 0.39, RMSE = 13, NRMSD = 0.26). In general, the results confirmed that the measurement of soil spectral characteristics is a promising technology for a digital soil mapping and predicting studied soil properties. Acknowledgment: Authors acknowledge the financial support of the Ministry of Agriculture of the Czech Republic (grant No. QJ1230319) and the Czech Science Foundation (grant No. GA526/09/1762).

  10. Prediction of soil organic carbon in a coal mining area by Vis-NIR spectroscopy.

    PubMed

    Sun, Wenjuan; Li, Xinju; Niu, Beibei

    2018-01-01

    Coal mining has led to increasingly serious land subsidence, and the reclamation of the subsided land has become a hot topic of concern for governments and scholars. Soil quality of reclaimed land is the key indicator to the evaluation of the reclamation effect; hence, rapid monitoring and evaluation of reclaimed land is of great significance. Visible-near infrared (Vis-NIR) spectroscopy has been shown to be a rapid, timely and efficient tool for the prediction of soil organic carbon (SOC). In this study, 104 soil samples were collected from the Baodian mining area of Shandong province. Vis-NIR reflectance spectra and soil organic carbon content were then measured under laboratory conditions. The spectral data were first denoised using the Savitzky-Golay (SG) convolution smoothing method or the multiple scattering correction (MSC) method, after which the spectral reflectance (R) was subjected to reciprocal, reciprocal logarithm and differential transformations to improve spectral sensitivity. Finally, regression models for estimating the SOC content by the spectral data were constructed using partial least squares regression (PLSR). The results showed that: (1) The SOC content in the mining area was generally low (at the below-average level) and exhibited great variability. (2) The spectral reflectance increased with the decrease of soil organic carbon content. In addition, the sensitivity of the spectrum to the change in SOC content, especially that in the near-infrared band of the original reflectance, decreased when the SOC content was low. (3) The modeling results performed best when the spectral reflectance was preprocessed by Savitzky-Golay (SG) smoothing coupled with multiple scattering correction (MSC) and first-order differential transformation (modeling R2 = 0.86, RMSE = 2.00 g/kg, verification R2 = 0.78, RMSE = 1.81 g/kg, and RPD = 2.69). In addition, the first-order differential of R combined with SG, MSC with R, SG together with MSC and R also produced better modeling results than other pretreatment combinations. Vis-NIR modeling with specific spectral preprocessing methods could predict SOC content effectively.

  11. A Simple Introduction to Moving Least Squares and Local Regression Estimation

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Garimella, Rao Veerabhadra

    In this brief note, a highly simpli ed introduction to esimating functions over a set of particles is presented. The note starts from Global Least Squares tting, going on to Moving Least Squares estimation (MLS) and nally, Local Regression Estimation (LRE).

  12. Normalization Ridge Regression in Practice I: Comparisons Between Ordinary Least Squares, Ridge Regression and Normalization Ridge Regression.

    ERIC Educational Resources Information Center

    Bulcock, J. W.

    The problem of model estimation when the data are collinear was examined. Though the ridge regression (RR) outperforms ordinary least squares (OLS) regression in the presence of acute multicollinearity, it is not a problem free technique for reducing the variance of the estimates. It is a stochastic procedure when it should be nonstochastic and it…

  13. Addressing the identification problem in age-period-cohort analysis: a tutorial on the use of partial least squares and principal components analysis.

    PubMed

    Tu, Yu-Kang; Krämer, Nicole; Lee, Wen-Chung

    2012-07-01

    In the analysis of trends in health outcomes, an ongoing issue is how to separate and estimate the effects of age, period, and cohort. As these 3 variables are perfectly collinear by definition, regression coefficients in a general linear model are not unique. In this tutorial, we review why identification is a problem, and how this problem may be tackled using partial least squares and principal components regression analyses. Both methods produce regression coefficients that fulfill the same collinearity constraint as the variables age, period, and cohort. We show that, because the constraint imposed by partial least squares and principal components regression is inherent in the mathematical relation among the 3 variables, this leads to more interpretable results. We use one dataset from a Taiwanese health-screening program to illustrate how to use partial least squares regression to analyze the trends in body heights with 3 continuous variables for age, period, and cohort. We then use another dataset of hepatocellular carcinoma mortality rates for Taiwanese men to illustrate how to use partial least squares regression to analyze tables with aggregated data. We use the second dataset to show the relation between the intrinsic estimator, a recently proposed method for the age-period-cohort analysis, and partial least squares regression. We also show that the inclusion of all indicator variables provides a more consistent approach. R code for our analyses is provided in the eAppendix.

  14. Use of partial least squares regression to impute SNP genotypes in Italian cattle breeds.

    PubMed

    Dimauro, Corrado; Cellesi, Massimo; Gaspa, Giustino; Ajmone-Marsan, Paolo; Steri, Roberto; Marras, Gabriele; Macciotta, Nicolò P P

    2013-06-05

    The objective of the present study was to test the ability of the partial least squares regression technique to impute genotypes from low density single nucleotide polymorphisms (SNP) panels i.e. 3K or 7K to a high density panel with 50K SNP. No pedigree information was used. Data consisted of 2093 Holstein, 749 Brown Swiss and 479 Simmental bulls genotyped with the Illumina 50K Beadchip. First, a single-breed approach was applied by using only data from Holstein animals. Then, to enlarge the training population, data from the three breeds were combined and a multi-breed analysis was performed. Accuracies of genotypes imputed using the partial least squares regression method were compared with those obtained by using the Beagle software. The impact of genotype imputation on breeding value prediction was evaluated for milk yield, fat content and protein content. In the single-breed approach, the accuracy of imputation using partial least squares regression was around 90 and 94% for the 3K and 7K platforms, respectively; corresponding accuracies obtained with Beagle were around 85% and 90%. Moreover, computing time required by the partial least squares regression method was on average around 10 times lower than computing time required by Beagle. Using the partial least squares regression method in the multi-breed resulted in lower imputation accuracies than using single-breed data. The impact of the SNP-genotype imputation on the accuracy of direct genomic breeding values was small. The correlation between estimates of genetic merit obtained by using imputed versus actual genotypes was around 0.96 for the 7K chip. Results of the present work suggested that the partial least squares regression imputation method could be useful to impute SNP genotypes when pedigree information is not available.

  15. Robust analysis of trends in noisy tokamak confinement data using geodesic least squares regression

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Verdoolaege, G., E-mail: geert.verdoolaege@ugent.be; Laboratory for Plasma Physics, Royal Military Academy, B-1000 Brussels; Shabbir, A.

    Regression analysis is a very common activity in fusion science for unveiling trends and parametric dependencies, but it can be a difficult matter. We have recently developed the method of geodesic least squares (GLS) regression that is able to handle errors in all variables, is robust against data outliers and uncertainty in the regression model, and can be used with arbitrary distribution models and regression functions. We here report on first results of application of GLS to estimation of the multi-machine scaling law for the energy confinement time in tokamaks, demonstrating improved consistency of the GLS results compared to standardmore » least squares.« less

  16. Quantitative structure-activity relationship study of antioxidative peptide by using different sets of amino acids descriptors

    NASA Astrophysics Data System (ADS)

    Li, Yao-Wang; Li, Bo; He, Jiguo; Qian, Ping

    2011-07-01

    A database consisting of 214 tripeptides which contain either His or Tyr residue was applied to study quantitative structure-activity relationships (QSAR) of antioxidative tripeptides. Partial Least-Squares Regression analysis (PLSR) was conducted using parameters individually of each amino acid descriptor, including Divided Physico-chemical Property Scores (DPPS), Hydrophobic, Electronic, Steric, and Hydrogen (HESH), Vectors of Hydrophobic, Steric, and Electronic properties (VHSE), Molecular Surface-Weighted Holistic Invariant Molecular (MS-WHIM), isotropic surface area-electronic charge index (ISA-ECI) and Z-scale, to describe antioxidative tripeptides as X-variables and antioxidant activities measured with ferric thiocyanate methods were as Y-variable. After elimination of outliers by Hotelling's T 2 method and residual analysis, six significant models were obtained describing the entire data set. According to cumulative squared multiple correlation coefficients ( R2), cumulative cross-validation coefficients ( Q2) and relative standard deviation for calibration set (RSD c), the qualities of models using DPPS, HESH, ISA-ECI, and VHSE descriptors are better ( R2 > 0.6, Q2 > 0.5, RSD c < 0.39) than that of models using MS-WHIM and Z-scale descriptors ( R2 < 0.6, Q2 < 0.5, RSD c > 0.44). Furthermore, the predictive ability of models using DPPS descriptor is best among the six descriptors systems (cumulative multiple correlation coefficient for predict set ( Rext2) > 0.7). It was concluded that the DPPS is better to describe the amino acid of antioxidative tripeptides. The results of DPPS descriptor reveal that the importance of the center amino acid and the N-terminal amino acid are far more than the importance of the C-terminal amino acid for antioxidative tripeptides. The hydrophobic (positively to activity) and electronic (negatively to activity) properties of the N-terminal amino acid are suggested to play the most important significance to activity, followed by the hydrogen bond (positively to activity) of the center amino acid. The N-terminal amino acid should be a high hydrophobic and low electronic amino acid (such as Ala, Gly, Val, and Leu); the center amino acid would be an amino acid that possesses high hydrogen bond property (such as base amino acid Arg, Lys, and His). The structural characteristics of antioxidative peptide be found in this paper may contribute to the further research of antioxidative mechanism.

  17. Retargeted Least Squares Regression Algorithm.

    PubMed

    Zhang, Xu-Yao; Wang, Lingfeng; Xiang, Shiming; Liu, Cheng-Lin

    2015-09-01

    This brief presents a framework of retargeted least squares regression (ReLSR) for multicategory classification. The core idea is to directly learn the regression targets from data other than using the traditional zero-one matrix as regression targets. The learned target matrix can guarantee a large margin constraint for the requirement of correct classification for each data point. Compared with the traditional least squares regression (LSR) and a recently proposed discriminative LSR models, ReLSR is much more accurate in measuring the classification error of the regression model. Furthermore, ReLSR is a single and compact model, hence there is no need to train two-class (binary) machines that are independent of each other. The convex optimization problem of ReLSR is solved elegantly and efficiently with an alternating procedure including regression and retargeting as substeps. The experimental evaluation over a range of databases identifies the validity of our method.

  18. Geodesic least squares regression for scaling studies in magnetic confinement fusion

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Verdoolaege, Geert

    In regression analyses for deriving scaling laws that occur in various scientific disciplines, usually standard regression methods have been applied, of which ordinary least squares (OLS) is the most popular. However, concerns have been raised with respect to several assumptions underlying OLS in its application to scaling laws. We here discuss a new regression method that is robust in the presence of significant uncertainty on both the data and the regression model. The method, which we call geodesic least squares regression (GLS), is based on minimization of the Rao geodesic distance on a probabilistic manifold. We demonstrate the superiority ofmore » the method using synthetic data and we present an application to the scaling law for the power threshold for the transition to the high confinement regime in magnetic confinement fusion devices.« less

  19. Methods for estimating annual exceedance probability discharges for streams in Arkansas, based on data through water year 2013

    USGS Publications Warehouse

    Wagner, Daniel M.; Krieger, Joshua D.; Veilleux, Andrea G.

    2016-08-04

    In 2013, the U.S. Geological Survey initiated a study to update regional skew, annual exceedance probability discharges, and regional regression equations used to estimate annual exceedance probability discharges for ungaged locations on streams in the study area with the use of recent geospatial data, new analytical methods, and available annual peak-discharge data through the 2013 water year. An analysis of regional skew using Bayesian weighted least-squares/Bayesian generalized-least squares regression was performed for Arkansas, Louisiana, and parts of Missouri and Oklahoma. The newly developed constant regional skew of -0.17 was used in the computation of annual exceedance probability discharges for 281 streamgages used in the regional regression analysis. Based on analysis of covariance, four flood regions were identified for use in the generation of regional regression models. Thirty-nine basin characteristics were considered as potential explanatory variables, and ordinary least-squares regression techniques were used to determine the optimum combinations of basin characteristics for each of the four regions. Basin characteristics in candidate models were evaluated based on multicollinearity with other basin characteristics (variance inflation factor < 2.5) and statistical significance at the 95-percent confidence level (p ≤ 0.05). Generalized least-squares regression was used to develop the final regression models for each flood region. Average standard errors of prediction of the generalized least-squares models ranged from 32.76 to 59.53 percent, with the largest range in flood region D. Pseudo coefficients of determination of the generalized least-squares models ranged from 90.29 to 97.28 percent, with the largest range also in flood region D. The regional regression equations apply only to locations on streams in Arkansas where annual peak discharges are not substantially affected by regulation, diversion, channelization, backwater, or urbanization. The applicability and accuracy of the regional regression equations depend on the basin characteristics measured for an ungaged location on a stream being within range of those used to develop the equations.

  20. Application of least median of squared orthogonal distance (LMD) and LMD-based reweighted least squares (RLS) methods on the stock-recruitment relationship

    NASA Astrophysics Data System (ADS)

    Wang, Yan-Jun; Liu, Qun

    1999-03-01

    Analysis of stock-recruitment (SR) data is most often done by fitting various SR relationship curves to the data. Fish population dynamics data often have stochastic variations and measurement errors, which usually result in a biased regression analysis. This paper presents a robust regression method, least median of squared orthogonal distance (LMD), which is insensitive to abnormal values in the dependent and independent variables in a regression analysis. Outliers that have significantly different variance from the rest of the data can be identified in a residual analysis. Then, the least squares (LS) method is applied to the SR data with defined outliers being down weighted. The application of LMD and LMD-based Reweighted Least Squares (RLS) method to simulated and real fisheries SR data is explored.

  1. A Weighted Least Squares Approach To Robustify Least Squares Estimates.

    ERIC Educational Resources Information Center

    Lin, Chowhong; Davenport, Ernest C., Jr.

    This study developed a robust linear regression technique based on the idea of weighted least squares. In this technique, a subsample of the full data of interest is drawn, based on a measure of distance, and an initial set of regression coefficients is calculated. The rest of the data points are then taken into the subsample, one after another,…

  2. Validation of Core Temperature Estimation Algorithm

    DTIC Science & Technology

    2016-01-29

    plot of observed versus estimated core temperature with the line of identity (dashed) and the least squares regression line (solid) and line equation...estimated PSI with the line of identity (dashed) and the least squares regression line (solid) and line equation in the top left corner. (b) Bland...for comparison. The root mean squared error (RMSE) was also computed, as given by Equation 2.

  3. Comparing least-squares and quantile regression approaches to analyzing median hospital charges.

    PubMed

    Olsen, Cody S; Clark, Amy E; Thomas, Andrea M; Cook, Lawrence J

    2012-07-01

    Emergency department (ED) and hospital charges obtained from administrative data sets are useful descriptors of injury severity and the burden to EDs and the health care system. However, charges are typically positively skewed due to costly procedures, long hospital stays, and complicated or prolonged treatment for few patients. The median is not affected by extreme observations and is useful in describing and comparing distributions of hospital charges. A least-squares analysis employing a log transformation is one approach for estimating median hospital charges, corresponding confidence intervals (CIs), and differences between groups; however, this method requires certain distributional properties. An alternate method is quantile regression, which allows estimation and inference related to the median without making distributional assumptions. The objective was to compare the log-transformation least-squares method to the quantile regression approach for estimating median hospital charges, differences in median charges between groups, and associated CIs. The authors performed simulations using repeated sampling of observed statewide ED and hospital charges and charges randomly generated from a hypothetical lognormal distribution. The median and 95% CI and the multiplicative difference between the median charges of two groups were estimated using both least-squares and quantile regression methods. Performance of the two methods was evaluated. In contrast to least squares, quantile regression produced estimates that were unbiased and had smaller mean square errors in simulations of observed ED and hospital charges. Both methods performed well in simulations of hypothetical charges that met least-squares method assumptions. When the data did not follow the assumed distribution, least-squares estimates were often biased, and the associated CIs had lower than expected coverage as sample size increased. Quantile regression analyses of hospital charges provide unbiased estimates even when lognormal and equal variance assumptions are violated. These methods may be particularly useful in describing and analyzing hospital charges from administrative data sets. © 2012 by the Society for Academic Emergency Medicine.

  4. Analyzing Multilevel Data: Comparing Findings from Hierarchical Linear Modeling and Ordinary Least Squares Regression

    ERIC Educational Resources Information Center

    Rocconi, Louis M.

    2013-01-01

    This study examined the differing conclusions one may come to depending upon the type of analysis chosen, hierarchical linear modeling or ordinary least squares (OLS) regression. To illustrate this point, this study examined the influences of seniors' self-reported critical thinking abilities three ways: (1) an OLS regression with the student…

  5. Least median of squares and iteratively re-weighted least squares as robust linear regression methods for fluorimetric determination of α-lipoic acid in capsules in ideal and non-ideal cases of linearity.

    PubMed

    Korany, Mohamed A; Gazy, Azza A; Khamis, Essam F; Ragab, Marwa A A; Kamal, Miranda F

    2018-06-01

    This study outlines two robust regression approaches, namely least median of squares (LMS) and iteratively re-weighted least squares (IRLS) to investigate their application in instrument analysis of nutraceuticals (that is, fluorescence quenching of merbromin reagent upon lipoic acid addition). These robust regression methods were used to calculate calibration data from the fluorescence quenching reaction (∆F and F-ratio) under ideal or non-ideal linearity conditions. For each condition, data were treated using three regression fittings: Ordinary Least Squares (OLS), LMS and IRLS. Assessment of linearity, limits of detection (LOD) and quantitation (LOQ), accuracy and precision were carefully studied for each condition. LMS and IRLS regression line fittings showed significant improvement in correlation coefficients and all regression parameters for both methods and both conditions. In the ideal linearity condition, the intercept and slope changed insignificantly, but a dramatic change was observed for the non-ideal condition and linearity intercept. Under both linearity conditions, LOD and LOQ values after the robust regression line fitting of data were lower than those obtained before data treatment. The results obtained after statistical treatment indicated that the linearity ranges for drug determination could be expanded to lower limits of quantitation by enhancing the regression equation parameters after data treatment. Analysis results for lipoic acid in capsules, using both fluorimetric methods, treated by parametric OLS and after treatment by robust LMS and IRLS were compared for both linearity conditions. Copyright © 2018 John Wiley & Sons, Ltd.

  6. Methods for estimating the magnitude and frequency of peak streamflows at ungaged sites in and near the Oklahoma Panhandle

    USGS Publications Warehouse

    Smith, S. Jerrod; Lewis, Jason M.; Graves, Grant M.

    2015-09-28

    Generalized-least-squares multiple-linear regression analysis was used to formulate regression relations between peak-streamflow frequency statistics and basin characteristics. Contributing drainage area was the only basin characteristic determined to be statistically significant for all percentage of annual exceedance probabilities and was the only basin characteristic used in regional regression equations for estimating peak-streamflow frequency statistics on unregulated streams in and near the Oklahoma Panhandle. The regression model pseudo-coefficient of determination, converted to percent, for the Oklahoma Panhandle regional regression equations ranged from about 38 to 63 percent. The standard errors of prediction and the standard model errors for the Oklahoma Panhandle regional regression equations ranged from about 84 to 148 percent and from about 76 to 138 percent, respectively. These errors were comparable to those reported for regional peak-streamflow frequency regression equations for the High Plains areas of Texas and Colorado. The root mean square errors for the Oklahoma Panhandle regional regression equations (ranging from 3,170 to 92,000 cubic feet per second) were less than the root mean square errors for the Oklahoma statewide regression equations (ranging from 18,900 to 412,000 cubic feet per second); therefore, the Oklahoma Panhandle regional regression equations produce more accurate peak-streamflow statistic estimates for the irrigated period of record in the Oklahoma Panhandle than do the Oklahoma statewide regression equations. The regression equations developed in this report are applicable to streams that are not substantially affected by regulation, impoundment, or surface-water withdrawals. These regression equations are intended for use for stream sites with contributing drainage areas less than or equal to about 2,060 square miles, the maximum value for the independent variable used in the regression analysis.

  7. Prediction of iron oxide contents using diffuse reflectance spectroscopy

    NASA Astrophysics Data System (ADS)

    Marques, José, Jr.; Arantes Camargo, Livia

    2015-04-01

    Determining soil iron oxides using conventional analysis is relatively unfeasible when large areas are mapped, with the aim of characterizing spatial variability. Diffuse reflectance spectroscopy (DRS) is rapid, less expensive, non-destructive and sometimes more accurate than conventional analysis. Furthermore, this technique allows the simultaneous characterization of many soil attributes with agronomic and environmental relevance. This study aims to assess the DRS capability to predict iron oxides content -hematite and goethite - , characterizing their spatial variability in soils of Brazil. Soil samples collected from an 800-hectare area were scanned in the visible and near-infrared spectral range. Moreover, chemometric calibration was obtained through partial least-squares regression (PLSR). Then, spatial distribution maps of the attributes were constructed using predicted values from calibrated models through geostatistical methods. The studied area presented soils with varied contents of iron oxides as examples for the Oxisols and Entisols. In the spectra of each soil is observed that the reflectance decreases with the content of iron oxides present in the soil. In soils with a high content of iron oxides can be observed more pronounced concavities between 380 and 1100 nm which are characteristic of the presence of these oxides. In soils with higher reflectance it were observed concavity characteristics due to the presence of kaolinite, in agreement with the low iron contents of those soils. The best accuracy of prediction models [residual prediction deviation (RPD) = 1.7] was obtained for goethite within the visible region (380-800 nm), and for hematite (RPD = 2.0) within the visible near infrared (380-2300 nm). The maps of goethite and hematite predicted showed the spatial distribution pattern similar to the maps of clay and iron extracted by dithionite-citrate-bicarbonate, being consistent with the iron oxide contents of soils present in the study area. These results confirm the value of DRS in the mapping of iron oxides in large areas at detailed scale.

  8. Land-use versus natural controls on soil fertility in the Subandean Amazon, Peru.

    PubMed

    Lindell, Lina; Aström, Mats; Oberg, Tomas

    2010-01-15

    Deforestation to amplify the agricultural frontier is a serious threat to the Amazon forest. Strategies to attain and maintain satisfactory soil fertility, which requires knowledge of spatial and temporal changes caused by land-use, are important for reaching sustainable development. This study highlights these issues by evaluating the relative effects of agricultural land-use and natural factors on chemical fertility of Inceptisols on redbed lithologies in the Subandean Amazon. Macro and micronutrients were determined in topsoil and subsoil in the vicinity of two villages at a total of 80 sites including pastures, coffee plantations, swidden fields, secondary forest and, as a reference, adjacent primary forest. Differences in soil fertility between the land cover classes were investigated by principal component analysis (PCA) and partial least squares regression (PLSR). Primary forest soil was found to be chemically similar to that of coffee plantations, pastures and secondary forests. There were no significant differences between soils of these land cover types in terms of plant nutrients (e.g. N, P, K, Ca, Mg, Mo, Mn, Zn, Cu and Co) or other fertility indicators (OM, pH, BS, EC, CECe and exchangeable acidity). The parent material (as indicated by texture and sample geographical origin) and the slope of the sampled sites were stronger controls on soil fertility than land cover type. Elevated concentrations of a few nutrients (NO(3) and K) were, however detected in soils of swidden fields. Despite being fertile (higher CECe, Ca and P) compared to Oxisols and Ultisols in the Amazon lowland, the Subandean soils frequently showed deficiencies in several nutrients (e.g. P, K, NO(3), Cu and Zn), and high levels of free Al at acidic sites. This paper concludes that deforestation and agricultural land-use has not introduced lasting chemical changes in the studied Subandean soils that are significant in comparison to the natural variability. Copyright 2009 Elsevier B.V. All rights reserved.

  9. Monitoring Powdery Mildew of Winter Wheat by Using Moderate Resolution Multi-Temporal Satellite Imagery

    PubMed Central

    Zhang, Jingcheng; Pu, Ruiliang; Yuan, Lin; Wang, Jihua; Huang, Wenjiang; Yang, Guijun

    2014-01-01

    Powdery mildew is one of the most serious diseases that have a significant impact on the production of winter wheat. As an effective alternative to traditional sampling methods, remote sensing can be a useful tool in disease detection. This study attempted to use multi-temporal moderate resolution satellite-based data of surface reflectances in blue (B), green (G), red (R) and near infrared (NIR) bands from HJ-CCD (CCD sensor on Huanjing satellite) to monitor disease at a regional scale. In a suburban area in Beijing, China, an extensive field campaign for disease intensity survey was conducted at key growth stages of winter wheat in 2010. Meanwhile, corresponding time series of HJ-CCD images were acquired over the study area. In this study, a number of single-stage and multi-stage spectral features, which were sensitive to powdery mildew, were selected by using an independent t-test. With the selected spectral features, four advanced methods: mahalanobis distance, maximum likelihood classifier, partial least square regression and mixture tuned matched filtering were tested and evaluated for their performances in disease mapping. The experimental results showed that all four algorithms could generate disease maps with a generally correct distribution pattern of powdery mildew at the grain filling stage (Zadoks 72). However, by comparing these disease maps with ground survey data (validation samples), all of the four algorithms also produced a variable degree of error in estimating the disease occurrence and severity. Further, we found that the integration of MTMF and PLSR algorithms could result in a significant accuracy improvement of identifying and determining the disease intensity (overall accuracy of 72% increased to 78% and kappa coefficient of 0.49 increased to 0.59). The experimental results also demonstrated that the multi-temporal satellite images have a great potential in crop diseases mapping at a regional scale. PMID:24691435

  10. [Estimating heavy metal concentrations in topsoil from vegetation reflectance spectra of Hyperion images: A case study of Yushu County, Qinghai, China.

    PubMed

    Yang, Ling Yu; Gao, Xiao Hong; Zhang, Wei; Shi, Fei Fei; He, Lin Hua; Jia, Wei

    2016-06-01

    In this study, we explored the feasibility of estimating the soil heavy metal concentrations using the hyperspectral satellite image. The concentration of As, Pb, Zn and Cd elements in 48 topsoil samples collected from the field in Yushu County of the Sanjiangyuan regions was measured in the laboratory. We then extracted 176 vegetation spectral reflectance bands of 48 soil samples as well as five vegetation indices from two Hyperion images. Following that, the partial least squares regression (PLSR) method was employed to estimate the soil heavy metal concentrations using the above two independent sets of Hyperion-derived variables, separately constructed the estimation model between the 176 vegetation spectral reflectance bands and the soil heavy metal concentrations (called the vegetation spectral reflectance-based estimation model), and between the five vegetation indices being used as the independent variable and the soil heavy metal concentrations (called synthetic vegetation index-based estimation model). Using RPD (the ratio of standard deviation from the 4 heavy metals measured values of the validation samples to RMSE) as the validation criteria, the RPDs of As and Pb concentrations from the two models were both less than 1.4, which suggested that both models were incapable of roughly estimating As and Pb concentrations; whereas the RPDs of Zn and Cd were 1.53, 1.46 and 1.46, 1.42, respectively, which implied that both models had the ability for rough estimation of Zn and Cd concentrations. Based on those results, the vegetation spectral-based estimation model was selected to obtain the spatial distribution map of Zn concentration in combination with the Hyperion image. The estimated Zn map showed that the zones with high Zn concentrations were distributed near the provincial road 308, national road 214 and towns, which could be influenced by human activities. Our study proved that the spectral reflectance of Hyperion image was useful in estimating the soil concentrations of Zn and Cd.

  11. Monitoring and evaluating the quality consistency of Compound Bismuth Aluminate tablets by a simple quantified ratio fingerprint method combined with simultaneous determination of five compounds and correlated with antioxidant activities.

    PubMed

    Liu, Yingchun; Liu, Zhongbo; Sun, Guoxiang; Wang, Yan; Ling, Junhong; Gao, Jiayue; Huang, Jiahao

    2015-01-01

    A combination method of multi-wavelength fingerprinting and multi-component quantification by high performance liquid chromatography (HPLC) coupled with diode array detector (DAD) was developed and validated to monitor and evaluate the quality consistency of herbal medicines (HM) in the classical preparation Compound Bismuth Aluminate tablets (CBAT). The validation results demonstrated that our method met the requirements of fingerprint analysis and quantification analysis with suitable linearity, precision, accuracy, limits of detection (LOD) and limits of quantification (LOQ). In the fingerprint assessments, rather than using conventional qualitative "Similarity" as a criterion, the simple quantified ratio fingerprint method (SQRFM) was recommended, which has an important quantified fingerprint advantage over the "Similarity" approach. SQRFM qualitatively and quantitatively offers the scientific criteria for traditional Chinese medicines (TCM)/HM quality pyramid and warning gate in terms of three parameters. In order to combine the comprehensive characterization of multi-wavelength fingerprints, an integrated fingerprint assessment strategy based on information entropy was set up involving a super-information characteristic digitized parameter of fingerprints, which reveals the total entropy value and absolute information amount about the fingerprints and, thus, offers an excellent method for fingerprint integration. The correlation results between quantified fingerprints and quantitative determination of 5 marker compounds, including glycyrrhizic acid (GLY), liquiritin (LQ), isoliquiritigenin (ILG), isoliquiritin (ILQ) and isoliquiritin apioside (ILA), indicated that multi-component quantification could be replaced by quantified fingerprints. The Fenton reaction was employed to determine the antioxidant activities of CBAT samples in vitro, and they were correlated with HPLC fingerprint components using the partial least squares regression (PLSR) method. In summary, the method of multi-wavelength fingerprints combined with antioxidant activities has been proved to be a feasible and scientific procedure for monitoring and evaluating the quality consistency of CBAT.

  12. Imaging spectroscopy calibration and applications for coastal wetland species composition and biomass mapping in the Mississippi Delta

    NASA Astrophysics Data System (ADS)

    Jensen, D.; Cavanaugh, K. C.; Simard, M.

    2016-12-01

    Coastal wetlands provide a wealth of ecosystem services, including improved water quality, protection from storm surges, and wildlife habitat. Louisiana's wetlands, however, are threatened by development, pollution, and relative sea level rise (RSLR)—the combination of sea level rise and subsidence rates. Beyond causing land loss, RSLR impacts Louisiana's wetland ecosystems by altering salinity, nutrient availability, flood duration, and flood frequency in the region. Despite widespread wetland loss, areas such as the Wax Lake and Atchafalaya river deltas are in fact growing due to their sediment loads, resulting in a complex of both degradation and aggradation along the Louisiana coast. In order to understand and model how coastal wetlands are responding to RSLR, there is a need for improved vegetation distribution mapping, biomass estimation, and ecosystem change modeling. To this end, high-resolution imaging spectroscopy offers the ability to accurately develop species-level distribution maps and predictive aboveground biomass (AGB) models. AVIRIS-NG data collected over the Atchafalaya River Delta were calibrated to reduce Bidirectional Reflectance Distribution Function (BRDF) effects and mosaicked, along with other scenes that coincided with field observations. Multiple Endmember Spectral Mixture Analysis (MESMA) was used to map salt marsh at the species level across our study area. Field observations were used to parameterize and validate our MESMA based approach. AGB was then mapped for this region using a partial least squares regression (PLSR) model developed from the same imagery and field measurements. Last, the Sea Level Affecting Marshes Model was applied to predict wetland loss and changes in marsh composition due to sea level rise, which was then paired with the AGB map to estimate carbon storage change. In doing so, this study addresses key concerns for coastal regions and demonstrates the ability of imaging spectroscopy to predict those impacts.

  13. Cellulose, Chitosan and Keratin Composite Materials: Facile and Recyclable Synthesis, Conformation and Properties.

    PubMed

    Tran, Chieu D; Mututuvari, Tamutsiwa M

    2016-03-07

    A method was developed in which cellulose (CEL) and/or chitosan (CS) were added to keratin (KER) to enable [CEL/CS+KER] composites formed to have better mechanical strength and wider utilization. Butylmethylimmidazolium chloride ([BMIm + Cl - ]), an ionic liquid, was used as the sole solvent, and because the majority of [BMIm + Cl - ] used (at least 88%) was recovered, the method is green and recyclable. FTIR, XRD, 13 C CP-MAS NMR and SEM results confirm that KER, CS and CEL remain chemically intact and distributed homogeneously in the composites. We successfully demonstrate that the widely used method based on the deconvolution of the FTIR bands of amide bonds to determine secondary structure of proteins is relatively subjective as the conformation obtained is strongly dependent on the choice of parameters selected for curve fitting. A new method, based on the partial least squares regression analysis (PLSR) of the amide bands, was developed, and proven to be objective and can provide more accurate information. Results obtained with this method agree well with those by XRD, namely they indicate that although KER retains its second structure when incorporated into the [CEL+CS] composites, it has relatively lower α -helix, higher β -turn and random form compared to that of the KER in native wool. It seems that during dissolution by [BMIm + Cl - ], the inter- and intramolecular forces in KER were broken thereby destroying its secondary structure. During regeneration, these interactions were reestablished to reform partially the secondary structure. However, in the presence of either CEL or CS, the chains seem to prefer the extended form thereby hindering reformation of the α -helix. Consequently, the KER in these matrices may adopt structures with lower content of α -helix and higher β -sheet. As anticipated, results of tensile strength and TGA confirm that adding CEL or CS into KER substantially increase the mechanical strength and thermal stability of the [CS/CEL+KER] composites.

  14. Modeling the hydrological impacts of land use/land cover changes in the Andassa watershed, Blue Nile Basin, Ethiopia.

    PubMed

    Gashaw, Temesgen; Tulu, Taffa; Argaw, Mekuria; Worqlul, Abeyou W

    2018-04-01

    Understanding the hydrological response of a watershed to land use/land cover (LULC) changes is imperative for water resources management planning. The objective of this study was to analyze the hydrological impacts of LULC changes in the Andassa watershed for a period of 1985-2015 and to predict the LULC change impact on the hydrological status in year 2045. The hybrid land use classification technique for classifying Landsat images (1985, 2000 and 2015); Cellular-Automata Markov (CA-Markov) for prediction of the 2030 and 2045 LULC states; the Soil and Water Assessment Tool (SWAT) for hydrological modeling were employed in the analyses. In order to isolate the impacts of LULC changes, the LULC maps were used independently while keeping the other SWAT inputs constant. The contribution of each of the LULC classes was examined with the Partial Least Squares Regression (PLSR) model. The results showed that there was a continuous expansion of cultivated land and built-up area, and withdrawing of forest, shrubland and grassland during the 1985-2015 periods, which are expected to continue in the 2030 and 2045 periods. The LULC changes, which had occurred during the period of 1985 to 2015, had increased the annual flow (2.2%), wet seasonal flow (4.6%), surface runoff (9.3%) and water yield (2.4%). Conversely, the observed changes had reduced dry season flow (2.8%), lateral flow (5.7%), groundwater flow (7.8%) and ET (0.3%). The 2030 and 2045 LULC states are expected to further increase the annual and wet season flow, surface runoff and water yield, and reduce dry season flow, groundwater flow, lateral flow and ET. The change in hydrological components is a direct result of the significant transition from the vegetation to non-vegetation cover in the watershed. This suggests an urgent need to regulate the LULC in order to maintain the hydrological balance. Copyright © 2017 Elsevier B.V. All rights reserved.

  15. Monitoring powdery mildew of winter wheat by using moderate resolution multi-temporal satellite imagery.

    PubMed

    Zhang, Jingcheng; Pu, Ruiliang; Yuan, Lin; Wang, Jihua; Huang, Wenjiang; Yang, Guijun

    2014-01-01

    Powdery mildew is one of the most serious diseases that have a significant impact on the production of winter wheat. As an effective alternative to traditional sampling methods, remote sensing can be a useful tool in disease detection. This study attempted to use multi-temporal moderate resolution satellite-based data of surface reflectances in blue (B), green (G), red (R) and near infrared (NIR) bands from HJ-CCD (CCD sensor on Huanjing satellite) to monitor disease at a regional scale. In a suburban area in Beijing, China, an extensive field campaign for disease intensity survey was conducted at key growth stages of winter wheat in 2010. Meanwhile, corresponding time series of HJ-CCD images were acquired over the study area. In this study, a number of single-stage and multi-stage spectral features, which were sensitive to powdery mildew, were selected by using an independent t-test. With the selected spectral features, four advanced methods: mahalanobis distance, maximum likelihood classifier, partial least square regression and mixture tuned matched filtering were tested and evaluated for their performances in disease mapping. The experimental results showed that all four algorithms could generate disease maps with a generally correct distribution pattern of powdery mildew at the grain filling stage (Zadoks 72). However, by comparing these disease maps with ground survey data (validation samples), all of the four algorithms also produced a variable degree of error in estimating the disease occurrence and severity. Further, we found that the integration of MTMF and PLSR algorithms could result in a significant accuracy improvement of identifying and determining the disease intensity (overall accuracy of 72% increased to 78% and kappa coefficient of 0.49 increased to 0.59). The experimental results also demonstrated that the multi-temporal satellite images have a great potential in crop diseases mapping at a regional scale.

  16. Imaging volcanic CO2 and SO2

    NASA Astrophysics Data System (ADS)

    Gabrieli, A.; Wright, R.; Lucey, P. G.; Porter, J. N.

    2017-12-01

    Detecting and quantifying volcanic carbon dioxide (CO2) and sulfur dioxide (SO2) emissions is of relevance to volcanologists. Changes in the amount and composition of gases that volcanoes emit are related to subsurface magma movements and the probability of eruptions. Volcanic gases and related acidic aerosols are also an important atmospheric pollution source that create environmental health hazards for people, animals, plants, and infrastructures. For these reasons, it is important to measure emissions from volcanic plumes during both day and night. We present image measurements of the volcanic plume at Kīlauea volcano, HI, and flux derivation, using a newly developed 8-14 um hyperspectral imaging spectrometer, the Thermal Hyperspectral Imager (THI). THI is capable of acquiring images of the scene it views from which spectra can be derived from each pixel. Each spectrum contains 50 wavelength samples between 8 and 14 um where CO2 and SO2 volcanic gases have diagnostic absorption/emission features respectively at 8.6 and 14 um. Plume radiance measurements were carried out both during the day and the night by using both the lava lake in the Halema'uma'u crater as a hot source and the sky as a cold background to detect respectively the spectral signatures of volcanic CO2 and SO2 gases. CO2 and SO2 path-concentrations were then obtained from the spectral radiance measurements using a new Partial Least Squares Regression (PLSR)-based inversion algorithm, which was developed as part of this project. Volcanic emission fluxes were determined by combining the path measurements with wind observations, derived directly from the images. Several hours long time-series of volcanic emission fluxes will be presented and the SO2 conversion rates into aerosols will be discussed. The new imaging and inversion technique, discussed here, are novel allowing for continuous CO2 and SO2 plume mapping during both day and night.

  17. Remotely estimating photosynthetic capacity, and its response to temperature, in vegetation canopies using imaging spectroscopy

    DOE PAGES

    Serbin, Shawn P.; Singh, Aditya; Desai, Ankur R.; ...

    2015-06-11

    To date, the utility of ecosystem and Earth system models (EESMs) has been limited by poor spatial and temporal representation of critical input parameters. For example, EESMs often rely on leaf-scale or literature-derived estimates for a key determinant of canopy photosynthesis, the maximum velocity of RuBP carboxylation (Vcmax, μmol m –2 s –1). Our recent work (Ainsworth et al., 2014; Serbin et al., 2012) showed that reflectance spectroscopy could be used to estimate Vcmax at the leaf level. Here, we present evidence that imaging spectroscopy data can be used to simultaneously predict Vcmax and its sensitivity to temperature (E V)more » at the canopy scale. In 2013 and 2014, high-altitude Airborne Visible/Infrared Imaging Spectroscopy (AVIRIS) imagery and contemporaneous ground-based assessments of canopy structure and leaf photosynthesis were acquired across an array of monospecific agroecosystems in central and southern California, USA. A partial least-squares regression (PLSR) modeling approach was employed to characterize the pixel-level variation in canopy V cmax (at a standardized canopy temperature of 30 °C) and E V, based on visible and shortwave infrared AVIRIS spectra (414–2447 nm). Our approach yielded parsimonious models with strong predictive capability for Vcmax (at 30 °C) and E V (R 2 of withheld data = 0.94 and 0.92, respectively), both of which varied substantially in the field (≥ 1.7 fold) across the sampled crop types. The models were applied to additional AVIRIS imagery to generate maps of V cmax and E V, as well as their uncertainties, for agricultural landscapes in California. The spatial patterns exhibited in the maps were consistent with our in-situ observations. As a result, these findings highlight the considerable promise of airborne and, by implication, space-borne imaging spectroscopy, such as the proposed HyspIRI mission, to map spatial and temporal variation in key drivers of photosynthetic metabolism in terrestrial vegetation.« less

  18. Exploring a volatomic-based strategy for a fingerprinting approach of Vaccinium padifolium L. berries at different ripening stages.

    PubMed

    Porto-Figueira, Priscilla; Figueira, José A; Berenguer, Pedro; Câmara, José S

    2018-04-15

    The effect of ripening on the evolution of the volatomic pattern from endemic Vaccinium padifolium L. (Uveira) berries was investigated using headspace-solid phase microextraction (HS-SPME) followed by gas chromatography/quadrupole-mass spectrometry (GC-qMS) and multivariate statistical analysis (MVA). The most significant HS-SPME parameters, namely fibre polymer, ionic strength and extraction time, were optimized in order to improve extraction efficiency. Under optimal experimental conditions (DVB/CAR/PDMS fibre coating, 40°C, 30min extraction time and 5g of sample amount), a total of 72 volatiles of different functionalities were isolated and identified. Terpenes followed by higher alcohols and esters were the predominant classes in the ripening stages - green, break and ripe. Although significant differences in the volatomic profiles at the three stages were obtained, cis-β-ocimene (2.0-40.0%), trans-2-hexenol (2.4-19.4%), cis-3-hexenol (2.5.16.4%), β-myrcene (1.9-13.8%), 1-hexanol (1.7-13.6%), 2-hexenal (0.7-8.0%), 2-heptanone (0.7-7.7%), and linalool (1.9-6.1%) were the main volatile compounds identified. Higher alcohols, carboxylic acids and ketones gradually increased during ripening, whereas monoterpenes significantly decreased. These trends were dominated by the higher alcohols (1-hexanol, cis-3-hexenol, trans-2-hexenol) and monoterpenes (β-myrcene, cis-β-ocimene and trans-β-ocimene). Partial least squares regression (PLSR) revealed that ethyl caprylate (1.000), trans-geraniol (0.995), ethyl isovalerate (-0.994) and benzyl carbinol (0.993) are the key variables that most contributed to the successful differentiation of Uveira berries according to ripening stage. To the best of our knowledge, no study has carried out on the volatomic composition of berries from endemic Uveira. Copyright © 2017 Elsevier Ltd. All rights reserved.

  19. A Cross-Lingual Similarity Measure for Detecting Biomedical Term Translations

    PubMed Central

    Bollegala, Danushka; Kontonatsios, Georgios; Ananiadou, Sophia

    2015-01-01

    Bilingual dictionaries for technical terms such as biomedical terms are an important resource for machine translation systems as well as for humans who would like to understand a concept described in a foreign language. Often a biomedical term is first proposed in English and later it is manually translated to other languages. Despite the fact that there are large monolingual lexicons of biomedical terms, only a fraction of those term lexicons are translated to other languages. Manually compiling large-scale bilingual dictionaries for technical domains is a challenging task because it is difficult to find a sufficiently large number of bilingual experts. We propose a cross-lingual similarity measure for detecting most similar translation candidates for a biomedical term specified in one language (source) from another language (target). Specifically, a biomedical term in a language is represented using two types of features: (a) intrinsic features that consist of character n-grams extracted from the term under consideration, and (b) extrinsic features that consist of unigrams and bigrams extracted from the contextual windows surrounding the term under consideration. We propose a cross-lingual similarity measure using each of those feature types. First, to reduce the dimensionality of the feature space in each language, we propose prototype vector projection (PVP)—a non-negative lower-dimensional vector projection method. Second, we propose a method to learn a mapping between the feature spaces in the source and target language using partial least squares regression (PLSR). The proposed method requires only a small number of training instances to learn a cross-lingual similarity measure. The proposed PVP method outperforms popular dimensionality reduction methods such as the singular value decomposition (SVD) and non-negative matrix factorization (NMF) in a nearest neighbor prediction task. Moreover, our experimental results covering several language pairs such as English–French, English–Spanish, English–Greek, and English–Japanese show that the proposed method outperforms several other feature projection methods in biomedical term translation prediction tasks. PMID:26030738

  20. Spatial distribution of heterocyclic organic matter compounds at macropore surfaces in Bt-horizons

    NASA Astrophysics Data System (ADS)

    Leue, Martin; Eckhardt, Kai-Uwe; Gerke, Horst H.; Ellerbrock, Ruth H.; Leinweber, Peter

    2017-04-01

    The illuvial Bt-horizon of Luvisols is characterized by coatings of clay and organic matter (OM) at the surfaces of cracks, biopores and inter-aggregate spaces. The OM composition of the coatings that originate from preferential transport of suspended matter in macropores determines the physico-chemical properties of the macropore surfaces. The analysis of the spatial distribution of specific OM components such as heterocyclic N-compounds (NCOMP) and benzonitrile and naphthalene (BN+NA) could enlighten the effect of macropore coatings on the transport of colloids and reactive solutes during preferential flow and on OM turnover processes in subsoils. The objective was to characterize the mm-to-cm scale spatial distribution of NCOMP and BN+NA at intact macropore surfaces from the Bt-horizons of two Luvisols developed on loess and glacial till. In material manually separated from macropore surfaces the proportions of NCOMP and BN+NA were determined by pyrolysis-field ionization mass spectrometry (Py-FIMS). These OM compounds, likely originating from combustion residues, were found increased in crack coatings and pinhole fillings but decreased in biopore walls (worm burrows and root channels). The Py-FIMS data were correlated with signals from C=O and C=C groups and with signals from O-H groups of clay minerals as determined by Fourier transform infrared spectroscopy in diffuse reflectance mode (DRIFT). Intensive signals of C15 to C17 alkanes from long-chain alkenes as main components of diesel and diesel exhaust particulates substantiated the assumption that burning residues were prominent in the subsoil OM. The spatial distribution of NCOMP and BN+NA along the macropores was predicted by partial least squares regression (PLSR) using DRIFT mapping spectra from intact surfaces and was found closely related to the distribution of crack coatings and pinholes. The results emphasize the importance of clay coatings in the subsoil to OM sorption and stabilization. Differences between biopores and cracks suggest differences in the mass transport and OM turnover between these macropore types in Luvisols.

  1. Spectroscopic determination of leaf traits using infrared spectra

    NASA Astrophysics Data System (ADS)

    Buitrago, Maria F.; Groen, Thomas A.; Hecker, Christoph A.; Skidmore, Andrew K.

    2018-07-01

    Leaf traits characterise and differentiate single species but can also be used for monitoring vegetation structure and function. Conventional methods to measure leaf traits, especially at the molecular level (e.g. water, lignin and cellulose content), are expensive and time-consuming. Spectroscopic methods to estimate leaf traits can provide an alternative approach. In this study, we investigated high spectral resolution (6612 bands) emissivity measurements from the short to the long wave infrared (1.4-16.0 μm) of leaves from 19 different plant species ranging from herbaceous to woody, and from temperate to tropical types. At the same time, we measured 14 leaf traits to characterise a leaf, including chemical (e.g., leaf water content, nitrogen, cellulose) and physical features (e.g., leaf area and leaf thickness). We fitted partial least squares regression (PLSR) models across the SWIR, MWIR and LWIR for each leaf trait. Then, reduced models (PLSRred) were derived by iteratively reducing the number of bands in the model (using a modified Jackknife resampling method with a Martens and Martens uncertainty test) down to a few bands (4-10 bands) that contribute the most to the variation of the trait. Most leaf traits could be determined from infrared data with a moderate accuracy (65 < Rcv2 < 77% for observed versus predicted plots) based on PLSRred models, while the accuracy using the whole infrared range (6612 bands) presented higher accuracies, 74 < Rcv2 < 90%. Using the full SWIR range (1.4-2.5 μm) shows similarly high accuracies compared to the whole infrared. Leaf thickness, leaf water content, cellulose, lignin and stomata density are the traits that could be estimated most accurately from infrared data (with Rcv2 above 0.80 for the full range models). Leaf thickness, cellulose and lignin were predicted with reasonable accuracy from a combination of single infrared bands. Nevertheless, for all leaf traits, a combination of a few bands yields moderate to accurate estimations.

  2. A regional method for craniofacial reconstruction based on coordinate adjustments and a new fusion strategy.

    PubMed

    Deng, Qingqiong; Zhou, Mingquan; Wu, Zhongke; Shui, Wuyang; Ji, Yuan; Wang, Xingce; Liu, Ching Yiu Jessica; Huang, Youliang; Jiang, Haiyan

    2016-02-01

    Craniofacial reconstruction recreates a facial outlook from the cranium based on the relationship between the face and the skull to assist identification. But craniofacial structures are very complex, and this relationship is not the same in different craniofacial regions. Several regional methods have recently been proposed, these methods segmented the face and skull into regions, and the relationship of each region is then learned independently, after that, facial regions for a given skull are estimated and finally glued together to generate a face. Most of these regional methods use vertex coordinates to represent the regions, and they define a uniform coordinate system for all of the regions. Consequently, the inconsistence in the positions of regions between different individuals is not eliminated before learning the relationships between the face and skull regions, and this reduces the accuracy of the craniofacial reconstruction. In order to solve this problem, an improved regional method is proposed in this paper involving two types of coordinate adjustments. One is the global coordinate adjustment performed on the skulls and faces with the purpose to eliminate the inconsistence of position and pose of the heads; the other is the local coordinate adjustment performed on the skull and face regions with the purpose to eliminate the inconsistence of position of these regions. After these two coordinate adjustments, partial least squares regression (PLSR) is used to estimate the relationship between the face region and the skull region. In order to obtain a more accurate reconstruction, a new fusion strategy is also proposed in the paper to maintain the reconstructed feature regions when gluing the facial regions together. This is based on the observation that the feature regions usually have less reconstruction errors compared to rest of the face. The results demonstrate that the coordinate adjustments and the new fusion strategy can significantly improve the craniofacial reconstructions. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.

  3. Multispectral Imaging for Determination of Astaxanthin Concentration in Salmonids

    PubMed Central

    Dissing, Bjørn S.; Nielsen, Michael E.; Ersbøll, Bjarne K.; Frosch, Stina

    2011-01-01

    Multispectral imaging has been evaluated for characterization of the concentration of a specific cartenoid pigment; astaxanthin. 59 fillets of rainbow trout, Oncorhynchus mykiss, were filleted and imaged using a rapid multispectral imaging device for quantitative analysis. The multispectral imaging device captures reflection properties in 19 distinct wavelength bands, prior to determination of the true concentration of astaxanthin. The samples ranged from 0.20 to 4.34 g per g fish. A PLSR model was calibrated to predict astaxanthin concentration from novel images, and showed good results with a RMSEP of 0.27. For comparison a similar model were built for normal color images, which yielded a RMSEP of 0.45. The acquisition speed of the multispectral imaging system and the accuracy of the PLSR model obtained suggest this method as a promising technique for rapid in-line estimation of astaxanthin concentration in rainbow trout fillets. PMID:21573000

  4. Application of Partial Least Square (PLS) Regression to Determine Landscape-Scale Aquatic Resources Vulnerability in the Ozark Mountains

    EPA Science Inventory

    Partial least squares (PLS) analysis offers a number of advantages over the more traditionally used regression analyses applied in landscape ecology, particularly for determining the associations among multiple constituents of surface water and landscape configuration. Common dat...

  5. Orthogonalizing EM: A design-based least squares algorithm.

    PubMed

    Xiong, Shifeng; Dai, Bin; Huling, Jared; Qian, Peter Z G

    We introduce an efficient iterative algorithm, intended for various least squares problems, based on a design of experiments perspective. The algorithm, called orthogonalizing EM (OEM), works for ordinary least squares and can be easily extended to penalized least squares. The main idea of the procedure is to orthogonalize a design matrix by adding new rows and then solve the original problem by embedding the augmented design in a missing data framework. We establish several attractive theoretical properties concerning OEM. For the ordinary least squares with a singular regression matrix, an OEM sequence converges to the Moore-Penrose generalized inverse-based least squares estimator. For ordinary and penalized least squares with various penalties, it converges to a point having grouping coherence for fully aliased regression matrices. Convergence and the convergence rate of the algorithm are examined. Finally, we demonstrate that OEM is highly efficient for large-scale least squares and penalized least squares problems, and is considerably faster than competing methods when n is much larger than p . Supplementary materials for this article are available online.

  6. The Use of Alternative Regression Methods in Social Sciences and the Comparison of Least Squares and M Estimation Methods in Terms of the Determination of Coefficient

    ERIC Educational Resources Information Center

    Coskuntuncel, Orkun

    2013-01-01

    The purpose of this study is two-fold; the first aim being to show the effect of outliers on the widely used least squares regression estimator in social sciences. The second aim is to compare the classical method of least squares with the robust M-estimator using the "determination of coefficient" (R[superscript 2]). For this purpose,…

  7. Application of Partial Least Squares (PLS) Regression to Determine Landscape-Scale Aquatic Resource Vulnerability in the Ozark Mountains

    EPA Science Inventory

    Partial least squares (PLS) analysis offers a number of advantages over the more traditionally used regression analyses applied in landscape ecology to study the associations among constituents of surface water and landscapes. Common data problems in ecological studies include: s...

  8. Identifying maternal and infant factors associated with newborn size in rural Bangladesh by partial least squares (PLS) regression analysis

    PubMed Central

    Rahman, Md. Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D. W.; Labrique, Alain B.; Rashid, Mahbubur; Christian, Parul; West, Keith P.

    2017-01-01

    Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 − -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset. PMID:29261760

  9. Identifying maternal and infant factors associated with newborn size in rural Bangladesh by partial least squares (PLS) regression analysis.

    PubMed

    Kabir, Alamgir; Rahman, Md Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D W; Labrique, Alain B; Rashid, Mahbubur; Christian, Parul; West, Keith P

    2017-01-01

    Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 - -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset.

  10. Kernel Partial Least Squares for Nonlinear Regression and Discrimination

    NASA Technical Reports Server (NTRS)

    Rosipal, Roman; Clancy, Daniel (Technical Monitor)

    2002-01-01

    This paper summarizes recent results on applying the method of partial least squares (PLS) in a reproducing kernel Hilbert space (RKHS). A previously proposed kernel PLS regression model was proven to be competitive with other regularized regression methods in RKHS. The family of nonlinear kernel-based PLS models is extended by considering the kernel PLS method for discrimination. Theoretical and experimental results on a two-class discrimination problem indicate usefulness of the method.

  11. Superquantile Regression: Theory, Algorithms, and Applications

    DTIC Science & Technology

    2014-12-01

    Example C: Stack loss data scatterplot matrix. 91 Regression α c0 caf cwt cac R̄ 2 α R̄ 2 α,Adj Least Squares NA -39.9197 0.7156 1.2953 -0.1521 0.9136...This is due to a small 92 Model Regression α c0 cwt cwt2 R̄ 2 α R̄ 2 α,Adj f2 Least Squares NA -41.9109 2.8174 — 0.7665 0.7542 Quantile 0.25 -32.0000

  12. Determination of geographical origin and icariin content of Herba Epimedii using near infrared spectroscopy and chemometrics

    NASA Astrophysics Data System (ADS)

    Yang, Yue; Wu, Yongjiang; Li, Weili; Liu, Xuesong; Zheng, Jiyu; Zhang, Wentao; Chen, Yong

    2018-02-01

    Near infrared (NIR) spectroscopy coupled with chemometrics was used to discriminate the geographical origin of Herba Epimedii in this work. Four different classification models, namely discriminant analysis (DA), back propagation neural network (BPNN), K-nearest neighbor (KNN), and support vector machine (SVM), were constructed, and their performances in terms of recognition accuracy were compared. The results indicated that the SVM model was superior over the other models in the geographical origin identification of Herba Epimedii. The recognition rates of the optimum SVM model were up to 100% for the calibration set and 94.44% for the prediction set, respectively. In addition, the feasibility of NIR spectroscopy with the CARS-PLSR calibration model in prediction of icariin content of Herba Epimedii was also investigated. The determination coefficient (RP2) and root-mean-square error (RMSEP) for prediction set were 0.9269 and 0.0480, respectively. It can be concluded that the NIR spectroscopy technique in combination with chemometrics has great potential in determination of geographical origin and icariin content of Herba Epimedii. This study can provide a valuable reference for rapid quality control of food products.

  13. Confidence Intervals for Squared Semipartial Correlation Coefficients: The Effect of Nonnormality

    ERIC Educational Resources Information Center

    Algina, James; Keselman, H. J.; Penfield, Randall D.

    2010-01-01

    The increase in the squared multiple correlation coefficient ([delta]R[superscript 2]) associated with a variable in a regression equation is a commonly used measure of importance in regression analysis. Algina, Keselman, and Penfield found that intervals based on asymptotic principles were typically very inaccurate, even though the sample size…

  14. Multilevel Modeling and Ordinary Least Squares Regression: How Comparable Are They?

    ERIC Educational Resources Information Center

    Huang, Francis L.

    2018-01-01

    Studies analyzing clustered data sets using both multilevel models (MLMs) and ordinary least squares (OLS) regression have generally concluded that resulting point estimates, but not the standard errors, are comparable with each other. However, the accuracy of the estimates of OLS models is important to consider, as several alternative techniques…

  15. A Comparison of Mean Phase Difference and Generalized Least Squares for Analyzing Single-Case Data

    ERIC Educational Resources Information Center

    Manolov, Rumen; Solanas, Antonio

    2013-01-01

    The present study focuses on single-case data analysis specifically on two procedures for quantifying differences between baseline and treatment measurements. The first technique tested is based on generalized least square regression analysis and is compared to a proposed non-regression technique, which allows obtaining similar information. The…

  16. Orthogonalizing EM: A design-based least squares algorithm

    PubMed Central

    Xiong, Shifeng; Dai, Bin; Huling, Jared; Qian, Peter Z. G.

    2016-01-01

    We introduce an efficient iterative algorithm, intended for various least squares problems, based on a design of experiments perspective. The algorithm, called orthogonalizing EM (OEM), works for ordinary least squares and can be easily extended to penalized least squares. The main idea of the procedure is to orthogonalize a design matrix by adding new rows and then solve the original problem by embedding the augmented design in a missing data framework. We establish several attractive theoretical properties concerning OEM. For the ordinary least squares with a singular regression matrix, an OEM sequence converges to the Moore-Penrose generalized inverse-based least squares estimator. For ordinary and penalized least squares with various penalties, it converges to a point having grouping coherence for fully aliased regression matrices. Convergence and the convergence rate of the algorithm are examined. Finally, we demonstrate that OEM is highly efficient for large-scale least squares and penalized least squares problems, and is considerably faster than competing methods when n is much larger than p. Supplementary materials for this article are available online. PMID:27499558

  17. Geodesic regression on orientation distribution functions with its application to an aging study.

    PubMed

    Du, Jia; Goh, Alvina; Kushnarev, Sergey; Qiu, Anqi

    2014-02-15

    In this paper, we treat orientation distribution functions (ODFs) derived from high angular resolution diffusion imaging (HARDI) as elements of a Riemannian manifold and present a method for geodesic regression on this manifold. In order to find the optimal regression model, we pose this as a least-squares problem involving the sum-of-squared geodesic distances between observed ODFs and their model fitted data. We derive the appropriate gradient terms and employ gradient descent to find the minimizer of this least-squares optimization problem. In addition, we show how to perform statistical testing for determining the significance of the relationship between the manifold-valued regressors and the real-valued regressands. Experiments on both synthetic and real human data are presented. In particular, we examine aging effects on HARDI via geodesic regression of ODFs in normal adults aged 22 years old and above. © 2013 Elsevier Inc. All rights reserved.

  18. Weighted linear regression using D2H and D2 as the independent variables

    Treesearch

    Hans T. Schreuder; Michael S. Williams

    1998-01-01

    Several error structures for weighted regression equations used for predicting volume were examined for 2 large data sets of felled and standing loblolly pine trees (Pinus taeda L.). The generally accepted model with variance of error proportional to the value of the covariate squared ( D2H = diameter squared times height or D...

  19. Using partial least squares regression as a predictive tool in describing equine third metacarpal bone shape.

    PubMed

    Liley, Helen; Zhang, Ju; Firth, Elwyn; Fernandez, Justin; Besier, Thor

    2017-11-01

    Population variance in bone shape is an important consideration when applying the results of subject-specific computational models to a population. In this letter, we demonstrate the ability of partial least squares regression to provide an improved shape prediction of the equine third metacarpal epiphysis, using two easily obtained measurements.

  20. The Multivariate Regression Statistics Strategy to Investigate Content-Effect Correlation of Multiple Components in Traditional Chinese Medicine Based on a Partial Least Squares Method.

    PubMed

    Peng, Ying; Li, Su-Ning; Pei, Xuexue; Hao, Kun

    2018-03-01

    Amultivariate regression statisticstrategy was developed to clarify multi-components content-effect correlation ofpanaxginseng saponins extract and predict the pharmacological effect by components content. In example 1, firstly, we compared pharmacological effects between panax ginseng saponins extract and individual saponin combinations. Secondly, we examined the anti-platelet aggregation effect in seven different saponin combinations of ginsenoside Rb1, Rg1, Rh, Rd, Ra3 and notoginsenoside R1. Finally, the correlation between anti-platelet aggregation and the content of multiple components was analyzed by a partial least squares algorithm. In example 2, firstly, 18 common peaks were identified in ten different batches of panax ginseng saponins extracts from different origins. Then, we investigated the anti-myocardial ischemia reperfusion injury effects of the ten different panax ginseng saponins extracts. Finally, the correlation between the fingerprints and the cardioprotective effects was analyzed by a partial least squares algorithm. Both in example 1 and 2, the relationship between the components content and pharmacological effect was modeled well by the partial least squares regression equations. Importantly, the predicted effect curve was close to the observed data of dot marked on the partial least squares regression model. This study has given evidences that themulti-component content is a promising information for predicting the pharmacological effects of traditional Chinese medicine.

  1. Multi-analyte quantification in bioprocesses by Fourier-transform-infrared spectroscopy by partial least squares regression and multivariate curve resolution.

    PubMed

    Koch, Cosima; Posch, Andreas E; Goicoechea, Héctor C; Herwig, Christoph; Lendl, Bernhard

    2014-01-07

    This paper presents the quantification of Penicillin V and phenoxyacetic acid, a precursor, inline during Pencillium chrysogenum fermentations by FTIR spectroscopy and partial least squares (PLS) regression and multivariate curve resolution - alternating least squares (MCR-ALS). First, the applicability of an attenuated total reflection FTIR fiber optic probe was assessed offline by measuring standards of the analytes of interest and investigating matrix effects of the fermentation broth. Then measurements were performed inline during four fed-batch fermentations with online HPLC for the determination of Penicillin V and phenoxyacetic acid as reference analysis. PLS and MCR-ALS models were built using these data and validated by comparison of single analyte spectra with the selectivity ratio of the PLS models and the extracted spectral traces of the MCR-ALS models, respectively. The achieved root mean square errors of cross-validation for the PLS regressions were 0.22 g L(-1) for Penicillin V and 0.32 g L(-1) for phenoxyacetic acid and the root mean square errors of prediction for MCR-ALS were 0.23 g L(-1) for Penicillin V and 0.15 g L(-1) for phenoxyacetic acid. A general work-flow for building and assessing chemometric regression models for the quantification of multiple analytes in bioprocesses by FTIR spectroscopy is given. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.

  2. Evaluation of the Bitterness of Traditional Chinese Medicines using an E-Tongue Coupled with a Robust Partial Least Squares Regression Method.

    PubMed

    Lin, Zhaozhou; Zhang, Qiao; Liu, Ruixin; Gao, Xiaojie; Zhang, Lu; Kang, Bingya; Shi, Junhan; Wu, Zidan; Gui, Xinjing; Li, Xuelin

    2016-01-25

    To accurately, safely, and efficiently evaluate the bitterness of Traditional Chinese Medicines (TCMs), a robust predictor was developed using robust partial least squares (RPLS) regression method based on data obtained from an electronic tongue (e-tongue) system. The data quality was verified by the Grubb's test. Moreover, potential outliers were detected based on both the standardized residual and score distance calculated for each sample. The performance of RPLS on the dataset before and after outlier detection was compared to other state-of-the-art methods including multivariate linear regression, least squares support vector machine, and the plain partial least squares regression. Both R² and root-mean-squares error (RMSE) of cross-validation (CV) were recorded for each model. With four latent variables, a robust RMSECV value of 0.3916 with bitterness values ranging from 0.63 to 4.78 were obtained for the RPLS model that was constructed based on the dataset including outliers. Meanwhile, the RMSECV, which was calculated using the models constructed by other methods, was larger than that of the RPLS model. After six outliers were excluded, the performance of all benchmark methods markedly improved, but the difference between the RPLS model constructed before and after outlier exclusion was negligible. In conclusion, the bitterness of TCM decoctions can be accurately evaluated with the RPLS model constructed using e-tongue data.

  3. A Generalized Least Squares Regression Approach for Computing Effect Sizes in Single-Case Research: Application Examples

    ERIC Educational Resources Information Center

    Maggin, Daniel M.; Swaminathan, Hariharan; Rogers, Helen J.; O'Keeffe, Breda V.; Sugai, George; Horner, Robert H.

    2011-01-01

    A new method for deriving effect sizes from single-case designs is proposed. The strategy is applicable to small-sample time-series data with autoregressive errors. The method uses Generalized Least Squares (GLS) to model the autocorrelation of the data and estimate regression parameters to produce an effect size that represents the magnitude of…

  4. Use of Thematic Mapper for water quality assessment

    NASA Technical Reports Server (NTRS)

    Horn, E. M.; Morrissey, L. A.

    1984-01-01

    The evaluation of simulated TM data obtained on an ER-2 aircraft at twenty-five predesignated sample sites for mapping water quality factors such as conductivity, pH, suspended solids, turbidity, temperature, and depth, is discussed. Using a multiple regression for the seven TM bands, an equation is developed for the suspended solids. TM bands 1, 2, 3, 4, and 6 are used with logarithm conductivity in a multiple regression. The assessment of regression equations for a high coefficient of determination (R-squared) and statistical significance is considered. Confidence intervals about the mean regression point are calculated in order to assess the robustness of the regressions used for mapping conductivity, turbidity, and suspended solids, and by regressing random subsamples of sites and comparing the resultant range of R-squared, cross validation is conducted.

  5. Spectral distance decay: Assessing species beta-diversity by quantile regression

    USGS Publications Warehouse

    Rocchinl, D.; Nagendra, H.; Ghate, R.; Cade, B.S.

    2009-01-01

    Remotely sensed data represents key information for characterizing and estimating biodiversity. Spectral distance among sites has proven to be a powerful approach for detecting species composition variability. Regression analysis of species similarity versus spectral distance may allow us to quantitatively estimate how beta-diversity in species changes with respect to spectral and ecological variability. In classical regression analysis, the residual sum of squares is minimized for the mean of the dependent variable distribution. However, many ecological datasets are characterized by a high number of zeroes that can add noise to the regression model. Quantile regression can be used to evaluate trend in the upper quantiles rather than a mean trend across the whole distribution of the dependent variable. In this paper, we used ordinary least square (ols) and quantile regression to estimate the decay of species similarity versus spectral distance. The achieved decay rates were statistically nonzero (p < 0.05) considering both ols and quantile regression. Nonetheless, ols regression estimate of mean decay rate was only half the decay rate indicated by the upper quantiles. Moreover, the intercept value, representing the similarity reached when spectral distance approaches zero, was very low compared with the intercepts of upper quantiles, which detected high species similarity when habitats are more similar. In this paper we demonstrated the power of using quantile regressions applied to spectral distance decay in order to reveal species diversity patterns otherwise lost or underestimated by ordinary least square regression. ?? 2009 American Society for Photogrammetry and Remote Sensing.

  6. Least Square Regression Method for Estimating Gas Concentration in an Electronic Nose System

    PubMed Central

    Khalaf, Walaa; Pace, Calogero; Gaudioso, Manlio

    2009-01-01

    We describe an Electronic Nose (ENose) system which is able to identify the type of analyte and to estimate its concentration. The system consists of seven sensors, five of them being gas sensors (supplied with different heater voltage values), the remainder being a temperature and a humidity sensor, respectively. To identify a new analyte sample and then to estimate its concentration, we use both some machine learning techniques and the least square regression principle. In fact, we apply two different training models; the first one is based on the Support Vector Machine (SVM) approach and is aimed at teaching the system how to discriminate among different gases, while the second one uses the least squares regression approach to predict the concentration of each type of analyte. PMID:22573980

  7. Regional regression of flood characteristics employing historical information

    USGS Publications Warehouse

    Tasker, Gary D.; Stedinger, J.R.

    1987-01-01

    Streamflow gauging networks provide hydrologic information for use in estimating the parameters of regional regression models. The regional regression models can be used to estimate flood statistics, such as the 100 yr peak, at ungauged sites as functions of drainage basin characteristics. A recent innovation in regional regression is the use of a generalized least squares (GLS) estimator that accounts for unequal station record lengths and sample cross correlation among the flows. However, this technique does not account for historical flood information. A method is proposed here to adjust this generalized least squares estimator to account for possible information about historical floods available at some stations in a region. The historical information is assumed to be in the form of observations of all peaks above a threshold during a long period outside the systematic record period. A Monte Carlo simulation experiment was performed to compare the GLS estimator adjusted for historical floods with the unadjusted GLS estimator and the ordinary least squares estimator. Results indicate that using the GLS estimator adjusted for historical information significantly improves the regression model. ?? 1987.

  8. Computation of nonlinear least squares estimator and maximum likelihood using principles in matrix calculus

    NASA Astrophysics Data System (ADS)

    Mahaboob, B.; Venkateswarlu, B.; Sankar, J. Ravi; Balasiddamuni, P.

    2017-11-01

    This paper uses matrix calculus techniques to obtain Nonlinear Least Squares Estimator (NLSE), Maximum Likelihood Estimator (MLE) and Linear Pseudo model for nonlinear regression model. David Pollard and Peter Radchenko [1] explained analytic techniques to compute the NLSE. However the present research paper introduces an innovative method to compute the NLSE using principles in multivariate calculus. This study is concerned with very new optimization techniques used to compute MLE and NLSE. Anh [2] derived NLSE and MLE of a heteroscedatistic regression model. Lemcoff [3] discussed a procedure to get linear pseudo model for nonlinear regression model. In this research article a new technique is developed to get the linear pseudo model for nonlinear regression model using multivariate calculus. The linear pseudo model of Edmond Malinvaud [4] has been explained in a very different way in this paper. David Pollard et.al used empirical process techniques to study the asymptotic of the LSE (Least-squares estimation) for the fitting of nonlinear regression function in 2006. In Jae Myung [13] provided a go conceptual for Maximum likelihood estimation in his work “Tutorial on maximum likelihood estimation

  9. Inter-class sparsity based discriminative least square regression.

    PubMed

    Wen, Jie; Xu, Yong; Li, Zuoyong; Ma, Zhongli; Xu, Yuanrong

    2018-06-01

    Least square regression is a very popular supervised classification method. However, two main issues greatly limit its performance. The first one is that it only focuses on fitting the input features to the corresponding output labels while ignoring the correlations among samples. The second one is that the used label matrix, i.e., zero-one label matrix is inappropriate for classification. To solve these problems and improve the performance, this paper presents a novel method, i.e., inter-class sparsity based discriminative least square regression (ICS_DLSR), for multi-class classification. Different from other methods, the proposed method pursues that the transformed samples have a common sparsity structure in each class. For this goal, an inter-class sparsity constraint is introduced to the least square regression model such that the margins of samples from the same class can be greatly reduced while those of samples from different classes can be enlarged. In addition, an error term with row-sparsity constraint is introduced to relax the strict zero-one label matrix, which allows the method to be more flexible in learning the discriminative transformation matrix. These factors encourage the method to learn a more compact and discriminative transformation for regression and thus has the potential to perform better than other methods. Extensive experimental results show that the proposed method achieves the best performance in comparison with other methods for multi-class classification. Copyright © 2018 Elsevier Ltd. All rights reserved.

  10. Least Squares Procedures.

    ERIC Educational Resources Information Center

    Hester, Yvette

    Least squares methods are sophisticated mathematical curve fitting procedures used in all classical parametric methods. The linear least squares approximation is most often associated with finding the "line of best fit" or the regression line. Since all statistical analyses are correlational and all classical parametric methods are least…

  11. Determination of suitable drying curve model for bread moisture loss during baking

    NASA Astrophysics Data System (ADS)

    Soleimani Pour-Damanab, A. R.; Jafary, A.; Rafiee, S.

    2013-03-01

    This study presents mathematical modelling of bread moisture loss or drying during baking in a conventional bread baking process. In order to estimate and select the appropriate moisture loss curve equation, 11 different models, semi-theoretical and empirical, were applied to the experimental data and compared according to their correlation coefficients, chi-squared test and root mean square error which were predicted by nonlinear regression analysis. Consequently, of all the drying models, a Page model was selected as the best one, according to the correlation coefficients, chi-squared test, and root mean square error values and its simplicity. Mean absolute estimation error of the proposed model by linear regression analysis for natural and forced convection modes was 2.43, 4.74%, respectively.

  12. Interpreting the Results of Weighted Least-Squares Regression: Caveats for the Statistical Consumer.

    ERIC Educational Resources Information Center

    Willett, John B.; Singer, Judith D.

    In research, data sets often occur in which the variance of the distribution of the dependent variable at given levels of the predictors is a function of the values of the predictors. In this situation, the use of weighted least-squares (WLS) or techniques is required. Weights suitable for use in a WLS regression analysis must be estimated. A…

  13. Using multiple calibration sets to improve the quantitative accuracy of partial least squares (PLS) regression on open-path fourier transform infrared (OP/FT-IR) spectra of ammonia over wide concentration ranges

    USDA-ARS?s Scientific Manuscript database

    A technique of using multiple calibration sets in partial least squares regression (PLS) was proposed to improve the quantitative determination of ammonia from open-path Fourier transform infrared spectra. The spectra were measured near animal farms, and the path-integrated concentration of ammonia...

  14. Local Linear Regression for Data with AR Errors.

    PubMed

    Li, Runze; Li, Yan

    2009-07-01

    In many statistical applications, data are collected over time, and they are likely correlated. In this paper, we investigate how to incorporate the correlation information into the local linear regression. Under the assumption that the error process is an auto-regressive process, a new estimation procedure is proposed for the nonparametric regression by using local linear regression method and the profile least squares techniques. We further propose the SCAD penalized profile least squares method to determine the order of auto-regressive process. Extensive Monte Carlo simulation studies are conducted to examine the finite sample performance of the proposed procedure, and to compare the performance of the proposed procedures with the existing one. From our empirical studies, the newly proposed procedures can dramatically improve the accuracy of naive local linear regression with working-independent error structure. We illustrate the proposed methodology by an analysis of real data set.

  15. Linear regression in astronomy. II

    NASA Technical Reports Server (NTRS)

    Feigelson, Eric D.; Babu, Gutti J.

    1992-01-01

    A wide variety of least-squares linear regression procedures used in observational astronomy, particularly investigations of the cosmic distance scale, are presented and discussed. The classes of linear models considered are (1) unweighted regression lines, with bootstrap and jackknife resampling; (2) regression solutions when measurement error, in one or both variables, dominates the scatter; (3) methods to apply a calibration line to new data; (4) truncated regression models, which apply to flux-limited data sets; and (5) censored regression models, which apply when nondetections are present. For the calibration problem we develop two new procedures: a formula for the intercept offset between two parallel data sets, which propagates slope errors from one regression to the other; and a generalization of the Working-Hotelling confidence bands to nonstandard least-squares lines. They can provide improved error analysis for Faber-Jackson, Tully-Fisher, and similar cosmic distance scale relations.

  16. Evaluation of the Bitterness of Traditional Chinese Medicines using an E-Tongue Coupled with a Robust Partial Least Squares Regression Method

    PubMed Central

    Lin, Zhaozhou; Zhang, Qiao; Liu, Ruixin; Gao, Xiaojie; Zhang, Lu; Kang, Bingya; Shi, Junhan; Wu, Zidan; Gui, Xinjing; Li, Xuelin

    2016-01-01

    To accurately, safely, and efficiently evaluate the bitterness of Traditional Chinese Medicines (TCMs), a robust predictor was developed using robust partial least squares (RPLS) regression method based on data obtained from an electronic tongue (e-tongue) system. The data quality was verified by the Grubb’s test. Moreover, potential outliers were detected based on both the standardized residual and score distance calculated for each sample. The performance of RPLS on the dataset before and after outlier detection was compared to other state-of-the-art methods including multivariate linear regression, least squares support vector machine, and the plain partial least squares regression. Both R2 and root-mean-squares error (RMSE) of cross-validation (CV) were recorded for each model. With four latent variables, a robust RMSECV value of 0.3916 with bitterness values ranging from 0.63 to 4.78 were obtained for the RPLS model that was constructed based on the dataset including outliers. Meanwhile, the RMSECV, which was calculated using the models constructed by other methods, was larger than that of the RPLS model. After six outliers were excluded, the performance of all benchmark methods markedly improved, but the difference between the RPLS model constructed before and after outlier exclusion was negligible. In conclusion, the bitterness of TCM decoctions can be accurately evaluated with the RPLS model constructed using e-tongue data. PMID:26821026

  17. Different sources of soil CO2 respiration from a drained spruce forest and their dependence on environmental factors

    NASA Astrophysics Data System (ADS)

    Nousratpour, A.

    2011-12-01

    The annual CO2 emission from soils corresponds to a large portion of the global carbon cycle and equals 10 percent of the total atmospheric carbon pool. The total forest soil CO2 loss equals the sum of contribution from autotrophic and heterotrophic organisms. The autotrophic respiration is derived from recent photosynthates from the forest canopy and exudates via the roots. The heterotrophic respiration is less directly dependent on root presence and recently assimilated photosynthates, which points to the possibility of separate mechanisms governing the CO2 emissions. The variation of the CO2 flux from these some-what overlapping sources in the soil i.e. rhizospheric and non-rhizosperically is still not fully understood. Soil temperature and water availability in particular have often been used to explain the variation of soil CO2 efflux by using regression methods. In this experiment around 1000 hours of soil CO2-emission rates from a drained spruce forest was collected from 6 plots, among which 3 were previously root excluded. The emission rates were collected during 5 campaigns throughout the growing season along with continuous above ground and below ground temperature and water properties such as precipitation and VPD (vapor pressure deficit). The resulting matrix was analyzed using multivariate statistical model PLSr (Partial Least Squares regression). This operation reduces the dimensionality of large datasets with probable multicollinearity and helps clarify the dependence of a response factor on x- variables. In addition a time series analysis is applied to the dataset to address the time lag between below ground temperature and water properties to the above ground weather conditions such as VPD and air temperature. Mean carbon emission from the control plots (428 mg Carbon m-2 hr-1) was significantly larger than that from the root excluded plots (136 mg Carbon m-2 hr-1). During the growing season more than 2/3 of the total CO2 release was estimated to be root contribution. The results show that the activity in the rhizosphere increased with rising soil temperature, VPD and ground water depletion until a certain point. When the level of ground water depth was deeper than about 0.5 m the dependence was reversed. This effect was either the opposite or lacking in the root excluded plots, which reflects the involvement of the tree roots and the separate factors controlling the different sources of CO2.

  18. Fourier transform infrared reflectance spectra of latent fingerprints: a biometric gauge for the age of an individual.

    PubMed

    Hemmila, April; McGill, Jim; Ritter, David

    2008-03-01

    To determine if changes in fingerprint infrared spectra linear with age can be found, partial least squares (PLS1) regression of 155 fingerprint infrared spectra against the person's age was constructed. The regression produced a linear model of age as a function of spectrum with a root mean square error of calibration of less than 4 years, showing an inflection at about 25 years of age. The spectral ranges emphasized by the regression do not correspond to the highest concentration constituents of the fingerprints. Separate linear regression models for old and young people can be constructed with even more statistical rigor. The success of the regression demonstrates that a combination of constituents can be found that changes linearly with age, with a significant shift around puberty.

  19. Analysis of Learning Curve Fitting Techniques.

    DTIC Science & Technology

    1987-09-01

    1986. 15. Neter, John and others. Applied Linear Regression Models. Homewood IL: Irwin, 19-33. 16. SAS User’s Guide: Basics, Version 5 Edition. SAS... Linear Regression Techniques (15:23-52). Random errors are assumed to be normally distributed when using -# ordinary least-squares, according to Johnston...lot estimated by the improvement curve formula. For a more detailed explanation of the ordinary least-squares technique, see Neter, et. al., Applied

  20. A Simulation-Based Comparison of Several Stochastic Linear Regression Methods in the Presence of Outliers.

    ERIC Educational Resources Information Center

    Rule, David L.

    Several regression methods were examined within the framework of weighted structural regression (WSR), comparing their regression weight stability and score estimation accuracy in the presence of outlier contamination. The methods compared are: (1) ordinary least squares; (2) WSR ridge regression; (3) minimum risk regression; (4) minimum risk 2;…

  1. A portable nondestructive detection device of quality and nutritional parameters of meat using Vis/NIR spectroscopy

    NASA Astrophysics Data System (ADS)

    Wang, Wenxiu; Peng, Yankun; Wang, Fan; Sun, Hongwei

    2017-05-01

    The improvement of living standards has urged consumers to pay more attention to the quality and nutrition of meat, so the development of nondestructive detection device for quality and nutritional parameters is commercioganic undoubtedly. In this research, a portable device equipped with visible (Vis) and near-infrared (NIR) spectrometers, tungsten halogen lamp, optical fiber, ring light guide and embedded computer was developed to realize simultaneous and fast detection of color (L*, a*, b*), pH, total volatile basic nitrogen (TVB-N), intramuscular fat (IF), protein and water content in pork. The wavelengths of dual-band spectrometers were 400 1100 nm and 940 1650 nm respectively and the tungsten halogen lamp cooperated with ring light guide to form a ring light source and provide appropriate illumination intensity for sample. Software was self-developed to control the functionality of dual-band spectrometers, set spectrometer parameters, acquire and process Vis/NIR spectroscopy and display the prediction results in real time. In order to obtain a robust and accurate prediction model, fresh longissimus dorsi meat was bought and placed in the refrigerator for 12 days to get pork samples with different freshness degrees. Besides, pork meat from three different parts including longissimus dorsi, haunch and lean meat was collected for the determination of IF, protein and water to make the reference values have a wider distribution range. After acquisition of Vis/NIR spectra, data from 400 1100 nm were pretreated with Savitzky-Golay (S-G) filter and standard normal variables transform (SNVT) and spectrum data from 940 1650 nm were preprocessed with SNVT. The anomalous were eliminated by Monte Carlo method based on model cluster analysis and then partial least square regression (PLSR) models based on single band (400 1100 nm or 940 1650 nm) and dual-band were established and compared. The results showed the optimal models for each parameter were built with correlation coefficients in prediction set of 0.9101, 0.9121, 0.8873, 0.9094, 0.9378, 0.9348, 0.9342 and 0.8882, respectively. It indicated this innovative and practical device can be a promising technology for nondestructive, fast and accurate detection of nutritional parameters in meat.

  2. Decay extent evaluation of wood degraded by a fungal community using NIRS: application for ecological engineering structures used for natural hazard mitigation

    NASA Astrophysics Data System (ADS)

    Baptiste Barré, Jean; Bourrier, Franck; Bertrand, David; Rey, Freddy

    2015-04-01

    Ecological engineering corresponds to the design of efficient solutions for protection against natural hazards such as shallow landslides and soil erosion. In particular, bioengineering structures can be composed of a living part, made of plants, cuttings or seeds, and an inert part, a timber logs structure. As wood is not treated by preservatives, fungal degradation can occur from the start of the construction. It results in wood strength loss, which practitioners try to evaluate with non-destructive tools (NDT). Classical NDT are mainly based on density measurements. However, the fungal activity reduces the mechanical properties (modulus of elasticity - MOE) well before well before a density change could be measured. In this context, it would be useful to provide a tool for assessing the residual mechanical strength at different decay stages due to a fungal community. Near-infrared spectroscopy (NIRS) can be used for that purpose, as it can allow evaluating wood mechanical properties as well as wood chemical changes due to brown and white rots. We monitored 160 silver fir samples (30x30x6000mm) from green state to different levels of decay. The degradation process took place in a greenhouse and samples were inoculated with silver fir decayed debris in order to accelerate the process. For each sample, we calculated the normalized bending modulus of elasticity loss (Dw moe) and defined it as decay extent. Near infrared spectra collected from both green and decayed ground samples were corrected by the subtraction of baseline offset. Spectra of green samples were averaged into one mean spectrum and decayed spectra were subtracted from the mean spectrum to calculate the absorption loss. Partial least square regression (PLSR) has been performed between the normalized MOE loss Dw moe (0 < Dw moe < 1) and the absorption loss, with a correlation coefficient R² equal to 0.85. Finally, the prediction of silver fir biodegradation rate by NIRS was significant (RMSEP = 0.13). This tool improves the evaluation accuracy of wood decay extent in the context of ecological engineering structures used for natural hazard mitigation.

  3. Perception and Modeling of Affective Qualities of Musical Instrument Sounds across Pitch Registers.

    PubMed

    McAdams, Stephen; Douglas, Chelsea; Vempala, Naresh N

    2017-01-01

    Composers often pick specific instruments to convey a given emotional tone in their music, partly due to their expressive possibilities, but also due to their timbres in specific registers and at given dynamic markings. Of interest to both music psychology and music informatics from a computational point of view is the relation between the acoustic properties that give rise to the timbre at a given pitch and the perceived emotional quality of the tone. Musician and nonmusician listeners were presented with 137 tones produced at a fixed dynamic marking (forte) playing tones at pitch class D# across each instrument's entire pitch range and with different playing techniques for standard orchestral instruments drawn from the brass, woodwind, string, and pitched percussion families. They rated each tone on six analogical-categorical scales in terms of emotional valence (positive/negative and pleasant/unpleasant), energy arousal (awake/tired), tension arousal (excited/calm), preference (like/dislike), and familiarity. Linear mixed models revealed interactive effects of musical training, instrument family, and pitch register, with non-linear relations between pitch register and several dependent variables. Twenty-three audio descriptors from the Timbre Toolbox were computed for each sound and analyzed in two ways: linear partial least squares regression (PLSR) and nonlinear artificial neural net modeling. These two analyses converged in terms of the importance of various spectral, temporal, and spectrotemporal audio descriptors in explaining the emotion ratings, but some differences also emerged. Different combinations of audio descriptors make major contributions to the three emotion dimensions, suggesting that they are carried by distinct acoustic properties. Valence is more positive with lower spectral slopes, a greater emergence of strong partials, and an amplitude envelope with a sharper attack and earlier decay. Higher tension arousal is carried by brighter sounds, more spectral variation and more gentle attacks. Greater energy arousal is associated with brighter sounds, with higher spectral centroids and slower decrease of the spectral slope, as well as with greater spectral emergence. The divergences between linear and nonlinear approaches are discussed.

  4. Seasonal variability in the persistence of dissolved environmental DNA (eDNA) in a marine system: The role of microbial nutrient limitation.

    PubMed

    Salter, Ian

    2018-01-01

    Environmental DNA (eDNA) can be defined as the DNA pool recovered from an environmental sample that includes both extracellular and intracellular DNA. There has been a significant increase in the number of recent studies that have demonstrated the possibility to detect macroorganisms using eDNA. Despite the enormous potential of eDNA to serve as a biomonitoring and conservation tool in aquatic systems, there remain some important limitations concerning its application. One significant factor is the variable persistence of eDNA over natural environmental gradients, which imposes a critical constraint on the temporal and spatial scales of species detection. In the present study, a radiotracer bioassay approach was used to quantify the kinetic parameters of dissolved eDNA (d-eDNA), a component of extracellular DNA, over an annual cycle in the coastal Northwest Mediterranean. Significant seasonal variability in the biological uptake and turnover of d-eDNA was observed, the latter ranging from several hours to over one month. Maximum uptake rates of d-eDNA occurred in summer during a period of intense phosphate limitation (turnover <5 hrs). Corresponding increases in bacterial production and uptake of adenosine triphosphate (ATP) demonstrated the microbial utilization of d-eDNA as an organic phosphorus substrate. Higher temperatures during summer may amplify this effect through a general enhancement of microbial metabolism. A partial least squares regression (PLSR) model was able to reproduce the seasonal cycle in d-eDNA persistence and explained 60% of the variance in the observations. Rapid phosphate turnover and low concentrations of bioavailable phosphate, both indicative of phosphate limitation, were the most important parameters in the model. Abiotic factors such as pH, salinity and oxygen exerted minimal influence. The present study demonstrates significant seasonal variability in the persistence of d-eDNA in a natural marine environment that can be linked to the metabolic response of microbial communities to nutrient limitation. Future studies should consider the effect of natural environmental gradients on the seasonal persistence of eDNA, which will be of particular relevance for time-series biomonitoring programs.

  5. Seasonal variability in the persistence of dissolved environmental DNA (eDNA) in a marine system: The role of microbial nutrient limitation

    PubMed Central

    2018-01-01

    Environmental DNA (eDNA) can be defined as the DNA pool recovered from an environmental sample that includes both extracellular and intracellular DNA. There has been a significant increase in the number of recent studies that have demonstrated the possibility to detect macroorganisms using eDNA. Despite the enormous potential of eDNA to serve as a biomonitoring and conservation tool in aquatic systems, there remain some important limitations concerning its application. One significant factor is the variable persistence of eDNA over natural environmental gradients, which imposes a critical constraint on the temporal and spatial scales of species detection. In the present study, a radiotracer bioassay approach was used to quantify the kinetic parameters of dissolved eDNA (d-eDNA), a component of extracellular DNA, over an annual cycle in the coastal Northwest Mediterranean. Significant seasonal variability in the biological uptake and turnover of d-eDNA was observed, the latter ranging from several hours to over one month. Maximum uptake rates of d-eDNA occurred in summer during a period of intense phosphate limitation (turnover <5 hrs). Corresponding increases in bacterial production and uptake of adenosine triphosphate (ATP) demonstrated the microbial utilization of d-eDNA as an organic phosphorus substrate. Higher temperatures during summer may amplify this effect through a general enhancement of microbial metabolism. A partial least squares regression (PLSR) model was able to reproduce the seasonal cycle in d-eDNA persistence and explained 60% of the variance in the observations. Rapid phosphate turnover and low concentrations of bioavailable phosphate, both indicative of phosphate limitation, were the most important parameters in the model. Abiotic factors such as pH, salinity and oxygen exerted minimal influence. The present study demonstrates significant seasonal variability in the persistence of d-eDNA in a natural marine environment that can be linked to the metabolic response of microbial communities to nutrient limitation. Future studies should consider the effect of natural environmental gradients on the seasonal persistence of eDNA, which will be of particular relevance for time-series biomonitoring programs. PMID:29474423

  6. Perception and Modeling of Affective Qualities of Musical Instrument Sounds across Pitch Registers

    PubMed Central

    McAdams, Stephen; Douglas, Chelsea; Vempala, Naresh N.

    2017-01-01

    Composers often pick specific instruments to convey a given emotional tone in their music, partly due to their expressive possibilities, but also due to their timbres in specific registers and at given dynamic markings. Of interest to both music psychology and music informatics from a computational point of view is the relation between the acoustic properties that give rise to the timbre at a given pitch and the perceived emotional quality of the tone. Musician and nonmusician listeners were presented with 137 tones produced at a fixed dynamic marking (forte) playing tones at pitch class D# across each instrument's entire pitch range and with different playing techniques for standard orchestral instruments drawn from the brass, woodwind, string, and pitched percussion families. They rated each tone on six analogical-categorical scales in terms of emotional valence (positive/negative and pleasant/unpleasant), energy arousal (awake/tired), tension arousal (excited/calm), preference (like/dislike), and familiarity. Linear mixed models revealed interactive effects of musical training, instrument family, and pitch register, with non-linear relations between pitch register and several dependent variables. Twenty-three audio descriptors from the Timbre Toolbox were computed for each sound and analyzed in two ways: linear partial least squares regression (PLSR) and nonlinear artificial neural net modeling. These two analyses converged in terms of the importance of various spectral, temporal, and spectrotemporal audio descriptors in explaining the emotion ratings, but some differences also emerged. Different combinations of audio descriptors make major contributions to the three emotion dimensions, suggesting that they are carried by distinct acoustic properties. Valence is more positive with lower spectral slopes, a greater emergence of strong partials, and an amplitude envelope with a sharper attack and earlier decay. Higher tension arousal is carried by brighter sounds, more spectral variation and more gentle attacks. Greater energy arousal is associated with brighter sounds, with higher spectral centroids and slower decrease of the spectral slope, as well as with greater spectral emergence. The divergences between linear and nonlinear approaches are discussed. PMID:28228741

  7. Detecting outliers when fitting data with nonlinear regression – a new method based on robust nonlinear regression and the false discovery rate

    PubMed Central

    Motulsky, Harvey J; Brown, Ronald E

    2006-01-01

    Background Nonlinear regression, like linear regression, assumes that the scatter of data around the ideal curve follows a Gaussian or normal distribution. This assumption leads to the familiar goal of regression: to minimize the sum of the squares of the vertical or Y-value distances between the points and the curve. Outliers can dominate the sum-of-the-squares calculation, and lead to misleading results. However, we know of no practical method for routinely identifying outliers when fitting curves with nonlinear regression. Results We describe a new method for identifying outliers when fitting data with nonlinear regression. We first fit the data using a robust form of nonlinear regression, based on the assumption that scatter follows a Lorentzian distribution. We devised a new adaptive method that gradually becomes more robust as the method proceeds. To define outliers, we adapted the false discovery rate approach to handling multiple comparisons. We then remove the outliers, and analyze the data using ordinary least-squares regression. Because the method combines robust regression and outlier removal, we call it the ROUT method. When analyzing simulated data, where all scatter is Gaussian, our method detects (falsely) one or more outlier in only about 1–3% of experiments. When analyzing data contaminated with one or several outliers, the ROUT method performs well at outlier identification, with an average False Discovery Rate less than 1%. Conclusion Our method, which combines a new method of robust nonlinear regression with a new method of outlier identification, identifies outliers from nonlinear curve fits with reasonable power and few false positives. PMID:16526949

  8. Ridge: a computer program for calculating ridge regression estimates

    Treesearch

    Donald E. Hilt; Donald W. Seegrist

    1977-01-01

    Least-squares coefficients for multiple-regression models may be unstable when the independent variables are highly correlated. Ridge regression is a biased estimation procedure that produces stable estimates of the coefficients. Ridge regression is discussed, and a computer program for calculating the ridge coefficients is presented.

  9. An improved partial least-squares regression method for Raman spectroscopy

    NASA Astrophysics Data System (ADS)

    Momenpour Tehran Monfared, Ali; Anis, Hanan

    2017-10-01

    It is known that the performance of partial least-squares (PLS) regression analysis can be improved using the backward variable selection method (BVSPLS). In this paper, we further improve the BVSPLS based on a novel selection mechanism. The proposed method is based on sorting the weighted regression coefficients, and then the importance of each variable of the sorted list is evaluated using root mean square errors of prediction (RMSEP) criterion in each iteration step. Our Improved BVSPLS (IBVSPLS) method has been applied to leukemia and heparin data sets and led to an improvement in limit of detection of Raman biosensing ranged from 10% to 43% compared to PLS. Our IBVSPLS was also compared to the jack-knifing (simpler) and Genetic Algorithm (more complex) methods. Our method was consistently better than the jack-knifing method and showed either a similar or a better performance compared to the genetic algorithm.

  10. Exact Analysis of Squared Cross-Validity Coefficient in Predictive Regression Models

    ERIC Educational Resources Information Center

    Shieh, Gwowen

    2009-01-01

    In regression analysis, the notion of population validity is of theoretical interest for describing the usefulness of the underlying regression model, whereas the presumably more important concept of population cross-validity represents the predictive effectiveness for the regression equation in future research. It appears that the inference…

  11. Spatial Autocorrelation Approaches to Testing Residuals from Least Squares Regression.

    PubMed

    Chen, Yanguang

    2016-01-01

    In geo-statistics, the Durbin-Watson test is frequently employed to detect the presence of residual serial correlation from least squares regression analyses. However, the Durbin-Watson statistic is only suitable for ordered time or spatial series. If the variables comprise cross-sectional data coming from spatial random sampling, the test will be ineffectual because the value of Durbin-Watson's statistic depends on the sequence of data points. This paper develops two new statistics for testing serial correlation of residuals from least squares regression based on spatial samples. By analogy with the new form of Moran's index, an autocorrelation coefficient is defined with a standardized residual vector and a normalized spatial weight matrix. Then by analogy with the Durbin-Watson statistic, two types of new serial correlation indices are constructed. As a case study, the two newly presented statistics are applied to a spatial sample of 29 China's regions. These results show that the new spatial autocorrelation models can be used to test the serial correlation of residuals from regression analysis. In practice, the new statistics can make up for the deficiencies of the Durbin-Watson test.

  12. Estimating current and future streamflow characteristics at ungaged sites, central and eastern Montana, with application to evaluating effects of climate change on fish populations

    USGS Publications Warehouse

    Sando, Roy; Chase, Katherine J.

    2017-03-23

    A common statistical procedure for estimating streamflow statistics at ungaged locations is to develop a relational model between streamflow and drainage basin characteristics at gaged locations using least squares regression analysis; however, least squares regression methods are parametric and make constraining assumptions about the data distribution. The random forest regression method provides an alternative nonparametric method for estimating streamflow characteristics at ungaged sites and requires that the data meet fewer statistical conditions than least squares regression methods.Random forest regression analysis was used to develop predictive models for 89 streamflow characteristics using Precipitation-Runoff Modeling System simulated streamflow data and drainage basin characteristics at 179 sites in central and eastern Montana. The predictive models were developed from streamflow data simulated for current (baseline, water years 1982–99) conditions and three future periods (water years 2021–38, 2046–63, and 2071–88) under three different climate-change scenarios. These predictive models were then used to predict streamflow characteristics for baseline conditions and three future periods at 1,707 fish sampling sites in central and eastern Montana. The average root mean square error for all predictive models was about 50 percent. When streamflow predictions at 23 fish sampling sites were compared to nearby locations with simulated data, the mean relative percent difference was about 43 percent. When predictions were compared to streamflow data recorded at 21 U.S. Geological Survey streamflow-gaging stations outside of the calibration basins, the average mean absolute percent error was about 73 percent.

  13. Comparison of Logistic Regression and Artificial Neural Network in Low Back Pain Prediction: Second National Health Survey

    PubMed Central

    Parsaeian, M; Mohammad, K; Mahmoudi, M; Zeraati, H

    2012-01-01

    Background: The purpose of this investigation was to compare empirically predictive ability of an artificial neural network with a logistic regression in prediction of low back pain. Methods: Data from the second national health survey were considered in this investigation. This data includes the information of low back pain and its associated risk factors among Iranian people aged 15 years and older. Artificial neural network and logistic regression models were developed using a set of 17294 data and they were validated in a test set of 17295 data. Hosmer and Lemeshow recommendation for model selection was used in fitting the logistic regression. A three-layer perceptron with 9 inputs, 3 hidden and 1 output neurons was employed. The efficiency of two models was compared by receiver operating characteristic analysis, root mean square and -2 Loglikelihood criteria. Results: The area under the ROC curve (SE), root mean square and -2Loglikelihood of the logistic regression was 0.752 (0.004), 0.3832 and 14769.2, respectively. The area under the ROC curve (SE), root mean square and -2Loglikelihood of the artificial neural network was 0.754 (0.004), 0.3770 and 14757.6, respectively. Conclusions: Based on these three criteria, artificial neural network would give better performance than logistic regression. Although, the difference is statistically significant, it does not seem to be clinically significant. PMID:23113198

  14. Comparison of logistic regression and artificial neural network in low back pain prediction: second national health survey.

    PubMed

    Parsaeian, M; Mohammad, K; Mahmoudi, M; Zeraati, H

    2012-01-01

    The purpose of this investigation was to compare empirically predictive ability of an artificial neural network with a logistic regression in prediction of low back pain. Data from the second national health survey were considered in this investigation. This data includes the information of low back pain and its associated risk factors among Iranian people aged 15 years and older. Artificial neural network and logistic regression models were developed using a set of 17294 data and they were validated in a test set of 17295 data. Hosmer and Lemeshow recommendation for model selection was used in fitting the logistic regression. A three-layer perceptron with 9 inputs, 3 hidden and 1 output neurons was employed. The efficiency of two models was compared by receiver operating characteristic analysis, root mean square and -2 Loglikelihood criteria. The area under the ROC curve (SE), root mean square and -2Loglikelihood of the logistic regression was 0.752 (0.004), 0.3832 and 14769.2, respectively. The area under the ROC curve (SE), root mean square and -2Loglikelihood of the artificial neural network was 0.754 (0.004), 0.3770 and 14757.6, respectively. Based on these three criteria, artificial neural network would give better performance than logistic regression. Although, the difference is statistically significant, it does not seem to be clinically significant.

  15. A Model Comparison for Count Data with a Positively Skewed Distribution with an Application to the Number of University Mathematics Courses Completed

    ERIC Educational Resources Information Center

    Liou, Pey-Yan

    2009-01-01

    The current study examines three regression models: OLS (ordinary least square) linear regression, Poisson regression, and negative binomial regression for analyzing count data. Simulation results show that the OLS regression model performed better than the others, since it did not produce more false statistically significant relationships than…

  16. Using Quantile and Asymmetric Least Squares Regression for Optimal Risk Adjustment.

    PubMed

    Lorenz, Normann

    2017-06-01

    In this paper, we analyze optimal risk adjustment for direct risk selection (DRS). Integrating insurers' activities for risk selection into a discrete choice model of individuals' health insurance choice shows that DRS has the structure of a contest. For the contest success function (csf) used in most of the contest literature (the Tullock-csf), optimal transfers for a risk adjustment scheme have to be determined by means of a restricted quantile regression, irrespective of whether insurers are primarily engaged in positive DRS (attracting low risks) or negative DRS (repelling high risks). This is at odds with the common practice of determining transfers by means of a least squares regression. However, this common practice can be rationalized for a new csf, but only if positive and negative DRSs are equally important; if they are not, optimal transfers have to be calculated by means of a restricted asymmetric least squares regression. Using data from German and Swiss health insurers, we find considerable differences between the three types of regressions. Optimal transfers therefore critically depend on which csf represents insurers' incentives for DRS and, if it is not the Tullock-csf, whether insurers are primarily engaged in positive or negative DRS. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  17. Examination of Parameters Affecting the House Prices by Multiple Regression Analysis and its Contributions to Earthquake-Based Urban Transformation

    NASA Astrophysics Data System (ADS)

    Denli, H. H.; Durmus, B.

    2016-12-01

    The purpose of this study is to examine the factors which may affect the apartment prices with multiple linear regression analysis models and visualize the results by value maps. The study is focused on a county of Istanbul - Turkey. Totally 390 apartments around the county Umraniye are evaluated due to their physical and locational conditions. The identification of factors affecting the price of apartments in the county with a population of approximately 600k is expected to provide a significant contribution to the apartment market.Physical factors are selected as the age, number of rooms, size, floor numbers of the building and the floor that the apartment is positioned in. Positional factors are selected as the distances to the nearest hospital, school, park and police station. Totally ten physical and locational parameters are examined by regression analysis.After the regression analysis has been performed, value maps are composed from the parameters age, price and price per square meters. The most significant of the composed maps is the price per square meters map. Results show that the location of the apartment has the most influence to the square meter price information of the apartment. A different practice is developed from the composed maps by searching the ability of using price per square meters map in urban transformation practices. By marking the buildings older than 15 years in the price per square meters map, a different and new interpretation has been made to determine the buildings, to which should be given priority during an urban transformation in the county.This county is very close to the North Anatolian Fault zone and is under the threat of earthquakes. By marking the apartments older than 15 years on the price per square meters map, both older and expensive square meters apartments list can be gathered. By the help of this list, the priority could be given to the selected higher valued old apartments to support the economy of the country during an earthquake loss. We may call this urban transformation as earthquake-based urban transformation.

  18. A weighted least squares estimation of the polynomial regression model on paddy production in the area of Kedah and Perlis

    NASA Astrophysics Data System (ADS)

    Musa, Rosliza; Ali, Zalila; Baharum, Adam; Nor, Norlida Mohd

    2017-08-01

    The linear regression model assumes that all random error components are identically and independently distributed with constant variance. Hence, each data point provides equally precise information about the deterministic part of the total variation. In other words, the standard deviations of the error terms are constant over all values of the predictor variables. When the assumption of constant variance is violated, the ordinary least squares estimator of regression coefficient lost its property of minimum variance in the class of linear and unbiased estimators. Weighted least squares estimation are often used to maximize the efficiency of parameter estimation. A procedure that treats all of the data equally would give less precisely measured points more influence than they should have and would give highly precise points too little influence. Optimizing the weighted fitting criterion to find the parameter estimates allows the weights to determine the contribution of each observation to the final parameter estimates. This study used polynomial model with weighted least squares estimation to investigate paddy production of different paddy lots based on paddy cultivation characteristics and environmental characteristics in the area of Kedah and Perlis. The results indicated that factors affecting paddy production are mixture fertilizer application cycle, average temperature, the squared effect of average rainfall, the squared effect of pest and disease, the interaction between acreage with amount of mixture fertilizer, the interaction between paddy variety and NPK fertilizer application cycle and the interaction between pest and disease and NPK fertilizer application cycle.

  19. Model Estimation Using Ridge Regression with the Variance Normalization Criterion. Interim Report No. 2. The Education and Inequality in Canada Project.

    ERIC Educational Resources Information Center

    Lee, Wan-Fung; Bulcock, Jeffrey Wilson

    The purposes of this study are: (1) to demonstrate the superiority of simple ridge regression over ordinary least squares regression through theoretical argument and empirical example; (2) to modify ridge regression through use of the variance normalization criterion; and (3) to demonstrate the superiority of simple ridge regression based on the…

  20. Determining Sample Size for Accurate Estimation of the Squared Multiple Correlation Coefficient.

    ERIC Educational Resources Information Center

    Algina, James; Olejnik, Stephen

    2000-01-01

    Discusses determining sample size for estimation of the squared multiple correlation coefficient and presents regression equations that permit determination of the sample size for estimating this parameter for up to 20 predictor variables. (SLD)

  1. Functional Relationships and Regression Analysis.

    ERIC Educational Resources Information Center

    Preece, Peter F. W.

    1978-01-01

    Using a degenerate multivariate normal model for the distribution of organismic variables, the form of least-squares regression analysis required to estimate a linear functional relationship between variables is derived. It is suggested that the two conventional regression lines may be considered to describe functional, not merely statistical,…

  2. Tobacco alkaloids reduction by casings added/enzymatic hydrolysis treatments assessed through PLSR analysis.

    PubMed

    Lin, Shunshun; Zhang, Xiaoming; Song, Shiqing; Hayat, Khizar; Eric, Karangwa; Majeed, Hamid

    2016-03-01

    Based on encouraged development of potential reduced-exposure products (PREPs) by the US Institute of Medicine, casings (glucose and peptides) added treatments (CAT) and enzymatic (protease and xylanase) hydrolysis treatments (EHT) were developed to study their effect on alkaloids reduction in tobacco and cigarette mainstream smoke (MS) and further investigate the correlation between sensory attributes and alkaloids. Results showed that the developed treatments reduced nicotine by 14.5% and 24.4% in tobacco and cigarette MS, respectively, indicating that both CAT and EHT are potentially effective for developing lower-risk cigarettes. Sensory and electronic nose analysis confirmed the significant influence of treatments on sensory and cigarette MS components. PLSR analysis demonstrated that tobacco alkaloids were positively correlated to the off-taste, irritation and impact attributes, and negatively correlated to the aroma and softness attributes. Additionally, nicotine and anabasine from tobacco leaves positively contributed to the impact attribute, while they negatively contributed to the aroma attribute (P<0.05). Meanwhile, most alkaloids in cigarette MS positively contributed to the impact and irritation attributes (P<0.05). Hence, this study paved a way to better understand the correlation between tobacco alkaloids and sensory attributes. Copyright © 2015 Elsevier Inc. All rights reserved.

  3. Weighted regression analysis and interval estimators

    Treesearch

    Donald W. Seegrist

    1974-01-01

    A method for deriving the weighted least squares estimators for the parameters of a multiple regression model. Confidence intervals for expected values, and prediction intervals for the means of future samples are given.

  4. Spatial Autocorrelation Approaches to Testing Residuals from Least Squares Regression

    PubMed Central

    Chen, Yanguang

    2016-01-01

    In geo-statistics, the Durbin-Watson test is frequently employed to detect the presence of residual serial correlation from least squares regression analyses. However, the Durbin-Watson statistic is only suitable for ordered time or spatial series. If the variables comprise cross-sectional data coming from spatial random sampling, the test will be ineffectual because the value of Durbin-Watson’s statistic depends on the sequence of data points. This paper develops two new statistics for testing serial correlation of residuals from least squares regression based on spatial samples. By analogy with the new form of Moran’s index, an autocorrelation coefficient is defined with a standardized residual vector and a normalized spatial weight matrix. Then by analogy with the Durbin-Watson statistic, two types of new serial correlation indices are constructed. As a case study, the two newly presented statistics are applied to a spatial sample of 29 China’s regions. These results show that the new spatial autocorrelation models can be used to test the serial correlation of residuals from regression analysis. In practice, the new statistics can make up for the deficiencies of the Durbin-Watson test. PMID:26800271

  5. Membrane Introduction Mass Spectrometry Combined with an Orthogonal Partial-Least Squares Calibration Model for Mixture Analysis.

    PubMed

    Li, Min; Zhang, Lu; Yao, Xiaolong; Jiang, Xingyu

    2017-01-01

    The emerging membrane introduction mass spectrometry technique has been successfully used to detect benzene, toluene, ethyl benzene and xylene (BTEX), while overlapped spectra have unfortunately hindered its further application to the analysis of mixtures. Multivariate calibration, an efficient method to analyze mixtures, has been widely applied. In this paper, we compared univariate and multivariate analyses for quantification of the individual components of mixture samples. The results showed that the univariate analysis creates poor models with regression coefficients of 0.912, 0.867, 0.440 and 0.351 for BTEX, respectively. For multivariate analysis, a comparison to the partial-least squares (PLS) model shows that the orthogonal partial-least squares (OPLS) regression exhibits an optimal performance with regression coefficients of 0.995, 0.999, 0.980 and 0.976, favorable calibration parameters (RMSEC and RMSECV) and a favorable validation parameter (RMSEP). Furthermore, the OPLS exhibits a good recovery of 73.86 - 122.20% and relative standard deviation (RSD) of the repeatability of 1.14 - 4.87%. Thus, MIMS coupled with the OPLS regression provides an optimal approach for a quantitative BTEX mixture analysis in monitoring and predicting water pollution.

  6. Incorporation of prior information on parameters into nonlinear regression groundwater flow models: 2. Applications

    USGS Publications Warehouse

    Cooley, Richard L.

    1983-01-01

    This paper investigates factors influencing the degree of improvement in estimates of parameters of a nonlinear regression groundwater flow model by incorporating prior information of unknown reliability. Consideration of expected behavior of the regression solutions and results of a hypothetical modeling problem lead to several general conclusions. First, if the parameters are properly scaled, linearized expressions for the mean square error (MSE) in parameter estimates of a nonlinear model will often behave very nearly as if the model were linear. Second, by using prior information, the MSE in properly scaled parameters can be reduced greatly over the MSE of ordinary least squares estimates of parameters. Third, plots of estimated MSE and the estimated standard deviation of MSE versus an auxiliary parameter (the ridge parameter) specifying the degree of influence of the prior information on regression results can help determine the potential for improvement of parameter estimates. Fourth, proposed criteria can be used to make appropriate choices for the ridge parameter and another parameter expressing degree of overall bias in the prior information. Results of a case study of Truckee Meadows, Reno-Sparks area, Washoe County, Nevada, conform closely to the results of the hypothetical problem. In the Truckee Meadows case, incorporation of prior information did not greatly change the parameter estimates from those obtained by ordinary least squares. However, the analysis showed that both sets of estimates are more reliable than suggested by the standard errors from ordinary least squares.

  7. Applying Regression Analysis to Problems in Institutional Research.

    ERIC Educational Resources Information Center

    Bohannon, Tom R.

    1988-01-01

    Regression analysis is one of the most frequently used statistical techniques in institutional research. Principles of least squares, model building, residual analysis, influence statistics, and multi-collinearity are described and illustrated. (Author/MSE)

  8. On the calibration process of film dosimetry: OLS inverse regression versus WLS inverse prediction.

    PubMed

    Crop, F; Van Rompaye, B; Paelinck, L; Vakaet, L; Thierens, H; De Wagter, C

    2008-07-21

    The purpose of this study was both putting forward a statistically correct model for film calibration and the optimization of this process. A reliable calibration is needed in order to perform accurate reference dosimetry with radiographic (Gafchromic) film. Sometimes, an ordinary least squares simple linear (in the parameters) regression is applied to the dose-optical-density (OD) curve with the dose as a function of OD (inverse regression) or sometimes OD as a function of dose (inverse prediction). The application of a simple linear regression fit is an invalid method because heteroscedasticity of the data is not taken into account. This could lead to erroneous results originating from the calibration process itself and thus to a lower accuracy. In this work, we compare the ordinary least squares (OLS) inverse regression method with the correct weighted least squares (WLS) inverse prediction method to create calibration curves. We found that the OLS inverse regression method could lead to a prediction bias of up to 7.3 cGy at 300 cGy and total prediction errors of 3% or more for Gafchromic EBT film. Application of the WLS inverse prediction method resulted in a maximum prediction bias of 1.4 cGy and total prediction errors below 2% in a 0-400 cGy range. We developed a Monte-Carlo-based process to optimize calibrations, depending on the needs of the experiment. This type of thorough analysis can lead to a higher accuracy for film dosimetry.

  9. Who Will Win?: Predicting the Presidential Election Using Linear Regression

    ERIC Educational Resources Information Center

    Lamb, John H.

    2007-01-01

    This article outlines a linear regression activity that engages learners, uses technology, and fosters cooperation. Students generated least-squares linear regression equations using TI-83 Plus[TM] graphing calculators, Microsoft[C] Excel, and paper-and-pencil calculations using derived normal equations to predict the 2004 presidential election.…

  10. The Variance Normalization Method of Ridge Regression Analysis.

    ERIC Educational Resources Information Center

    Bulcock, J. W.; And Others

    The testing of contemporary sociological theory often calls for the application of structural-equation models to data which are inherently collinear. It is shown that simple ridge regression, which is commonly used for controlling the instability of ordinary least squares regression estimates in ill-conditioned data sets, is not a legitimate…

  11. A graphical method to evaluate spectral preprocessing in multivariate regression calibrations: example with Savitzky-Golay filters and partial least squares regression

    USDA-ARS?s Scientific Manuscript database

    In multivariate regression analysis of spectroscopy data, spectral preprocessing is often performed to reduce unwanted background information (offsets, sloped baselines) or accentuate absorption features in intrinsically overlapping bands. These procedures, also known as pretreatments, are commonly ...

  12. Neither fixed nor random: weighted least squares meta-regression.

    PubMed

    Stanley, T D; Doucouliagos, Hristos

    2017-03-01

    Our study revisits and challenges two core conventional meta-regression estimators: the prevalent use of 'mixed-effects' or random-effects meta-regression analysis and the correction of standard errors that defines fixed-effects meta-regression analysis (FE-MRA). We show how and explain why an unrestricted weighted least squares MRA (WLS-MRA) estimator is superior to conventional random-effects (or mixed-effects) meta-regression when there is publication (or small-sample) bias that is as good as FE-MRA in all cases and better than fixed effects in most practical applications. Simulations and statistical theory show that WLS-MRA provides satisfactory estimates of meta-regression coefficients that are practically equivalent to mixed effects or random effects when there is no publication bias. When there is publication selection bias, WLS-MRA always has smaller bias than mixed effects or random effects. In practical applications, an unrestricted WLS meta-regression is likely to give practically equivalent or superior estimates to fixed-effects, random-effects, and mixed-effects meta-regression approaches. However, random-effects meta-regression remains viable and perhaps somewhat preferable if selection for statistical significance (publication bias) can be ruled out and when random, additive normal heterogeneity is known to directly affect the 'true' regression coefficient. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

  13. Modelling fourier regression for time series data- a case study: modelling inflation in foods sector in Indonesia

    NASA Astrophysics Data System (ADS)

    Prahutama, Alan; Suparti; Wahyu Utami, Tiani

    2018-03-01

    Regression analysis is an analysis to model the relationship between response variables and predictor variables. The parametric approach to the regression model is very strict with the assumption, but nonparametric regression model isn’t need assumption of model. Time series data is the data of a variable that is observed based on a certain time, so if the time series data wanted to be modeled by regression, then we should determined the response and predictor variables first. Determination of the response variable in time series is variable in t-th (yt), while the predictor variable is a significant lag. In nonparametric regression modeling, one developing approach is to use the Fourier series approach. One of the advantages of nonparametric regression approach using Fourier series is able to overcome data having trigonometric distribution. In modeling using Fourier series needs parameter of K. To determine the number of K can be used Generalized Cross Validation method. In inflation modeling for the transportation sector, communication and financial services using Fourier series yields an optimal K of 120 parameters with R-square 99%. Whereas if it was modeled by multiple linear regression yield R-square 90%.

  14. Peak flow regression equations For small, ungaged streams in Maine: Comparing map-based to field-based variables

    USGS Publications Warehouse

    Lombard, Pamela J.; Hodgkins, Glenn A.

    2015-01-01

    Regression equations to estimate peak streamflows with 1- to 500-year recurrence intervals (annual exceedance probabilities from 99 to 0.2 percent, respectively) were developed for small, ungaged streams in Maine. Equations presented here are the best available equations for estimating peak flows at ungaged basins in Maine with drainage areas from 0.3 to 12 square miles (mi2). Previously developed equations continue to be the best available equations for estimating peak flows for basin areas greater than 12 mi2. New equations presented here are based on streamflow records at 40 U.S. Geological Survey streamgages with a minimum of 10 years of recorded peak flows between 1963 and 2012. Ordinary least-squares regression techniques were used to determine the best explanatory variables for the regression equations. Traditional map-based explanatory variables were compared to variables requiring field measurements. Two field-based variables—culvert rust lines and bankfull channel widths—either were not commonly found or did not explain enough of the variability in the peak flows to warrant inclusion in the equations. The best explanatory variables were drainage area and percent basin wetlands; values for these variables were determined with a geographic information system. Generalized least-squares regression was used with these two variables to determine the equation coefficients and estimates of accuracy for the final equations.

  15. Online measurement of urea concentration in spent dialysate during hemodialysis.

    PubMed

    Olesberg, Jonathon T; Arnold, Mark A; Flanigan, Michael J

    2004-01-01

    We describe online optical measurements of urea in the effluent dialysate line during regular hemodialysis treatment of several patients. Monitoring urea removal can provide valuable information about dialysis efficiency. Spectral measurements were performed with a Fourier-transform infrared spectrometer equipped with a flow-through cell. Spectra were recorded across the 5000-4000 cm(-1) (2.0-2.5 microm) wavelength range at 1-min intervals. Savitzky-Golay filtering was used to remove baseline variations attributable to the temperature dependence of the water absorption spectrum. Urea concentrations were extracted from the filtered spectra by use of partial least-squares regression and the net analyte signal of urea. Urea concentrations predicted by partial least-squares regression matched concentrations obtained from standard chemical assays with a root mean square error of 0.30 mmol/L (0.84 mg/dL urea nitrogen) over an observed concentration range of 0-11 mmol/L. The root mean square error obtained with the net analyte signal of urea was 0.43 mmol/L with a calibration based only on a set of pure-component spectra. The error decreased to 0.23 mmol/L when a slope and offset correction were used. Urea concentrations can be continuously monitored during hemodialysis by near-infrared spectroscopy. Calibrations based on the net analyte signal of urea are particularly appealing because they do not require a training step, as do statistical multivariate calibration procedures such as partial least-squares regression.

  16. A portable device for rapid nondestructive detection of fresh meat quality

    NASA Astrophysics Data System (ADS)

    Lin, Wan; Peng, Yankun

    2014-05-01

    Quality attributes of fresh meat influence nutritional value and consumers' purchasing power. In order to meet the demand of inspection department for portable device, a rapid and nondestructive detection device for fresh meat quality based on ARM (Advanced RISC Machines) processor and VIS/NIR technology was designed. Working principal, hardware composition, software system and functional test were introduced. Hardware system consisted of ARM processing unit, light source unit, detection probe unit, spectral data acquisition unit, LCD (Liquid Crystal Display) touch screen display unit, power unit and the cooling unit. Linux operating system and quality parameters acquisition processing application were designed. This system has realized collecting spectral signal, storing, displaying and processing as integration with the weight of 3.5 kg. 40 pieces of beef were used in experiment to validate the stability and reliability. The results indicated that prediction model developed using PLSR method using SNV as pre-processing method had good performance, with the correlation coefficient of 0.90 and root mean square error of 1.56 for validation set for L*, 0.95 and 1.74 for a*,0.94 and 0.59 for b*, 0.88 and 0.13 for pH, 0.79 and 12.46 for tenderness, 0.89 and 0.91 for water content, respectively. The experimental result shows that this device can be a useful tool for detecting quality of meat.

  17. Low-flow, base-flow, and mean-flow regression equations for Pennsylvania streams

    USGS Publications Warehouse

    Stuckey, Marla H.

    2006-01-01

    Low-flow, base-flow, and mean-flow characteristics are an important part of assessing water resources in a watershed. These streamflow characteristics can be used by watershed planners and regulators to determine water availability, water-use allocations, assimilative capacities of streams, and aquatic-habitat needs. Streamflow characteristics are commonly predicted by use of regression equations when a nearby streamflow-gaging station is not available. Regression equations for predicting low-flow, base-flow, and mean-flow characteristics for Pennsylvania streams were developed from data collected at 293 continuous- and partial-record streamflow-gaging stations with flow unaffected by upstream regulation, diversion, or mining. Continuous-record stations used in the regression analysis had 9 years or more of data, and partial-record stations used had seven or more measurements collected during base-flow conditions. The state was divided into five low-flow regions and regional regression equations were developed for the 7-day, 10-year; 7-day, 2-year; 30-day, 10-year; 30-day, 2-year; and 90-day, 10-year low flows using generalized least-squares regression. Statewide regression equations were developed for the 10-year, 25-year, and 50-year base flows using generalized least-squares regression. Statewide regression equations were developed for harmonic mean and mean annual flow using weighted least-squares regression. Basin characteristics found to be significant explanatory variables at the 95-percent confidence level for one or more regression equations were drainage area, basin slope, thickness of soil, stream density, mean annual precipitation, mean elevation, and the percentage of glaciation, carbonate bedrock, forested area, and urban area within a basin. Standard errors of prediction ranged from 33 to 66 percent for the n-day, T-year low flows; 21 to 23 percent for the base flows; and 12 to 38 percent for the mean annual flow and harmonic mean, respectively. The regression equations are not valid in watersheds with upstream regulation, diversions, or mining activities. Watersheds with karst features need close examination as to the applicability of the regression-equation results.

  18. On estimating gravity anomalies - A comparison of least squares collocation with conventional least squares techniques

    NASA Technical Reports Server (NTRS)

    Argentiero, P.; Lowrey, B.

    1977-01-01

    The least squares collocation algorithm for estimating gravity anomalies from geodetic data is shown to be an application of the well known regression equations which provide the mean and covariance of a random vector (gravity anomalies) given a realization of a correlated random vector (geodetic data). It is also shown that the collocation solution for gravity anomalies is equivalent to the conventional least-squares-Stokes' function solution when the conventional solution utilizes properly weighted zero a priori estimates. The mathematical and physical assumptions underlying the least squares collocation estimator are described.

  19. Radon-222 concentrations in ground water and soil gas on Indian reservations in Wisconsin

    USGS Publications Warehouse

    DeWild, John F.; Krohelski, James T.

    1995-01-01

    For sites with wells finished in the sand and gravel aquifer, the coefficient of determination (R2) of the regression of concentration of radon-222 in ground water as a function of well depth is 0.003 and the significance level is 0.32, which indicates that there is not a statistically significant relation between radon-222 concentrations in ground water and well depth. The coefficient of determination of the regression of radon-222 in ground water and soil gas is 0.19 and the root mean square error of the regression line is 271 picocuries per liter. Even though the significance level (0.036) indicates a statistical relation, the root mean square error of the regression is so large that the regression equation would not give reliable predictions. Because of an inadequate number of samples, similar statistical analyses could not be performed for sites with wells finished in the crystalline and sedimentary bedrock aquifers.

  20. Quantum State Tomography via Linear Regression Estimation

    PubMed Central

    Qi, Bo; Hou, Zhibo; Li, Li; Dong, Daoyi; Xiang, Guoyong; Guo, Guangcan

    2013-01-01

    A simple yet efficient state reconstruction algorithm of linear regression estimation (LRE) is presented for quantum state tomography. In this method, quantum state reconstruction is converted into a parameter estimation problem of a linear regression model and the least-squares method is employed to estimate the unknown parameters. An asymptotic mean squared error (MSE) upper bound for all possible states to be estimated is given analytically, which depends explicitly upon the involved measurement bases. This analytical MSE upper bound can guide one to choose optimal measurement sets. The computational complexity of LRE is O(d4) where d is the dimension of the quantum state. Numerical examples show that LRE is much faster than maximum-likelihood estimation for quantum state tomography. PMID:24336519

  1. Sampling system for wheat (Triticum aestivum L) area estimation using digital LANDSAT MSS data and aerial photographs. [Brazil

    NASA Technical Reports Server (NTRS)

    Parada, N. D. J. (Principal Investigator); Moreira, M. A.; Chen, S. C.; Batista, G. T.

    1984-01-01

    A procedure to estimate wheat (Triticum aestivum L) area using sampling technique based on aerial photographs and digital LANDSAT MSS data is developed. Aerial photographs covering 720 square km are visually analyzed. To estimate wheat area, a regression approach is applied using different sample sizes and various sampling units. As the size of sampling unit decreased, the percentage of sampled area required to obtain similar estimation performance also decreased. The lowest percentage of the area sampled for wheat estimation with relatively high precision and accuracy through regression estimation is 13.90% using 10 square km as the sampling unit. Wheat area estimation using only aerial photographs is less precise and accurate than those obtained by regression estimation.

  2. Modeling Group Differences in OLS and Orthogonal Regression: Implications for Differential Validity Studies

    ERIC Educational Resources Information Center

    Kane, Michael T.; Mroch, Andrew A.

    2010-01-01

    In evaluating the relationship between two measures across different groups (i.e., in evaluating "differential validity") it is necessary to examine differences in correlation coefficients and in regression lines. Ordinary least squares (OLS) regression is the standard method for fitting lines to data, but its criterion for optimal fit…

  3. Tutorial on Using Regression Models with Count Outcomes Using R

    ERIC Educational Resources Information Center

    Beaujean, A. Alexander; Morgan, Grant B.

    2016-01-01

    Education researchers often study count variables, such as times a student reached a goal, discipline referrals, and absences. Most researchers that study these variables use typical regression methods (i.e., ordinary least-squares) either with or without transforming the count variables. In either case, using typical regression for count data can…

  4. Teaching the Concept of Breakdown Point in Simple Linear Regression.

    ERIC Educational Resources Information Center

    Chan, Wai-Sum

    2001-01-01

    Most introductory textbooks on simple linear regression analysis mention the fact that extreme data points have a great influence on ordinary least-squares regression estimation; however, not many textbooks provide a rigorous mathematical explanation of this phenomenon. Suggests a way to fill this gap by teaching students the concept of breakdown…

  5. Principles of Quantile Regression and an Application

    ERIC Educational Resources Information Center

    Chen, Fang; Chalhoub-Deville, Micheline

    2014-01-01

    Newer statistical procedures are typically introduced to help address the limitations of those already in practice or to deal with emerging research needs. Quantile regression (QR) is introduced in this paper as a relatively new methodology, which is intended to overcome some of the limitations of least squares mean regression (LMR). QR is more…

  6. The concept of psychological regression: metaphors, mapping, Queen Square, and Tavistock Square.

    PubMed

    Mercer, Jean

    2011-05-01

    The term "regression" refers to events in which an individual changes from his or her present level of maturity and regains mental and behavioral characteristics shown at an earlier point in development. This definition has remained constant for over a century, but the implications of the concept have changed systematically from a perspective in which regression was considered pathological, to a current view in which regression may be seen as a positive step in psychotherapy or as a part of normal development. The concept of regression, famously employed by Sigmund Freud and others in his circle, derived from ideas suggested by Herbert Spencer and by John Hughlings Jackson. By the 1940s and '50s, the regression concept was applied by Winnicott and others in treatment of disturbed children and in adult psychotherapy. In addition, behavioral regression came to be seen as a part of a normal developmental trajectory, with a focus on expectable variability. The present article examines historical changes in the regression concept in terms of mapping to biomedical or other metaphors, in terms of a movement from earlier nativism toward an increased environmentalism in psychology, and with respect to other historical factors such as wartime events. The role of dominant metaphors in shifting perspectives on regression is described.

  7. Recursive least squares method of regression coefficients estimation as a special case of Kalman filter

    NASA Astrophysics Data System (ADS)

    Borodachev, S. M.

    2016-06-01

    The simple derivation of recursive least squares (RLS) method equations is given as special case of Kalman filter estimation of a constant system state under changing observation conditions. A numerical example illustrates application of RLS to multicollinearity problem.

  8. Parameter estimation of Monod model by the Least-Squares method for microalgae Botryococcus Braunii sp

    NASA Astrophysics Data System (ADS)

    See, J. J.; Jamaian, S. S.; Salleh, R. M.; Nor, M. E.; Aman, F.

    2018-04-01

    This research aims to estimate the parameters of Monod model of microalgae Botryococcus Braunii sp growth by the Least-Squares method. Monod equation is a non-linear equation which can be transformed into a linear equation form and it is solved by implementing the Least-Squares linear regression method. Meanwhile, Gauss-Newton method is an alternative method to solve the non-linear Least-Squares problem with the aim to obtain the parameters value of Monod model by minimizing the sum of square error ( SSE). As the result, the parameters of the Monod model for microalgae Botryococcus Braunii sp can be estimated by the Least-Squares method. However, the estimated parameters value obtained by the non-linear Least-Squares method are more accurate compared to the linear Least-Squares method since the SSE of the non-linear Least-Squares method is less than the linear Least-Squares method.

  9. On estimating gravity anomalies: A comparison of least squares collocation with least squares techniques

    NASA Technical Reports Server (NTRS)

    Argentiero, P.; Lowrey, B.

    1976-01-01

    The least squares collocation algorithm for estimating gravity anomalies from geodetic data is shown to be an application of the well known regression equations which provide the mean and covariance of a random vector (gravity anomalies) given a realization of a correlated random vector (geodetic data). It is also shown that the collocation solution for gravity anomalies is equivalent to the conventional least-squares-Stokes' function solution when the conventional solution utilizes properly weighted zero a priori estimates. The mathematical and physical assumptions underlying the least squares collocation estimator are described, and its numerical properties are compared with the numerical properties of the conventional least squares estimator.

  10. A comparison between the use of Cox regression and the use of partial least squares-Cox regression to predict the survival of kidney-transplant patients

    NASA Astrophysics Data System (ADS)

    Solimun

    2017-05-01

    The aim of this research is to model survival data from kidney-transplant patients using the partial least squares (PLS)-Cox regression, which can both meet and not meet the no-multicollinearity assumption. The secondary data were obtained from research entitled "Factors affecting the survival of kidney-transplant patients". The research subjects comprised 250 patients. The predictor variables consisted of: age (X1), sex (X2); two categories, prior hemodialysis duration (X3), diabetes (X4); two categories, prior transplantation number (X5), number of blood transfusions (X6), discrepancy score (X7), use of antilymphocyte globulin(ALG) (X8); two categories, while the response variable was patient survival time (in months). Partial least squares regression is a model that connects the predictor variables X and the response variable y and it initially aims to determine the relationship between them. Results of the above analyses suggest that the survival of kidney transplant recipients ranged from 0 to 55 months, with 62% of the patients surviving until they received treatment that lasted for 55 months. The PLS-Cox regression analysis results revealed that patients' age and the use of ALG significantly affected the survival time of patients. The factor of patients' age (X1) in the PLS-Cox regression model merely affected the failure probability by 1.201. This indicates that the probability of dying for elderly patients with a kidney transplant is 1.152 times higher than that for younger patients.

  11. The relationship between air pollution, fossil fuel energy consumption, and water resources in the panel of selected Asia-Pacific countries.

    PubMed

    Rafindadi, Abdulkadir Abdulrashid; Yusof, Zarinah; Zaman, Khalid; Kyophilavong, Phouphet; Akhmat, Ghulam

    2014-10-01

    The objective of the study is to examine the relationship between air pollution, fossil fuel energy consumption, water resources, and natural resource rents in the panel of selected Asia-Pacific countries, over a period of 1975-2012. The study includes number of variables in the model for robust analysis. The results of cross-sectional analysis show that there is a significant relationship between air pollution, energy consumption, and water productivity in the individual countries of Asia-Pacific. However, the results of each country vary according to the time invariant shocks. For this purpose, the study employed the panel least square technique which includes the panel least square regression, panel fixed effect regression, and panel two-stage least square regression. In general, all the panel tests indicate that there is a significant and positive relationship between air pollution, energy consumption, and water resources in the region. The fossil fuel energy consumption has a major dominating impact on the changes in the air pollution in the region.

  12. Estimation of Flood-Frequency Discharges for Rural, Unregulated Streams in West Virginia

    USGS Publications Warehouse

    Wiley, Jeffrey B.; Atkins, John T.

    2010-01-01

    Flood-frequency discharges were determined for 290 streamgage stations having a minimum of 9 years of record in West Virginia and surrounding states through the 2006 or 2007 water year. No trend was determined in the annual peaks used to calculate the flood-frequency discharges. Multiple and simple least-squares regression equations for the 100-year (1-percent annual-occurrence probability) flood discharge with independent variables that describe the basin characteristics were developed for 290 streamgage stations in West Virginia and adjacent states. The regression residuals for the models were evaluated and used to define three regions of the State, designated as Eastern Panhandle, Central Mountains, and Western Plateaus. Exploratory data analysis procedures identified 44 streamgage stations that were excluded from the development of regression equations representative of rural, unregulated streams in West Virginia. Regional equations for the 1.1-, 1.5-, 2-, 5-, 10-, 25-, 50-, 100-, 200-, and 500-year flood discharges were determined by generalized least-squares regression using data from the remaining 246 streamgage stations. Drainage area was the only significant independent variable determined for all equations in all regions. Procedures developed to estimate flood-frequency discharges on ungaged streams were based on (1) regional equations and (2) drainage-area ratios between gaged and ungaged locations on the same stream. The procedures are applicable only to rural, unregulated streams within the boundaries of West Virginia that have drainage areas within the limits of the stations used to develop the regional equations (from 0.21 to 1,461 square miles in the Eastern Panhandle, from 0.10 to 1,619 square miles in the Central Mountains, and from 0.13 to 1,516 square miles in the Western Plateaus). The accuracy of the equations is quantified by measuring the average prediction error (from 21.7 to 56.3 percent) and equivalent years of record (from 2.0 to 70.9 years).

  13. A New Test of Linear Hypotheses in OLS Regression under Heteroscedasticity of Unknown Form

    ERIC Educational Resources Information Center

    Cai, Li; Hayes, Andrew F.

    2008-01-01

    When the errors in an ordinary least squares (OLS) regression model are heteroscedastic, hypothesis tests involving the regression coefficients can have Type I error rates that are far from the nominal significance level. Asymptotically, this problem can be rectified with the use of a heteroscedasticity-consistent covariance matrix (HCCM)…

  14. Deriving the Regression Equation without Using Calculus

    ERIC Educational Resources Information Center

    Gordon, Sheldon P.; Gordon, Florence S.

    2004-01-01

    Probably the one "new" mathematical topic that is most responsible for modernizing courses in college algebra and precalculus over the last few years is the idea of fitting a function to a set of data in the sense of a least squares fit. Whether it be simple linear regression or nonlinear regression, this topic opens the door to applying the…

  15. The Collinearity Free and Bias Reduced Regression Estimation Project: The Theory of Normalization Ridge Regression. Report No. 2.

    ERIC Educational Resources Information Center

    Bulcock, J. W.; And Others

    Multicollinearity refers to the presence of highly intercorrelated independent variables in structural equation models, that is, models estimated by using techniques such as least squares regression and maximum likelihood. There is a problem of multicollinearity in both the natural and social sciences where theory formulation and estimation is in…

  16. Independent contrasts and PGLS regression estimators are equivalent.

    PubMed

    Blomberg, Simon P; Lefevre, James G; Wells, Jessie A; Waterhouse, Mary

    2012-05-01

    We prove that the slope parameter of the ordinary least squares regression of phylogenetically independent contrasts (PICs) conducted through the origin is identical to the slope parameter of the method of generalized least squares (GLSs) regression under a Brownian motion model of evolution. This equivalence has several implications: 1. Understanding the structure of the linear model for GLS regression provides insight into when and why phylogeny is important in comparative studies. 2. The limitations of the PIC regression analysis are the same as the limitations of the GLS model. In particular, phylogenetic covariance applies only to the response variable in the regression and the explanatory variable should be regarded as fixed. Calculation of PICs for explanatory variables should be treated as a mathematical idiosyncrasy of the PIC regression algorithm. 3. Since the GLS estimator is the best linear unbiased estimator (BLUE), the slope parameter estimated using PICs is also BLUE. 4. If the slope is estimated using different branch lengths for the explanatory and response variables in the PIC algorithm, the estimator is no longer the BLUE, so this is not recommended. Finally, we discuss whether or not and how to accommodate phylogenetic covariance in regression analyses, particularly in relation to the problem of phylogenetic uncertainty. This discussion is from both frequentist and Bayesian perspectives.

  17. Understanding Scaling Relations in Fracture and Mechanical Deformation of Single Crystal and Polycrystalline Silicon by Performing Atomistic Simulations at Mesoscale

    DTIC Science & Technology

    2009-07-16

    0.25 0.26 -0.85 1 SSR SSE R SSTO SSTO = = − 2 2 ˆ( ) : Regression sum of square, ˆwhere : mean value, : value from the fitted line ˆ...Error sum of square : Total sum of square i i i i SSR Y Y Y Y SSE Y Y SSTO SSE SSR = − = − = + ∑ ∑ Statistical analysis: Coefficient of correlation

  18. A regression-kriging model for estimation of rainfall in the Laohahe basin

    NASA Astrophysics Data System (ADS)

    Wang, Hong; Ren, Li L.; Liu, Gao H.

    2009-10-01

    This paper presents a multivariate geostatistical algorithm called regression-kriging (RK) for predicting the spatial distribution of rainfall by incorporating five topographic/geographic factors of latitude, longitude, altitude, slope and aspect. The technique is illustrated using rainfall data collected at 52 rain gauges from the Laohahe basis in northeast China during 1986-2005 . Rainfall data from 44 stations were selected for modeling and the remaining 8 stations were used for model validation. To eliminate multicollinearity, the five explanatory factors were first transformed using factor analysis with three Principal Components (PCs) extracted. The rainfall data were then fitted using step-wise regression and residuals interpolated using SK. The regression coefficients were estimated by generalized least squares (GLS), which takes the spatial heteroskedasticity between rainfall and PCs into account. Finally, the rainfall prediction based on RK was compared with that predicted from ordinary kriging (OK) and ordinary least squares (OLS) multiple regression (MR). For correlated topographic factors are taken into account, RK improves the efficiency of predictions. RK achieved a lower relative root mean square error (RMSE) (44.67%) than MR (49.23%) and OK (73.60%) and a lower bias than MR and OK (23.82 versus 30.89 and 32.15 mm) for annual rainfall. It is much more effective for the wet season than for the dry season. RK is suitable for estimation of rainfall in areas where there are no stations nearby and where topography has a major influence on rainfall.

  19. Use of partial least squares regression for the multivariate calibration of hazardous air pollutants in open-path FT-IR spectrometry

    NASA Astrophysics Data System (ADS)

    Hart, Brian K.; Griffiths, Peter R.

    1998-06-01

    Partial least squares (PLS) regression has been evaluated as a robust calibration technique for over 100 hazardous air pollutants (HAPs) measured by open path Fourier transform infrared (OP/FT-IR) spectrometry. PLS has the advantage over the current recommended calibration method of classical least squares (CLS), in that it can look at the whole useable spectrum (700-1300 cm-1, 2000-2150 cm-1, and 2400-3000 cm-1), and detect several analytes simultaneously. Up to one hundred HAPs synthetically added to OP/FT-IR backgrounds have been simultaneously calibrated and detected using PLS. PLS also has the advantage in requiring less preprocessing of spectra than that which is required in CLS calibration schemes, allowing PLS to provide user independent real-time analysis of OP/FT-IR spectra.

  20. A consensus least squares support vector regression (LS-SVR) for analysis of near-infrared spectra of plant samples.

    PubMed

    Li, Yankun; Shao, Xueguang; Cai, Wensheng

    2007-04-15

    Consensus modeling of combining the results of multiple independent models to produce a single prediction avoids the instability of single model. Based on the principle of consensus modeling, a consensus least squares support vector regression (LS-SVR) method for calibrating the near-infrared (NIR) spectra was proposed. In the proposed approach, NIR spectra of plant samples were firstly preprocessed using discrete wavelet transform (DWT) for filtering the spectral background and noise, then, consensus LS-SVR technique was used for building the calibration model. With an optimization of the parameters involved in the modeling, a satisfied model was achieved for predicting the content of reducing sugar in plant samples. The predicted results show that consensus LS-SVR model is more robust and reliable than the conventional partial least squares (PLS) and LS-SVR methods.

  1. Intrinsic Raman spectroscopy for quantitative biological spectroscopy Part II

    PubMed Central

    Bechtel, Kate L.; Shih, Wei-Chuan; Feld, Michael S.

    2009-01-01

    We demonstrate the effectiveness of intrinsic Raman spectroscopy (IRS) at reducing errors caused by absorption and scattering. Physical tissue models, solutions of varying absorption and scattering coefficients with known concentrations of Raman scatterers, are studied. We show significant improvement in prediction error by implementing IRS to predict concentrations of Raman scatterers using both ordinary least squares regression (OLS) and partial least squares regression (PLS). In particular, we show that IRS provides a robust calibration model that does not increase in error when applied to samples with optical properties outside the range of calibration. PMID:18711512

  2. Estimation of Ordinary Differential Equation Parameters Using Constrained Local Polynomial Regression.

    PubMed

    Ding, A Adam; Wu, Hulin

    2014-10-01

    We propose a new method to use a constrained local polynomial regression to estimate the unknown parameters in ordinary differential equation models with a goal of improving the smoothing-based two-stage pseudo-least squares estimate. The equation constraints are derived from the differential equation model and are incorporated into the local polynomial regression in order to estimate the unknown parameters in the differential equation model. We also derive the asymptotic bias and variance of the proposed estimator. Our simulation studies show that our new estimator is clearly better than the pseudo-least squares estimator in estimation accuracy with a small price of computational cost. An application example on immune cell kinetics and trafficking for influenza infection further illustrates the benefits of the proposed new method.

  3. Estimation of Ordinary Differential Equation Parameters Using Constrained Local Polynomial Regression

    PubMed Central

    Ding, A. Adam; Wu, Hulin

    2015-01-01

    We propose a new method to use a constrained local polynomial regression to estimate the unknown parameters in ordinary differential equation models with a goal of improving the smoothing-based two-stage pseudo-least squares estimate. The equation constraints are derived from the differential equation model and are incorporated into the local polynomial regression in order to estimate the unknown parameters in the differential equation model. We also derive the asymptotic bias and variance of the proposed estimator. Our simulation studies show that our new estimator is clearly better than the pseudo-least squares estimator in estimation accuracy with a small price of computational cost. An application example on immune cell kinetics and trafficking for influenza infection further illustrates the benefits of the proposed new method. PMID:26401093

  4. A method for the selection of a functional form for a thermodynamic equation of state using weighted linear least squares stepwise regression

    NASA Technical Reports Server (NTRS)

    Jacobsen, R. T.; Stewart, R. B.; Crain, R. W., Jr.; Rose, G. L.; Myers, A. F.

    1976-01-01

    A method was developed for establishing a rational choice of the terms to be included in an equation of state with a large number of adjustable coefficients. The methods presented were developed for use in the determination of an equation of state for oxygen and nitrogen. However, a general application of the methods is possible in studies involving the determination of an optimum polynomial equation for fitting a large number of data points. The data considered in the least squares problem are experimental thermodynamic pressure-density-temperature data. Attention is given to a description of stepwise multiple regression and the use of stepwise regression in the determination of an equation of state for oxygen and nitrogen.

  5. Application of near-infrared spectroscopy for the rapid quality assessment of Radix Paeoniae Rubra

    NASA Astrophysics Data System (ADS)

    Zhan, Hao; Fang, Jing; Tang, Liying; Yang, Hongjun; Li, Hua; Wang, Zhuju; Yang, Bin; Wu, Hongwei; Fu, Meihong

    2017-08-01

    Near-infrared (NIR) spectroscopy with multivariate analysis was used to quantify gallic acid, catechin, albiflorin, and paeoniflorin in Radix Paeoniae Rubra, and the feasibility to classify the samples originating from different areas was investigated. A new high-performance liquid chromatography method was developed and validated to analyze gallic acid, catechin, albiflorin, and paeoniflorin in Radix Paeoniae Rubra as the reference. Partial least squares (PLS), principal component regression (PCR), and stepwise multivariate linear regression (SMLR) were performed to calibrate the regression model. Different data pretreatments such as derivatives (1st and 2nd), multiplicative scatter correction, standard normal variate, Savitzky-Golay filter, and Norris derivative filter were applied to remove the systematic errors. The performance of the model was evaluated according to the root mean square of calibration (RMSEC), root mean square error of prediction (RMSEP), root mean square error of cross-validation (RMSECV), and correlation coefficient (r). The results show that compared to PCR and SMLR, PLS had a lower RMSEC, RMSECV, and RMSEP and higher r for all the four analytes. PLS coupled with proper pretreatments showed good performance in both the fitting and predicting results. Furthermore, the original areas of Radix Paeoniae Rubra samples were partly distinguished by principal component analysis. This study shows that NIR with PLS is a reliable, inexpensive, and rapid tool for the quality assessment of Radix Paeoniae Rubra.

  6. Methods for estimating selected low-flow frequency statistics and harmonic mean flows for streams in Iowa

    USGS Publications Warehouse

    Eash, David A.; Barnes, Kimberlee K.

    2017-01-01

    A statewide study was conducted to develop regression equations for estimating six selected low-flow frequency statistics and harmonic mean flows for ungaged stream sites in Iowa. The estimation equations developed for the six low-flow frequency statistics include: the annual 1-, 7-, and 30-day mean low flows for a recurrence interval of 10 years, the annual 30-day mean low flow for a recurrence interval of 5 years, and the seasonal (October 1 through December 31) 1- and 7-day mean low flows for a recurrence interval of 10 years. Estimation equations also were developed for the harmonic-mean-flow statistic. Estimates of these seven selected statistics are provided for 208 U.S. Geological Survey continuous-record streamgages using data through September 30, 2006. The study area comprises streamgages located within Iowa and 50 miles beyond the State's borders. Because trend analyses indicated statistically significant positive trends when considering the entire period of record for the majority of the streamgages, the longest, most recent period of record without a significant trend was determined for each streamgage for use in the study. The median number of years of record used to compute each of these seven selected statistics was 35. Geographic information system software was used to measure 54 selected basin characteristics for each streamgage. Following the removal of two streamgages from the initial data set, data collected for 206 streamgages were compiled to investigate three approaches for regionalization of the seven selected statistics. Regionalization, a process using statistical regression analysis, provides a relation for efficiently transferring information from a group of streamgages in a region to ungaged sites in the region. The three regionalization approaches tested included statewide, regional, and region-of-influence regressions. For the regional regression, the study area was divided into three low-flow regions on the basis of hydrologic characteristics, landform regions, and soil regions. A comparison of root mean square errors and average standard errors of prediction for the statewide, regional, and region-of-influence regressions determined that the regional regression provided the best estimates of the seven selected statistics at ungaged sites in Iowa. Because a significant number of streams in Iowa reach zero flow as their minimum flow during low-flow years, four different types of regression analyses were used: left-censored, logistic, generalized-least-squares, and weighted-least-squares regression. A total of 192 streamgages were included in the development of 27 regression equations for the three low-flow regions. For the northeast and northwest regions, a censoring threshold was used to develop 12 left-censored regression equations to estimate the 6 low-flow frequency statistics for each region. For the southern region a total of 12 regression equations were developed; 6 logistic regression equations were developed to estimate the probability of zero flow for the 6 low-flow frequency statistics and 6 generalized least-squares regression equations were developed to estimate the 6 low-flow frequency statistics, if nonzero flow is estimated first by use of the logistic equations. A weighted-least-squares regression equation was developed for each region to estimate the harmonic-mean-flow statistic. Average standard errors of estimate for the left-censored equations for the northeast region range from 64.7 to 88.1 percent and for the northwest region range from 85.8 to 111.8 percent. Misclassification percentages for the logistic equations for the southern region range from 5.6 to 14.0 percent. Average standard errors of prediction for generalized least-squares equations for the southern region range from 71.7 to 98.9 percent and pseudo coefficients of determination for the generalized-least-squares equations range from 87.7 to 91.8 percent. Average standard errors of prediction for weighted-least-squares equations developed for estimating the harmonic-mean-flow statistic for each of the three regions range from 66.4 to 80.4 percent. The regression equations are applicable only to stream sites in Iowa with low flows not significantly affected by regulation, diversion, or urbanization and with basin characteristics within the range of those used to develop the equations. If the equations are used at ungaged sites on regulated streams, or on streams affected by water-supply and agricultural withdrawals, then the estimates will need to be adjusted by the amount of regulation or withdrawal to estimate the actual flow conditions if that is of interest. Caution is advised when applying the equations for basins with characteristics near the applicable limits of the equations and for basins located in karst topography. A test of two drainage-area ratio methods using 31 pairs of streamgages, for the annual 7-day mean low-flow statistic for a recurrence interval of 10 years, indicates a weighted drainage-area ratio method provides better estimates than regional regression equations for an ungaged site on a gaged stream in Iowa when the drainage-area ratio is between 0.5 and 1.4. These regression equations will be implemented within the U.S. Geological Survey StreamStats web-based geographic-information-system tool. StreamStats allows users to click on any ungaged site on a river and compute estimates of the seven selected statistics; in addition, 90-percent prediction intervals and the measured basin characteristics for the ungaged sites also are provided. StreamStats also allows users to click on any streamgage in Iowa and estimates computed for these seven selected statistics are provided for the streamgage.

  7. Partial Least Squares Regression Can Aid in Detecting Differential Abundance of Multiple Features in Sets of Metagenomic Samples

    PubMed Central

    Libiger, Ondrej; Schork, Nicholas J.

    2015-01-01

    It is now feasible to examine the composition and diversity of microbial communities (i.e., “microbiomes”) that populate different human organs and orifices using DNA sequencing and related technologies. To explore the potential links between changes in microbial communities and various diseases in the human body, it is essential to test associations involving different species within and across microbiomes, environmental settings and disease states. Although a number of statistical techniques exist for carrying out relevant analyses, it is unclear which of these techniques exhibit the greatest statistical power to detect associations given the complexity of most microbiome datasets. We compared the statistical power of principal component regression, partial least squares regression, regularized regression, distance-based regression, Hill's diversity measures, and a modified test implemented in the popular and widely used microbiome analysis methodology “Metastats” across a wide range of simulated scenarios involving changes in feature abundance between two sets of metagenomic samples. For this purpose, simulation studies were used to change the abundance of microbial species in a real dataset from a published study examining human hands. Each technique was applied to the same data, and its ability to detect the simulated change in abundance was assessed. We hypothesized that a small subset of methods would outperform the rest in terms of the statistical power. Indeed, we found that the Metastats technique modified to accommodate multivariate analysis and partial least squares regression yielded high power under the models and data sets we studied. The statistical power of diversity measure-based tests, distance-based regression and regularized regression was significantly lower. Our results provide insight into powerful analysis strategies that utilize information on species counts from large microbiome data sets exhibiting skewed frequency distributions obtained on a small to moderate number of samples. PMID:26734061

  8. Potential of SENTINEL-2 images for predicting common topsoil properties over Temperate and Mediterranean agroecosystems

    NASA Astrophysics Data System (ADS)

    Vaudour, Emmanuelle; Gomez, Cécile; Fouad, Youssef; Gilliot, Jean-Marc; Lagacherie, Philippe

    2017-04-01

    This study aimed at exploring the potential of SENTINEL-2 (S2A) multispectral satellite images for predicting several topsoil properties in two contrasted environments: a temperate region marked by intensive annual crop cultivation and soils derived from either loess or colluvium and/or marine limestone or chalk for one part (Versailles Plain, 221 km2), and a Mediterranean region marked by vineyard cultivation and soils derived from either lacustrine limestone, calcareous sandstones, colluvium, or alluvial deposits (La Peyne catchment, 48 km2) for the other part. Two S2A images (acquired in mid-March 2016 over each site) were atmospherically corrected. Then NDVI was computed and thresholded (0.35) in order to extract bare soils. Prediction models of soil properties based on partial least squares regressions (PLSR) were built from S2A spectra of 72 and 143 sampling locations in the Versailles Plain and La Peyne catchment, respectively. Ten soil properties were investigated in both regions: pH, cation exchange capacity (CEC), five texture fractions (clay, coarse silt, fine silt, coarse sand and fine sand), iron, calcium carbonate and soil organic carbon (SOC) in the tilled horizon. Predictive abilities were studied according to R_cv2 and ratio of performance to deviation (RPD). Intermediate to near intermediate performances of prediction (R_cv2 and RPD between 0.28-0.70 and 1.19-1.85 respectively) were obtained for 6 topsoil properties: clay, iron, SOC, CEC, pH, coarse silt. In the Versailles Plain, 5 out of these properties could be predicted (by decreasing performance, CEC, SOC, pH, clay, coarse silt), while there were 4 predictable properties for La Peyne catchment (Iron, clay, CEC, coarse silt). The amount in coarse fragment content appeared to impact prediction error for iron content over La Peyne, while it influenced prediction error for SOC content over the Versailles Plain along with calcium carbonate content. A spatial structure of the estimated soil properties for bare soils pixels was highlighted, which promises further improvements in spatial prediction models for these properties. This work was carried out in the framework of both the TOSCA-CES "Cartographie Numérique des sols" and the PLEIADES-CO projects of the French Space Agency (CNES).

  9. Restoration of Circum-Arctic Upper Jurassic source rock paleolatitude based on crude oil geochemistry

    USGS Publications Warehouse

    Peters, K.E.; Ramos, L.S.; Zumberge, J.E.; Valin, Z.C.; Scotese, C.R.

    2008-01-01

    Tectonic geochemical paleolatitude (TGP) models were developed to predict the paleolatitude of petroleum source rock from the geochemical composition of crude oil. The results validate studies designed to reconstruct ancient source rock depositional environments using oil chemistry and tectonic reconstruction of paleogeography from coordinates of the present day collection site. TGP models can also be used to corroborate tectonic paleolatitude in cases where the predicted paleogeography conflicts with the depositional setting predicted by the oil chemistry, or to predict paleolatitude when the present day collection locality is far removed from the source rock, as might occur due to long distance subsurface migration or transport of tarballs by ocean currents. Biomarker and stable carbon isotope ratios were measured for 496 crude oil samples inferred to originate from Upper Jurassic source rock in West Siberia, the North Sea and offshore Labrador. First, a unique, multi-tiered chemometric (multivariate statistics) decision tree was used to classify these samples into seven oil families and infer the type of organic matter, lithology and depositional environment of each organofacies of source rock [Peters, K.E., Ramos, L.S., Zumberge, J.E., Valin, Z.C., Scotese, C.R., Gautier, D.L., 2007. Circum-Arctic petroleum systems identified using decision-tree chemometrics. American Association of Petroleum Geologists Bulletin 91, 877-913]. Second, present day geographic locations for each sample were used to restore the tectonic paleolatitude of the source rock during Late Jurassic time (???150 Ma). Third, partial least squares regression (PLSR) was used to construct linear TGP models that relate tectonic and geochemical paleolatitude, where the latter is based on 19 source-related biomarker and isotope ratios for each oil family. The TGP models were calibrated using 70% of the samples in each family and the remaining 30% of samples were used for model validation. Positive relationships exist between tectonic and geochemical paleolatitude for each family. Standard error of prediction for geochemical paleolatitude ranges from 0.9?? to 2.6?? of tectonic paleolatitude, which translates to a relative standard error of prediction in the range 1.5-4.8%. The results suggest that the observed effect of source rock paleolatitude on crude oil composition is caused by (i) stable carbon isotope fractionation during photosynthetic fixation of carbon and (ii) species diversity at different latitudes during Late Jurassic time. ?? 2008 Elsevier Ltd. All rights reserved.

  10. Applicability of Monte Carlo cross validation technique for model development and validation using generalised least squares regression

    NASA Astrophysics Data System (ADS)

    Haddad, Khaled; Rahman, Ataur; A Zaman, Mohammad; Shrestha, Surendra

    2013-03-01

    SummaryIn regional hydrologic regression analysis, model selection and validation are regarded as important steps. Here, the model selection is usually based on some measurements of goodness-of-fit between the model prediction and observed data. In Regional Flood Frequency Analysis (RFFA), leave-one-out (LOO) validation or a fixed percentage leave out validation (e.g., 10%) is commonly adopted to assess the predictive ability of regression-based prediction equations. This paper develops a Monte Carlo Cross Validation (MCCV) technique (which has widely been adopted in Chemometrics and Econometrics) in RFFA using Generalised Least Squares Regression (GLSR) and compares it with the most commonly adopted LOO validation approach. The study uses simulated and regional flood data from the state of New South Wales in Australia. It is found that when developing hydrologic regression models, application of the MCCV is likely to result in a more parsimonious model than the LOO. It has also been found that the MCCV can provide a more realistic estimate of a model's predictive ability when compared with the LOO.

  11. Least squares regression methods for clustered ROC data with discrete covariates.

    PubMed

    Tang, Liansheng Larry; Zhang, Wei; Li, Qizhai; Ye, Xuan; Chan, Leighton

    2016-07-01

    The receiver operating characteristic (ROC) curve is a popular tool to evaluate and compare the accuracy of diagnostic tests to distinguish the diseased group from the nondiseased group when test results from tests are continuous or ordinal. A complicated data setting occurs when multiple tests are measured on abnormal and normal locations from the same subject and the measurements are clustered within the subject. Although least squares regression methods can be used for the estimation of ROC curve from correlated data, how to develop the least squares methods to estimate the ROC curve from the clustered data has not been studied. Also, the statistical properties of the least squares methods under the clustering setting are unknown. In this article, we develop the least squares ROC methods to allow the baseline and link functions to differ, and more importantly, to accommodate clustered data with discrete covariates. The methods can generate smooth ROC curves that satisfy the inherent continuous property of the true underlying curve. The least squares methods are shown to be more efficient than the existing nonparametric ROC methods under appropriate model assumptions in simulation studies. We apply the methods to a real example in the detection of glaucomatous deterioration. We also derive the asymptotic properties of the proposed methods. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  12. Assessment of parametric uncertainty for groundwater reactive transport modeling,

    USGS Publications Warehouse

    Shi, Xiaoqing; Ye, Ming; Curtis, Gary P.; Miller, Geoffery L.; Meyer, Philip D.; Kohler, Matthias; Yabusaki, Steve; Wu, Jichun

    2014-01-01

    The validity of using Gaussian assumptions for model residuals in uncertainty quantification of a groundwater reactive transport model was evaluated in this study. Least squares regression methods explicitly assume Gaussian residuals, and the assumption leads to Gaussian likelihood functions, model parameters, and model predictions. While the Bayesian methods do not explicitly require the Gaussian assumption, Gaussian residuals are widely used. This paper shows that the residuals of the reactive transport model are non-Gaussian, heteroscedastic, and correlated in time; characterizing them requires using a generalized likelihood function such as the formal generalized likelihood function developed by Schoups and Vrugt (2010). For the surface complexation model considered in this study for simulating uranium reactive transport in groundwater, parametric uncertainty is quantified using the least squares regression methods and Bayesian methods with both Gaussian and formal generalized likelihood functions. While the least squares methods and Bayesian methods with Gaussian likelihood function produce similar Gaussian parameter distributions, the parameter distributions of Bayesian uncertainty quantification using the formal generalized likelihood function are non-Gaussian. In addition, predictive performance of formal generalized likelihood function is superior to that of least squares regression and Bayesian methods with Gaussian likelihood function. The Bayesian uncertainty quantification is conducted using the differential evolution adaptive metropolis (DREAM(zs)) algorithm; as a Markov chain Monte Carlo (MCMC) method, it is a robust tool for quantifying uncertainty in groundwater reactive transport models. For the surface complexation model, the regression-based local sensitivity analysis and Morris- and DREAM(ZS)-based global sensitivity analysis yield almost identical ranking of parameter importance. The uncertainty analysis may help select appropriate likelihood functions, improve model calibration, and reduce predictive uncertainty in other groundwater reactive transport and environmental modeling.

  13. Rex fortran 4 system for combinatorial screening or conventional analysis of multivariate regressions

    Treesearch

    L.R. Grosenbaugh

    1967-01-01

    Describes an expansible computerized system that provides data needed in regression or covariance analysis of as many as 50 variables, 8 of which may be dependent. Alternatively, it can screen variously generated combinations of independent variables to find the regression with the smallest mean-squared-residual, which will be fitted if desired. The user can easily...

  14. Evaluation of three statistical prediction models for forensic age prediction based on DNA methylation.

    PubMed

    Smeers, Inge; Decorte, Ronny; Van de Voorde, Wim; Bekaert, Bram

    2018-05-01

    DNA methylation is a promising biomarker for forensic age prediction. A challenge that has emerged in recent studies is the fact that prediction errors become larger with increasing age due to interindividual differences in epigenetic ageing rates. This phenomenon of non-constant variance or heteroscedasticity violates an assumption of the often used method of ordinary least squares (OLS) regression. The aim of this study was to evaluate alternative statistical methods that do take heteroscedasticity into account in order to provide more accurate, age-dependent prediction intervals. A weighted least squares (WLS) regression is proposed as well as a quantile regression model. Their performances were compared against an OLS regression model based on the same dataset. Both models provided age-dependent prediction intervals which account for the increasing variance with age, but WLS regression performed better in terms of success rate in the current dataset. However, quantile regression might be a preferred method when dealing with a variance that is not only non-constant, but also not normally distributed. Ultimately the choice of which model to use should depend on the observed characteristics of the data. Copyright © 2018 Elsevier B.V. All rights reserved.

  15. Production of deerbrush and mountain whitethorn related to shrub volume and overstory crown closure

    Treesearch

    John G. Kie

    1985-01-01

    Annual production by deerbrush (Ceanothus integerrimus) and mountain whitethorn shrubs (C. cordulatus) in the south-central Sierra Nevada of California was related to shrub volume, volume squared, and overstory crown closure by regression models. production increased as shrub volume and volume squared increased, and decreased as...

  16. Post-processing through linear regression

    NASA Astrophysics Data System (ADS)

    van Schaeybroeck, B.; Vannitsem, S.

    2011-03-01

    Various post-processing techniques are compared for both deterministic and ensemble forecasts, all based on linear regression between forecast data and observations. In order to evaluate the quality of the regression methods, three criteria are proposed, related to the effective correction of forecast error, the optimal variability of the corrected forecast and multicollinearity. The regression schemes under consideration include the ordinary least-square (OLS) method, a new time-dependent Tikhonov regularization (TDTR) method, the total least-square method, a new geometric-mean regression (GM), a recently introduced error-in-variables (EVMOS) method and, finally, a "best member" OLS method. The advantages and drawbacks of each method are clarified. These techniques are applied in the context of the 63 Lorenz system, whose model version is affected by both initial condition and model errors. For short forecast lead times, the number and choice of predictors plays an important role. Contrarily to the other techniques, GM degrades when the number of predictors increases. At intermediate lead times, linear regression is unable to provide corrections to the forecast and can sometimes degrade the performance (GM and the best member OLS with noise). At long lead times the regression schemes (EVMOS, TDTR) which yield the correct variability and the largest correlation between ensemble error and spread, should be preferred.

  17. Multiple concurrent recursive least squares identification with application to on-line spacecraft mass-property identification

    NASA Technical Reports Server (NTRS)

    Wilson, Edward (Inventor)

    2006-01-01

    The present invention is a method for identifying unknown parameters in a system having a set of governing equations describing its behavior that cannot be put into regression form with the unknown parameters linearly represented. In this method, the vector of unknown parameters is segmented into a plurality of groups where each individual group of unknown parameters may be isolated linearly by manipulation of said equations. Multiple concurrent and independent recursive least squares identification of each said group run, treating other unknown parameters appearing in their regression equation as if they were known perfectly, with said values provided by recursive least squares estimation from the other groups, thereby enabling the use of fast, compact, efficient linear algorithms to solve problems that would otherwise require nonlinear solution approaches. This invention is presented with application to identification of mass and thruster properties for a thruster-controlled spacecraft.

  18. Regression Analysis: Instructional Resource for Cost/Managerial Accounting

    ERIC Educational Resources Information Center

    Stout, David E.

    2015-01-01

    This paper describes a classroom-tested instructional resource, grounded in principles of active learning and a constructivism, that embraces two primary objectives: "demystify" for accounting students technical material from statistics regarding ordinary least-squares (OLS) regression analysis--material that students may find obscure or…

  19. Exact and Approximate Statistical Inference for Nonlinear Regression and the Estimating Equation Approach.

    PubMed

    Demidenko, Eugene

    2017-09-01

    The exact density distribution of the nonlinear least squares estimator in the one-parameter regression model is derived in closed form and expressed through the cumulative distribution function of the standard normal variable. Several proposals to generalize this result are discussed. The exact density is extended to the estimating equation (EE) approach and the nonlinear regression with an arbitrary number of linear parameters and one intrinsically nonlinear parameter. For a very special nonlinear regression model, the derived density coincides with the distribution of the ratio of two normally distributed random variables previously obtained by Fieller (1932), unlike other approximations previously suggested by other authors. Approximations to the density of the EE estimators are discussed in the multivariate case. Numerical complications associated with the nonlinear least squares are illustrated, such as nonexistence and/or multiple solutions, as major factors contributing to poor density approximation. The nonlinear Markov-Gauss theorem is formulated based on the near exact EE density approximation.

  20. An index of effluent aquatic toxicity designed by partial least squares regression, using acute and chronic tests and expert judgements.

    PubMed

    Vindimian, Éric; Garric, Jeanne; Flammarion, Patrick; Thybaud, Éric; Babut, Marc

    1999-10-01

    The evaluation of the ecotoxicity of effluents requires a battery of biological tests on several species. In order to derive a summary parameter from such a battery, a single endpoint was calculated for all the tests: the EC10, obtained by nonlinear regression, with bootstrap evaluation of the confidence intervals. Principal component analysis was used to characterize and visualize the correlation between the tests. The table of the toxicity of the effluents was then submitted to a panel of experts, who classified the effluents according to the test results. Partial least squares (PLS) regression was used to fit the average value of the experts' judgements to the toxicity data, using a simple equation. Furthermore, PLS regression on partial data sets and other considerations resulted in an optimum battery, with two chronic tests and one acute test. The index is intended to be used for the classification of effluents based on their toxicity to aquatic species. Copyright © 1999 SETAC.

  1. An index of effluent aquatic toxicity designed by partial least squares regression, using acute and chronic tests and expert judgments

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Vindimian, E.; Garric, J.; Flammarion, P.

    1999-10-01

    The evaluation of the ecotoxicity of effluents requires a battery of biological tests on several species. In order to derive a summary parameter from such a battery, a single endpoint was calculated for all the tests: the EC10, obtained by nonlinear regression, with bootstrap evaluation of the confidence intervals. Principal component analysis was used to characterize and visualize the correlation between the tests. The table of the toxicity of the effluents was then submitted to a panel of experts, who classified the effluents according to the test results. Partial least squares (PLS) regression was used to fit the average valuemore » of the experts' judgments to the toxicity data, using a simple equation. Furthermore, PLS regression on partial data sets and other considerations resulted in an optimum battery, with two chronic tests and one acute test. The index is intended to be used for the classification of effluents based on their toxicity to aquatic species.« less

  2. Electricity Consumption in the Industrial Sector of Jordan: Application of Multivariate Linear Regression and Adaptive Neuro-Fuzzy Techniques

    NASA Astrophysics Data System (ADS)

    Samhouri, M.; Al-Ghandoor, A.; Fouad, R. H.

    2009-08-01

    In this study two techniques, for modeling electricity consumption of the Jordanian industrial sector, are presented: (i) multivariate linear regression and (ii) neuro-fuzzy models. Electricity consumption is modeled as function of different variables such as number of establishments, number of employees, electricity tariff, prevailing fuel prices, production outputs, capacity utilizations, and structural effects. It was found that industrial production and capacity utilization are the most important variables that have significant effect on future electrical power demand. The results showed that both the multivariate linear regression and neuro-fuzzy models are generally comparable and can be used adequately to simulate industrial electricity consumption. However, comparison that is based on the square root average squared error of data suggests that the neuro-fuzzy model performs slightly better for future prediction of electricity consumption than the multivariate linear regression model. Such results are in full agreement with similar work, using different methods, for other countries.

  3. Quantile regression applied to spectral distance decay

    USGS Publications Warehouse

    Rocchini, D.; Cade, B.S.

    2008-01-01

    Remotely sensed imagery has long been recognized as a powerful support for characterizing and estimating biodiversity. Spectral distance among sites has proven to be a powerful approach for detecting species composition variability. Regression analysis of species similarity versus spectral distance allows us to quantitatively estimate the amount of turnover in species composition with respect to spectral and ecological variability. In classical regression analysis, the residual sum of squares is minimized for the mean of the dependent variable distribution. However, many ecological data sets are characterized by a high number of zeroes that add noise to the regression model. Quantile regressions can be used to evaluate trend in the upper quantiles rather than a mean trend across the whole distribution of the dependent variable. In this letter, we used ordinary least squares (OLS) and quantile regressions to estimate the decay of species similarity versus spectral distance. The achieved decay rates were statistically nonzero (p < 0.01), considering both OLS and quantile regressions. Nonetheless, the OLS regression estimate of the mean decay rate was only half the decay rate indicated by the upper quantiles. Moreover, the intercept value, representing the similarity reached when the spectral distance approaches zero, was very low compared with the intercepts of the upper quantiles, which detected high species similarity when habitats are more similar. In this letter, we demonstrated the power of using quantile regressions applied to spectral distance decay to reveal species diversity patterns otherwise lost or underestimated by OLS regression. ?? 2008 IEEE.

  4. The potential of UAS imagery for soil mapping at the agricultural plot scale

    NASA Astrophysics Data System (ADS)

    Gilliot, Jean-Marc; Michelin, Joël; Becu, Maxime; Cissé, Moustapha; Hadjar, Dalila; Vaudour, Emmanuelle

    2017-04-01

    Soil mapping is expensive and time consuming. Airborne and satellite remote sensing data have already been used to predict some soil properties but now Unmanned Aerial Systems (UAS) allow to do many images acquisitions in various field conditions in favour of developing methods for better prediction models construction. This study propose an operational method for spatial prediction of soil properties (organic carbon, clay) at the scale of the agricultural plot by using UAS imagery. An agricultural plot of 28 ha, located in the western region of Paris France, was studied from March to May 2016. An area of 3.6 ha was delimited within the plot and a total of 16 flights were completed. The UAS platforms used were the eBee fixed wing provided by Sensefly® flying at an altitude from 60m to 130m and the iris+ 3DR® Quadcopter (from 30m to 100m). Two multispectral visible near-infrared cameras were used: the AirInov® MultiSPEC 4C® and the Micasense® RedEdge®. 42 ground control points (GCP) were sampled within the 3.6 ha plot. A centimetric Trimble Geo 7x DGPS was used to determine precise GCP positions. On each GCP the soil horizons were described and the top soil were sampled for standard physico-chemical analysis. Ground spectral measurements with a Spectral Evolution® SR-3500 spectroradiometer were made synchronously with the drone flights. 22 additional GCP were placed around the 3.6 ha area in order to realize a precise georeferencing. The multispectral mosaics were calculated using the Agisoft Photoscan® software and all mapping processings were done with the ESRI ArcGIS® 10.3 software. The soil properties were estimated by partial least squares regression (PLSR) between the laboratory analyses and the multispectral information of the UAS images, with the PLS package of the R software. The objective was to establish a model that would achieve an acceptable prediction quality using minimum number of points. For this, we tested 5 models with a decreasing number of calibration points: 20, 15, 10, 5 and 3 points. The remaining points were used to validate the models. The point positions were determined on the basis of a soil brightness index map calculated from the UAS image, in order to distribute the points in areas of contrasted brightness. Root Mean Squared Error Prediction (RMSEP) obtained by cross-validation were 1.6 g.kg-1 and 28 g.kg-1 for organic carbon and clay respectively, with 20 points. Results showed ability to obtain acceptable precision (2 g.kg-1 and 48 g.kg-1) with only 3 points. This work was supported by the SolFIT research network of the BASC LabEx (Laboratory of Excellence) and by the TOSCA-PLEIADES-CO project of the French Space Agency (CNES).

  5. Error Covariance Penalized Regression: A novel multivariate model combining penalized regression with multivariate error structure.

    PubMed

    Allegrini, Franco; Braga, Jez W B; Moreira, Alessandro C O; Olivieri, Alejandro C

    2018-06-29

    A new multivariate regression model, named Error Covariance Penalized Regression (ECPR) is presented. Following a penalized regression strategy, the proposed model incorporates information about the measurement error structure of the system, using the error covariance matrix (ECM) as a penalization term. Results are reported from both simulations and experimental data based on replicate mid and near infrared (MIR and NIR) spectral measurements. The results for ECPR are better under non-iid conditions when compared with traditional first-order multivariate methods such as ridge regression (RR), principal component regression (PCR) and partial least-squares regression (PLS). Copyright © 2018 Elsevier B.V. All rights reserved.

  6. Principal components and iterative regression analysis of geophysical series: Application to Sunspot number (1750 2004)

    NASA Astrophysics Data System (ADS)

    Nordemann, D. J. R.; Rigozo, N. R.; de Souza Echer, M. P.; Echer, E.

    2008-11-01

    We present here an implementation of a least squares iterative regression method applied to the sine functions embedded in the principal components extracted from geophysical time series. This method seems to represent a useful improvement for the non-stationary time series periodicity quantitative analysis. The principal components determination followed by the least squares iterative regression method was implemented in an algorithm written in the Scilab (2006) language. The main result of the method is to obtain the set of sine functions embedded in the series analyzed in decreasing order of significance, from the most important ones, likely to represent the physical processes involved in the generation of the series, to the less important ones that represent noise components. Taking into account the need of a deeper knowledge of the Sun's past history and its implication to global climate change, the method was applied to the Sunspot Number series (1750-2004). With the threshold and parameter values used here, the application of the method leads to a total of 441 explicit sine functions, among which 65 were considered as being significant and were used for a reconstruction that gave a normalized mean squared error of 0.146.

  7. On Quantile Regression in Reproducing Kernel Hilbert Spaces with Data Sparsity Constraint

    PubMed Central

    Zhang, Chong; Liu, Yufeng; Wu, Yichao

    2015-01-01

    For spline regressions, it is well known that the choice of knots is crucial for the performance of the estimator. As a general learning framework covering the smoothing splines, learning in a Reproducing Kernel Hilbert Space (RKHS) has a similar issue. However, the selection of training data points for kernel functions in the RKHS representation has not been carefully studied in the literature. In this paper we study quantile regression as an example of learning in a RKHS. In this case, the regular squared norm penalty does not perform training data selection. We propose a data sparsity constraint that imposes thresholding on the kernel function coefficients to achieve a sparse kernel function representation. We demonstrate that the proposed data sparsity method can have competitive prediction performance for certain situations, and have comparable performance in other cases compared to that of the traditional squared norm penalty. Therefore, the data sparsity method can serve as a competitive alternative to the squared norm penalty method. Some theoretical properties of our proposed method using the data sparsity constraint are obtained. Both simulated and real data sets are used to demonstrate the usefulness of our data sparsity constraint. PMID:27134575

  8. Mixed geographically weighted regression (MGWR) model with weighted adaptive bi-square for case of dengue hemorrhagic fever (DHF) in Surakarta

    NASA Astrophysics Data System (ADS)

    Astuti, H. N.; Saputro, D. R. S.; Susanti, Y.

    2017-06-01

    MGWR model is combination of linear regression model and geographically weighted regression (GWR) model, therefore, MGWR model could produce parameter estimation that had global parameter estimation, and other parameter that had local parameter in accordance with its observation location. The linkage between locations of the observations expressed in specific weighting that is adaptive bi-square. In this research, we applied MGWR model with weighted adaptive bi-square for case of DHF in Surakarta based on 10 factors (variables) that is supposed to influence the number of people with DHF. The observation unit in the research is 51 urban villages and the variables are number of inhabitants, number of houses, house index, many public places, number of healthy homes, number of Posyandu, area width, level population density, welfare of the family, and high-region. Based on this research, we obtained 51 MGWR models. The MGWR model were divided into 4 groups with significant variable is house index as a global variable, an area width as a local variable and the remaining variables vary in each. Global variables are variables that significantly affect all locations, while local variables are variables that significantly affect a specific location.

  9. Ordinary least squares regression is indicated for studies of allometry.

    PubMed

    Kilmer, J T; Rodríguez, R L

    2017-01-01

    When it comes to fitting simple allometric slopes through measurement data, evolutionary biologists have been torn between regression methods. On the one hand, there is the ordinary least squares (OLS) regression, which is commonly used across many disciplines of biology to fit lines through data, but which has a reputation for underestimating slopes when measurement error is present. On the other hand, there is the reduced major axis (RMA) regression, which is often recommended as a substitute for OLS regression in studies of allometry, but which has several weaknesses of its own. Here, we review statistical theory as it applies to evolutionary biology and studies of allometry. We point out that the concerns that arise from measurement error for OLS regression are small and straightforward to deal with, whereas RMA has several key properties that make it unfit for use in the field of allometry. The recommended approach for researchers interested in allometry is to use OLS regression on measurements taken with low (but realistically achievable) measurement error. If measurement error is unavoidable and relatively large, it is preferable to correct for slope attenuation rather than to turn to RMA regression, or to take the expected amount of attenuation into account when interpreting the data. © 2016 European Society For Evolutionary Biology. Journal of Evolutionary Biology © 2016 European Society For Evolutionary Biology.

  10. Quality of semen: a 6-year single experience study on 5680 patients.

    PubMed

    Cozzolino, Mauro; Coccia, Maria E; Picone, Rita

    2018-02-08

    The aim of our study was to evaluate the quality of semen of a large sample from general healthy population living in Italy, in order to identify possible variables that could influence several parameters of spermiogram. We conducted a cross-sectional study from February 2010 to March 2015, collecting semen samples from the general population. Semen analysis was performed according to the WHO guidelines. The collected data were inserted in a database and processed using the software Stata 12. The Mann - Whitney test was used to assess the relationship of dichotomus variables with the parameters of the spermiogram; Kruskal-Wallis test for variables with more than two categories. We used also Robust regression and Spearman correlation to analyze the relationship between age and the parameters. We collected 5680 samples of semen. The mean age of our patients was 41.4 years old. Mann-Whitney test showed that the citizenship (codified as "Italian/Foreign") influences some parameters: pH, vitality, number of spermatozoa, sperm concentration, with worse results for the Italian group. Kruskal-Wallis test showed that the single nationality influences pH, volume, Sperm motility A-B-C-D, vitality, morphology, number of spermatozoa, sperm concentration. Robust regression showed a relationship between age and several parameters: volume (p=0.04, R squared= 0.0007 β: - 0.06); sperm motility A (p<0.01; R squared 0.0051 β: 0.02); sperm motility B (p<0.01; R squared 0.02 β: -0.35); sperm motility C (p<0.01; R squared 0.01 β: 0.12); sperm motility D (p<0.01; R squared 0.006 β: 0.2); vitality (p<0.01; R squared 0.01 β: -0.32); sperm concentration (p=0.01; R squared 0.001 β: 0.19). Our patients had spermiogram's results quite better than the standard guidelines. Our study showed that the country of origin could be a factor influencing several parameters of the spermiogram in healthy population and through Robust regression confirmed a strict correlation between age and these parameters.

  11. Use of Empirical Estimates of Shrinkage in Multiple Regression: A Caution.

    ERIC Educational Resources Information Center

    Kromrey, Jeffrey D.; Hines, Constance V.

    1995-01-01

    The accuracy of four empirical techniques to estimate shrinkage in multiple regression was studied through Monte Carlo simulation. None of the techniques provided unbiased estimates of the population squared multiple correlation coefficient, but the normalized jackknife and bootstrap techniques demonstrated marginally acceptable performance with…

  12. Enhance-Synergism and Suppression Effects in Multiple Regression

    ERIC Educational Resources Information Center

    Lipovetsky, Stan; Conklin, W. Michael

    2004-01-01

    Relations between pairwise correlations and the coefficient of multiple determination in regression analysis are considered. The conditions for the occurrence of enhance-synergism and suppression effects when multiple determination becomes bigger than the total of squared correlations of the dependent variable with the regressors are discussed. It…

  13. Method for nonlinear exponential regression analysis

    NASA Technical Reports Server (NTRS)

    Junkin, B. G.

    1972-01-01

    Two computer programs developed according to two general types of exponential models for conducting nonlinear exponential regression analysis are described. Least squares procedure is used in which the nonlinear problem is linearized by expanding in a Taylor series. Program is written in FORTRAN 5 for the Univac 1108 computer.

  14. A Comparative Investigation of the Combined Effects of Pre-Processing, Wavelength Selection, and Regression Methods on Near-Infrared Calibration Model Performance.

    PubMed

    Wan, Jian; Chen, Yi-Chieh; Morris, A Julian; Thennadil, Suresh N

    2017-07-01

    Near-infrared (NIR) spectroscopy is being widely used in various fields ranging from pharmaceutics to the food industry for analyzing chemical and physical properties of the substances concerned. Its advantages over other analytical techniques include available physical interpretation of spectral data, nondestructive nature and high speed of measurements, and little or no need for sample preparation. The successful application of NIR spectroscopy relies on three main aspects: pre-processing of spectral data to eliminate nonlinear variations due to temperature, light scattering effects and many others, selection of those wavelengths that contribute useful information, and identification of suitable calibration models using linear/nonlinear regression . Several methods have been developed for each of these three aspects and many comparative studies of different methods exist for an individual aspect or some combinations. However, there is still a lack of comparative studies for the interactions among these three aspects, which can shed light on what role each aspect plays in the calibration and how to combine various methods of each aspect together to obtain the best calibration model. This paper aims to provide such a comparative study based on four benchmark data sets using three typical pre-processing methods, namely, orthogonal signal correction (OSC), extended multiplicative signal correction (EMSC) and optical path-length estimation and correction (OPLEC); two existing wavelength selection methods, namely, stepwise forward selection (SFS) and genetic algorithm optimization combined with partial least squares regression for spectral data (GAPLSSP); four popular regression methods, namely, partial least squares (PLS), least absolute shrinkage and selection operator (LASSO), least squares support vector machine (LS-SVM), and Gaussian process regression (GPR). The comparative study indicates that, in general, pre-processing of spectral data can play a significant role in the calibration while wavelength selection plays a marginal role and the combination of certain pre-processing, wavelength selection, and nonlinear regression methods can achieve superior performance over traditional linear regression-based calibration.

  15. Inverse models: A necessary next step in ground-water modeling

    USGS Publications Warehouse

    Poeter, E.P.; Hill, M.C.

    1997-01-01

    Inverse models using, for example, nonlinear least-squares regression, provide capabilities that help modelers take full advantage of the insight available from ground-water models. However, lack of information about the requirements and benefits of inverse models is an obstacle to their widespread use. This paper presents a simple ground-water flow problem to illustrate the requirements and benefits of the nonlinear least-squares repression method of inverse modeling and discusses how these attributes apply to field problems. The benefits of inverse modeling include: (1) expedited determination of best fit parameter values; (2) quantification of the (a) quality of calibration, (b) data shortcomings and needs, and (c) confidence limits on parameter estimates and predictions; and (3) identification of issues that are easily overlooked during nonautomated calibration.Inverse models using, for example, nonlinear least-squares regression, provide capabilities that help modelers take full advantage of the insight available from ground-water models. However, lack of information about the requirements and benefits of inverse models is an obstacle to their widespread use. This paper presents a simple ground-water flow problem to illustrate the requirements and benefits of the nonlinear least-squares regression method of inverse modeling and discusses how these attributes apply to field problems. The benefits of inverse modeling include: (1) expedited determination of best fit parameter values; (2) quantification of the (a) quality of calibration, (b) data shortcomings and needs, and (c) confidence limits on parameter estimates and predictions; and (3) identification of issues that are easily overlooked during nonautomated calibration.

  16. The Impact of School Socioeconomic Status on Student-Generated Teacher Ratings

    ERIC Educational Resources Information Center

    Agnew, Steve

    2011-01-01

    This paper uses ordinary least squares, logit and probit regressions, along with chi-square analysis applied to nationwide data from the New Zealand ratemyteacher website to establish if there is any correlation between student ratings of their teachers and the socioeconomic status of the school the students attend. The results show that students…

  17. Least-squares sequential parameter and state estimation for large space structures

    NASA Technical Reports Server (NTRS)

    Thau, F. E.; Eliazov, T.; Montgomery, R. C.

    1982-01-01

    This paper presents the formulation of simultaneous state and parameter estimation problems for flexible structures in terms of least-squares minimization problems. The approach combines an on-line order determination algorithm, with least-squares algorithms for finding estimates of modal approximation functions, modal amplitudes, and modal parameters. The approach combines previous results on separable nonlinear least squares estimation with a regression analysis formulation of the state estimation problem. The technique makes use of sequential Householder transformations. This allows for sequential accumulation of matrices required during the identification process. The technique is used to identify the modal prameters of a flexible beam.

  18. Quantification of brain lipids by FTIR spectroscopy and partial least squares regression

    NASA Astrophysics Data System (ADS)

    Dreissig, Isabell; Machill, Susanne; Salzer, Reiner; Krafft, Christoph

    2009-01-01

    Brain tissue is characterized by high lipid content. Its content decreases and the lipid composition changes during transformation from normal brain tissue to tumors. Therefore, the analysis of brain lipids might complement the existing diagnostic tools to determine the tumor type and tumor grade. Objective of this work is to extract lipids from gray matter and white matter of porcine brain tissue, record infrared (IR) spectra of these extracts and develop a quantification model for the main lipids based on partial least squares (PLS) regression. IR spectra of the pure lipids cholesterol, cholesterol ester, phosphatidic acid, phosphatidylcholine, phosphatidylethanolamine, phosphatidylserine, phosphatidylinositol, sphingomyelin, galactocerebroside and sulfatide were used as references. Two lipid mixtures were prepared for training and validation of the quantification model. The composition of lipid extracts that were predicted by the PLS regression of IR spectra was compared with lipid quantification by thin layer chromatography.

  19. Fast determination of total ginsenosides content in ginseng powder by near infrared reflectance spectroscopy

    NASA Astrophysics Data System (ADS)

    Chen, Hua-cai; Chen, Xing-dan; Lu, Yong-jun; Cao, Zhi-qiang

    2006-01-01

    Near infrared (NIR) reflectance spectroscopy was used to develop a fast determination method for total ginsenosides in Ginseng (Panax Ginseng) powder. The spectra were analyzed with multiplicative signal correction (MSC) correlation method. The best correlative spectra region with the total ginsenosides content was 1660 nm~1880 nm and 2230nm~2380 nm. The NIR calibration models of ginsenosides were built with multiple linear regression (MLR), principle component regression (PCR) and partial least squares (PLS) regression respectively. The results showed that the calibration model built with PLS combined with MSC and the optimal spectrum region was the best one. The correlation coefficient and the root mean square error of correction validation (RMSEC) of the best calibration model were 0.98 and 0.15% respectively. The optimal spectrum region for calibration was 1204nm~2014nm. The result suggested that using NIR to rapidly determinate the total ginsenosides content in ginseng powder were feasible.

  20. Hypothesis Testing Using Factor Score Regression

    PubMed Central

    Devlieger, Ines; Mayer, Axel; Rosseel, Yves

    2015-01-01

    In this article, an overview is given of four methods to perform factor score regression (FSR), namely regression FSR, Bartlett FSR, the bias avoiding method of Skrondal and Laake, and the bias correcting method of Croon. The bias correcting method is extended to include a reliable standard error. The four methods are compared with each other and with structural equation modeling (SEM) by using analytic calculations and two Monte Carlo simulation studies to examine their finite sample characteristics. Several performance criteria are used, such as the bias using the unstandardized and standardized parameterization, efficiency, mean square error, standard error bias, type I error rate, and power. The results show that the bias correcting method, with the newly developed standard error, is the only suitable alternative for SEM. While it has a higher standard error bias than SEM, it has a comparable bias, efficiency, mean square error, power, and type I error rate. PMID:29795886

  1. Prediction of clinical depression scores and detection of changes in whole-brain using resting-state functional MRI data with partial least squares regression

    PubMed Central

    Shimizu, Yu; Yoshimoto, Junichiro; Takamura, Masahiro; Okada, Go; Okamoto, Yasumasa; Yamawaki, Shigeto; Doya, Kenji

    2017-01-01

    In diagnostic applications of statistical machine learning methods to brain imaging data, common problems include data high-dimensionality and co-linearity, which often cause over-fitting and instability. To overcome these problems, we applied partial least squares (PLS) regression to resting-state functional magnetic resonance imaging (rs-fMRI) data, creating a low-dimensional representation that relates symptoms to brain activity and that predicts clinical measures. Our experimental results, based upon data from clinically depressed patients and healthy controls, demonstrated that PLS and its kernel variants provided significantly better prediction of clinical measures than ordinary linear regression. Subsequent classification using predicted clinical scores distinguished depressed patients from healthy controls with 80% accuracy. Moreover, loading vectors for latent variables enabled us to identify brain regions relevant to depression, including the default mode network, the right superior frontal gyrus, and the superior motor area. PMID:28700672

  2. Raman spectroscopy compared against traditional predictors of shear force in lamb m. longissimus lumborum.

    PubMed

    Fowler, Stephanie M; Schmidt, Heinar; van de Ven, Remy; Wynn, Peter; Hopkins, David L

    2014-12-01

    A Raman spectroscopic hand held device was used to predict shear force (SF) of 80 fresh lamb m. longissimus lumborum (LL) at 1 and 5days post mortem (PM). Traditional predictors of SF including sarcomere length (SL), particle size (PS), cooking loss (CL), percentage myofibrillar breaks and pH were also measured. SF values were regressed against Raman spectra using partial least squares regression and against the traditional predictors using linear regression. The best prediction of shear force values used spectra at 1day PM to predict shear force at 1day which gave a root mean square error of prediction (RMSEP) of 13.6 (Null=14.0) and the R(2) between observed and cross validated predicted values was 0.06 (R(2)cv). Overall, for fresh LL, the predictability SF, by either the Raman hand held probe or traditional predictors was low. Copyright © 2014 Elsevier Ltd. All rights reserved.

  3. Robust regression on noisy data for fusion scaling laws

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Verdoolaege, Geert, E-mail: geert.verdoolaege@ugent.be; Laboratoire de Physique des Plasmas de l'ERM - Laboratorium voor Plasmafysica van de KMS

    2014-11-15

    We introduce the method of geodesic least squares (GLS) regression for estimating fusion scaling laws. Based on straightforward principles, the method is easily implemented, yet it clearly outperforms established regression techniques, particularly in cases of significant uncertainty on both the response and predictor variables. We apply GLS for estimating the scaling of the L-H power threshold, resulting in estimates for ITER that are somewhat higher than predicted earlier.

  4. Application of nonlinear least-squares regression to ground-water flow modeling, west-central Florida

    USGS Publications Warehouse

    Yobbi, D.K.

    2000-01-01

    A nonlinear least-squares regression technique for estimation of ground-water flow model parameters was applied to an existing model of the regional aquifer system underlying west-central Florida. The regression technique minimizes the differences between measured and simulated water levels. Regression statistics, including parameter sensitivities and correlations, were calculated for reported parameter values in the existing model. Optimal parameter values for selected hydrologic variables of interest are estimated by nonlinear regression. Optimal estimates of parameter values are about 140 times greater than and about 0.01 times less than reported values. Independently estimating all parameters by nonlinear regression was impossible, given the existing zonation structure and number of observations, because of parameter insensitivity and correlation. Although the model yields parameter values similar to those estimated by other methods and reproduces the measured water levels reasonably accurately, a simpler parameter structure should be considered. Some possible ways of improving model calibration are to: (1) modify the defined parameter-zonation structure by omitting and/or combining parameters to be estimated; (2) carefully eliminate observation data based on evidence that they are likely to be biased; (3) collect additional water-level data; (4) assign values to insensitive parameters, and (5) estimate the most sensitive parameters first, then, using the optimized values for these parameters, estimate the entire data set.

  5. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kwon, Deukwoo; Little, Mark P.; Miller, Donald L.

    Purpose: To determine more accurate regression formulas for estimating peak skin dose (PSD) from reference air kerma (RAK) or kerma-area product (KAP). Methods: After grouping of the data from 21 procedures into 13 clinically similar groups, assessments were made of optimal clustering using the Bayesian information criterion to obtain the optimal linear regressions of (log-transformed) PSD vs RAK, PSD vs KAP, and PSD vs RAK and KAP. Results: Three clusters of clinical groups were optimal in regression of PSD vs RAK, seven clusters of clinical groups were optimal in regression of PSD vs KAP, and six clusters of clinical groupsmore » were optimal in regression of PSD vs RAK and KAP. Prediction of PSD using both RAK and KAP is significantly better than prediction of PSD with either RAK or KAP alone. The regression of PSD vs RAK provided better predictions of PSD than the regression of PSD vs KAP. The partial-pooling (clustered) method yields smaller mean squared errors compared with the complete-pooling method.Conclusion: PSD distributions for interventional radiology procedures are log-normal. Estimates of PSD derived from RAK and KAP jointly are most accurate, followed closely by estimates derived from RAK alone. Estimates of PSD derived from KAP alone are the least accurate. Using a stochastic search approach, it is possible to cluster together certain dissimilar types of procedures to minimize the total error sum of squares.« less

  6. Two biased estimation techniques in linear regression: Application to aircraft

    NASA Technical Reports Server (NTRS)

    Klein, Vladislav

    1988-01-01

    Several ways for detection and assessment of collinearity in measured data are discussed. Because data collinearity usually results in poor least squares estimates, two estimation techniques which can limit a damaging effect of collinearity are presented. These two techniques, the principal components regression and mixed estimation, belong to a class of biased estimation techniques. Detection and assessment of data collinearity and the two biased estimation techniques are demonstrated in two examples using flight test data from longitudinal maneuvers of an experimental aircraft. The eigensystem analysis and parameter variance decomposition appeared to be a promising tool for collinearity evaluation. The biased estimators had far better accuracy than the results from the ordinary least squares technique.

  7. Diabetic Prevalence in Bangladesh: The Role of Some Associated Demographic and Socioeconomic Characteristics

    NASA Astrophysics Data System (ADS)

    Imam, Tasneem

    2012-12-01

    The study attempts at examining the association of a few selected socio-economic and demographic characteristics on diabetic prevalence. Nationally representative data from BIRDEM 2000 have been used to meet the objectives of the study. Cross tabulation, Chi-square and logistic regression analysis have been used to portray the necessary associations. Chi- square reveals significant relationship between diabetic prevalence and all the selected demographic and socio-economic variables except ìeducationî while logistic regression analysis shows no significant contribution of ìageî and ìeducationî in diabetic prevalence. It has to be noted that, this paper dealt with all the three types of diabetes- Type 1, Type 2 and Gestational.

  8. Using the Criterion-Predictor Factor Model to Compute the Probability of Detecting Prediction Bias with Ordinary Least Squares Regression

    ERIC Educational Resources Information Center

    Culpepper, Steven Andrew

    2012-01-01

    The study of prediction bias is important and the last five decades include research studies that examined whether test scores differentially predict academic or employment performance. Previous studies used ordinary least squares (OLS) to assess whether groups differ in intercepts and slopes. This study shows that OLS yields inaccurate inferences…

  9. A New Global Regression Analysis Method for the Prediction of Wind Tunnel Model Weight Corrections

    NASA Technical Reports Server (NTRS)

    Ulbrich, Norbert Manfred; Bridge, Thomas M.; Amaya, Max A.

    2014-01-01

    A new global regression analysis method is discussed that predicts wind tunnel model weight corrections for strain-gage balance loads during a wind tunnel test. The method determines corrections by combining "wind-on" model attitude measurements with least squares estimates of the model weight and center of gravity coordinates that are obtained from "wind-off" data points. The method treats the least squares fit of the model weight separate from the fit of the center of gravity coordinates. Therefore, it performs two fits of "wind- off" data points and uses the least squares estimator of the model weight as an input for the fit of the center of gravity coordinates. Explicit equations for the least squares estimators of the weight and center of gravity coordinates are derived that simplify the implementation of the method in the data system software of a wind tunnel. In addition, recommendations for sets of "wind-off" data points are made that take typical model support system constraints into account. Explicit equations of the confidence intervals on the model weight and center of gravity coordinates and two different error analyses of the model weight prediction are also discussed in the appendices of the paper.

  10. Modelling of the batch biosorption system: study on exchange of protons with cell wall-bound mineral ions.

    PubMed

    Mishra, Vishal

    2015-01-01

    The interchange of the protons with the cell wall-bound calcium and magnesium ions at the interface of solution/bacterial cell surface in the biosorption system at various concentrations of protons has been studied in the present work. A mathematical model for establishing the correlation between concentration of protons and active sites was developed and optimized. The sporadic limited residence time reactor was used to titrate the calcium and magnesium ions at the individual data point. The accuracy of the proposed mathematical model was estimated using error functions such as nonlinear regression, adjusted nonlinear regression coefficient, the chi-square test, P-test and F-test. The values of the chi-square test (0.042-0.017), P-test (<0.001-0.04), sum of square errors (0.061-0.016), root mean square error (0.01-0.04) and F-test (2.22-19.92) reported in the present research indicated the suitability of the model over a wide range of proton concentrations. The zeta potential of the bacterium surface at various concentrations of protons was observed to validate the denaturation of active sites.

  11. Support vector machine regression (SVR/LS-SVM)--an alternative to neural networks (ANN) for analytical chemistry? Comparison of nonlinear methods on near infrared (NIR) spectroscopy data.

    PubMed

    Balabin, Roman M; Lomakina, Ekaterina I

    2011-04-21

    In this study, we make a general comparison of the accuracy and robustness of five multivariate calibration models: partial least squares (PLS) regression or projection to latent structures, polynomial partial least squares (Poly-PLS) regression, artificial neural networks (ANNs), and two novel techniques based on support vector machines (SVMs) for multivariate data analysis: support vector regression (SVR) and least-squares support vector machines (LS-SVMs). The comparison is based on fourteen (14) different datasets: seven sets of gasoline data (density, benzene content, and fractional composition/boiling points), two sets of ethanol gasoline fuel data (density and ethanol content), one set of diesel fuel data (total sulfur content), three sets of petroleum (crude oil) macromolecules data (weight percentages of asphaltenes, resins, and paraffins), and one set of petroleum resins data (resins content). Vibrational (near-infrared, NIR) spectroscopic data are used to predict the properties and quality coefficients of gasoline, biofuel/biodiesel, diesel fuel, and other samples of interest. The four systems presented here range greatly in composition, properties, strength of intermolecular interactions (e.g., van der Waals forces, H-bonds), colloid structure, and phase behavior. Due to the high diversity of chemical systems studied, general conclusions about SVM regression methods can be made. We try to answer the following question: to what extent can SVM-based techniques replace ANN-based approaches in real-world (industrial/scientific) applications? The results show that both SVR and LS-SVM methods are comparable to ANNs in accuracy. Due to the much higher robustness of the former, the SVM-based approaches are recommended for practical (industrial) application. This has been shown to be especially true for complicated, highly nonlinear objects.

  12. Using Remote Sensing Data to Evaluate Surface Soil Properties in Alabama Ultisols

    NASA Technical Reports Server (NTRS)

    Sullivan, Dana G.; Shaw, Joey N.; Rickman, Doug; Mask, Paul L.; Luvall, Jeff

    2005-01-01

    Evaluation of surface soil properties via remote sensing could facilitate soil survey mapping, erosion prediction and allocation of agrochemicals for precision management. The objective of this study was to evaluate the relationship between soil spectral signature and surface soil properties in conventionally managed row crop systems. High-resolution RS data were acquired over bare fields in the Coastal Plain, Appalachian Plateau, and Ridge and Valley provinces of Alabama using the Airborne Terrestrial Applications Sensor multispectral scanner. Soils ranged from sandy Kandiudults to fine textured Rhodudults. Surface soil samples (0-1 cm) were collected from 163 sampling points for soil organic carbon, particle size distribution, and citrate dithionite extractable iron content. Surface roughness, soil water content, and crusting were also measured during sampling. Two methods of analysis were evaluated: 1) multiple linear regression using common spectral band ratios, and 2) partial least squares regression. Our data show that thermal infrared spectra are highly, linearly related to soil organic carbon, sand and clay content. Soil organic carbon content was the most difficult to quantify in these highly weathered systems, where soil organic carbon was generally less than 1.2%. Estimates of sand and clay content were best using partial least squares regression at the Valley site, explaining 42-59% of the variability. In the Coastal Plain, sandy surfaces prone to crusting limited estimates of sand and clay content via partial least squares and regression with common band ratios. Estimates of iron oxide content were a function of mineralogy and best accomplished using specific band ratios, with regression explaining 36-65% of the variability at the Valley and Coastal Plain sites, respectively.

  13. Impact of multicollinearity on small sample hydrologic regression models

    NASA Astrophysics Data System (ADS)

    Kroll, Charles N.; Song, Peter

    2013-06-01

    Often hydrologic regression models are developed with ordinary least squares (OLS) procedures. The use of OLS with highly correlated explanatory variables produces multicollinearity, which creates highly sensitive parameter estimators with inflated variances and improper model selection. It is not clear how to best address multicollinearity in hydrologic regression models. Here a Monte Carlo simulation is developed to compare four techniques to address multicollinearity: OLS, OLS with variance inflation factor screening (VIF), principal component regression (PCR), and partial least squares regression (PLS). The performance of these four techniques was observed for varying sample sizes, correlation coefficients between the explanatory variables, and model error variances consistent with hydrologic regional regression models. The negative effects of multicollinearity are magnified at smaller sample sizes, higher correlations between the variables, and larger model error variances (smaller R2). The Monte Carlo simulation indicates that if the true model is known, multicollinearity is present, and the estimation and statistical testing of regression parameters are of interest, then PCR or PLS should be employed. If the model is unknown, or if the interest is solely on model predictions, is it recommended that OLS be employed since using more complicated techniques did not produce any improvement in model performance. A leave-one-out cross-validation case study was also performed using low-streamflow data sets from the eastern United States. Results indicate that OLS with stepwise selection generally produces models across study regions with varying levels of multicollinearity that are as good as biased regression techniques such as PCR and PLS.

  14. Linear Least Squares for Correlated Data

    NASA Technical Reports Server (NTRS)

    Dean, Edwin B.

    1988-01-01

    Throughout the literature authors have consistently discussed the suspicion that regression results were less than satisfactory when the independent variables were correlated. Camm, Gulledge, and Womer, and Womer and Marcotte provide excellent applied examples of these concerns. Many authors have obtained partial solutions for this problem as discussed by Womer and Marcotte and Wonnacott and Wonnacott, which result in generalized least squares algorithms to solve restrictive cases. This paper presents a simple but relatively general multivariate method for obtaining linear least squares coefficients which are free of the statistical distortion created by correlated independent variables.

  15. Wind Tunnel Strain-Gage Balance Calibration Data Analysis Using a Weighted Least Squares Approach

    NASA Technical Reports Server (NTRS)

    Ulbrich, N.; Volden, T.

    2017-01-01

    A new approach is presented that uses a weighted least squares fit to analyze wind tunnel strain-gage balance calibration data. The weighted least squares fit is specifically designed to increase the influence of single-component loadings during the regression analysis. The weighted least squares fit also reduces the impact of calibration load schedule asymmetries on the predicted primary sensitivities of the balance gages. A weighting factor between zero and one is assigned to each calibration data point that depends on a simple count of its intentionally loaded load components or gages. The greater the number of a data point's intentionally loaded load components or gages is, the smaller its weighting factor becomes. The proposed approach is applicable to both the Iterative and Non-Iterative Methods that are used for the analysis of strain-gage balance calibration data in the aerospace testing community. The Iterative Method uses a reasonable estimate of the tare corrected load set as input for the determination of the weighting factors. The Non-Iterative Method, on the other hand, uses gage output differences relative to the natural zeros as input for the determination of the weighting factors. Machine calibration data of a six-component force balance is used to illustrate benefits of the proposed weighted least squares fit. In addition, a detailed derivation of the PRESS residuals associated with a weighted least squares fit is given in the appendices of the paper as this information could not be found in the literature. These PRESS residuals may be needed to evaluate the predictive capabilities of the final regression models that result from a weighted least squares fit of the balance calibration data.

  16. America's Democracy Colleges: The Civic Engagement of Community College Students

    ERIC Educational Resources Information Center

    Angeli Newell, Mallory

    2014-01-01

    This study explored the civic engagement of current two- and four-year students to explore whether differences exist between the groups and what may explain the differences. Using binary logistic regression and Ordinary Least Squares regression it was found that community-based engagement was lower for two- than four-year students, though…

  17. A Simple and Convenient Method of Multiple Linear Regression to Calculate Iodine Molecular Constants

    ERIC Educational Resources Information Center

    Cooper, Paul D.

    2010-01-01

    A new procedure using a student-friendly least-squares multiple linear-regression technique utilizing a function within Microsoft Excel is described that enables students to calculate molecular constants from the vibronic spectrum of iodine. This method is advantageous pedagogically as it calculates molecular constants for ground and excited…

  18. Revisiting the Scale-Invariant, Two-Dimensional Linear Regression Method

    ERIC Educational Resources Information Center

    Patzer, A. Beate C.; Bauer, Hans; Chang, Christian; Bolte, Jan; Su¨lzle, Detlev

    2018-01-01

    The scale-invariant way to analyze two-dimensional experimental and theoretical data with statistical errors in both the independent and dependent variables is revisited by using what we call the triangular linear regression method. This is compared to the standard least-squares fit approach by applying it to typical simple sets of example data…

  19. Robust Regression for Slope Estimation in Curriculum-Based Measurement Progress Monitoring

    ERIC Educational Resources Information Center

    Mercer, Sterett H.; Lyons, Alina F.; Johnston, Lauren E.; Millhoff, Courtney L.

    2015-01-01

    Although ordinary least-squares (OLS) regression has been identified as a preferred method to calculate rates of improvement for individual students during curriculum-based measurement (CBM) progress monitoring, OLS slope estimates are sensitive to the presence of extreme values. Robust estimators have been developed that are less biased by…

  20. Pick Your Poisson: A Tutorial on Analyzing Counts of Student Victimization Data

    ERIC Educational Resources Information Center

    Huang, Francis L.; Cornell, Dewey G.

    2012-01-01

    School violence research is often concerned with infrequently occurring events such as counts of the number of bullying incidents or fights a student may experience. Analyzing count data using ordinary least squares regression may produce improbable predicted values, and as a result of regression assumption violations, result in higher Type I…

Top