NASA Astrophysics Data System (ADS)
Li, Jiangtong; Luo, Yongdao; Dai, Honglin
2018-01-01
Water is the source of life and the essential foundation of all life. With the development of industrialization, the phenomenon of water pollution is becoming more and more frequent, which directly affects the survival and development of human. Water quality detection is one of the necessary measures to protect water resources. Ultraviolet (UV) spectral analysis is an important research method in the field of water quality detection, which partial least squares regression (PLSR) analysis method is becoming predominant technology, however, in some special cases, PLSR's analysis produce considerable errors. In order to solve this problem, the traditional principal component regression (PCR) analysis method was improved by using the principle of PLSR in this paper. The experimental results show that for some special experimental data set, improved PCR analysis method performance is better than PLSR. The PCR and PLSR is the focus of this paper. Firstly, the principal component analysis (PCA) is performed by MATLAB to reduce the dimensionality of the spectral data; on the basis of a large number of experiments, the optimized principal component is extracted by using the principle of PLSR, which carries most of the original data information. Secondly, the linear regression analysis of the principal component is carried out with statistic package for social science (SPSS), which the coefficients and relations of principal components can be obtained. Finally, calculating a same water spectral data set by PLSR and improved PCR, analyzing and comparing two results, improved PCR and PLSR is similar for most data, but improved PCR is better than PLSR for data near the detection limit. Both PLSR and improved PCR can be used in Ultraviolet spectral analysis of water, but for data near the detection limit, improved PCR's result better than PLSR.
Partial Least Squares Regression Models for the Analysis of Kinase Signaling.
Bourgeois, Danielle L; Kreeger, Pamela K
2017-01-01
Partial least squares regression (PLSR) is a data-driven modeling approach that can be used to analyze multivariate relationships between kinase networks and cellular decisions or patient outcomes. In PLSR, a linear model relating an X matrix of dependent variables and a Y matrix of independent variables is generated by extracting the factors with the strongest covariation. While the identified relationship is correlative, PLSR models can be used to generate quantitative predictions for new conditions or perturbations to the network, allowing for mechanisms to be identified. This chapter will provide a brief explanation of PLSR and provide an instructive example to demonstrate the use of PLSR to analyze kinase signaling.
Lin, Lixin; Wang, Yunjia; Teng, Jiyao; Wang, Xuchen
2016-02-01
Hyperspectral estimation of soil organic matter (SOM) in coal mining regions is an important tool for enhancing fertilization in soil restoration programs. The correlation--partial least squares regression (PLSR) method effectively solves the information loss problem of correlation--multiple linear stepwise regression, but results of the correlation analysis must be optimized to improve precision. This study considers the relationship between spectral reflectance and SOM based on spectral reflectance curves of soil samples collected from coal mining regions. Based on the major absorption troughs in the 400-1006 nm spectral range, PLSR analysis was performed using 289 independent bands of the second derivative (SDR) with three levels and measured SOM values. A wavelet-correlation-PLSR (W-C-PLSR) model was then constructed. By amplifying useful information that was previously obscured by noise, the W-C-PLSR model was optimal for estimating SOM content, with smaller prediction errors in both calibration (R(2) = 0.970, root mean square error (RMSEC) = 3.10, and mean relative error (MREC) = 8.75) and validation (RMSEV = 5.85 and MREV = 14.32) analyses, as compared with other models. Results indicate that W-C-PLSR has great potential to estimate SOM in coal mining regions.
Perez-Guaita, David; Kuligowski, Julia; Quintás, Guillermo; Garrigues, Salvador; Guardia, Miguel de la
2013-03-30
Locally weighted partial least squares regression (LW-PLSR) has been applied to the determination of four clinical parameters in human serum samples (total protein, triglyceride, glucose and urea contents) by Fourier transform infrared (FTIR) spectroscopy. Classical LW-PLSR models were constructed using different spectral regions. For the selection of parameters by LW-PLSR modeling, a multi-parametric study was carried out employing the minimum root-mean square error of cross validation (RMSCV) as objective function. In order to overcome the effect of strong matrix interferences on the predictive accuracy of LW-PLSR models, this work focuses on sample selection. Accordingly, a novel strategy for the development of local models is proposed. It was based on the use of: (i) principal component analysis (PCA) performed on an analyte specific spectral region for identifying most similar sample spectra and (ii) partial least squares regression (PLSR) constructed using the whole spectrum. Results found by using this strategy were compared to those provided by PLSR using the same spectral intervals as for LW-PLSR. Prediction errors found by both, classical and modified LW-PLSR improved those obtained by PLSR. Hence, both proposed approaches were useful for the determination of analytes present in a complex matrix as in the case of human serum samples. Copyright © 2013 Elsevier B.V. All rights reserved.
Hao, Yong; Sun, Xu-Dong; Yang, Qiang
2012-12-01
Variables selection strategy combined with local linear embedding (LLE) was introduced for the analysis of complex samples by using near infrared spectroscopy (NIRS). Three methods include Monte Carlo uninformation variable elimination (MCUVE), successive projections algorithm (SPA) and MCUVE connected with SPA were used for eliminating redundancy spectral variables. Partial least squares regression (PLSR) and LLE-PLSR were used for modeling complex samples. The results shown that MCUVE can both extract effective informative variables and improve the precision of models. Compared with PLSR models, LLE-PLSR models can achieve more accurate analysis results. MCUVE combined with LLE-PLSR is an effective modeling method for NIRS quantitative analysis.
Bian, Xihui; Li, Shujuan; Lin, Ligang; Tan, Xiaoyao; Fan, Qingjie; Li, Ming
2016-06-21
Accurate prediction of the model is fundamental to the successful analysis of complex samples. To utilize abundant information embedded over frequency and time domains, a novel regression model is presented for quantitative analysis of hydrocarbon contents in the fuel oil samples. The proposed method named as high and low frequency unfolded PLSR (HLUPLSR), which integrates empirical mode decomposition (EMD) and unfolded strategy with partial least squares regression (PLSR). In the proposed method, the original signals are firstly decomposed into a finite number of intrinsic mode functions (IMFs) and a residue by EMD. Secondly, the former high frequency IMFs are summed as a high frequency matrix and the latter IMFs and residue are summed as a low frequency matrix. Finally, the two matrices are unfolded to an extended matrix in variable dimension, and then the PLSR model is built between the extended matrix and the target values. Coupled with Ultraviolet (UV) spectroscopy, HLUPLSR has been applied to determine hydrocarbon contents of light gas oil and diesel fuels samples. Comparing with single PLSR and other signal processing techniques, the proposed method shows superiority in prediction ability and better model interpretation. Therefore, HLUPLSR method provides a promising tool for quantitative analysis of complex samples. Copyright © 2016 Elsevier B.V. All rights reserved.
Wang, Hai-Xia; Suo, Tong-Chuan; Yu, He-Shui; Li, Zheng
2016-10-01
The manufacture of traditional Chinese medicine (TCM) products is always accompanied by processing complex raw materials and real-time monitoring of the manufacturing process. In this study, we investigated different modeling strategies for the extraction process of licorice. Near-infrared spectra associate with the extraction time was used to detemine the states of the extraction processes. Three modeling approaches, i.e., principal component analysis (PCA), partial least squares regression (PLSR) and parallel factor analysis-PLSR (PARAFAC-PLSR), were adopted for the prediction of the real-time status of the process. The overall results indicated that PCA, PLSR and PARAFAC-PLSR can effectively detect the errors in the extraction procedure and predict the process trajectories, which has important significance for the monitoring and controlling of the extraction processes. Copyright© by the Chinese Pharmaceutical Association.
Quantitative Analysis of Single and Mix Food Antiseptics Basing on SERS Spectra with PLSR Method
NASA Astrophysics Data System (ADS)
Hou, Mengjing; Huang, Yu; Ma, Lingwei; Zhang, Zhengjun
2016-06-01
Usage and dosage of food antiseptics are very concerned due to their decisive influence in food safety. Surface-enhanced Raman scattering (SERS) effect was employed in this research to realize trace potassium sorbate (PS) and sodium benzoate (SB) detection. HfO2 ultrathin film-coated Ag NR array was fabricated as SERS substrate. Protected by HfO2 film, the SERS substrate possesses good acid resistance, which enables it to be applicable in acidic environment where PS and SB work. Regression relationship between SERS spectra of 0.3~10 mg/L PS solution and their concentration was calibrated by partial least squares regression (PLSR) method, and the concentration prediction performance was quite satisfactory. Furthermore, mixture solution of PS and SB was also quantitatively analyzed by PLSR method. Spectrum data of characteristic peak sections corresponding to PS and SB was used to establish the regression models of these two solutes, respectively, and their concentrations were determined accurately despite their characteristic peak sections overlapping. It is possible that the unique modeling process of PLSR method prevented the overlapped Raman signal from reducing the model accuracy.
NASA Astrophysics Data System (ADS)
Peterson, K. T.; Wulamu, A.
2017-12-01
Water, essential to all living organisms, is one of the Earth's most precious resources. Remote sensing offers an ideal approach to monitor water quality over traditional in-situ techniques that are highly time and resource consuming. Utilizing a multi-scale approach, incorporating data from handheld spectroscopy, UAS based hyperspectal, and satellite multispectral images were collected in coordination with in-situ water quality samples for the two midwestern watersheds. The remote sensing data was modeled and correlated to the in-situ water quality variables including chlorophyll content (Chl), turbidity, and total dissolved solids (TDS) using Normalized Difference Spectral Indices (NDSI) and Partial Least Squares Regression (PLSR). The results of the study supported the original hypothesis that correlating water quality variables with remotely sensed data benefits greatly from the use of more complex modeling and regression techniques such as PLSR. The final results generated from the PLSR analysis resulted in much higher R2 values for all variables when compared to NDSI. The combination of NDSI and PLSR analysis also identified key wavelengths for identification that aligned with previous study's findings. This research displays the advantages and future for complex modeling and machine learning techniques to improve water quality variable estimation from spectral data.
A deep belief network with PLSR for nonlinear system modeling.
Qiao, Junfei; Wang, Gongming; Li, Wenjing; Li, Xiaoli
2018-08-01
Nonlinear system modeling plays an important role in practical engineering, and deep learning-based deep belief network (DBN) is now popular in nonlinear system modeling and identification because of the strong learning ability. However, the existing weights optimization for DBN is based on gradient, which always leads to a local optimum and a poor training result. In this paper, a DBN with partial least square regression (PLSR-DBN) is proposed for nonlinear system modeling, which focuses on the problem of weights optimization for DBN using PLSR. Firstly, unsupervised contrastive divergence (CD) algorithm is used in weights initialization. Secondly, initial weights derived from CD algorithm are optimized through layer-by-layer PLSR modeling from top layer to bottom layer. Instead of gradient method, PLSR-DBN can determine the optimal weights using several PLSR models, so that a better performance of PLSR-DBN is achieved. Then, the analysis of convergence is theoretically given to guarantee the effectiveness of the proposed PLSR-DBN model. Finally, the proposed PLSR-DBN is tested on two benchmark nonlinear systems and an actual wastewater treatment system as well as a handwritten digit recognition (nonlinear mapping and modeling) with high-dimension input data. The experiment results show that the proposed PLSR-DBN has better performances of time and accuracy on nonlinear system modeling than that of other methods. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Sarkar, Arnab; Karki, Vijay; Aggarwal, Suresh K.; Maurya, Gulab S.; Kumar, Rohit; Rai, Awadhesh K.; Mao, Xianglei; Russo, Richard E.
2015-06-01
Laser induced breakdown spectroscopy (LIBS) was applied for elemental characterization of high alloy steel using partial least squares regression (PLSR) with an objective to evaluate the analytical performance of this multivariate approach. The optimization of the number of principle components for minimizing error in PLSR algorithm was investigated. The effect of different pre-treatment procedures on the raw spectral data before PLSR analysis was evaluated based on several statistical (standard error of prediction, percentage relative error of prediction etc.) parameters. The pre-treatment with "NORM" parameter gave the optimum statistical results. The analytical performance of PLSR model improved by increasing the number of laser pulses accumulated per spectrum as well as by truncating the spectrum to appropriate wavelength region. It was found that the statistical benefit of truncating the spectrum can also be accomplished by increasing the number of laser pulses per accumulation without spectral truncation. The constituents (Co and Mo) present in hundreds of ppm were determined with relative precision of 4-9% (2σ), whereas the major constituents Cr and Ni (present at a few percent levels) were determined with a relative precision of ~ 2%(2σ).
Qiu, Shanshan; Wang, Jun; Gao, Liping
2014-07-09
An electronic nose (E-nose) and an electronic tongue (E-tongue) have been used to characterize five types of strawberry juices based on processing approaches (i.e., microwave pasteurization, steam blanching, high temperature short time pasteurization, frozen-thawed, and freshly squeezed). Juice quality parameters (vitamin C, pH, total soluble solid, total acid, and sugar/acid ratio) were detected by traditional measuring methods. Multivariate statistical methods (linear discriminant analysis (LDA) and partial least squares regression (PLSR)) and neural networks (Random Forest (RF) and Support Vector Machines) were employed to qualitative classification and quantitative regression. E-tongue system reached higher accuracy rates than E-nose did, and the simultaneous utilization did have an advantage in LDA classification and PLSR regression. According to cross-validation, RF has shown outstanding and indisputable performances in the qualitative and quantitative analysis. This work indicates that the simultaneous utilization of E-nose and E-tongue can discriminate processed fruit juices and predict quality parameters successfully for the beverage industry.
Tchabo, William; Ma, Yongkun; Kwaw, Emmanuel; Zhang, Haining; Xiao, Lulu; Tahir, Haroon Elrasheid
2017-10-01
The present study was undertaken to assess accelerating aging effects of high pressure, ultrasound and manosonication on the aromatic profile and sensorial attributes of aged mulberry wines (AMW). A total of 166 volatile compounds were found amongst the AMW. The outcomes of the investigation were presented by means of geometric mean (GM), cluster analysis (CA), principal component analysis (PCA), partial least squares regressions (PLSR) and principal component regression (PCR). GM highlighted 24 organoleptic attributes responsible for the sensorial profile of the AMW. Moreover, CA revealed that the volatile composition of the non-thermal accelerated aged wines differs from that of the conventional aged wines. Besides, PCA discriminated the AMW on the basis of their main sensorial characteristics. Furthermore, PLSR identified 75 aroma compounds which were mainly responsible for the olfactory notes of the AMW. Finally, the overall quality of the AMW was noted to be better predicted by PLSR than PCR. Copyright © 2017 Elsevier Ltd. All rights reserved.
Soil sail content estimation in the yellow river delta with satellite hyperspectral data
Weng, Yongling; Gong, Peng; Zhu, Zhi-Liang
2008-01-01
Soil salinization is one of the most common land degradation processes and is a severe environmental hazard. The primary objective of this study is to investigate the potential of predicting salt content in soils with hyperspectral data acquired with EO-1 Hyperion. Both partial least-squares regression (PLSR) and conventional multiple linear regression (MLR), such as stepwise regression (SWR), were tested as the prediction model. PLSR is commonly used to overcome the problem caused by high-dimensional and correlated predictors. Chemical analysis of 95 samples collected from the top layer of soils in the Yellow River delta area shows that salt content was high on average, and the dominant chemicals in the saline soil were NaCl and MgCl2. Multivariate models were established between soil contents and hyperspectral data. Our results indicate that the PLSR technique with laboratory spectral data has a strong prediction capacity. Spectral bands at 1487-1527, 1971-1991, 2032-2092, and 2163-2355 nm possessed large absolute values of regression coefficients, with the largest coefficient at 2203 nm. We obtained a root mean squared error (RMSE) for calibration (with 61 samples) of RMSEC = 0.753 (R2 = 0.893) and a root mean squared error for validation (with 30 samples) of RMSEV = 0.574. The prediction model was applied on a pixel-by-pixel basis to a Hyperion reflectance image to yield a quantitative surface distribution map of soil salt content. The result was validated successfully from 38 sampling points. We obtained an RMSE estimate of 1.037 (R2 = 0.784) for the soil salt content map derived by the PLSR model. The salinity map derived from the SWR model shows that the predicted value is higher than the true value. These results demonstrate that the PLSR method is a more suitable technique than stepwise regression for quantitative estimation of soil salt content in a large area. ?? 2008 CASI.
Application of near-infrared spectroscopy in the detection of fat-soluble vitamins in premix feed
NASA Astrophysics Data System (ADS)
Jia, Lian Ping; Tian, Shu Li; Zheng, Xue Cong; Jiao, Peng; Jiang, Xun Peng
2018-02-01
Vitamin is the organic compound and necessary for animal physiological maintenance. The rapid determination of the content of different vitamins in premix feed can help to achieve accurate diets and efficient feeding. Compared with high-performance liquid chromatography and other wet chemical methods, near-infrared spectroscopy is a fast, non-destructive, non-polluting method. 168 samples of premix feed were collected and the contents of vitamin A, vitamin E and vitamin D3 were detected by the standard method. The near-infrared spectra of samples ranging from 10 000 to 4 000 cm-1 were obtained. Partial least squares regression (PLSR) and support vector machine regression (SVMR) were used to construct the quantitative model. The results showed that the RMSEP of PLSR model of vitamin A, vitamin E and vitamin D3 were 0.43×107 IU/kg, 0.09×105 IU/kg and 0.17×107 IU/kg, respectively. The RMSEP of SVMR model was 0.45×107 IU/kg, 0.11×105 IU/kg and 0.18×107 IU/kg. Compared with nonlinear regression method (SVMR), linear regression method (PLSR) is more suitable for the quantitative analysis of vitamins in premix feed.
Quantitative determination of Auramine O by terahertz spectroscopy with 2DCOS-PLSR model
NASA Astrophysics Data System (ADS)
Zhang, Huo; Li, Zhi; Chen, Tao; Qin, Binyi
2017-09-01
Residues of harmful dyes such as Auramine O (AO) in herb and food products threaten the health of people. So, fast and sensitive detection techniques of the residues are needed. As a powerful tool for substance detection, terahertz (THz) spectroscopy was used for the quantitative determination of AO by combining with an improved partial least-squares regression (PLSR) model in this paper. Absorbance of herbal samples with different concentrations was obtained by THz-TDS in the band between 0.2THz and 1.6THz. We applied two-dimensional correlation spectroscopy (2DCOS) to improve the PLSR model. This method highlighted the spectral differences of different concentrations, provided a clear criterion of the input interval selection, and improved the accuracy of detection result. The experimental result indicated that the combination of the THz spectroscopy and 2DCOS-PLSR is an excellent quantitative analysis method.
Tøndel, Kristin; Indahl, Ulf G; Gjuvsland, Arne B; Vik, Jon Olav; Hunter, Peter; Omholt, Stig W; Martens, Harald
2011-06-01
Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. HC-PLSR is a promising approach for metamodelling in systems biology, especially for highly nonlinear or non-monotone parameter to phenotype maps. The algorithm can be flexibly adjusted to suit the complexity of the dynamic model behaviour, inviting automation in the metamodelling of complex systems.
2011-01-01
Background Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy C-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function. Results Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops. Conclusions HC-PLSR is a promising approach for metamodelling in systems biology, especially for highly nonlinear or non-monotone parameter to phenotype maps. The algorithm can be flexibly adjusted to suit the complexity of the dynamic model behaviour, inviting automation in the metamodelling of complex systems. PMID:21627852
NASA Astrophysics Data System (ADS)
Polat, Esra; Gunay, Suleyman
2013-10-01
One of the problems encountered in Multiple Linear Regression (MLR) is multicollinearity, which causes the overestimation of the regression parameters and increase of the variance of these parameters. Hence, in case of multicollinearity presents, biased estimation procedures such as classical Principal Component Regression (CPCR) and Partial Least Squares Regression (PLSR) are then performed. SIMPLS algorithm is the leading PLSR algorithm because of its speed, efficiency and results are easier to interpret. However, both of the CPCR and SIMPLS yield very unreliable results when the data set contains outlying observations. Therefore, Hubert and Vanden Branden (2003) have been presented a robust PCR (RPCR) method and a robust PLSR (RPLSR) method called RSIMPLS. In RPCR, firstly, a robust Principal Component Analysis (PCA) method for high-dimensional data on the independent variables is applied, then, the dependent variables are regressed on the scores using a robust regression method. RSIMPLS has been constructed from a robust covariance matrix for high-dimensional data and robust linear regression. The purpose of this study is to show the usage of RPCR and RSIMPLS methods on an econometric data set, hence, making a comparison of two methods on an inflation model of Turkey. The considered methods have been compared in terms of predictive ability and goodness of fit by using a robust Root Mean Squared Error of Cross-validation (R-RMSECV), a robust R2 value and Robust Component Selection (RCS) statistic.
Kandpal, Lalit Mohan; Lee, Hoonsoo; Kim, Moon S.; Mo, Changyeun; Cho, Byoung-Kwan
2013-01-01
Spectroscopy has proven to be an efficient tool for measuring the properties of meat. In this article, hyperspectral imaging (HSI) techniques are used to determine the moisture content in cooked chicken breast over the VIS/NIR (400–1,000 nm) spectral range. Moisture measurements were performed using an oven drying method. A partial least squares regression (PLSR) model was developed to extract a relationship between the HSI spectra and the moisture content. In the full wavelength range, the PLSR model possessed a maximum R2p of 0.90 and an SEP of 0.74%. For the NIR range, the PLSR model yielded an R2p of 0.94 and an SEP of 0.71%. The majority of the absorption peaks occurred around 760 and 970 nm, representing the water content in the samples. Finally, PLSR images were constructed to visualize the dehydration and water distribution within different sample regions. The high correlation coefficient and low prediction error from the PLSR analysis validates that HSI is an effective tool for visualizing the chemical properties of meat. PMID:24084119
NASA Astrophysics Data System (ADS)
Mai, W.; Zhang, J.-F.; Zhao, X.-M.; Li, Z.; Xu, Z.-W.
2017-11-01
Wastewater from the dye industry is typically analyzed using a standard method for measurement of chemical oxygen demand (COD) or by a single-wavelength spectroscopic method. To overcome the disadvantages of these methods, ultraviolet-visible (UV-Vis) spectroscopy was combined with principal component regression (PCR) and partial least squares regression (PLSR) in this study. Unlike the standard method, this method does not require digestion of the samples for preparation. Experiments showed that the PLSR model offered high prediction performance for COD, with a mean relative error of about 5% for two dyes. This error is similar to that obtained with the standard method. In this study, the precision of the PLSR model decreased with the number of dye compounds present. It is likely that multiple models will be required in reality, and the complexity of a COD monitoring system would be greatly reduced if the PLSR model is used because it can include several dyes. UV-Vis spectroscopy with PLSR successfully enhanced the performance of COD prediction for dye wastewater and showed good potential for application in on-line water quality monitoring.
How to predict the sugariness and hardness of melons: A near-infrared hyperspectral imaging method.
Sun, Meijun; Zhang, Dong; Liu, Li; Wang, Zheng
2017-03-01
Hyperspectral imaging (HSI) in the near-infrared (NIR) region (900-1700nm) was used for non-intrusive quality measurements (of sweetness and texture) in melons. First, HSI data from melon samples were acquired to extract the spectral signatures. The corresponding sample sweetness and hardness values were recorded using traditional intrusive methods. Partial least squares regression (PLSR), principal component analysis (PCA), support vector machine (SVM), and artificial neural network (ANN) models were created to predict melon sweetness and hardness values from the hyperspectral data. Experimental results for the three types of melons show that PLSR produces the most accurate results. To reduce the high dimensionality of the hyperspectral data, the weighted regression coefficients of the resulting PLSR models were used to identify the most important wavelengths. On the basis of these wavelengths, each image pixel was used to visualize the sweetness and hardness in all the portions of each sample. Copyright © 2016 Elsevier Ltd. All rights reserved.
Rapid Detection of Volatile Oil in Mentha haplocalyx by Near-Infrared Spectroscopy and Chemometrics.
Yan, Hui; Guo, Cheng; Shao, Yang; Ouyang, Zhen
2017-01-01
Near-infrared spectroscopy combined with partial least squares regression (PLSR) and support vector machine (SVM) was applied for the rapid determination of chemical component of volatile oil content in Mentha haplocalyx . The effects of data pre-processing methods on the accuracy of the PLSR calibration models were investigated. The performance of the final model was evaluated according to the correlation coefficient ( R ) and root mean square error of prediction (RMSEP). For PLSR model, the best preprocessing method combination was first-order derivative, standard normal variate transformation (SNV), and mean centering, which had of 0.8805, of 0.8719, RMSEC of 0.091, and RMSEP of 0.097, respectively. The wave number variables linking to volatile oil are from 5500 to 4000 cm-1 by analyzing the loading weights and variable importance in projection (VIP) scores. For SVM model, six LVs (less than seven LVs in PLSR model) were adopted in model, and the result was better than PLSR model. The and were 0.9232 and 0.9202, respectively, with RMSEC and RMSEP of 0.084 and 0.082, respectively, which indicated that the predicted values were accurate and reliable. This work demonstrated that near infrared reflectance spectroscopy with chemometrics could be used to rapidly detect the main content volatile oil in M. haplocalyx . The quality of medicine directly links to clinical efficacy, thus, it is important to control the quality of Mentha haplocalyx . Near-infrared spectroscopy combined with partial least squares regression (PLSR) and support vector machine (SVM) was applied for the rapid determination of chemical component of volatile oil content in Mentha haplocalyx . For SVM model, 6 LVs (less than 7 LVs in PLSR model) were adopted in model, and the result was better than PLSR model. It demonstrated that near infrared reflectance spectroscopy with chemometrics could be used to rapidly detect the main content volatile oil in Mentha haplocalyx . Abbreviations used: 1 st der: First-order derivative; 2 nd der: Second-order derivative; LOO: Leave-one-out; LVs: Latent variables; MC: Mean centering, NIR: Near-infrared; NIRS: Near infrared spectroscopy; PCR: Principal component regression, PLSR: Partial least squares regression; RBF: Radial basis function; RMSEC: Root mean square error of cross validation, RMSEC: Root mean square error of calibration; RMSEP: Root mean square error of prediction; SNV: Standard normal variate transformation; SVM: Support vector machine; VIP: Variable Importance in projection.
NASA Astrophysics Data System (ADS)
Lespinats, S.; Meyer-Bäse, Anke; He, Huan; Marshall, Alan G.; Conrad, Charles A.; Emmett, Mark R.
2009-05-01
Partial Least Square Regression (PLSR) and Data-Driven High Dimensional Scaling (DD-HDS) are employed for the prediction and the visualization of changes in polar lipid expression induced by different combinations of wild-type (wt) p53 gene therapy and SN38 chemotherapy of U87 MG glioblastoma cells. A very detailed analysis of the gangliosides reveals that certain gangliosides of GM3 or GD1-type have unique properties not shared by the others. In summary, this preliminary work shows that data mining techniques are able to determine the modulation of gangliosides by different treatment combinations.
Discrimination of serum Raman spectroscopy between normal and colorectal cancer
NASA Astrophysics Data System (ADS)
Li, Xiaozhou; Yang, Tianyue; Yu, Ting; Li, Siqi
2011-07-01
Raman spectroscopy of tissues has been widely studied for the diagnosis of various cancers, but biofluids were seldom used as the analyte because of the low concentration. Herein, serum of 30 normal people, 46 colon cancer, and 44 rectum cancer patients were measured Raman spectra and analyzed. The information of Raman peaks (intensity and width) and that of the fluorescence background (baseline function coefficients) were selected as parameters for statistical analysis. Principal component regression (PCR) and partial least square regression (PLSR) were used on the selected parameters separately to see the performance of the parameters. PCR performed better than PLSR in our spectral data. Then linear discriminant analysis (LDA) was used on the principal components (PCs) of the two regression method on the selected parameters, and a diagnostic accuracy of 88% and 83% were obtained. The conclusion is that the selected features can maintain the information of original spectra well and Raman spectroscopy of serum has the potential for the diagnosis of colorectal cancer.
Fusing face-verification algorithms and humans.
O'Toole, Alice J; Abdi, Hervé; Jiang, Fang; Phillips, P Jonathon
2007-10-01
It has been demonstrated recently that state-of-the-art face-recognition algorithms can surpass human accuracy at matching faces over changes in illumination. The ranking of algorithms and humans by accuracy, however, does not provide information about whether algorithms and humans perform the task comparably or whether algorithms and humans can be fused to improve performance. In this paper, we fused humans and algorithms using partial least square regression (PLSR). In the first experiment, we applied PLSR to face-pair similarity scores generated by seven algorithms participating in the Face Recognition Grand Challenge. The PLSR produced an optimal weighting of the similarity scores, which we tested for generality with a jackknife procedure. Fusing the algorithms' similarity scores using the optimal weights produced a twofold reduction of error rate over the most accurate algorithm. Next, human-subject-generated similarity scores were added to the PLSR analysis. Fusing humans and algorithms increased the performance to near-perfect classification accuracy. These results are discussed in terms of maximizing face-verification accuracy with hybrid systems consisting of multiple algorithms and humans.
NASA Astrophysics Data System (ADS)
Chen, Pengfei; Jing, Qi
2017-02-01
An assumption that the non-linear method is more reasonable than the linear method when canopy reflectance is used to establish the yield prediction model was proposed and tested in this study. For this purpose, partial least squares regression (PLSR) and artificial neural networks (ANN), represented linear and non-linear analysis method, were applied and compared for wheat yield prediction. Multi-period Landsat-8 OLI images were collected at two different wheat growth stages, and a field campaign was conducted to obtain grain yields at selected sampling sites in 2014. The field data were divided into a calibration database and a testing database. Using calibration data, a cross-validation concept was introduced for the PLSR and ANN model construction to prevent over-fitting. All models were tested using the test data. The ANN yield-prediction model produced R2, RMSE and RMSE% values of 0.61, 979 kg ha-1, and 10.38%, respectively, in the testing phase, performing better than the PLSR yield-prediction model, which produced R2, RMSE, and RMSE% values of 0.39, 1211 kg ha-1, and 12.84%, respectively. Non-linear method was suggested as a better method for yield prediction.
NASA Astrophysics Data System (ADS)
Cheng, Jun-Hu; Jin, Huali; Liu, Zhiwei
2018-01-01
The feasibility of developing a multispectral imaging method using important wavelengths from hyperspectral images selected by genetic algorithm (GA), successive projection algorithm (SPA) and regression coefficient (RC) methods for modeling and predicting protein content in peanut kernel was investigated for the first time. Partial least squares regression (PLSR) calibration model was established between the spectral data from the selected optimal wavelengths and the reference measured protein content ranged from 23.46% to 28.43%. The RC-PLSR model established using eight key wavelengths (1153, 1567, 1972, 2143, 2288, 2339, 2389 and 2446 nm) showed the best predictive results with the coefficient of determination of prediction (R2P) of 0.901, and root mean square error of prediction (RMSEP) of 0.108 and residual predictive deviation (RPD) of 2.32. Based on the obtained best model and image processing algorithms, the distribution maps of protein content were generated. The overall results of this study indicated that developing a rapid and online multispectral imaging system using the feature wavelengths and PLSR analysis is potential and feasible for determination of the protein content in peanut kernels.
Marabel, Miguel; Alvarez-Taboada, Flor
2013-01-01
Aboveground biomass (AGB) is one of the strategic biophysical variables of interest in vegetation studies. The main objective of this study was to evaluate the Support Vector Machine (SVM) and Partial Least Squares Regression (PLSR) for estimating the AGB of grasslands from field spectrometer data and to find out which data pre-processing approach was the most suitable. The most accurate model to predict the total AGB involved PLSR and the Maximum Band Depth index derived from the continuum removed reflectance in the absorption features between 916–1,120 nm and 1,079–1,297 nm (R2 = 0.939, RMSE = 7.120 g/m2). Regarding the green fraction of the AGB, the Area Over the Minimum index derived from the continuum removed spectra provided the most accurate model overall (R2 = 0.939, RMSE = 3.172 g/m2). Identifying the appropriate absorption features was proved to be crucial to improve the performance of PLSR to estimate the total and green aboveground biomass, by using the indices derived from those spectral regions. Ordinary Least Square Regression could be used as a surrogate for the PLSR approach with the Area Over the Minimum index as the independent variable, although the resulting model would not be as accurate. PMID:23925082
Hao, Z Q; Li, C M; Shen, M; Yang, X Y; Li, K H; Guo, L B; Li, X Y; Lu, Y F; Zeng, X Y
2015-03-23
Laser-induced breakdown spectroscopy (LIBS) with partial least squares regression (PLSR) has been applied to measuring the acidity of iron ore, which can be defined by the concentrations of oxides: CaO, MgO, Al₂O₃, and SiO₂. With the conventional internal standard calibration, it is difficult to establish the calibration curves of CaO, MgO, Al₂O₃, and SiO₂ in iron ore due to the serious matrix effects. PLSR is effective to address this problem due to its excellent performance in compensating the matrix effects. In this work, fifty samples were used to construct the PLSR calibration models for the above-mentioned oxides. These calibration models were validated by the 10-fold cross-validation method with the minimum root-mean-square errors (RMSE). Another ten samples were used as a test set. The acidities were calculated according to the estimated concentrations of CaO, MgO, Al₂O₃, and SiO₂ using the PLSR models. The average relative error (ARE) and RMSE of the acidity achieved 3.65% and 0.0048, respectively, for the test samples.
Barmeier, Gero; Schmidhalter, Urs
2017-01-01
To optimize plant architecture (e.g., photosynthetic active leaf area, leaf-stem ratio), plant physiologists and plant breeders rely on destructively and tediously harvested biomass samples. A fast and non-destructive method for obtaining information about different plant organs could be vehicle-based spectral proximal sensing. In this 3-year study, the mobile phenotyping platform PhenoTrac 4 was used to compare the measurements from active and passive spectral proximal sensors of leaves, leaf sheaths, culms and ears of 34 spring barley cultivars at anthesis and dough ripeness. Published vegetation indices (VI), partial least square regression (PLSR) models and contour map analysis were compared to assess these traits. Contour maps are matrices consisting of coefficients of determination for all of the binary combinations of wavelengths and the biomass parameters. The PLSR models of leaves, leaf sheaths and culms showed strong correlations ( R 2 = 0.61-0.76). Published vegetation indices depicted similar coefficients of determination; however, their RMSEs were higher. No wavelength combination could be found by the contour map analysis to improve the results of the PLSR or published VIs. The best results were obtained for the dry weight and N uptake of leaves and culms. The PLSR models yielded satisfactory relationships for leaf sheaths at anthesis ( R 2 = 0.69), whereas only a low performance for all of sensors and methods was observed at dough ripeness. No relationships with ears were observed. Active and passive sensors performed comparably, with slight advantages observed for the passive spectrometer. The results indicate that tractor-based proximal sensing in combination with optimized spectral indices or PLSR models may represent a suitable tool for plant breeders to assess relevant morphological traits, allowing for a better understanding of plant architecture, which is closely linked to the physiological performance. Further validation of PLSR models is required in independent studies. Organ specific phenotyping represents a first step toward breeding by design.
Barmeier, Gero; Schmidhalter, Urs
2017-01-01
To optimize plant architecture (e.g., photosynthetic active leaf area, leaf-stem ratio), plant physiologists and plant breeders rely on destructively and tediously harvested biomass samples. A fast and non-destructive method for obtaining information about different plant organs could be vehicle-based spectral proximal sensing. In this 3-year study, the mobile phenotyping platform PhenoTrac 4 was used to compare the measurements from active and passive spectral proximal sensors of leaves, leaf sheaths, culms and ears of 34 spring barley cultivars at anthesis and dough ripeness. Published vegetation indices (VI), partial least square regression (PLSR) models and contour map analysis were compared to assess these traits. Contour maps are matrices consisting of coefficients of determination for all of the binary combinations of wavelengths and the biomass parameters. The PLSR models of leaves, leaf sheaths and culms showed strong correlations (R2 = 0.61–0.76). Published vegetation indices depicted similar coefficients of determination; however, their RMSEs were higher. No wavelength combination could be found by the contour map analysis to improve the results of the PLSR or published VIs. The best results were obtained for the dry weight and N uptake of leaves and culms. The PLSR models yielded satisfactory relationships for leaf sheaths at anthesis (R2 = 0.69), whereas only a low performance for all of sensors and methods was observed at dough ripeness. No relationships with ears were observed. Active and passive sensors performed comparably, with slight advantages observed for the passive spectrometer. The results indicate that tractor-based proximal sensing in combination with optimized spectral indices or PLSR models may represent a suitable tool for plant breeders to assess relevant morphological traits, allowing for a better understanding of plant architecture, which is closely linked to the physiological performance. Further validation of PLSR models is required in independent studies. Organ specific phenotyping represents a first step toward breeding by design. PMID:29163629
Rapid analysis of pharmaceutical drugs using LIBS coupled with multivariate analysis.
Tiwari, P K; Awasthi, S; Kumar, R; Anand, R K; Rai, P K; Rai, A K
2018-02-01
Type 2 diabetes drug tablets containing voglibose having dose strengths of 0.2 and 0.3 mg of various brands have been examined, using laser-induced breakdown spectroscopy (LIBS) technique. The statistical methods such as the principal component analysis (PCA) and the partial least square regression analysis (PLSR) have been employed on LIBS spectral data for classifying and developing the calibration models of drug samples. We have developed the ratio-based calibration model applying PLSR in which relative spectral intensity ratios H/C, H/N and O/N are used. Further, the developed model has been employed to predict the relative concentration of element in unknown drug samples. The experiment has been performed in air and argon atmosphere, respectively, and the obtained results have been compared. The present model provides rapid spectroscopic method for drug analysis with high statistical significance for online control and measurement process in a wide variety of pharmaceutical industrial applications.
NASA Astrophysics Data System (ADS)
Braga, Jez Willian Batista; Trevizan, Lilian Cristina; Nunes, Lidiane Cristina; Rufini, Iolanda Aparecida; Santos, Dário, Jr.; Krug, Francisco José
2010-01-01
The application of laser induced breakdown spectrometry (LIBS) aiming the direct analysis of plant materials is a great challenge that still needs efforts for its development and validation. In this way, a series of experimental approaches has been carried out in order to show that LIBS can be used as an alternative method to wet acid digestions based methods for analysis of agricultural and environmental samples. The large amount of information provided by LIBS spectra for these complex samples increases the difficulties for selecting the most appropriated wavelengths for each analyte. Some applications have suggested that improvements in both accuracy and precision can be achieved by the application of multivariate calibration in LIBS data when compared to the univariate regression developed with line emission intensities. In the present work, the performance of univariate and multivariate calibration, based on partial least squares regression (PLSR), was compared for analysis of pellets of plant materials made from an appropriate mixture of cryogenically ground samples with cellulose as the binding agent. The development of a specific PLSR model for each analyte and the selection of spectral regions containing only lines of the analyte of interest were the best conditions for the analysis. In this particular application, these models showed a similar performance, but PLSR seemed to be more robust due to a lower occurrence of outliers in comparison to the univariate method. Data suggests that efforts dealing with sample presentation and fitness of standards for LIBS analysis must be done in order to fulfill the boundary conditions for matrix independent development and validation.
Liu, Wei; Wang, Zhen-Zhong; Qing, Jian-Ping; Li, Hong-Juan; Xiao, Wei
2014-01-01
Background: Peach kernels which contain kinds of fatty acids play an important role in the regulation of a variety of physiological and biological functions. Objective: To establish an innovative and rapid diffuse reflectance near-infrared spectroscopy (DR-NIR) analysis method along with chemometric techniques for the qualitative and quantitative determination of a peach kernel. Materials and Methods: Peach kernel samples from nine different origins were analyzed with high-performance liquid chromatography (HPLC) as a reference method. DR-NIR is in the spectral range 1100-2300 nm. Principal component analysis (PCA) and partial least squares regression (PLSR) algorithm were applied to obtain prediction models, The Savitzky-Golay derivative and first derivative were adopted for the spectral pre-processing, PCA was applied to classify the varieties of those samples. For the quantitative calibration, the models of linoleic and oleinic acids were established with the PLSR algorithm and the optimal principal component (PC) numbers were selected with leave-one-out (LOO) cross-validation. The established models were evaluated with the root mean square error of deviation (RMSED) and corresponding correlation coefficients (R2). Results: The PCA results of DR-NIR spectra yield clear classification of the two varieties of peach kernel. PLSR had a better predictive ability. The correlation coefficients of the two calibration models were above 0.99, and the RMSED of linoleic and oleinic acids were 1.266% and 1.412%, respectively. Conclusion: The DR-NIR combined with PCA and PLSR algorithm could be used efficiently to identify and quantify peach kernels and also help to solve variety problem. PMID:25422544
Lim, Jongguk; Kim, Giyoung; Mo, Changyeun; Kim, Moon S; Chao, Kuanglin; Qin, Jianwei; Fu, Xiaping; Baek, Insuck; Cho, Byoung-Kwan
2016-05-01
Illegal use of nitrogen-rich melamine (C3H6N6) to boost perceived protein content of food products such as milk, infant formula, frozen yogurt, pet food, biscuits, and coffee drinks has caused serious food safety problems. Conventional methods to detect melamine in foods, such as Enzyme-linked immunosorbent assay (ELISA), High-performance liquid chromatography (HPLC), and Gas chromatography-mass spectrometry (GC-MS), are sensitive but they are time-consuming, expensive, and labor-intensive. In this research, near-infrared (NIR) hyperspectral imaging technique combined with regression coefficient of partial least squares regression (PLSR) model was used to detect melamine particles in milk powders easily and quickly. NIR hyperspectral reflectance imaging data in the spectral range of 990-1700nm were acquired from melamine-milk powder mixture samples prepared at various concentrations ranging from 0.02% to 1%. PLSR models were developed to correlate the spectral data (independent variables) with melamine concentration (dependent variables) in melamine-milk powder mixture samples. PLSR models applying various pretreatment methods were used to reconstruct the two-dimensional PLS images. PLS images were converted to the binary images to detect the suspected melamine pixels in milk powder. As the melamine concentration was increased, the numbers of suspected melamine pixels of binary images were also increased. These results suggested that NIR hyperspectral imaging technique and the PLSR model can be regarded as an effective tool to detect melamine particles in milk powders. Copyright © 2016 Elsevier B.V. All rights reserved.
Lee, Byeong-Ju; Zhou, Yaoyao; Lee, Jae Soung; Shin, Byeung Kon; Seo, Jeong-Ah; Lee, Doyup; Kim, Young-Suk
2018-01-01
The ability to determine the origin of soybeans is an important issue following the inclusion of this information in the labeling of agricultural food products becoming mandatory in South Korea in 2017. This study was carried out to construct a prediction model for discriminating Chinese and Korean soybeans using Fourier-transform infrared (FT-IR) spectroscopy and multivariate statistical analysis. The optimal prediction models for discriminating soybean samples were obtained by selecting appropriate scaling methods, normalization methods, variable influence on projection (VIP) cutoff values, and wave-number regions. The factors for constructing the optimal partial-least-squares regression (PLSR) prediction model were using second derivatives, vector normalization, unit variance scaling, and the 4000–400 cm–1 region (excluding water vapor and carbon dioxide). The PLSR model for discriminating Chinese and Korean soybean samples had the best predictability when a VIP cutoff value was not applied. When Chinese soybean samples were identified, a PLSR model that has the lowest root-mean-square error of the prediction value was obtained using a VIP cutoff value of 1.5. The optimal PLSR prediction model for discriminating Korean soybean samples was also obtained using a VIP cutoff value of 1.5. This is the first study that has combined FT-IR spectroscopy with normalization methods, VIP cutoff values, and selected wave-number regions for discriminating Chinese and Korean soybeans. PMID:29689113
NASA Astrophysics Data System (ADS)
Vasat, Radim; Klement, Ales; Jaksik, Ondrej; Kodesova, Radka; Drabek, Ondrej; Boruvka, Lubos
2014-05-01
Visible and near-infrared diffuse reflectance spectroscopy (VNIR-DRS) provides a rapid and inexpensive tool for simultaneous prediction of a variety of soil properties. Usually, some sophisticated multivariate mathematical or statistical methods are employed in order to extract the required information from the raw spectra measurement. For this purpose especially the Partial least squares regression (PLSR) and Support vector machines (SVM) are the most frequently used. These methods generally benefit from the complexity with which the soil spectra are treated. But it is interesting that also techniques that focus only on a single spectral feature, such as a simple linear regression with selected continuum-removed spectra (CRS) characteristic (e.g. peak depth), can often provide competitive results. Therefore, we decided to enhance the potential of CRS taking into account all possible CRS peak parameters (area, width and depth) and develop a comprehensive methodology based on multiple linear regression approach. The eight considered soil properties were oxidizable carbon content (Cox), exchangeable (pHex) and active soil pH (pHa), particle and bulk density, CaCO3 content, crystalline and amorphous (Fed) and amorphous Fe (Feox) forms. In four cases (pHa, bulk density, Fed and Feox), of which two (Fed and Feox) were predicted reliably accurately (0.50 < R2cv < 0.80) and the other two (pHa and bulk density) only poorly (R2cv < 0.50), we obtained slightly better results than with PLSR and SVM. In one case (pHex) we achieved a significantly higher, although just reliable, accuracy (R2cv = 0.601) than with PLSR and SVM (R2cv = 0.448 and 0.442, resp.). But most interestingly, in the case of particle density, the presented approach outperformed the PLSR and SVM dramatically offering a fairly accurate prediction (R2cv = 0.827) against two failures (R2cv = 0.034 and 0.121 for PLSR and SVM, resp.). In last two cases (Cox and CaCO3) a slightly worse results were achieved then with PLSR and SVM with overall fairly accurate prediction (R2cv > 0.80). Acknowledgment: Authors acknowledge the financial support of the Ministry of Agriculture of the Czech Republic (grant No. QJ1230319).
Kuriakose, Saji; Joe, I Hubert
2013-11-01
Determination of the authenticity of essential oils has become more significant, in recent years, following some illegal adulteration and contamination scandals. The present investigative study focuses on the application of near infrared spectroscopy to detect sample authenticity and quantify economic adulteration of sandalwood oils. Several data pre-treatments are investigated for calibration and prediction using partial least square regression (PLSR). The quantitative data analysis is done using a new spectral approach - full spectrum or sequential spectrum. The optimum number of PLS components is obtained according to the lowest root mean square error of calibration (RMSEC=0.00009% v/v). The lowest root mean square error of prediction (RMSEP=0.00016% v/v) in the test set and the highest coefficient of determination (R(2)=0.99989) are used as the evaluation tools for the best model. A nonlinear method, locally weighted regression (LWR), is added to extract nonlinear information and to compare with the linear PLSR model. Copyright © 2013 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Kuriakose, Saji; Joe, I. Hubert
2013-11-01
Determination of the authenticity of essential oils has become more significant, in recent years, following some illegal adulteration and contamination scandals. The present investigative study focuses on the application of near infrared spectroscopy to detect sample authenticity and quantify economic adulteration of sandalwood oils. Several data pre-treatments are investigated for calibration and prediction using partial least square regression (PLSR). The quantitative data analysis is done using a new spectral approach - full spectrum or sequential spectrum. The optimum number of PLS components is obtained according to the lowest root mean square error of calibration (RMSEC = 0.00009% v/v). The lowest root mean square error of prediction (RMSEP = 0.00016% v/v) in the test set and the highest coefficient of determination (R2 = 0.99989) are used as the evaluation tools for the best model. A nonlinear method, locally weighted regression (LWR), is added to extract nonlinear information and to compare with the linear PLSR model.
Hyperspectral imaging using a color camera and its application for pathogen detection
NASA Astrophysics Data System (ADS)
Yoon, Seung-Chul; Shin, Tae-Sung; Heitschmidt, Gerald W.; Lawrence, Kurt C.; Park, Bosoon; Gamble, Gary
2015-02-01
This paper reports the results of a feasibility study for the development of a hyperspectral image recovery (reconstruction) technique using a RGB color camera and regression analysis in order to detect and classify colonies of foodborne pathogens. The target bacterial pathogens were the six representative non-O157 Shiga-toxin producing Escherichia coli (STEC) serogroups (O26, O45, O103, O111, O121, and O145) grown in Petri dishes of Rainbow agar. The purpose of the feasibility study was to evaluate whether a DSLR camera (Nikon D700) could be used to predict hyperspectral images in the wavelength range from 400 to 1,000 nm and even to predict the types of pathogens using a hyperspectral STEC classification algorithm that was previously developed. Unlike many other studies using color charts with known and noise-free spectra for training reconstruction models, this work used hyperspectral and color images, separately measured by a hyperspectral imaging spectrometer and the DSLR color camera. The color images were calibrated (i.e. normalized) to relative reflectance, subsampled and spatially registered to match with counterpart pixels in hyperspectral images that were also calibrated to relative reflectance. Polynomial multivariate least-squares regression (PMLR) was previously developed with simulated color images. In this study, partial least squares regression (PLSR) was also evaluated as a spectral recovery technique to minimize multicollinearity and overfitting. The two spectral recovery models (PMLR and PLSR) and their parameters were evaluated by cross-validation. The QR decomposition was used to find a numerically more stable solution of the regression equation. The preliminary results showed that PLSR was more effective especially with higher order polynomial regressions than PMLR. The best classification accuracy measured with an independent test set was about 90%. The results suggest the potential of cost-effective color imaging using hyperspectral image classification algorithms for rapidly differentiating pathogens in agar plates.
Genkawa, Takuma; Shinzawa, Hideyuki; Kato, Hideaki; Ishikawa, Daitaro; Murayama, Kodai; Komiyama, Makoto; Ozaki, Yukihiro
2015-12-01
An alternative baseline correction method for diffuse reflection near-infrared (NIR) spectra, searching region standard normal variate (SRSNV), was proposed. Standard normal variate (SNV) is an effective pretreatment method for baseline correction of diffuse reflection NIR spectra of powder and granular samples; however, its baseline correction performance depends on the NIR region used for SNV calculation. To search for an optimal NIR region for baseline correction using SNV, SRSNV employs moving window partial least squares regression (MWPLSR), and an optimal NIR region is identified based on the root mean square error (RMSE) of cross-validation of the partial least squares regression (PLSR) models with the first latent variable (LV). The performance of SRSNV was evaluated using diffuse reflection NIR spectra of mixture samples consisting of wheat flour and granular glucose (0-100% glucose at 5% intervals). From the obtained NIR spectra of the mixture in the 10 000-4000 cm(-1) region at 4 cm intervals (1501 spectral channels), a series of spectral windows consisting of 80 spectral channels was constructed, and then SNV spectra were calculated for each spectral window. Using these SNV spectra, a series of PLSR models with the first LV for glucose concentration was built. A plot of RMSE versus the spectral window position obtained using the PLSR models revealed that the 8680–8364 cm(-1) region was optimal for baseline correction using SNV. In the SNV spectra calculated using the 8680–8364 cm(-1) region (SRSNV spectra), a remarkable relative intensity change between a band due to wheat flour at 8500 cm(-1) and that due to glucose at 8364 cm(-1) was observed owing to successful baseline correction using SNV. A PLSR model with the first LV based on the SRSNV spectra yielded a determination coefficient (R2) of 0.999 and an RMSE of 0.70%, while a PLSR model with three LVs based on SNV spectra calculated in the full spectral region gave an R2 of 0.995 and an RMSE of 2.29%. Additional evaluation of SRSNV was carried out using diffuse reflection NIR spectra of marzipan and corn samples, and PLSR models based on SRSNV spectra showed good prediction results. These evaluation results indicate that SRSNV is effective in baseline correction of diffuse reflection NIR spectra and provides regression models with good prediction accuracy.
Li, Jie; Sun, Jin; He, Zhonggui
2007-01-26
We aimed to establish quantitative structure-retention relationship (QSRR) with immobilized artificial membrane (IAM) chromatography using easily understood and obtained physicochemical molecular descriptors and to elucidate which descriptors are critical to affect the interaction process between solutes and immobilized phospholipid membranes. The retention indices (logk(IAM)) of 55 structurally diverse drugs were determined on an immobilized artificial membrane column (IAM.PC.DD2) directly or obtained by extrapolation method for highly hydrophobic compounds. Ten simple physicochemical property descriptors (clogP, rings, rotatory bond, hydro-bond counting, etc.) of these drugs were collected and used to establish QSRR and predict the retention data by partial least squares regression (PLSR). Five descriptors, clogP, rotatory bond (RotB), rings, molecular weight (MW) and total surface area (TSA), were reserved by using the Variable Importance for Projection (VIP) values as criterion to build the final PLSR model. An external test set was employed to verify the QSRR based on the training set with the five variables, and QSRR by PLSR exhibited a satisfying predictive ability with R(p)=0.902 and RMSE(p)=0.400. Comparison of coefficients of centered and scaled variables by PLSR demonstrated that, for the descriptors studied, clogP and TSA have the most significant positive effect but the rotatable bond has significant negative effect on drug IAM chromatographic retention.
Naguib, Ibrahim A; Abdelrahman, Maha M; El Ghobashy, Mohamed R; Ali, Nesma A
2016-01-01
Two accurate, sensitive, and selective stability-indicating methods are developed and validated for simultaneous quantitative determination of agomelatine (AGM) and its forced degradation products (Deg I and Deg II), whether in pure forms or in pharmaceutical formulations. Partial least-squares regression (PLSR) and spectral residual augmented classical least-squares (SRACLS) are two chemometric models that are being subjected to a comparative study through handling UV spectral data in range (215-350 nm). For proper analysis, a three-factor, four-level experimental design was established, resulting in a training set consisting of 16 mixtures containing different ratios of interfering species. An independent test set consisting of eight mixtures was used to validate the prediction ability of the suggested models. The results presented indicate the ability of mentioned multivariate calibration models to analyze AGM, Deg I, and Deg II with high selectivity and accuracy. The analysis results of the pharmaceutical formulations were statistically compared to the reference HPLC method, with no significant differences observed regarding accuracy and precision. The SRACLS model gives comparable results to the PLSR model; however, it keeps the qualitative spectral information of the classical least-squares algorithm for analyzed components.
NASA Astrophysics Data System (ADS)
Qu, Yonghua; Jiao, Siong; Lin, Xudong
2008-10-01
Hetao Irrigation District located in Inner Mongolia, is one of the three largest irrigated area in China. In the irrigational agriculture region, for the reasons that many efforts have been put on irrigation rather than on drainage, as a result much sedimentary salt that usually is solved in water has been deposited in surface soil. So there has arisen a problem in such irrigation district that soil salinity has become a chief fact which causes land degrading. Remote sensing technology is an efficiency way to map the salinity in regional scale. In the principle of remote sensing, soil spectrum is one of the most important indications which can be used to reflect the status of soil salinity. In the past decades, many efforts have been made to reveal the spectrum characteristics of the salinized soil, such as the traditional statistic regression method. But it also has been found that when the hyper-spectral reflectance data are considered, the traditional regression method can't be treat the large dimension data, because the hyper-spectral data usually have too higher spectral band number. In this paper, a partial least squares regression (PLSR) model was established based on the statistical analysis on the soil salinity and the reflectance of hyper-spectral. Dataset were collect through the field soil samples were collected in the region of Hetao irrigation from the end of July to the beginning of August. The independent validation using data which are not included in the calibration model reveals that the proposed model can predicate the main soil components such as the content of total ions(S%), PH with higher determination coefficients(R2) of 0.728 and 0.715 respectively. And the rate of prediction to deviation(RPD) of the above predicted value are larger than 1.6, which indicates that the calibrated PLSR model can be used as a tool to retrieve soil salinity with accurate results. When the PLSR model's regression coefficients were aggregated according to the wavelength of visual (blue, green, red) and near infrared bands of LandSat Thematic Mapper(TM) sensor, some significant response values were observed, which indicates that the proposed method in this paper can be used to analysis the remotely sensed data from the space-boarded platform.
Multivariate analysis of ATR-FTIR spectra for assessment of oil shale organic geochemical properties
Washburn, Kathryn E.; Birdwell, Justin E.
2013-01-01
In this study, attenuated total reflectance (ATR) Fourier transform infrared spectroscopy (FTIR) was coupled with partial least squares regression (PLSR) analysis to relate spectral data to parameters from total organic carbon (TOC) analysis and programmed pyrolysis to assess the feasibility of developing predictive models to estimate important organic geochemical parameters. The advantage of ATR-FTIR over traditional analytical methods is that source rocks can be analyzed in the laboratory or field in seconds, facilitating more rapid and thorough screening than would be possible using other tools. ATR-FTIR spectra, TOC concentrations and Rock–Eval parameters were measured for a set of oil shales from deposits around the world and several pyrolyzed oil shale samples. PLSR models were developed to predict the measured geochemical parameters from infrared spectra. Application of the resulting models to a set of test spectra excluded from the training set generated accurate predictions of TOC and most Rock–Eval parameters. The critical region of the infrared spectrum for assessing S1, S2, Hydrogen Index and TOC consisted of aliphatic organic moieties (2800–3000 cm−1) and the models generated a better correlation with measured values of TOC and S2 than did integrated aliphatic peak areas. The results suggest that combining ATR-FTIR with PLSR is a reliable approach for estimating useful geochemical parameters of oil shales that is faster and requires less sample preparation than current screening methods.
Rapid Isolation and Detection for RNA Biomarkers for TBI Diagnostics
2015-10-01
V., Grape and wine sensory attributes correlate with pattern- based discrimination of Cabernet Sauvignon wines by a peptidic sensor array, Tetrahedron... wine samples. Partial Least Squares Regression (PLSR) was used for the correlation of wine sensory attributes to the peptide-based receptor...responses. Data analysis was done using the software XLSTAT Addinsoft, NewYork) and R.Absorbance values due to wine without the sensing ensembles were
Oberg, Tomas
2004-01-01
Halogenated aliphatic compounds have many technical uses, but substances within this group are also ubiquitous environmental pollutants that can affect the ozone layer and contribute to global warming. The establishment of quantitative structure-property relationships is of interest not only to fill in gaps in the available database but also to validate experimental data already acquired. The three-dimensional structures of 240 compounds were modeled with molecular mechanics prior to the generation of empirical descriptors. Two bilinear projection methods, principal component analysis (PCA) and partial-least-squares regression (PLSR), were used to identify outliers. PLSR was subsequently used to build a multivariate calibration model by extracting the latent variables that describe most of the covariation between the molecular structure and the boiling point. Boiling points were also estimated with an extension of the group contribution method of Stein and Brown.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tripathi, Markandey M.; Krishnan, Sundar R.; Srinivasan, Kalyan K.
Chemiluminescence emissions from OH*, CH*, C2, and CO2 formed within the reaction zone of premixed flames depend upon the fuel-air equivalence ratio in the burning mixture. In the present paper, a new partial least square regression (PLS-R) based multivariate sensing methodology is investigated and compared with an OH*/CH* intensity ratio-based calibration model for sensing equivalence ratio in atmospheric methane-air premixed flames. Five replications of spectral data at nine different equivalence ratios ranging from 0.73 to 1.48 were used in the calibration of both models. During model development, the PLS-R model was initially validated with the calibration data set using themore » leave-one-out cross validation technique. Since the PLS-R model used the entire raw spectral intensities, it did not need the nonlinear background subtraction of CO2 emission that is required for typical OH*/CH* intensity ratio calibrations. An unbiased spectral data set (not used in the PLS-R model development), for 28 different equivalence ratio conditions ranging from 0.71 to 1.67, was used to predict equivalence ratios using the PLS-R and the intensity ratio calibration models. It was found that the equivalence ratios predicted with the PLS-R based multivariate calibration model matched the experimentally measured equivalence ratios within 7%; whereas, the OH*/CH* intensity ratio calibration grossly underpredicted equivalence ratios in comparison to measured equivalence ratios, especially under rich conditions ( > 1.2). The practical implications of the chemiluminescence-based multivariate equivalence ratio sensing methodology are also discussed.« less
2009-01-01
Background Genomic selection (GS) uses molecular breeding values (MBV) derived from dense markers across the entire genome for selection of young animals. The accuracy of MBV prediction is important for a successful application of GS. Recently, several methods have been proposed to estimate MBV. Initial simulation studies have shown that these methods can accurately predict MBV. In this study we compared the accuracies and possible bias of five different regression methods in an empirical application in dairy cattle. Methods Genotypes of 7,372 SNP and highly accurate EBV of 1,945 dairy bulls were used to predict MBV for protein percentage (PPT) and a profit index (Australian Selection Index, ASI). Marker effects were estimated by least squares regression (FR-LS), Bayesian regression (Bayes-R), random regression best linear unbiased prediction (RR-BLUP), partial least squares regression (PLSR) and nonparametric support vector regression (SVR) in a training set of 1,239 bulls. Accuracy and bias of MBV prediction were calculated from cross-validation of the training set and tested against a test team of 706 young bulls. Results For both traits, FR-LS using a subset of SNP was significantly less accurate than all other methods which used all SNP. Accuracies obtained by Bayes-R, RR-BLUP, PLSR and SVR were very similar for ASI (0.39-0.45) and for PPT (0.55-0.61). Overall, SVR gave the highest accuracy. All methods resulted in biased MBV predictions for ASI, for PPT only RR-BLUP and SVR predictions were unbiased. A significant decrease in accuracy of prediction of ASI was seen in young test cohorts of bulls compared to the accuracy derived from cross-validation of the training set. This reduction was not apparent for PPT. Combining MBV predictions with pedigree based predictions gave 1.05 - 1.34 times higher accuracies compared to predictions based on pedigree alone. Some methods have largely different computational requirements, with PLSR and RR-BLUP requiring the least computing time. Conclusions The four methods which use information from all SNP namely RR-BLUP, Bayes-R, PLSR and SVR generate similar accuracies of MBV prediction for genomic selection, and their use in the selection of immediate future generations in dairy cattle will be comparable. The use of FR-LS in genomic selection is not recommended. PMID:20043835
NASA Astrophysics Data System (ADS)
Chen, Hui; Tan, Chao; Lin, Zan; Wu, Tong
2018-01-01
Milk is among the most popular nutrient source worldwide, which is of great interest due to its beneficial medicinal properties. The feasibility of the classification of milk powder samples with respect to their brands and the determination of protein concentration is investigated by NIR spectroscopy along with chemometrics. Two datasets were prepared for experiment. One contains 179 samples of four brands for classification and the other contains 30 samples for quantitative analysis. Principal component analysis (PCA) was used for exploratory analysis. Based on an effective model-independent variable selection method, i.e., minimal-redundancy maximal-relevance (MRMR), only 18 variables were selected to construct a partial least-square discriminant analysis (PLS-DA) model. On the test set, the PLS-DA model based on the selected variable set was compared with the full-spectrum PLS-DA model, both of which achieved 100% accuracy. In quantitative analysis, the partial least-square regression (PLSR) model constructed by the selected subset of 260 variables outperforms significantly the full-spectrum model. It seems that the combination of NIR spectroscopy, MRMR and PLS-DA or PLSR is a powerful tool for classifying different brands of milk and determining the protein content.
NASA Astrophysics Data System (ADS)
Arshad, Muhammad; Ullah, Saleem; Khurshid, Khurram; Ali, Asad
2017-10-01
Leaf Water Content (LWC) is an essential constituent of plant leaves that determines vegetation heath and its productivity. An accurate and on-time measurement of water content is crucial for planning irrigation, forecasting drought and predicting woodland fire. The retrieval of LWC from Visible to Shortwave Infrared (VSWIR: 0.4-2.5 μm) has been extensively investigated but little has been done in the Mid and Thermal Infrared (MIR and TIR: 2.50 -14.0 μm), windows of electromagnetic spectrum. This study is mainly focused on retrieval of LWC from Mid and Thermal Infrared, using Genetic Algorithm integrated with Partial Least Square Regression (PLSR). Genetic Algorithm fused with PLSR selects spectral wavebands with high predictive performance i.e., yields high adjusted-R2 and low RMSE. In our case, GA-PLSR selected eight variables (bands) and yielded highly accurate models with adjusted-R2 of 0.93 and RMSEcv equal to 7.1 %. The study also demonstrated that MIR is more sensitive to the variation in LWC as compared to TIR. However, the combined use of MIR and TIR spectra enhances the predictive performance in retrieval of LWC. The integration of Genetic Algorithm and PLSR, not only increases the estimation precision by selecting the most sensitive spectral bands but also helps in identifying the important spectral regions for quantifying water stresses in vegetation. The findings of this study will allow the future space missions (like HyspIRI) to position wavebands at sensitive regions for characterizing vegetation stresses.
NASA Astrophysics Data System (ADS)
Nawar, Said; Buddenbaum, Henning; Hill, Joachim
2014-05-01
A rapid and inexpensive soil analytical technique is needed for soil quality assessment and accurate mapping. This study investigated a method for improved estimation of soil clay (SC) and organic matter (OM) using reflectance spectroscopy. Seventy soil samples were collected from Sinai peninsula in Egypt to estimate the soil clay and organic matter relative to the soil spectra. Soil samples were scanned with an Analytical Spectral Devices (ASD) spectrometer (350-2500 nm). Three spectral formats were used in the calibration models derived from the spectra and the soil properties: (1) original reflectance spectra (OR), (2) first-derivative spectra smoothened using the Savitzky-Golay technique (FD-SG) and (3) continuum-removed reflectance (CR). Partial least-squares regression (PLSR) models using the CR of the 400-2500 nm spectral region resulted in R2 = 0.76 and 0.57, and RPD = 2.1 and 1.5 for estimating SC and OM, respectively, indicating better performance than that obtained using OR and SG. The multivariate adaptive regression splines (MARS) calibration model with the CR spectra resulted in an improved performance (R2 = 0.89 and 0.83, RPD = 3.1 and 2.4) for estimating SC and OM, respectively. The results show that the MARS models have a great potential for estimating SC and OM compared with PLSR models. The results obtained in this study have potential value in the field of soil spectroscopy because they can be applied directly to the mapping of soil properties using remote sensing imagery in arid environment conditions. Key Words: soil clay, organic matter, PLSR, MARS, reflectance spectroscopy.
Hattori, Yusuke; Otsuka, Makoto
2017-05-30
In the pharmaceutical industry, the implementation of continuous manufacturing has been widely promoted in lieu of the traditional batch manufacturing approach. More specially, in recent years, the innovative concept of feed-forward control has been introduced in relation to process analytical technology. In the present study, we successfully developed a feed-forward control model for the tablet compression process by integrating data obtained from near-infrared (NIR) spectra and the physical properties of granules. In the pharmaceutical industry, batch manufacturing routinely allows for the preparation of granules with the desired properties through the manual control of process parameters. On the other hand, continuous manufacturing demands the automatic determination of these process parameters. Here, we proposed the development of a control model using the partial least squares regression (PLSR) method. The most significant feature of this method is the use of dataset integrating both the NIR spectra and the physical properties of the granules. Using our model, we determined that the properties of products, such as tablet weight and thickness, need to be included as independent variables in the PLSR analysis in order to predict unknown process parameters. Copyright © 2017 Elsevier B.V. All rights reserved.
On the prediction of threshold friction velocity of wind erosion using soil reflectance spectroscopy
NASA Astrophysics Data System (ADS)
Li, Junran; Flagg, Cody; Okin, Gregory S.; Painter, Thomas H.; Dintwe, Kebonye; Belnap, Jayne
2015-12-01
Current approaches to estimate threshold friction velocity (TFV) of soil particle movement, including both experimental and empirical methods, suffer from various disadvantages, and they are particularly not effective to estimate TFVs at regional to global scales. Reflectance spectroscopy has been widely used to obtain TFV-related soil properties (e.g., moisture, texture, crust, etc.), however, no studies have attempted to directly relate soil TFV to their spectral reflectance. The objective of this study was to investigate the relationship between soil TFV and soil reflectance in the visible and near infrared (VIS-NIR, 350-2500 nm) spectral region, and to identify the best range of wavelengths or combinations of wavelengths to predict TFV. Threshold friction velocity of 31 soils, along with their reflectance spectra and texture were measured in the Mojave Desert, California and Moab, Utah. A correlation analysis between TFV and soil reflectance identified a number of isolated, narrow spectral domains that largely fell into two spectral regions, the VIS area (400-700 nm) and the short-wavelength infrared (SWIR) area (1100-2500 nm). A partial least squares regression analysis (PLSR) confirmed the significant bands that were identified by correlation analysis. The PLSR further identified the strong relationship between the first-difference transformation and TFV at several narrow regions around 1400, 1900, and 2200 nm. The use of PLSR allowed us to identify a total of 17 key wavelengths in the investigated spectrum range, which may be used as the optimal spectral settings for estimating TFV in the laboratory and field, or mapping of TFV using airborne/satellite sensors.
Palomba, M. Lia; Piersanti, Kelly; Ziegler, Carly G. K.; Decker, Hugo; Cotari, Jesse W.; Bantilan, Kurt; Rijo, Ivelise; Gardner, Jeff R.; Heaney, Mark; Bemis, Debra; Balderas, Robert; Malek, Sami N.; Seymour, Erlene; Zelenetz, Andrew D.
2014-01-01
Purpose Chronic Lymphocytic Leukemia (CLL) is defined by a perturbed B-cell receptor-mediated signaling machinery. We aimed to model differential signaling behavior between B cells from CLL and healthy individuals to pinpoint modes of dysregulation. Experimental Design We developed an experimental methodology combining immunophenotyping, multiplexed phosphospecific flow cytometry, and multifactorial statistical modeling. Utilizing patterns of signaling network covariance, we modeled BCR signaling in 67 CLL patients using Partial Least Squares Regression (PLSR). Results from multidimensional modeling were validated using an independent test cohort of 38 patients. Results We identified a dynamic and variable imbalance between proximal (pSYK, pBTK) and distal (pPLCγ2, pBLNK, ppERK) phosphoresponses. PLSR identified the relationship between upstream tyrosine kinase SYK and its target, PLCγ2, as maximally predictive and sufficient to distinguish CLL from healthy samples, pointing to this juncture in the signaling pathway as a hallmark of CLL B cells. Specific BCR pathway signaling signatures that correlate with the disease and its degree of aggressiveness were identified. Heterogeneity in the PLSR response variable within the B cell population is both a characteristic mark of healthy samples and predictive of disease aggressiveness. Conclusion Single-cell multidimensional analysis of BCR signaling permitted focused analysis of the variability and heterogeneity of signaling behavior from patient-to-patient, and from cell-to-cell. Disruption of the pSYK/pPLCγ2 relationship is uncovered as a robust hallmark of CLL B cell signaling behavior. Together, these observations implicate novel elements of the BCR signal transduction as potential therapeutic targets. PMID:24489640
Estimation of Nitrogen Vertical Distribution by Bi-Directional Canopy Reflectance in Winter Wheat
Huang, Wenjiang; Yang, Qinying; Pu, Ruiliang; Yang, Shaoyuan
2014-01-01
Timely measurement of vertical foliage nitrogen distribution is critical for increasing crop yield and reducing environmental impact. In this study, a novel method with partial least square regression (PLSR) and vegetation indices was developed to determine optimal models for extracting vertical foliage nitrogen distribution of winter wheat by using bi-directional reflectance distribution function (BRDF) data. The BRDF data were collected from ground-based hyperspectral reflectance measurements recorded at the Xiaotangshan Precision Agriculture Experimental Base in 2003, 2004 and 2007. The view zenith angles (1) at nadir, 40° and 50°; (2) at nadir, 30° and 40°; and (3) at nadir, 20° and 30° were selected as optical view angles to estimate foliage nitrogen density (FND) at an upper, middle and bottom layer, respectively. For each layer, three optimal PLSR analysis models with FND as a dependent variable and two vegetation indices (nitrogen reflectance index (NRI), normalized pigment chlorophyll index (NPCI) or a combination of NRI and NPCI) at corresponding angles as explanatory variables were established. The experimental results from an independent model verification demonstrated that the PLSR analysis models with the combination of NRI and NPCI as the explanatory variables were the most accurate in estimating FND for each layer. The coefficients of determination (R2) of this model between upper layer-, middle layer- and bottom layer-derived and laboratory-measured foliage nitrogen density were 0.7335, 0.7336, 0.6746, respectively. PMID:25353983
Estimation of nitrogen vertical distribution by bi-directional canopy reflectance in winter wheat.
Huang, Wenjiang; Yang, Qinying; Pu, Ruiliang; Yang, Shaoyuan
2014-10-28
Timely measurement of vertical foliage nitrogen distribution is critical for increasing crop yield and reducing environmental impact. In this study, a novel method with partial least square regression (PLSR) and vegetation indices was developed to determine optimal models for extracting vertical foliage nitrogen distribution of winter wheat by using bi-directional reflectance distribution function (BRDF) data. The BRDF data were collected from ground-based hyperspectral reflectance measurements recorded at the Xiaotangshan Precision Agriculture Experimental Base in 2003, 2004 and 2007. The view zenith angles (1) at nadir, 40° and 50°; (2) at nadir, 30° and 40°; and (3) at nadir, 20° and 30° were selected as optical view angles to estimate foliage nitrogen density (FND) at an upper, middle and bottom layer, respectively. For each layer, three optimal PLSR analysis models with FND as a dependent variable and two vegetation indices (nitrogen reflectance index (NRI), normalized pigment chlorophyll index (NPCI) or a combination of NRI and NPCI) at corresponding angles as explanatory variables were established. The experimental results from an independent model verification demonstrated that the PLSR analysis models with the combination of NRI and NPCI as the explanatory variables were the most accurate in estimating FND for each layer. The coefficients of determination (R2) of this model between upper layer-, middle layer- and bottom layer-derived and laboratory-measured foliage nitrogen density were 0.7335, 0.7336, 0.6746, respectively.
Chen, Baisheng; Wu, Huanan; Li, Sam Fong Yau
2014-03-01
To overcome the challenging task to select an appropriate pathlength for wastewater chemical oxygen demand (COD) monitoring with high accuracy by UV-vis spectroscopy in wastewater treatment process, a variable pathlength approach combined with partial-least squares regression (PLSR) was developed in this study. Two new strategies were proposed to extract relevant information of UV-vis spectral data from variable pathlength measurements. The first strategy was by data fusion with two data fusion levels: low-level data fusion (LLDF) and mid-level data fusion (MLDF). Predictive accuracy was found to improve, indicated by the lower root-mean-square errors of prediction (RMSEP) compared with those obtained for single pathlength measurements. Both fusion levels were found to deliver very robust PLSR models with residual predictive deviations (RPD) greater than 3 (i.e. 3.22 and 3.29, respectively). The second strategy involved calculating the slopes of absorbance against pathlength at each wavelength to generate slope-derived spectra. Without the requirement to select the optimal pathlength, the predictive accuracy (RMSEP) was improved by 20-43% as compared to single pathlength spectroscopy. Comparing to nine-factor models from fusion strategy, the PLSR model from slope-derived spectroscopy was found to be more parsimonious with only five factors and more robust with residual predictive deviation (RPD) of 3.72. It also offered excellent correlation of predicted and measured COD values with R(2) of 0.936. In sum, variable pathlength spectroscopy with the two proposed data analysis strategies proved to be successful in enhancing prediction performance of COD in wastewater and showed high potential to be applied in on-line water quality monitoring. Copyright © 2013 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Arantes Camargo, Livia; Marques, José, Jr.
2015-04-01
The prediction of erodibility using indirect methods such as diffuse reflectance spectroscopy could facilitate the characterization of the spatial variability in large areas and optimize implementation of conservation practices. The aim of this study was to evaluate the prediction of interrill erodibility (Ki) and rill erodibility (Kr) by means of iron oxides content and soil color using multiple linear regression and diffuse reflectance spectroscopy (DRS) using regression analysis by least squares partial (PLSR). The soils were collected from three geomorphic surfaces and analyzed for chemical, physical and mineralogical properties, plus scanned in the spectral range from the visible and infrared. Maps of spatial distribution of Ki and Kr were built with the values calculated by the calibrated models that obtained the best accuracy using geostatistics. Interrill-rill erodibility presented negative correlation with iron extracted by dithionite-citrate-bicarbonate, hematite, and chroma, confirming the influence of iron oxides in soil structural stability. Hematite and hue were the attributes that most contributed in calibration models by multiple linear regression for the prediction of Ki (R2 = 0.55) and Kr (R2 = 0.53). The diffuse reflectance spectroscopy via PLSR allowed to predict Interrill-rill erodibility with high accuracy (R2adj = 0.76, 0.81 respectively and RPD> 2.0) in the range of the visible spectrum (380-800 nm) and the characterization of the spatial variability of these attributes by geostatistics.
NASA Astrophysics Data System (ADS)
Darvishzadeh, R.; Skidmore, A. K.; Mirzaie, M.; Atzberger, C.; Schlerf, M.
2014-12-01
Accurate estimation of grassland biomass at their peak productivity can provide crucial information regarding the functioning and productivity of the rangelands. Hyperspectral remote sensing has proved to be valuable for estimation of vegetation biophysical parameters such as biomass using different statistical techniques. However, in statistical analysis of hyperspectral data, multicollinearity is a common problem due to large amount of correlated hyper-spectral reflectance measurements. The aim of this study was to examine the prospect of above ground biomass estimation in a heterogeneous Mediterranean rangeland employing multivariate calibration methods. Canopy spectral measurements were made in the field using a GER 3700 spectroradiometer, along with concomitant in situ measurements of above ground biomass for 170 sample plots. Multivariate calibrations including partial least squares regression (PLSR), principal component regression (PCR), and Least-Squared Support Vector Machine (LS-SVM) were used to estimate the above ground biomass. The prediction accuracy of the multivariate calibration methods were assessed using cross validated R2 and RMSE. The best model performance was obtained using LS_SVM and then PLSR both calibrated with first derivative reflectance dataset with R2cv = 0.88 & 0.86 and RMSEcv= 1.15 & 1.07 respectively. The weakest prediction accuracy was appeared when PCR were used (R2cv = 0.31 and RMSEcv= 2.48). The obtained results highlight the importance of multivariate calibration methods for biomass estimation when hyperspectral data are used.
Feng, Chao-Hui; Makino, Yoshio; Yoshimura, Masatoshi; Thuyet, Dang Quoc; García-Martín, Juan Francisco
2018-02-01
The potential of hyperspectral imaging with wavelengths of 380 to 1000 nm was used to determine the pH of cooked sausages after different storage conditions (4 °C for 1 d, 35 °C for 1, 3, and 5 d). The mean spectra of the sausages were extracted from the hyperspectral images and partial least squares regression (PLSR) model was developed to relate spectral profiles with the pH of the cooked sausages. Eleven important wavelengths were selected based on the regression coefficient values. The PLSR model established using the optimal wavelengths showed good precision being the prediction coefficient of determination (R p 2 ) 0.909 and the root mean square error of prediction 0.035. The prediction map for illustrating pH indices in sausages was for the first time developed by R statistics. The overall results suggested that hyperspectral imaging combined with PLSR and R statistics are capable to quantify and visualize the sausages pH evolution under different storage conditions. In this paper, hyperspectral imaging is for the first time used to detect pH in cooked sausages using R statistics, which provides another useful information for the researchers who do not have the access to Matlab. Eleven optimal wavelengths were successfully selected, which were used for simplifying the PLSR model established based on the full wavelengths. This simplified model achieved a high R p 2 (0.909) and a low root mean square error of prediction (0.035), which can be useful for the design of multispectral imaging systems. © 2017 Institute of Food Technologists®.
NASA Astrophysics Data System (ADS)
Lorenzi, Marco; Simpson, Ivor J.; Mendelson, Alex F.; Vos, Sjoerd B.; Cardoso, M. Jorge; Modat, Marc; Schott, Jonathan M.; Ourselin, Sebastien
2016-04-01
The joint analysis of brain atrophy measured with magnetic resonance imaging (MRI) and hypometabolism measured with positron emission tomography with fluorodeoxyglucose (FDG-PET) is of primary importance in developing models of pathological changes in Alzheimer’s disease (AD). Most of the current multimodal analyses in AD assume a local (spatially overlapping) relationship between MR and FDG-PET intensities. However, it is well known that atrophy and hypometabolism are prominent in different anatomical areas. The aim of this work is to describe the relationship between atrophy and hypometabolism by means of a data-driven statistical model of non-overlapping intensity correlations. For this purpose, FDG-PET and MRI signals are jointly analyzed through a computationally tractable formulation of partial least squares regression (PLSR). The PLSR model is estimated and validated on a large clinical cohort of 1049 individuals from the ADNI dataset. Results show that the proposed non-local analysis outperforms classical local approaches in terms of predictive accuracy while providing a plausible description of disease dynamics: early AD is characterised by non-overlapping temporal atrophy and temporo-parietal hypometabolism, while the later disease stages show overlapping brain atrophy and hypometabolism spread in temporal, parietal and cortical areas.
Kang, Bo-Sik; Lee, Jang-Eun; Park, Hyun-Jin
2014-05-15
A commercial electronic tongue was used to discriminate Korean rice wines (makgeolli) brewed from nine cultivars of rice with different amino acid and fatty acid compositions. The E-tongue was applied to establish prediction models with sensory evaluation or LC-MS/MS by partial least squares regression (PLSR). All makgeollis were classified into three groups by principal components analysis, and the separation pattern was affected by rice qualities and yeast fermentation. Makgeolli taste changed from the complicated comprising sweetness, saltiness, and umami to the uncomplicated, such as bitterness and then, sourness, with a decrease of amino acids and fatty acids in the rice. The quantitative correlation between E-tongue and sensory scores or LC-MS/MS by PLSR demonstrated that E-tongue could well predict most of the sensory attributes with relatively acceptable r(2), except for bitterness, but could not predict most of the chemical compounds responsible for taste attributes, except for ribose, lactate, succinate, and tryptophan. Copyright © 2013 Elsevier Ltd. All rights reserved.
Liu, Xuesong; Wu, Chunyan; Geng, Shu; Jin, Ye; Luan, Lianjun; Chen, Yong; Wu, Yongjiang
2015-01-01
This paper used near-infrared (NIR) spectroscopy for the on-line quantitative monitoring of water precipitation during Danhong injection. For these NIR measurements, two fiber optic probes designed to transmit NIR radiation through a 2 mm flow cell were used to collect spectra in real-time. Partial least squares regression (PLSR) was developed as the preferred chemometrics quantitative analysis of the critical intermediate qualities: the danshensu (DSS, (R)-3, 4-dihydroxyphenyllactic acid), protocatechuic aldehyde (PA), rosmarinic acid (RA), and salvianolic acid B (SAB) concentrations. Optimized PLSR models were successfully built and used for on-line detecting of the concentrations of DSS, PA, RA, and SAB of water precipitation during Danhong injection. Besides, the information of DSS, PA, RA, and SAB concentrations would be instantly fed back to site technical personnel for control and adjustment timely. The verification experiments determined that the predicted values agreed with the actual homologic value.
On the prediction of threshold friction velocity of wind erosion using soil reflectance spectroscopy
Li, Junran; Flagg, Cody B.; Okin, Gregory S.; Painter, Thomas H.; Dintwe, Kebonye; Belnap, Jayne
2015-01-01
Current approaches to estimate threshold friction velocity (TFV) of soil particle movement, including both experimental and empirical methods, suffer from various disadvantages, and they are particularly not effective to estimate TFVs at regional to global scales. Reflectance spectroscopy has been widely used to obtain TFV-related soil properties (e.g., moisture, texture, crust, etc.), however, no studies have attempted to directly relate soil TFV to their spectral reflectance. The objective of this study was to investigate the relationship between soil TFV and soil reflectance in the visible and near infrared (VIS–NIR, 350–2500 nm) spectral region, and to identify the best range of wavelengths or combinations of wavelengths to predict TFV. Threshold friction velocity of 31 soils, along with their reflectance spectra and texture were measured in the Mojave Desert, California and Moab, Utah. A correlation analysis between TFV and soil reflectance identified a number of isolated, narrow spectral domains that largely fell into two spectral regions, the VIS area (400–700 nm) and the short-wavelength infrared (SWIR) area (1100–2500 nm). A partial least squares regression analysis (PLSR) confirmed the significant bands that were identified by correlation analysis. The PLSR further identified the strong relationship between the first-difference transformation and TFV at several narrow regions around 1400, 1900, and 2200 nm. The use of PLSR allowed us to identify a total of 17 key wavelengths in the investigated spectrum range, which may be used as the optimal spectral settings for estimating TFV in the laboratory and field, or mapping of TFV using airborne/satellite sensors.
NASA Astrophysics Data System (ADS)
Jiang, Hao; Lu, Jiangang
2018-05-01
Corn starch is an important material which has been traditionally used in the fields of food and chemical industry. In order to enhance the rapidness and reliability of the determination for starch content in corn, a methodology is proposed in this work, using an optimal CC-PLSR-RBFNN calibration model and near-infrared (NIR) spectroscopy. The proposed model was developed based on the optimal selection of crucial parameters and the combination of correlation coefficient method (CC), partial least squares regression (PLSR) and radial basis function neural network (RBFNN). To test the performance of the model, a standard NIR spectroscopy data set was introduced, containing spectral information and chemical reference measurements of 80 corn samples. For comparison, several other models based on the identical data set were also briefly discussed. In this process, the root mean square error of prediction (RMSEP) and coefficient of determination (Rp2) in the prediction set were used to make evaluations. As a result, the proposed model presented the best predictive performance with the smallest RMSEP (0.0497%) and the highest Rp2 (0.9968). Therefore, the proposed method combining NIR spectroscopy with the optimal CC-PLSR-RBFNN model can be helpful to determine starch content in corn.
Qualitative Analysis of Dairy and Powder Milk Using Laser-Induced Breakdown Spectroscopy (LIBS).
Alfarraj, Bader A; Sanghapi, Herve K; Bhatt, Chet R; Yueh, Fang Y; Singh, Jagdish P
2018-01-01
Laser-induced breakdown spectroscopy (LIBS) technique was used to compare various types of commercial milk products. Laser-induced breakdown spectroscopy spectra were investigated for the determination of the elemental composition of soy and rice milk powder, dairy milk, and lactose-free dairy milk. The analysis was performed using radiative transitions. Atomic emissions from Ca, K, Na, and Mg lines observed in LIBS spectra of dairy milk were compared. In addition, proteins and fat level in milks can be determined using molecular emissions such as CN bands. Ca concentrations were calculated to be 2.165 ± 0.203 g/L in 1% of dairy milk fat samples and 2.809 ± 0.172 g/L in 2% of dairy milk fat samples using the standard addition method (SAM) with LIBS spectra. Univariate and multivariate statistical analysis methods showed that the contents of major mineral elements were higher in lactose-free dairy milk than those in dairy milk. The principal component analysis (PCA) method was used to discriminate four milk samples depending on their mineral elements concentration. In addition, proteins and fat level in dairy milks were determined using molecular emissions such as CN band. We applied partial least squares regression (PLSR) and simple linear regression (SLR) models to predict levels of milk fat in dairy milk samples. The PLSR model was successfully used to predict levels of milk fat in dairy milk sample with the relative accuracy (RA%) less than 6.62% using CN (0,0) band.
Feature reconstruction of LFP signals based on PLSR in the neural information decoding study.
Yonghui Dong; Zhigang Shang; Mengmeng Li; Xinyu Liu; Hong Wan
2017-07-01
To solve the problems of Signal-to-Noise Ratio (SNR) and multicollinearity when the Local Field Potential (LFP) signals is used for the decoding of animal motion intention, a feature reconstruction of LFP signals based on partial least squares regression (PLSR) in the neural information decoding study is proposed in this paper. Firstly, the feature information of LFP coding band is extracted based on wavelet transform. Then the PLSR model is constructed by the extracted LFP coding features. According to the multicollinearity characteristics among the coding features, several latent variables which contribute greatly to the steering behavior are obtained, and the new LFP coding features are reconstructed. Finally, the K-Nearest Neighbor (KNN) method is used to classify the reconstructed coding features to verify the decoding performance. The results show that the proposed method can achieve the highest accuracy compared to the other three methods and the decoding effect of the proposed method is robust.
Wenjun, Ji; Zhou, Shi; Jingyi, Huang; Shuo, Li
2014-01-01
In situ measurements with visible and near-infrared spectroscopy (vis-NIR) provide an efficient way for acquiring soil information of paddy soils in the short time gap between the harvest and following rotation. The aim of this study was to evaluate its feasibility to predict a series of soil properties including organic matter (OM), organic carbon (OC), total nitrogen (TN), available nitrogen (AN), available phosphorus (AP), available potassium (AK) and pH of paddy soils in Zhejiang province, China. Firstly, the linear partial least squares regression (PLSR) was performed on the in situ spectra and the predictions were compared to those with laboratory-based recorded spectra. Then, the non-linear least-square support vector machine (LS-SVM) algorithm was carried out aiming to extract more useful information from the in situ spectra and improve predictions. Results show that in terms of OC, OM, TN, AN and pH, (i) the predictions were worse using in situ spectra compared to laboratory-based spectra with PLSR algorithm (ii) the prediction accuracy using LS-SVM (R2>0.75, RPD>1.90) was obviously improved with in situ vis-NIR spectra compared to PLSR algorithm, and comparable or even better than results generated using laboratory-based spectra with PLSR; (iii) in terms of AP and AK, poor predictions were obtained with in situ spectra (R2<0.5, RPD<1.50) either using PLSR or LS-SVM. The results highlight the use of LS-SVM for in situ vis-NIR spectroscopic estimation of soil properties of paddy soils. PMID:25153132
Douglas, R K; Nawar, S; Alamar, M C; Mouazen, A M; Coulon, F
2018-03-01
Visible and near infrared spectrometry (vis-NIRS) coupled with data mining techniques can offer fast and cost-effective quantitative measurement of total petroleum hydrocarbons (TPH) in contaminated soils. Literature showed however significant differences in the performance on the vis-NIRS between linear and non-linear calibration methods. This study compared the performance of linear partial least squares regression (PLSR) with a nonlinear random forest (RF) regression for the calibration of vis-NIRS when analysing TPH in soils. 88 soil samples (3 uncontaminated and 85 contaminated) collected from three sites located in the Niger Delta were scanned using an analytical spectral device (ASD) spectrophotometer (350-2500nm) in diffuse reflectance mode. Sequential ultrasonic solvent extraction-gas chromatography (SUSE-GC) was used as reference quantification method for TPH which equal to the sum of aliphatic and aromatic fractions ranging between C 10 and C 35 . Prior to model development, spectra were subjected to pre-processing including noise cut, maximum normalization, first derivative and smoothing. Then 65 samples were selected as calibration set and the remaining 20 samples as validation set. Both vis-NIR spectrometry and gas chromatography profiles of the 85 soil samples were subjected to RF and PLSR with leave-one-out cross-validation (LOOCV) for the calibration models. Results showed that RF calibration model with a coefficient of determination (R 2 ) of 0.85, a root means square error of prediction (RMSEP) 68.43mgkg -1 , and a residual prediction deviation (RPD) of 2.61 outperformed PLSR (R 2 =0.63, RMSEP=107.54mgkg -1 and RDP=2.55) in cross-validation. These results indicate that RF modelling approach is accounting for the nonlinearity of the soil spectral responses hence, providing significantly higher prediction accuracy compared to the linear PLSR. It is recommended to adopt the vis-NIRS coupled with RF modelling approach as a portable and cost effective method for the rapid quantification of TPH in soils. Copyright © 2017 Elsevier B.V. All rights reserved.
Radioecological modelling of Polonium-210 and Caesium-137 in lichen-reindeer-man and top predators.
Persson, Bertil R R; Gjelsvik, Runhild; Holm, Elis
2018-06-01
This work deals with analysis and modelling of the radionuclides 210 Pb and 210 Po in the food-chain lichen-reindeer-man in addition to 210 Po and 137 Cs in top predators. By using the methods of Partial Least Square Regression (PLSR) the atmospheric deposition of 210 Pb and 210 Po is predicted at the sample locations. Dynamic modelling of the activity concentration with differential equations is fitted to the sample data. Reindeer lichen consumption, gastrointestinal absorption, organ distribution and elimination is derived from information in the literature. Dynamic modelling of transfer of 210 Pb and 210 Po to reindeer meat, liver and bone from lichen consumption, fitted well with data from Sweden and Finland from 1966 to 1971. The activity concentration of 210 Pb in the skeleton in man is modelled by using the results of studying the kinetics of lead in skeleton and blood in lead-workers after end of occupational exposure. The result of modelling 210 Pb and 210 Po activity in skeleton matched well with concentrations of 210 Pb and 210 Po in teeth from reindeer-breeders and autopsy bone samples in Finland. The results of 210 Po and 137 Cs in different tissues of wolf, wolverine and lynx previously published, are analysed with multivariate data processing methods such as Principal Component Analysis PCA, and modelled with the method of Projection to Latent Structures, PLS, or Partial Least Square Regression PLSR. Copyright © 2017 Elsevier Ltd. All rights reserved.
Zhang, Ni; Liu, Xu; Jin, Xiaoduo; Li, Chen; Wu, Xuan; Yang, Shuqin; Ning, Jifeng; Yanne, Paul
2017-12-15
Phenolics contents in wine grapes are key indicators for assessing ripeness. Near-infrared hyperspectral images during ripening have been explored to achieve an effective method for predicting phenolics contents. Principal component regression (PCR), partial least squares regression (PLSR) and support vector regression (SVR) models were built, respectively. The results show that SVR behaves globally better than PLSR and PCR, except in predicting tannins content of seeds. For the best prediction results, the squared correlation coefficient and root mean square error reached 0.8960 and 0.1069g/L (+)-catechin equivalents (CE), respectively, for tannins in skins, 0.9065 and 0.1776 (g/L CE) for total iron-reactive phenolics (TIRP) in skins, 0.8789 and 0.1442 (g/L M3G) for anthocyanins in skins, 0.9243 and 0.2401 (g/L CE) for tannins in seeds, and 0.8790 and 0.5190 (g/L CE) for TIRP in seeds. Our results indicated that NIR hyperspectral imaging has good prospects for evaluation of phenolics in wine grapes. Copyright © 2017 Elsevier Ltd. All rights reserved.
Han, Fu Liang; Li, Zheng; Xu, Yan
2015-12-01
Monomeric anthocyanin contributions to young red wine color were investigated using partial least square regression (PLSR) and aqueous alcohol solutions in this study. Results showed that the correlation between the anthocyanin concentration and the solution color fitted in a quadratic regression rather than linear or cubic regression. Malvidin-3-O-glucoside was estimated to show the highest contribution to young red wine color according to its concentration in wine, whereas peonidin-3-O-glucoside in its concentration contributed the least. The PLSR suggested that delphinidin-3-O-glucoside and peonidin-3-O-glucoside under the same concentration resulted in a stronger color of young red wine compared with malvidin-3-O-glucoside. These estimates were further confirmed by their color in aqueous alcohol solutions. These results suggested that delphinidin-3-O-glucoside and peonidin-3-O-glucoside were primary anthocyanins to enhance young red wine color by increasing their concentrations. This study could provide an alternative approach to improve young red wine color by adjusting anthocyanin composition and concentration. © 2015 Institute of Food Technologists®
Yao, Mingyin; Yang, Hui; Huang, Lin; Chen, Tianbing; Rao, Gangfu; Liu, Muhua
2017-05-10
In seeking a novel method with the ability of green analysis in monitoring toxic heavy metals residue in fresh leafy vegetables, laser-induced breakdown spectroscopy (LIBS) was applied to prove its capability in performing this work. The spectra of fresh vegetable samples polluted in the lab were collected by optimized LIBS experimental setup, and the reference concentrations of cadmium (Cd) from samples were obtained by conventional atomic absorption spectroscopy after wet digestion. The direct calibration employing intensity of single Cd line and Cd concentration exposed the weakness of this calibration method. Furthermore, the accuracy of linear calibration can be improved a little by triple Cd lines as characteristic variables, especially after the spectra were pretreated. However, it is not enough in predicting Cd in samples. Therefore, partial least-squares regression (PLSR) was utilized to enhance the robustness of quantitative analysis. The results of the PLSR model showed that the prediction accuracy of the Cd target can meet the requirement of determination in food safety. This investigation presented that LIBS is a promising and emerging method in analyzing toxic compositions in agricultural products, especially combined with suitable chemometrics.
Mercury and water level fluctuations in lakes of northern Minnesota
Larson, James H.; Maki, Ryan P; Christensen, Victoria G.; Sandheinrich, Mark B.; LeDuc, Jaime F.; Kissane, Claire; Knights, Brent C.
2017-01-01
Large lake ecosystems support a variety of ecosystem services in surrounding communities, including recreational and commercial fishing. However, many northern temperate fisheries are contaminated by mercury. Annual variation in mercury accumulation in fish has previously been linked to water level (WL) fluctuations, opening the possibility of regulating water levels in a manner that minimizes or reduces mercury contamination in fisheries. Here, we compiled a long-term dataset (1997-2015) of mercury content in young-of-year Yellow Perch (Perca flavescens) from six lakes on the border between the U.S. and Canada and examined whether mercury content appeared to be related to several metrics of WL fluctuation (e.g., spring WL rise, annual maximum WL, and year-to-year change in maximum WL). Using simple correlation analysis, several WL metrics appear to be strongly correlated to Yellow Perch mercury content, although the strength of these correlations varies by lake. We also used many WL metrics, water quality measurements, temperature and annual deposition data to build predictive models using partial least squared regression (PLSR) analysis for each lake. These PLSR models showed some variation among lakes, but also supported strong associations between WL fluctuations and annual variation in Yellow Perch mercury content. The study lakes underwent a modest change in WL management in 2000, when winter WL minimums were increased by about 1 m in five of the six study lakes. Using the PLSR models, we estimated how this change in WL management would have affected Yellow Perch mercury content. For four of the study lakes, the change in WL management that occurred in 2000 likely reduced Yellow Perch mercury content, relative to the previous WL management regime.
Sakudo, Akikazu; Kato, Yukiko Hakariya; Kuratsune, Hirohiko; Ikuta, Kazuyoshi
2009-10-01
After blood donation, in some individuals having polycythemia, dehydration causes anemia. Although the hematocrit (Ht) level is closely related to anemia, the current method of measuring Ht is performed after blood drawing. Furthermore, the monitoring of Ht levels contributes to a healthy life. Therefore, a non-invasive test for Ht is warranted for the safe donation of blood and good quality of life. A non-invasive procedure for the prediction of hematocrit levels was developed on the basis of a chemometric analysis of visible and near-infrared (Vis-NIR) spectra of the thumbs using portable spectrophotometer. Transmittance spectra in the 600- to 1100-nm region from thumbs of Japanese volunteers were subjected to a partial least squares regression (PLSR) analysis and leave-out cross-validation to develop chemometric models for predicting Ht levels. Ht levels of masked samples predicted by this model from Vis-NIR spectra provided a coefficient of determination in prediction of 0.6349 with a standard error of prediction of 3.704% and a detection limit in prediction of 17.14%, indicating that the model is applicable for normal and abnormal value in Ht level. These results suggest portable Vis-NIR spectrophotometer to have potential for the non-invasive measurement of Ht levels with a combination of PLSR analysis.
USDA-ARS?s Scientific Manuscript database
Purpose: The aim of this study was to develop a technique for the non-destructive and rapid prediction of the moisture content in red pepper powder using near-infrared (NIR) spectroscopy and a partial least squares regression (PLSR) model. Methods: Three red pepper powder products were separated in...
Xia, Qing; Liu, Changhong; Liu, Jinxia; Pan, Wenjuan; Lu, Xuzhong; Yang, Jianbo; Chen, Wei; Zheng, Lei
2016-03-30
Rancidity is an important attribute for quality assessment of butter cookies, while traditional methods for rancidity measurement are usually laborious, destructive and prone to operational error. In the present paper, the potential of applying multi-spectral imaging (MSI) technology with 19 wavelengths in the range of 405-970 nm to evaluate the rancidity in butter cookies was investigated. Moisture content, acid value and peroxide value were determined by traditional methods and then related with the spectral information by partial least squares regression (PLSR) and back-propagation artificial neural network (BP-ANN). The optimal models for predicting moisture content, acid value and peroxide value were obtained by PLSR. The correlation coefficient (r) obtained by PLSR models revealed that MSI had a perfect ability to predict moisture content (r = 0.909), acid value (r = 0.944) and peroxide value (r = 0.971). The study demonstrated that the rancidity level of butter cookies can be continuously monitored and evaluated in real-time by the multi-spectral imaging, which is of great significance for developing online food safety monitoring solutions. © 2015 Society of Chemical Industry.
Li, Shuifang; Zhang, Xin; Shan, Yang; Su, Donglin; Ma, Qiang; Wen, Ruizhi; Li, Jiaojuan
2017-03-01
Near-infrared spectroscopy (NIR) was used for qualitative and quantitative detection of honey adulterated with high-fructose corn syrup (HFCS) or maltose syrup (MS). Competitive adaptive reweighted sampling (CARS) was employed to select key variables. Partial least squares linear discriminant analysis (PLS-LDA) was adopted to classify the adulterated honey samples. The CARS-PLS-LDA models showed an accuracy of 86.3% (honey vs. adulterated honey with HFCS) and 96.1% (honey vs. adulterated honey with MS), respectively. PLS regression (PLSR) was used to predict the extent of adulteration in the honeys. The results showed that NIR combined with PLSR could not be used to quantify adulteration with HFCS, but could be used to quantify adulteration with MS: coefficient (R p 2 ) and root mean square of prediction (RMSEP) were 0.901 and 4.041 for MS-adulterated samples from different floral origins, and 0.981 and 1.786 for MS-adulterated samples from the same floral origin (Brassica spp.), respectively. Copyright © 2016. Published by Elsevier Ltd.
The prediction of food additives in the fruit juice based on electronic nose with chemometrics.
Qiu, Shanshan; Wang, Jun
2017-09-01
Food additives are added to products to enhance their taste, and preserve flavor or appearance. While their use should be restricted to achieve a technological benefit, the contents of food additives should be also strictly controlled. In this study, E-nose was applied as an alternative to traditional monitoring technologies for determining two food additives, namely benzoic acid and chitosan. For quantitative monitoring, support vector machine (SVM), random forest (RF), extreme learning machine (ELM) and partial least squares regression (PLSR) were applied to establish regression models between E-nose signals and the amount of food additives in fruit juices. The monitoring models based on ELM and RF reached higher correlation coefficients (R 2 s) and lower root mean square errors (RMSEs) than models based on PLSR and SVM. This work indicates that E-nose combined with RF or ELM can be a cost-effective, easy-to-build and rapid detection system for food additive monitoring. Copyright © 2017 Elsevier Ltd. All rights reserved.
Bhatt, Chet R; Jain, Jinesh C; Goueguel, Christian L; McIntyre, Dustin L; Singh, Jagdish P
2018-01-01
Laser-induced breakdown spectroscopy (LIBS) was used to detect rare earth elements (REEs) in natural geological samples. Low and high intensity emission lines of Ce, La, Nd, Y, Pr, Sm, Eu, Gd, and Dy were identified in the spectra recorded from the samples to claim the presence of these REEs. Multivariate analysis was executed by developing partial least squares regression (PLS-R) models for the quantification of Ce, La, and Nd. Analysis of unknown samples indicated that the prediction results of these samples were found comparable to those obtained by inductively coupled plasma mass spectrometry analysis. Data support that LIBS has potential to quantify REEs in geological minerals/ores.
Jackman, Patrick; Sun, Da-Wen; Elmasry, Gamal
2012-08-01
A new algorithm for the conversion of device dependent RGB colour data into device independent L*a*b* colour data without introducing noticeable error has been developed. By combining a linear colour space transform and advanced multiple regression methodologies it was possible to predict L*a*b* colour data with less than 2.2 colour units of error (CIE 1976). By transforming the red, green and blue colour components into new variables that better reflect the structure of the L*a*b* colour space, a low colour calibration error was immediately achieved (ΔE(CAL) = 14.1). Application of a range of regression models on the data further reduced the colour calibration error substantially (multilinear regression ΔE(CAL) = 5.4; response surface ΔE(CAL) = 2.9; PLSR ΔE(CAL) = 2.6; LASSO regression ΔE(CAL) = 2.1). Only the PLSR models deteriorated substantially under cross validation. The algorithm is adaptable and can be easily recalibrated to any working computer vision system. The algorithm was tested on a typical working laboratory computer vision system and delivered only a very marginal loss of colour information ΔE(CAL) = 2.35. Colour features derived on this system were able to safely discriminate between three classes of ham with 100% correct classification whereas colour features measured on a conventional colourimeter were not. Copyright © 2012 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Qiao, T.; Ren, J.; Craigie, C.; Zabalza, J.; Maltin, Ch.; Marshall, S.
2015-03-01
It is well known that the eating quality of beef has a significant influence on the repurchase behavior of consumers. There are several key factors that affect the perception of quality, including color, tenderness, juiciness, and flavor. To support consumer repurchase choices, there is a need for an objective measurement of quality that could be applied to meat prior to its sale. Objective approaches such as offered by spectral technologies may be useful, but the analytical algorithms used remain to be optimized. For visible and near infrared (VISNIR) spectroscopy, Partial Least Squares Regression (PLSR) is a widely used technique for meat related quality modeling and prediction. In this paper, a Support Vector Machine (SVM) based machine learning approach is presented to predict beef eating quality traits. Although SVM has been successfully used in various disciplines, it has not been applied extensively to the analysis of meat quality parameters. To this end, the performance of PLSR and SVM as tools for the analysis of meat tenderness is evaluated, using a large dataset acquired under industrial conditions. The spectral dataset was collected using VISNIR spectroscopy with the wavelength ranging from 350 to 1800 nm on 234 beef M. longissimus thoracis steaks from heifers, steers, and young bulls. As the dimensionality with the VISNIR data is very high (over 1600 spectral bands), the Principal Component Analysis (PCA) technique was applied for feature extraction and data reduction. The extracted principal components (less than 100) were then used for data modeling and prediction. The prediction results showed that SVM has a greater potential to predict beef eating quality than PLSR, especially for the prediction of tenderness. The infl uence of animal gender on beef quality prediction was also investigated, and it was found that beef quality traits were predicted most accurately in beef from young bulls.
Liu, Mingyue; Du, Baojia; Zhang, Bai
2018-01-01
Soil salinity and sodicity can significantly reduce the value and the productivity of affected lands, posing degradation, and threats to sustainable development of natural resources on earth. This research attempted to map soil salinity/sodicity via disentangling the relationships between Landsat 8 Operational Land Imager (OLI) imagery and in-situ measurements (EC, pH) over the west Jilin of China. We established the retrieval models for soil salinity and sodicity using Partial Least Square Regression (PLSR). Spatial distribution of the soils that were subjected to hybridized salinity and sodicity (HSS) was obtained by overlay analysis using maps of soil salinity and sodicity in geographical information system (GIS) environment. We analyzed the severity and occurring sizes of soil salinity, sodicity, and HSS with regard to specified soil types and land cover. Results indicated that the models’ accuracy was improved by combining the reflectance bands and spectral indices that were mathematically transformed. Therefore, our results stipulated that the OLI imagery and PLSR method applied to mapping soil salinity and sodicity in the region. The mapping results revealed that the areas of soil salinity, sodicity, and HSS were 1.61 × 106 hm2, 1.46 × 106 hm2, and 1.36 × 106 hm2, respectively. Also, the occurring area of moderate and intensive sodicity was larger than that of salinity. This research may underpin efficiently mapping regional salinity/sodicity occurrences, understanding the linkages between spectral reflectance and ground measurements of soil salinity and sodicity, and provide tools for soil salinity monitoring and the sustainable utilization of land resources. PMID:29614727
Kocaoglu-Vurma, N A; Eliardi, A; Drake, M A; Rodriguez-Saona, L E; Harper, W J
2009-08-01
The acceptability of cheese depends largely on the flavor formed during ripening. The flavor profiles of cheeses are complex and region- or manufacturer-specific which have made it challenging to understand the chemistry of flavor development and its correlation with sensory properties. Infrared spectroscopy is an attractive technology for the rapid, sensitive, and high-throughput analysis of foods, providing information related to its composition and conformation of food components from the spectra. Our objectives were to establish infrared spectral profiles to discriminate Swiss cheeses produced by different manufacturers in the United States and to develop predictive models for determination of sensory attributes based on infrared spectra. Fifteen samples from 3 Swiss cheese manufacturers were received and analyzed using attenuated total reflectance infrared spectroscopy (ATR-IR). The spectra were analyzed using soft independent modeling of class analogy (SIMCA) to build a classification model. The cheeses were profiled by a trained sensory panel using descriptive sensory analysis. The relationship between the descriptive sensory scores and ATR-IR spectra was assessed using partial least square regression (PLSR) analysis. SIMCA discriminated the Swiss cheeses based on manufacturer and production region. PLSR analysis generated prediction models with correlation coefficients of validation (rVal) between 0.69 and 0.96 with standard error of cross-validation (SECV) ranging from 0.04 to 0.29. Implementation of rapid infrared analysis by the Swiss cheese industry would help to streamline quality assurance.
USDA-ARS?s Scientific Manuscript database
Spectral scattering is useful for nondestructive sensing of fruit firmness. Prediction models, however, are typically built using multivariate statistical methods such as partial least squares regression (PLSR), whose performance generally depends on the characteristics of the data. The aim of this ...
Spatial assessment of soluble solid contents on apple slices using hyperspectral imaging
USDA-ARS?s Scientific Manuscript database
A partial least squares regression (PLSR) model to map internal soluble solids content (SSC) of apples using visible/near-infrared (VNIR) hyperspectral imaging was developed. The reflectance spectra of sliced apples were extracted from hyperspectral absorbance images obtained in the 400e1000 nm rang...
Wei, Zhebo; Xiao, Xize
2017-01-01
In this study, a portable electronic nose (E-nose) was self-developed to identify rice wines with different marked ages—all the operations of the E-nose were controlled by a special Smartphone Application. The sensor array of the E-nose was comprised of 12 MOS sensors and the obtained response values were transmitted to the Smartphone thorough a wireless communication module. Then, Aliyun worked as a cloud storage platform for the storage of responses and identification models. The measurement of the E-nose was composed of the taste information obtained phase (TIOP) and the aftertaste information obtained phase (AIOP). The area feature data obtained from the TIOP and the feature data obtained from the TIOP-AIOP were applied to identify rice wines by using pattern recognition methods. Principal component analysis (PCA), locally linear embedding (LLE) and linear discriminant analysis (LDA) were applied for the classification of those wine samples. LDA based on the area feature data obtained from the TIOP-AIOP proved a powerful tool and showed the best classification results. Partial least-squares regression (PLSR) and support vector machine (SVM) were applied for the predictions of marked ages and SVM (R2 = 0.9942) worked much better than PLSR. PMID:29088076
Use of Standing Gold Nanorods for Detection of Malachite Green and Crystal Violet in Fish by SERS.
Chen, Xiaowei; Nguyen, Trang H D; Gu, Liqun; Lin, Mengshi
2017-07-01
With growing consumption of aquaculture products, there is increasing demand on rapid and sensitive techniques that can detect prohibited substances in the seafood products. This study aimed to develop a novel surface-enhanced Raman spectroscopy (SERS) method coupled with simplified extraction protocol and novel gold nanorod (AuNR) substrates to detect banned aquaculture substances (malachite green [MG] and crystal violet [CV]) and their mixture (1:1) in aqueous solution and fish samples. Multivariate statistical tools such as principal component analysis (PCA) and partial least squares regression (PLSR) were used in data analysis. PCA results demonstrate that SERS can distinguish MG, CV and their mixture (1:1) in aqueous solution and in fish samples. The detection limit of SERS coupled with standing AuNR substrates is 1 ppb for both MG and CV in fish samples. A good linear relationship between the actual concentration and predicted concentration of analytes based on PLSR models with R 2 values from 0.87 to 0.99 were obtained, indicating satisfactory quantification results of this method. These results demonstrate that the SERS method coupled with AuNR substrates can be used for rapid and accurate detection of MG and CV in fish samples. © 2017 Institute of Food Technologists®.
Wei, Zhebo; Xiao, Xize; Wang, Jun; Wang, Hui
2017-10-31
In this study, a portable electronic nose (E-nose) was self-developed to identify rice wines with different marked ages-all the operations of the E-nose were controlled by a special Smartphone Application. The sensor array of the E-nose was comprised of 12 MOS sensors and the obtained response values were transmitted to the Smartphone thorough a wireless communication module. Then, Aliyun worked as a cloud storage platform for the storage of responses and identification models. The measurement of the E-nose was composed of the taste information obtained phase (TIOP) and the aftertaste information obtained phase (AIOP). The area feature data obtained from the TIOP and the feature data obtained from the TIOP-AIOP were applied to identify rice wines by using pattern recognition methods. Principal component analysis (PCA), locally linear embedding (LLE) and linear discriminant analysis (LDA) were applied for the classification of those wine samples. LDA based on the area feature data obtained from the TIOP-AIOP proved a powerful tool and showed the best classification results. Partial least-squares regression (PLSR) and support vector machine (SVM) were applied for the predictions of marked ages and SVM (R² = 0.9942) worked much better than PLSR.
Soil-Bacterium Compatibility Model as a Decision-Making Tool for Soil Bioremediation.
Horemans, Benjamin; Breugelmans, Philip; Saeys, Wouter; Springael, Dirk
2017-02-07
Bioremediation of organic pollutant contaminated soil involving bioaugmentation with dedicated bacteria specialized in degrading the pollutant is suggested as a green and economically sound alternative to physico-chemical treatment. However, intrinsic soil characteristics impact the success of bioaugmentation. The feasibility of using partial least-squares regression (PLSR) to predict the success of bioaugmentation in contaminated soil based on the intrinsic physico-chemical soil characteristics and, hence, to improve the success of bioaugmentation, was examined. As a proof of principle, PLSR was used to build soil-bacterium compatibility models to predict the bioaugmentation success of the phenanthrene-degrading Novosphingobium sp. LH128. The survival and biodegradation activity of strain LH128 were measured in 20 soils and correlated with the soil characteristics. PLSR was able to predict the strain's survival using 12 variables or less while the PAH-degrading activity of strain LH128 in soils that show survival was predicted using 9 variables. A three-step approach using the developed soil-bacterium compatibility models is proposed as a decision making tool and first estimation to select compatible soils and organisms and increase the chance of success of bioaugmentation.
NASA Astrophysics Data System (ADS)
Paul, Andrea; Meyer, Klas; Ruiken, Jan-Paul; Illner, Markus; Müller, David-Nicolas; Esche, Erik; Wozny, Günther; Westad, Frank; Maiwald, Michael
2017-03-01
A major industrial reaction based on homogeneous catalysis is hydroformylation for the production of aldehydes from alkenes and syngas. Hydroformylation in microemulsions, which is currently under investigation at Technische Universität Berlin on a mini-plant scale, was identified as a cost efficient approach which also enhances product selectivity. Herein, we present the application of online Raman spectroscopy on the reaction of 1-dodecene to 1-tridecanal within a microemulsion. To achieve a good representation of the operation range in the mini-plant with regard to concentrations of the reactants a design of experiments was used. Based on initial Raman spectra partial least squares regression (PLSR) models were calibrated for the prediction of 1-dodecene and 1-tridecanal. Limits of predictions arise from nonlinear correlations between Raman intensity and mass fractions of compounds in the microemulsion system. Furthermore, the prediction power of PLSR models becomes limited due to unexpected by-product formation. Application of the lab-scale derived calibration spectra and PLSR models on online spectra from a mini-plant operation yielded promising estimations of 1-tridecanal and acceptable predictions of 1-dodecene mass fractions suggesting Raman spectroscopy as a suitable technique for process analytics in microemulsions.
[Measurement of soil organic matter and available K based on SPA-LS-SVM].
Zhang, Hai-Liang; Liu, Xue-Mei; He, Yong
2014-05-01
Visible and short wave infrared spectroscopy (Vis/SW-NIRS) was investigated in the present study for measurement of soil organic matter (OM) and available potassium (K). Four types of pretreatments including smoothing, SNV, MSC and SG smoothing+first derivative were adopted to eliminate the system noises and external disturbances. Then partial least squares regression (PLSR) and least squares-support vector machine (LS-SVM) models were implemented for calibration models. The LS-SVM model was built by using characteristic wavelength based on successive projections algorithm (SPA). Simultaneously, the performance of LSSVM models was compared with PLSR models. The results indicated that LS-SVM models using characteristic wavelength as inputs based on SPA outperformed PLSR models. The optimal SPA-LS-SVM models were achieved, and the correlation coefficient (r), and RMSEP were 0. 860 2 and 2. 98 for OM and 0. 730 5 and 15. 78 for K, respectively. The results indicated that visible and short wave near infrared spectroscopy (Vis/SW-NIRS) (325 approximately 1 075 nm) combined with LS-SVM based on SPA could be utilized as a precision method for the determination of soil properties.
NASA Astrophysics Data System (ADS)
Yan, Ling; Liu, Changhong; Qu, Hao; Liu, Wei; Zhang, Yan; Yang, Jianbo; Zheng, Lei
2018-03-01
Terahertz (THz) technique, a recently developed spectral method, has been researched and used for the rapid discrimination and measurements of food compositions due to its low-energy and non-ionizing characteristics. In this study, THz spectroscopy combined with chemometrics has been utilized for qualitative and quantitative analysis of myricetin, quercetin, and kaempferol with concentrations of 0.025, 0.05, and 0.1 mg/mL. The qualitative discrimination was achieved by KNN, ELM, and RF models with the spectra pre-treatments. An excellent discrimination (100% CCR in the prediction set) could be achieved using the RF model. Furthermore, the quantitative analyses were performed by partial least square regression (PLSR) and least squares support vector machine (LS-SVM). Comparing to the PLSR models, the LS-SVM yielded better results with low RMSEP (0.0044, 0.0039, and 0.0048), higher Rp (0.9601, 0.9688, and 0.9359), and higher RPD (8.6272, 9.6333, and 7.9083) for myricetin, quercetin, and kaempferol, respectively. Our results demonstrate that THz spectroscopy technique is a powerful tool for identification of three flavonols with similar chemical structures and quantitative determination of their concentrations.
Sensory characteristics and consumer preference for chicken meat in Guinea.
Sow, T M A; Grongnet, J F
2010-10-01
This study identified the sensory characteristics and consumer preference for chicken meat in Guinea. Five chicken samples [live village chicken, live broiler, live spent laying hen, ready-to-cook broiler, and ready-to-cook broiler (imported)] bought from different locations were assessed by 10 trained panelists using 19 sensory attributes. The ANOVA results showed that 3 chicken appearance attributes (brown, yellow, and white), 5 chicken odor attributes (oily, intense, medicine smell, roasted, and mouth persistent), 3 chicken flavor attributes (sweet, bitter, and astringent), and 8 chicken texture attributes (firm, tender, juicy, chew, smooth, springy, hard, and fibrous) were significantly discriminating between the chicken samples (P<0.05). Principal component analysis of the sensory data showed that the first 2 principal components explained 84% of the sensory data variance. The principal component analysis results showed that the live village chicken, the live spent laying hen, and the ready-to-cook broiler (imported) were very well represented and clearly distinguished from the live broiler and the ready-to-cook broiler. One hundred twenty consumers expressed their preferences for the chicken samples using a 5-point Likert scale. The hierarchical cluster analysis of the preference data identified 4 homogenous consumer clusters. The hierarchical cluster analysis results showed that the live village chicken was the most preferred chicken sample, whereas the ready-to-cook broiler was the least preferred one. The partial least squares regression (PLSR) type 1 showed that 72% of the sensory data for the first 2 principal components explained 83% of the chicken preference. The PLSR1 identified that the sensory characteristics juicy, oily, sweet, hard, mouth persistent, and yellow were the most relevant sensory drivers of the Guinean chicken preference. The PLSR2 (with multiple responses) identified the relationship between the chicken samples, their sensory attributes, and the consumer clusters. Our results showed that there was not a chicken category that was exclusively preferred from the other chicken samples and therefore highlight the existence of place for development of all chicken categories in the local market.
Quality Detection of Litchi Stored in Different Environments Using an Electronic Nose
Xu, Sai; Lü, Enli; Lu, Huazhong; Zhou, Zhiyan; Wang, Yu; Yang, Jing; Wang, Yajuan
2016-01-01
The purpose of this paper was to explore the utility of an electronic nose to detect the quality of litchi fruit stored in different environments. In this study, a PEN3 electronic nose was adopted to test the storage time and hardness of litchi that were stored in three different types of environment (room temperature, refrigerator and controlled-atmosphere). After acquiring data about the hardness of the sample and from the electronic nose, linear discriminant analysis (LDA), canonical correlation analysis (CCA), BP neural network (BPNN) and BP neural network-partial least squares regression (BPNN-PLSR), were employed for data processing. The experimental results showed that the hardness of litchi fruits stored in all three environments decreased during storage. The litchi stored at room temperature had the fastest rate of decrease in hardness, followed by those stored in a refrigerator environment and under a controlled-atmosphere. LDA has a poor ability to classify the storage time of the three environments in which litchi was stored. BPNN can effectively recognize the storage time of litchi stored in a refrigerator and a controlled-atmosphere environment. However, the BPNN classification of the effect of room temperature storage on litchi was poor. CCA results show a significant correlation between electronic nose data and hardness data under the room temperature, and the correlation is more obvious for those under the refrigerator environment and controlled-atmosphere environment. The BPNN-PLSR can effectively predict the hardness of litchi under refrigerator storage conditions and a controlled-atmosphere environment. However, the BPNN-PLSR prediction of the effect of room temperature storage on litchi and global environment storage on litchi were poor. Thus, this experiment proved that an electronic nose can detect the quality of litchi under refrigeratored storage and a controlled-atmosphere environment. These results provide a useful reference for future studies on nondestructive and intelligent monitoring of fruit quality. PMID:27338391
Liu, Fei; Ye, Lanhan; Peng, Jiyu; Song, Kunlin; Shen, Tingting; Zhang, Chu; He, Yong
2018-02-27
Fast detection of heavy metals is very important for ensuring the quality and safety of crops. Laser-induced breakdown spectroscopy (LIBS), coupled with uni- and multivariate analysis, was applied for quantitative analysis of copper in three kinds of rice (Jiangsu rice, regular rice, and Simiao rice). For univariate analysis, three pre-processing methods were applied to reduce fluctuations, including background normalization, the internal standard method, and the standard normal variate (SNV). Linear regression models showed a strong correlation between spectral intensity and Cu content, with an R 2 more than 0.97. The limit of detection (LOD) was around 5 ppm, lower than the tolerance limit of copper in foods. For multivariate analysis, partial least squares regression (PLSR) showed its advantage in extracting effective information for prediction, and its sensitivity reached 1.95 ppm, while support vector machine regression (SVMR) performed better in both calibration and prediction sets, where R c 2 and R p 2 reached 0.9979 and 0.9879, respectively. This study showed that LIBS could be considered as a constructive tool for the quantification of copper contamination in rice.
Ye, Lanhan; Song, Kunlin; Shen, Tingting
2018-01-01
Fast detection of heavy metals is very important for ensuring the quality and safety of crops. Laser-induced breakdown spectroscopy (LIBS), coupled with uni- and multivariate analysis, was applied for quantitative analysis of copper in three kinds of rice (Jiangsu rice, regular rice, and Simiao rice). For univariate analysis, three pre-processing methods were applied to reduce fluctuations, including background normalization, the internal standard method, and the standard normal variate (SNV). Linear regression models showed a strong correlation between spectral intensity and Cu content, with an R2 more than 0.97. The limit of detection (LOD) was around 5 ppm, lower than the tolerance limit of copper in foods. For multivariate analysis, partial least squares regression (PLSR) showed its advantage in extracting effective information for prediction, and its sensitivity reached 1.95 ppm, while support vector machine regression (SVMR) performed better in both calibration and prediction sets, where Rc2 and Rp2 reached 0.9979 and 0.9879, respectively. This study showed that LIBS could be considered as a constructive tool for the quantification of copper contamination in rice. PMID:29495445
Computerized pigment design based on property hypersurfaces
NASA Astrophysics Data System (ADS)
Halova, Jaroslava; Sulcova, Petra; Kupka, Karel
2007-05-01
Competition is tough in the pigment market. Rational pigment design has therefore a competitive advantage, saving time and money. The aim of this work is to provide methods that can assist in designing pigments with defined properties. These methods include partial least squares regression (PLSR), neural network (NN) and generalized regression ANOVA model. Authors show how PLS bi-plot can be used to identify market gaps poorly covered by pigment manufacturers, thus giving an opportunity to develop pigments with potentially profitable properties.
Multivariate analysis relating oil shale geochemical properties to NMR relaxometry
Birdwell, Justin E.; Washburn, Kathryn E.
2015-01-01
Low-field nuclear magnetic resonance (NMR) relaxometry has been used to provide insight into shale composition by separating relaxation responses from the various hydrogen-bearing phases present in shales in a noninvasive way. Previous low-field NMR work using solid-echo methods provided qualitative information on organic constituents associated with raw and pyrolyzed oil shale samples, but uncertainty in the interpretation of longitudinal-transverse (T1–T2) relaxometry correlation results indicated further study was required. Qualitative confirmation of peaks attributed to kerogen in oil shale was achieved by comparing T1–T2 correlation measurements made on oil shale samples to measurements made on kerogen isolated from those shales. Quantitative relationships between T1–T2 correlation data and organic geochemical properties of raw and pyrolyzed oil shales were determined using partial least-squares regression (PLSR). Relaxometry results were also compared to infrared spectra, and the results not only provided further confidence in the organic matter peak interpretations but also confirmed attribution of T1–T2 peaks to clay hydroxyls. In addition, PLSR analysis was applied to correlate relaxometry data to trace element concentrations with good success. The results of this work show that NMR relaxometry measurements using the solid-echo approach produce T1–T2 peak distributions that correlate well with geochemical properties of raw and pyrolyzed oil shales.
NASA Astrophysics Data System (ADS)
Yan, B.; Fang, N. F.; Zhang, P. C.; Shi, Z. H.
2013-03-01
SummaryUnderstanding how changes in individual land use types influence the dynamics of streamflow and sediment yield would greatly improve the predictability of the hydrological consequences of land use changes and could thus help stakeholders to make better decisions. Multivariate statistics are commonly used to compare individual land use types to control the dynamics of streamflow or sediment yields. However, one issue with the use of conventional statistical methods to address relationships between land use types and streamflow or sediment yield is multicollinearity. In this study, an integrated approach involving hydrological modelling and partial least squares regression (PLSR) was used to quantify the contributions of changes in individual land use types to changes in streamflow and sediment yield. In a case study, hydrological modelling was conducted using land use maps from four time periods (1978, 1987, 1999, and 2007) for the Upper Du watershed (8973 km2) in China using the Soil and Water Assessment Tool (SWAT). Changes in streamflow and sediment yield across the two simulations conducted using the land use maps from 2007 to 1978 were found to be related to land use changes according to a PLSR, which was used to quantify the effect of this influence at the sub-basin scale. The major land use changes that affected streamflow in the studied catchment areas were related to changes in the farmland, forest and urban areas between 1978 and 2007; the corresponding regression coefficients were 0.232, -0.147 and 1.256, respectively, and the Variable Influence on Projection (VIP) was greater than 1. The dominant first-order factors affecting the changes in sediment yield in our study were: farmland (the VIP and regression coefficient were 1.762 and 14.343, respectively) and forest (the VIP and regression coefficient were 1.517 and -7.746, respectively). The PLSR methodology presented in this paper is beneficial and novel, as it partially eliminates the co-dependency of the variables and facilitates a more unbiased view of the contribution of the changes in individual land use types to changes in streamflow and sediment yield. This practicable and simple approach could be applied to a variety of other watersheds for which time-sequenced digital land use maps are available.
NASA Astrophysics Data System (ADS)
Princz, S.; Wenzel, U.; Miller, R.; Hessling, M.
2014-11-01
One aerobic and four anaerobic batch fermentations of the yeast Saccharomyces cerevisiae were conducted in a stirred bioreactor and monitored inline by NIR spectroscopy and a transflectance dip probe. From the acquired NIR spectra, chemometric partial least squares regression (PLSR) models for predicting biomass, glucose and ethanol were constructed. The spectra were directly measured in the fermentation broth and successfully inspected for adulteration using our novel data pre-processing method. These adulterations manifested as strong fluctuations in the shape and offset of the absorption spectra. They resulted from cells, cell clusters, or gas bubbles intercepting the optical path of the dip probe. In the proposed data pre-processing method, adulterated signals are removed by passing the time-scanned non-averaged spectra through two filter algorithms with a 5% quantile cutoff. The filtered spectra containing meaningful data are then averaged. A second step checks whether the whole time scan is analyzable. If true, the average is calculated and used to prepare the PLSR models. This new method distinctly improved the prediction results. To dissociate possible correlations between analyte concentrations, such as glucose and ethanol, the feeding analytes were alternately supplied at different concentrations (spiking) at the end of the four anaerobic fermentations. This procedure yielded low-error (anaerobic) PLSR models for predicting analyte concentrations of 0.31 g/l for biomass, 3.41 g/l for glucose, and 2.17 g/l for ethanol. The maximum concentrations were 14 g/l biomass, 167 g/l glucose, and 80 g/l ethanol. Data from the aerobic fermentation, carried out under high agitation and high aeration, were incorporated to realize combined PLSR models, which have not been previously reported to our knowledge.
Estimation of water quality by UV/Vis spectrometry in the framework of treated wastewater reuse.
Carré, Erwan; Pérot, Jean; Jauzein, Vincent; Lin, Liming; Lopez-Ferber, Miguel
2017-07-01
The aim of this study is to investigate the potential of ultraviolet/visible (UV/Vis) spectrometry as a complementary method for routine monitoring of reclaimed water production. Robustness of the models and compliance of their sensitivity with current quality limits are investigated. The following indicators are studied: total suspended solids (TSS), turbidity, chemical oxygen demand (COD) and nitrate. Partial least squares regression (PLSR) is used to find linear correlations between absorbances and indicators of interest. Artificial samples are made by simulating a sludge leak on the wastewater treatment plant and added to the original dataset, then divided into calibration and prediction datasets. The models are built on the calibration set, and then tested on the prediction set. The best models are developed with: PLSR for COD (R pred 2 = 0.80), TSS (R pred 2 = 0.86) and turbidity (R pred 2 = 0.96), and with a simple linear regression from absorbance at 208 nm (R pred 2 = 0.95) for nitrate concentration. The input of artificial data significantly enhances the robustness of the models. The sensitivity of the UV/Vis spectrometry monitoring system developed is compatible with quality requirements of reclaimed water production processes.
Zhang, Yong-Hong; Xia, Zhi-Ning; Qin, Li-Tang; Liu, Shu-Shen
2010-09-01
The objective of this paper is to build a reliable model based on the molecular electronegativity distance vector (MEDV) descriptors for predicting the blood-brain barrier (BBB) permeability and to reveal the effects of the molecular structural segments on the BBB permeability. Using 70 structurally diverse compounds, the partial least squares regression (PLSR) models between the BBB permeability and the MEDV descriptors were developed and validated by the variable selection and modeling based on prediction (VSMP) technique. The estimation ability, stability, and predictive power of a model are evaluated by the estimated correlation coefficient (r), leave-one-out (LOO) cross-validation correlation coefficient (q), and predictive correlation coefficient (R(p)). It has been found that PLSR model has good quality, r=0.9202, q=0.7956, and R(p)=0.6649 for M1 model based on the training set of 57 samples. To search the most important structural factors affecting the BBB permeability of compounds, we performed the values of the variable importance in projection (VIP) analysis for MEDV descriptors. It was found that some structural fragments in compounds, such as -CH(3), -CH(2)-, =CH-, =C, triple bond C-, -CH<, =C<, =N-, -NH-, =O, and -OH, are the most important factors affecting the BBB permeability. (c) 2010. Published by Elsevier Inc.
Tahir, Haroon Elrasheid; Xiaobo, Zou; Xiaowei, Huang; Jiyong, Shi; Mariod, Abdalbasit Adam
2016-09-01
Aroma profiles of six honey varieties of different botanical origins were investigated using colorimetric sensor array, gas chromatography-mass spectrometry (GC-MS) and descriptive sensory analysis. Fifty-eight aroma compounds were identified, including 2 norisoprenoids, 5 hydrocarbons, 4 terpenes, 6 phenols, 7 ketones, 9 acids, 12 aldehydes and 13 alcohols. Twenty abundant or active compounds were chosen as key compounds to characterize honey aroma. Discrimination of the honeys was subsequently implemented using multivariate analysis, including hierarchical clustering analysis (HCA) and principal component analysis (PCA). Honeys of the same botanical origin were grouped together in the PCA score plot and HCA dendrogram. SPME-GC/MS and colorimetric sensor array were able to discriminate the honeys effectively with the advantages of being rapid, simple and low-cost. Moreover, partial least squares regression (PLSR) was applied to indicate the relationship between sensory descriptors and aroma compounds. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Liu, Yande; Ying, Yibin; Lu, Huishan; Fu, Xiaping
2005-11-01
A new method is proposed to eliminate the varying background and noise simultaneously for multivariate calibration of Fourier transform near infrared (FT-NIR) spectral signals. An ideal spectrum signal prototype was constructed based on the FT-NIR spectrum of fruit sugar content measurement. The performances of wavelet based threshold de-noising approaches via different combinations of wavelet base functions were compared. Three families of wavelet base function (Daubechies, Symlets and Coiflets) were applied to estimate the performance of those wavelet bases and threshold selection rules by a series of experiments. The experimental results show that the best de-noising performance is reached via the combinations of Daubechies 4 or Symlet 4 wavelet base function. Based on the optimization parameter, wavelet regression models for sugar content of pear were also developed and result in a smaller prediction error than a traditional Partial Least Squares Regression (PLSR) mode.
Niu, Yunwei; Zhang, Xiaoming; Xiao, Zuobing; Song, Shiqing; Jia, Chengsheng; Yu, Haiyan; Fang, Lingling; Xu, Chunhua
2012-08-01
Five cherry wines exhibiting marked differences in taste and mouthfeel were selected for the study. The taste and mouthfeel of cherry wines were described by four sensory terms as sour, sweet, bitter and astringent. Eight organic acids, seventeen amino acids, three sugars and tannic acid were determined by high performance liquid chromatography (HPLC). Five phenolic acids were determined by ultra performance liquid chromatography coupled with mass spectrometry (UPLC-MS). The relationship between these taste-active compounds, wine samples and sensory attributes was modeled by partial least squares regression (PLSR). The regression analysis indicated tartaric acid, methionine, proline, sucrose, glucose, fructose, asparagines, serine, glycine, threonine, phenylalanine, leucine, gallic acid, chlorogenic acid, vanillic acid, arginine and tannic acid made a great contribution to the characteristic taste or mouthfeel of cherry wines. Copyright © 2012 Elsevier B.V. All rights reserved.
Mansoor, J K; Schelegle, Edward S; Davis, Cristina E; Walby, William F; Zhao, Weixiang; Aksenov, Alexander A; Pasamontes, Alberto; Figueroa, Jennifer; Allen, Roblee
2014-01-01
An important challenge to pulmonary arterial hypertension (PAH) diagnosis and treatment is early detection of occult pulmonary vascular pathology. Symptoms are frequently confused with other disease entities that lead to inappropriate interventions and allow for progression to advanced states of disease. There is a significant need to develop new markers for early disease detection and management of PAH. Exhaled breath condensate (EBC) samples were compared from 30 age-matched normal healthy individuals and 27 New York Heart Association functional class III and IV idiopathic pulmonary arterial hypertenion (IPAH) patients, a subgroup of PAH. Volatile organic compounds (VOC) in EBC samples were analyzed using gas chromatography/mass spectrometry (GC/MS). Individual peaks in GC profiles were identified in both groups and correlated with pulmonary hemodynamic and clinical endpoints in the IPAH group. Additionally, GC/MS data were analyzed using autoregression followed by partial least squares regression (AR/PLSR) analysis to discriminate between the IPAH and control groups. After correcting for medicaitons, there were 62 unique compounds in the control group, 32 unique compounds in the IPAH group, and 14 in-common compounds between groups. Peak-by-peak analysis of GC profiles of IPAH group EBC samples identified 6 compounds significantly correlated with pulmonary hemodynamic variables important in IPAH diagnosis. AR/PLSR analysis of GC/MS data resulted in a distinct and identifiable metabolic signature for IPAH patients. These findings indicate the utility of EBC VOC analysis to discriminate between severe IPAH and a healthy population; additionally, we identified potential novel biomarkers that correlated with IPAH pulmonary hemodynamic variables that may be important in screening for less severe forms IPAH.
Mansoor, J. K.; Schelegle, Edward S.; Davis, Cristina E.; Walby, William F.; Zhao, Weixiang; Aksenov, Alexander A.; Pasamontes, Alberto; Figueroa, Jennifer; Allen, Roblee
2014-01-01
Background An important challenge to pulmonary arterial hypertension (PAH) diagnosis and treatment is early detection of occult pulmonary vascular pathology. Symptoms are frequently confused with other disease entities that lead to inappropriate interventions and allow for progression to advanced states of disease. There is a significant need to develop new markers for early disease detection and management of PAH. Methodolgy and Findings Exhaled breath condensate (EBC) samples were compared from 30 age-matched normal healthy individuals and 27 New York Heart Association functional class III and IV idiopathic pulmonary arterial hypertenion (IPAH) patients, a subgroup of PAH. Volatile organic compounds (VOC) in EBC samples were analyzed using gas chromatography/mass spectrometry (GC/MS). Individual peaks in GC profiles were identified in both groups and correlated with pulmonary hemodynamic and clinical endpoints in the IPAH group. Additionally, GC/MS data were analyzed using autoregression followed by partial least squares regression (AR/PLSR) analysis to discriminate between the IPAH and control groups. After correcting for medicaitons, there were 62 unique compounds in the control group, 32 unique compounds in the IPAH group, and 14 in-common compounds between groups. Peak-by-peak analysis of GC profiles of IPAH group EBC samples identified 6 compounds significantly correlated with pulmonary hemodynamic variables important in IPAH diagnosis. AR/PLSR analysis of GC/MS data resulted in a distinct and identifiable metabolic signature for IPAH patients. Conclusions These findings indicate the utility of EBC VOC analysis to discriminate between severe IPAH and a healthy population; additionally, we identified potential novel biomarkers that correlated with IPAH pulmonary hemodynamic variables that may be important in screening for less severe forms IPAH. PMID:24748102
Lee, Byeong-Ju; Kim, Hye-Youn; Lim, Sa Rang; Huang, Linfang; Choi, Hyung-Kyoon
2017-01-01
Panax ginseng C.A. Meyer is a herb used for medicinal purposes, and its discrimination according to cultivation age has been an important and practical issue. This study employed Fourier-transform infrared (FT-IR) spectroscopy with multivariate statistical analysis to obtain a prediction model for discriminating cultivation ages (5 and 6 years) and three different parts (rhizome, tap root, and lateral root) of P. ginseng. The optimal partial-least-squares regression (PLSR) models for discriminating ginseng samples were determined by selecting normalization methods, number of partial-least-squares (PLS) components, and variable influence on projection (VIP) cutoff values. The best prediction model for discriminating 5- and 6-year-old ginseng was developed using tap root, vector normalization applied after the second differentiation, one PLS component, and a VIP cutoff of 1.0 (based on the lowest root-mean-square error of prediction value). In addition, for discriminating among the three parts of P. ginseng, optimized PLSR models were established using data sets obtained from vector normalization, two PLS components, and VIP cutoff values of 1.5 (for 5-year-old ginseng) and 1.3 (for 6-year-old ginseng). To our knowledge, this is the first study to provide a novel strategy for rapidly discriminating the cultivation ages and parts of P. ginseng using FT-IR by selected normalization methods, number of PLS components, and VIP cutoff values.
Lim, Sa Rang; Huang, Linfang
2017-01-01
Panax ginseng C.A. Meyer is a herb used for medicinal purposes, and its discrimination according to cultivation age has been an important and practical issue. This study employed Fourier-transform infrared (FT-IR) spectroscopy with multivariate statistical analysis to obtain a prediction model for discriminating cultivation ages (5 and 6 years) and three different parts (rhizome, tap root, and lateral root) of P. ginseng. The optimal partial-least-squares regression (PLSR) models for discriminating ginseng samples were determined by selecting normalization methods, number of partial-least-squares (PLS) components, and variable influence on projection (VIP) cutoff values. The best prediction model for discriminating 5- and 6-year-old ginseng was developed using tap root, vector normalization applied after the second differentiation, one PLS component, and a VIP cutoff of 1.0 (based on the lowest root-mean-square error of prediction value). In addition, for discriminating among the three parts of P. ginseng, optimized PLSR models were established using data sets obtained from vector normalization, two PLS components, and VIP cutoff values of 1.5 (for 5-year-old ginseng) and 1.3 (for 6-year-old ginseng). To our knowledge, this is the first study to provide a novel strategy for rapidly discriminating the cultivation ages and parts of P. ginseng using FT-IR by selected normalization methods, number of PLS components, and VIP cutoff values. PMID:29049369
Quantitative determination and classification of energy drinks using near-infrared spectroscopy.
Rácz, Anita; Héberger, Károly; Fodor, Marietta
2016-09-01
Almost a hundred commercially available energy drink samples from Hungary, Slovakia, and Greece were collected for the quantitative determination of their caffeine and sugar content with FT-NIR spectroscopy and high-performance liquid chromatography (HPLC). Calibration models were built with partial least-squares regression (PLSR). An HPLC-UV method was used to measure the reference values for caffeine content, while sugar contents were measured with the Schoorl method. Both the nominal sugar content (as indicated on the cans) and the measured sugar concentration were used as references. Although the Schoorl method has larger error and bias, appropriate models could be developed using both references. The validation of the models was based on sevenfold cross-validation and external validation. FT-NIR analysis is a good candidate to replace the HPLC-UV method, because it is much cheaper than any chromatographic method, while it is also more time-efficient. The combination of FT-NIR with multidimensional chemometric techniques like PLSR can be a good option for the detection of low caffeine concentrations in energy drinks. Moreover, three types of energy drinks that contain (i) taurine, (ii) arginine, and (iii) none of these two components were classified correctly using principal component analysis and linear discriminant analysis. Such classifications are important for the detection of adulterated samples and for quality control, as well. In this case, more than a hundred samples were used for the evaluation. The classification was validated with cross-validation and several randomization tests (X-scrambling). Graphical Abstract The way of energy drinks from cans to appropriate chemometric models.
Nondestructive detection of pork quality based on dual-band VIS/NIR spectroscopy
NASA Astrophysics Data System (ADS)
Wang, Wenxiu; Peng, Yankun; Li, Yongyu; Tang, Xiuying; Liu, Yuanyuan
2015-05-01
With the continuous development of living standards and the relative change of dietary structure, consumers' rising and persistent demand for better quality of meat is emphasized. Colour, pH value, and cooking loss are important quality attributes when evaluating meat. To realize nondestructive detection of multi-parameter of meat quality simultaneously is popular in production and processing of meat and meat products. The objectives of this research were to compare the effectiveness of two bands for rapid nondestructive and simultaneous detection of pork quality attributes. Reflectance spectra of 60 chilled pork samples were collected from a dual-band visible/near-infrared spectroscopy system which covered 350-1100 nm and 1000-2600 nm. Then colour, pH value and cooking loss were determined by standard methods as reference values. Standard normal variables transform (SNVT) was employed to eliminate the spectral noise. A spectrum connection method was put forward for effective integration of the dual-band spectrum to make full use of the whole efficient information. Partial least squares regression (PLSR) and Principal component analysis (PCA) were applied to establish prediction models using based on single-band spectrum and dual-band spectrum, respectively. The experimental results showed that the PLSR model based on dual-band spectral information was superior to the models based on single band spectral information with lower root means quare error (RMSE) and higher accuracy. The PLSR model based on dual-band (use the overlapping part of first band) yielded the best prediction result with correlation coefficient of validation (Rv) of 0.9469, 0.9495, 0.9180, 0.9054 and 0.8789 for L*, a*, b*, pH value and cooking loss, respectively. This mainly because dual-band spectrum can provide sufficient and comprehensive information which reflected the quality attributes. Data fusion from dual-band spectrum could significantly improve pork quality parameters prediction performance. The research also indicated that multi-band spectral information fusion has potential to comprehensively evaluate other quality and safety attributes of pork.
Miloudi, Lynda; Bonnier, Franck; Bertrand, Dominique; Byrne, Hugh J; Perse, Xavier; Chourpa, Igor; Munnier, Emilie
2017-07-01
Core-shell nanocarriers are increasingly being adapted in cosmetic and dermatological fields, aiming to provide an increased penetration of the active pharmaceutical or cosmetic ingredients (API and ACI) through the skin. In the final form, the nanocarriers (NC) are usually prepared in hydrogels, conferring desired viscous properties for topical application. Combined with the high chemical complexity of the encapsulating system itself, involving numerous ingredients to form a stable core and quantifying the NC and/or the encapsulated active without labor-intensive and destructive methods remains challenging. In this respect, the specific molecular fingerprint obtained from vibrational spectroscopy analysis could unambiguously overcome current obstacles in the development of fast and cost-effective quality control tools for NC-based products. The present study demonstrates the feasibility to deliver accurate quantification of the concentrations of curcumin (ACI)-loaded alginate nanocarriers in hydrogel matrices, coupling partial least square regression (PLSR) to infrared (IR) absorption and Raman spectroscopic analyses. With respective root mean square errors of 0.1469 ± 0.0175% w/w and 0.4462 ± 0.0631% w/w, both approaches offer acceptable precision. Further investigation of the PLSR results allowed to highlight the different selectivity of each approach, indicating only IR analysis delivers direct monitoring of the NC through the quantification of the Labrafac®, the main NC ingredient. Raman analyses are rather dominated by the contribution of the ACI which opens numerous perspectives to quantify the active molecules without interferences from the complex core-shell encapsulating systems thus positioning the technique as a powerful analytical tool for industrial screening of cosmetic and pharmaceutical products. Graphical abstract Quantitative analysis of encapuslated active molecules in hydrogel-based samples by means of infrared and Raman spectroscopy.
Tan, Jin; Li, Rong; Jiang, Zi-Tao; Tang, Shu-Hua; Wang, Ying; Shi, Meng; Xiao, Yi-Qian; Jia, Bin; Lu, Tian-Xiang; Wang, Hao
2017-02-15
Synchronous front-face fluorescence spectroscopy has been developed for the discrimination of used frying oil (UFO) from edible vegetable oil (EVO), the estimation of the using time of UFO, and the determination of the adulteration of EVO with UFO. Both the heating time of laboratory prepared UFO and the adulteration of EVO with UFO could be determined by partial least squares regression (PLSR). To simulate the EVO adulteration with UFO, for each kind of oil, fifty adulterated samples at the adulterant amounts range of 1-50% were prepared. PLSR was then adopted to build the model and both full (leave-one-out) cross-validation and external validation were performed to evaluate the predictive ability. Under the optimum condition, the plots of observed versus predicted values exhibited high linearity (R(2)>0.96). The root mean square error of cross-validation (RMSECV) and root mean square error of prediction (RMSEP) were both lower than 3%. Copyright © 2016 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Hussain, Javid; Mabood, Fazal; Al-Harrasi, Ahmed; Ali, Liaqat; Rizvi, Tania Shamim; Jabeen, Farah; Gilani, Syed Abdullah; Shinwari, Shehla; Ahmad, Mushtaq; Alabri, Zahra Khalfan; Al Ghawi, Said Hamood Salim
2018-04-01
Flavonoids are natural antioxidants derived from plants and commonly found in a variety of foods to sequester free radicals. Quercetin, belonging to flavonol subclass of flavonoids, has received considerable attention because of its wide uses as a nutritional supplement as well as a phytochemical remedy for a number of diseases. In the current study, quantification of quercetin was carried out in two medicinally important flavonoid rich plant Ziziphus mucronata and Ziziphus sativa. Emission spectroscopy was utilized as a new method coupled with Partial Least Squares Regression (PLSR) and the cross validation was done by UV-Visible spectroscopy. The results indicated the higher quercetin content in Z. mucronata (1.50 ± 0.034%) than Z. sativa (1.21 ± 0.052%), and were further verified through Folin-Ciocalteu Colorimetric method (Z. mucronata; 1.41 ± 0.26% and Z. sativa; 1.13 ± 0.136%). In this study the sensitivity was explained in term of slope i.e. Slope = 0.9973.
Liang, Ningjian; Lu, Xiaonan; Hu, Yaxi; Kitts, David D
2016-01-27
The chlorogenic acid isomer profile and antioxidant activity of both green and roasted coffee beans are reported herein using ATR-FTIR spectroscopy combined with chemometric analyses. High-performance liquid chromatography (HPLC) quantified different chlorogenic acid isomer contents for reference, whereas ORAC, ABTS, and DPPH were used to determine the antioxidant activity of the same coffee bean extracts. FTIR spectral data and reference data of 42 coffee bean samples were processed to build optimized PLSR models, and 18 samples were used for external validation of constructed PLSR models. In total, six PLSR models were constructed for six chlorogenic acid isomers to predict content, with three PLSR models constructed to forecast the free radical scavenging activities, obtained using different chemical assays. In conclusion, FTIR spectroscopy, coupled with PLSR, serves as a reliable, nondestructive, and rapid analytical method to quantify chlorogenic acids and to assess different free radical-scavenging capacities in coffee beans.
Rapid determination of sugar level in snack products using infrared spectroscopy.
Wang, Ting; Rodriguez-Saona, Luis E
2012-08-01
Real-time spectroscopic methods can provide a valuable window into food manufacturing to permit optimization of production rate, quality and safety. There is a need for cutting edge sensor technology directed at improving efficiency, throughput and reliability of critical processes. The aim of the research was to evaluate the feasibility of infrared systems combined with chemometric analysis to develop rapid methods for determination of sugars in cereal products. Samples were ground and spectra were collected using a mid-infrared (MIR) spectrometer equipped with a triple-bounce ZnSe MIRacle attenuated total reflectance accessory or Fourier transform near infrared (NIR) system equipped with a diffuse reflection-integrating sphere. Sugar contents were determined using a reference HPLC method. Partial least squares regression (PLSR) was used to create cross-validated calibration models. The predictability of the models was evaluated on an independent set of samples and compared with reference techniques. MIR and NIR spectra showed characteristic absorption bands for sugars, and generated excellent PLSR models (sucrose: SEP < 1.7% and r > 0.96). Multivariate models accurately and precisely predicted sugar level in snacks allowing for rapid analysis. This simple technique allows for reliable prediction of quality parameters, and automation enabling food manufacturers for early corrective actions that will ultimately save time and money while establishing a uniform quality. The U.S. snack food industry generates billions of dollars in revenue each year and vibrational spectroscopic methods combined with pattern recognition analysis could permit optimization of production rate, quality, and safety of many food products. This research showed that infrared spectroscopy is a powerful technique for near real-time (approximately 1 min) assessment of sugar content in various cereal products. © 2012 Institute of Food Technologists®
Characterization of the biosolids composting process by hyperspectral analysis.
Ilani, Talli; Herrmann, Ittai; Karnieli, Arnon; Arye, Gilboa
2016-02-01
Composted biosolids are widely used as a soil supplement to improve soil quality. However, the application of immature or unstable compost can cause the opposite effect. To date, compost maturation determination is time consuming and cannot be done at the composting site. Hyperspectral spectroscopy was suggested as a simple tool for assessing compost maturity and quality. Nevertheless, there is still a gap in knowledge regarding several compost maturation characteristics, such as dissolved organic carbon, NO3, and NH4 contents. In addition, this approach has not yet been tested on a sample at its natural water content. Therefore, in the current study, hyperspectral analysis was employed in order to characterize the biosolids composting process as a function of composting time. This goal was achieved by correlating the reflectance spectra in the range of 400-2400nm, using the partial least squares-regression (PLS-R) model, with the chemical properties of wet and oven-dried biosolid samples. The results showed that the proposed method can be used as a reliable means to evaluate compost maturity and stability. Specifically, the PLS-R model was found to be an adequate tool to evaluate the biosolids' total carbon and dissolved organic carbon, total nitrogen and dissolved nitrogen, and nitrate content, as well as the absorbance ratio of 254/365nm (E2/E3) and C/N ratios in the dry and wet samples. It failed, however, to predict the ammonium content in the dry samples since the ammonium evaporated during the drying process. It was found that in contrast to what is commonly assumed, the spectral analysis of the wet samples can also be successfully used to build a model for predicting the biosolids' compost maturity. Copyright © 2015 Elsevier Ltd. All rights reserved.
Zhao, Ming; Nian, Yingqun; Allen, Paul; Downey, Gerard; Kerry, Joseph P; O'Donnell, Colm P
2018-05-01
This work aims to develop a rapid analytical technique to predict beef sensory attributes using Raman spectroscopy (RS) and to investigate correlations between sensory attributes using chemometric analysis. Beef samples (n = 72) were obtained from young dairy bulls (Holstein-Friesian and Jersey×Holstein-Friesian) slaughtered at 15 and 19 months old. Trained sensory panel evaluation and Raman spectral data acquisition were both carried out on the same longissimus thoracis muscles after ageing for 21 days. The best prediction results were obtained using a Raman frequency range of 1300-2800 cm -1 . Prediction performance of partial least squares regression (PLSR) models developed using all samples were moderate to high for all sensory attributes (R 2 CV values of 0.50-0.84 and RMSECV values of 1.31-9.07) and were particularly high for desirable flavour attributes (R 2 CVs of 0.80-0.84, RMSECVs of 4.21-4.65). For PLSR models developed on subsets of beef samples i.e. beef of an identical age or breed type, significant improvements on prediction performances were achieved for overall sensory attributes (R 2 CVs of 0.63-0.89 and RMSECVs of 0.38-6.88 for each breed type; R 2 CVs of 0.52-0.89 and RMSECVs of 0.96-6.36 for each age group). Chemometric analysis revealed strong correlations between sensory attributes. Raman spectroscopy combined with chemometric analysis was demonstrated to have high potential as a rapid and non-destructive technique to predict the sensory quality traits of young dairy bull beef. Copyright © 2018. Published by Elsevier Ltd.
Liu, Huiyu; Zhang, Mingyang; Lin, Zhenshan
2017-10-05
Climate changes are considered to significantly impact net primary productivity (NPP). However, there are few studies on how climate changes at multiple time scales impact NPP. With MODIS NPP product and station-based observations of sunshine duration, annual average temperature and annual precipitation, impacts of climate changes at different time scales on annual NPP, have been studied with EEMD (ensemble empirical mode decomposition) method in the Karst area of northwest Guangxi, China, during 2000-2013. Moreover, with partial least squares regression (PLSR) model, the relative importance of climatic variables for annual NPP has been explored. The results show that (1) only at quasi 3-year time scale do sunshine duration and temperature have significantly positive relations with NPP. (2) Annual precipitation has no significant relation to NPP by direct comparison, but significantly positive relation at 5-year time scale, which is because 5-year time scale is not the dominant scale of precipitation; (3) the changes of NPP may be dominated by inter-annual variabilities. (4) Multiple time scales analysis will greatly improve the performance of PLSR model for estimating NPP. The variable importance in projection (VIP) scores of sunshine duration and temperature at quasi 3-year time scale, and precipitation at quasi 5-year time scale are greater than 0.8, indicating important for NPP during 2000-2013. However, sunshine duration and temperature at quasi 3-year time scale are much more important. Our results underscore the importance of multiple time scales analysis for revealing the relations of NPP to changing climate.
Hejri-Zarifi, Sudiyeh; Ahmadian-Kouchaksaraei, Zahra; Pourfarzad, Amir; Khodaparast, Mohammad Hossein Haddad
2014-12-01
Germinated palm date seeds were milled into two fractions: germ and residue. Dough rheological characteristics, baking (specific volume and sensory evaluation), and textural properties (at first day and during storage for 5 days) were determined in Barbari flat bread. Germ and residue fractions were incorporated at various levels ranged in 0.5-3 g/100 g of wheat flour. Water absorption, arrival time and gelatination temperature were decreased by germ fraction but accompanied by an increasing effect on the mixing tolerance index and degree of softening in most levels. Although improvement in dough stability was monitored but specific volume of bread was not affected by both fractions. Texture analysis of bread samples during 5 days of storage indicated that both fractions of germinated date seeds were able to diminish bread staling. Avrami non-linear regression equation was chosen as useful mathematical model to properly study bread hardening kinetics. In addition, principal component analysis (PCA) allowed discriminating among dough and bread specialties. Partial least squares regression (PLSR) models were applied to determine the relationships between sensory and instrumental data.
Accuracy of direct genomic values in Holstein bulls and cows using subsets of SNP markers
2010-01-01
Background At the current price, the use of high-density single nucleotide polymorphisms (SNP) genotyping assays in genomic selection of dairy cattle is limited to applications involving elite sires and dams. The objective of this study was to evaluate the use of low-density assays to predict direct genomic value (DGV) on five milk production traits, an overall conformation trait, a survival index, and two profit index traits (APR, ASI). Methods Dense SNP genotypes were available for 42,576 SNP for 2,114 Holstein bulls and 510 cows. A subset of 1,847 bulls born between 1955 and 2004 was used as a training set to fit models with various sets of pre-selected SNP. A group of 297 bulls born between 2001 and 2004 and all cows born between 1992 and 2004 were used to evaluate the accuracy of DGV prediction. Ridge regression (RR) and partial least squares regression (PLSR) were used to derive prediction equations and to rank SNP based on the absolute value of the regression coefficients. Four alternative strategies were applied to select subset of SNP, namely: subsets of the highest ranked SNP for each individual trait, or a single subset of evenly spaced SNP, where SNP were selected based on their rank for ASI, APR or minor allele frequency within intervals of approximately equal length. Results RR and PLSR performed very similarly to predict DGV, with PLSR performing better for low-density assays and RR for higher-density SNP sets. When using all SNP, DGV predictions for production traits, which have a higher heritability, were more accurate (0.52-0.64) than for survival (0.19-0.20), which has a low heritability. The gain in accuracy using subsets that included the highest ranked SNP for each trait was marginal (5-6%) over a common set of evenly spaced SNP when at least 3,000 SNP were used. Subsets containing 3,000 SNP provided more than 90% of the accuracy that could be achieved with a high-density assay for cows, and 80% of the high-density assay for young bulls. Conclusions Accurate genomic evaluation of the broader bull and cow population can be achieved with a single genotyping assays containing ~ 3,000 to 5,000 evenly spaced SNP. PMID:20950478
Xu, Shengxiang; Shi, Xuezheng; Wang, Meiyan; Zhao, Yongcun
2016-01-01
Assessment and monitoring of soil organic matter (SOM) quality are important for understanding SOM dynamics and developing management practices that will enhance and maintain the productivity of agricultural soils. Visible and near-infrared (Vis–NIR) diffuse reflectance spectroscopy (350–2500 nm) has received increasing attention over the recent decades as a promising technique for SOM analysis. While heterogeneity of sample sets is one critical factor that complicates the prediction of soil properties from Vis–NIR spectra, a spectral library representing the local soil diversity needs to be constructed. The study area, covering a surface of 927 km2 and located in Yujiang County of Jiangsu Province, is characterized by a hilly area with different soil parent materials (e.g., red sandstone, shale, Quaternary red clay, and river alluvium). In total, 232 topsoil (0–20 cm) samples were collected for SOM analysis and scanned with a Vis–NIR spectrometer in the laboratory. Reflectance data were related to surface SOM content by means of a partial least square regression (PLSR) method and several data pre-processing techniques, such as first and second derivatives with a smoothing filter. The performance of the PLSR model was tested under different combinations of calibration/validation sets (global and local calibrations stratified according to parent materials). The results showed that the models based on the global calibrations can only make approximate predictions for SOM content (RMSE (root mean squared error) = 4.23–4.69 g kg−1; R2 (coefficient of determination) = 0.80–0.84; RPD (ratio of standard deviation to RMSE) = 2.19–2.44; RPIQ (ratio of performance to inter-quartile distance) = 2.88–3.08). Under the local calibrations, the individual PLSR models for each parent material improved SOM predictions (RMSE = 2.55–3.49 g kg−1; R2 = 0.87–0.93; RPD = 2.67–3.12; RPIQ = 3.15–4.02). Among the four different parent materials, the largest R2 and the smallest RMSE were observed for the shale soils, which had the lowest coefficient of variation (CV) values for clay (18.95%), free iron oxides (15.93%), and pH (1.04%). This demonstrates the importance of a practical subsetting strategy for the continued improvement of SOM prediction with Vis–NIR spectroscopy. PMID:26974821
Tahir, Haroon Elrasheid; Xiaobo, Zou; Zhihua, Li; Jiyong, Shi; Zhai, Xiaodong; Wang, Sheng; Mariod, Abdalbasit Adam
2017-07-01
Fourier transform infrared with attenuated total reflectance (FTIR-ATR) and Raman spectroscopy combined with partial least square regression (PLSR) were applied for the prediction of phenolic compounds and antioxidant activity in honey. Standards of catechin, syringic, vanillic, and chlorogenic acids were used for the identification and quantification of the individual phenolic compounds in six honey varieties using HPLC-DAD. Total antioxidant activity (TAC) and ferrous chelating capacity were measured spectrophotometrically. For the establishment of PLSR model, Raman spectra with Savitzky-Golay smoothing in wavenumber region 1500-400cm -1 was used while for FTIR-ATR the wavenumber regions of 1800-700 and 3000-2800cm -1 with multiplicative scattering correction (MSC) and Savitzky-Golay smoothing were used. The determination coefficients (R 2 ) were ranged from 0.9272 to 0.9992 for Raman while from 0.9461 to 0.9988 for FTIT-ART. The FTIR-ATR and Raman demonstrated to be simple, rapid and nondestructive methods to quantify phenolic compounds and antioxidant activities in honey. Copyright © 2017 Elsevier Ltd. All rights reserved.
Pérez-Castaño, Estefanía; Sánchez-Viñas, Mercedes; Gázquez-Evangelista, Domingo; Bagur-González, M Gracia
2018-01-15
This paper describes and discusses the application of trimethylsilyl (TMS)-4,4'-desmethylsterols derivatives chromatographic fingerprints (obtained from an off-line HPLC-GC-FID system) for the quantification of extra virgin olive oil in commercial vinaigrettes, dressing salad and in-house reference materials (i-HRM) using two different Partial Least Square-Regression (PLS-R) multivariate quantification methods. Different data pre-processing strategies were carried out being the whole one: (i) internal normalization; (ii) sampling based on The Nyquist Theorem; (iii) internal correlation optimized shifting, icoshift; (iv) baseline correction (v) mean centering and (vi) selecting zones. The first model corresponds to a matrix of dimensions 'n×911' variables and the second one to a matrix of dimensions 'n×431' variables. It has to be highlighted that the proposed two PLS-R models allow the quantification of extra virgin olive oil in binary blends, foodstuffs, etc., when the provided percentage is greater than 25%. Copyright © 2017 Elsevier Ltd. All rights reserved.
Hyperspectral sensing to detect the impact of herbicide drift on cotton growth and yield
NASA Astrophysics Data System (ADS)
Suarez, L. A.; Apan, A.; Werth, J.
2016-10-01
Yield loss in crops is often associated with plant disease or external factors such as environment, water supply and nutrient availability. Improper agricultural practices can also introduce risks into the equation. Herbicide drift can be a combination of improper practices and environmental conditions which can create a potential yield loss. As traditional assessment of plant damage is often imprecise and time consuming, the ability of remote and proximal sensing techniques to monitor various bio-chemical alterations in the plant may offer a faster, non-destructive and reliable approach to predict yield loss caused by herbicide drift. This paper examines the prediction capabilities of partial least squares regression (PLS-R) models for estimating yield. Models were constructed with hyperspectral data of a cotton crop sprayed with three simulated doses of the phenoxy herbicide 2,4-D at three different growth stages. Fibre quality, photosynthesis, conductance, and two main hormones, indole acetic acid (IAA) and abscisic acid (ABA) were also analysed. Except for fibre quality and ABA, Spearman correlations have shown that these variables were highly affected by the chemical. Four PLS-R models for predicting yield were developed according to four timings of data collection: 2, 7, 14 and 28 days after the exposure (DAE). As indicated by the model performance, the analysis revealed that 7 DAE was the best time for data collection purposes (RMSEP = 2.6 and R2 = 0.88), followed by 28 DAE (RMSEP = 3.2 and R2 = 0.84). In summary, the results of this study show that it is possible to accurately predict yield after a simulated herbicide drift of 2,4-D on a cotton crop, through the analysis of hyperspectral data, thereby providing a reliable, effective and non-destructive alternative based on the internal response of the cotton leaves.
Lu, Xiaonan; Rasco, Barbara A.; Jabal, Jamie M. F.; Aston, D. Eric; Lin, Mengshi; Konkel, Michael E.
2011-01-01
Fourier transform infrared (FT-IR) spectroscopy and Raman spectroscopy were used to study the cell injury and inactivation of Campylobacter jejuni from exposure to antioxidants from garlic. C. jejuni was treated with various concentrations of garlic concentrate and garlic-derived organosulfur compounds in growth media and saline at 4, 22, and 35°C. The antimicrobial activities of the diallyl sulfides increased with the number of sulfur atoms (diallyl sulfide < diallyl disulfide < diallyl trisulfide). FT-IR spectroscopy confirmed that organosulfur compounds are responsible for the substantial antimicrobial activity of garlic, much greater than those of garlic phenolic compounds, as indicated by changes in the spectral features of proteins, lipids, and polysaccharides in the bacterial cell membranes. Confocal Raman microscopy (532-nm-gold-particle substrate) and Raman mapping of a single bacterium confirmed the intracellular uptake of sulfur and phenolic components. Scanning electron microscopy (SEM) and transmission electron microscopy (TEM) were employed to verify cell damage. Principal-component analysis (PCA), discriminant function analysis (DFA), and soft independent modeling of class analogs (SIMCA) were performed, and results were cross validated to differentiate bacteria based upon the degree of cell injury. Partial least-squares regression (PLSR) was employed to quantify and predict actual numbers of healthy and injured bacterial cells remaining following treatment. PLSR-based loading plots were investigated to further verify the changes in the cell membrane of C. jejuni treated with organosulfur compounds. We demonstrated that bacterial injury and inactivation could be accurately investigated by complementary infrared and Raman spectroscopies using a chemical-based, “whole-organism fingerprint” with the aid of chemometrics and electron microscopy. PMID:21642409
Katseanes, Chelsea K; Chappell, Mark A; Hopkins, Bryan G; Durham, Brian D; Price, Cynthia L; Porter, Beth E; Miller, Lesley F
2016-11-01
After nearly a century of use in numerous munition platforms, TNT and RDX contamination has turned up largely in the environment due to ammunition manufacturing or as part of releases from low-order detonations during training activities. Although the basic knowledge governing the environmental fate of TNT and RDX are known, accurate predictions of TNT and RDX persistence in soil remain elusive, particularly given the universal heterogeneity of pedomorphic soil types. In this work, we proposed a new solution for modeling the sorption and persistence of these munition constituents as multivariate mathematical functions correlating soil attribute data over a variety of taxonomically distinct soil types to contaminant behavior, instead of a single constant or parameter of a specific absolute value. To test this idea, we conducted experiments measuring the sorption of TNT and RDX on taxonomically different soil types that were extensively physical and chemically characterized. Statistical decomposition of the log-transformed, and auto-scaled soil characterization data using the dimension-reduction technique PCA (principal component analysis) revealed a strong latent structure based in the multiple pairwise correlations among the soil properties. TNT and RDX sorption partitioning coefficients (KD-TNT and KD-RDX) were regressed against this latent structure using partial least squares regression (PLSR), generating a 3-factor, multivariate linear functions. Here, PLSR models predicted KD-TNT and KD-RDX values based on attributes contributing to endogenous alkaline/calcareous and soil fertility criteria, respectively, exhibited among the different soil types: We hypothesized that the latent structure arising from the strong covariance of full multivariate geochemical matrix describing taxonomically distinguished soil types may provide the means for potentially predicting complex phenomena in soils. The development of predictive multivariate models tuned to a local soil's taxonomic designation would have direct benefit to military range managers seeking to anticipate the environmental risks of training activities on impact sites. Published by Elsevier Ltd.
NASA Astrophysics Data System (ADS)
Pullanagari, R. R.; Kereszturi, Gábor; Yule, I. J.
2016-07-01
On-farm assessment of mixed pasture nutrient concentrations is important for animal production and pasture management. Hyperspectral imaging is recognized as a potential tool to quantify the nutrient content of vegetation. However, it is a great challenge to estimate macro and micro nutrients in heterogeneous mixed pastures. In this study, canopy reflectance data was measured by using a high resolution airborne visible-to-shortwave infrared (Vis-SWIR) imaging spectrometer measuring in the wavelength region 380-2500 nm to predict nutrient concentrations, nitrogen (N) phosphorus (P), potassium (K), sulfur (S), zinc (Zn), sodium (Na), manganese (Mn) copper (Cu) and magnesium (Mg) in heterogeneous mixed pastures across a sheep and beef farm in hill country, within New Zealand. Prediction models were developed using four different methods which are included partial least squares regression (PLSR), kernel PLSR, support vector regression (SVR), random forest regression (RFR) algorithms and their performance compared using the test data. The results from the study revealed that RFR produced highest accuracy (0.55 ⩽ R2CV ⩽ 0.78; 6.68% ⩽ nRMSECV ⩽ 26.47%) compared to all other algorithms for the majority of nutrients (N, P, K, Zn, Na, Cu and Mg) described, and the remaining nutrients (S and Mn) were predicted with high accuracy (0.68 ⩽ R2CV ⩽ 0.86; 13.00% ⩽ nRMSECV ⩽ 14.64%) using SVR. The best training models were used to extrapolate over the whole farm with the purpose of predicting those pasture nutrients and expressed through pixel based spatial maps. These spatially registered nutrient maps demonstrate the range and geographical location of often large differences in pasture nutrient values which are normally not measured and therefore not included in decision making when considering more effective ways to utilized pasture.
NASA Astrophysics Data System (ADS)
Tuukkanen, T.; Marttila, H.; Kløve, B.
2017-07-01
Organic matter and nutrient export from drained peatlands is affected by complex hydrological and biogeochemical interactions. Here partial least squares regression (PLSR) was used to relate various soil and catchment characteristics to variations in chemical oxygen demand (COD), total nitrogen (TN), and total phosphorus (TP) concentrations in runoff. Peat core samples and water quality data were collected from 15 peat extraction sites in Finland. PLSR models constructed by cross-validation and variable selection routines predicted 92, 88, and 95% of the variation in mean COD, TN, and TP concentration in runoff, respectively. The results showed that variations in COD were mainly related to net production (temperature and water-extractable dissolved organic carbon (DOC)), hydrology (topographical relief), and solubility of dissolved organic matter (peat sulfur (S) and calcium (Ca) concentrations). Negative correlations for peat S and runoff COD indicated that acidity from oxidation of organic S stored in peat may be an important mechanism suppressing organic matter leaching. Moreover, runoff COD was associated with peat aluminum (Al), P, and sodium (Na) concentrations. Hydrological controls on TN and COD were similar (i.e., related to topography), whereas degree of humification, bulk density, and water-extractable COD and Al provided additional explanations for TN concentration. Variations in runoff TP concentration were attributed to erosion of particulate P, as indicated by a positive correlation with suspended sediment concentration (SSC), and factors associated with metal-humic complexation and P adsorption (peat Al, water-extractable P, and water-extractable iron (Fe)).
Canopy Spectral Reflectance as a Predictor of Soil Water Potential in Rice
NASA Astrophysics Data System (ADS)
Panigrahi, N.; Das, B. S.
2018-04-01
Soil water potential (SWP) is a key parameter for characterizing water stress. Typically, a tensiometer is used to measure SWP. However, the measurement range for commercially available tensiometers is limited to -90 kPa and a tensiometer can only provide estimate of SWP at a single location. In this study, a new approach was developed for estimating SWP from spectral reflectance data of a standing rice crop over the visible to shortwave-infrared region (wavelength: 350-2,500 nm). Five water stress treatments corresponding to targeted SWP of -30, -50, -70, -120, and -140 kPa were examined by withholding irrigation during the vegetative growth stage of three rice varieties. Tensiometers and mechanistic water flow model were used for monitoring SWP. Spectral models for SWP were developed using partial-least-squares regression (PLSR), support vector regression (SVR), and coupled PLSR and feature selection (PLSRFS) approaches. Results showed that the SVR approach was the best model for estimating SWP from spectral reflectance data with the coefficient of determination values of 0.71 and 0.55 for the calibration and validation data sets, respectively. Observed root-mean-squared residuals for the predicted SWPs were in the range of -7 to -19 kPa. A new spectral water stress index was also developed using the reflectance values at 745 and 2,002 nm, which showed strong correlation with relative water contents and electrolyte leakage. This new approach is rapid and noninvasive and may be used for estimating SWP over large areas.
Value of Information Analysis for Time-lapse Seismic Data by Simulation-Regression
NASA Astrophysics Data System (ADS)
Dutta, G.; Mukerji, T.; Eidsvik, J.
2016-12-01
A novel method to estimate the Value of Information (VOI) of time-lapse seismic data in the context of reservoir development is proposed. VOI is a decision analytic metric quantifying the incremental value that would be created by collecting information prior to making a decision under uncertainty. The VOI has to be computed before collecting the information and can be used to justify its collection. Previous work on estimating the VOI of geophysical data has involved explicit approximation of the posterior distribution of reservoir properties given the data and then evaluating the prospect values for that posterior distribution of reservoir properties. Here, we propose to directly estimate the prospect values given the data by building a statistical relationship between them using regression. Various regression techniques such as Partial Least Squares Regression (PLSR), Multivariate Adaptive Regression Splines (MARS) and k-Nearest Neighbors (k-NN) are used to estimate the VOI, and the results compared. For a univariate Gaussian case, the VOI obtained from simulation-regression has been shown to be close to the analytical solution. Estimating VOI by simulation-regression is much less computationally expensive since the posterior distribution of reservoir properties given each possible dataset need not be modeled and the prospect values need not be evaluated for each such posterior distribution of reservoir properties. This method is flexible, since it does not require rigid model specification of posterior but rather fits conditional expectations non-parametrically from samples of values and data.
Madhavan, Dinesh B; Baldock, Jeff A; Read, Zoe J; Murphy, Simon C; Cunningham, Shaun C; Perring, Michael P; Herrmann, Tim; Lewis, Tom; Cavagnaro, Timothy R; England, Jacqueline R; Paul, Keryn I; Weston, Christopher J; Baker, Thomas G
2017-05-15
Reforestation of agricultural lands with mixed-species environmental plantings can effectively sequester C. While accurate and efficient methods for predicting soil organic C content and composition have recently been developed for soils under agricultural land uses, such methods under forested land uses are currently lacking. This study aimed to develop a method using infrared spectroscopy for accurately predicting total organic C (TOC) and its fractions (particulate, POC; humus, HOC; and resistant, ROC organic C) in soils under environmental plantings. Soils were collected from 117 paired agricultural-reforestation sites across Australia. TOC fractions were determined in a subset of 38 reforested soils using physical fractionation by automated wet-sieving and 13 C nuclear magnetic resonance (NMR) spectroscopy. Mid- and near-infrared spectra (MNIRS, 6000-450 cm -1 ) were acquired from finely-ground soils from environmental plantings and agricultural land. Satisfactory prediction models based on MNIRS and partial least squares regression (PLSR) were developed for TOC and its fractions. Leave-one-out cross-validations of MNIRS-PLSR models indicated accurate predictions (R 2 > 0.90, negligible bias, ratio of performance to deviation > 3) and fraction-specific functional group contributions to beta coefficients in the models. TOC and its fractions were predicted using the cross-validated models and soil spectra for 3109 reforested and agricultural soils. The reliability of predictions determined using k-nearest neighbour score distance indicated that >80% of predictions were within the satisfactory inlier limit. The study demonstrated the utility of infrared spectroscopy (MNIRS-PLSR) to rapidly and economically determine TOC and its fractions and thereby accurately describe the effects of land use change such as reforestation on agricultural soils. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Shi, Tiezhu; Wang, Junjie; Chen, Yiyun; Wu, Guofeng
2016-10-01
Visible and near-infrared reflectance spectroscopy provides a beneficial tool for investigating soil heavy metal contamination. This study aimed to investigate mechanisms of soil arsenic prediction using laboratory based soil and leaf spectra, compare the prediction of arsenic content using soil spectra with that using rice plant spectra, and determine whether the combination of both could improve the prediction of soil arsenic content. A total of 100 samples were collected and the reflectance spectra of soils and rice plants were measured using a FieldSpec3 portable spectroradiometer (350-2500 nm). After eliminating spectral outliers, the reflectance spectra were divided into calibration (n = 62) and validation (n = 32) data sets using the Kennard-Stone algorithm. Genetic algorithm (GA) was used to select useful spectral variables for soil arsenic prediction. Thereafter, the GA-selected spectral variables of the soil and leaf spectra were individually and jointly employed to calibrate the partial least squares regression (PLSR) models using the calibration data set. The regression models were validated and compared using independent validation data set. Furthermore, the correlation coefficients of soil arsenic against soil organic matter, leaf arsenic and leaf chlorophyll were calculated, and the important wavelengths for PLSR modeling were extracted. Results showed that arsenic prediction using the leaf spectra (coefficient of determination in validation, Rv2 = 0.54; root mean square error in validation, RMSEv = 12.99 mg kg-1; and residual prediction deviation in validation, RPDv = 1.35) was slightly better than using the soil spectra (Rv2 = 0.42, RMSEv = 13.35 mg kg-1, and RPDv = 1.31). However, results also showed that the combinational use of soil and leaf spectra resulted in higher arsenic prediction (Rv2 = 0.63, RMSEv = 11.94 mg kg-1, RPDv = 1.47) compared with either soil or leaf spectra alone. Soil spectral bands near 480, 600, 670, 810, 1980, 2050 and 2290 nm, leaf spectral bands near 700, 890 and 900 nm in PLSR models were important wavelengths for soil arsenic prediction. Moreover, soil arsenic showed significantly positive correlations with soil organic matter (r = 0.62, p < 0.01) and leaf arsenic (r = 0.77, p < 0.01), and a significantly negative correlation with leaf chlorophyll (r = -0.67, p < 0.01). The results showed that the prediction of arsenic contents using soil and leaf spectra may be based on their relationships with soil organic matter and leaf chlorophyll contents, respectively. Although RPD of 1.47 was below the recommended RPD of >2 for soil analysis, arsenic prediction in agricultural soils can be improved by combining the leaf and soil spectra.
Jiménez-Carvelo, Ana M; González-Casado, Antonio; Cuadros-Rodríguez, Luis
2017-03-01
A new analytical method for the quantification of olive oil and palm oil in blends with other vegetable edible oils (canola, safflower, corn, peanut, seeds, grapeseed, linseed, sesame and soybean) using normal phase liquid chromatography, and applying chemometric tools was developed. The procedure for obtaining of chromatographic fingerprint from the methyl-transesterified fraction from each blend is described. The multivariate quantification methods used were Partial Least Square-Regression (PLS-R) and Support Vector Regression (SVR). The quantification results were evaluated by several parameters as the Root Mean Square Error of Validation (RMSEV), Mean Absolute Error of Validation (MAEV) and Median Absolute Error of Validation (MdAEV). It has to be highlighted that the new proposed analytical method, the chromatographic analysis takes only eight minutes and the results obtained showed the potential of this method and allowed quantification of mixtures of olive oil and palm oil with other vegetable oils. Copyright © 2016 Elsevier B.V. All rights reserved.
Ding, Guoyu; Li, Baiqing; Han, Yanqi; Liu, Aina; Zhang, Jingru; Peng, Jiamin; Jiang, Min; Hou, Yuanyuan; Bai, Gang
2016-11-30
For quality control of herbal medicines or functional foods, integral activity evaluation has become more popular in recent studies. The majority of researchers focus on the relationship between chromatography/mass spectroscopy and bioactivity, but the connection with spectrum-activity is easily ignored. In this paper, the near infrared reflection spectra (NIRS) of Flos Chrysanthemi samples were collected as a representative spectrum technology, and corresponding anti-inflammation activities were utilized to illustrate the spectrum-activity study. HPLC/Q-TOF-MS identification and heat map clustering were used to select the quality markers (Q-marker) from five cultivars of Flos Chrysanthemi. Using boxplot analysis and the interval limits of detection (LODs) theory, six crucial markers, namely, chlorogenic acid, 3,5-dicaffeoylquinic acid, 1,5-dicaffeoylquinic acid, luteoloside, apigenin-7-O-β-d-glucoside, and luteolin-7-O-6-malonylglucoside were screened out. Then partial least squares regression (PLSR) calibration models combined with synergy interval partial least squares (siPLS) and 12 different spectral pretreatment methods were developed for the parameters optimization of these Q-markers in Flos Chrysanthemi powder. After comparing the relationship between Q-marker contents and anti-inflammation activity via three machine learning approaches and PLSR, back-propagation neural network (BP-ANN) displayed a more excellent non-linear fitting effect, as its R for new batches reached 0.89. These results indicated that the integrated NIRS and bioactive strategy was suitable for fast quality management in Flos Chrysanthemi, and also applied to other botanical food quality control. Copyright © 2016 Elsevier B.V. All rights reserved.
Liu, Jinxia; Cao, Yue; Wang, Qiu; Pan, Wenjuan; Ma, Fei; Liu, Changhong; Chen, Wei; Yang, Jianbo; Zheng, Lei
2016-01-01
Water-injected beef has aroused public concern as a major food-safety issue in meat products. In the study, the potential of multispectral imaging analysis in the visible and near-infrared (405-970 nm) regions was evaluated for identifying water-injected beef. A multispectral vision system was used to acquire images of beef injected with up to 21% content of water, and partial least squares regression (PLSR) algorithm was employed to establish prediction model, leading to quantitative estimations of actual water increase with a correlation coefficient (r) of 0.923. Subsequently, an optimized model was achieved by integrating spectral data with feature information extracted from ordinary RGB data, yielding better predictions (r = 0.946). Moreover, the prediction equation was transferred to each pixel within the images for visualizing the distribution of actual water increase. These results demonstrate the capability of multispectral imaging technology as a rapid and non-destructive tool for the identification of water-injected beef. Copyright © 2015 Elsevier Ltd. All rights reserved.
Determination of elemental composition of shale rocks by laser induced breakdown spectroscopy
NASA Astrophysics Data System (ADS)
Sanghapi, Hervé K.; Jain, Jinesh; Bol'shakov, Alexander; Lopano, Christina; McIntyre, Dustin; Russo, Richard
2016-08-01
In this study laser induced breakdown spectroscopy (LIBS) is used for elemental characterization of outcrop samples from the Marcellus Shale. Powdered samples were pressed to form pellets and used for LIBS analysis. Partial least squares regression (PLS-R) and univariate calibration curves were used for quantification of analytes. The matrix effect is substantially reduced using the partial least squares calibration method. Predicted results with LIBS are compared to ICP-OES results for Si, Al, Ti, Mg, and Ca. As for C, its results are compared to those obtained by a carbon analyzer. Relative errors of the LIBS measurements are in the range of 1.7 to 12.6%. The limits of detection (LODs) obtained for Si, Al, Ti, Mg and Ca are 60.9, 33.0, 15.6, 4.2 and 0.03 ppm, respectively. An LOD of 0.4 wt.% was obtained for carbon. This study shows that the LIBS method can provide a rapid analysis of shale samples and can potentially benefit depleted gas shale carbon storage research.
Selecting minimum dataset soil variables using PLSR as a regressive multivariate method
NASA Astrophysics Data System (ADS)
Stellacci, Anna Maria; Armenise, Elena; Castellini, Mirko; Rossi, Roberta; Vitti, Carolina; Leogrande, Rita; De Benedetto, Daniela; Ferrara, Rossana M.; Vivaldi, Gaetano A.
2017-04-01
Long-term field experiments and science-based tools that characterize soil status (namely the soil quality indices, SQIs) assume a strategic role in assessing the effect of agronomic techniques and thus in improving soil management especially in marginal environments. Selecting key soil variables able to best represent soil status is a critical step for the calculation of SQIs. Current studies show the effectiveness of statistical methods for variable selection to extract relevant information deriving from multivariate datasets. Principal component analysis (PCA) has been mainly used, however supervised multivariate methods and regressive techniques are progressively being evaluated (Armenise et al., 2013; de Paul Obade et al., 2016; Pulido Moncada et al., 2014). The present study explores the effectiveness of partial least square regression (PLSR) in selecting critical soil variables, using a dataset comparing conventional tillage and sod-seeding on durum wheat. The results were compared to those obtained using PCA and stepwise discriminant analysis (SDA). The soil data derived from a long-term field experiment in Southern Italy. On samples collected in April 2015, the following set of variables was quantified: (i) chemical: total organic carbon and nitrogen (TOC and TN), alkali-extractable C (TEC and humic substances - HA-FA), water extractable N and organic C (WEN and WEOC), Olsen extractable P, exchangeable cations, pH and EC; (ii) physical: texture, dry bulk density (BD), macroporosity (Pmac), air capacity (AC), and relative field capacity (RFC); (iii) biological: carbon of the microbial biomass quantified with the fumigation-extraction method. PCA and SDA were previously applied to the multivariate dataset (Stellacci et al., 2016). PLSR was carried out on mean centered and variance scaled data of predictors (soil variables) and response (wheat yield) variables using the PLS procedure of SAS/STAT. In addition, variable importance for projection (VIP) statistics was used to quantitatively assess the predictors most relevant for response variable estimation and then for variable selection (Andersen and Bro, 2010). PCA and SDA returned TOC and RFC as influential variables both on the set of chemical and physical data analyzed separately as well as on the whole dataset (Stellacci et al., 2016). Highly weighted variables in PCA were also TEC, followed by K, and AC, followed by Pmac and BD, in the first PC (41.2% of total variance); Olsen P and HA-FA in the second PC (12.6%), Ca in the third (10.6%) component. Variables enabling maximum discrimination among treatments for SDA were WEOC, on the whole dataset, humic substances, followed by Olsen P, EC and clay, in the separate data analyses. The highest PLS-VIP statistics were recorded for Olsen P and Pmac, followed by TOC, TEC, pH and Mg for chemical variables and clay, RFC and AC for the physical variables. Results show that different methods may provide different ranking of the selected variables and the presence of a response variable, in regressive techniques, may affect variable selection. Further investigation with different response variables and with multi-year datasets would allow to better define advantages and limits of single or combined approaches. Acknowledgment The work was supported by the projects "BIOTILLAGE, approcci innovative per il miglioramento delle performances ambientali e produttive dei sistemi cerealicoli no-tillage", financed by PSR-Basilicata 2007-2013, and "DESERT, Low-cost water desalination and sensor technology compact module" financed by ERANET-WATERWORKS 2014. References Andersen C.M. and Bro R., 2010. Variable selection in regression - a tutorial. Journal of Chemometrics, 24 728-737. Armenise et al., 2013. Developing a soil quality index to compare soil fitness for agricultural use under different managements in the mediterranean environment. Soil and Tillage Research, 130:91-98. de Paul Obade et al., 2016. A standardized soil quality index for diverse field conditions. Sci. Total Env. 541:424-434. Pulido Moncada et al., 2014. Data-driven analysis of soil quality indicators using limited data. Geoderma, 235:271-278. Stellacci et al., 2016. Comparison of different multivariate methods to select key soil variables for soil quality indices computation. XLV Congress of the Italian Society of Agronomy (SIA), Sassari, 20-22 September 2016.
Cao, Xueren; Luo, Yong; Zhou, Yilin; Fan, Jieru; Xu, Xiangming; West, Jonathan S.; Duan, Xiayu; Cheng, Dengfa
2015-01-01
To determine the influence of plant density and powdery mildew infection of winter wheat and to predict grain yield, hyperspectral canopy reflectance of winter wheat was measured for two plant densities at Feekes growth stage (GS) 10.5.3, 10.5.4, and 11.1 in the 2009–2010 and 2010–2011 seasons. Reflectance in near infrared (NIR) regions was significantly correlated with disease index at GS 10.5.3, 10.5.4, and 11.1 at two plant densities in both seasons. For the two plant densities, the area of the red edge peak (Σdr 680–760 nm), difference vegetation index (DVI), and triangular vegetation index (TVI) were significantly correlated negatively with disease index at three GSs in two seasons. Compared with other parameters Σdr 680–760 nm was the most sensitive parameter for detecting powdery mildew. Linear regression models relating mildew severity to Σdr 680–760 nm were constructed at three GSs in two seasons for the two plant densities, demonstrating no significant difference in the slope estimates between the two plant densities at three GSs. Σdr 680–760 nm was correlated with grain yield at three GSs in two seasons. The accuracies of partial least square regression (PLSR) models were consistently higher than those of models based on Σdr 680760 nm for disease index and grain yield. PLSR can, therefore, provide more accurate estimation of disease index of wheat powdery mildew and grain yield using canopy reflectance. PMID:25815468
NASA Astrophysics Data System (ADS)
Kusumo, B. H.; Sukartono, S.; Bustan, B.
2018-02-01
Measuring soil organic carbon (C) using conventional analysis is tedious procedure, time consuming and expensive. It is needed simple procedure which is cheap and saves time. Near infrared technology offers rapid procedure as it works based on the soil spectral reflectance and without any chemicals. The aim of this research is to test whether this technology able to rapidly measure soil organic C in rice paddy field. Soil samples were collected from rice paddy field of Lombok Island Indonesia, and the coordinates of the samples were recorded. Parts of the samples were analysed using conventional analysis (Walkley and Black) and some other parts were scanned using near infrared spectroscopy (NIRS) for soil spectral collection. Partial Least Square Regression (PLSR) Models were developed using data of soil C analysed using conventional analysis and data from soil spectral reflectance. The models were moderately successful to measure soil C in rice paddy field of Lombok Island. This shows that the NIR technology can be further used to monitor the C change in rice paddy soil.
NASA Astrophysics Data System (ADS)
Das, Bappa; Sahoo, Rabi N.; Pargal, Sourabh; Krishna, Gopal; Verma, Rakesh; Chinnusamy, Viswanathan; Sehgal, Vinay K.; Gupta, Vinod K.; Dash, Sushanta K.; Swain, Padmini
2018-03-01
In the present investigation, the changes in sucrose, reducing and total sugar content due to water-deficit stress in rice leaves were modeled using visible, near infrared (VNIR) and shortwave infrared (SWIR) spectroscopy. The objectives of the study were to identify the best vegetation indices and suitable multivariate technique based on precise analysis of hyperspectral data (350 to 2500 nm) and sucrose, reducing sugar and total sugar content measured at different stress levels from 16 different rice genotypes. Spectral data analysis was done to identify suitable spectral indices and models for sucrose estimation. Novel spectral indices in near infrared (NIR) range viz. ratio spectral index (RSI) and normalised difference spectral indices (NDSI) sensitive to sucrose, reducing sugar and total sugar content were identified which were subsequently calibrated and validated. The RSI and NDSI models had R2 values of 0.65, 0.71 and 0.67; RPD values of 1.68, 1.95 and 1.66 for sucrose, reducing sugar and total sugar, respectively for validation dataset. Different multivariate spectral models such as artificial neural network (ANN), multivariate adaptive regression splines (MARS), multiple linear regression (MLR), partial least square regression (PLSR), random forest regression (RFR) and support vector machine regression (SVMR) were also evaluated. The best performing multivariate models for sucrose, reducing sugars and total sugars were found to be, MARS, ANN and MARS, respectively with respect to RPD values of 2.08, 2.44, and 1.93. Results indicated that VNIR and SWIR spectroscopy combined with multivariate calibration can be used as a reliable alternative to conventional methods for measurement of sucrose, reducing sugars and total sugars of rice under water-deficit stress as this technique is fast, economic, and noninvasive.
Oxidative Stress in Wild Boars Naturally and Experimentally Infected with Mycobacterium bovis
Gassó, Diana; Vicente, Joaquín; Mentaberre, Gregorio; Soriguer, Ramón; Jiménez Rodríguez, Rocío; Navarro-González, Nora; Tvarijonaviciute, Asta; Lavín, Santiago; Fernández-Llario, Pedro; Segalés, Joaquim; Serrano, Emmanuel
2016-01-01
Reactive oxygen and nitrogen species (ROS-RNS) are important defence substances involved in the immune response against pathogens. An excessive increase in ROS-RNS, however, can damage the organism causing oxidative stress (OS). The organism is able to neutralise OS by the production of antioxidant enzymes (AE); hence, tissue damage is the result of an imbalance between oxidant and antioxidant status. Though some work has been carried out in humans, there is a lack of information about the oxidant/antioxidant status in the presence of tuberculosis (TB) in wild reservoirs. In the Mediterranean Basin, wild boar (Sus scrofa) is the main reservoir of TB. Wild boar showing severe TB have an increased risk to Mycobacterium spp. shedding, leading to pathogen spreading and persistence. If OS is greater in these individuals, oxidant/antioxidant balance in TB-affected boars could be used as a biomarker of disease severity. The present work had a two-fold objective: i) to study the effects of bovine TB on different OS biomarkers (namely superoxide dismutase (SOD), catalasa (CAT), glutathione peroxidase (GPX), glutathione reductase (GR) and thiobarbituric acid reactive substances (TBARS)) in wild boar experimentally challenged with Mycobacterium bovis, and ii) to explore the role of body weight, sex, population and season in explaining the observed variability of OS indicators in two populations of free-ranging wild boar where TB is common. For the first objective, a partial least squares regression (PLSR) approach was used whereas, recursive partitioning with regression tree models (RTM) were applied for the second. A negative relationship between antioxidant enzymes and bovine TB (the more severe lesions, the lower the concentration of antioxidant biomarkers) was observed in experimentally infected animals. The final PLSR model retained the GPX, SOD and GR biomarkers and showed that 17.6% of the observed variability of antioxidant capacity was significantly correlated with the PLSR X’s component represented by both disease status and the age of boars. In the samples from free-ranging wild boar, however, the environmental factors were more relevant to the observed variability of the OS biomarkers than the TB itself. For each OS biomarker, each RTM was defined as a maximum by one node due to the population effect. Along the same lines, the ad hoc tree regression on boars from the population with a higher prevalence of severe TB confirmed that disease status was not the main factor explaining the observed variability in OS biomarkers. It was concluded that oxidative damage caused by TB is significant, but can only be detected in the absence of environmental variation in wild boar. PMID:27682987
Petersen, Nanna; Stocks, Stuart; Gernaey, Krist V
2008-05-01
The main purpose of this article is to demonstrate that principal component analysis (PCA) and partial least squares regression (PLSR) can be used to extract information from particle size distribution data and predict rheological properties. Samples from commercially relevant Aspergillus oryzae fermentations conducted in 550 L pilot scale tanks were characterized with respect to particle size distribution, biomass concentration, and rheological properties. The rheological properties were described using the Herschel-Bulkley model. Estimation of all three parameters in the Herschel-Bulkley model (yield stress (tau(y)), consistency index (K), and flow behavior index (n)) resulted in a large standard deviation of the parameter estimates. The flow behavior index was not found to be correlated with any of the other measured variables and previous studies have suggested a constant value of the flow behavior index in filamentous fermentations. It was therefore chosen to fix this parameter to the average value thereby decreasing the standard deviation of the estimates of the remaining rheological parameters significantly. Using a PLSR model, a reasonable prediction of apparent viscosity (micro(app)), yield stress (tau(y)), and consistency index (K), could be made from the size distributions, biomass concentration, and process information. This provides a predictive method with a high predictive power for the rheology of fermentation broth, and with the advantages over previous models that tau(y) and K can be predicted as well as micro(app). Validation on an independent test set yielded a root mean square error of 1.21 Pa for tau(y), 0.209 Pa s(n) for K, and 0.0288 Pa s for micro(app), corresponding to R(2) = 0.95, R(2) = 0.94, and R(2) = 0.95 respectively. Copyright 2007 Wiley Periodicals, Inc.
Nicolaou, Nicoletta; Goodacre, Royston
2008-10-01
Microbiological safety plays a very significant part in the quality control of milk and dairy products worldwide. Current methods used in the detection and enumeration of spoilage bacteria in pasteurized milk in the dairy industry, although accurate and sensitive, are time-consuming. FT-IR spectroscopy is a metabolic fingerprinting technique that can potentially be used to deliver results with the same accuracy and sensitivity, within minutes after minimal sample preparation. We tested this hypothesis using attenuated total reflectance (ATR), and high throughput (HT) FT-IR techniques. Three main types of pasteurized milk - whole, semi-skimmed and skimmed - were used and milk was allowed to spoil naturally by incubation at 15 degrees C. Samples for FT-IR were obtained at frequent, fixed time intervals and pH and total viable counts were also recorded. Multivariate statistical methods, including principal components-discriminant function analysis and partial least squares regression (PLSR), were then used to investigate the relationship between metabolic fingerprints and the total viable counts. FT-IR ATR data for all milks showed reasonable results for bacterial loads above 10(5) cfu ml(-1). By contrast, FT-IR HT provided more accurate results for lower viable bacterial counts down to 10(3) cfu ml(-1) for whole milk and, 4 x 10(2) cfu ml(-1) for semi-skimmed and skimmed milk. Using FT-IR with PLSR we were able to acquire a metabolic fingerprint rapidly and quantify the microbial load of milk samples accurately, with very little sample preparation. We believe that metabolic fingerprinting using FT-IR has very good potential for future use in the dairy industry as a rapid method of detection and enumeration.
Toziou, Peristera-Maria; Barmpalexis, Panagiotis; Boukouvala, Paraskevi; Verghese, Susan; Nikolakakis, Ioannis
2018-05-30
Since culture-based methods are costly and time consuming, alternative methods are investigated for the quantification of probiotics in commercial products. In this work ATR- FTIR vibration spectroscopy was applied for the differentiation and quantification of live Lactobacillus (La 5) in mixed populations of live and killed La 5, in the absence and in the presence of enteric polymer Eudragit ® L 100-55. Suspensions of live (La 5_L) and killed in acidic environment bacillus (La 5_K) were prepared and binary mixtures of different percentages were used to grow cell cultures for colony counting and spectral analysis. The increase in the number of colonies with added%La 5_L to the mixture was log-linear (r 2 = 0.926). Differentiation of La 5_L from La 5_K was possible directly from the peak area at 1635 cm -1 (amides of proteins and peptides) and a linear relationship between%La 5_L and peak area in the range 0-95% was obtained. Application of partial least squares regression (PLSR) gave reasonable prediction of%La 5_L (RMSEp = 6.48) in binary mixtures of live and killed La 5 but poor prediction (RMSEp = 11.75) when polymer was added to the La 5 mixture. Application of artificial neural networks (ANNs) improved greatly the predictive ability for%La 5_L both in the absence and in the presence of polymer (RMSEp = 8.11 × 10 -8 for La 5 only mixtures and RMSEp = 8.77 × 10 -8 with added polymer) due to their ability to express in the calibration models more hidden spectral information than PLSR. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Dube, Timothy; Sibanda, Mbulisi; Shoko, Cletah; Mutanga, Onisimo
2017-10-01
Forest stand volume is one of the crucial stand parameters, which influences the ability of these forests to provide ecosystem goods and services. This study thus aimed at examining the potential of integrating multispectral SPOT 5 image, with ancillary data (forest age and rainfall metrics) in estimating stand volume between coppiced and planted Eucalyptus spp. in KwaZulu-Natal, South Africa. To achieve this objective, Partial Least Squares Regression (PLSR) algorithm was used. The PLSR algorithm was implemented by applying three tier analysis stages: stage I: using ancillary data as an independent dataset, stage II: SPOT 5 spectral bands as an independent dataset and stage III: combined SPOT 5 spectral bands and ancillary data. The results of the study showed that the use of an independent ancillary dataset better explained the volume of Eucalyptus spp. growing from coppices (adjusted R2 (R2Adj) = 0.54, RMSEP = 44.08 m3/ha), when compared with those that were planted (R2Adj = 0.43, RMSEP = 53.29 m3/ha). Similar results were also observed when SPOT 5 spectral bands were applied as an independent dataset, whereas improved volume estimates were produced when using combined dataset. For instance, planted Eucalyptus spp. were better predicted adjusted R2 (R2Adj) = 0.77, adjusted R2Adj = 0.59, RMSEP = 36.02 m3/ha) when compared with those that grow from coppices (R2 = 0.76, R2Adj = 0.46, RMSEP = 40.63 m3/ha). Overall, the findings of this study demonstrated the relevance of multi-source data in ecosystems modelling.
Wang, Jie; Shen, Changwei; Liu, Na; Jin, Xin; Fan, Xueshan; Dong, Caixia; Xu, Yangchun
2017-03-08
Non-destructive and timely determination of leaf nitrogen (N) concentration is urgently needed for N management in pear orchards. A two-year field experiment was conducted in a commercial pear orchard with five N application rates: 0 (N0), 165 (N1), 330 (N2), 660 (N3), and 990 (N4) kg·N·ha -1 . The mid-portion leaves on the year's shoot were selected for the spectral measurement first and then N concentration determination in the laboratory at 50 and 80 days after full bloom (DAB). Three methods of in-field spectral measurement (25° bare fibre under solar conditions, black background attached to plant probe, and white background attached to plant probe) were compared. We also investigated the modelling performances of four chemometric techniques (principal components regression, PCR; partial least squares regression, PLSR; stepwise multiple linear regression, SMLR; and back propagation neural network, BPNN) and three vegetation indices (difference spectral index, normalized difference spectral index, and ratio spectral index). Due to the low correlation of reflectance obtained by the 25° field of view method, all of the modelling was performed on two spectral datasets-both acquired by a plant probe. Results showed that the best modelling and prediction accuracy were found in the model established by PLSR and spectra measured with a black background. The randomly-separated subsets of calibration ( n = 1000) and validation ( n = 420) of this model resulted in high R² values of 0.86 and 0.85, respectively, as well as a low mean relative error (<6%). Furthermore, a higher coefficient of determination between the leaf N concentration and fruit yield was found at 50 DAB samplings in both 2015 (R² = 0.77) and 2014 (R² = 0.59). Thus, the leaf N concentration was suggested to be determined at 50 DAB by visible/near-infrared spectroscopy and the threshold should be 24-27 g/kg.
Xu, Yun; Muhamadali, Howbeer; Sayqal, Ali; Dixon, Neil; Goodacre, Royston
2016-10-28
Partial least squares (PLS) is one of the most commonly used supervised modelling approaches for analysing multivariate metabolomics data. PLS is typically employed as either a regression model (PLS-R) or a classification model (PLS-DA). However, in metabolomics studies it is common to investigate multiple, potentially interacting, factors simultaneously following a specific experimental design. Such data often cannot be considered as a "pure" regression or a classification problem. Nevertheless, these data have often still been treated as a regression or classification problem and this could lead to ambiguous results. In this study, we investigated the feasibility of designing a hybrid target matrix Y that better reflects the experimental design than simple regression or binary class membership coding commonly used in PLS modelling. The new design of Y coding was based on the same principle used by structural modelling in machine learning techniques. Two real metabolomics datasets were used as examples to illustrate how the new Y coding can improve the interpretability of the PLS model compared to classic regression/classification coding.
Kang, Bo-Sik; Lee, Jang-Eun; Park, Hyun-Jin
2014-06-01
In Korean rice wine (makgeolli) model, we tried to develop a prediction model capable of eliciting a quantitative relationship between initial amino acids in makgeolli mash and major aromatic compounds, such as fusel alcohols, their acetate esters, and ethyl esters of fatty acids, in makgeolli brewed. Mass-spectrometry-based electronic nose (MS-EN) was used to qualitatively discriminate between makgeollis made from makgeolli mashes with different amino acid compositions. Following this measurement, headspace solid-phase microextraction coupled to gas chromatography-mass spectrometry (GC-MS) combined with partial least-squares regression (PLSR) method was employed to quantitatively correlate amino acid composition of makgeolli mash with major aromatic compounds evolved during makgeolli fermentation. In qualitative prediction with MS-EN analysis, the makgeollis were well discriminated according to the volatile compounds derived from amino acids of makgeolli mash. Twenty-seven ion fragments with mass-to-charge ratio (m/z) of 55 to 98 amu were responsible for the discrimination. In GC-MS combined with PLSR method, a quantitative approach between the initial amino acids of makgeolli mash and the fusel compounds of makgeolli demonstrated that coefficient of determination (R(2)) of most of the fusel compounds ranged from 0.77 to 0.94 in good correlation, except for 2-phenylethanol (R(2) = 0.21), whereas R(2) for ethyl esters of MCFAs including ethyl caproate, ethyl caprylate, and ethyl caprate was 0.17 to 0.40 in poor correlation. The amino acids have been known to affect the aroma in alcoholic beverages. In this study, we demonstrated that an electronic nose qualitatively differentiated Korean rice wines (makgeollis) by their volatile compounds evolved from amino acids with rapidity and reproducibility and successively, a quantitative correlation with acceptable R2 between amino acids and fusel compounds could be established via HS-SPME GC-MS combined with partial least-squares regression. Our approach for predicting the quantities of volatile compounds in the finished product from initial condition of fermentation will give an insight to food researchers to modify and optimize the qualities of the corresponding products. © 2014 Institute of Food Technologists®
Liu, Jinbao; Han, Jichang; Zhang, Yang; Wang, Huanyuan; Kong, Hui; Shi, Lei
2018-06-05
The storage of soil organic carbon (SOC) should improve soil fertility. Conventional determination of SOC is expensive and tedious. Visible-near infrared reflectance spectroscopy is a practical and cost-effective approach that has been successfully used SOC concentration. Soil spectral inversion model could quickly and efficiently determine SOC content. This paper presents a study dealing with SOC estimation through the combination of soil spectroscopy and stepwise multiple linear regression (SMLR), partial least squares regression (PLSR), principal component regression (PCR). Spectral measurements for 106 soil samples were acquired using an ASD FieldSpec 4 standard-res spectroradiometer (350-2500 nm). Six types of transformations and three regression methods were applied to build for the quantification of different parent materials development soil. The results show that (1)the basaltic volcanic clastics development of SOC spectral response bands located in 500 nm, 800 nm; Trachyte spectral response of the soil quality, and the volcanic clastics development at 405 nm, 465 nm, 575 nm, 1105 nm. (2) Basaltic volcanic debris soil development, first deviation of maximum correlation coefficient is 0.8898; thick surface soil of the development of rocky volcanic debris from bottom reflectivity logarithm of first deviation of maximum correlation coefficient is 0.9029. (3) Soil organic matter content of basaltic volcanic clastics development optimal prediction model based on spectral reflectance inverse logarithms of first deviation of SMLR. Independent variable number is 7, Rv 2 = 0.9720, RMSEP = 2.0590, sig = 0.003. Trachyte qualitative volcanic clastics developed soil organic matter content of the optimal prediction model based on spectral reflectance inverse logarithms of first deviation of PLSR. Model number of the independent variables Pc = 5, Rc = 0.9872, Rc 2 = 0.9745, RMSEC = 0.4821, SEC = 0.4906, forecasts determine coefficient Rv 2 = 0.9702, RMSEP = 0.9563, SEP = 0.9711, Bias = 0.0637. Copyright © 2018 Elsevier B.V. All rights reserved.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yang, Xi; Tang, Jianwu; Mustard, John F.
Understanding the temporal patterns of leaf traits is critical in determining the seasonality and magnitude of terrestrial carbon, water, and energy fluxes. However, we lack robust and efficient ways to monitor the temporal dynamics of leaf traits. Here we assessed the potential of leaf spectroscopy to predict and monitor leaf traits across their entire life cycle at different forest sites and light environments (sunlit vs. shaded) using a weekly sampled dataset across the entire growing season at two temperate deciduous forests. In addition, the dataset includes field measured leaf-level directional-hemispherical reflectance/transmittance together with seven important leaf traits [total chlorophyll (chlorophyllmore » a and b), carotenoids, mass-based nitrogen concentration (N mass), mass-based carbon concentration (C mass), and leaf mass per area (LMA)]. All leaf traits varied significantly throughout the growing season, and displayed trait-specific temporal patterns. We used a Partial Least Square Regression (PLSR) modeling approach to estimate leaf traits from spectra, and found that PLSR was able to capture the variability across time, sites, and light environments of all leaf traits investigated (R 2 = 0.6–0.8 for temporal variability; R 2 = 0.3–0.7 for cross-site variability; R 2 = 0.4–0.8 for variability from light environments). We also tested alternative field sampling designs and found that for most leaf traits, biweekly leaf sampling throughout the growing season enabled accurate characterization of the seasonal patterns. Compared with the estimation of foliar pigments, the performance of N mass, C mass and LMA PLSR models improved more significantly with sampling frequency. Our results demonstrate that leaf spectra-trait relationships vary with time, and thus tracking the seasonality of leaf traits requires statistical models calibrated with data sampled throughout the growing season. In conclusion, our results have broad implications for future research that use vegetation spectra to infer leaf traits at different growing stages.« less
Yang, Xi; Tang, Jianwu; Mustard, John F.; ...
2016-04-02
Understanding the temporal patterns of leaf traits is critical in determining the seasonality and magnitude of terrestrial carbon, water, and energy fluxes. However, we lack robust and efficient ways to monitor the temporal dynamics of leaf traits. Here we assessed the potential of leaf spectroscopy to predict and monitor leaf traits across their entire life cycle at different forest sites and light environments (sunlit vs. shaded) using a weekly sampled dataset across the entire growing season at two temperate deciduous forests. In addition, the dataset includes field measured leaf-level directional-hemispherical reflectance/transmittance together with seven important leaf traits [total chlorophyll (chlorophyllmore » a and b), carotenoids, mass-based nitrogen concentration (N mass), mass-based carbon concentration (C mass), and leaf mass per area (LMA)]. All leaf traits varied significantly throughout the growing season, and displayed trait-specific temporal patterns. We used a Partial Least Square Regression (PLSR) modeling approach to estimate leaf traits from spectra, and found that PLSR was able to capture the variability across time, sites, and light environments of all leaf traits investigated (R 2 = 0.6–0.8 for temporal variability; R 2 = 0.3–0.7 for cross-site variability; R 2 = 0.4–0.8 for variability from light environments). We also tested alternative field sampling designs and found that for most leaf traits, biweekly leaf sampling throughout the growing season enabled accurate characterization of the seasonal patterns. Compared with the estimation of foliar pigments, the performance of N mass, C mass and LMA PLSR models improved more significantly with sampling frequency. Our results demonstrate that leaf spectra-trait relationships vary with time, and thus tracking the seasonality of leaf traits requires statistical models calibrated with data sampled throughout the growing season. In conclusion, our results have broad implications for future research that use vegetation spectra to infer leaf traits at different growing stages.« less
Multisensor on-the-go mapping of readily dispersible clay, particle size and soil organic matter
NASA Astrophysics Data System (ADS)
Debaene, Guillaume; Niedźwiecki, Jacek; Papierowska, Ewa
2016-04-01
Particle size fractions affect strongly the physical and chemical properties of soil. Readily dispersible clay (RDC) is the part of the clay fraction in soils that is easily or potentially dispersible in water when small amounts of mechanical energy are applied to soil. The amount of RDC in the soil is of significant importance for agriculture and environment because clay dispersion is a cause of poor soil stability in water which in turn contributes to soil erodibility, mud flows, and cementation. To obtain a detailed map of soil texture, many samples are needed. Moreover, RDC determination is time consuming. The use of a mobile visible and near-infrared (VIS-NIR) platform is proposed here to map those soil properties and obtain the first detailed map of RDC at field level. Soil properties prediction was based on calibration model developed with 10 representative samples selected by a fuzzy logic algorithm. Calibration samples were analysed for soil texture (clay, silt and sand), RDC and soil organic carbon (SOC) using conventional wet chemistry analysis. Moreover, the Veris mobile sensor platform is also collecting electrical conductivity (EC) data (deep and shallow), and soil temperature. These auxiliary data were combined with VIS-NIR measurement (data fusion) to improve prediction results. EC maps were also produced to help understanding RDC data. The resulting maps were visually compared with an orthophotography of the field taken at the beginning of the plant growing season. Models were developed with partial least square regression (PLSR) and support vector machine regression (SVMR). There were no significant differences between calibration using PLSR or SVMR. Nevertheless, the best models were obtained with PLSR and standard normal variate (SNV) pretreatment and the fusion with deep EC data (e.g. for RDC and clay content: RMSECV = 0,35% and R2 = 0,71; RMSECV = 0,32% and R2 = 0,73 respectively). The best models were used to predict soil properties from the field spectra collected with the VIS-NIR platform. Maps of soil properties were generated using natural neighbour (NN) interpolation. Calibration results were satisfactory for all soil properties and allowed for the generation of detailed maps. The spatial variability of RDC was in accordance with the field orthophotography. Areas of high RDC content were corresponding to area of bad plant development. Soil texture has been correctly predicted by VIS-NIR spectroscopy (laboratory or on-the-go) before. However, readily dispersible clay (an important parameter for soil stability) has never been investigated before. This study introduces the possibility of using VIS-NIR for predicting readily dispersible clay at field level. The results obtained could be used in preventing soil erosion. Acknowledgement: This research was financed by a National Science Centre grant (NCN - Poland) with decision number UMO-2012/07/B/ST10/04387
NASA Astrophysics Data System (ADS)
Anggraeni, Anni; Arianto, Fernando; Mutalib, Abdul; Pratomo, Uji; Bahti, Husein H.
2017-05-01
Rare Earth Elements (REE) are elements that a lot of function for life, such as metallurgy, optical devices, and manufacture of electronic devices. Sources of REE is present in the mineral, in which each element has similar properties. Currently, to determining the content of REE is used instruments such as ICP-OES, ICP-MS, XRF, and HPLC. But in each instruments, there are still have some weaknesses. Therefore we need an alternative analytical method for the determination of rare earth metal content, one of them is by a combination of UV-Visible spectrophotometry and multivariate analysis, including Principal Component Analysis (PCA), Principal Component Regression (PCR), and Partial Least Square Regression (PLS). The purpose of this experiment is to determine the content of light and medium rare earth elements in the mineral monazite without chemical separation by using a combination of multivariate analysis and UV-Visible spectrophotometric methods. Training set created 22 variations of concentration and absorbance was measured using a UV-Vis spectrophotometer, then the data is processed by PCA, PCR, and PLSR. The results were compared and validated to obtain the mathematical equation with the smallest percent error. From this experiment, mathematical equation used PLS methods was better than PCR after validated, which has RMSE value for La, Ce, Pr, Nd, Gd, Sm, Eu, and Tb respectively 0.095; 0.573; 0.538; 0.440; 3.387; 1.240; 1.870; and 0.639.
Multispectral Imaging for Determination of Astaxanthin Concentration in Salmonids
Dissing, Bjørn S.; Nielsen, Michael E.; Ersbøll, Bjarne K.; Frosch, Stina
2011-01-01
Multispectral imaging has been evaluated for characterization of the concentration of a specific cartenoid pigment; astaxanthin. 59 fillets of rainbow trout, Oncorhynchus mykiss, were filleted and imaged using a rapid multispectral imaging device for quantitative analysis. The multispectral imaging device captures reflection properties in 19 distinct wavelength bands, prior to determination of the true concentration of astaxanthin. The samples ranged from 0.20 to 4.34 g per g fish. A PLSR model was calibrated to predict astaxanthin concentration from novel images, and showed good results with a RMSEP of 0.27. For comparison a similar model were built for normal color images, which yielded a RMSEP of 0.45. The acquisition speed of the multispectral imaging system and the accuracy of the PLSR model obtained suggest this method as a promising technique for rapid in-line estimation of astaxanthin concentration in rainbow trout fillets. PMID:21573000
Ayvaz, Huseyin; Rodriguez-Saona, Luis E
2015-05-01
The most common methods for acrylamide analysis in foods require the use of LC-MS/MS and GC-MS. Although these methods have great analytical performance, they need intensive sample preparation, highly specialised instrumentation, and are time consuming. In this study, portable and handheld infrared spectrometers were evaluated as rapid methods for screening acrylamide in potato chips and their performances were compared to those of benchtop infrared systems. The acrylamide content of 64 commercial potato chips (169-2453 μg/kg) was determined by LC-MS/MS. Spectral data were collected using mid-infrared (MIR) and near-infrared (NIR) spectrometers. Partial least squares regression (PLSR) calibration models were developed to predict acrylamide levels. Overall, good linear correlation was found between the predicted acrylamide levels and actual measured acrylamide concentrations by LC-MS/MS (rPred > 0.90 and SEP < 100 μg/kg). Our results indicate that portable and handheld spectrometers can be used as simple and rapid alternatives for acrylamide analysis in potato chips. Copyright © 2014 Elsevier Ltd. All rights reserved.
Prediction of warmed-over flavour development in cooked chicken by colorimetric sensor array.
Kim, Su-Yeon; Li, Jinglei; Lim, Na-Ri; Kang, Bo-Sik; Park, Hyun-Jin
2016-11-15
The aim of this study was to develop a simple and rapid method based on colorimetric sensor array (CSA) for evaluation of warmed-over flavour (WOF) in cooked chicken. All samples were classified according to storage time by CSA coupled with principle component analysis (PCA) or hierarchical cluster analysis (HCA). The CSA data were used to establish prediction models with thiobarbituric acid reactive substances (TBARS), pentanal, hexanal, or heptanal associated with WOF by partial least square regression (PLSR). For the TBARS model, the coefficient of determination (rp(2)) was 0.9997 in the prediction range of 0.28-0.69mg/kg. In each of the models for pentanal, hexanal, and heptanal, all rp(2) were higher than 0.960 in the range of 0.58-2.10mg/kg, 5.50-11.69mg/kg, and 0.09-0.16mg/kg, respectively. These results demonstrate that the CSA was able to predict WOF development and to distinguish between each storage time. Copyright © 2016 Elsevier Ltd. All rights reserved.
New type of dry substances content meter using microwaves for application in biogas plants.
Nacke, Thomas; Brückner, Kathleen; Göller, Arndt; Kaufhold, Sebastian; Nakos, Xenia; Noack, Stephan; Stöber, Heinrich; Beckmann, Dieter
2005-11-01
Dry substances (DS) are an important index for monitoring and controlling anaerobic co-digestion in biogas plants. We have developed and tested an online meter that measures suspended solids by means of the reflection coefficient of an exiting microwave signal, which is dependent on the dielectric properties of the suspensions. Intelligent models based on partial least squares regression (PLSR) and artificial neural network (ANN) for calibration allow exact and reproducible measurements under different circumstances. This measuring method is appropriate for contactless and online measurements of dry substance contents in biogas plants in a large range from 2-14%.
Lin, Shunshun; Zhang, Xiaoming; Song, Shiqing; Hayat, Khizar; Eric, Karangwa; Majeed, Hamid
2016-03-01
Based on encouraged development of potential reduced-exposure products (PREPs) by the US Institute of Medicine, casings (glucose and peptides) added treatments (CAT) and enzymatic (protease and xylanase) hydrolysis treatments (EHT) were developed to study their effect on alkaloids reduction in tobacco and cigarette mainstream smoke (MS) and further investigate the correlation between sensory attributes and alkaloids. Results showed that the developed treatments reduced nicotine by 14.5% and 24.4% in tobacco and cigarette MS, respectively, indicating that both CAT and EHT are potentially effective for developing lower-risk cigarettes. Sensory and electronic nose analysis confirmed the significant influence of treatments on sensory and cigarette MS components. PLSR analysis demonstrated that tobacco alkaloids were positively correlated to the off-taste, irritation and impact attributes, and negatively correlated to the aroma and softness attributes. Additionally, nicotine and anabasine from tobacco leaves positively contributed to the impact attribute, while they negatively contributed to the aroma attribute (P<0.05). Meanwhile, most alkaloids in cigarette MS positively contributed to the impact and irritation attributes (P<0.05). Hence, this study paved a way to better understand the correlation between tobacco alkaloids and sensory attributes. Copyright © 2015 Elsevier Inc. All rights reserved.
NASA Astrophysics Data System (ADS)
Kim, Dae-Yong; Cho, Byoung-Kwan
2015-11-01
The quality parameters of the Korean traditional rice wine "Makgeolli" were monitored using Fourier transform near-infrared (FT-NIR) spectroscopy with multivariate statistical analysis (MSA) during fermentation. Alcohol, reducing sugar, and titratable acid were the parameters assessed to determine the quality index of fermentation substrates and products. The acquired spectra were analyzed with partial least squares regression (PLSR). The best prediction model for alcohol was obtained with maximum normalization, showing a coefficient of determination (Rp2) of 0.973 and a standard error of prediction (SEP) of 0.760%. In addition, the best prediction model for reducing sugar was obtained with no data preprocessing, with a Rp2 value of 0.945 and a SEP of 1.233%. The prediction of titratable acidity was best with mean normalization, showing a Rp2 value of 0.882 and a SEP of 0.045%. These results demonstrate that FT-NIR spectroscopy can be used for rapid measurements of quality parameters during Makgeolli fermentation.
Chen, Tao; Chang, Qingrui; Clevers, J G P W; Kooistra, L
2015-11-01
Soil heavy metal pollution due to long-term sewage irrigation is a serious environmental problem in many irrigation areas in northern China. Quickly identifying its pollution status is an important basis for remediation. Visible-near-infrared reflectance spectroscopy (VNIRS) provides a useful tool. In a case study, 76 soil samples were collected and their reflectance spectra were used to estimate cadmium (Cd) concentration by partial least squares regression (PLSR) and back propagation neural network (BPNN). To reduce noise, six pre-treatments were compared, in which orthogonal signal correction (OSC) was first used in soil Cd estimation. Spectral analysis and geostatistics were combined to identify Cd pollution hotspots. Results showed that Cd was accumulated in topsoil at the study area. OSC can effectively remove irrelevant information to improve prediction accuracy. More accurate estimation was achieved by applying a BPNN. Soil Cd pollution hotspots could be identified by interpolating the predicted values obtained from spectral estimates. Copyright © 2015 Elsevier Ltd. All rights reserved.
Investigating the Moisture Content of Polyamide 6 by Raman-Microscopy and Multivariate Data Analysis
NASA Astrophysics Data System (ADS)
Lechner, Tobias; Noack, Kristina; Thöne, Manuel; Amend, Philipp; Schmidt, Michael; Will, Stefan
Thermal malleability of thermoplastics results in a high product diversity in various industry sectors. However, industrial applications require a constant and high component quality. Hence, material processing such as laser welding has to consider that, e.g., the moisture content of thermoplastics influences the mechanical properties such as the tensile strength. Moreover, water evaporates during laser welding and can form pores and defects. Thus, there is a large need for non-invasive material inspection before processing. To that end, we developed a methodology based on Raman-microscopy and multivariate data analysis (MVD) to determine the moisture content of polyamide (MCP). Further, the impact of the MCP on the mechanical properties was verified. For samples with a defined variation of the MCP, xyz-Raman-scans were carried out and analysed using MVD. For reference purposes, the samples were weighted and tensile tests were performed. An evaluation by means of partial least squares regression analysis (PLSR) resulted in a prediction of the MCP with a correlation coefficient >98%. Consequently, Raman-microscopy shows large potential for developing new techniques for inspection and quality control of plastics before processing. Dedicated to Professor Alfred Leipertz on the occasion of his 70th birthday.
Tian, Huaixiang; Li, Fenghua; Qin, Lan; Yu, Haiyan; Ma, Xia
2014-11-01
This study examines the feasibility of electronic nose as a method to discriminate chicken and beef seasonings and to predict sensory attributes. Sensory evaluation showed that 8 chicken seasonings and 4 beef seasonings could be well discriminated and classified based on 8 sensory attributes. The sensory attributes including chicken/beef, gamey, garlic, spicy, onion, soy sauce, retention, and overall aroma intensity were generated by a trained evaluation panel. Principal component analysis (PCA), discriminant factor analysis (DFA), and cluster analysis (CA) combined with electronic nose were used to discriminate seasoning samples based on the difference of the sensor response signals of chicken and beef seasonings. The correlation between sensory attributes and electronic nose sensors signal was established using partial least squares regression (PLSR) method. The results showed that the seasoning samples were all correctly classified by the electronic nose combined with PCA, DFA, and CA. The electronic nose gave good prediction results for all the sensory attributes with correlation coefficient (r) higher than 0.8. The work indicated that electronic nose is an effective method for discriminating different seasonings and predicting sensory attributes. © 2014 Institute of Food Technologists®
Nie, Pengcheng; Wu, Di; Sun, Da-Wen; Cao, Fang; Bao, Yidan; He, Yong
2013-01-01
Notoginseng is a classical traditional Chinese medical herb, which is of high economic and medical value. Notoginseng powder (NP) could be easily adulterated with Sophora flavescens powder (SFP) or corn flour (CF), because of their similar tastes and appearances and much lower cost for these adulterants. The objective of this study is to quantify the NP content in adulterated NP by using a rapid and non-destructive visible and near infrared (Vis-NIR) spectroscopy method. Three wavelength ranges of visible spectra, short-wave near infrared spectra (SNIR) and long-wave near infrared spectra (LNIR) were separately used to establish the model based on two calibration methods of partial least square regression (PLSR) and least-squares support vector machines (LS-SVM), respectively. Competitive adaptive reweighted sampling (CARS) was conducted to identify the most important wavelengths/variables that had the greatest influence on the adulterant quantification throughout the whole wavelength range. The CARS-PLSR models based on LNIR were determined as the best models for the quantification of NP adulterated with SFP, CF, and their mixtures, in which the rP values were 0.940, 0.939, and 0.867 for the three models respectively. The research demonstrated the potential of the Vis-NIR spectroscopy technique for the rapid and non-destructive quantification of NP containing adulterants. PMID:24129019
Janik, Leslie J; Forrester, Sean T; Soriano-Disla, José M; Kirby, Jason K; McLaughlin, Michael J; Reimann, Clemens
2015-02-01
The authors' aim was to develop rapid and inexpensive regression models for the prediction of partitioning coefficients (Kd), defined as the ratio of the total or surface-bound metal/metalloid concentration of the solid phase to the total concentration in the solution phase. Values of Kd were measured for boric acid (B[OH]3(0)) and selected added soluble oxoanions: molybdate (MoO4(2-)), antimonate (Sb[OH](6-)), selenate (SeO4(2-)), tellurate (TeO4(2-)) and vanadate (VO4(3-)). Models were developed using approximately 500 spectrally representative soils of the Geochemical Mapping of Agricultural Soils of Europe (GEMAS) program. These calibration soils represented the major properties of the entire 4813 soils of the GEMAS project. Multiple linear regression (MLR) from soil properties, partial least-squares regression (PLSR) using mid-infrared diffuse reflectance Fourier-transformed (DRIFT) spectra, and models using DRIFT spectra plus analytical pH values (DRIFT + pH), were compared with predicted log K(d + 1) values. Apart from selenate (R(2) = 0.43), the DRIFT + pH calibrations resulted in marginally better models to predict log K(d + 1) values (R(2) = 0.62-0.79), compared with those from PSLR-DRIFT (R(2) = 0.61-0.72) and MLR (R(2) = 0.54-0.79). The DRIFT + pH calibrations were applied to the prediction of log K(d + 1) values in the remaining 4313 soils. An example map of predicted log K(d + 1) values for added soluble MoO4(2-) in soils across Europe is presented. The DRIFT + pH PLSR models provided a rapid and inexpensive tool to assess the risk of mobility and potential availability of boric acid and selected oxoanions in European soils. For these models to be used in the prediction of log K(d + 1) values in soils globally, additional research will be needed to determine if soil variability is accounted on the calibration. © 2014 SETAC.
NASA Astrophysics Data System (ADS)
Oropeza, D.
2016-12-01
A highly innovative laser ablation sampling instrument (J200 Tandem LA - LIBS) that combines the capabilities and analytical benefits of LIBS, LA-ICP-MS and LA-ICP-OES was used for micrometer-scale, spatially-resolved, elemental analysis of a wide variety of samples of geological interest. Data collected using ablation systems consisted of nanosecond (Nd:YAG operated 266nm) and femtosecond lasers (1030 and 343nm). An ICCD LIBS detector and Quadrupole based mass spectrometer were selected for LIBS and ICP-MS detection, respectively. This tandem instrument allows simultaneous determination of major and minor elements (for example, Si, Ca, Na, and Al, and trace elements such as Li, Ce, Cr, Sr, Y, Zn, Zr among others). The research also focused on elemental mapping and calibration strategies, specifically the use of emission and mass spectra for multivariate data analysis. Partial Least Square Regression (PLSR) is shown to minimize and compensate for matrix effects in the emission and mass spectra improving quantitative analysis by LIBS and LA-ICP-MS, respectively. The study provides a benchmark to evaluate analytical results for more complex geological sample matrices.
Lipiäinen, Tiina; Fraser-Miller, Sara J; Gordon, Keith C; Strachan, Clare J
2018-02-05
This study considers the potential of low-frequency (terahertz) Raman spectroscopy in the quantitative analysis of ternary mixtures of solid-state forms. Direct comparison between low-frequency and mid-frequency spectral regions for quantitative analysis of crystal form mixtures, without confounding sampling and instrumental variations, is reported for the first time. Piroxicam was used as a model drug, and the low-frequency spectra of piroxicam forms β, α2 and monohydrate are presented for the first time. These forms show clear spectral differences in both the low- and mid-frequency regions. Both spectral regions provided quantitative models suitable for predicting the mixture compositions using partial least squares regression (PLSR), but the low-frequency data gave better models, based on lower errors of prediction (2.7, 3.1 and 3.2% root-mean-square errors of prediction [RMSEP] values for the β, α2 and monohydrate forms, respectively) than the mid-frequency data (6.3, 5.4 and 4.8%, for the β, α2 and monohydrate forms, respectively). The better performance of low-frequency Raman analysis was attributed to larger spectral differences between the solid-state forms, combined with a higher signal-to-noise ratio. Copyright © 2017 Elsevier B.V. All rights reserved.
Park, Hyunjin; Yang, Jin-ju; Seo, Jongbum; Choi, Yu-yong; Lee, Kun-ho; Lee, Jong-min
2014-04-01
Cortical features derived from magnetic resonance imaging (MRI) provide important information to account for human intelligence. Cortical thickness, surface area, sulcal depth, and mean curvature were considered to explain human intelligence. One region of interest (ROI) of a cortical structure consisting of thousands of vertices contained thousands of measurements, and typically, one mean value (first order moment), was used to represent a chosen ROI, which led to a potentially significant loss of information. We proposed a technological improvement to account for human intelligence in which a second moment (variance) in addition to the mean value was adopted to represent a chosen ROI, so that the loss of information would be less severe. Two computed moments for the chosen ROIs were analyzed with partial least squares regression (PLSR). Cortical features for 78 adults were measured and analyzed in conjunction with the full-scale intelligence quotient (FSIQ). Our results showed that 45% of the variance of the FSIQ could be explained using the combination of four cortical features using two moments per chosen ROI. Our results showed improvement over using a mean value for each ROI, which explained 37% of the variance of FSIQ using the same set of cortical measurements. Our results suggest that using additional second order moments is potentially better than using mean values of chosen ROIs for regression analysis to account for human intelligence. Copyright © 2014 Elsevier Ltd. All rights reserved.
Wang, Meng; Ellsworth, Patrick Z; Zhou, Jianfeng; Cousins, Asaph B; Sankaran, Sindhuja
2016-05-15
Water limitations decrease stomatal conductance (g(s)) and, in turn, photosynthetic rate (A(net)), resulting in decreased crop productivity. The current techniques for evaluating these physiological responses are limited to leaf-level measures acquired by measuring leaf-level gas exchange. In this regard, proximal sensing techniques can be a useful tool in studying plant biology as they can be used to acquire plant-level measures in a high-throughput manner. However, to confidently utilize the proximal sensing technique for high-throughput physiological monitoring, it is important to assess the relationship between plant physiological parameters and the sensor data. Therefore, in this study, the application of rapid sensing techniques based on thermal imaging and visual-near infrared spectroscopy for assessing water-use efficiency (WUE) in foxtail millet (Setaria italica (L.) P. Beauv) was evaluated. The visible-near infrared spectral reflectance (350-2500 nm) and thermal (7.5-14 µm) data were collected at regular intervals from well-watered and drought-stressed plants in combination with other leaf physiological parameters (transpiration rate-E, A(net), g(s), leaf carbon isotopic signature-δ(13)C(leaf), WUE). Partial least squares regression (PLSR) analysis was used to predict leaf physiological measures based on the spectral data. The PLSR modeling on the hyperspectral data yielded accurate and precise estimates of leaf E, gs, δ(13)C(leaf), and WUE with coefficient of determination in a range of 0.85-0.91. Additionally, significant differences in average leaf temperatures (~1°C) measured with a thermal camera were observed between well-watered plants and drought-stressed plants. In summary, the visible-near infrared reflectance data, and thermal images can be used as a potential rapid technique for evaluating plant physiological responses such as WUE. Copyright © 2016 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Liu, Ronghua; Sun, Qiaofeng; Hu, Tian; Li, Lian; Nie, Lei; Wang, Jiayue; Zhou, Wanhui; Zang, Hengchang
2018-03-01
As a powerful process analytical technology (PAT) tool, near infrared (NIR) spectroscopy has been widely used in real-time monitoring. In this study, NIR spectroscopy was applied to monitor multi-parameters of traditional Chinese medicine (TCM) Shenzhiling oral liquid during the concentration process to guarantee the quality of products. Five lab scale batches were employed to construct quantitative models to determine five chemical ingredients and physical change (samples density) during concentration process. The paeoniflorin, albiflorin, liquiritin and samples density were modeled by partial least square regression (PLSR), while the content of the glycyrrhizic acid and cinnamic acid were modeled by support vector machine regression (SVMR). Standard normal variate (SNV) and/or Savitzkye-Golay (SG) smoothing with derivative methods were adopted for spectra pretreatment. Variable selection methods including correlation coefficient (CC), competitive adaptive reweighted sampling (CARS) and interval partial least squares regression (iPLS) were performed for optimizing the models. The results indicated that NIR spectroscopy was an effective tool to successfully monitoring the concentration process of Shenzhiling oral liquid.
Habibi Najafi, Mohammad B; Pourfarzad, Amir; Zahedi, Hoda; Ahmadian-Kouchaksaraie, Zahra; Haddad Khodaparast, Mohammad H
2016-01-01
The aim of this work was to study the effects of a novel sourdough system prepared by wheat flour supplemented by combination of pulverized date seed, Lactobacillus plantarum, and/or Lactobacillus brevis as well as Saccharomyces cerevisiae on the sourdough characteristics, quality, sensory, texture, shelf life and image properties of Barbari flat bread. The highest sourdough acidity and bread specific volume was obtained with co-culture of Lb. plantarum + Lb. brevis + S. cerevisiae. The results suggest that fermentation is a potential bioprocessing technology for improving sensory aspects of bread supplemented with pulverized date seed, as a dietary fiber resource. Texture analysis of bread samples during 7 days of storage indicated that the presence of pulverized date seed in sourdough was able to diminish bread staling. The interaction of baker's yeast and lactic acid bacteria (LAB) has led to increase the particle average size of bread crumb and decrease the area fraction than the LAB samples. It was observed that all treatments of sourdough Barbari breads had higher cell wall thickness than the control Barbari bread. Avrami non-linear regression equation was chosen as useful mathematical model to properly study bread hardening kinetics. In addition, principal component analysis (PCA) allowed discriminating among sourdough and bread specialties. Partial least squares regression (PLSR) models were applied to determine the relationships between sensory and instrumental data.
Kamruzzaman, Mohammed; Sun, Da-Wen; ElMasry, Gamal; Allen, Paul
2013-01-15
Many studies have been carried out in developing non-destructive technologies for predicting meat adulteration, but there is still no endeavor for non-destructive detection and quantification of adulteration in minced lamb meat. The main goal of this study was to develop and optimize a rapid analytical technique based on near-infrared (NIR) hyperspectral imaging to detect the level of adulteration in minced lamb. Initial investigation was carried out using principal component analysis (PCA) to identify the most potential adulterate in minced lamb. Minced lamb meat samples were then adulterated with minced pork in the range 2-40% (w/w) at approximately 2% increments. Spectral data were used to develop a partial least squares regression (PLSR) model to predict the level of adulteration in minced lamb. Good prediction model was obtained using the whole spectral range (910-1700 nm) with a coefficient of determination (R(2)(cv)) of 0.99 and root-mean-square errors estimated by cross validation (RMSECV) of 1.37%. Four important wavelengths (940, 1067, 1144 and 1217 nm) were selected using weighted regression coefficients (Bw) and a multiple linear regression (MLR) model was then established using these important wavelengths to predict adulteration. The MLR model resulted in a coefficient of determination (R(2)(cv)) of 0.98 and RMSECV of 1.45%. The developed MLR model was then applied to each pixel in the image to obtain prediction maps to visualize the distribution of adulteration of the tested samples. The results demonstrated that the laborious and time-consuming tradition analytical techniques could be replaced by spectral data in order to provide rapid, low cost and non-destructive testing technique for adulterate detection in minced lamb meat. Copyright © 2012 Elsevier B.V. All rights reserved.
Peng, Jiyu; He, Yong; Ye, Lanhan; Shen, Tingting; Liu, Fei; Kong, Wenwen; Liu, Xiaodan; Zhao, Yun
2017-07-18
Fast detection of heavy metals in plant materials is crucial for environmental remediation and ensuring food safety. However, most plant materials contain high moisture content, the influence of which cannot be simply ignored. Hence, we proposed moisture influence reducing method for fast detection of heavy metals using laser-induced breakdown spectroscopy (LIBS). First, we investigated the effect of moisture content on signal intensity, stability, and plasma parameters (temperature and electron density) and determined the main influential factors (experimental parameters F and the change of analyte concentration) on the variations of signal. For chromium content detection, the rice leaves were performed with a quick drying procedure, and two strategies were further used to reduce the effect of moisture content and shot-to-shot fluctuation. An exponential model based on the intensity of background was used to correct the actual element concentration in analyte. Also, the ratio of signal-to-background for univariable calibration and partial least squared regression (PLSR) for multivariable calibration were used to compensate the prediction deviations. The PLSR calibration model obtained the best result, with the correlation coefficient of 0.9669 and root-mean-square error of 4.75 mg/kg in the prediction set. The preliminary results indicated that the proposed method allowed for the detection of heavy metals in plant materials using LIBS, and it could be possibly used for element mapping in future work.
Hyperspectral imaging technique for determination of pork freshness attributes
NASA Astrophysics Data System (ADS)
Li, Yongyu; Zhang, Leilei; Peng, Yankun; Tang, Xiuying; Chao, Kuanglin; Dhakal, Sagar
2011-06-01
Freshness of pork is an important quality attribute, which can vary greatly in storage and logistics. The specific objectives of this research were to develop a hyperspectral imaging system to predict pork freshness based on quality attributes such as total volatile basic-nitrogen (TVB-N), pH value and color parameters (L*,a*,b*). Pork samples were packed in seal plastic bags and then stored at 4°C. Every 12 hours. Hyperspectral scattering images were collected from the pork surface at the range of 400 nm to 1100 nm. Two different methods were performed to extract scattering feature spectra from the hyperspectral scattering images. First, the spectral scattering profiles at individual wavelengths were fitted accurately by a three-parameter Lorentzian distribution (LD) function; second, reflectance spectra were extracted from the scattering images. Partial Least Square Regression (PLSR) method was used to establish prediction models to predict pork freshness. The results showed that the PLSR models based on reflectance spectra was better than combinations of LD "parameter spectra" in prediction of TVB-N with a correlation coefficient (r) = 0.90, a standard error of prediction (SEP) = 7.80 mg/100g. Moreover, a prediction model for pork freshness was established by using a combination of TVB-N, pH and color parameters. It could give a good prediction results with r = 0.91 for pork freshness. The research demonstrated that hyperspectral scattering technique is a valid tool for real-time and nondestructive detection of pork freshness.
Maurer, Natalie E; Hatta-Sakoda, Beatriz; Pascual-Chagman, Gloria; Rodriguez-Saona, Luis E
2012-09-15
Consumption of omega-3 fatty acids (ω-3's), whether from fish oils, flax or supplements, can protect against cardiovascular disease. Finding plant-based sources of the essential ω-3's could provide a sustainable, renewable and inexpensive source of ω-3's, compared to fish oils. Our objective was to develop a rapid test to characterize and detect adulteration in sacha inchi oils, a Peruvian seed containing higher levels of ω-3's in comparison to other oleaginous seeds. A temperature-controlled ZnSe ATR mid-infrared benchtop and diamond ATR mid-infrared portable handheld spectrometers were used to characterize sacha inchi oil and evaluate its oxidative stability compared to commercial oils. A soft independent model of class analogy (SIMCA) and partial least squares regression (PLSR) analyzed the spectral data. Fatty acid profiles showed that sacha inchi oil (44% linolenic acid) had levels of PUFA similar to those of flax oils. PLSR showed good correlation coefficients (R(2)>0.9) between reference tests and spectra from infrared devices, allowing for rapid determination of fatty acid composition and prediction of oxidative stability. Oils formed distinct clusters, allowing the evaluation of commercial sacha inchi oils from Peruvian markets and showed some prevalence of adulteration. Determining oil adulteration and quality parameters, by using the ATR-MIR portable handheld spectrometer, allowed for portability and ease-of-use, making it a great alternative to traditional testing methods. Copyright © 2012 Elsevier Ltd. All rights reserved.
Vongsvivut, Jitraporn; Heraud, Philip; Gupta, Adarsha; Puri, Munish; McNaughton, Don; Barrow, Colin J
2013-10-21
The increase in polyunsaturated fatty acid (PUFA) consumption has prompted research into alternative resources other than fish oil. In this study, a new approach based on focal-plane-array Fourier transform infrared (FPA-FTIR) microspectroscopy and multivariate data analysis was developed for the characterisation of some marine microorganisms. Cell and lipid compositions in lipid-rich marine yeasts collected from the Australian coast were characterised in comparison to a commercially available PUFA-producing marine fungoid protist, thraustochytrid. Multivariate classification methods provided good discriminative accuracy evidenced from (i) separation of the yeasts from thraustochytrids and distinct spectral clusters among the yeasts that conformed well to their biological identities, and (ii) correct classification of yeasts from a totally independent set using cross-validation testing. The findings further indicated additional capability of the developed FPA-FTIR methodology, when combined with partial least squares regression (PLSR) analysis, for rapid monitoring of lipid production in one of the yeasts during the growth period, which was achieved at a high accuracy compared to the results obtained from the traditional lipid analysis based on gas chromatography. The developed FTIR-based approach when coupled to programmable withdrawal devices and a cytocentrifugation module would have strong potential as a novel online monitoring technology suited for bioprocessing applications and large-scale production.
Capote, F Priego; Jiménez, J Ruiz; de Castro, M D Luque
2007-08-01
An analytical method for the sequential detection, identification and quantitation of extra virgin olive oil adulteration with four edible vegetable oils--sunflower, corn, peanut and coconut oils--is proposed. The only data required for this method are the results obtained from an analysis of the lipid fraction by gas chromatography-mass spectrometry. A total number of 566 samples (pure oils and samples of adulterated olive oil) were used to develop the chemometric models, which were designed to accomplish, step-by-step, the three aims of the method: to detect whether an olive oil sample is adulterated, to identify the type of adulterant used in the fraud, and to determine how much aldulterant is in the sample. Qualitative analysis was carried out via two chemometric approaches--soft independent modelling of class analogy (SIMCA) and K nearest neighbours (KNN)--both approaches exhibited prediction abilities that were always higher than 91% for adulterant detection and 88% for type of adulterant identification. Quantitative analysis was based on partial least squares regression (PLSR), which yielded R2 values of >0.90 for calibration and validation sets and thus made it possible to determine adulteration with excellent precision according to the Shenk criteria.
Luo, Yu; Li, Wen-Long; Huang, Wen-Hua; Liu, Xue-Hua; Song, Yan-Gang; Qu, Hai-Bin
2017-05-01
A near infrared spectroscopy (NIRS) approach was established for quality control of the alcohol precipitation liquid in the manufacture of Codonopsis Radix. By applying NIRS with multivariate analysis, it was possible to build variation into the calibration sample set, and the Plackett-Burman design, Box-Behnken design, and a concentrating-diluting method were used to obtain the sample set covered with sufficient fluctuation of process parameters and extended concentration information. NIR data were calibrated to predict the four quality indicators using partial least squares regression (PLSR). In the four calibration models, the root mean squares errors of prediction (RMSEPs) were 1.22 μg/ml, 10.5 μg/ml, 1.43 μg/ml, and 0.433% for lobetyolin, total flavonoids, pigments, and total solid contents, respectively. The results indicated that multi-components quantification of the alcohol precipitation liquid of Codonopsis Radix could be achieved with an NIRS-based method, which offers a useful tool for real-time release testing (RTRT) of intermediates in the manufacture of Codonopsis Radix.
Nhouchi, Zeineb; Karoui, Romdhane
2018-06-30
The aim of the present study was to investigate the ability of MIR and texture analyzer to evaluate the quality of pound cake samples produced with palm oil and rapeseed oil throughout storage. The MIR spectra analyzed by using principal component analysis (PCA) showed a clear separation of pound cakes as a function of the storage time and the nature of the used oil in the recipe. By applying partial least square regression (PLSR), excellent prediction was obtained for hardness (R 2 = 0.91; RPD = 2.26), while an approximate qualitative prediction was found for springiness (R 2 = 0.73; RPD = 2.07), cohesiveness (R 2 = 0.67; RPD = 1.31) and resilience (R 2 = 0.65; RPD = 1.24). It could be concluded that the MIR spectroscopy could be used as a rapid and non-destructive technique for monitoring texture of pound cakes throughout storage as well as for the prediction of their hardness. Copyright © 2018 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Harris, C. D.; Profeta, Luisa T. M.; Akpovo, Codjo A.; Johnson, Lewis; Stowe, Ashley C.
2017-05-01
A calibration model was created to illustrate the detection capabilities of laser ablation molecular isotopic spectroscopy (LAMIS) discrimination in isotopic analysis. The sample set contained boric acid pellets that varied in isotopic concentrations of 10B and 11B. Each sample set was interrogated with a Q-switched Nd:YAG ablation laser operating at 532 nm. A minimum of four band heads of the β system B2∑ -> Χ2∑transitions were identified and verified with previous literature on BO molecular emission lines. Isotopic shifts were observed in the spectra for each transition and used as the predictors in the calibration model. The spectra along with their respective 10/11B isotopic ratios were analyzed using Partial Least Squares Regression (PLSR). An IUPAC novel approach for determining a multivariate Limit of Detection (LOD) interval was used to predict the detection of the desired isotopic ratios. The predicted multivariate LOD is dependent on the variation of the instrumental signal and other composites in the calibration model space.
Predicting heavy metal concentrations in soils and plants using field spectrophotometry
NASA Astrophysics Data System (ADS)
Muradyan, V.; Tepanosyan, G.; Asmaryan, Sh.; Sahakyan, L.; Saghatelyan, A.; Warner, T. A.
2017-09-01
Aim of this study is to predict heavy metal (HM) concentrations in soils and plants using field remote sensing methods. The studied sites were an industrial town of Kajaran and city of Yerevan. The research also included sampling of soils and leaves of two tree species exposed to different pollution levels and determination of contents of HM in lab conditions. The obtained spectral values were then collated with contents of HM in Kajaran soils and the tree leaves sampled in Yerevan, and statistical analysis was done. Consequently, Zn and Pb have a negative correlation coefficient (p <0.01) in a 2498 nm spectral range for soils. Pb has a significantly higher correlation at red edge for plants. A regression models and artificial neural network (ANN) for HM prediction were developed. Good results were obtained for the best stress sensitive spectral band ANN (R2 0.9, RPD 2.0), Simple Linear Regression (SLR) and Partial Least Squares Regression (PLSR) (R2 0.7, RPD 1.4) models. Multiple Linear Regression (MLR) model was not applicable to predict Pb and Zn concentrations in soils in this research. Almost all full spectrum PLS models provide good calibration and validation results (RPD>1.4). Full spectrum ANN models are characterized by excellent calibration R2, rRMSE and RPD (0.9; 0.1 and >2.5 respectively). For prediction of Pb and Ni contents in plants SLR and PLS models were used. The latter provide almost the same results. Our findings indicate that it is possible to make coarse direct estimation of HM content in soils and plants using rapid and economic reflectance spectroscopy.
Structure-activity relationships between sterols and their thermal stability in oil matrix.
Hu, Yinzhou; Xu, Junli; Huang, Weisu; Zhao, Yajing; Li, Maiquan; Wang, Mengmeng; Zheng, Lufei; Lu, Baiyi
2018-08-30
Structure-activity relationships between 20 sterols and their thermal stabilities were studied in a model oil system. All sterol degradations were found to be consistent with a first-order kinetic model with determination of coefficient (R 2 ) higher than 0.9444. The number of double bonds in the sterol structure was negatively correlated with the thermal stability of sterol, whereas the length of the branch chain was positively correlated with the thermal stability of sterol. A quantitative structure-activity relationship (QSAR) model to predict thermal stability of sterol was developed by using partial least squares regression (PLSR) combined with genetic algorithm (GA). A regression model was built with R 2 of 0.806. Almost all sterol degradation constants can be predicted accurately with R 2 of cross-validation equals to 0.680. Four important variables were selected in optimal QSAR model and the selected variables were observed to be related with information indices, RDF descriptors, and 3D-MoRSE descriptors. Copyright © 2018 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Singh, A.; Serbin, S. P.; Kingdon, C.; Townsend, P. A.
2013-12-01
A major goal of remote sensing, and imaging spectroscopy in particular, is the development of generalizable algorithms to repeatedly and accurately map ecosystem properties such as canopy chemistry across space and time. Existing methods must therefore be tested across a range of measurement approaches to identify and overcome limits to the consistent retrieval of such properties from spectroscopic imagery. Here we illustrate a general approach for the estimation of key foliar biochemical and morphological traits from spectroscopic imagery derived from the AVIRIS instrument and the propagation of errors from the leaf to the image scale using partial least squares regression (PLSR) techniques. Our method involves the integration of three types of data representing different scales of observation: At the image scale, the images were normalized for atmospheric, illumination and BRDF effects. Spectra from field plot locations were extracted from the 51AVIRIS images and were averaged when the field plot was larger than a single pixel. At the plot level, the scaling was conducted using multiple replicates (1000) derived from the leaf-level uncertainty estimates to generate plot-level estimates with their associated uncertainties. Leaf-level estimates of foliar traits (%N, %C, %Fiber, %Cellulose, %Lignin, LMA) were scaled to the canopy based on relative species composition of each plot. Image spectra were iteratively split into 50/50 randomized calibration-validation datasets and multiple (500) trait-predictive PLSR models were generated, this time sampling from within the plot-level uncertainty distribution. This allowed the propagation of uncertainty from the leaf-level dependent variables to the plot level, and finally to models built using AVIRIS image spectra. Moreover, this method allows us to generate spatially explicit maps of uncertainty in our sampled traits. Both LMA and %N PLSR models had a R2 greater than 0.8, root mean square errors (RMSEs) for both variables were less than 6% of the range of data. Fiber and lignin were predicted with R2 > 0.65 and carbon and cellulose greater than 0.5. Although R2 of these variables were lower than LMA and %N, their RMSE values were beneath 9% of the range of data. The comparatively lower R2 values for %C and cellulose in particular were related to the low amount of natural variability in these constituents. Further, coefficients from the randomized set of PLSR models were applied to imagery and aggregated to obtain pixel-wise predicted means and uncertainty estimates for each foliar trait. The resulting maps of nutritional and morphological properties together with their overall uncertainties represent a first-of-its-kind data product for examining the spatio-temporal patterns of forest functioning and nutrient cycling. These data are now being used to relate foliar traits with ecosystem processes such as streamwater nutrient export and insect herbivory. In addition, the ability to assign a retrieval uncertainty enables more efficient assimilation of these data products into ecosystem models to help constrain carbon and nutrient cycling projections.
Yang, Teng; Adams, Jonathan M; Shi, Yu; He, Jin-Sheng; Jing, Xin; Chen, Litong; Tedersoo, Leho; Chu, Haiyan
2017-07-01
Previous studies have revealed inconsistent correlations between fungal diversity and plant diversity from local to global scales, and there is a lack of information about the diversity-diversity and productivity-diversity relationships for fungi in alpine regions. Here we investigated the internal relationships between soil fungal diversity, plant diversity and productivity across 60 grassland sites on the Tibetan Plateau, using Illumina sequencing of the internal transcribed spacer 2 (ITS2) region for fungal identification. Fungal alpha and beta diversities were best explained by plant alpha and beta diversities, respectively, when accounting for environmental drivers and geographic distance. The best ordinary least squares (OLS) multiple regression models, partial least squares regression (PLSR) and variation partitioning analysis (VPA) indicated that plant richness was positively correlated with fungal richness. However, no correlation between plant richness and fungal richness was evident for fungal functional guilds when analyzed individually. Plant productivity showed a weaker relationship to fungal diversity which was intercorrelated with other factors such as plant diversity, and was thus excluded as a main driver. Our study points to a predominant effect of plant diversity, along with other factors such as carbon : nitrogen (C : N) ratio, soil phosphorus and dissolved organic carbon, on soil fungal richness. © 2017 The Authors. New Phytologist © 2017 New Phytologist Trust.
Egg embryo development detection with hyperspectral imaging
NASA Astrophysics Data System (ADS)
Lawrence, Kurt C.; Smith, Douglas P.; Windham, William R.; Heitschmidt, Gerald W.; Park, Bosoon
2006-10-01
In the U. S. egg industry, anywhere from 130 million to over one billion infertile eggs are incubated each year. Some of these infertile eggs explode in the hatching cabinet and can potentially spread molds or bacteria to all the eggs in the cabinet. A method to detect the embryo development of incubated eggs was developed. Twelve brown-shell hatching eggs from two replicates (n=24) were incubated and imaged to identify embryo development. A hyperspectral imaging system was used to collect transmission images from 420 to 840 nm of brown-shell eggs positioned with the air cell vertical and normal to the camera lens. Raw transmission images from about 400 to 900 nm were collected for every egg on days 0, 1, 2, and 3 of incubation. A total of 96 images were collected and eggs were broken out on day 6 to determine fertility. After breakout, all eggs were found to be fertile. Therefore, this paper presents results for egg embryo development, not fertility. The original hyperspectral data and spectral means for each egg were both used to create embryo development models. With the hyperspectral data range reduced to about 500 to 700 nm, a minimum noise fraction transformation was used, along with a Mahalanobis Distance classification model, to predict development. Days 2 and 3 were all correctly classified (100%), while day 0 and day 1 were classified at 95.8% and 91.7%, respectively. Alternatively, the mean spectra from each egg were used to develop a partial least squares regression (PLSR) model. First, a PLSR model was developed with all eggs and all days. The data were multiplicative scatter corrected, spectrally smoothed, and the wavelength range was reduced to 539 - 770 nm. With a one-out cross validation, all eggs for all days were correctly classified (100%). Second, a PLSR model was developed with data from day 0 and day 3, and the model was validated with data from day 1 and 2. For day 1, 22 of 24 eggs were correctly classified (91.7%) and for day 2, all eggs were correctly classified (100%). Although the results are based on relatively small sample sizes, they are encouraging. However, larger sample sizes, from multiple flocks, will be needed to fully validate and verify these models. Additionally, future experiments must also include non-fertile eggs so the fertile / non-fertile effect can be determined.
Retrieval and Mapping of Heavy Metal Concentration in Soil Using Time Series Landsat 8 Imagery
NASA Astrophysics Data System (ADS)
Fang, Y.; Xu, L.; Peng, J.; Wang, H.; Wong, A.; Clausi, D. A.
2018-04-01
Heavy metal pollution is a critical global environmental problem which has always been a concern. Traditional approach to obtain heavy metal concentration relying on field sampling and lab testing is expensive and time consuming. Although many related studies use spectrometers data to build relational model between heavy metal concentration and spectra information, and then use the model to perform prediction using the hyperspectral imagery, this manner can hardly quickly and accurately map soil metal concentration of an area due to the discrepancies between spectrometers data and remote sensing imagery. Taking the advantage of easy accessibility of Landsat 8 data, this study utilizes Landsat 8 imagery to retrieve soil Cu concentration and mapping its distribution in the study area. To enlarge the spectral information for more accurate retrieval and mapping, 11 single date Landsat 8 imagery from 2013-2017 are selected to form a time series imagery. Three regression methods, partial least square regression (PLSR), artificial neural network (ANN) and support vector regression (SVR) are used to model construction. By comparing these models unbiasedly, the best model are selected to mapping Cu concentration distribution. The produced distribution map shows a good spatial autocorrelation and consistency with the mining area locations.
Fanesi, Andrea; Wagner, Heiko; Wilhelm, Christian
2017-02-08
Climate change has a strong impact on phytoplankton communities and water quality. However, the development of robust techniques to assess phytoplankton growth is still in progress. In this study, the growth rate of phytoplankton cells grown at different temperatures was modelled based on conventional physiological traits (e.g. chlorophyll, carbon and photosynthetic parameters) using the partial least square regression (PLSR) algorithm and compared with a new approach combining Fourier transform infrared-spectroscopy and PLSR. In this second model, it is assumed that the macromolecular composition of phytoplankton cells represents an intracellular marker for growth. The models have comparable high predictive power (R 2 > 0.8) and low error in predicting new observations. Interestingly, not all of the predictors present the same weight in the modelling of growth rate. A set of specific parameters, such as non-photochemical fluorescence quenching (NPQ) and the quantum yield of carbon production in the first model, and lipid, protein and carbohydrate contents for the second one, strongly covary with cell growth rate regardless of the taxonomic position of the phytoplankton species investigated. This reflects a set of specific physiological adjustments covarying with growth rate, conserved among taxonomically distant algal species that might be used as guidelines for the improvement of modern primary production models. The high predictive power of both sets of cellular traits for growth rate is of great importance for applied phycological studies. Our approach may find application as a quality control tool for the monitoring of phytoplankton populations in natural communities or in photobioreactors. © 2017 The Author(s).
Determination of yolk contamination in liquid egg white using Raman spectroscopy.
Cluff, K; Konda Naganathan, G; Jonnalagada, D; Mortensen, I; Wehling, R; Subbiah, J
2016-07-01
Purified egg white is an important ingredient in a number of baked and confectionary foods because of its foaming properties. However, yolk contamination in amounts as low as 0.01% can impede the foaming ability of egg white. In this study, we used Raman spectroscopy to evaluate the hypothesis that yolk contamination in egg white could be detected based on its molecular optical properties. Yolk contaminated egg white samples (n = 115) with contamination levels ranging from 0% to 0.25% (on weight basis) were prepared. The samples were excited with a 785 nm laser and Raman spectra from 250 to 3,200 cm(-1) were recorded. The Raman spectra were baseline corrected using an optimized piecewise cubic interpolation on each spectrum and then normalized with a standard normal variate transformation. Samples were randomly divided into calibration (n = 77) and validation (n = 38) data sets. A partial least squares regression (PLSR) model was developed to predict yolk contamination levels, based on the Raman spectral fingerprint. Raman spectral peaks, in the spectral region of 1,080 and 1,666 cm(-1), had the largest influence on detecting yolk contamination in egg white. The PLSR model was able to correctly predict yolk contamination levels with an R(2) = 0.90 in the validation data set. These results demonstrate the capability of Raman spectroscopy for detection of yolk contamination at very low levels in egg white and present a strong case for development of an on-line system to be deployed in egg processing plants. © 2016 Poultry Science Association Inc.
Dawson, Neil; Thompson, Rhiannon J.; McVie, Allan; Thomson, David M.; Morris, Brian J.; Pratt, Judith A.
2012-01-01
Objective: In the present study, we employ mathematical modeling (partial least squares regression, PLSR) to elucidate the functional connectivity signatures of discrete brain regions in order to identify the functional networks subserving PCP-induced disruption of distinct cognitive functions and their restoration by the procognitive drug modafinil. Methods: We examine the functional connectivity signatures of discrete brain regions that show overt alterations in metabolism, as measured by semiquantitative 2-deoxyglucose autoradiography, in an animal model (subchronic phencyclidine [PCP] treatment), which shows cognitive inflexibility with relevance to the cognitive deficits seen in schizophrenia. Results: We identify the specific components of functional connectivity that contribute to the rescue of this cognitive inflexibility and to the restoration of overt cerebral metabolism by modafinil. We demonstrate that modafinil reversed both the PCP-induced deficit in the ability to switch attentional set and the PCP-induced hypometabolism in the prefrontal (anterior prelimbic) and retrosplenial cortices. Furthermore, modafinil selectively enhanced metabolism in the medial prelimbic cortex. The functional connectivity signatures of these regions identified a unifying functional subsystem underlying the influence of modafinil on cerebral metabolism and cognitive flexibility that included the nucleus accumbens core and locus coeruleus. In addition, these functional connectivity signatures identified coupling events specific to each brain region, which relate to known anatomical connectivity. Conclusions: These data support clinical evidence that modafinil may alleviate cognitive deficits in schizophrenia and also demonstrate the benefit of applying PLSR modeling to characterize functional brain networks in translational models relevant to central nervous system dysfunction. PMID:20810469
Spatially dense morphometrics of craniofacial sexual dimorphism in 1-year-olds.
Matthews, Harold; Penington, Tony; Saey, Ine; Halliday, Jane; Muggli, Evelyn; Claes, Peter
2016-10-01
Recent advances in the field of geometric morphometrics allow for powerful statistical hypothesis testing for effects of biological and environmental variables on anatomical shape. This study used partial least-squares regression (PLSR) and the recently developed bootstrapped response-based imputation modelling (BRIM) algorithm to test for sexual dimorphism in the craniofacial shape of 1-year-old humans. We observed a recession of the forehead in boys relative to girls, and differences in the nose, consistent with adult dimorphism. Results also suggest that the degree to which individuals express dimorphic traits is continuous throughout the population. This is also seen in adult dimorphism but in 1-year-olds the amount of overlap between groups is much higher, indicating the strength of dimorphism between sexes is lower. Our results demonstrate early sexual dimorphism that is not attributable to the influx of sex hormones at puberty. This highlights the need to look at very early ontogeny for the origins of sexual dimorphism. We suggest that future work look at potential mediating effects of this early dimorphism on the later impact of puberty. The subtle shape differences we have detected, may also be applied to sexing fossilised crania. A common artefact in 3D images of faces of young children is that they often have their mouths open to varying degrees, introducing variability in the data unrelated to anatomy. We describe two PLSR-based methods of correcting this. These methods may facilitate surgical planning and assessment of young children based on 3D images. © 2016 Anatomical Society.
What’s Wrong with the Murals at the Mogao Grottoes: A Near-Infrared Hyperspectral Imaging Method
Sun, Meijun; Zhang, Dong; Wang, Zheng; Ren, Jinchang; Chai, Bolong; Sun, Jizhou
2015-01-01
Although a significant amount of work has been performed to preserve the ancient murals in the Mogao Grottoes by Dunhuang Cultural Research, non-contact methods need to be developed to effectively evaluate the degree of flaking of the murals. In this study, we propose to evaluate the flaking by automatically analyzing hyperspectral images that were scanned at the site. Murals with various degrees of flaking were scanned in the 126th cave using a near-infrared (NIR) hyperspectral camera with a spectral range of approximately 900 to 1700 nm. The regions of interest (ROIs) of the murals were manually labeled and grouped into four levels: normal, slight, moderate, and severe. The average spectral data from each ROI and its group label were used to train our classification model. To predict the degree of flaking, we adopted four algorithms: deep belief networks (DBNs), partial least squares regression (PLSR), principal component analysis with a support vector machine (PCA + SVM) and principal component analysis with an artificial neural network (PCA + ANN). The experimental results show the effectiveness of our method. In particular, better results are obtained using DBNs when the training data contain a significant amount of striping noise. PMID:26394926
Kern, Simon; Meyer, Klas; Guhl, Svetlana; Gräßer, Patrick; Paul, Andrea; King, Rudibert; Maiwald, Michael
2018-05-01
Monitoring specific chemical properties is the key to chemical process control. Today, mainly optical online methods are applied, which require time- and cost-intensive calibration effort. NMR spectroscopy, with its advantage being a direct comparison method without need for calibration, has a high potential for enabling closed-loop process control while exhibiting short set-up times. Compact NMR instruments make NMR spectroscopy accessible in industrial and rough environments for process monitoring and advanced process control strategies. We present a fully automated data analysis approach which is completely based on physically motivated spectral models as first principles information (indirect hard modeling-IHM) and applied it to a given pharmaceutical lithiation reaction in the framework of the European Union's Horizon 2020 project CONSENS. Online low-field NMR (LF NMR) data was analyzed by IHM with low calibration effort, compared to a multivariate PLS-R (partial least squares regression) approach, and both validated using online high-field NMR (HF NMR) spectroscopy. Graphical abstract NMR sensor module for monitoring of the aromatic coupling of 1-fluoro-2-nitrobenzene (FNB) with aniline to 2-nitrodiphenylamine (NDPA) using lithium-bis(trimethylsilyl) amide (Li-HMDS) in continuous operation. Online 43.5 MHz low-field NMR (LF) was compared to 500 MHz high-field NMR spectroscopy (HF) as reference method.
Balabin, Roman M; Smirnov, Sergey V
2011-04-29
During the past several years, near-infrared (near-IR/NIR) spectroscopy has increasingly been adopted as an analytical tool in various fields from petroleum to biomedical sectors. The NIR spectrum (above 4000 cm(-1)) of a sample is typically measured by modern instruments at a few hundred of wavelengths. Recently, considerable effort has been directed towards developing procedures to identify variables (wavelengths) that contribute useful information. Variable selection (VS) or feature selection, also called frequency selection or wavelength selection, is a critical step in data analysis for vibrational spectroscopy (infrared, Raman, or NIRS). In this paper, we compare the performance of 16 different feature selection methods for the prediction of properties of biodiesel fuel, including density, viscosity, methanol content, and water concentration. The feature selection algorithms tested include stepwise multiple linear regression (MLR-step), interval partial least squares regression (iPLS), backward iPLS (BiPLS), forward iPLS (FiPLS), moving window partial least squares regression (MWPLS), (modified) changeable size moving window partial least squares (CSMWPLS/MCSMWPLSR), searching combination moving window partial least squares (SCMWPLS), successive projections algorithm (SPA), uninformative variable elimination (UVE, including UVE-SPA), simulated annealing (SA), back-propagation artificial neural networks (BP-ANN), Kohonen artificial neural network (K-ANN), and genetic algorithms (GAs, including GA-iPLS). Two linear techniques for calibration model building, namely multiple linear regression (MLR) and partial least squares regression/projection to latent structures (PLS/PLSR), are used for the evaluation of biofuel properties. A comparison with a non-linear calibration model, artificial neural networks (ANN-MLP), is also provided. Discussion of gasoline, ethanol-gasoline (bioethanol), and diesel fuel data is presented. The results of other spectroscopic techniques application, such as Raman, ultraviolet-visible (UV-vis), or nuclear magnetic resonance (NMR) spectroscopies, can be greatly improved by an appropriate feature selection choice. Copyright © 2011 Elsevier B.V. All rights reserved.
Subaihi, Abdu; Muhamadali, Howbeer; Mutter, Shaun T; Blanch, Ewan; Ellis, David I; Goodacre, Royston
2017-03-27
In this study surface enhanced Raman scattering (SERS) combined with the isotopic labelling (IL) principle has been used for the quantification of codeine spiked into both water and human plasma. Multivariate statistical approaches were employed for the analysis of these SERS spectral data, particularly partial least squares regression (PLSR) which was used to generate models using the full SERS spectral data for quantification of codeine with, and without, an internal isotopic labelled standard. The PLSR models provided accurate codeine quantification in water and human plasma with high prediction accuracy (Q 2 ). In addition, the employment of codeine-d 6 as the internal standard further improved the accuracy of the model, by increasing the Q 2 from 0.89 to 0.94 and decreasing the low root-mean-square error of predictions (RMSEP) from 11.36 to 8.44. Using the peak area at 1281 cm -1 assigned to C-N stretching, C-H wagging and ring breathing, the limit of detection was calculated in both water and human plasma to be 0.7 μM (209.55 ng mL -1 ) and 1.39 μM (416.12 ng mL -1 ), respectively. Due to a lack of definitive codeine vibrational assignments, density functional theory (DFT) calculations have also been used to assign the spectral bands with their corresponding vibrational modes, which were in excellent agreement with our experimental Raman and SERS findings. Thus, we have successfully demonstrated the application of SERS with isotope labelling for the absolute quantification of codeine in human plasma for the first time with a high degree of accuracy and reproducibility. The use of the IL principle which employs an isotopolog (that is to say, a molecule which is only different by the substitution of atoms by isotopes) improves quantification and reproducibility because the competition of the codeine and codeine-d 6 for the metal surface used for SERS is equal and this will offset any difference in the number of particles under analysis or any fluctuations in laser fluence. It is our belief that this may open up new exciting opportunities for testing SERS in real-world samples and applications which would be an area of potential future studies.
Lykomitros, Dimitrios; Fogliano, Vincenzo; Capuano, Edoardo
2018-04-01
Roasted peanuts are a popular snack in Europe, but their drivers of liking and perceived freshness have not been previously studied with European consumers. Consumer research to date has been focused on U.S. consumers, and only on specific peanut cultivars. In this study, 26 unique samples were produced from peanuts of different types, cultivars, origins, and with different process technologies (including baking, frying, and maceration). The peanut samples were subjected to sensory (expert panel, Spectrum TM ) and instrumental analysis (color, headspace volatiles, sugar profile, large deformation compression tests, and graded by size) and were hedonically rated by consumers in The Netherlands, Spain, and Turkey (n > 200 each). Preference Mapping (PREFMAP) on mean liking models revealed that the drivers of liking are similar across the three countries. Sweet taste, roasted peanut, dark roast, and sweet aromas and the color b * value were related to increased liking, and raw bean aroma and bitter taste with decreased liking. Further partial least square regression (PLSR) modeling of liking and perceived freshness against instrumental attributes showed that the color coordinates in combination with sucrose content and a select few headspace volatiles were strong predictors of both preference and perceived freshness. Finally, additional PLSR models focusing on the headspace volatiles only showed that liking and ''fresh'' attributes were correlated with the presence of several pyrroles in the volatile fraction, and inversely related to ''stale'' and to hexanal and 2-heptanone. This study provides insight into which flavor, taste, and appearance attributes drive liking and disliking of roasted peanuts for European consumers. The drivers are linked back to analytical attributes that can be measured instrumentally, thereby reducing the reliance on costly sensory panels. Particular emphasis is placed on color as a predictor of preference, because of the low cost of the measuring equipment, it is available to even smaller producers. In addition to preference, the study also examines whether product attributes that drive perceived freshness exist. The results can be used to design products with high acceptability across several countries within Europe. © 2018 Institute of Food Technologists®.
NASA Astrophysics Data System (ADS)
Vaudour, E.; Gilliot, J. M.; Bel, L.; Lefevre, J.; Chehdi, K.
2016-07-01
This study aimed at identifying the potential of Vis-NIR airborne hyperspectral AISA-Eagle data for predicting the topsoil organic carbon (SOC) content of bare cultivated soils over a large peri-urban area (221 km2) with both contrasted soils and SOC contents, located in the western region of Paris, France. Soil types comprised haplic luvisols, calcaric cambisols and colluvic cambisols. Airborne AISA-Eagle data (400-1000 nm, 126 bands) with 1 m-resolution were acquired on 17 April 2013 over 13 tracks. Tracks were atmospherically corrected then mosaicked at a 2 m-resolution using a set of 24 synchronous field spectra of bare soils, black and white targets and impervious surfaces. The land use identification system layer (RPG) of 2012 was used to mask non-agricultural areas, then calculation and thresholding of NDVI from an atmospherically corrected SPOT image acquired the same day enabled to map agricultural fields with bare soil. A total of 101 sites sampled either in 2013 or in the 3 previous years and in 2015 were identified as bare by means of this map. Predictions were made from the mosaic AISA spectra which were related to topsoil SOC contents by means of partial least squares regression (PLSR). Regression robustness was evaluated through a series of 1000 bootstrap data sets of calibration-validation samples, considering 74 sites outside cloud shadows only, and different sampling strategies for selecting calibration samples. Validation root-mean-square errors (RMSE) were comprised between 3.73 and 4.49 g Kg-1 and were ∼4 g Kg-1 in median. The most performing models in terms of coefficient of determination (R2) and Residual Prediction Deviation (RPD) values were the calibration models derived either from Kennard-Stone or conditioned Latin Hypercube sampling on smoothed spectra. The most generalizable model leading to lowest RMSE value of 3.73 g Kg-1 at the regional scale and 1.44 g Kg-1 at the within-field scale and low bias was the cross-validated leave-one-out PLSR model constructed with the 28 near-synchronous samples and raw spectra.
[Research on Oil Sands Spectral Characteristics and Oil Content by Remote Sensing Estimation].
You, Jin-feng; Xing, Li-xin; Pan, Jun; Shan, Xuan-long; Liang, Li-heng; Fan, Rui-xue
2015-04-01
Visible and near infrared spectroscopy is a proven technology to be widely used in identification and exploration of hydrocarbon energy sources with high spectral resolution for detail diagnostic absorption characteristics of hydrocarbon groups. The most prominent regions for hydrocarbon absorption bands are 1,740-1,780, 2,300-2,340 and 2,340-2,360 nm by the reflectance of oil sands samples. These spectral ranges are dominated by various C-H overlapping overtones and combination bands. Meanwhile, there is relatively weak even or no absorption characteristics in the region from 1,700 to 1,730 nm in the spectra of oil sands samples with low bitumen content. With the increase in oil content, in the spectral range of 1,700-1,730 nm the obvious hydrocarbon absorption begins to appear. The bitumen content is the critical parameter for oil sands reserves estimation. The absorption depth was used to depict the response intensity of the absorption bands controlled by first-order overtones and combinations of the various C-H stretching and bending fundamentals. According to the Pearson and partial correlation relationships of oil content and absorption depth dominated by hydrocarbon groups in 1,740-1,780, 2,300-2,340 and 2,340-2,360 nm wavelength range, the scheme of association mode was established between the intensity of spectral response and bitumen content, and then unary linear regression(ULR) and partial least squares regression (PLSR) methods were employed to model the equation between absorption depth attributed to various C-H bond and bitumen content. There were two calibration equations in which ULR method was employed to model the relationship between absorption depth near 2,350 nm region and bitumen content and PLSR method was developed to model the relationship between absorption depth of 1,758, 2,310, 2,350 nm regions and oil content. It turned out that the calibration models had good predictive ability and high robustness and they could provide the scientific basis for rapid estimation of oil content in oil sands in future.
Early detection of germinated wheat grains using terahertz image and chemometrics
NASA Astrophysics Data System (ADS)
Jiang, Yuying; Ge, Hongyi; Lian, Feiyu; Zhang, Yuan; Xia, Shanhong
2016-02-01
In this paper, we propose a feasible tool that uses a terahertz (THz) imaging system for identifying wheat grains at different stages of germination. The THz spectra of the main changed components of wheat grains, maltose and starch, which were obtained by THz time spectroscopy, were distinctly different. Used for original data compression and feature extraction, principal component analysis (PCA) revealed the changes that occurred in the inner chemical structure during germination. Two thresholds, one indicating the start of the release of α-amylase and the second when it reaches the steady state, were obtained through the first five score images. Thus, the first five PCs were input for the partial least-squares regression (PLSR), least-squares support vector machine (LS-SVM), and back-propagation neural network (BPNN) models, which were used to classify seven different germination times between 0 and 48 h, with a prediction accuracy of 92.85%, 93.57%, and 90.71%, respectively. The experimental results indicated that the combination of THz imaging technology and chemometrics could be a new effective way to discriminate wheat grains at the early germination stage of approximately 6 h.
Liu, Xue-song; Sun, Fen-fang; Jin, Ye; Wu, Yong-jiang; Gu, Zhi-xin; Zhu, Li; Yan, Dong-lan
2015-12-01
A novel method was developed for the rapid determination of multi-indicators in corni fructus by means of near infrared (NIR) spectroscopy. Particle swarm optimization (PSO) based least squares support vector machine was investigated to increase the levels of quality control. The calibration models of moisture, extractum, morroniside and loganin were established using the PSO-LS-SVM algorithm. The performance of PSO-LS-SVM models was compared with partial least squares regression (PLSR) and back propagation artificial neural network (BP-ANN). The calibration and validation results of PSO-LS-SVM were superior to both PLS and BP-ANN. For PSO-LS-SVM models, the correlation coefficients (r) of calibrations were all above 0.942. The optimal prediction results were also achieved by PSO-LS-SVM models with the RMSEP (root mean square error of prediction) and RSEP (relative standard errors of prediction) less than 1.176 and 15.5% respectively. The results suggest that PSO-LS-SVM algorithm has a good model performance and high prediction accuracy. NIR has a potential value for rapid determination of multi-indicators in Corni Fructus.
Raman microspectroscopy for in situ examination of carbon-microbe-mineral interactions
NASA Astrophysics Data System (ADS)
Creamer, C.; Foster, A. L.; Lawrence, C. R.; Mcfarland, J. W.; Waldrop, M. P.
2016-12-01
The changing paradigm of soil organic matter formation and turnover is focused at the nexus of microbe-carbon-mineral interactions. However, visualizing biotic and abiotic stabilization of C on mineral surfaces is difficult given our current techniques. Therefore we investigated Raman microspectroscopy as a potential tool to examine microbially mediated organo-mineral associations. Raman microspectroscopy is a non-destructive technique that has been used to identify microorganisms and minerals, and to quantify microbial assimilation of 13C labeled substrates in culture. We developed a partial least squares regression (PLSR) model to accurately quantify (within 5%) adsorption of four model 12C substrates (glucose, glutamic acid, oxalic acid, p-hydroxybenzoic acid) on a range of soil minerals. We also developed a PLSR model to quantify the incorporation of 13C into E. coli cells. Using these two models, along with measures of the 13C content of respired CO2, we determined the allocation of glucose-derived C into mineral-associated microbial biomass and respired CO2 in situ and through time. We observed progressive 13C enrichment of microbial biomass with incubation time, as well as 13C enrichment of CO2 indicating preferential decomposition of glucose-derived C. We will also present results on the application of our in situ chamber to quantify the formation of organo-mineral associations under both abiotic and biotic conditions with a variety of C and mineral substrates, as well as the rate of turnover and stabilization of microbial residues. Application of Raman microspectroscopy to microbial-mineral interactions represents a novel method to quantify microbial transformation of C substrates and subsequent mineral stabilization without destructive sampling, and has the potential to provide new insights to our conceptual understanding of carbon-microbe-mineral interactions.
NASA Astrophysics Data System (ADS)
Zheng, Xiaochun; Peng, Yankun; Li, Yongyu; Chao, Kuanglin; Qin, Jianwei
2017-05-01
The plate count method is commonly used to detect the total viable count (TVC) of bacteria in pork, which is timeconsuming and destructive. It has also been used to study the changes of the TVC in pork under different storage conditions. In recent years, many scholars have explored the non-destructive methods on detecting TVC by using visible near infrared (VIS/NIR) technology and hyperspectral technology. The TVC in chilled pork was monitored under high oxygen condition in this study by using hyperspectral technology in order to evaluate the changes of total bacterial count during storage, and then evaluate advantages and disadvantages of the storage condition. The VIS/NIR hyperspectral images of samples stored in high oxygen condition was acquired by a hyperspectral system in range of 400 1100nm. The actual reference value of total bacteria was measured by standard plate count method, and the results were obtained in 48 hours. The reflection spectra of the samples are extracted and used for the establishment of prediction model for TVC. The spectral preprocessing methods of standard normal variate transformation (SNV), multiple scatter correction (MSC) and derivation was conducted to the original reflectance spectra of samples. Partial least squares regression (PLSR) of TVC was performed and optimized to be the prediction model. The results show that the near infrared hyperspectral technology based on 400-1100nm combined with PLSR model can describe the growth pattern of the total bacteria count of the chilled pork under the condition of high oxygen very vividly and rapidly. The results obtained in this study demonstrate that the nondestructive method of TVC based on NIR hyperspectral has great potential in monitoring of edible safety in processing and storage of meat.
Wang, Jingzhe; Abulimiti, Aerzuna; Cai, Lianghong
2018-01-01
Soil salinization is one of the most common forms of land degradation. The detection and assessment of soil salinity is critical for the prevention of environmental deterioration especially in arid and semi-arid areas. This study introduced the fractional derivative in the pretreatment of visible and near infrared (VIS–NIR) spectroscopy. The soil samples (n = 400) collected from the Ebinur Lake Wetland, Xinjiang Uyghur Autonomous Region (XUAR), China, were used as the dataset. After measuring the spectral reflectance and salinity in the laboratory, the raw spectral reflectance was preprocessed by means of the absorbance and the fractional derivative order in the range of 0.0–2.0 order with an interval of 0.1. Two different modeling methods, namely, partial least squares regression (PLSR) and random forest (RF) with preprocessed reflectance were used for quantifying soil salinity. The results showed that more spectral characteristics were refined for the spectrum reflectance treated via fractional derivative. The validation accuracies showed that RF models performed better than those of PLSR. The most effective model was established based on RF with the 1.5 order derivative of absorbance with the optimal values of R2 (0.93), RMSE (4.57 dS m−1), and RPD (2.78 ≥ 2.50). The developed RF model was stable and accurate in the application of spectral reflectance for determining the soil salinity of the Ebinur Lake wetland. The pretreatment of fractional derivative could be useful for monitoring multiple soil parameters with higher accuracy, which could effectively help to analyze the soil salinity. PMID:29736341
Hirnschall, Nino; Norrby, Sverker; Weber, Maria; Maedel, Sophie; Amir-Asgari, Sahand; Findl, Oliver
2015-01-01
To include intraoperative measurements of the anterior lens capsule of the aphakic eye into the intraocular lens power calculation (IPC) process and to compare the refractive outcome with conventional IPC formulae. In this prospective study, a prototype operating microscope with an integrated continuous optical coherence tomography (OCT) device (Visante attached to OPMI VISU 200, Carl Zeiss Meditec AG, Germany) was used to measure the anterior lens capsule position after implanting a capsular tension ring (CTR). Optical biometry (intraocular lens (IOL) Master 500) and ACMaster measurements (Carl Zeiss Meditec AG, Germany) were performed before surgery. Autorefraction and subjective refraction were performed 3 months after surgery. Conventional IPC formulae were compared with a new intraoperatively measured anterior chamber depth (ACD) (ACDIntraOP) partial least squares regression (PLSR) model for prediction of the postoperative refractive outcome. In total, 70 eyes of 70 patients were included. Mean axial eye length (AL) was 23.3 mm (range: 20.6-29.5 mm). Predictive power of the intraoperative measurements was found to be slightly better compared to conventional IOL power calculations. Refractive error dependency on AL for Holladay I, HofferQ, SRK/T, Haigis and ACDintraOP PLSR was r(2)=-0.42 (p<0.0001), r(2)=-0.5 (p<0.0001), r(2)=-0.34 (p=0.010), r(2)=-0.28 (p=0.049) and r(2)<0.001 (p=0.866), respectively, ACDIntraOP measurements help to better predict the refractive outcome and could be useful, if implemented in fourth-generation IPC formulae. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.
Wang, Jingzhe; Ding, Jianli; Abulimiti, Aerzuna; Cai, Lianghong
2018-01-01
Soil salinization is one of the most common forms of land degradation. The detection and assessment of soil salinity is critical for the prevention of environmental deterioration especially in arid and semi-arid areas. This study introduced the fractional derivative in the pretreatment of visible and near infrared (VIS-NIR) spectroscopy. The soil samples ( n = 400) collected from the Ebinur Lake Wetland, Xinjiang Uyghur Autonomous Region (XUAR), China, were used as the dataset. After measuring the spectral reflectance and salinity in the laboratory, the raw spectral reflectance was preprocessed by means of the absorbance and the fractional derivative order in the range of 0.0-2.0 order with an interval of 0.1. Two different modeling methods, namely, partial least squares regression (PLSR) and random forest (RF) with preprocessed reflectance were used for quantifying soil salinity. The results showed that more spectral characteristics were refined for the spectrum reflectance treated via fractional derivative. The validation accuracies showed that RF models performed better than those of PLSR. The most effective model was established based on RF with the 1.5 order derivative of absorbance with the optimal values of R 2 (0.93), RMSE (4.57 dS m -1 ), and RPD (2.78 ≥ 2.50). The developed RF model was stable and accurate in the application of spectral reflectance for determining the soil salinity of the Ebinur Lake wetland. The pretreatment of fractional derivative could be useful for monitoring multiple soil parameters with higher accuracy, which could effectively help to analyze the soil salinity.
Tiyip, Tashpolat; Ding, Jianli; Zhang, Dong; Liu, Wei; Wang, Fei; Tashpolat, Nigara
2017-01-01
Effective pretreatment of spectral reflectance is vital to model accuracy in soil parameter estimation. However, the classic integer derivative has some disadvantages, including spectral information loss and the introduction of high-frequency noise. In this paper, the fractional order derivative algorithm was applied to the pretreatment and partial least squares regression (PLSR) was used to assess the clay content of desert soils. Overall, 103 soil samples were collected from the Ebinur Lake basin in the Xinjiang Uighur Autonomous Region of China, and used as data sets for calibration and validation. Following laboratory measurements of spectral reflectance and clay content, the raw spectral reflectance and absorbance data were treated using the fractional derivative order from the 0.0 to the 2.0 order (order interval: 0.2). The ratio of performance to deviation (RPD), determinant coefficients of calibration (Rc2), root mean square errors of calibration (RMSEC), determinant coefficients of prediction (Rp2), and root mean square errors of prediction (RMSEP) were applied to assess the performance of predicting models. The results showed that models built on the fractional derivative order performed better than when using the classic integer derivative. Comparison of the predictive effects of 22 models for estimating clay content, calibrated by PLSR, showed that those models based on the fractional derivative 1.8 order of spectral reflectance (Rc2 = 0.907, RMSEC = 0.425%, Rp2 = 0.916, RMSEP = 0.364%, and RPD = 2.484 ≥ 2.000) and absorbance (Rc2 = 0.888, RMSEC = 0.446%, Rp2 = 0.918, RMSEP = 0.383% and RPD = 2.511 ≥ 2.000) were most effective. Furthermore, they performed well in quantitative estimations of the clay content of soils in the study area. PMID:28934274
Sirisomboon, Panmanas; Chowbankrang, Rawiphan; Williams, Phil
2012-05-01
Near-infrared spectroscopy in diffuse reflection mode was used to evaluate the apparent viscosity of Para rubber field latex and concentrated latex over the wavelength range of 1100 to 2500 nm, using partial least square regression (PLSR). The model with ten principal components (PCs) developed using the raw spectra accurately predicted the apparent viscosity with correlation coefficient (r), standard error of prediction (SEP), and bias of 0.974, 8.6 cP, and -0.4 cP, respectively. The ratio of the SEP to the standard deviation (RPD) and the ratio of the SEP to the range (RER) for the prediction were 4.4 and 16.7, respectively. Therefore, the model can be used for measurement of the apparent viscosity of field latex and concentrated latex in quality assurance and process control in the factory.
Feng, Yao-Ze; Elmasry, Gamal; Sun, Da-Wen; Scannell, Amalia G M; Walsh, Des; Morcy, Noha
2013-06-01
Bacterial pathogens are the main culprits for outbreaks of food-borne illnesses. This study aimed to use the hyperspectral imaging technique as a non-destructive tool for quantitative and direct determination of Enterobacteriaceae loads on chicken fillets. Partial least squares regression (PLSR) models were established and the best model using full wavelengths was obtained in the spectral range 930-1450 nm with coefficients of determination R(2)≥ 0.82 and root mean squared errors (RMSEs) ≤ 0.47 log(10)CFUg(-1). In further development of simplified models, second derivative spectra and weighted PLS regression coefficients (BW) were utilised to select important wavelengths. However, the three wavelengths (930, 1121 and 1345 nm) selected from BW were competent and more preferred for predicting Enterobacteriaceae loads with R(2) of 0.89, 0.86 and 0.87 and RMSEs of 0.33, 0.40 and 0.45 log(10)CFUg(-1) for calibration, cross-validation and prediction, respectively. Besides, the constructed prediction map provided the distribution of Enterobacteriaceae bacteria on chicken fillets, which cannot be achieved by conventional methods. It was demonstrated that hyperspectral imaging is a potential tool for determining food sanitation and detecting bacterial pathogens on food matrix without using complicated laboratory regimes. Copyright © 2012 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Maimaitijiang, Maitiniyazi; Ghulam, Abduwasit; Sidike, Paheding; Hartling, Sean; Maimaitiyiming, Matthew; Peterson, Kyle; Shavers, Ethan; Fishman, Jack; Peterson, Jim; Kadam, Suhas; Burken, Joel; Fritschi, Felix
2017-12-01
Estimating crop biophysical and biochemical parameters with high accuracy at low-cost is imperative for high-throughput phenotyping in precision agriculture. Although fusion of data from multiple sensors is a common application in remote sensing, less is known on the contribution of low-cost RGB, multispectral and thermal sensors to rapid crop phenotyping. This is due to the fact that (1) simultaneous collection of multi-sensor data using satellites are rare and (2) multi-sensor data collected during a single flight have not been accessible until recent developments in Unmanned Aerial Systems (UASs) and UAS-friendly sensors that allow efficient information fusion. The objective of this study was to evaluate the power of high spatial resolution RGB, multispectral and thermal data fusion to estimate soybean (Glycine max) biochemical parameters including chlorophyll content and nitrogen concentration, and biophysical parameters including Leaf Area Index (LAI), above ground fresh and dry biomass. Multiple low-cost sensors integrated on UASs were used to collect RGB, multispectral, and thermal images throughout the growing season at a site established near Columbia, Missouri, USA. From these images, vegetation indices were extracted, a Crop Surface Model (CSM) was advanced, and a model to extract the vegetation fraction was developed. Then, spectral indices/features were combined to model and predict crop biophysical and biochemical parameters using Partial Least Squares Regression (PLSR), Support Vector Regression (SVR), and Extreme Learning Machine based Regression (ELR) techniques. Results showed that: (1) For biochemical variable estimation, multispectral and thermal data fusion provided the best estimate for nitrogen concentration and chlorophyll (Chl) a content (RMSE of 9.9% and 17.1%, respectively) and RGB color information based indices and multispectral data fusion exhibited the largest RMSE 22.6%; the highest accuracy for Chl a + b content estimation was obtained by fusion of information from all three sensors with an RMSE of 11.6%. (2) Among the plant biophysical variables, LAI was best predicted by RGB and thermal data fusion while multispectral and thermal data fusion was found to be best for biomass estimation. (3) For estimation of the above mentioned plant traits of soybean from multi-sensor data fusion, ELR yields promising results compared to PLSR and SVR in this study. This research indicates that fusion of low-cost multiple sensor data within a machine learning framework can provide relatively accurate estimation of plant traits and provide valuable insight for high spatial precision in agriculture and plant stress assessment.
NASA Astrophysics Data System (ADS)
Araya, F. Z.; Abdul-Aziz, O. I.
2017-12-01
This study utilized a systematic data analytics approach to determine the relative linkages of stream dissolved oxygen (DO) with the hydro-climatic and biogeochemical drivers across the U.S. Pacific Coast. Multivariate statistical techniques of Pearson correlation matrix, principal component analysis, and factor analysis were applied to a complex water quality dataset (1998-2015) at 35 water quality monitoring stations of USGS NWIS and EPA STORET. Power-law based partial least squares regression (PLSR) models with a bootstrap Monte Carlo procedure (1000 iterations) were developed to reliably estimate the relative linkages by resolving multicollinearity (Nash-Sutcliffe Efficiency, NSE = 0.50-0.94). Based on the dominant drivers, four environmental regimes have been identified and adequately described the system-data variances. In Pacific North West and Southern California, water temperature was the most dominant driver of DO in majority of the streams. However, in Central and Northern California, stream DO was controlled by multiple drivers (i.e., water temperature, pH, stream flow, and total phosphorus), exhibiting a transitional environmental regime. Further, total phosphorus (TP) appeared to be the limiting nutrient for most streams. The estimated linkages and insights would be useful to identify management priorities to achieve healthy coastal stream ecosystems across the Pacific Coast of U.S.A. and similar regions around the world. Keywords: Data analytics, water quality, coastal streams, dissolved oxygen, environmental regimes, Pacific Coast, United States.
NASA Astrophysics Data System (ADS)
Bornemann, L.; Welp, G.; Amelung, W.
2009-04-01
Comprising more than 60 % of the terrestrial carbon pool, soil organic carbon (SOC) is one of the principal factors regulating the global C-cycle. Against the background of worldwide increasing CO2 emissions, much effort has been put to the modelling of soil-C turnover in order to evaluate its potential for mitigation of climate change. Soil organic matter is an ever changing assemblage of various organic components that interact with the mineral matrix and in dependence of its ecological environment. Carbon storage is thereby assumed to propagate by hierarchical saturation of different carbon pools. A homogeneous distribution of the respective pools within natural environments is unlikely as the controlling soil parameters are subject to spatial and temporal heterogeneity. Several attempts to operationalize this complex soil compartment have been proposed, most of them resting upon a concept of pools with different stability and varying turnover times. Among these pools, particulate organic matter (POM) is considered to be most sensitive to environmental changes and has been shown to explain major parts of the SOC variations. Until today, rather laborious physical and physico-chemical fractionation procedures are most commonly applied for the initialization and validation of POM in C-turnover models. Mid-infrared spectroscopy (MIRS) in combination with partial least squares regression (PLSR) could overcome this problem. The technique is fast, cheap, and requires little sample preparation. All the same, it is an appropriate technique not only for the determination of gross parameters like total soil organic carbon contents, but also for the determination and characterization of minor constituents like black carbon in soils. Basically, the infrared radiation is absorbed by molecules that express a dipole-moment during vibration. As virtually all constituents of soil organic matter and also a multitude of inorganic soil constituents express such a dipole-moment, plentiful chemical information can be extracted from absorption spectra of soil samples. In this work we present the development of calibration models for POM quantification via MIRS-PLSR, and the compilation of a raster data set including SOC and POM of three size classes for the testsite of the SFB-TR32 at Selhausen near Jülich (Germany). The studied test site is an orthic luvisol which has been sampled in a ten times ten meter raster from 0-30 cm depth (n=131). For POM fractionation samples were gently sonicated and material from 2000-250 µm was gained by wet sieving. After a second, more intense sonication, intermediate (250-53 µm) and fine (53-20 µm) material was also gained by wet sieving. All fractions were dried at 40 °C, carbon contents were determined by elemental analysis. For calibration of MIRS-PLSR, SOC contents of 87 bulk soil samples were determined by elemental analysis. Contributions of the different POM fractions to bulk SOC as well as the SOC contents within the particular POM fraction were determined for 36 soil samples by physical particle size fractionation as described above. MIRS-PLSR based predictions for the contribution of POM fractions to bulk soil proved to be satisfactory (R² >0.77) and improved with decreasing particle size. For the predictions of SOC contents in bulk soil and the different POM fractions R² even reached values ≥0.97. Root mean squared errors of the cross validations were in the range of standard deviations of the lab analysis or smaller. As physical fractionation methods are intrinsically susceptible to measurement errors, determination of POM fractions by MIRS analysis may even improve data sets for modelling. Apart from the generally convincing statistical parameters, further evidence for reliable predictions of the contributions of the different POM fractions to bulk SOC could be drawn from the spectral information itself. The spectral features utilized for the determination of the contribution of the different POM fractions to bulk SOC were matching the features for the prediction of the absolute SOC concentrations within the particular fractions. As these predictions were conducted with independent sample sets (bulk soil for the POM contribution and soil fractions for the SOC content within the fraction) the matching structural information for both features of the individual POM fraction indirectly validates the prediction for the POM pools. The latter is especially true as the observed features coincide with the actual knowledge on chemistry and stabilization of POM in soils. For the compilation of a complete raster data-set, the developed calibrations were applied to all of the 131 topsoil samples taken at the SFB-TR32 testsite. Correlation analysis indicated that the coarse and the intermediate POM fractions are related to each other, to bulk SOC content and textural parameters respectively, while the fine POM fraction seems to be independent from these factors. The observed coherences and the applicability of a C-saturation concept will be discussed by visual map-comparison and geostatistical analysis of the determined parameters.
Sharma, Ashok K; Srivastava, Gopal N; Roy, Ankita; Sharma, Vineet K
2017-01-01
The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84-0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better ( R 2 = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better ( R 2 = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules.
Sharma, Ashok K.; Srivastava, Gopal N.; Roy, Ankita; Sharma, Vineet K.
2017-01-01
The experimental methods for the prediction of molecular toxicity are tedious and time-consuming tasks. Thus, the computational approaches could be used to develop alternative methods for toxicity prediction. We have developed a tool for the prediction of molecular toxicity along with the aqueous solubility and permeability of any molecule/metabolite. Using a comprehensive and curated set of toxin molecules as a training set, the different chemical and structural based features such as descriptors and fingerprints were exploited for feature selection, optimization and development of machine learning based classification and regression models. The compositional differences in the distribution of atoms were apparent between toxins and non-toxins, and hence, the molecular features were used for the classification and regression. On 10-fold cross-validation, the descriptor-based, fingerprint-based and hybrid-based classification models showed similar accuracy (93%) and Matthews's correlation coefficient (0.84). The performances of all the three models were comparable (Matthews's correlation coefficient = 0.84–0.87) on the blind dataset. In addition, the regression-based models using descriptors as input features were also compared and evaluated on the blind dataset. Random forest based regression model for the prediction of solubility performed better (R2 = 0.84) than the multi-linear regression (MLR) and partial least square regression (PLSR) models, whereas, the partial least squares based regression model for the prediction of permeability (caco-2) performed better (R2 = 0.68) in comparison to the random forest and MLR based regression models. The performance of final classification and regression models was evaluated using the two validation datasets including the known toxins and commonly used constituents of health products, which attests to its accuracy. The ToxiM web server would be a highly useful and reliable tool for the prediction of toxicity, solubility, and permeability of small molecules. PMID:29249969
Farrés, Mireia; Piña, Benjamí; Tauler, Romà
2016-08-01
Copper containing fungicides are used to protect vineyards from fungal infections. Higher residues of copper in grapes at toxic concentrations are potentially toxic and affect the microorganisms living in vineyards, such as Saccharomyces cerevisiae. In this study, the response of the metabolic profiles of S. cerevisiae at different concentrations of copper sulphate (control, 1 mM, 3 mM and 6 mM) was analysed by liquid chromatography coupled to mass spectrometry (LC-MS) and multivariate curve resolution-alternating least squares (MCR-ALS) using an untargeted metabolomics approach. Peak areas of the MCR-ALS resolved elution profiles in control and in Cu(ii)-treated samples were compared using partial least squares regression (PLSR) and PLS-discriminant analysis (PLS-DA), and the intracellular metabolites best contributing to sample discrimination were selected and identified. Fourteen metabolites showed significant concentration changes upon Cu(ii) exposure, following a dose-response effect. The observed changes were consistent with the expected effects of Cu(ii) toxicity, including oxidative stress and DNA damage. This research confirmed that LC-MS based metabolomics coupled to chemometric methods are a powerful approach for discerning metabolomics changes in S. cerevisiae and for elucidating modes of toxicity of environmental stressors, including heavy metals like Cu(ii).
NASA Astrophysics Data System (ADS)
Will, R. M.; Li, A.; Glenn, N. F.; Benner, S. G.; Spaete, L.; Ilangakoon, N. T.
2015-12-01
Soil organic carbon distribution and the factors influencing this distribution are important for understanding carbon stores, vegetation dynamics, and the overall carbon cycle. Linking soil organic carbon (SOC) with aboveground vegetation biomass may provide a method to better understand SOC distribution in semiarid ecosystems. The Reynolds Creek Critical Zone Observatory (RC CZO) in Idaho, USA, is approximately 240 square kilometers and is situated in the semiarid Great Basin of the sagebrush-steppe ecosystem. Full waveform airborne lidar data and Next-Generation Airborne Visible/Infrared Imaging Spectrometer (AVIRIS-ng) collected in 2014 across the RC CZO are used to map vegetation biomass and SOC and then explore the relationships between them. Vegetation biomass is estimated by identifying vegetation species, and quantifying distribution and structure with lidar and integrating the field-measured biomass. Spectral data from AVIRIS-ng are used to differentiate non-photosynthetic vegetation (NPV) and soil, which are commonly confused in semiarid ecosystems. The information from lidar and AVIRIS-ng are then used to predict SOC by partial least squares regression (PLSR). An uncertainty analysis is provided, demonstrating the applicability of these approaches to improving our understanding of the distribution and patterns of SOC across the landscape.
The rapid measurement of soil carbon stock using near-infrared technology
NASA Astrophysics Data System (ADS)
Kusumo, B. H.; Sukartono; Bustan
2018-03-01
As a soil pool stores carbon (C) three times higher than an atmospheric pool, the depletion of C stock in the soil will significantly increase the concentration of CO2 in the atmosphere, causing global warming. However, the monitoring or measurement of soil C stock using conventional procedures is time-consuming and expensive. So it requires a rapid and non-destructive technique that is simple and does not need chemical substances. This research is aimed at testing whether near-infrared (NIR) technology is able to rapidly measure C stock in the soil. Soil samples were collected from an agricultural land at the sub-district of Kayangan, North Lombok, Indonesia. The coordinates of the samples were recorded. Parts of the samples were analyzed using conventional procedure (Walkley and Black) and some other parts were scanned using near-infrared spectroscopy (NIRS) for soil spectral collection. Partial Least Square Regression (PLSR) was used to develop models from soil C data measured by conventional analysis and from spectral data scanned by NIRS. The best model was moderately successful to measure soil C stock in the study area in North Lombok. This indicates that the NIR technology can be further used to monitor the change of soil C stock in the soil.
Determination of persimmon leaf chloride contents using near-infrared spectroscopy (NIRS).
de Paz, José Miguel; Visconti, Fernando; Chiaravalle, Mara; Quiñones, Ana
2016-05-01
Early diagnosis of specific chloride toxicity in persimmon trees requires the reliable and fast determination of the leaf chloride content, which is usually performed by means of a cumbersome, expensive and time-consuming wet analysis. A methodology has been developed in this study as an alternative to determine chloride in persimmon leaves using near-infrared spectroscopy (NIRS) in combination with multivariate calibration techniques. Based on a training dataset of 134 samples, a predictive model was developed from their NIR spectral data. For modelling, the partial least squares regression (PLSR) method was used. The best model was obtained with the first derivative of the apparent absorbance and using just 10 latent components. In the subsequent external validation carried out with 35 external data this model reached r(2) = 0.93, RMSE = 0.16% and RPD = 3.6, with standard error of 0.026% and bias of -0.05%. From these results, the model based on NIR spectral readings can be used for speeding up the laboratory determination of chloride in persimmon leaves with only a modest loss of precision. The intermolecular interaction between chloride ions and the peptide bonds in leaf proteins through hydrogen bonding, i.e. N-H···Cl, explains the ability for chloride determinations on the basis of NIR spectra.
Mapping The Temporal and Spatial Variability of Soil Moisture Content Using Proximal Soil Sensing
NASA Astrophysics Data System (ADS)
Virgawati, S.; Mawardi, M.; Sutiarso, L.; Shibusawa, S.; Segah, H.; Kodaira, M.
2018-05-01
In studies related to soil optical properties, it has been proven that visual and NIR soil spectral response can predict soil moisture content (SMC) using proper data analysis techniques. SMC is one of the most important soil properties influencing most physical, chemical, and biological soil processes. The problem is how to provide reliable, fast and inexpensive information of SMC in the subsurface from numerous soil samples and repeated measurement. The use of spectroscopy technology has emerged as a rapid and low-cost tool for extensive investigation of soil properties. The objective of this research was to develop calibration models based on laboratory Vis-NIR spectroscopy to estimate the SMC at four different growth stages of the soybean crop in Yogyakarta Province. An ASD Field-spectrophotoradiometer was used to measure the reflectance of soil samples. The partial least square regression (PLSR) was performed to establish the relationship between the SMC with Vis-NIR soil reflectance spectra. The selected calibration model was used to predict the new samples of SMC. The temporal and spatial variability of SMC was performed in digital maps. The results revealed that the calibration model was excellent for SMC prediction. Vis-NIR spectroscopy was a reliable tool for the prediction of SMC.
Moreira, Maria João
2018-01-01
The aim of this study was to evaluate the potential of Fourier transform infrared (FTIR) spectroscopy coupled with chemometric methods to detect fish adulteration. Muscles of Atlantic salmon (Salmo salar) (SS) and Salmon trout (Onconrhynchus mykiss) (OM) muscles were mixed in different percentages and transformed into mini-burgers. These were stored at 3 °C, then examined at 0, 72, 160, and 240 h for deteriorative microorganisms. Mini-burgers was submitted to Soxhlet extraction, following which lipid extracts were analyzed by FTIR. The principal component analysis (PCA) described the studied adulteration using four principal components with an explained variance of 95.60%. PCA showed that the absorbance in the spectral region from 721, 1097, 1370, 1464, 1655, 2805, to 2935, 3009 cm−1 may be attributed to biochemical fingerprints related to differences between SS and OM. The partial least squares regression (PLS-R) predicted the presence/absence of adulteration in fish samples of an external set with high accuracy. The proposed methods have the advantage of allowing quick measurements, despite the storage time of the adulterated fish. FTIR combined with chemometrics showed that a methodology to identify the adulteration of SS with OM can be established, even when stored for different periods of time. PMID:29621135
[Determination of Carbaryl in Rice by Using FT Far-IR and THz-TDS Techniques].
Sun, Tong; Zhang, Zhuo-yong; Xiang, Yu-hong; Zhu, Ruo-hua
2016-02-01
Determination of carbaryl in rice by using Fourier transform far-infrared (FT- Far-IR) and terahertz time-domain spectroscopy (THz-TDS) combined with chemometrics was studied and the spectral characteristics of carbaryl in terahertz region was investigated. Samples were prepared by mixing carbaryl at different amounts with rice powder, and then a 13 mm diameter, and about 1 mm thick pellet with polyethylene (PE) as matrix was compressed under the pressure of 5-7 tons. Terahertz time domain spectra of the pellets were measured at 0.5~1.5 THz, and the absorption spectra at 1.6. 3 THz were acquired with Fourier transform far-IR spectroscopy. The method of sample preparation is so simple that it does not need separation and enrichment. The absorption peaks in the frequency range of 1.8-6.3 THz have been found at 3.2 and 5.2 THz by Far-IR. There are several weak absorption peaks in the range of 0.5-1.5 THz by THz-TDS. These two kinds of characteristic absorption spectra were randomly divided into calibration set and prediction set by leave-N-out cross-validation, respectively. Finally, the partial least squares regression (PLSR) method was used to establish two quantitative analysis models. The root mean square error (RMSECV), the root mean square errors of prediction (RMSEP) and the correlation coefficient of the prediction are used as a basis for the model of performance evaluation. For the R,, a higher value is better; for the RMSEC and RMSEP, lower is better. The obtained results demonstrated that the predictive accuracy of. the two models with PLSR method were satisfactory. For the FT-Far-IR model, the correlation between actual and predicted values of prediction samples (Rv) was 0.99. The root mean square error of prediction set (RMSEP) was 0.008 6, and for calibration set (RMSECV) was 0.007 7. For the THz-TDS model, R. was 0. 98, RMSEP was 0.004 4, and RMSECV was 0.002 5. Results proved that the technology of FT-Far-IR and THz- TDS can be a feasible tool for quantitative determination of carbaryl in rice. This paper provides a new method for the quantitative determination pesticide in other grain samples.
NASA Astrophysics Data System (ADS)
Kriegs, Stefanie; Buddenbaum, Henning; Rogge, Derek; Steffens, Markus
2015-04-01
Laboratory imaging Vis-NIR spectroscopy of soil profiles is a novel technique in soil science that can determine quantity and quality of various chemical soil properties with a hitherto unreached spatial resolution in undisturbed soil profiles. We have applied this technique to soil cores in order to get quantitative proof of redoximorphic processes under two different tree species and to proof tree-soil interactions at microscale. Due to the imaging capabilities of Vis-NIR spectroscopy a spatially explicit understanding of soil processes and properties can be achieved. Spatial heterogeneity of the soil profile can be taken into account. We took six 30 cm long rectangular soil columns of adjacent Luvisols derived from quaternary aeolian sediments (Loess) in a forest soil near Freising/Bavaria using stainless steel boxes (100×100×300 mm). Three profiles were sampled under Norway spruce and three under European beech. A hyperspectral camera (VNIR, 400-1000 nm in 160 spectral bands) with spatial resolution of 63×63 µm² per pixel was used for data acquisition. Reference samples were taken at representative spots and analysed for organic carbon (OC) quantity and quality with a CN elemental analyser and for iron oxides (Fe) content using dithionite extraction followed by ICP-OES measurement. We compared two supervised classification algorithms, Spectral Angle Mapper and Maximum Likelihood, using different sets of training areas and spectral libraries. As established in chemometrics we used multivariate analysis such as partial least-squares regression (PLSR) in addition to multivariate adaptive regression splines (MARS) to correlate chemical data with Vis-NIR spectra. As a result elemental mapping of Fe and OC within the soil core at high spatial resolution has been achieved. The regression model was validated by a new set of reference samples for chemical analysis. Digital soil classification easily visualizes soil properties within the soil profiles. By combining both techniques, detailed soil maps, elemental balances and a deeper understanding of soil forming processes at the microscale become feasible for complete soil profiles.
Garriga, Miguel; Romero-Bravo, Sebastián; Estrada, Félix; Escobar, Alejandro; Matus, Iván A.; del Pozo, Alejandro; Astudillo, Cesar A.; Lobos, Gustavo A.
2017-01-01
Phenotyping, via remote and proximal sensing techniques, of the agronomic and physiological traits associated with yield potential and drought adaptation could contribute to improvements in breeding programs. In the present study, 384 genotypes of wheat (Triticum aestivum L.) were tested under fully irrigated (FI) and water stress (WS) conditions. The following traits were evaluated and assessed via spectral reflectance: Grain yield (GY), spikes per square meter (SM2), kernels per spike (KPS), thousand-kernel weight (TKW), chlorophyll content (SPAD), stem water soluble carbohydrate concentration and content (WSC and WSCC, respectively), carbon isotope discrimination (Δ13C), and leaf area index (LAI). The performances of spectral reflectance indices (SRIs), four regression algorithms (PCR, PLSR, ridge regression RR, and SVR), and three classification methods (PCA-LDA, PLS-DA, and kNN) were evaluated for the prediction of each trait. For the classification approaches, two classes were established for each trait: The lower 80% of the trait variability range (Class 1) and the remaining 20% (Class 2 or elite genotypes). Both the SRIs and regression methods performed better when data from FI and WS were combined. The traits that were best estimated by SRIs and regression methods were GY and Δ13C. For most traits and conditions, the estimations provided by RR and SVR were the same, or better than, those provided by the SRIs. PLS-DA showed the best performance among the categorical methods and, unlike the SRI and regression models, most traits were relatively well-classified within a specific hydric condition (FI or WS), proving that classification approach is an effective tool to be explored in future studies related to genotype selection. PMID:28337210
Garriga, Miguel; Romero-Bravo, Sebastián; Estrada, Félix; Escobar, Alejandro; Matus, Iván A; Del Pozo, Alejandro; Astudillo, Cesar A; Lobos, Gustavo A
2017-01-01
Phenotyping, via remote and proximal sensing techniques, of the agronomic and physiological traits associated with yield potential and drought adaptation could contribute to improvements in breeding programs. In the present study, 384 genotypes of wheat ( Triticum aestivum L.) were tested under fully irrigated (FI) and water stress (WS) conditions. The following traits were evaluated and assessed via spectral reflectance: Grain yield (GY), spikes per square meter (SM2), kernels per spike (KPS), thousand-kernel weight (TKW), chlorophyll content (SPAD), stem water soluble carbohydrate concentration and content (WSC and WSCC, respectively), carbon isotope discrimination (Δ 13 C), and leaf area index (LAI). The performances of spectral reflectance indices (SRIs), four regression algorithms (PCR, PLSR, ridge regression RR, and SVR), and three classification methods (PCA-LDA, PLS-DA, and k NN) were evaluated for the prediction of each trait. For the classification approaches, two classes were established for each trait: The lower 80% of the trait variability range (Class 1) and the remaining 20% (Class 2 or elite genotypes). Both the SRIs and regression methods performed better when data from FI and WS were combined. The traits that were best estimated by SRIs and regression methods were GY and Δ 13 C. For most traits and conditions, the estimations provided by RR and SVR were the same, or better than, those provided by the SRIs. PLS-DA showed the best performance among the categorical methods and, unlike the SRI and regression models, most traits were relatively well-classified within a specific hydric condition (FI or WS), proving that classification approach is an effective tool to be explored in future studies related to genotype selection.
Open-target sparse sensing of biological agents using DNA microarray
2011-01-01
Background Current biosensors are designed to target and react to specific nucleic acid sequences or structural epitopes. These 'target-specific' platforms require creation of new physical capture reagents when new organisms are targeted. An 'open-target' approach to DNA microarray biosensing is proposed and substantiated using laboratory generated data. The microarray consisted of 12,900 25 bp oligonucleotide capture probes derived from a statistical model trained on randomly selected genomic segments of pathogenic prokaryotic organisms. Open-target detection of organisms was accomplished using a reference library of hybridization patterns for three test organisms whose DNA sequences were not included in the design of the microarray probes. Results A multivariate mathematical model based on the partial least squares regression (PLSR) was developed to detect the presence of three test organisms in mixed samples. When all 12,900 probes were used, the model correctly detected the signature of three test organisms in all mixed samples (mean(R2)) = 0.76, CI = 0.95), with a 6% false positive rate. A sampling algorithm was then developed to sparsely sample the probe space for a minimal number of probes required to capture the hybridization imprints of the test organisms. The PLSR detection model was capable of correctly identifying the presence of the three test organisms in all mixed samples using only 47 probes (mean(R2)) = 0.77, CI = 0.95) with nearly 100% specificity. Conclusions We conceived an 'open-target' approach to biosensing, and hypothesized that a relatively small, non-specifically designed, DNA microarray is capable of identifying the presence of multiple organisms in mixed samples. Coupled with a mathematical model applied to laboratory generated data, and sparse sampling of capture probes, the prototype microarray platform was able to capture the signature of each organism in all mixed samples with high sensitivity and specificity. It was demonstrated that this new approach to biosensing closely follows the principles of sparse sensing. PMID:21801424
NASA Astrophysics Data System (ADS)
Fang, N. F.; Shi, Z. H.; Chen, F. X.; Zhang, H. Y.; Wang, Y. X.
2015-09-01
Understanding and quantifying sediment loads is important in watersheds with highly erodible materials, which will eventually cause environmental and ecological problems. Within this context, suspended sediment (SS) transport and its temporal dynamics were studied in a small mountainous watershed with sloping lands containing rock fragments in subtropical China. Soils containing rock fragments with many macro-pores have a high permeability rate. Over a 7-year period, the mean runoff coefficient of this watershed was 0.65. Overall, 30 flood events were monitored and accounted for 95.5%, 27.3%, 17.1% of the total SS load, precipitation and total discharge, respectively, over a 5-year period. The presence of rock fragments in soils can affect soil loss. When comparing the soil loss in the studied watershed with that of other watersheds under similar climatic conditions, rock fragments negatively affect soil loss. However, an extreme event occurred on 14 August 1990, and the sediment load exhibited a phenomenon called "small deposits towards lump withdrawal", which resulted in a soil loss of 20,499 t (4.6 times the mean yearly soil loss). This event exhausted most of the SSs stored by the rock fragments on the slope and channel. Following this event, the mean SS concentration (SSC) of the 11 events was 1.05 kg m-3, and the mean SSC of the 18 previous events was 1.75 kg m-3. Twelve variables were separated using the classical hydrograph separation method. Partial least-squares regression (PLSR) was used to determine the highly co-related variables of the discharge. The results indicated that PLSR could explain runoff well. The relationship between discharge and SSC was highly scattered. During 24 flood events, three types of hysteresis loops were observed: clockwise (17 events), figure-eight (3 events), and complex (4 events).
Hyperspectral Remote Sensing of Terrestrial Ecosystem Productivity from ISS
NASA Astrophysics Data System (ADS)
Huemmrich, K. F.; Campbell, P. K. E.; Gao, B. C.; Flanagan, L. B.; Goulden, M.
2017-12-01
Data from the Hyperspectral Imager for Coastal Ocean (HICO), mounted on the International Space Station (ISS), were used to develop and test algorithms for remotely retrieving ecosystem productivity. The ISS orbit introduces both limitations and opportunities for observing ecosystem dynamics. Twenty six HICO images were used from four study sites representing different vegetation types: grasslands, shrubland, and forest. Gross ecosystem production (GEP) data from eddy covariance were matched with HICO-derived spectra. Multiple algorithms were successful relating spectral reflectance with GEP, including: Spectral Vegetation Indices (SVI), SVI in a light use efficiency model framework, spectral shape characteristics through spectral derivatives and absorption feature analysis, and statistical models leading to Multiband Hyperspectral Indices (MHI) from stepwise regressions and Partial Least Squares Regression (PLSR). Algorithms were able to achieve r2 better than 0.7 for both GEP at the overpass time and daily GEP. These algorithms were successful using a diverse set of observations combining data from multiple years, multiple times during growing season, different times of day, with different view angles, and different vegetation types. The demonstrated robustness of the algorithms presented in this study over these conditions provides some confidence in mapping spatial patterns of GEP, describing variability within fields as well as the regional patterns based only on spectral reflectance information. The ISS orbit provides periods with multiple observations collected at different times of the day within a period of a few days. Diurnal GEP patterns were estimated comparing the half-hourly average GEP from the flux tower against HICO estimates of GEP (r2=0.87) if morning, midday, and afternoon observations were available for average fluxes in the time period.
NASA Astrophysics Data System (ADS)
Moore, T. S.; Sanderman, J.; Baldock, J.; Plante, A. F.
2016-12-01
National-scale inventories typically include soil organic carbon (SOC) content, but not chemical composition or biogeochemical stability. Australia's Soil Carbon Research Programme (SCaRP) represents a national inventory of SOC content and composition in agricultural systems. The program used physical fractionation followed by 13C nuclear magnetic resonance (NMR) spectroscopy. While these techniques are highly effective, they are typically too expensive and time consuming for use in large-scale SOC monitoring. We seek to understand if analytical thermal analysis is a viable alternative. Coupled differential scanning calorimetry (DSC) and evolved gas analysis (CO2- and H2O-EGA) yields valuable data on SOC composition and stability via ramped combustion. The technique requires little training to use, and does not require fractionation or other sample pre-treatment. We analyzed 300 agricultural samples collected by SCaRP, divided into four fractions: whole soil, coarse particulates (POM), untreated mineral associated (HUM), and hydrofluoric acid (HF)-treated HUM. All samples were analyzed by DSC-EGA, but only the POM and HF-HUM fractions were analyzed by NMR. Multivariate statistical analyses were used to explore natural clustering in SOC composition and stability based on DSC-EGA data. A partial least-squares regression (PLSR) model was used to explore correlations among the NMR and DSC-EGA data. Correlations demonstrated regions of combustion attributable to specific functional groups, which may relate to SOC stability. We are increasingly challenged with developing an efficient technique to assess SOC composition and stability at large spatial and temporal scales. Correlations between NMR and DSC-EGA may demonstrate the viability of using thermal analysis in lieu of more demanding methods in future large-scale surveys, and may provide data that goes beyond chemical composition to better approach quantification of biogeochemical stability.
Detection and quantification of adulteration in sandalwood oil through near infrared spectroscopy.
Kuriakose, Saji; Thankappan, Xavier; Joe, Hubert; Venkataraman, Venkateswaran
2010-10-01
The confirmation of authenticity of essential oils and the detection of adulteration are problems of increasing importance in the perfumes, pharmaceutical, flavor and fragrance industries. This is especially true for 'value added' products like sandalwood oil. A methodical study is conducted here to demonstrate the potential use of Near Infrared (NIR) spectroscopy along with multivariate calibration models like principal component regression (PCR) and partial least square regression (PLSR) as rapid analytical techniques for the qualitative and quantitative determination of adulterants in sandalwood oil. After suitable pre-processing of the NIR raw spectral data, the models are built-up by cross-validation. The lowest Root Mean Square Error of Cross-Validation and Calibration (RMSECV and RMSEC % v/v) are used as a decision supporting system to fix the optimal number of factors. The coefficient of determination (R(2)) and the Root Mean Square Error of Prediction (RMSEP % v/v) in the prediction sets are used as the evaluation parameters (R(2) = 0.9999 and RMSEP = 0.01355). The overall result leads to the conclusion that NIR spectroscopy with chemometric techniques could be successfully used as a rapid, simple, instant and non-destructive method for the detection of adulterants, even 1% of the low-grade oils, in the high quality form of sandalwood oil.
Estimating Biochemical Parameters of Tea (camellia Sinensis (L.)) Using Hyperspectral Techniques
NASA Astrophysics Data System (ADS)
Bian, M.; Skidmore, A. K.; Schlerf, M.; Liu, Y.; Wang, T.
2012-07-01
Tea (Camellia Sinensis (L.)) is an important economic crop and the market price of tea depends largely on its quality. This research aims to explore the potential of hyperspectral remote sensing on predicting the concentration of biochemical components, namely total tea polyphenols, as indicators of tea quality at canopy scale. Experiments were carried out for tea plants growing in the field and greenhouse. Partial least squares regression (PLSR), which has proven to be the one of the most successful empirical approach, was performed to establish the relationship between reflectance and biochemical concentration across six tea varieties in the field. Moreover, a novel integrated approach involving successive projections algorithms as band selection method and neural networks was developed and applied to detect the concentration of total tea polyphenols for one tea variety, in order to explore and model complex nonlinearity relationships between independent (wavebands) and dependent (biochemicals) variables. The good prediction accuracies (r2 > 0.8 and relative RMSEP < 10 %) achieved for tea plants using both linear (partial lease squares regress) and nonlinear (artificial neural networks) modelling approaches in this study demonstrates the feasibility of using airborne and spaceborne sensors to cover wide areas of tea plantation for in situ monitoring of tea quality cheaply and rapidly.
Gas Chromatography Data Classification Based on Complex Coefficients of an Autoregressive Model
Zhao, Weixiang; Morgan, Joshua T.; Davis, Cristina E.
2008-01-01
This paper introduces autoregressive (AR) modeling as a novel method to classify outputs from gas chromatography (GC). The inverse Fourier transformation was applied to the original sensor data, and then an AR model was applied to transform data to generate AR model complex coefficients. This series of coefficients effectively contains a compressed version of all of the information in the original GC signal output. We applied this method to chromatograms resulting from proliferating bacteria species grown in culture. Three types of neural networks were used to classify the AR coefficients: backward propagating neural network (BPNN), radial basis function-principal component analysismore » (RBF-PCA) approach, and radial basis function-partial least squares regression (RBF-PLSR) approach. This exploratory study demonstrates the feasibility of using complex root coefficient patterns to distinguish various classes of experimental data, such as those from the different bacteria species. This cognition approach also proved to be robust and potentially useful for freeing us from time alignment of GC signals.« less
Elsohaby, Ibrahim; Windeyer, M Claire; Haines, Deborah M; Homerosky, Elizabeth R; Pearson, Jennifer M; McClure, J Trenton; Keefe, Greg P
2018-03-06
The objective of this study was to explore the potential of transmission infrared (TIR) spectroscopy in combination with partial least squares regression (PLSR) for quantification of dairy and beef cow colostral immunoglobulin G (IgG) concentration and assessment of colostrum quality. A total of 430 colostrum samples were collected from dairy (n = 235) and beef (n = 195) cows and tested by a radial immunodiffusion (RID) assay and TIR spectroscopy. Colostral IgG concentrations obtained by the RID assay were linked to the preprocessed spectra and divided into combined and prediction data sets. Three PLSR calibration models were built: one for the dairy cow colostrum only, the second for beef cow colostrum only, and the third for the merged dairy and beef cow colostrum. The predictive performance of each model was evaluated separately using the independent prediction data set. The Pearson correlation coefficients between IgG concentrations as determined by the TIR-based assay and the RID assay were 0.84 for dairy cow colostrum, 0.88 for beef cow colostrum, and 0.92 for the merged set of dairy and beef cow colostrum. The average of the differences between colostral IgG concentrations obtained by the RID- and TIR-based assays were -3.5, 2.7, and 1.4 g/L for dairy, beef, and merged colostrum samples, respectively. Further, the average relative error of the colostral IgG predicted by the TIR spectroscopy from the RID assay was 5% for dairy cow, 1.2% for beef cow, and 0.8% for the merged data set. The average intra-assay CV% of the IgG concentration predicted by the TIR-based method were 3.2%, 2.5%, and 6.9% for dairy cow, beef cow, and merged data set, respectively.The utility of TIR method for assessment of colostrum quality was evaluated using the entire data set and showed that TIR spectroscopy accurately identified the quality status of 91% of dairy cow colostrum, 95% of beef cow colostrum, and 89% and 93% of the merged dairy and beef cow colostrum samples, respectively. The results showed that TIR spectroscopy demonstrates potential as a simple, rapid, and cost-efficient method for use as an estimate of IgG concentration in dairy and beef cow colostrum samples and assessment of colostrum quality. The results also showed that merging the dairy and beef cow colostrum sample data sets improved the predictive ability of the TIR spectroscopy.
NASA Astrophysics Data System (ADS)
Vaudour, Emmanuelle; Gilliot, Jean-Marc; Bel, Liliane; Lefevre, Josias; Chehdi, Kacem
2016-04-01
This study was carried out in the framework of the TOSCA-PLEIADES-CO of the French Space Agency and benefited data from the earlier PROSTOCK-Gessol3 project supported by the French Environment and Energy Management Agency (ADEME). It aimed at identifying the potential of airborne hyperspectral visible near-infrared AISA-Eagle data for predicting the topsoil organic carbon (SOC) content of bare cultivated soils over a large peri-urban area (221 km2) with intensive annual crop cultivation and both contrasted soils and SOC contents, located in the western region of Paris, France. Soils comprise hortic or glossic luvisols, calcaric, rendzic cambisols and colluvic cambisols. Airborne AISA-Eagle images (400-1000 nm, 126 bands) with 1 m-resolution were acquired on 17 April 2013 over 13 tracks. Tracks were atmospherically corrected then mosaicked at a 2 m-resolution using a set of 24 synchronous field spectra of bare soils, black and white targets and impervious surfaces. The land use identification system layer (RPG) of 2012 was used to mask non-agricultural areas, then calculation and thresholding of NDVI from an atmospherically corrected SPOT4 image acquired the same day enabled to map agricultural fields with bare soil. A total of 101 sites, which were sampled either at the regional scale or within one field, were identified as bare by means of this map. Predictions were made from the mosaic AISA spectra which were related to SOC contents by means of partial least squares regression (PLSR). Regression robustness was evaluated through a series of 1000 bootstrap data sets of calibration-validation samples, considering those 75 sites outside cloud shadows only, and different sampling strategies for selecting calibration samples. Validation root-mean-square errors (RMSE) were comprised between 3.73 and 4.49 g. Kg-1 and were ~4 g. Kg-1 in median. The most performing models in terms of coefficient of determination (R²) and Residual Prediction Deviation (RPD) values were the calibration models derived either from Kennard-Stone or conditioned Latin Hypercube sampling on smoothed spectra. However, the most generalizable model leading to lowest RMSE value of 3.73 g. Kg-1 at the regional scale and 1.44 g. Kg-1 at the within-field scale and low validation bias was the cross-validated leave-one-out PLSR model constructed with the 28 near-synchronous samples and raw spectra.
Li, Jian; Ma, Guowei; Ma, Lin; Bao, Xiaolin; Li, Liping; Zhao, Qian
2018-01-01
Effects of 1-methylcyclopropene (1-MCP) and vacuum precooling on quality and antioxidant properties of blackberries (Rubus spp.) were evaluated using one-way analysis of variance, principal component analysis (PCA), partial least squares (PLS), and path analysis. Results showed that the activities of antioxidant enzymes were enhanced by both 1-MCP treatment and vacuum precooling. PCA could discriminate 1-MCP treated fruit and the vacuum precooled fruit and showed that the radical-scavenging activities in vacuum precooled fruit were higher than those in 1-MCP treated fruit. The scores of PCA showed that H2O2 content was the most important variables of blackberry fruit. PLSR results showed that peroxidase (POD) activity negatively correlated with H2O2 content. The results of path coefficient analysis indicated that glutathione (GSH) also had an indirect effect on H2O2 content. PMID:29487622
Using LUCAS topsoil database to estimate soil organic carbon content in local spectral libraries
NASA Astrophysics Data System (ADS)
Castaldi, Fabio; van Wesemael, Bas; Chabrillat, Sabine; Chartin, Caroline
2017-04-01
The quantification of the soil organic carbon (SOC) content over large areas is mandatory to obtain accurate soil characterization and classification, which can improve site specific management at local or regional scale exploiting the strong relationship between SOC and crop growth. The estimation of the SOC is not only important for agricultural purposes: in recent years, the increasing attention towards global warming highlighted the crucial role of the soil in the global carbon cycle. In this context, soil spectroscopy is a well consolidated and widespread method to estimate soil variables exploiting the interaction between chromophores and electromagnetic radiation. The importance of spectroscopy in soil science is reflected by the increasing number of large soil spectral libraries collected in the world. These large libraries contain soil samples derived from a consistent number of pedological regions and thus from different parent material and soil types; this heterogeneity entails, in turn, a large variability in terms of mineralogical and organic composition. In the light of the huge variability of the spectral responses to SOC content and composition, a rigorous classification process is necessary to subset large spectral libraries and to avoid the calibration of global models failing to predict local variation in SOC content. In this regard, this study proposes a method to subset the European LUCAS topsoil database into soil classes using a clustering analysis based on a large number of soil properties. The LUCAS database was chosen to apply a standardized multivariate calibration approach valid for large areas without the need for extensive field and laboratory work for calibration of local models. Seven soil classes were detected by the clustering analyses and the samples belonging to each class were used to calibrate specific partial least square regression (PLSR) models to estimate SOC content of three local libraries collected in Belgium (Loam belt and Wallonia) and Luxembourg. The three local libraries only consist of spectral data (199 samples) acquired using the same protocol as the one used for the LUCAS database. SOC was estimated with a good accuracy both within each local library (RMSE: 1.2 ÷ 5.4 g kg-1; RPD: 1.41 ÷ 2.06) and for the samples of the three libraries together (RMSE: 3.9 g kg-1; RPD: 2.47). The proposed approach could allow to estimate SOC everywhere in Europe only collecting spectra, without the need for chemical laboratory analyses, exploiting the potentiality of the LUCAS database and specific PLSR models.
Nissen, Lise R; Byrne, Derek V; Bertelsen, Grete; Skibsted, Leif H
2004-11-01
Antioxidative efficiency of extracts of rosemary, green tea, coffee and grape skin in precooked pork patties was investigated during storage under retail conditions (10 days, 4 °C, atmospheric air), using descriptive sensory profiling following reheating and quantitative measurements of hexanal, thiobarbituric acid reactive substances (TBARS) and vitamin E as indicators of lipid oxidation. The initial oxidative status of pork patties (evaluated by ANOVA) showed a significant lower level of secondary oxidation products and higher levels of vitamin E in patties with extracts incorporated, indicating that the extracts retarded lipid oxidation during processing of the meat. Data analysis for the storage study was based on qualitative overview of sensory/chemical variation by principal component analysis (PCA) and quantitative ANOVA-PLSR for determination of the relationship between design variables (days of chill-storage, extract treatment) versus sensory-chemical variables and PLSR for elucidating the predictive ability of the chemical methods for sensory terms. Lipid oxidation was seen to involve a decrease in perception of meat flavour/odour and a concomitant increase in the off-flavour/odours linseed, rancid. TBARS, hexanal and vitamin E were all significant predictive indices (P<0.05) for the majority of the sensory terms, while vitamin E through negative correlation with TBARS and hexanal displayed its antioxidative effect and thus, its ability to preserve sensory fresh meat flavour/odour. The effect of the various extracts incorporated in the product was clearly related to the degree of lipid oxidation and an overall ranking of the antioxidative efficiency of extracts in declining order became apparent: Rosemary>Grape skin>Tea>Coffee>Reference. Furthermore, the relation between extracts and vitamin E indicated that the extracts, to some extent, interacted with the vitamin and prevented it from degrading. In conclusion, the rosemary extract displayed potential for maintaining sensory eating quality in processed pork products.
Modeling pollen time series using seasonal-trend decomposition procedure based on LOESS smoothing
NASA Astrophysics Data System (ADS)
Rojo, Jesús; Rivero, Rosario; Romero-Morte, Jorge; Fernández-González, Federico; Pérez-Badia, Rosa
2017-02-01
Analysis of airborne pollen concentrations provides valuable information on plant phenology and is thus a useful tool in agriculture—for predicting harvests in crops such as the olive and for deciding when to apply phytosanitary treatments—as well as in medicine and the environmental sciences. Variations in airborne pollen concentrations, moreover, are indicators of changing plant life cycles. By modeling pollen time series, we can not only identify the variables influencing pollen levels but also predict future pollen concentrations. In this study, airborne pollen time series were modeled using a seasonal-trend decomposition procedure based on LOcally wEighted Scatterplot Smoothing (LOESS) smoothing (STL). The data series—daily Poaceae pollen concentrations over the period 2006-2014—was broken up into seasonal and residual (stochastic) components. The seasonal component was compared with data on Poaceae flowering phenology obtained by field sampling. Residuals were fitted to a model generated from daily temperature and rainfall values, and daily pollen concentrations, using partial least squares regression (PLSR). This method was then applied to predict daily pollen concentrations for 2014 (independent validation data) using results for the seasonal component of the time series and estimates of the residual component for the period 2006-2013. Correlation between predicted and observed values was r = 0.79 (correlation coefficient) for the pre-peak period (i.e., the period prior to the peak pollen concentration) and r = 0.63 for the post-peak period. Separate analysis of each of the components of the pollen data series enables the sources of variability to be identified more accurately than by analysis of the original non-decomposed data series, and for this reason, this procedure has proved to be a suitable technique for analyzing the main environmental factors influencing airborne pollen concentrations.
NASA Astrophysics Data System (ADS)
Ahmed, M. H.; Abdul-Aziz, O. I.
2017-12-01
Chlorophyll-a (Chl-a) is a key indicator for stream water quality and ecological health. The characterization of interplay between Chl-a and its numerous hydroclimatic and biogeochemical drivers is complex, and often involves multicollinear datasets. A systematic data analytics methodology was employed to determine the relative linkages of stream Chl-a with its dynamic environmental drivers at 50 stream water quality monitoring stations across the continental U.S. Multivariate statistical techniques of principal component analysis (PCA) and factor analysis (FA), in concert with Pearson correlation analysis, were applied to evaluate interrelationships among hydroclimatic, biogeochemical, and biological variables. Power-law based partial least square regression (PLSR) models were developed with a bootstrap Monte Carlo procedure (1000 iterations) to reliably estimate the comparative linkages of Chl-a by resolving multicollinearity in the data matrices (Nash-Sutcliff efficiency = 0.50-87). The data analytics suggested four environmental regimes of stream Chl-a, as dominated by nutrient, climate, redox, and hydro-atmospheric contributions, respectively. Total phosphorous (TP) was the most dominant driver of stream Chl-a in the nutrient controlled regime. Water temperature demonstrated the strongest control of Chl-a in the climate-dominated regime. Furthermore, pH and stream flow were found to be the most important drivers of Chl-a in the redox and hydro-atmospheric component dominated regimes, respectively. The research led to a significant reduction of dimensionality in the large data matrices, providing quantitative and qualitative insights on the dynamics of stream Chl-a. The findings would be useful to manage stream water quality and ecosystem health in the continental U.S. and around the world under a changing climate and environment.
Modeling pollen time series using seasonal-trend decomposition procedure based on LOESS smoothing.
Rojo, Jesús; Rivero, Rosario; Romero-Morte, Jorge; Fernández-González, Federico; Pérez-Badia, Rosa
2017-02-01
Analysis of airborne pollen concentrations provides valuable information on plant phenology and is thus a useful tool in agriculture-for predicting harvests in crops such as the olive and for deciding when to apply phytosanitary treatments-as well as in medicine and the environmental sciences. Variations in airborne pollen concentrations, moreover, are indicators of changing plant life cycles. By modeling pollen time series, we can not only identify the variables influencing pollen levels but also predict future pollen concentrations. In this study, airborne pollen time series were modeled using a seasonal-trend decomposition procedure based on LOcally wEighted Scatterplot Smoothing (LOESS) smoothing (STL). The data series-daily Poaceae pollen concentrations over the period 2006-2014-was broken up into seasonal and residual (stochastic) components. The seasonal component was compared with data on Poaceae flowering phenology obtained by field sampling. Residuals were fitted to a model generated from daily temperature and rainfall values, and daily pollen concentrations, using partial least squares regression (PLSR). This method was then applied to predict daily pollen concentrations for 2014 (independent validation data) using results for the seasonal component of the time series and estimates of the residual component for the period 2006-2013. Correlation between predicted and observed values was r = 0.79 (correlation coefficient) for the pre-peak period (i.e., the period prior to the peak pollen concentration) and r = 0.63 for the post-peak period. Separate analysis of each of the components of the pollen data series enables the sources of variability to be identified more accurately than by analysis of the original non-decomposed data series, and for this reason, this procedure has proved to be a suitable technique for analyzing the main environmental factors influencing airborne pollen concentrations.
Elsohaby, Ibrahim; McClure, J Trenton; Riley, Christopher B; Bryanton, Janet; Bigsby, Kathryn; Shaw, R Anthony
2018-02-20
Attenuated total reflectance infrared (ATR-IR) spectroscopy is a simple, rapid and cost-effective method for the analysis of serum. However, the complex nature of serum remains a limiting factor to the reliability of this method. We investigated the benefits of coupling the centrifugal ultrafiltration with ATR-IR spectroscopy for quantification of human serum IgA concentration. Human serum samples (n = 196) were analyzed for IgA using an immunoturbidimetric assay. ATR-IR spectra were acquired for whole serum samples and for the retentate (residue) reconstituted with saline following 300 kDa centrifugal ultrafiltration. IR-based analytical methods were developed for each of the two spectroscopic datasets, and the accuracy of each of the two methods compared. Analytical methods were based upon partial least squares regression (PLSR) calibration models - one with 5-PLS factors (for whole serum) and the second with 9-PLS factors (for the reconstituted retentate). Comparison of the two sets of IR-based analytical results to reference IgA values revealed improvements in the Pearson correlation coefficient (from 0.66 to 0.76), and the root mean squared error of prediction in IR-based IgA concentrations (from 102 to 79 mg/dL) for the ultrafiltration retentate-based method as compared to the method built upon whole serum spectra. Depleting human serum low molecular weight proteins using a 300 kDa centrifugal filter thus enhances the accuracy IgA quantification by ATR-IR spectroscopy. Further evaluation and optimization of this general approach may ultimately lead to routine analysis of a range of high molecular-weight analytical targets that are otherwise unsuitable for IR-based analysis. Copyright © 2017 Elsevier B.V. All rights reserved.
Chang, Xing; Jia, Hongmei; Zhou, Chao; Zhang, Hongwu; Yu, Meng; Yang, Junshan; Zou, Zhongmei
2015-12-01
Chaihu-Shu-Gan-San (CSGS) is a classical traditional Chinese medicine formula for the treatment of depression. As one of the single herbs in CSGS, Bai-Shao displayed antidepressant effect. In order to explore the role of Bai-Shao towards the antidepressant effect of CSGS, the metabolic regulation and chemical profiles of CSGS with and without Bai-Shao (QBS) were investigated using metabonomics integrated with chemical fingerprinting. At first, partial least squares regression (PLSR) analysis was applied to characterize the potential biomarkers associated with chronic unpredictable mild stress (CUMS)-induced depression. Among 46 differential metabolites found in the ultra-performance liquid chromatography quadrupole time of flight mass spectrometry (UPLC-Q-TOF/MS) and (1)H NMR-based urinary metabonomics, 20 were significantly correlated with the preferred sucrose consumption observed in behavior experiments and were considered as biomarkers to evaluate the antidepressant effect of CSGS. Based on differential regulation on CUMS-induced metabolic disturbances with CSGS and QBS treatments, we concluded that Bai-Shao made crucial contribution to CSGS in the improvement of the metabolic deviations of six biomarkers (i.e., glutamate, acetoacetic acid, creatinine, xanthurenic acid, kynurenic acid, and N-acetylserotonin) disturbed in CUMS-induced depression. While the chemical constituents of Bai-Shao contributed to CSGS were paeoniflorin, albiflorin, isomaltopaeoniflorin, and benzoylpaeoniflorin based on the multivariate analysis of the UPLC-Q-TOF/MS chemical profiles from CSGS and QBS extracts. These findings suggested that Bai-Shao played an indispensable role in the antidepressant effect of CSGS. Copyright © 2015 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Chen, Sanming; Lin, Gang; Yin, Xianyang; Sun, Xiaolin; Xu, Jiasheng; Liu, Zhiying
2015-12-01
Sedimentary manganese deposits widely distribute in North Guangxi with the characteristic existing Celosia argentea. Celosia argentea is a kind of plant which has a strong ability to enrich manganese. In order to study the relationship between the hyperspectral characteristics of Celosia argentea and the concentration effect of manganese in the soil, we used soil of B layer in mining area, background soil and the soil adding reagent of MnCl4 to make up experimental sample soil with 10 levels Manganese content for the same batch Celosia argentea. The levels are 0mg/kg, 4500mg/kg, 9000mg/kg, 13500mg/kg, 18000mg/kg, 18020mg/kg, 18040mg/kg, 18080mg/kg, 18160mg/kg. ASD FieldSpec-4 has been used to measure the abnormal spectrums of these Celosia argentea through a whole growth cycle. After pretreating the spectral data, we used Successive Projections Algorithm (SPA) to extract the characteristic variables for extracting 1603 bands into 8 bands. Finally, the relationship between the spectral variables and the concentration of manganese was predicted by the Model of Partial Least Squares Regression (PLSR). The results show that the correlation coefficient-r2 are 0.8714 and 0.9141 in two sets of data. The prediction results are satisfactory, but the front 5 groups are closer to the regression line than the last 5 groups.
NASA Astrophysics Data System (ADS)
Suo, Lizhu; Huang, Mingbin; Zhang, Yongkun; Duan, Liangxia; Shan, Yan
2018-07-01
Soil moisture dynamics plays an active role in ecological and hydrological processes, and it depends on a large number of environmental factors, such as topographic attributes, soil properties, land use types, and precipitation. However, studies must still clarify the relative significance of these environmental factors at different soil depths and at different spatial scales. This study aimed: (1) to characterize temporal and spatial variations in soil moisture content (SMC) at four soil layers (0-40, 40-100, 100-200, and 200-500 cm) and three spatial scales (plot, hillslope, and region); and (2) to determine their dominant controls in diverse soil layers at different spatial scales over semiarid and semi-humid areas of the Loess Plateau, China. Given the high co-dependence of environmental factors, partial least squares regression (PLSR) was used to detect relative significance among 15 selected environmental factors that affect SMC. Temporal variation in SMC decreased with increasing soil depth, and vertical changes in the 0-500 cm soil profile were divided into a fast-changing layer (0-40 cm), an active layer (40-100 cm), a sub-active layer (100-200 cm), and a relatively stable layer (200-500 cm). PLSR models simulated SMC accurately in diverse soil layers at different scales; almost all values for variation in response (R2) and goodness of prediction (Q2) were >0.5 and >0.0975, respectively. Upper and lower layer SMCs were the two most important factors that influenced diverse soil layers at three scales, and these SMC variables exhibited the highest importance in projection (VIP) values. The 7-day antecedent precipitation and 7-day antecedent potential evapotranspiration contributed significantly to SMC only at the 0-40 cm soil layer. VIP of soil properties, especially sand and silt content, which influenced SMC strongly, increased significantly after increasing the measured scale. Mean annual precipitation and potential evapotranspiration also influenced SMC at the regional scale significantly. Overall, this study indicated that dominant controls of SMC varied among three spatial scales on the Loess Plateau, and VIP was a function of spatial scale and soil depth.
NASA Astrophysics Data System (ADS)
Gavilan, C.; Grunwald, S.; Quiroz, R.
2017-12-01
The Andes represent the largest and highest mountain range in the tropics and is considered an important reserve of biodiversity, water provision and soil organic carbon (SOC) stocks. Nevertheless, limited attention has been given to estimate these stocks due to the lack of recent soil data, the poor accessibility and the wide range of coexistent ecosystems. In addition, conventional methods to determine SOC are usually time consuming and expensive to use in large-scale studies, hindering the possibility to have an accurate SOC assessment in the region. Proximal soil sensing techniques, such as visible near infrared (VNIR) and mid infrared (MIR) spectroscopy, have proven to be useful as an alternative to conventional methods for characterizing SOC but have not been tested in Andean soils. The aim of this study was to evaluate the potential of using VNIR and MIR spectroscopy to predict SOC content in the Central Andean region, using multivariate methods. Three study areas were selected across the Peruvian Central Andes. A total of 400 topsoil samples (0-30 cm) were collected and analyzed for SOC. The VNIR and MIR reflectance of the soil samples was measured in the laboratory. Three modeling approaches: Partial least squares regression (PLSR), random forest (RF) and support vector machine (SVM) were used to predict SOC from VNIR and MIR spectra in the study areas. The data was preprocessed in order to minimize the noise and optimize the accuracy of predictions. The models, for each study area, were assessed using 10-fold cross validation. Independent validation was implemented in the whole dataset (400 observations) by splitting it into calibration (70 %) and validation (30%) sets. Overall, the results indicate potential for both VNIR and MIR spectra to predict SOC content in the Andean soils. SOC content predictions from MIR spectra outperformed those from VNIR spectra. The evaluation of model performance shows that RF and SVM provide more accurate SOC predictions compared to PLSR. These findings suggest that integrating VNIR and MIR spectroscopy with machine learning algorithms constitutes a promising approach for assessing SOC content in high-Andean ecosystems.
Burns, Jennifer B.; Riley, Christopher B.; Shaw, R. Anthony; McClure, J. Trenton
2017-01-01
The objective of this study was to develop and compare the performance of laboratory grade and portable attenuated total reflectance infrared (ATR-IR) spectroscopic approaches in combination with partial least squares regression (PLSR) for the rapid quantification of alpaca serum IgG concentration, and the identification of low IgG (<1000 mg/dL), which is consistent with the diagnosis of failure of transfer of passive immunity (FTPI) in neonates. Serum samples (n = 175) collected from privately owned, healthy alpacas were tested by the reference method of radial immunodiffusion (RID) assay, and laboratory grade and portable ATR-IR spectrometers. Various pre-processing strategies were applied to the ATR-IR spectra that were linked to corresponding RID-IgG concentrations, and then randomly split into two sets: calibration (training) and test sets. PLSR was applied to the calibration set and calibration models were developed, and the test set was used to assess the accuracy of the analytical method. For the test set, the Pearson correlation coefficients between the IgG measured by RID and predicted by both laboratory grade and portable ATR-IR spectrometers was 0.91. The average differences between reference serum IgG concentrations and the two IR-based methods were 120.5 mg/dL and 71 mg/dL for the laboratory and portable ATR-IR-based assays, respectively. Adopting an IgG concentration <1000 mg/dL as the cut-point for FTPI cases, the sensitivity, specificity, and accuracy for identifying serum samples below this cut point by laboratory ATR-IR assay were 86, 100 and 98%, respectively (within the entire data set). Corresponding values for the portable ATR-IR assay were 95, 99 and 99%, respectively. These results suggest that the two different ATR-IR assays performed similarly for rapid qualitative evaluation of alpaca serum IgG and for diagnosis of IgG <1000 mg/dL, the portable ATR-IR spectrometer performed slightly better, and provides more flexibility for potential application in the field. PMID:28651006
Elsohaby, Ibrahim; Burns, Jennifer B; Riley, Christopher B; Shaw, R Anthony; McClure, J Trenton
2017-01-01
The objective of this study was to develop and compare the performance of laboratory grade and portable attenuated total reflectance infrared (ATR-IR) spectroscopic approaches in combination with partial least squares regression (PLSR) for the rapid quantification of alpaca serum IgG concentration, and the identification of low IgG (<1000 mg/dL), which is consistent with the diagnosis of failure of transfer of passive immunity (FTPI) in neonates. Serum samples (n = 175) collected from privately owned, healthy alpacas were tested by the reference method of radial immunodiffusion (RID) assay, and laboratory grade and portable ATR-IR spectrometers. Various pre-processing strategies were applied to the ATR-IR spectra that were linked to corresponding RID-IgG concentrations, and then randomly split into two sets: calibration (training) and test sets. PLSR was applied to the calibration set and calibration models were developed, and the test set was used to assess the accuracy of the analytical method. For the test set, the Pearson correlation coefficients between the IgG measured by RID and predicted by both laboratory grade and portable ATR-IR spectrometers was 0.91. The average differences between reference serum IgG concentrations and the two IR-based methods were 120.5 mg/dL and 71 mg/dL for the laboratory and portable ATR-IR-based assays, respectively. Adopting an IgG concentration <1000 mg/dL as the cut-point for FTPI cases, the sensitivity, specificity, and accuracy for identifying serum samples below this cut point by laboratory ATR-IR assay were 86, 100 and 98%, respectively (within the entire data set). Corresponding values for the portable ATR-IR assay were 95, 99 and 99%, respectively. These results suggest that the two different ATR-IR assays performed similarly for rapid qualitative evaluation of alpaca serum IgG and for diagnosis of IgG <1000 mg/dL, the portable ATR-IR spectrometer performed slightly better, and provides more flexibility for potential application in the field.
Tanaka, Ryoma; Takahashi, Naoyuki; Nakamura, Yasuaki; Hattori, Yusuke; Ashizawa, Kazuhide; Otsuka, Makoto
2017-01-01
Resonant acoustic ® mixing (RAM) technology is a system that performs high-speed mixing by vibration through the control of acceleration and frequency. In recent years, real-time process monitoring and prediction has become of increasing interest, and process analytical technology (PAT) systems will be increasingly introduced into actual manufacturing processes. This study examined the application of PAT with the combination of RAM, near-infrared spectroscopy, and chemometric technology as a set of PAT tools for introduction into actual pharmaceutical powder blending processes. Content uniformity was based on a robust partial least squares regression (PLSR) model constructed to manage the RAM configuration parameters and the changing concentration of the components. As a result, real-time monitoring may be possible and could be successfully demonstrated for in-line real-time prediction of active pharmaceutical ingredients and other additives using chemometric technology. This system is expected to be applicable to the RAM method for the risk management of quality.
NASA Astrophysics Data System (ADS)
Li, Wenlong; Cheng, Zhiwei; Wang, Yuefei; Qu, Haibin
2013-01-01
In this paper we describe the strategy used in the development and validation of a near infrared spectroscopy method for the rapid determination of baicalin, chlorogenic acid, ursodeoxycholic acid (UDCA), chenodeoxycholic acid (CDCA), and the total solid contents (TSCs) in the Tanreqing injection. To increase the representativeness of calibration sample set, a concentrating-diluting method was adopted to artificially prepare samples. Partial least square regression (PLSR) was used to establish calibration models, with which the five quality indicators can be determined with satisfied accuracy and repeatability. In addition, the slope/bias (S/B) method was used for the models transfer between two different types of NIR instruments from the same manufacturer, which is contributing to enlarge the application range of the established models. With the presented method, a great deal of time, effort and money can be saved when large amounts of Tanreqing injection samples need to be analyzed in a relatively short period of time, which is of great significance to the traditional Chinese medicine (TCM) industries.
Yulia, Meinilwita
2017-01-01
Asian palm civet coffee or kopi luwak (Indonesian words for coffee and palm civet) is well known as the world's priciest and rarest coffee. To protect the authenticity of luwak coffee and protect consumer from luwak coffee adulteration, it is very important to develop a robust and simple method for determining the adulteration of luwak coffee. In this research, the use of UV-Visible spectra combined with PLSR was evaluated to establish rapid and simple methods for quantification of adulteration in luwak-arabica coffee blend. Several preprocessing methods were tested and the results show that most of the preprocessing spectra were effective in improving the quality of calibration models with the best PLS calibration model selected for Savitzky-Golay smoothing spectra which had the lowest RMSECV (0.039) and highest RPDcal value (4.64). Using this PLS model, a prediction for quantification of luwak content was calculated and resulted in satisfactory prediction performance with high both RPDp and RER values. PMID:28913348
Wang, Pei; Zhang, Hui; Yang, Hailong; Nie, Lei; Zang, Hengchang
2015-02-25
Near-infrared (NIR) spectroscopy has been developed into an indispensable tool for both academic research and industrial quality control in a wide field of applications. The feasibility of NIR spectroscopy to monitor the concentration of puerarin, daidzin, daidzein and total isoflavonoid (TIF) during the extraction process of kudzu (Pueraria lobata) was verified in this work. NIR spectra were collected in transmission mode and pretreated with smoothing and derivative. Partial least square regression (PLSR) was used to establish calibration models. Three different variable selection methods, including correlation coefficient method, interval partial least squares (iPLS), and successive projections algorithm (SPA) were performed and compared with models based on all of the variables. The results showed that the approach was very efficient and environmentally friendly for rapid determination of the four quality indices (QIs) in the kudzu extraction process. This method established may have the potential to be used as a process analytical technological (PAT) tool in the future. Copyright © 2014 Elsevier B.V. All rights reserved.
Towards decadal soil salinity mapping using Landsat time series data
NASA Astrophysics Data System (ADS)
Fan, Xingwang; Weng, Yongling; Tao, Jinmei
2016-10-01
Salinization is one of the major soil problems around the world. However, decadal variation in soil salinization has not yet been extensively reported. This study exploited thirty years (1985-2015) of Landsat sensor data, including Landsat-4/5 TM (Thematic Mapper), Landsat-7 ETM+ (Enhanced Thematic Mapper Plus) and Landsat-8 OLI (Operational Land Imager), for monitoring soil salinity of the Yellow River Delta, China. The data were initially corrected for atmospheric effects, and then matched the spectral bands of EO-1 (Earth Observing One) ALI (Advanced Land Imager). Subsequently, soil salinity maps were derived with a previously developed PLSR (Partial Least Square Regression) model. On intra-annual scale, the retrievals showed that soil salinity increased in February, stabilized in March, and decreased in April. On inter-annual scale, soil salinity decreased within 1985-2000 (-0.74 g kg-1/10a, p < 0.001), and increased within 2000-2015 (0.79 g kg-1/10a, p < 0.001). Our study presents a new perspective for use of multiple Landsat data in soil salinity retrieval, and further the understanding of soil salinization development over the Yellow River Delta.
Mabood, Fazal; Abbas, Ghulam; Jabeen, Farah; Naureen, Zakira; Al-Harrasi, Ahmed; Hamaed, Ahmad M; Hussain, Javid; Al-Nabhani, Mahmood; Al Shukaili, Maryam S; Khan, Alamgir; Manzoor, Suryyia
2018-03-01
Cows' butterfat may be adulterated with animal fat materials like tallow which causes increased serum cholesterol and triglycerides levels upon consumption. There is no reliable technique to detect and quantify tallow adulteration in butter samples in a feasible way. In this study a highly sensitive near-infrared (NIR) spectroscopy combined with chemometric methods was developed to detect as well as quantify the level of tallow adulterant in clarified butter samples. For this investigation the pure clarified butter samples were intentionally adulterated with tallow at the following percentage levels: 1%, 3%, 5%, 7%, 9%, 11%, 13%, 15%, 17% and 20% (wt/wt). Altogether 99 clarified butter samples were used including nine pure samples (un-adulterated clarified butter) and 90 clarified butter samples adulterated with tallow. Each sample was analysed by using NIR spectroscopy in the reflection mode in the range 10,000-4000 cm -1 , at 2 cm -1 resolution and using the transflectance sample accessory which provided a total path length of 0.5 mm. Chemometric models including principal components analysis (PCA), partial least-squares discriminant analysis (PLSDA), and partial least-squares regressions (PLSR) were applied for statistical treatment of the obtained NIR spectral data. The PLSDA model was employed to differentiate pure butter samples from those adulterated with tallow. The employed model was then externally cross-validated by using a test set which included 30% of the total butter samples. The excellent performance of the model was proved by the low RMSEP value of 1.537% and the high correlation factor of 0.95. This newly developed method is robust, non-destructive, highly sensitive, and economical with very minor sample preparation and good ability to quantify less than 1.5% of tallow adulteration in clarified butter samples.
Saraiva, C; Vasconcelos, H; de Almeida, José M M M
2017-01-16
The aim of this work was to investigate the potential of Fourier transform infrared spectroscopy (FTIR) to detect and predict the bacterial load of salmon fillets (Salmo salar) stored at 3, 8 and 30°C under three packaging conditions: air packaging (AP) and two modified atmospheres constituted by a mixture of 50%N 2 /40%CO 2 /10%O 2 with lemon juice (MAPL) and without lemon juice (MAP). Fresh salmon samples were periodically examined for total viable counts (TVC), specific spoilage organisms (SSO) counts, pH, FTIR and sensory assessment of freshness. Principal components analysis (PCA) allowed identification of the wavenumbers potentially correlated with the spoilage process. Linear discriminant analysis (LDA) of infrared spectral data was performed to support sensory data and to accurately identify samples freshness. The effect of the packaging atmospheres was assessed by microbial enumeration and LDA was used to determine sample packaging from the measured infrared spectra. It was verified that modified atmospheres can decrease significantly the bacterial load of fresh salmon. Lemon juice combined with MAP showed a more pronounced delay in the growth of Brochothrix thermosphacta, Photobacterium phosphoreum, psychrotrophs and H 2 S producers. Partial least squares regression (PLS-R) allowed estimates of TVC and psychrotrophs, lactic acid bacteria, molds and yeasts, Brochothrix thermosphacta, Enterobacteriaceae, Pseudomonas spp. and H 2 S producer counts from the infrared spectral data. For TVC, the root mean square error of prediction (RMSEP) value was 0.78logcfug -1 for an external set of samples. According to the results, FTIR can be used as a reliable, accurate and fast method for real time freshness evaluation of salmon fillets stored under different temperatures and packaging atmospheres. Copyright © 2016 Elsevier B.V. All rights reserved.
Lin, Ping; Chen, Yong-ming; Yao, Zhi-lei
2015-11-01
A novel method of combination of the chemometrics and the hyperspectral imaging techniques was presented to detect the temperatures of Ethylene-Vinyl Acetate copolymer (EVA) films in photovoltaic cells during the thermal encapsulation process. Four varieties of the EVA films which had been heated at the temperatures of 128, 132, 142 and 148 °C during the photovoltaic cells production process were used for investigation in this paper. These copolymer encapsulation films were firstly scanned by the hyperspectral imaging equipment (Spectral Imaging Ltd. Oulu, Finland). The scanning band range of hyperspectral equipemnt was set between 904.58 and 1700.01 nm. The hyperspectral dataset of copolymer films was randomly divided into two parts for the training and test purpose. Each type of the training set and test set contained 90 and 10 instances, respectively. The obtained hyperspectral images of EVA films were dealt with by using the ENVI (Exelis Visual Information Solutions, USA) software. The size of region of interest (ROI) of each obtained hyperspectral image of EVA film was set as 150 x 150 pixels. The average of reflectance hyper spectra of all the pixels in the ROI was used as the characteristic curve to represent the instance. There kinds of chemometrics methods including partial least squares regression (PLSR), multi-class support vector machine (SVM) and large margin nearest neighbor (LMNN) were used to correlate the characteristic hyper spectra with the encapsulation temperatures of of copolymer films. The plot of weighted regression coefficients illustrated that both bands of short- and long-wave near infrared hyperspectral data contributed to enhancing the prediction accuracy of the forecast model. Because the attained reflectance hyperspectral data of EVA materials displayed the strong nonlinearity, the prediction performance of linear modeling method of PLSR declined and the prediction precision only reached to 95%. The kernel-based forecast models were introduced to eliminate the impact of nonlinear hyperspectral data to some extent through mapping the original nonlinear hyperspectral data to the high dimensional linear feature space, so the relationship between the nonlinear hyperspectral data and the encapsulation temperatures of EVA films was fully disclosed finally. Compared with the prediction results of three proposed models, the prediction performance of LMNN was superior to the other two, whose final recognition accuracy achieved 100%. The results indicated that the methods of combination of LMNN model with the hyperspectral imaging techniques was the best one for accurately and rapidly determining the encapsulation temperatures of EVA films of photovoltaic cells. In addition, this paper had created the ideal conditions for automatically monitoring and effectively controlling the encapsulation temperatures of EVA films in the photovoltaic cells production process.
NASA Astrophysics Data System (ADS)
Rosero-Vlasova, O.; Borini Alves, D.; Vlassova, L.; Perez-Cabello, F.; Montorio Lloveria, R.
2017-10-01
Deforestation in Amazon basin due, among other factors, to frequent wildfires demands continuous post-fire monitoring of soil and vegetation. Thus, the study posed two objectives: (1) evaluate the capacity of Visible - Near InfraRed - ShortWave InfraRed (VIS-NIR-SWIR) spectroscopy to estimate soil organic matter (SOM) in fire-affected soils, and (2) assess the feasibility of SOM mapping from satellite images. For this purpose, 30 soil samples (surface layer) were collected in 2016 in areas of grass and riparian vegetation of Campos Amazonicos National Park, Brazil, repeatedly affected by wildfires. Standard laboratory procedures were applied to determine SOM. Reflectance spectra of soils were obtained in controlled laboratory conditions using Fieldspec4 spectroradiometer (spectral range 350nm- 2500nm). Measured spectra were resampled to simulate reflectances for Landsat-8, Sentinel-2 and EnMap spectral bands, used as predictors in SOM models developed using Partial Least Squares regression and step-down variable selection algorithm (PLSR-SD). The best fit was achieved with models based on reflectances simulated for EnMap bands (R2=0.93; R2cv=0.82 and NMSE=0.07; NMSEcv=0.19). The model uses only 8 out of 244 predictors (bands) chosen by the step-down variable selection algorithm. The least reliable estimates (R2=0.55 and R2cv=0.40 and NMSE=0.43; NMSEcv=0.60) resulted from Landsat model, while Sentinel-2 model showed R2=0.68 and R2cv=0.63; NMSE=0.31 and NMSEcv=0.38. The results confirm high potential of VIS-NIR-SWIR spectroscopy for SOM estimation. Application of step-down produces sparser and better-fit models. Finally, SOM can be estimated with an acceptable accuracy (NMSE 0.35) from EnMap and Sentinel-2 data enabling mapping and analysis of impacts of repeated wildfires on soils in the study area.
NASA Astrophysics Data System (ADS)
Pal, I.; Lall, U.; Robertson, A. W.; Cane, M. A.; Bansal, R.
2013-06-01
Snowmelt-dominated streamflow of the Western Himalayan rivers is an important water resource during the dry pre-monsoon spring months to meet the irrigation and hydropower needs in northern India. Here we study the seasonal prediction of melt-dominated total inflow into the Bhakra Dam in northern India based on statistical relationships with meteorological variables during the preceding winter. Total inflow into the Bhakra Dam includes the Satluj River flow together with a flow diversion from its tributary, the Beas River. Both are tributaries of the Indus River that originate from the Western Himalayas, which is an under-studied region. Average measured winter snow volume at the upper-elevation stations and corresponding lower-elevation rainfall and temperature of the Satluj River basin were considered as empirical predictors. Akaike information criteria (AIC) and Bayesian information criteria (BIC) were used to select the best subset of inputs from all the possible combinations of predictors for a multiple linear regression framework. To test for potential issues arising due to multicollinearity of the predictor variables, cross-validated prediction skills of the best subset were also compared with the prediction skills of principal component regression (PCR) and partial least squares regression (PLSR) techniques, which yielded broadly similar results. As a whole, the forecasts of the melt season at the end of winter and as the melt season commences were shown to have potential skill for guiding the development of stochastic optimization models to manage the trade-off between irrigation and hydropower releases versus flood control during the annual fill cycle of the Bhakra Reservoir, a major energy and irrigation source in the region.
Markiewicz-Keszycka, Maria; Casado-Gavalda, Maria P; Cama-Moncunill, Xavier; Cama-Moncunill, Raquel; Dixit, Yash; Cullen, Patrick J; Sullivan, Carl
2018-04-01
Gluten free (GF) diets are prone to mineral deficiency, thus effective monitoring of the elemental composition of GF products is important to ensure a balanced micronutrient diet. The objective of this study was to test the potential of laser-induced breakdown spectroscopy (LIBS) analysis combined with chemometrics for at-line monitoring of ash, potassium and magnesium content of GF flours: tapioca, potato, maize, buckwheat, brown rice and a GF flour mixture. Concentrations of ash, potassium and magnesium were determined with reference methods and LIBS. PCA analysis was performed and presented the potential for discrimination of the six GF flours. For the quantification analysis PLSR models were developed; R 2 cal were 0.99 for magnesium and potassium and 0.97 for ash. The study revealed that LIBS combined with chemometrics is a convenient method to quantify concentrations of ash, potassium and magnesium and present the potential to classify different types of flours. Copyright © 2017 Elsevier Ltd. All rights reserved.
Cheng, Jun-Hu; Sun, Da-Wen; Pu, Hong-Bin; Wang, Qi-Jun; Chen, Yu-Nan
2015-03-15
The suitability of hyperspectral imaging technique (400-1000 nm) was investigated to determine the thiobarbituric acid (TBA) value for monitoring lipid oxidation in fish fillets during cold storage at 4°C for 0, 2, 5, and 8 days. The PLSR calibration model was established with full spectral region between the spectral data extracted from the hyperspectral images and the reference TBA values and showed good performance for predicting TBA value with determination coefficients (R(2)P) of 0.8325 and root-mean-square errors of prediction (RMSEP) of 0.1172 mg MDA/kg flesh. Two simplified PLSR and MLR models were built and compared using the selected ten most important wavelengths. The optimised MLR model yielded satisfactory results with R(2)P of 0.8395 and RMSEP of 0.1147 mg MDA/kg flesh, which was used to visualise the TBA values distribution in fish fillets. The whole results confirmed that using hyperspectral imaging technique as a rapid and non-destructive tool is suitable for the determination of TBA values for monitoring lipid oxidation and evaluation of fish freshness. Copyright © 2014 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Li, Can; Wang, Fei; Zang, Lixuan; Zang, Hengchang; Alcalà, Manel; Nie, Lei; Wang, Mingyu; Li, Lian
2017-03-01
Nowadays, as a powerful process analytical tool, near infrared spectroscopy (NIRS) has been widely applied in process monitoring. In present work, NIRS combined with multivariate analysis was used to monitor the ethanol precipitation process of fraction I + II + III (FI + II + III) supernatant in human albumin (HA) separation to achieve qualitative and quantitative monitoring at the same time and assure the product's quality. First, a qualitative model was established by using principal component analysis (PCA) with 6 of 8 normal batches samples, and evaluated by the remaining 2 normal batches and 3 abnormal batches. The results showed that the first principal component (PC1) score chart could be successfully used for fault detection and diagnosis. Then, two quantitative models were built with 6 of 8 normal batches to determine the content of the total protein (TP) and HA separately by using partial least squares regression (PLS-R) strategy, and the models were validated by 2 remaining normal batches. The determination coefficient of validation (Rp2), root mean square error of cross validation (RMSECV), root mean square error of prediction (RMSEP) and ratio of performance deviation (RPD) were 0.975, 0.501 g/L, 0.465 g/L and 5.57 for TP, and 0.969, 0.530 g/L, 0.341 g/L and 5.47 for HA, respectively. The results showed that the established models could give a rapid and accurate measurement of the content of TP and HA. The results of this study indicated that NIRS is an effective tool and could be successfully used for qualitative and quantitative monitoring the ethanol precipitation process of FI + II + III supernatant simultaneously. This research has significant reference value for assuring the quality and improving the recovery ratio of HA in industrialization scale by using NIRS.
Li, Can; Wang, Fei; Zang, Lixuan; Zang, Hengchang; Alcalà, Manel; Nie, Lei; Wang, Mingyu; Li, Lian
2017-03-15
Nowadays, as a powerful process analytical tool, near infrared spectroscopy (NIRS) has been widely applied in process monitoring. In present work, NIRS combined with multivariate analysis was used to monitor the ethanol precipitation process of fraction I+II+III (FI+II+III) supernatant in human albumin (HA) separation to achieve qualitative and quantitative monitoring at the same time and assure the product's quality. First, a qualitative model was established by using principal component analysis (PCA) with 6 of 8 normal batches samples, and evaluated by the remaining 2 normal batches and 3 abnormal batches. The results showed that the first principal component (PC1) score chart could be successfully used for fault detection and diagnosis. Then, two quantitative models were built with 6 of 8 normal batches to determine the content of the total protein (TP) and HA separately by using partial least squares regression (PLS-R) strategy, and the models were validated by 2 remaining normal batches. The determination coefficient of validation (R p 2 ), root mean square error of cross validation (RMSECV), root mean square error of prediction (RMSEP) and ratio of performance deviation (RPD) were 0.975, 0.501g/L, 0.465g/L and 5.57 for TP, and 0.969, 0.530g/L, 0.341g/L and 5.47 for HA, respectively. The results showed that the established models could give a rapid and accurate measurement of the content of TP and HA. The results of this study indicated that NIRS is an effective tool and could be successfully used for qualitative and quantitative monitoring the ethanol precipitation process of FI+II+III supernatant simultaneously. This research has significant reference value for assuring the quality and improving the recovery ratio of HA in industrialization scale by using NIRS. Copyright © 2016 Elsevier B.V. All rights reserved.
Jia, Shengyao; Li, Hongyang; Wang, Yanjie; Tong, Renyuan; Li, Qing
2017-01-01
Soil is an important environment for crop growth. Quick and accurately access to soil nutrient content information is a prerequisite for scientific fertilization. In this work, hyperspectral imaging (HSI) technology was applied for the classification of soil types and the measurement of soil total nitrogen (TN) content. A total of 183 soil samples collected from Shangyu City (People’s Republic of China), were scanned by a near-infrared hyperspectral imaging system with a wavelength range of 874–1734 nm. The soil samples belonged to three major soil types typical of this area, including paddy soil, red soil and seashore saline soil. The successive projections algorithm (SPA) method was utilized to select effective wavelengths from the full spectrum. Pattern texture features (energy, contrast, homogeneity and entropy) were extracted from the gray-scale images at the effective wavelengths. The support vector machines (SVM) and partial least squares regression (PLSR) methods were used to establish classification and prediction models, respectively. The results showed that by using the combined data sets of effective wavelengths and texture features for modelling an optimal correct classification rate of 91.8%. could be achieved. The soil samples were first classified, then the local models were established for soil TN according to soil types, which achieved better prediction results than the general models. The overall results indicated that hyperspectral imaging technology could be used for soil type classification and soil TN determination, and data fusion combining spectral and image texture information showed advantages for the classification of soil types. PMID:28974005
Sarcoptic mange breaks up bottom-up regulation of body condition in a large herbivore population.
Carvalho, João; Granados, José E; López-Olvera, Jorge R; Cano-Manuel, Francisco Javier; Pérez, Jesús M; Fandos, Paulino; Soriguer, Ramón C; Velarde, Roser; Fonseca, Carlos; Ráez, Arian; Espinosa, José; Pettorelli, Nathalie; Serrano, Emmanuel
2015-11-06
Both parasitic load and resource availability can impact individual fitness, yet little is known about the interplay between these parameters in shaping body condition, a key determinant of fitness in wild mammals inhabiting seasonal environments. Using partial least square regressions (PLSR), we explored how temporal variation in climatic conditions, vegetation dynamics and sarcoptic mange (Sarcoptes scabiei) severity impacted body condition of 473 Iberian ibexes (Capra pyrenaica) harvested between 1995 and 2008 in the highly seasonal Alpine ecosystem of Sierra Nevada Natural Space (SNNS), southern Spain. Bottom-up regulation was found to only occur in healthy ibexes; the condition of infected ibexes was independent of primary productivity and snow cover. No link between ibex abundance and ibex body condition could be established when only considering infected individuals. The pernicious effects of mange on Iberian ibexes overcome the benefits of favorable environmental conditions. Even though the increase in primary production exerts a positive effect on the body condition of healthy ibexes, the scabietic individuals do not derive any advantage from increased resource availability. Further applied research coupled with continuous sanitary surveillance are needed to address remaining knowledge gaps associated with the transmission dynamics and management of sarcoptic mange in free-living populations.
Wang, Yan-Cang; Gu, Xiao-He; Zhu, Jin-Shan; Long, Hui-Ling; Xu, Peng; Liao, Qin-Hong
2014-01-01
The present study aims to assess the feasibility of multi-spectral data in monitoring soil organic matter content. The data source comes from hyperspectral measured under laboratory condition, and simulated multi-spectral data from the hyperspectral. According to the reflectance response functions of Landsat TM and HJ-CCD (the Environment and Disaster Reduction Small Satellites, HJ), the hyperspectra were resampled for the corresponding bands of multi-spectral sensors. The correlation between hyperspectral, simulated reflectance spectra and organic matter content was calculated, and used to extract the sensitive bands of the organic matter in the north fluvo-aquic soil. The partial least square regression (PLSR) method was used to establish experiential models to estimate soil organic matter content. Both root mean squared error (RMSE) and coefficient of the determination (R2) were introduced to test the precision and stability of the modes. Results demonstrate that compared with the hyperspectral data, the best model established by simulated multi-spectral data gives a good result for organic matter content, with R2=0.586, and RMSE=0.280. Therefore, using multi-spectral data to predict tide soil organic matter content is feasible.
Özdemir, İbrahim Sani; Öztürk, Bülent; Çelik, Belgin; Sarıtepe, Yüksel; Aksoy, Hatice
2018-08-15
The potential of using FT-NIR spectroscopy for the rapid and non-destructive measurement of the moisture, water activity, firmness and SO 2 content of the intact sulphured-dried apricots (SDA) was investigated for the first time in the literature. The partial least squares regression (PLS-R) models constructed using FT-NIR spectra were very successful in predicting the moisture content (R 2 p = 0.986, RMSEP = 1.22%, RPD = 9.15) and water activity (R 2 p = 0.987, RMSEP = 0.016, RPD = 9.37) of SDAs. Satisfactory results were also obtained for the models developed for the prediction of the firmness (R 2 p = 0.845, RMSEP = 0.445, RPD = 2.55) and SO 2 content (R 2 p = 0.804, RMSEP = 349 mg kg -1 , RPD = 2.27). These results clearly demonstrate that the major quality parameters of SDA can be simultaneously measured in a short time by FT-NIR spectroscopy without any need for the sample preparation or skilled laboratory personnel. Copyright © 2018 Elsevier B.V. All rights reserved.
Li, Bin; Shin, Hyunjin; Gulbekyan, Georgy; Pustovalova, Olga; Nikolsky, Yuri; Hope, Andrew; Bessarabova, Marina; Schu, Matthew; Kolpakova-Hart, Elona; Merberg, David; Dorner, Andrew; Trepicchio, William L.
2015-01-01
Development of drug responsive biomarkers from pre-clinical data is a critical step in drug discovery, as it enables patient stratification in clinical trial design. Such translational biomarkers can be validated in early clinical trial phases and utilized as a patient inclusion parameter in later stage trials. Here we present a study on building accurate and selective drug sensitivity models for Erlotinib or Sorafenib from pre-clinical in vitro data, followed by validation of individual models on corresponding treatment arms from patient data generated in the BATTLE clinical trial. A Partial Least Squares Regression (PLSR) based modeling framework was designed and implemented, using a special splitting strategy and canonical pathways to capture robust information for model building. Erlotinib and Sorafenib predictive models could be used to identify a sub-group of patients that respond better to the corresponding treatment, and these models are specific to the corresponding drugs. The model derived signature genes reflect each drug’s known mechanism of action. Also, the models predict each drug’s potential cancer indications consistent with clinical trial results from a selection of globally normalized GEO expression datasets. PMID:26107615
Li, Bin; Shin, Hyunjin; Gulbekyan, Georgy; Pustovalova, Olga; Nikolsky, Yuri; Hope, Andrew; Bessarabova, Marina; Schu, Matthew; Kolpakova-Hart, Elona; Merberg, David; Dorner, Andrew; Trepicchio, William L
2015-01-01
Development of drug responsive biomarkers from pre-clinical data is a critical step in drug discovery, as it enables patient stratification in clinical trial design. Such translational biomarkers can be validated in early clinical trial phases and utilized as a patient inclusion parameter in later stage trials. Here we present a study on building accurate and selective drug sensitivity models for Erlotinib or Sorafenib from pre-clinical in vitro data, followed by validation of individual models on corresponding treatment arms from patient data generated in the BATTLE clinical trial. A Partial Least Squares Regression (PLSR) based modeling framework was designed and implemented, using a special splitting strategy and canonical pathways to capture robust information for model building. Erlotinib and Sorafenib predictive models could be used to identify a sub-group of patients that respond better to the corresponding treatment, and these models are specific to the corresponding drugs. The model derived signature genes reflect each drug's known mechanism of action. Also, the models predict each drug's potential cancer indications consistent with clinical trial results from a selection of globally normalized GEO expression datasets.
Singh, Aditya; Serbin, Shawn P.; McNeil, Brenden E.; ...
2015-12-01
A major goal of remote sensing is the development of generalizable algorithms to repeatedly and accurately map ecosystem properties across space and time. Imaging spectroscopy has great potential to map vegetation traits that cannot be retrieved from broadband spectral data, but rarely have such methods been tested across broad regions. Here we illustrate a general approach for estimating key foliar chemical and morphological traits through space and time using NASA's Airborne Visible/Infrared Imaging Spectrometer (AVIRIS-Classic). We apply partial least squares regression (PLSR) to data from 237 field plots within 51 images acquired between 2008 and 2011. Using a series ofmore » 500 randomized 50/50 subsets of the original data, we generated spatially explicit maps of seven traits (leaf mass per area (M area), percentage nitrogen, carbon, fiber, lignin, and cellulose, and isotopic nitrogen concentration, δ 15N) as well as pixel-wise uncertainties in their estimates based on error propagation in the analytical methods. Both Marea and %N PLSR models had a R 2 > 0.85. Root mean square errors (RMSEs) for both variables were less than 9% of the range of data. Fiber and lignin were predicted with R 2 > 0.65 and carbon and cellulose with R 2 > 0.45. Although R 2 of %C and cellulose were lower than Marea and %N, the measured variability of these constituents (especially %C) was also lower, and their RMSE values were beneath 12% of the range in overall variability. Model performance for δ 15N was the lowest (R 2 = 0.48, RMSE = 0.95‰), but within 15% of the observed range. The resulting maps of chemical and morphological traits, together with their overall uncertainties, represent a first-of-its-kind approach for examining the spatiotemporal patterns of forest functioning and nutrient cycling across a broad range of temperate and sub-boreal ecosystems. These results offer an alternative to categorical maps of functional or physiognomic types by providing non-discrete maps (i.e., on a continuum) of traits that define those functional types. A key contribution of this work is the ability to assign retrieval uncertainties by pixel, a requirement to enable assimilation of these data products into ecosystem modeling frameworks to constrain carbon and nutrient cycling projections.« less
NASA Astrophysics Data System (ADS)
McMillan, N. J.; Chavez, A.; Chanover, N.; Voelz, D.; Uckert, K.; Tawalbeh, R.; Gariano, J.; Dragulin, I.; Xiao, X.; Hull, R.
2014-12-01
Rapid, in-situ methods for identification of biologic and non-biologic mineral precipitation sites permit mapping of biological hot spots. Two portable spectrometers, Laser-Induced Breakdown Spectroscopy (LIBS) and Acoustic-Optic Tunable Filter Reflectance Spectroscopy (AOTFRS) were used to differentiate between bacterially influenced and inorganically precipitated calcite specimens from Fort Stanton Cave, NM, USA. LIBS collects light emitted from the decay of excited electrons in a laser ablation plasma; the spectrum is a chemical fingerprint of the analyte. AOTFRS collects light reflected from the surface of a specimen and provides structural information about the material (i.e., the presence of O-H bonds). These orthogonal data sets provide a rigorous method to determine the origin of calcite in cave deposits. This study used a set of 48 calcite samples collected from Fort Stanton cave. Samples were examined in SEM for the presence of biologic markers; these data were used to separate the samples into biologic and non-biologic groups. Spectra were modeled using the multivariate technique Partial Least Squares Regression (PLSR). Half of the spectra were used to train a PLSR model, in which biologic samples were assigned to the independent variable "0" and non-biologic samples were assigned the variable "1". Values of the independent variable were calculated for each of the training samples, which were close to 0 for the biologic samples (-0.09 - 0.23) and close to 1 for the non-biologic samples (0.57 - 1.14). A Value of Apparent Distinction (VAD) of 0.55 was used to numerically distinguish between the two groups; any sample with an independent variable value < 0.55 was classified as having a biologic origin; a sample with a value > 0.55 was determined to be non-biologic in origin. After the model was trained, independent variable values for the remaining half of the samples were calculated. Biologic or non-biologic origin was assigned by comparison to the VAD. Using LIBS data alone, the model has a 92% success rate, correctly identifying 23 of 25 samples. Modeling of AOTFRS spectra and the combined LIBS-AOTFRS data set have similar success rates. This study demonstrates that rapid, portable LIBS and AOTFRS instruments can be used to map the spatial distribution of biologic precipitation in caves.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Singh, Aditya; Serbin, Shawn P.; McNeil, Brenden E.
A major goal of remote sensing is the development of generalizable algorithms to repeatedly and accurately map ecosystem properties across space and time. Imaging spectroscopy has great potential to map vegetation traits that cannot be retrieved from broadband spectral data, but rarely have such methods been tested across broad regions. Here we illustrate a general approach for estimating key foliar chemical and morphological traits through space and time using NASA's Airborne Visible/Infrared Imaging Spectrometer (AVIRIS-Classic). We apply partial least squares regression (PLSR) to data from 237 field plots within 51 images acquired between 2008 and 2011. Using a series ofmore » 500 randomized 50/50 subsets of the original data, we generated spatially explicit maps of seven traits (leaf mass per area (M area), percentage nitrogen, carbon, fiber, lignin, and cellulose, and isotopic nitrogen concentration, δ 15N) as well as pixel-wise uncertainties in their estimates based on error propagation in the analytical methods. Both Marea and %N PLSR models had a R 2 > 0.85. Root mean square errors (RMSEs) for both variables were less than 9% of the range of data. Fiber and lignin were predicted with R 2 > 0.65 and carbon and cellulose with R 2 > 0.45. Although R 2 of %C and cellulose were lower than Marea and %N, the measured variability of these constituents (especially %C) was also lower, and their RMSE values were beneath 12% of the range in overall variability. Model performance for δ 15N was the lowest (R 2 = 0.48, RMSE = 0.95‰), but within 15% of the observed range. The resulting maps of chemical and morphological traits, together with their overall uncertainties, represent a first-of-its-kind approach for examining the spatiotemporal patterns of forest functioning and nutrient cycling across a broad range of temperate and sub-boreal ecosystems. These results offer an alternative to categorical maps of functional or physiognomic types by providing non-discrete maps (i.e., on a continuum) of traits that define those functional types. A key contribution of this work is the ability to assign retrieval uncertainties by pixel, a requirement to enable assimilation of these data products into ecosystem modeling frameworks to constrain carbon and nutrient cycling projections.« less
NASA Astrophysics Data System (ADS)
Shi, Z. H.
2014-12-01
There are strong ties between land use and sediment yield in watersheds. Many studies have used multivariate regression techniques to explore the response of sediment yield to land-use compositions and spatial configurations in watersheds. However, one issue with the use of conventional statistical methods to address relationships between land-use compositions and spatial configurations and sediment yield is multicollinearity. This paper examines the combined effects of land-use compositions and land-use spatial configurations of the watershed on the specific sediment yield of the Upper Du River watershed (8,973 km2) in China using the Soil and Water Assessment Tool (SWAT) and partial least-squares regression (PLSR). The land-use compositions and spatial configurations of the watershed were calculated at the sub-watershed scale. The sediment yields from sub-watershed were evaluated using SWAT model. The first-order factors were identified by calculating the variable importance for the projection (VIP). The results revealed that the land-use compositions exerted the largest effects on the specific sediment yield and explained 61.2% of the variation in the specific sediment yield. Land-use spatial configurations were also found to have a large effect on the specific sediment yield and explained 21.7% of the observed variation in the specific sediment yield. The following are the dominant first-order factors of the specific sediment yield at the sub-watershed scale: the areal percentages of agriculture and forest, patch density, value of the Shannon's diversity index, contagion. The VIP values suggested that the Shannon's diversity index and contagion are important factors for sediment delivery.
VenuGopal, K S; Cherita, Chris; Anu-Appaiah, K A
2018-03-01
The role of grape seed tannins on improving organoleptic properties and its involvement in color stabilization in red wine are well established. The addition of grape seeds as the source of condensed tannins in fruit wine may provide a solution for its color instability and improvement of sensory attributes. Syzgium cumini is traditionally known for its therapeutic properties. In the current study, the influence of yeasts and grape seed addition during fermentation on the chromatic, phenolic and sensory attributes of the wine was accessed. Grape seed addition improved the color characteristics of wine and increased overall phenolic composition. Analysis by HPLC revealed 6 major anthocyanins, among which 3, 5-diglucoside form of delphidin and petunidin was found to be the major components. Cluster and PLSR analysis explained the impact of seed addition on the yeasts, as well as on the perception of panelists, with bitterness and astringency as the dominating attributes. Copyright © 2017 Elsevier Ltd. All rights reserved.
Herrero, Paula; Sáenz-Navajas, Pilar; Culleré, Laura; Ferreira, Vicente; Chatin, Amelie; Chaperon, Vincent; Litoux-Desrues, François; Escudero, Ana
2016-09-15
Five different methodologies were applied for the quantitative analysis of 86 volatile molecules in 32 Chardonnay and 30 Pinot Noir Champagne white base wines. Sensory characterization was carried out by descriptive analysis. Pinot Noir wines had more constitutive compounds while Chardonnay wines had more discriminant compounds. Only four compounds predominated in Chardonnay wines: 4-vinylphenol, guaiacol, sotolon and 4-methyl-4-mercapto-2-pentanone. Correlation studies and PLSR models were calculated with sensory and chemical variables. For Pinot Noir wines, they were not as revealing as for Chardonnay base wines. Sulfur-related compounds were suggested to be involved in tropical fruit, dried fruit and citric sensory notes. This family of compounds seemed to be responsible for discriminant sensory terms in Champagne base wines. Fermentative compounds (aromatic buffer) were found at significantly higher levels in Pinot Noir wines, which would explain the fact that these wines were more difficult to describe in comparison with Chardonnay base wines. Copyright © 2016 Elsevier Ltd. All rights reserved.
Khorasani, Milad; Amigo, José M; Sun, Changquan Calvin; Bertelsen, Poul; Rantanen, Jukka
2015-06-01
In the present study the application of near-infrared chemical imaging (NIR-CI) supported by chemometric modeling as non-destructive tool for monitoring and assessing the roller compaction and tableting processes was investigated. Based on preliminary risk-assessment, discussion with experts and current work from the literature the critical process parameter (roll pressure and roll speed) and critical quality attributes (ribbon porosity, granule size, amount of fines, tablet tensile strength) were identified and a design space was established. Five experimental runs with different process settings were carried out which revealed intermediates (ribbons, granules) and final products (tablets) with different properties. Principal component analysis (PCA) based model of NIR images was applied to map the ribbon porosity distribution. The ribbon porosity distribution gained from the PCA based NIR-CI was used to develop predictive models for granule size fractions. Predictive methods with acceptable R(2) values could be used to predict the granule particle size. Partial least squares regression (PLS-R) based model of the NIR-CI was used to map and predict the chemical distribution and content of active compound for both roller compacted ribbons and corresponding tablets. In order to select the optimal process, setting the standard deviation of tablet tensile strength and tablet weight for each tablet batch was considered. Strong linear correlation between tablet tensile strength and amount of fines and granule size was established, respectively. These approaches are considered to have a potentially large impact on quality monitoring and control of continuously operating manufacturing lines, such as roller compaction and tableting processes. Copyright © 2015 Elsevier B.V. All rights reserved.
Prediction of iron oxide contents using diffuse reflectance spectroscopy
NASA Astrophysics Data System (ADS)
Marques, José, Jr.; Arantes Camargo, Livia
2015-04-01
Determining soil iron oxides using conventional analysis is relatively unfeasible when large areas are mapped, with the aim of characterizing spatial variability. Diffuse reflectance spectroscopy (DRS) is rapid, less expensive, non-destructive and sometimes more accurate than conventional analysis. Furthermore, this technique allows the simultaneous characterization of many soil attributes with agronomic and environmental relevance. This study aims to assess the DRS capability to predict iron oxides content -hematite and goethite - , characterizing their spatial variability in soils of Brazil. Soil samples collected from an 800-hectare area were scanned in the visible and near-infrared spectral range. Moreover, chemometric calibration was obtained through partial least-squares regression (PLSR). Then, spatial distribution maps of the attributes were constructed using predicted values from calibrated models through geostatistical methods. The studied area presented soils with varied contents of iron oxides as examples for the Oxisols and Entisols. In the spectra of each soil is observed that the reflectance decreases with the content of iron oxides present in the soil. In soils with a high content of iron oxides can be observed more pronounced concavities between 380 and 1100 nm which are characteristic of the presence of these oxides. In soils with higher reflectance it were observed concavity characteristics due to the presence of kaolinite, in agreement with the low iron contents of those soils. The best accuracy of prediction models [residual prediction deviation (RPD) = 1.7] was obtained for goethite within the visible region (380-800 nm), and for hematite (RPD = 2.0) within the visible near infrared (380-2300 nm). The maps of goethite and hematite predicted showed the spatial distribution pattern similar to the maps of clay and iron extracted by dithionite-citrate-bicarbonate, being consistent with the iron oxide contents of soils present in the study area. These results confirm the value of DRS in the mapping of iron oxides in large areas at detailed scale.
Liu, Yingchun; Liu, Zhongbo; Sun, Guoxiang; Wang, Yan; Ling, Junhong; Gao, Jiayue; Huang, Jiahao
2015-01-01
A combination method of multi-wavelength fingerprinting and multi-component quantification by high performance liquid chromatography (HPLC) coupled with diode array detector (DAD) was developed and validated to monitor and evaluate the quality consistency of herbal medicines (HM) in the classical preparation Compound Bismuth Aluminate tablets (CBAT). The validation results demonstrated that our method met the requirements of fingerprint analysis and quantification analysis with suitable linearity, precision, accuracy, limits of detection (LOD) and limits of quantification (LOQ). In the fingerprint assessments, rather than using conventional qualitative "Similarity" as a criterion, the simple quantified ratio fingerprint method (SQRFM) was recommended, which has an important quantified fingerprint advantage over the "Similarity" approach. SQRFM qualitatively and quantitatively offers the scientific criteria for traditional Chinese medicines (TCM)/HM quality pyramid and warning gate in terms of three parameters. In order to combine the comprehensive characterization of multi-wavelength fingerprints, an integrated fingerprint assessment strategy based on information entropy was set up involving a super-information characteristic digitized parameter of fingerprints, which reveals the total entropy value and absolute information amount about the fingerprints and, thus, offers an excellent method for fingerprint integration. The correlation results between quantified fingerprints and quantitative determination of 5 marker compounds, including glycyrrhizic acid (GLY), liquiritin (LQ), isoliquiritigenin (ILG), isoliquiritin (ILQ) and isoliquiritin apioside (ILA), indicated that multi-component quantification could be replaced by quantified fingerprints. The Fenton reaction was employed to determine the antioxidant activities of CBAT samples in vitro, and they were correlated with HPLC fingerprint components using the partial least squares regression (PLSR) method. In summary, the method of multi-wavelength fingerprints combined with antioxidant activities has been proved to be a feasible and scientific procedure for monitoring and evaluating the quality consistency of CBAT.
Profiling Taste and Aroma Compound Metabolism during Apricot Fruit Development and Ripening
Xi, Wanpeng; Zheng, Huiwen; Zhang, Qiuyun; Li, Wenhui
2016-01-01
Sugars, organic acids and volatiles of apricot were determined by HPLC and GC-MS during fruit development and ripening, and the key taste and aroma components were identified by integrating flavor compound contents with consumers’ evaluation. Sucrose and glucose were the major sugars in apricot fruit. The contents of all sugars increased rapidly, and the accumulation pattern of sugars converted from glucose-predominated to sucrose-predominated during fruit development and ripening. Sucrose synthase (SS), sorbitol oxidase (SO) and sorbitol dehydrogenase (SDH) are under tight developmental control and they might play important roles in sugar accumulation. Almost all organic acids identified increased during early development and then decrease rapidly. During early development, fruit mainly accumulated quinate and malate, with the increase of citrate after maturation, and quinate, malate and citrate were the predominant organic acids at the ripening stage. The odor activity values (OAV) of aroma volatiles showed that 18 aroma compounds were the characteristic components of apricot fruit. Aldehydes and terpenes decreased significantly during the whole development period, whereas lactones and apocarotenoids significantly increased with fruit ripening. The partial least squares regression (PLSR) results revealed that β-ionone, γ-decalactone, sucrose and citrate are the key characteristic flavor factors contributing to consumer acceptance. Carotenoid cleavage dioxygenases (CCD) may be involved in β-ionone formation in apricot fruit. PMID:27347931
Tamburini, Elena; Mamolini, Elisabetta; De Bastiani, Morena; Marchetti, Maria Gabriella
2016-07-15
Fusarium proliferatum is considered to be a pathogen of many economically important plants, including garlic. The objective of this research was to apply near-infrared spectroscopy (NIRS) to rapidly determine fungal concentration in intact garlic cloves, avoiding the laborious and time-consuming procedures of traditional assays. Preventive detection of infection before seeding is of great interest for farmers, because it could avoid serious losses of yield during harvesting and storage. Spectra were collected on 95 garlic cloves, divided in five classes of infection (from 1-healthy to 5-very highly infected) in the range of fungal concentration 0.34-7231.15 ppb. Calibration and cross validation models were developed with partial least squares regression (PLSR) on pretreated spectra (standard normal variate, SNV, and derivatives), providing good accuracy in prediction, with a coefficient of determination (R²) of 0.829 and 0.774, respectively, a standard error of calibration (SEC) of 615.17 ppb, and a standard error of cross validation (SECV) of 717.41 ppb. The calibration model was then used to predict fungal concentration in unknown samples, peeled and unpeeled. The results showed that NIRS could be used as a reliable tool to directly detect and quantify F. proliferatum infection in peeled intact garlic cloves, but the presence of the external peel strongly affected the prediction reliability.
Rapid non-destructive assessment of pork edible quality by using VIS/NIR spectroscopic technique
NASA Astrophysics Data System (ADS)
Zhang, Leilei; Peng, Yankun; Dhakal, Sagar; Song, Yulin; Zhao, Juan; Zhao, Songwei
2013-05-01
The objectives of this research were to develop a rapid non-destructive method to evaluate the edible quality of chilled pork. A total of 42 samples were packed in seal plastic bags and stored at 4°C for 1 to 21 days. Reflectance spectra were collected from visible/near-infrared spectroscopy system in the range of 400nm to 1100nm. Microbiological, physicochemical and organoleptic characteristics such as the total viable counts (TVC), total volatile basic-nitrogen (TVB-N), pH value and color parameters L* were determined to appraise pork edible quality. Savitzky-Golay (SG) based on five and eleven smoothing points, Multiple Scattering Correlation (MSC) and first derivative pre-processing methods were employed to eliminate the spectra noise. The support vector machines (SVM) and partial least square regression (PLSR) were applied to establish prediction models using the de-noised spectra. A linear correlation was developed between the VIS/NIR spectroscopy and parameters such as TVC, TVB-N, pH and color parameter L* indexes, which could gain prediction results with Rv of 0.931, 0.844, 0.805 and 0.852, respectively. The results demonstrated that VIS/NIR spectroscopy technique combined with SVM possesses a powerful assessment capability. It can provide a potential tool for detecting pork edible quality rapidly and non-destructively.
Wu, Yongjiang; Jin, Ye; Ding, Haiying; Luan, Lianjun; Chen, Yong; Liu, Xuesong
2011-09-01
The application of near-infrared (NIR) spectroscopy for in-line monitoring of extraction process of scutellarein from Erigeron breviscapus (vant.) Hand-Mazz was investigated. For NIR measurements, two fiber optic probes designed to transmit NIR radiation through a 2 mm pathlength flow cell were utilized to collect spectra in real-time. High performance liquid chromatography (HPLC) was used as a reference method to determine scutellarein in extract solution. Partial least squares regression (PLSR) calibration model of Savitzky-Golay smoothing NIR spectra in the 5450-10,000 cm(-1) region gave satisfactory predictive results for scutellarein. The results showed that the correlation coefficients of calibration and cross validation were 0.9967 and 0.9811, respectively, and the root mean square error of calibration and cross validation were 0.044 and 0.105, respectively. Furthermore, both the moving block standard deviation (MBSD) method and conformity test were used to identify the end point of extraction process, providing real-time data and instant feedback about the extraction course. The results obtained in this study indicated that the NIR spectroscopy technique provides an efficient and environmentally friendly approach for fast determination of scutellarein and end point control of extraction process. Copyright © 2011 Elsevier B.V. All rights reserved.
Spectroscopic sensitivity of real-time, rapidly induced phytochemical change in response to damage.
Couture, John J; Serbin, Shawn P; Townsend, Philip A
2013-04-01
An ecological consequence of plant-herbivore interactions is the phytochemical induction of defenses in response to insect damage. Here, we used reflectance spectroscopy to characterize the foliar induction profile of cardenolides in Asclepias syriaca in response to damage, tracked in vivo changes and examined the influence of multiple plant traits on cardenolide concentrations. Foliar cardenolide concentrations were measured at specific time points following damage to capture their induction profile. Partial least-squares regression (PLSR) modeling was employed to calibrate cardenolide concentrations to reflectance spectroscopy. In addition, subsets of plants were either repeatedly sampled to track in vivo changes or modified to reduce latex flow to damaged areas. Cardenolide concentrations and the induction profile of A. syriaca were well predicted using models derived from reflectance spectroscopy, and this held true for repeatedly sampled plants. Correlations between cardenolides and other foliar-related variables were weak or not significant. Plant modification for latex reduction inhibited an induced cardenolide response. Our findings show that reflectance spectroscopy can characterize rapid phytochemical changes in vivo. We used reflectance spectroscopy to identify the mechanisms behind the production of plant secondary metabolites, simultaneously characterizing multiple foliar constituents. In this case, cardenolide induction appears to be largely driven by enhanced latex delivery to leaves following damage. © 2013 The Authors. New Phytologist © 2013 New Phytologist Trust.
Bekiaris, Georgios; Lindedam, Jane; Peltre, Clément; ...
2015-06-18
Complexity and high cost are the main limitations for high-throughput screening methods for the estimation of the sugar release from plant materials during bioethanol production. In addition, it is important that we improve our understanding of the mechanisms by which different chemical components are affecting the degradability of plant material. In this study, Fourier transform infrared photoacoustic spectroscopy (FTIR-PAS) was combined with advanced chemometrics to develop calibration models predicting the amount of sugars released after pretreatment and enzymatic hydrolysis of wheat straw during bioethanol production, and the spectra were analysed to identify components associated with recalcitrance. A total of 1122more » wheat straw samples from nine different locations in Denmark and one location in the United Kingdom, spanning a large variation in genetic material and environmental conditions during growth, were analysed. The FTIR-PAS spectra of non-pretreated wheat straw were correlated with the measured sugar release, determined by a high-throughput pretreatment and enzymatic hydrolysis (HTPH) assay. A partial least square regression (PLSR) calibration model predicting the glucose and xylose release was developed. The interpretation of the regression coefficients revealed a positive correlation between the released glucose and xylose with easily hydrolysable compounds, such as amorphous cellulose and hemicellulose. Additionally, we observed a negative correlation with crystalline cellulose and lignin, which inhibits cellulose and hemicellulose hydrolysis. FTIR-PAS was used as a reliable method for the rapid estimation of sugar release during bioethanol production. The spectra revealed that lignin inhibited the hydrolysis of polysaccharides into monomers, while the crystallinity of cellulose retarded its hydrolysis into glucose. Amorphous cellulose and xylans were found to contribute significantly to the released amounts of glucose and xylose, respectively.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bekiaris, Georgios; Lindedam, Jane; Peltre, Clément
Complexity and high cost are the main limitations for high-throughput screening methods for the estimation of the sugar release from plant materials during bioethanol production. In addition, it is important that we improve our understanding of the mechanisms by which different chemical components are affecting the degradability of plant material. In this study, Fourier transform infrared photoacoustic spectroscopy (FTIR-PAS) was combined with advanced chemometrics to develop calibration models predicting the amount of sugars released after pretreatment and enzymatic hydrolysis of wheat straw during bioethanol production, and the spectra were analysed to identify components associated with recalcitrance. A total of 1122more » wheat straw samples from nine different locations in Denmark and one location in the United Kingdom, spanning a large variation in genetic material and environmental conditions during growth, were analysed. The FTIR-PAS spectra of non-pretreated wheat straw were correlated with the measured sugar release, determined by a high-throughput pretreatment and enzymatic hydrolysis (HTPH) assay. A partial least square regression (PLSR) calibration model predicting the glucose and xylose release was developed. The interpretation of the regression coefficients revealed a positive correlation between the released glucose and xylose with easily hydrolysable compounds, such as amorphous cellulose and hemicellulose. Additionally, we observed a negative correlation with crystalline cellulose and lignin, which inhibits cellulose and hemicellulose hydrolysis. FTIR-PAS was used as a reliable method for the rapid estimation of sugar release during bioethanol production. The spectra revealed that lignin inhibited the hydrolysis of polysaccharides into monomers, while the crystallinity of cellulose retarded its hydrolysis into glucose. Amorphous cellulose and xylans were found to contribute significantly to the released amounts of glucose and xylose, respectively.« less
Aznar, Margarita; López, Ricardo; Cacho, Juan; Ferreira, Vicente
2003-04-23
Partial least squares regression (PLSR) models able to predict some of the wine aroma nuances from its chemical composition have been developed. The aromatic sensory characteristics of 57 Spanish aged red wines were determined by 51 experts from the wine industry. The individual descriptions given by the experts were recorded, and the frequency with which a sensory term was used to define a given wine was taken as a measurement of its intensity. The aromatic chemical composition of the wines was determined by already published gas chromatography (GC)-flame ionization detector and GC-mass spectrometry methods. In the whole, 69 odorants were analyzed. Both matrixes, the sensory and chemical data, were simplified by grouping and rearranging correlated sensory terms or chemical compounds and by the exclusion of secondary aroma terms or of weak aroma chemicals. Finally, models were developed for 18 sensory terms and 27 chemicals or groups of chemicals. Satisfactory models, explaining more than 45% of the original variance, could be found for nine of the most important sensory terms (wood-vanillin-cinnamon, animal-leather-phenolic, toasted-coffee, old wood-reduction, vegetal-pepper, raisin-flowery, sweet-candy-cacao, fruity, and berry fruit). For this set of terms, the correlation coefficients between the measured and predicted Y (determined by cross-validation) ranged from 0.62 to 0.81. Models confirmed the existence of complex multivariate relationships between chemicals and odors. In general, pleasant descriptors were positively correlated to chemicals with pleasant aroma, such as vanillin, beta damascenone, or (E)-beta-methyl-gamma-octalactone, and negatively correlated to compounds showing less favorable odor properties, such as 4-ethyl and vinyl phenols, 3-(methylthio)-1-propanol, or phenylacetaldehyde.
Hyperspectral estimation of soil heavy metals in Guanzhong area, Shaanxi province
NASA Astrophysics Data System (ADS)
Liu, Jinbao; Cheng, Jie; Wang, Huanyuan; Tong, Wei; Ma, Zenghui
2017-10-01
In this study, the contents of Cr, Mn, Ni, Cu, and Zn, As, Cd, Hg and Pub in 44 soil samples were collected from Fufeng County, Yangling County and Wugong County, Shaanxi Province and were used as data sources. ASD Field Spec HR (350 ˜ 2500 nm), and then the NOR, MSC and SNV of the reflectance were pretreated, the first deviation, second deviation and reflectance reciprocal logarithmic transformation were carried out. The optimal hyper spectral estimation model of nine heavy metal elements of Cr, Mn, Ni, Cu, and Zn, As, Cd, Hg and Pb was established by regression method. Comparing the reflection characteristics of different heavy metal contents and the effect of different pretreatment methods on the establishment of soil heavy metal spectral inversion model. The results show that: (1) the reflectance spectrum improves the signal-to-noise ratio of the reflectance spectrum after the transformation of NOR, MSC and SNV. Combining differential transformation can improve the information of heavy metal elements in the soil, and use the correlation band energy significantly improve the stability and predictability of the model. (2) The modeling accuracy of the optimal model of nine heavy metal spectra of Cr, Mn, Ni, Cu, and Zn, As, Cd, Hg and Pb by PLSR method were 0.7002, 0.7852, 0.687, 0.8036, 0.8619, 0.5765, 0.5451, 0.9912, and 0.6182.
Tamburini, Elena; Mamolini, Elisabetta; De Bastiani, Morena; Marchetti, Maria Gabriella
2016-01-01
Fusarium proliferatum is considered to be a pathogen of many economically important plants, including garlic. The objective of this research was to apply near-infrared spectroscopy (NIRS) to rapidly determine fungal concentration in intact garlic cloves, avoiding the laborious and time-consuming procedures of traditional assays. Preventive detection of infection before seeding is of great interest for farmers, because it could avoid serious losses of yield during harvesting and storage. Spectra were collected on 95 garlic cloves, divided in five classes of infection (from 1-healthy to 5-very highly infected) in the range of fungal concentration 0.34–7231.15 ppb. Calibration and cross validation models were developed with partial least squares regression (PLSR) on pretreated spectra (standard normal variate, SNV, and derivatives), providing good accuracy in prediction, with a coefficient of determination (R2) of 0.829 and 0.774, respectively, a standard error of calibration (SEC) of 615.17 ppb, and a standard error of cross validation (SECV) of 717.41 ppb. The calibration model was then used to predict fungal concentration in unknown samples, peeled and unpeeled. The results showed that NIRS could be used as a reliable tool to directly detect and quantify F. proliferatum infection in peeled intact garlic cloves, but the presence of the external peel strongly affected the prediction reliability. PMID:27428978
NASA Astrophysics Data System (ADS)
Beganović, Anel; Beć, Krzysztof B.; Henn, Raphael; Huck, Christian W.
2018-05-01
The applicability of two elimination techniques for interferences occurring in measurements with cells of short pathlength using Fourier transform near-infrared (FT-NIR) spectroscopy was evaluated. Due to the growing interest in the field of vibrational spectroscopy in aqueous biological fluids (e.g. glucose in blood), aqueous solutions of D-(+)-glucose were prepared and split into a calibration set and an independent validation set. All samples were measured with two FT-NIR spectrometers at various spectral resolutions. Moving average smoothing (MAS) and fast Fourier transform filter (FFT filter) were applied to the interference affected FT-NIR spectra in order to eliminate the interference pattern. After data pre-treatment, partial least squares regression (PLSR) models using different NIR regions were constructed using untreated (interference affected) spectra and spectra treated with MAS and FFT filter. The prediction of the independent validation set revealed information about the performance of the utilized interference elimination techniques, as well as the different NIR regions. The results showed that the combination band of water at approx. 5200 cm-1 is of great importance since its performance was superior to the one of the so-called first overtone of water at approx. 6800 cm-1. Furthermore, this work demonstrated that MAS and FFT filter are fast and easy-to-use techniques for the elimination of interference fringes in FT-NIR transmittance spectroscopy.
Alamar, Priscila D; Caramês, Elem T S; Poppi, Ronei J; Pallone, Juliana A L
2016-07-01
The present study investigated the application of near infrared spectroscopy as a green, quick, and efficient alternative to analytical methods currently used to evaluate the quality (moisture, total sugars, acidity, soluble solids, pH and ascorbic acid) of frozen guava and passion fruit pulps. Fifty samples were analyzed by near infrared spectroscopy (NIR) and reference methods. Partial least square regression (PLSR) was used to develop calibration models to relate the NIR spectra and the reference values. Reference methods indicated adulteration by water addition in 58% of guava pulp samples and 44% of yellow passion fruit pulp samples. The PLS models produced lower values of root mean squares error of calibration (RMSEC), root mean squares error of prediction (RMSEP), and coefficient of determination above 0.7. Moisture and total sugars presented the best calibration models (RMSEP of 0.240 and 0.269, respectively, for guava pulp; RMSEP of 0.401 and 0.413, respectively, for passion fruit pulp) which enables the application of these models to determine adulteration in guava and yellow passion fruit pulp by water or sugar addition. The models constructed for calibration of quality parameters of frozen fruit pulps in this study indicate that NIR spectroscopy coupled with the multivariate calibration technique could be applied to determine the quality of guava and yellow passion fruit pulp. Copyright © 2016 Elsevier Ltd. All rights reserved.
Lim, Jongguk; Kim, Giyoung; Mo, Changyeun; Kim, Moon S
2015-10-29
This research aims to design and fabricate a system to measure the capsaicinoid content of red pepper powder in a non-destructive and rapid method using visible and near infrared spectroscopy (VNIR). The developed system scans a well-leveled powder surface continuously to minimize the influence of the placenta distribution, thus acquiring stable and representative reflectance spectra. The system incorporates flat belts driven by a sample input hopper and stepping motor, a powder surface leveler, charge-coupled device (CCD) image sensor-embedded VNIR spectrometer, fiber optic probe, and tungsten halogen lamp, and an automated reference measuring unit with a reference panel to measure the standard spectrum. The operation program includes device interface, standard reflectivity measurement, and a graphical user interface to measure the capsaicinoid content. A partial least square regression (PLSR) model was developed to predict the capsaicinoid content; 44 red pepper powder samples whose measured capsaicinoid content ranged 13.45-159.48 mg/100 g by per high-performance liquid chromatography (HPLC) and 1242 VNIR absorbance spectra acquired by the pungency measurement system were used. The determination coefficient of validation (RV2) and standard error of prediction (SEP) for the model with the first-order derivative pretreatment method for Korean red pepper powder were 0.8484 and ±13.6388 mg/100 g, respectively.
Liu, Ya; Pan, Xianzhang; Wang, Changkun; Li, Yanli; Shi, Rongjie
2015-01-01
Robust models for predicting soil salinity that use visible and near-infrared (vis–NIR) reflectance spectroscopy are needed to better quantify soil salinity in agricultural fields. Currently available models are not sufficiently robust for variable soil moisture contents. Thus, we used external parameter orthogonalization (EPO), which effectively projects spectra onto the subspace orthogonal to unwanted variation, to remove the variations caused by an external factor, e.g., the influences of soil moisture on spectral reflectance. In this study, 570 spectra between 380 and 2400 nm were obtained from soils with various soil moisture contents and salt concentrations in the laboratory; 3 soil types × 10 salt concentrations × 19 soil moisture levels were used. To examine the effectiveness of EPO, we compared the partial least squares regression (PLSR) results established from spectra with and without EPO correction. The EPO method effectively removed the effects of moisture, and the accuracy and robustness of the soil salt contents (SSCs) prediction model, which was built using the EPO-corrected spectra under various soil moisture conditions, were significantly improved relative to the spectra without EPO correction. This study contributes to the removal of soil moisture effects from soil salinity estimations when using vis–NIR reflectance spectroscopy and can assist others in quantifying soil salinity in the future. PMID:26468645
NASA Astrophysics Data System (ADS)
Gottlieb, C.; Millar, S.; Günther, T.; Wilsch, G.
2017-06-01
For the damage assessment of reinforced concrete structures the quantified ingress profiles of harmful species like chlorides, sulfates and alkali need to be determined. In order to provide on-site analysis of concrete a fast and reliable method is necessary. Low transition probabilities as well as the high ionization energies for chlorine and sulfur in the near-infrared range makes the detection of Cl I and S I in low concentrations a difficult task. For the on-site analysis a mobile LIBS-system (λ = 1064 nm, Epulse ≤ 3 mJ, τ = 1.5 ns) with an automated scanner has been developed at BAM. Weak chlorine and sulfur signal intensities do not allow classical univariate analysis for process data derived from the mobile system. In order to improve the analytical performance multivariate analysis like PLS-R will be presented in this work. A comparison to standard univariate analysis will be carried out and results covering important parameters like detection and quantification limits (LOD, LOQ) as well as processing variances will be discussed (Allegrini and Olivieri, 2014 [1]; Ostra et al., 2008 [2]). It will be shown that for the first time a low cost mobile system is capable of providing reproducible chlorine and sulfur analysis on concrete by using a low sensitive system in combination with multivariate evaluation.
Improving the prediction of African savanna vegetation variables using time series of MODIS products
NASA Astrophysics Data System (ADS)
Tsalyuk, Miriam; Kelly, Maggi; Getz, Wayne M.
2017-09-01
African savanna vegetation is subject to extensive degradation as a result of rapid climate and land use change. To better understand these changes detailed assessment of vegetation structure is needed across an extensive spatial scale and at a fine temporal resolution. Applying remote sensing techniques to savanna vegetation is challenging due to sparse cover, high background soil signal, and difficulty to differentiate between spectral signals of bare soil and dry vegetation. In this paper, we attempt to resolve these challenges by analyzing time series of four MODIS Vegetation Products (VPs): Normalized Difference Vegetation Index (NDVI), Enhanced Vegetation Index (EVI), Leaf Area Index (LAI), and Fraction of Photosynthetically Active Radiation (FPAR) for Etosha National Park, a semiarid savanna in north-central Namibia. We create models to predict the density, cover, and biomass of the main savanna vegetation forms: grass, shrubs, and trees. To calibrate remote sensing data we developed an extensive and relatively rapid field methodology and measured herbaceous and woody vegetation during both the dry and wet seasons. We compared the efficacy of the four MODIS-derived VPs in predicting vegetation field measured variables. We then compared the optimal time span of VP time series to predict ground-measured vegetation. We found that Multiyear Partial Least Square Regression (PLSR) models were superior to single year or single date models. Our results show that NDVI-based PLSR models yield robust prediction of tree density (R2 = 0.79, relative Root Mean Square Error, rRMSE = 1.9%) and tree cover (R2 = 0.78, rRMSE = 0.3%). EVI provided the best model for shrub density (R2 = 0.82) and shrub cover (R2 = 0.83), but was only marginally superior over models based on other VPs. FPAR was the best predictor of vegetation biomass of trees (R2 = 0.76), shrubs (R2 = 0.83), and grass (R2 = 0.91). Finally, we addressed an enduring challenge in the remote sensing of semiarid vegetation by examining the transferability of predictive models through space and time. Our results show that models created in the wetter part of Etosha could accurately predict trees' and shrubs' variables in the drier part of the reserve and vice versa. Moreover, our results demonstrate that models created for vegetation variables in the dry season of 2011 could be successfully applied to predict vegetation in the wet season of 2012. We conclude that extensive field data combined with multiyear time series of MODIS vegetation products can produce robust predictive models for multiple vegetation forms in the African savanna. These methods advance the monitoring of savanna vegetation dynamics and contribute to improved management and conservation of these valuable ecosystems.
Slavchev, Aleksandar; Kovacs, Zoltan; Koshiba, Haruki; Nagai, Airi; Bázár, György; Krastanov, Albert; Kubota, Yousuke; Tsenkova, Roumiana
2015-01-01
Development of efficient screening method coupled with cell functionality evaluation is highly needed in contemporary microbiology. The presented novel concept and fast non-destructive method brings in to play the water spectral pattern of the solution as a molecular fingerprint of the cell culture system. To elucidate the concept, NIR spectroscopy with Aquaphotomics were applied to monitor the growth of sixteen Lactobacillus bulgaricus one Lactobacillus pentosus and one Lactobacillus gasseri bacteria strains. Their growth rate, maximal optical density, low pH and bile tolerances were measured and further used as a reference data for analysis of the simultaneously acquired spectral data. The acquired spectral data in the region of 1100-1850nm was subjected to various multivariate data analyses - PCA, OPLS-DA, PLSR. The results showed high accuracy of bacteria strains classification according to their probiotic strength. Most informative spectral fingerprints covered the first overtone of water, emphasizing the relation of water molecular system to cell functionality.
Mohamadi Monavar, H; Afseth, N K; Lozano, J; Alimardani, R; Omid, M; Wold, J P
2013-07-15
The purpose of this study was to evaluate the feasibility of Raman spectroscopy for predicting purity of caviars. The 93 wild caviar samples of three different types, namely; Beluga, Asetra and Sevruga were analysed by Raman spectroscopy in the range 1995 cm(-1) to 545 cm(-1). Also, 60 samples from combinations of every two types were examined. The chemical origin of the samples was identified by reference measurements on pure samples. Linear chemometric methods like Principal Component Analysis (PCA) and Linear Discriminant Analysis (LDA) were used for data visualisation and classification which permitted clear distinction between different caviars. Non-linear methods like Artificial Neural Networks (ANN) were used to classify caviar samples. Two different networks were tested in the classification: Probabilistic Neural Network with Radial-Basis Function (PNN) and Multilayer Feed Forward Networks with Back Propagation (BP-NN). In both cases, scores of principal components (PCs) were chosen as input nodes for the input layer in PC-ANN models in order to reduce the redundancy of data and time of training. Leave One Out (LOO) cross validation was applied in order to check the performance of the networks. Results of PCA indicated that, features like type and purity can be used to discriminate different caviar samples. These findings were also supported by LDA with efficiency between 83.77% and 100%. These results were confirmed with the results obtained by developed PC-ANN models, able to classify pure caviar samples with 93.55% and 71.00% accuracy in BP network and PNN, respectively. In comparison, LDA, PNN and BP-NN models for predicting caviar types have 90.3%, 73.1% and 91.4% accuracy. Partial least squares regression (PLSR) models were built under cross validation and tested with different independent data sets, yielding determination coefficients (R(2)) of 0.86, 0.83, 0.92 and 0.91 with root mean square error (RMSE) of validation of 0.32, 0.11, 0.03 and 0.09 for fatty acids of 16.0, 20.5, 22.6 and fat, respectively. Crown Copyright © 2013. Published by Elsevier B.V. All rights reserved.
Land-use versus natural controls on soil fertility in the Subandean Amazon, Peru.
Lindell, Lina; Aström, Mats; Oberg, Tomas
2010-01-15
Deforestation to amplify the agricultural frontier is a serious threat to the Amazon forest. Strategies to attain and maintain satisfactory soil fertility, which requires knowledge of spatial and temporal changes caused by land-use, are important for reaching sustainable development. This study highlights these issues by evaluating the relative effects of agricultural land-use and natural factors on chemical fertility of Inceptisols on redbed lithologies in the Subandean Amazon. Macro and micronutrients were determined in topsoil and subsoil in the vicinity of two villages at a total of 80 sites including pastures, coffee plantations, swidden fields, secondary forest and, as a reference, adjacent primary forest. Differences in soil fertility between the land cover classes were investigated by principal component analysis (PCA) and partial least squares regression (PLSR). Primary forest soil was found to be chemically similar to that of coffee plantations, pastures and secondary forests. There were no significant differences between soils of these land cover types in terms of plant nutrients (e.g. N, P, K, Ca, Mg, Mo, Mn, Zn, Cu and Co) or other fertility indicators (OM, pH, BS, EC, CECe and exchangeable acidity). The parent material (as indicated by texture and sample geographical origin) and the slope of the sampled sites were stronger controls on soil fertility than land cover type. Elevated concentrations of a few nutrients (NO(3) and K) were, however detected in soils of swidden fields. Despite being fertile (higher CECe, Ca and P) compared to Oxisols and Ultisols in the Amazon lowland, the Subandean soils frequently showed deficiencies in several nutrients (e.g. P, K, NO(3), Cu and Zn), and high levels of free Al at acidic sites. This paper concludes that deforestation and agricultural land-use has not introduced lasting chemical changes in the studied Subandean soils that are significant in comparison to the natural variability. Copyright 2009 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Jensen, D.; Cavanaugh, K. C.; Simard, M.
2016-12-01
Coastal wetlands provide a wealth of ecosystem services, including improved water quality, protection from storm surges, and wildlife habitat. Louisiana's wetlands, however, are threatened by development, pollution, and relative sea level rise (RSLR)—the combination of sea level rise and subsidence rates. Beyond causing land loss, RSLR impacts Louisiana's wetland ecosystems by altering salinity, nutrient availability, flood duration, and flood frequency in the region. Despite widespread wetland loss, areas such as the Wax Lake and Atchafalaya river deltas are in fact growing due to their sediment loads, resulting in a complex of both degradation and aggradation along the Louisiana coast. In order to understand and model how coastal wetlands are responding to RSLR, there is a need for improved vegetation distribution mapping, biomass estimation, and ecosystem change modeling. To this end, high-resolution imaging spectroscopy offers the ability to accurately develop species-level distribution maps and predictive aboveground biomass (AGB) models. AVIRIS-NG data collected over the Atchafalaya River Delta were calibrated to reduce Bidirectional Reflectance Distribution Function (BRDF) effects and mosaicked, along with other scenes that coincided with field observations. Multiple Endmember Spectral Mixture Analysis (MESMA) was used to map salt marsh at the species level across our study area. Field observations were used to parameterize and validate our MESMA based approach. AGB was then mapped for this region using a partial least squares regression (PLSR) model developed from the same imagery and field measurements. Last, the Sea Level Affecting Marshes Model was applied to predict wetland loss and changes in marsh composition due to sea level rise, which was then paired with the AGB map to estimate carbon storage change. In doing so, this study addresses key concerns for coastal regions and demonstrates the ability of imaging spectroscopy to predict those impacts.
Tran, Chieu D; Mututuvari, Tamutsiwa M
2016-03-07
A method was developed in which cellulose (CEL) and/or chitosan (CS) were added to keratin (KER) to enable [CEL/CS+KER] composites formed to have better mechanical strength and wider utilization. Butylmethylimmidazolium chloride ([BMIm + Cl - ]), an ionic liquid, was used as the sole solvent, and because the majority of [BMIm + Cl - ] used (at least 88%) was recovered, the method is green and recyclable. FTIR, XRD, 13 C CP-MAS NMR and SEM results confirm that KER, CS and CEL remain chemically intact and distributed homogeneously in the composites. We successfully demonstrate that the widely used method based on the deconvolution of the FTIR bands of amide bonds to determine secondary structure of proteins is relatively subjective as the conformation obtained is strongly dependent on the choice of parameters selected for curve fitting. A new method, based on the partial least squares regression analysis (PLSR) of the amide bands, was developed, and proven to be objective and can provide more accurate information. Results obtained with this method agree well with those by XRD, namely they indicate that although KER retains its second structure when incorporated into the [CEL+CS] composites, it has relatively lower α -helix, higher β -turn and random form compared to that of the KER in native wool. It seems that during dissolution by [BMIm + Cl - ], the inter- and intramolecular forces in KER were broken thereby destroying its secondary structure. During regeneration, these interactions were reestablished to reform partially the secondary structure. However, in the presence of either CEL or CS, the chains seem to prefer the extended form thereby hindering reformation of the α -helix. Consequently, the KER in these matrices may adopt structures with lower content of α -helix and higher β -sheet. As anticipated, results of tensile strength and TGA confirm that adding CEL or CS into KER substantially increase the mechanical strength and thermal stability of the [CS/CEL+KER] composites.
Liu, Jinbao; Zhang, Yang; Wang, Huanyuan; Du, Yichun
2018-06-15
The estimation of soils heavy metal content can reflect the impending surroundings of surface, which lays theoretical foundation for using covered vegetation to monitor environment and investigate resource. In this study, the contents of Cr, Mn, Ni, Cu, Zn, As, Cd, Hg and Pb in 44 soil samples were collected from Fufeng County, Yangling County and Wugong County, Shaanxi Province and were used as data sources. ASD FieldSpec HR (350-2500nm), and then the NOR, MSC and SNV of the reflectance were pretreated, the first deviation, second deviation and reflectance reciprocal logarithmic transformation were carried out. The optimal spectroscopy estimation model of nine heavy metal elements of Cr, Mn, Ni, Cu, Zn, As, Cd, Hg and Pb was established by regression method. Comparing the diffuse reflectance characteristics of different heavy metal contents and the effect of different pretreatment methods on the establishment of soil heavy metal spectral inversion model. The results of chemical analysis show that there was a serious Hg pollution in the study area, and the Cd content was close to the critical value. The results show that: (1) NOR, MSC and SNV were adopted for the acquisition of visible near-infrared. Combining differential transformation can improve the information of heavy metal elements in the soil, and use the correlation band energy Significantly improve the stability and predictability of the model. (2) The modeling accuracy of the optimal model of nine heavy metal spectra of Cr, Mn, Ni, Cu, Zn, As, Cd, Hg and Pb by PLSR method were 0.70, 0.79, 0.69, 0.81, 0.86, 0.58, 0.55, 0.99, 0.62. (3) The optimal estimation model of different elements using different treatment methods has better stability and higher precision, and can realize the rapid prediction of nine kinds of heavy metal elements in this region. Copyright © 2018 Elsevier B.V. All rights reserved.
Porto-Figueira, Priscilla; Figueira, José A; Berenguer, Pedro; Câmara, José S
2018-04-15
The effect of ripening on the evolution of the volatomic pattern from endemic Vaccinium padifolium L. (Uveira) berries was investigated using headspace-solid phase microextraction (HS-SPME) followed by gas chromatography/quadrupole-mass spectrometry (GC-qMS) and multivariate statistical analysis (MVA). The most significant HS-SPME parameters, namely fibre polymer, ionic strength and extraction time, were optimized in order to improve extraction efficiency. Under optimal experimental conditions (DVB/CAR/PDMS fibre coating, 40°C, 30min extraction time and 5g of sample amount), a total of 72 volatiles of different functionalities were isolated and identified. Terpenes followed by higher alcohols and esters were the predominant classes in the ripening stages - green, break and ripe. Although significant differences in the volatomic profiles at the three stages were obtained, cis-β-ocimene (2.0-40.0%), trans-2-hexenol (2.4-19.4%), cis-3-hexenol (2.5.16.4%), β-myrcene (1.9-13.8%), 1-hexanol (1.7-13.6%), 2-hexenal (0.7-8.0%), 2-heptanone (0.7-7.7%), and linalool (1.9-6.1%) were the main volatile compounds identified. Higher alcohols, carboxylic acids and ketones gradually increased during ripening, whereas monoterpenes significantly decreased. These trends were dominated by the higher alcohols (1-hexanol, cis-3-hexenol, trans-2-hexenol) and monoterpenes (β-myrcene, cis-β-ocimene and trans-β-ocimene). Partial least squares regression (PLSR) revealed that ethyl caprylate (1.000), trans-geraniol (0.995), ethyl isovalerate (-0.994) and benzyl carbinol (0.993) are the key variables that most contributed to the successful differentiation of Uveira berries according to ripening stage. To the best of our knowledge, no study has carried out on the volatomic composition of berries from endemic Uveira. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Liu, Jinbao; Zhang, Yang; Wang, Huanyuan; Du, Yichun
2018-06-01
The estimation of soils heavy metal content can reflect the impending surroundings of surface, which lays theoretical foundation for using covered vegetation to monitor environment and investigate resource. In this study, the contents of Cr, Mn, Ni, Cu, Zn, As, Cd, Hg and Pb in 44 soil samples were collected from Fufeng County, Yangling County and Wugong County, Shaanxi Province and were used as data sources. ASD FieldSpec HR (350-2500 nm), and then the NOR, MSC and SNV of the reflectance were pretreated, the first deviation, second deviation and reflectance reciprocal logarithmic transformation were carried out. The optimal spectroscopy estimation model of nine heavy metal elements of Cr, Mn, Ni, Cu, Zn, As, Cd, Hg and Pb was established by regression method. Comparing the diffuse reflectance characteristics of different heavy metal contents and the effect of different pretreatment methods on the establishment of soil heavy metal spectral inversion model. The results of chemical analysis show that there was a serious Hg pollution in the study area, and the Cd content was close to the critical value. The results show that: (1) NOR, MSC and SNV were adopted for the acquisition of visible near-infrared. Combining differential transformation can improve the information of heavy metal elements in the soil, and use the correlation band energy Significantly improve the stability and predictability of the model. (2) The modeling accuracy of the optimal model of nine heavy metal spectra of Cr, Mn, Ni, Cu, Zn, As, Cd, Hg and Pb by PLSR method were 0.70, 0.79, 0.69, 0.81, 0.86, 0.58, 0.55, 0.99, 0.62. (3) The optimal estimation model of different elements using different treatment methods has better stability and higher precision, and can realize the rapid prediction of nine kinds of heavy metal elements in this region.
Spatial distribution of heterocyclic organic matter compounds at macropore surfaces in Bt-horizons
NASA Astrophysics Data System (ADS)
Leue, Martin; Eckhardt, Kai-Uwe; Gerke, Horst H.; Ellerbrock, Ruth H.; Leinweber, Peter
2017-04-01
The illuvial Bt-horizon of Luvisols is characterized by coatings of clay and organic matter (OM) at the surfaces of cracks, biopores and inter-aggregate spaces. The OM composition of the coatings that originate from preferential transport of suspended matter in macropores determines the physico-chemical properties of the macropore surfaces. The analysis of the spatial distribution of specific OM components such as heterocyclic N-compounds (NCOMP) and benzonitrile and naphthalene (BN+NA) could enlighten the effect of macropore coatings on the transport of colloids and reactive solutes during preferential flow and on OM turnover processes in subsoils. The objective was to characterize the mm-to-cm scale spatial distribution of NCOMP and BN+NA at intact macropore surfaces from the Bt-horizons of two Luvisols developed on loess and glacial till. In material manually separated from macropore surfaces the proportions of NCOMP and BN+NA were determined by pyrolysis-field ionization mass spectrometry (Py-FIMS). These OM compounds, likely originating from combustion residues, were found increased in crack coatings and pinhole fillings but decreased in biopore walls (worm burrows and root channels). The Py-FIMS data were correlated with signals from C=O and C=C groups and with signals from O-H groups of clay minerals as determined by Fourier transform infrared spectroscopy in diffuse reflectance mode (DRIFT). Intensive signals of C15 to C17 alkanes from long-chain alkenes as main components of diesel and diesel exhaust particulates substantiated the assumption that burning residues were prominent in the subsoil OM. The spatial distribution of NCOMP and BN+NA along the macropores was predicted by partial least squares regression (PLSR) using DRIFT mapping spectra from intact surfaces and was found closely related to the distribution of crack coatings and pinholes. The results emphasize the importance of clay coatings in the subsoil to OM sorption and stabilization. Differences between biopores and cracks suggest differences in the mass transport and OM turnover between these macropore types in Luvisols.
NASA Astrophysics Data System (ADS)
Poggio, Matteo; Brown, David J.; Gasch, Caley K.; Brooks, Erin S.; Yourek, Matt A.
2015-04-01
In the Palouse region of eastern Washington and northern Idaho (USA), spatially discontinuous restrictive layers impede rooting growth and water infiltration. Consequently, accurate maps showing the depth and spatial extent of these restrictive layers are essential for watershed hydrologic modeling appropriate for precision agriculture. In this presentation, we report on the use of a Visible and Near-Infrared (VisNIR) penetrometer fore optic to construct detailed maps of three wheat fields in the Palouse region. The VisNIR penetrometer was used to deliver in situ soil reflectance to an Analytical Spectral Devices (ASD, Boulder, CO, USA) spectrometer and simultaneously acquire insertion force. With a hydraulic push-type soil coring systems for insertion (e.g. Giddings), we collected soil spectra and insertion force data along 41m x 41m grid points (2 fields) and 50m x 50m grid points (1 field) to ≈80cm depth, in addition to interrogation points at 36 representative instrumented locations per field. At each of the 36 instrumented locations, two soil cores were extracted for laboratory determination of clay content and bulk density. We developed calibration models of soil clay content and bulk density with spectra and insertion force collected in situ, using partial least squares regression 2 (PLSR2). Applying spline functions, we delineated clay and bulk density profiles at each points (grid and 24 locations). The soil profiles were then used as inputs in a regression-kriging model with terrain indexes and ECa data (derived from an EM38 field survey, Geonics, Mississauga, Ontario, Canada) as covariates to generate 3D soil maps. Preliminary results show that the VisNIR penetrometer can capture the spatial patterns of restrictive layers. Work is ongoing to evaluate the prediction accuracy of penetrometer-derived 3D clay content and restriction layer maps.
Development of VIS/NIR spectroscopic system for real-time prediction of fresh pork quality
NASA Astrophysics Data System (ADS)
Zhang, Haiyun; Peng, Yankun; Zhao, Songwei; Sasao, Akira
2013-05-01
Quality attributes of fresh meat will influence nutritional value and consumers' purchasing power. The aim of the research was to develop a prototype for real-time detection of quality in meat. It consisted of hardware system and software system. A VIS/NIR spectrograph in the range of 350 to 1100 nm was used to collect the spectral data. In order to acquire more potential information of the sample, optical fiber multiplexer was used. A conveyable and cylindrical device was designed and fabricated to hold optical fibers from multiplexer. High power halogen tungsten lamp was collected as the light source. The spectral data were obtained with the exposure time of 2.17ms from the surface of the sample by press down the trigger switch on the self-developed system. The system could automatically acquire, process, display and save the data. Moreover the quality could be predicted on-line. A total of 55 fresh pork samples were used to develop prediction model for real time detection. The spectral data were pretreated with standard normalized variant (SNV) and partial least squares regression (PLSR) was used to develop prediction model. The correlation coefficient and root mean square error of the validation set for water content and pH were 0.810, 0.653, and 0.803, 0.098 respectively. The research shows that the real-time non-destructive detection system based on VIS/NIR spectroscopy can be efficient to predict the quality of fresh meat.
NASA Astrophysics Data System (ADS)
Arantes Camargo, Livia; Marques Júnior, José; Reynaldo Ferracciú Alleoni, Luís; Tadeu Pereira, Gener; De Bortoli Teixeira, Daniel; Santos Rabelo de Souza Bahia, Angélica
2017-04-01
Environmental impact assessments may be assisted by spatial characterization of potentially toxic elements (PTEs). Diffuse reflectance spectroscopy (DRS) and X-ray fluorescence spectroscopy (XRF) are rapid, non-destructive, low-cost, prediction tools for a simultaneous characterization of different soil attributes. Although low concentrations of PTEs might preclude the observation of spectral features, their contents can be predicted using spectroscopy by exploring the existing relationship between the PTEs and soil attributes with spectral features. This study aimed to evaluate, in three geomorphic surfaces of Oxisols, the capacity for predicting PTEs (Ba, Co, and Ni) and their spatial variability by means of diffuse reflectance spectroscopy (DRS) and X-ray fluorescence spectroscopy (XRF). For that, soil samples were collected from three geomorphic surfaces and analyzed for chemical, physical, and mineralogical properties, and then analyzed in DRS (visible + near infrared - VIS+NIR and medium infrared - MIR) and XRF equipment. PTE prediction models were calibrated using partial least squares regression (PLSR). PTE spatial distribution maps were built using the values calculated by the calibrated models that reached the best accuracy using geostatistics. PTE prediction models were satisfactorily calibrated using MIR DRS for Ba, and Co (residual prediction deviation - RPD > 3.0), Vis DRS for Ni (RPD > 2.0) and FRX for all the studied PTEs (RPD > 1.8). DRS- and XRF-predicted values allowed the characterization and the understanding of spatial variability of the studied PTEs.
Kinoshita, Rintaro; Moebius-Clune, Bianca N.; van Es, Harold M.; Hively, W. Dean; Bilgilis, A. Volkan
2012-01-01
Visible and near-infrared reflectance spectroscopy (VNIRS) is a rapid and nondestructive method that can predict multiple soil properties simultaneously, but its application in multidimensional soil quality (SQ) assessment in the tropics still needs to be further assessed. In this study, VNIRS (350–2500 nm) was employed to analyze 227 air-dried soil samples of Ultisols from a soil chronosequence in western Kenya and assess 16 SQ indicators. Partial least squares regression (PLSR) was validated using the full-site cross-validation method by grouping samples from each farm or forest site. Most suitable models successfully predicted SQ indicators (R2 ≥ 0.80; ratio of performance to deviation [RPD] ≥ 2.00) including soil organic matter (OMLOI), active C, Ca, cation exchange capacity (CEC), and clay. Moderately-well predicted indicators (0.50 ≤ R2 pwp), and field capacity (Θfc). Poorly predicted indicators (R2 < 0.50; RPD < 1.40) were EC, S, P, available water capacity (AWC), K, Zn, and penetration resistance. Combining VNIRS with selected field- and laboratory-measured SQ indicator values increased predictability. Furthermore, VNIRS showed moderate to substantial agreement in predicting interpretive SQ scores and a composite soil quality index (CSQI) especially when combined with directly measured SQ indicator values. In conclusion, VNIRS has good potential for low cost, rapid assessment of physical and biological SQ indicators but conventional soil chemical tests may need to be retained to provide comprehensive SQ assessments.
Henn, Raphael; Kirchler, Christian G; Grossgut, Maria-Elisabeth; Huck, Christian W
2017-05-01
This study compared three commercially available spectrometers - whereas two of them were miniaturized - in terms of prediction ability of melamine in milk powder (infant formula). Therefore all spectra were split into calibration- and validation-set using Kennard Stone and Duplex algorithm in comparison. For each instrument the three best performing PLSR models were constructed using SNV and Savitzky Golay derivatives. The best RMSEP values were 0.28g/100g, 0.33g/100g and 0.27g/100g for the NIRFlex N-500, the microPHAZIR and the microNIR2200 respectively. Furthermore the multivariate LOD interval [LOD min , LOD max ] was calculated for all the PLSR models unveiling significant differences among the spectrometers showing values of 0.20g/100g - 0.27g/100g, 0.28g/100g - 0.54g/100g and 0.44g/100g - 1.01g/100g for the NIRFlex N-500, the microPHAZIR and the microNIR2200 respectively. To assess the robustness of all models, artificial introduction of white noise, baseline shift, multiplicative effect, spectral shrink and stretch, stray light and spectral shift were applied. Monitoring the RMSEP as function of the perturbation gave indication of robustness of the models and helped to compare the performances of the spectrometers. Not taking the additional information from the LOD calculations into account one could falsely assume that all the spectrometers perform equally well which is not the case when the multivariate evaluation and robustness data were considered. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Jurasinski, Gerald; Scharnweber, Tobias; Schröder, Christian; Lennartz, Bernd; Bauwe, Andreas
2017-04-01
Tree growth depends, among other factors, largely on the prevailing climatic conditions. Therefore, tree growth patterns are to be expected under climate change. Here, we analyze the tree-ring growth response of three major European tree species to projected future climate across a climatic (mostly precipitation) gradient in northeastern Germany. We used monthly data for temperature, precipitation, and the standardized precipitation evapotranspiration index (SPEI) over multiple time scales (1, 3, 6, 12, and 24 months) to construct models of tree-ring growth for Scots pine (Pinus syl- vestris L.) at three pure stands, and for Common beech (Fagus sylvatica L.) and Pedunculate oak (Quercus robur L.) at three mature mixed stands. The regression models were derived using a two-step approach based on partial least squares regression (PLSR) to extract potentially well explaining variables followed by ordinary least squares regression (OLSR) to consolidate the models to the least number of variables while retaining high explanatory power. The stability of the models was tested with a comprehensive calibration-verification scheme. All models were successfully verified with R2s ranging from 0.21 for the western pine stand to 0.62 for the beech stand in the east. For growth prediction, climate data forecasted until 2100 by the regional climate model WETTREG2010 based on the A1B Intergovernmental Panel on Climate Change (IPCC) emission scenario was used. For beech and oak, growth rates will likely decrease until the end of the 21st century. For pine, modeled growth trends vary and range from a slight growth increase to a weak decrease in growth rates depending on the position along the climatic gradient. The climatic gradient across the study area will possibly affect the future growth of oak with larger growth reductions towards the drier east. For beech, site-specific adaptations seem to override the influence of the climatic gradient. We conclude that in Northeastern Germany Scots pine has great potential to remain resilient to projected climate change without any greater impairment, whereas Common beech and Pedunculate oak will likely face lesser growth under the expected warmer and dryer climate conditions. The results call for an adaptation of forest management to mitigate the negative effects of climate change for beech and oak in the region.
Yu, Min; He, Shudong; Tang, Mingming; Zhang, Zuoyong; Zhu, Yongsheng; Sun, Hanju
2018-03-15
Four peptide fractions PF1 (>5;kDa), PF2 (3-5;kDa), PF3 (1-3;kDa), PF4 (<1;kDa) were isolated from soybean hydrolysate using the ultrafiltration method. Then, d-xylose and l-cysteine were reacted with specific peptide solution at 120;°C for 2;h, and the molecular weight distribution (MWD), pH, colour, browning intensity, DPPH radical-scavenging activity, free amino acids and sensory characteristics of corresponding Maillard reaction products (MRPF1, MRPF2, MRPF3 and MRPF4) were evaluated, respectively. Peptides with low molecular weight showed higher contribution to the changes of pH, colour and browning intensity during Maillard reaction. The DPPH radical-scavenging activity of PF4 was significantly improved after Maillard reaction. Aroma volatiles and PLSR analysis suggested MRPF3 had the best sensory characteristics with higher contents of umami amino acids and lower of bitter amino acids, therefore it could be deduced that the umami and meaty characteristics were correlated with the peptides of 1-3;kDa. Copyright © 2017 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Gilliot, Jean-Marc; Vaudour, Emmanuelle; Michelin, Joël
2016-04-01
This study was carried out in the framework of the PROSTOCK-Gessol3 project supported by the French Environment and Energy Management Agency (ADEME), the TOSCA-PLEIADES-CO project of the French Space Agency (CNES) and the SOERE PRO network working on environmental impacts of Organic Waste Products recycling on field crops at long time scale. The organic matter is an important soil fertility parameter and previous studies have shown the potential of spectral information measured in the laboratory or directly in the field using field spectro-radiometer or satellite imagery to predict the soil organic carbon (SOC) content. This work proposes a method for a spatial prediction of bare cultivated topsoil SOC content, from Unmanned Aerial Vehicle (UAV) multispectral imagery. An agricultural plot of 13 ha, located in the western region of Paris France, was analysed in April 2013, shortly before sowing while it was still bare soil. Soils comprised haplic luvisols, rendzic cambisols and calcaric or colluvic cambisols. The UAV platform used was a fixed wing provided by Airinov® flying at an altitude of 150m and was equipped with a four channels multispectral visible near-infrared camera MultiSPEC 4C® (550nm, 660nm, 735 nm and 790 nm). Twenty three ground control points (GCP) were sampled within the plot according to soils descriptions. GCP positions were determined with a centimetric DGPS. Different observations and measurements were made synchronously with the drone flight: soil surface description, spectral measurements (with ASD FieldSpec 3® spectroradiometer), roughness measurements by a photogrammetric method. Each of these locations was sampled for both soil standard physico-chemical analysis and soil water content. A Structure From Motion (SFM) processing was done from the UAV imagery to produce a 15 cm resolution multispectral mosaic using the Agisoft Photoscan® software. The SOC content was modelled by partial least squares regression (PLSR) between the laboratory analyses and the multispectral information for the 23 plots. The mean squared error of cross validation (RMSECV) by LOO (Leave One Out) method was 1.97 g of OC per kg of soil. A second correction of the model incorporating the effects of moisture and roughness on reflectance, has improved the quality of the prediction by 18% and a RMSECV of 1.61 g / kg. The model was finally spatialized on the whole plot using ArcGIS® by applying the regression formula on all mosaic pixels. Results are discussed in the light of an additional sampling campaign carried out in October 2015, providing 34 independent samples.
NASA Astrophysics Data System (ADS)
Nousratpour, A.
2011-12-01
The annual CO2 emission from soils corresponds to a large portion of the global carbon cycle and equals 10 percent of the total atmospheric carbon pool. The total forest soil CO2 loss equals the sum of contribution from autotrophic and heterotrophic organisms. The autotrophic respiration is derived from recent photosynthates from the forest canopy and exudates via the roots. The heterotrophic respiration is less directly dependent on root presence and recently assimilated photosynthates, which points to the possibility of separate mechanisms governing the CO2 emissions. The variation of the CO2 flux from these some-what overlapping sources in the soil i.e. rhizospheric and non-rhizosperically is still not fully understood. Soil temperature and water availability in particular have often been used to explain the variation of soil CO2 efflux by using regression methods. In this experiment around 1000 hours of soil CO2-emission rates from a drained spruce forest was collected from 6 plots, among which 3 were previously root excluded. The emission rates were collected during 5 campaigns throughout the growing season along with continuous above ground and below ground temperature and water properties such as precipitation and VPD (vapor pressure deficit). The resulting matrix was analyzed using multivariate statistical model PLSr (Partial Least Squares regression). This operation reduces the dimensionality of large datasets with probable multicollinearity and helps clarify the dependence of a response factor on x- variables. In addition a time series analysis is applied to the dataset to address the time lag between below ground temperature and water properties to the above ground weather conditions such as VPD and air temperature. Mean carbon emission from the control plots (428 mg Carbon m-2 hr-1) was significantly larger than that from the root excluded plots (136 mg Carbon m-2 hr-1). During the growing season more than 2/3 of the total CO2 release was estimated to be root contribution. The results show that the activity in the rhizosphere increased with rising soil temperature, VPD and ground water depletion until a certain point. When the level of ground water depth was deeper than about 0.5 m the dependence was reversed. This effect was either the opposite or lacking in the root excluded plots, which reflects the involvement of the tree roots and the separate factors controlling the different sources of CO2.
Soil Organic Carbon Estimation and Mapping Using "on-the-go" VisNIR Spectroscopy
NASA Astrophysics Data System (ADS)
Brown, D. J.; Bricklemyer, R. S.; Christy, C.
2007-12-01
Soil organic carbon (SOC) and other soil properties related to carbon sequestration (eg. soil clay content and mineralogy) vary spatially across landscapes. To cost effectively capture this variability, new technologies, such as Visible and Near Infrared (VisNIR) spectroscopy, have been applied to soils for rapid, accurate, and inexpensive estimation of SOC and other soil properties. For this study, we evaluated an "on the go" VisNIR sensor developed by Veris Technologies, Inc. (Salinas, KS) for mapping SOC, soil clay content and mineralogy. The Veris spectrometer spanned 350 to 2224 nm with 8 nm spectral resolution, and 25 spectra were integrated every 2 seconds resulting in 3 -5 m scanning distances on the ground. The unit was mounted to a mobile sensor platform pulled by a tractor, and scanned soils at an average depth of 10 cm through a quartz-sapphire window. We scanned eight 16.2 ha (40 ac) wheat fields in north central Montana (USA), with 15 m transect intervals. Using random sampling with spatial inhibition, 100 soil samples from 0-10 cm depths were extracted along scanned transects from each field and were analyzed for SOC. Neat, sieved (<2 mm) soil sample materials were also scanned in the lab using an Analytical Spectral Devices (ASD, Boulder, CO, USA) Fieldspec Pro FR spectroradiometer with a spectral range of 350-2500 and spectral resolution of 2-10 nm. The analyzed samples were used to calibrate and validate a number of partial least squares regression (PLSR) VisNIR models to compare on-the-go scanning vs. higher spectral resolution laboratory spectroscopy vs. standard SOC measurement methods.
NASA Astrophysics Data System (ADS)
Tsakiridis, Nikolaos L.; Tziolas, Nikolaos; Dimitrakos, Agathoklis; Galanis, Georgios; Ntonou, Eleftheria; Tsirika, Anastasia; Terzopoulou, Evangelia; Kalopesa, Eleni; Zalidis, George C.
2017-09-01
Soil Spectral Libraries facilitate agricultural production taking into account the principles of a low-input sustainable agriculture and provide more valuable knowledge to environmental policy makers, enabling improved decision making and effective management of natural resources in the region. In this paper, a comparison in the predictive performance of two state of the art algorithms, one linear (Partial Least Squares Regression) and one non-linear (Cubist), employed in soil spectroscopy is conducted. The comparison was carried out in a regional Soil Spectral Library developed in the Eastern Macedonia and Thrace region of Northern Greece, comprised of roughly 450 Entisol soil samples from soil horizons A (0-30 cm) and B (30-60 cm). The soil spectra were acquired in the visible - Near Infrared Red region (vis- NIR, 350nm-2500nm) using a standard protocol in the laboratory. Three soil properties, which are essential for agriculture, were analyzed and taken into account for the comparison. These were the Organic Matter, the Clay content and the concentration of nitrate-N. Additionally, three different spectral pre-processing techniques were utilized, namely the continuum removal, the absorbance transformation, and the first derivative. Following the removal of outliers using the Mahalanobis distance in the first 5 principal components of the spectra (accounting for 99.8% of the variance), a five-fold cross-validation experiment was considered for all 12 datasets. Statistical comparisons were conducted on the results, which indicate that the Cubist algorithm outperforms PLSR, while the most informative transformation is the first derivative.
Wang, Yan-Cang; Yang, Gui-Jun; Zhu, Jin-Shan; Gu, Xiao-He; Xu, Peng; Liao, Qin-Hong
2014-07-01
For improving the estimation accuracy of soil organic matter content of the north fluvo-aquic soil, wavelet transform technology is introduced. The soil samples were collected from Tongzhou district and Shunyi district in Beijing city. And the data source is from soil hyperspectral data obtained under laboratory condition. First, discrete wavelet transform efficiently decomposes hyperspectral into approximate coefficients and detail coefficients. Then, the correlation between approximate coefficients, detail coefficients and organic matter content was analyzed, and the sensitive bands of the organic matter were screened. Finally, models were established to estimate the soil organic content by using the partial least squares regression (PLSR). Results show that the NIR bands made more contributions than the visible band in estimating organic matter content models; the ability of approximate coefficients to estimate organic matter content is better than that of detail coefficients; The estimation precision of the detail coefficients fir soil organic matter content decreases with the spectral resolution being lower; Compared with the commonly used three types of soil spectral reflectance transforms, the wavelet transform can improve the estimation ability of soil spectral fir organic content; The accuracy of the best model established by the approximate coefficients or detail coefficients is higher, and the coefficient of determination (R2) and the root mean square error (RMSE) of the best model for approximate coefficients are 0.722 and 0.221, respectively. The R2 and RMSE of the best model for detail coefficients are 0.670 and 0.255, respectively.
Oberg, T
2007-01-01
The vapour pressure is the most important property of an anthropogenic organic compound in determining its partitioning between the atmosphere and the other environmental media. The enthalpy of vaporisation quantifies the temperature dependence of the vapour pressure and its value around 298 K is needed for environmental modelling. The enthalpy of vaporisation can be determined by different experimental methods, but estimation methods are needed to extend the current database and several approaches are available from the literature. However, these methods have limitations, such as a need for other experimental results as input data, a limited applicability domain, a lack of domain definition, and a lack of predictive validation. Here we have attempted to develop a quantitative structure-property relationship (QSPR) that has general applicability and is thoroughly validated. Enthalpies of vaporisation at 298 K were collected from the literature for 1835 pure compounds. The three-dimensional (3D) structures were optimised and each compound was described by a set of computationally derived descriptors. The compounds were randomly assigned into a calibration set and a prediction set. Partial least squares regression (PLSR) was used to estimate a low-dimensional QSPR model with 12 latent variables. The predictive performance of this model, within the domain of application, was estimated at n=560, q2Ext=0.968 and s=0.028 (log transformed values). The QSPR model was subsequently applied to a database of 100,000+ structures, after a similar 3D optimisation and descriptor generation. Reliable predictions can be reported for compounds within the previously defined applicability domain.
NASA Astrophysics Data System (ADS)
Li, Yao-Wang; Li, Bo; He, Jiguo; Qian, Ping
2011-07-01
A database consisting of 214 tripeptides which contain either His or Tyr residue was applied to study quantitative structure-activity relationships (QSAR) of antioxidative tripeptides. Partial Least-Squares Regression analysis (PLSR) was conducted using parameters individually of each amino acid descriptor, including Divided Physico-chemical Property Scores (DPPS), Hydrophobic, Electronic, Steric, and Hydrogen (HESH), Vectors of Hydrophobic, Steric, and Electronic properties (VHSE), Molecular Surface-Weighted Holistic Invariant Molecular (MS-WHIM), isotropic surface area-electronic charge index (ISA-ECI) and Z-scale, to describe antioxidative tripeptides as X-variables and antioxidant activities measured with ferric thiocyanate methods were as Y-variable. After elimination of outliers by Hotelling's T 2 method and residual analysis, six significant models were obtained describing the entire data set. According to cumulative squared multiple correlation coefficients ( R2), cumulative cross-validation coefficients ( Q2) and relative standard deviation for calibration set (RSD c), the qualities of models using DPPS, HESH, ISA-ECI, and VHSE descriptors are better ( R2 > 0.6, Q2 > 0.5, RSD c < 0.39) than that of models using MS-WHIM and Z-scale descriptors ( R2 < 0.6, Q2 < 0.5, RSD c > 0.44). Furthermore, the predictive ability of models using DPPS descriptor is best among the six descriptors systems (cumulative multiple correlation coefficient for predict set ( Rext2) > 0.7). It was concluded that the DPPS is better to describe the amino acid of antioxidative tripeptides. The results of DPPS descriptor reveal that the importance of the center amino acid and the N-terminal amino acid are far more than the importance of the C-terminal amino acid for antioxidative tripeptides. The hydrophobic (positively to activity) and electronic (negatively to activity) properties of the N-terminal amino acid are suggested to play the most important significance to activity, followed by the hydrogen bond (positively to activity) of the center amino acid. The N-terminal amino acid should be a high hydrophobic and low electronic amino acid (such as Ala, Gly, Val, and Leu); the center amino acid would be an amino acid that possesses high hydrogen bond property (such as base amino acid Arg, Lys, and His). The structural characteristics of antioxidative peptide be found in this paper may contribute to the further research of antioxidative mechanism.
NASA Astrophysics Data System (ADS)
Wang, Wenxiu; Peng, Yankun; Wang, Fan; Sun, Hongwei
2017-05-01
The improvement of living standards has urged consumers to pay more attention to the quality and nutrition of meat, so the development of nondestructive detection device for quality and nutritional parameters is commercioganic undoubtedly. In this research, a portable device equipped with visible (Vis) and near-infrared (NIR) spectrometers, tungsten halogen lamp, optical fiber, ring light guide and embedded computer was developed to realize simultaneous and fast detection of color (L*, a*, b*), pH, total volatile basic nitrogen (TVB-N), intramuscular fat (IF), protein and water content in pork. The wavelengths of dual-band spectrometers were 400 1100 nm and 940 1650 nm respectively and the tungsten halogen lamp cooperated with ring light guide to form a ring light source and provide appropriate illumination intensity for sample. Software was self-developed to control the functionality of dual-band spectrometers, set spectrometer parameters, acquire and process Vis/NIR spectroscopy and display the prediction results in real time. In order to obtain a robust and accurate prediction model, fresh longissimus dorsi meat was bought and placed in the refrigerator for 12 days to get pork samples with different freshness degrees. Besides, pork meat from three different parts including longissimus dorsi, haunch and lean meat was collected for the determination of IF, protein and water to make the reference values have a wider distribution range. After acquisition of Vis/NIR spectra, data from 400 1100 nm were pretreated with Savitzky-Golay (S-G) filter and standard normal variables transform (SNVT) and spectrum data from 940 1650 nm were preprocessed with SNVT. The anomalous were eliminated by Monte Carlo method based on model cluster analysis and then partial least square regression (PLSR) models based on single band (400 1100 nm or 940 1650 nm) and dual-band were established and compared. The results showed the optimal models for each parameter were built with correlation coefficients in prediction set of 0.9101, 0.9121, 0.8873, 0.9094, 0.9378, 0.9348, 0.9342 and 0.8882, respectively. It indicated this innovative and practical device can be a promising technology for nondestructive, fast and accurate detection of nutritional parameters in meat.
Prediction of soil organic carbon in a coal mining area by Vis-NIR spectroscopy.
Sun, Wenjuan; Li, Xinju; Niu, Beibei
2018-01-01
Coal mining has led to increasingly serious land subsidence, and the reclamation of the subsided land has become a hot topic of concern for governments and scholars. Soil quality of reclaimed land is the key indicator to the evaluation of the reclamation effect; hence, rapid monitoring and evaluation of reclaimed land is of great significance. Visible-near infrared (Vis-NIR) spectroscopy has been shown to be a rapid, timely and efficient tool for the prediction of soil organic carbon (SOC). In this study, 104 soil samples were collected from the Baodian mining area of Shandong province. Vis-NIR reflectance spectra and soil organic carbon content were then measured under laboratory conditions. The spectral data were first denoised using the Savitzky-Golay (SG) convolution smoothing method or the multiple scattering correction (MSC) method, after which the spectral reflectance (R) was subjected to reciprocal, reciprocal logarithm and differential transformations to improve spectral sensitivity. Finally, regression models for estimating the SOC content by the spectral data were constructed using partial least squares regression (PLSR). The results showed that: (1) The SOC content in the mining area was generally low (at the below-average level) and exhibited great variability. (2) The spectral reflectance increased with the decrease of soil organic carbon content. In addition, the sensitivity of the spectrum to the change in SOC content, especially that in the near-infrared band of the original reflectance, decreased when the SOC content was low. (3) The modeling results performed best when the spectral reflectance was preprocessed by Savitzky-Golay (SG) smoothing coupled with multiple scattering correction (MSC) and first-order differential transformation (modeling R2 = 0.86, RMSE = 2.00 g/kg, verification R2 = 0.78, RMSE = 1.81 g/kg, and RPD = 2.69). In addition, the first-order differential of R combined with SG, MSC with R, SG together with MSC and R also produced better modeling results than other pretreatment combinations. Vis-NIR modeling with specific spectral preprocessing methods could predict SOC content effectively.
Taradolsirithitikul, Panchita; Sirisomboon, Panmanas; Dachoupakan Sirisomboon, Cheewanun
2017-03-01
Ochratoxin A (OTA) contamination is highly prevalent in a variety of agricultural products including the commercially important coffee bean. As such, rapid and accurate detection methods are considered necessary for the identification of OTA in green coffee beans. The goal of this research was to apply Fourier transform near infrared spectroscopy to detect and classify OTA contamination in green coffee beans in both a quantitative and qualitative manner. PLSR models were generated using pretreated spectroscopic data to predict the OTA concentration. The best model displayed a correlation coefficient (r) of 0.814, a standard error of prediction (SEP and bias of 1.965 µg kg -1 and 0.358 µg kg -1 , respectively. Additionally, a PLS-DA model was also generated, displaying a classification accuracy of 96.83% for a non-OTA contaminated model and 80.95% for an OTA contaminated model, with an overall classification accuracy of 88.89%. The results demonstrate that the developed model could be used for detecting OTA contamination in green coffee beans in either a quantitative or qualitative manner. © 2016 Society of Chemical Industry. © 2016 Society of Chemical Industry.
NASA Astrophysics Data System (ADS)
Yang, Yue; Wu, Yongjiang; Li, Weili; Liu, Xuesong; Zheng, Jiyu; Zhang, Wentao; Chen, Yong
2018-02-01
Near infrared (NIR) spectroscopy coupled with chemometrics was used to discriminate the geographical origin of Herba Epimedii in this work. Four different classification models, namely discriminant analysis (DA), back propagation neural network (BPNN), K-nearest neighbor (KNN), and support vector machine (SVM), were constructed, and their performances in terms of recognition accuracy were compared. The results indicated that the SVM model was superior over the other models in the geographical origin identification of Herba Epimedii. The recognition rates of the optimum SVM model were up to 100% for the calibration set and 94.44% for the prediction set, respectively. In addition, the feasibility of NIR spectroscopy with the CARS-PLSR calibration model in prediction of icariin content of Herba Epimedii was also investigated. The determination coefficient (RP2) and root-mean-square error (RMSEP) for prediction set were 0.9269 and 0.0480, respectively. It can be concluded that the NIR spectroscopy technique in combination with chemometrics has great potential in determination of geographical origin and icariin content of Herba Epimedii. This study can provide a valuable reference for rapid quality control of food products.
Leaf aging of Amazonian canopy trees as revealed by spectral and physiochemical measurements.
Chavana-Bryant, Cecilia; Malhi, Yadvinder; Wu, Jin; Asner, Gregory P; Anastasiou, Athanasios; Enquist, Brian J; Cosio Caravasi, Eric G; Doughty, Christopher E; Saleska, Scott R; Martin, Roberta E; Gerard, France F
2017-05-01
Leaf aging is a fundamental driver of changes in leaf traits, thereby regulating ecosystem processes and remotely sensed canopy dynamics. We explore leaf reflectance as a tool to monitor leaf age and develop a spectra-based partial least squares regression (PLSR) model to predict age using data from a phenological study of 1099 leaves from 12 lowland Amazonian canopy trees in southern Peru. Results demonstrated monotonic decreases in leaf water (LWC) and phosphorus (P mass ) contents and an increase in leaf mass per unit area (LMA) with age across trees; leaf nitrogen (N mass ) and carbon (C mass ) contents showed monotonic but tree-specific age responses. We observed large age-related variation in leaf spectra across trees. A spectra-based model was more accurate in predicting leaf age (R 2 = 0.86; percent root mean square error (%RMSE) = 33) compared with trait-based models using single (R 2 = 0.07-0.73; %RMSE = 7-38) and multiple (R 2 = 0.76; %RMSE = 28) predictors. Spectra- and trait-based models established a physiochemical basis for the spectral age model. Vegetation indices (VIs) including the normalized difference vegetation index (NDVI), enhanced vegetation index 2 (EVI2), normalized difference water index (NDWI) and photosynthetic reflectance index (PRI) were all age-dependent. This study highlights the importance of leaf age as a mediator of leaf traits, provides evidence of age-related leaf reflectance changes that have important impacts on VIs used to monitor canopy dynamics and productivity and proposes a new approach to predicting and monitoring leaf age with important implications for remote sensing. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
Difficulties of biomass estimation over natural grassland
NASA Astrophysics Data System (ADS)
Kertész, Péter; Gecse, Bernadett; Pintér, Krisztina; Fóti, Szilvia; Nagy, Zoltán
2017-04-01
Estimation of biomass amount in grasslands using remote sensing is a challenge due to the high diversity and different phenologies of the constituting plant species. The aim of this study was to estimate the biomass amount (dry weight per area) during the vegetation period of a diverse semi-natural grassland with remote sensing. A multispectral camera (Tetracam Mini-MCA 6) was used with 3 cm ground resolution. The pre-processing method includes noise reduction, the correction for the vignetting effect and the calculation of the reflectance using an Incident Light Sensor (ILS). Calibration was made with ASD spectrophotometer as reference. To estimate biomass Partial Least Squares Regression (PLSR) statistical method was used with 5 bands and NDVI as input variables. Above ground biomass was cut in 15 quadrats (50×50 cm) as reference. The best prediction was attained in spring (r2=0.94, RMSE: 26.37 g m-2). The average biomass amount was 167 g m-2. The variability of the biomass is mainly determined by the relief, which causes the high and low biomass patches to be stable. The reliability of biomass estimation was negatively affected by the appearance of flowers and by the senescent plant parts during the summer. To determine the effects of flower's presence on the biomass estimation, 20 dominant species with visually dominant flowers in the area were selected and cover of flowers (%) were estimated in permanent plots during measurement campaigns. If the cover of flowers was low (<25%), the biomass amount estimation was successful (r2 >0,9), while at higher cover of flowers (>30%), the estimation failed (r2 <0,2). This effect restricts the usage of the remote sensing method to the spring - early summer period in diverse grasslands.
A portable device for detecting fruit quality by diffuse reflectance Vis/NIR spectroscopy
NASA Astrophysics Data System (ADS)
Sun, Hongwei; Peng, Yankun; Li, Peng; Wang, Wenxiu
2017-05-01
Soluble solid content (SSC) is a major quality parameter to fruit, which has influence on its flavor or texture. Some researches on the on-line non-invasion detection of fruit quality were published. However, consumers desire portable devices currently. This study aimed to develop a portable device for accurate, real-time and nondestructive determination of quality factors of fruit based on diffuse reflectance Vis/NIR spectroscopy (520-950 nm). The hardware of the device consisted of four units: light source unit, spectral acquisition unit, central processing unit, display unit. Halogen lamp was chosen as light source. When working, its hand-held probe was in contact with the surface of fruit samples thus forming dark environment to shield the interferential light outside. Diffuse reflectance light was collected and measured by spectrometer (USB4000). ARM (Advanced RISC Machines), as central processing unit, controlled all parts in device and analyzed spectral data. Liquid Crystal Display (LCD) touch screen was used to interface with users. To validate its reliability and stability, 63 apples were tested in experiment, 47 of which were chosen as calibration set, while others as prediction set. Their SSC reference values were measured by refractometer. At the same time, samples' spectral data acquired by portable device were processed by standard normalized variables (SNV) and Savitzky-Golay filter (S-G) to eliminate the spectra noise. Then partial least squares regression (PLSR) was applied to build prediction models, and the best predictions results was achieved with correlation coefficient (r) of 0.855 and standard error of 0.6033° Brix. The results demonstrated that this device was feasible to quantitatively analyze soluble solid content of apple.
Estimating soil zinc concentrations using reflectance spectroscopy
NASA Astrophysics Data System (ADS)
Sun, Weichao; Zhang, Xia
2017-06-01
Soil contamination by heavy metals has been an increasingly severe threat to nature environment and human health. Efficiently investigation of contamination status is essential to soil protection and remediation. Visible and near-infrared reflectance spectroscopy (VNIRS) has been regarded as an alternative for monitoring soil contamination by heavy metals. Generally, the entire VNIR spectral bands are employed to estimate heavy metal concentration, which lacks interpretability and requires much calculation. In this study, 74 soil samples were collected from Hunan Province, China and their reflectance spectra were used to estimate zinc (Zn) concentration in soil. Organic matter and clay minerals have strong adsorption for Zn in soil. Spectral bands associated with organic matter and clay minerals were used for estimation with genetic algorithm based partial least square regression (GA-PLSR). The entire VNIR spectral bands, the bands associated with organic matter and the bands associated with clay minerals were incorporated as comparisons. Root mean square error of prediction, residual prediction deviation, and coefficient of determination (R2) for the model developed using combined bands of organic matter and clay minerals were 329.65 mg kg-1, 1.96 and 0.73, which is better than 341.88 mg kg-1, 1.89 and 0.71 for the entire VNIR spectral bands, 492.65 mg kg-1, 1.31 and 0.40 for the organic matter, and 430.26 mg kg-1, 1.50 and 0.54 for the clay minerals. Additionally, in consideration of atmospheric water vapor absorption in field spectra measurement, combined bands of organic matter and absorption around 2200 nm were used for estimation and achieved high prediction accuracy with R2 reached 0.640. The results indicate huge potential of soil reflectance spectroscopy in estimating Zn concentrations in soil.
Zhu, JianCai; Chen, Feng; Wang, LingYing; Niu, YunWei; Chen, HeXing; Wang, HongLin; Xiao, ZuoBing
2016-06-22
The volatile compounds of cranberries obtained from four cultivars (Early Black, Y1; Howes, Y2; Searles, Y3; and McFarlin, Y4) were analyzed by gas chromatography-olfactometry (GC-O), gas chromatography-mass spectrometry (GC-MS), and GC-flame photometric detection (FPD). The result presented that a total of thirty-three, thirty-four, thirty-four, and thirty-six odor-active compounds were identified by GC-O in the Y1, Y2, Y3, and Y4, respectively. In addition, twenty-two, twenty-two, thirty, and twenty-seven quantified compounds were demonstrated as important odorants according to odor activity values (OAVs > 1). Among these compounds, hexanal (OAV: 27-60), pentanal (OAV: 31-51), (E)-2-heptenal (OAV: 17-66), (E)-2-hexenal (OAV: 18-63), (E)-2-octenal (OAV: 10-28), (E)-2-nonenal (OAV: 8-77), ethyl 2-methylbutyrate (OAV: 10-33), β-ionone (OAV: 8-73), 2-methylbutyric acid (OAV: 18-37), and octanal (OAV: 4-24) contributed greatly to the aroma of cranberry. Partial least-squares regression (PLSR) was used to process the mean data accumulated from sensory evaluation by the panelists, odor-active aroma compounds (OAVs > 1), and samples. Sample Y3 was highly correlated with the sensory descriptors "floral" and "fruity". Sample Y4 was greatly related to the sensory descriptors "mellow" and "green and grass". Finally, an aroma reconstitution (Model A) was prepared by mixing the odor-active aroma compounds (OAVs > 1) based on their measured concentrations in the Y1 sample, indicating that the aroma profile of the reconstitution was pretty similar to that of the original sample.
NASA Astrophysics Data System (ADS)
Camargo, Livia; Marques, José, Jr.
2014-05-01
Traditional technologies for measuring phosphorus adsorbed (Pads) and other soil attributes of agronomic importance are relatively unfeasible when aims to mapping large areas using the characterization of the spatial variability of soil attributes. These mappings need a large number of samples, which makes it expensive in mappings scale detail. This arouses in scientific society the need to develop methodologies able to assess these attributes within the landscape quickly, nondestructive, and not expensive. The diffuse reflectance spectroscopy (DRS) has been used to aid the characterization of soil attributes view of these requirements. In this sensing, the objective of this study was to evaluate the ability of DRS to estimate the Pads, clay, Fe extracted by dithionite-citrate-bicarbonate (Fedcb), contents of goethite (Gt) and hematite (Hm) and ratio Gt/(Gt + Hm) in Oxisols in The Northeastern State of São Paulo. Soil samples were collected in the transects each 25 m (100 samples). Geomorphic surfaces (GSs) were mapped in detail to support soil mapping. The soil in GS I was a Typic Hapludox, that in GS II a Typic Hapludox and Typic Eutrudox, and that in GS III a Typic Eutrudox. The soil samples were taken to the laboratory for chemical, physical and mineralogical analysis and DRS spectra were obtained over 380-2300 nm. Chemometric calibration and validation (using a one-out crossvalidation procedure) were done on absorbance measurements [Log10 (1/Reflectance)] by Partial least-squares regression (PLSR) analysis. The calibration accuracy was evaluated via the determination coefficient (R2), RMSE and the ratio performance deviation (RPD). The graph of Variable Importance in the Projection (VIP) for the Pad was built. The DRS was effective in predicting the attributes studied whereas the obtained models for the prediction of clay, Fedcb and Gt with greater accuracy (RPD> 1.4) were calibrated in the visible (380-800 nm) and to predict Pads, ratio Gt/(Gt + Hm) and Hm were calibrated in the visible + near infrared (801-2300 nm). The highest peaks of VIP for the Pads have been found in wavelengths: 480-580 nm and 780-980 nm which are assigned to crystalline iron oxides, mainly Gt and Hm. This result demonstrates the influence of these oxides on the P adsorption. In weathered soils, P adsorption is mainly correlated to iron oxides and aluminum clay fraction due phosphate interact with the functional groups of these oxides.
Regional prediction of soil organic carbon content over croplands using airborne hyperspectral data
NASA Astrophysics Data System (ADS)
Vaudour, Emmanuelle; Gilliot, Jean-Marc; Bel, Liliane; Lefebvre, Josias; Chehdi, Kacem
2015-04-01
This study was carried out in the framework of the Prostock-Gessol3 and the BASC-SOCSENSIT projects, dedicated to the spatial monitoring of the effects of exogenous organic matter land application on soil organic carbon storage. It aims at identifying the potential of airborne hyperspectral AISA-Eagle data for predicting the topsoil organic carbon (SOC) content of bare cultivated soils over a large peri-urban area (221 km2) with both contrasted soils and SOC contents, located in the western region of Paris, France. Soils comprise hortic or glossic luvisols, calcaric, rendzic cambisols and colluvic cambisols. Airborne AISA-Eagle data (400-1000 nm, 126 bands) with 1 m-resolution were acquired on 17 April 2013 over 13 tracks which were georeferenced. Tracks were atmospherically corrected using a set of 22 synchronous field spectra of both bare soils, black and white targets and impervious surfaces. Atmospherically corrected track tiles were mosaicked at a 2 m-resolution resulting in a 66 Gb image. A SPOT4 satellite image was acquired the same day in the framework of the SPOT4-Take Five program of the French Space Agency (CNES) which provided it with atmospheric correction. The land use identification system layer (RPG) of 2012 was used to mask non-agricultural areas, then NDVI calculation and thresholding enabled to map agricultural fields with bare soil. All 18 sampled sites known to be bare at this very date were correctly included in this map. A total of 85 sites sampled in 2013 or in the 3 previous years were identified as bare by means of this map. Predictions were made from the mosaic spectra which were related to topsoil SOC contents by means of partial least squares regression (PLSR). Regression robustness was evaluated through a series of 1000 bootstrap data sets of calibration-validation samples. The use of the total sample including 27 sites under cloud shadows led to non-significant results. Considering 43 sites outside cloud shadows only, median validation root-mean-square errors (RMSE) were ~4-4.5 g. kg-1. An additional set of 15 samples with bare soils led to similar RMSE values. Such results are only slightly better than those resulting from an earlier study with multispectral satellite images (Vaudour et al., 2013). The influence of soil surface condition and particularly soil roughness is discussed.
Principal component regression analysis with SPSS.
Liu, R X; Kuang, J; Gong, Q; Hou, X L
2003-06-01
The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.
Regression Analysis by Example. 5th Edition
ERIC Educational Resources Information Center
Chatterjee, Samprit; Hadi, Ali S.
2012-01-01
Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. "Regression Analysis by Example, Fifth Edition" has been expanded and thoroughly…
Zhang, Jingcheng; Pu, Ruiliang; Yuan, Lin; Wang, Jihua; Huang, Wenjiang; Yang, Guijun
2014-01-01
Powdery mildew is one of the most serious diseases that have a significant impact on the production of winter wheat. As an effective alternative to traditional sampling methods, remote sensing can be a useful tool in disease detection. This study attempted to use multi-temporal moderate resolution satellite-based data of surface reflectances in blue (B), green (G), red (R) and near infrared (NIR) bands from HJ-CCD (CCD sensor on Huanjing satellite) to monitor disease at a regional scale. In a suburban area in Beijing, China, an extensive field campaign for disease intensity survey was conducted at key growth stages of winter wheat in 2010. Meanwhile, corresponding time series of HJ-CCD images were acquired over the study area. In this study, a number of single-stage and multi-stage spectral features, which were sensitive to powdery mildew, were selected by using an independent t-test. With the selected spectral features, four advanced methods: mahalanobis distance, maximum likelihood classifier, partial least square regression and mixture tuned matched filtering were tested and evaluated for their performances in disease mapping. The experimental results showed that all four algorithms could generate disease maps with a generally correct distribution pattern of powdery mildew at the grain filling stage (Zadoks 72). However, by comparing these disease maps with ground survey data (validation samples), all of the four algorithms also produced a variable degree of error in estimating the disease occurrence and severity. Further, we found that the integration of MTMF and PLSR algorithms could result in a significant accuracy improvement of identifying and determining the disease intensity (overall accuracy of 72% increased to 78% and kappa coefficient of 0.49 increased to 0.59). The experimental results also demonstrated that the multi-temporal satellite images have a great potential in crop diseases mapping at a regional scale. PMID:24691435
Yang, Ling Yu; Gao, Xiao Hong; Zhang, Wei; Shi, Fei Fei; He, Lin Hua; Jia, Wei
2016-06-01
In this study, we explored the feasibility of estimating the soil heavy metal concentrations using the hyperspectral satellite image. The concentration of As, Pb, Zn and Cd elements in 48 topsoil samples collected from the field in Yushu County of the Sanjiangyuan regions was measured in the laboratory. We then extracted 176 vegetation spectral reflectance bands of 48 soil samples as well as five vegetation indices from two Hyperion images. Following that, the partial least squares regression (PLSR) method was employed to estimate the soil heavy metal concentrations using the above two independent sets of Hyperion-derived variables, separately constructed the estimation model between the 176 vegetation spectral reflectance bands and the soil heavy metal concentrations (called the vegetation spectral reflectance-based estimation model), and between the five vegetation indices being used as the independent variable and the soil heavy metal concentrations (called synthetic vegetation index-based estimation model). Using RPD (the ratio of standard deviation from the 4 heavy metals measured values of the validation samples to RMSE) as the validation criteria, the RPDs of As and Pb concentrations from the two models were both less than 1.4, which suggested that both models were incapable of roughly estimating As and Pb concentrations; whereas the RPDs of Zn and Cd were 1.53, 1.46 and 1.46, 1.42, respectively, which implied that both models had the ability for rough estimation of Zn and Cd concentrations. Based on those results, the vegetation spectral-based estimation model was selected to obtain the spatial distribution map of Zn concentration in combination with the Hyperion image. The estimated Zn map showed that the zones with high Zn concentrations were distributed near the provincial road 308, national road 214 and towns, which could be influenced by human activities. Our study proved that the spectral reflectance of Hyperion image was useful in estimating the soil concentrations of Zn and Cd.
Gashaw, Temesgen; Tulu, Taffa; Argaw, Mekuria; Worqlul, Abeyou W
2018-04-01
Understanding the hydrological response of a watershed to land use/land cover (LULC) changes is imperative for water resources management planning. The objective of this study was to analyze the hydrological impacts of LULC changes in the Andassa watershed for a period of 1985-2015 and to predict the LULC change impact on the hydrological status in year 2045. The hybrid land use classification technique for classifying Landsat images (1985, 2000 and 2015); Cellular-Automata Markov (CA-Markov) for prediction of the 2030 and 2045 LULC states; the Soil and Water Assessment Tool (SWAT) for hydrological modeling were employed in the analyses. In order to isolate the impacts of LULC changes, the LULC maps were used independently while keeping the other SWAT inputs constant. The contribution of each of the LULC classes was examined with the Partial Least Squares Regression (PLSR) model. The results showed that there was a continuous expansion of cultivated land and built-up area, and withdrawing of forest, shrubland and grassland during the 1985-2015 periods, which are expected to continue in the 2030 and 2045 periods. The LULC changes, which had occurred during the period of 1985 to 2015, had increased the annual flow (2.2%), wet seasonal flow (4.6%), surface runoff (9.3%) and water yield (2.4%). Conversely, the observed changes had reduced dry season flow (2.8%), lateral flow (5.7%), groundwater flow (7.8%) and ET (0.3%). The 2030 and 2045 LULC states are expected to further increase the annual and wet season flow, surface runoff and water yield, and reduce dry season flow, groundwater flow, lateral flow and ET. The change in hydrological components is a direct result of the significant transition from the vegetation to non-vegetation cover in the watershed. This suggests an urgent need to regulate the LULC in order to maintain the hydrological balance. Copyright © 2017 Elsevier B.V. All rights reserved.
Zhang, Jingcheng; Pu, Ruiliang; Yuan, Lin; Wang, Jihua; Huang, Wenjiang; Yang, Guijun
2014-01-01
Powdery mildew is one of the most serious diseases that have a significant impact on the production of winter wheat. As an effective alternative to traditional sampling methods, remote sensing can be a useful tool in disease detection. This study attempted to use multi-temporal moderate resolution satellite-based data of surface reflectances in blue (B), green (G), red (R) and near infrared (NIR) bands from HJ-CCD (CCD sensor on Huanjing satellite) to monitor disease at a regional scale. In a suburban area in Beijing, China, an extensive field campaign for disease intensity survey was conducted at key growth stages of winter wheat in 2010. Meanwhile, corresponding time series of HJ-CCD images were acquired over the study area. In this study, a number of single-stage and multi-stage spectral features, which were sensitive to powdery mildew, were selected by using an independent t-test. With the selected spectral features, four advanced methods: mahalanobis distance, maximum likelihood classifier, partial least square regression and mixture tuned matched filtering were tested and evaluated for their performances in disease mapping. The experimental results showed that all four algorithms could generate disease maps with a generally correct distribution pattern of powdery mildew at the grain filling stage (Zadoks 72). However, by comparing these disease maps with ground survey data (validation samples), all of the four algorithms also produced a variable degree of error in estimating the disease occurrence and severity. Further, we found that the integration of MTMF and PLSR algorithms could result in a significant accuracy improvement of identifying and determining the disease intensity (overall accuracy of 72% increased to 78% and kappa coefficient of 0.49 increased to 0.59). The experimental results also demonstrated that the multi-temporal satellite images have a great potential in crop diseases mapping at a regional scale.
NASA Astrophysics Data System (ADS)
Gabrieli, A.; Wright, R.; Lucey, P. G.; Porter, J. N.
2017-12-01
Detecting and quantifying volcanic carbon dioxide (CO2) and sulfur dioxide (SO2) emissions is of relevance to volcanologists. Changes in the amount and composition of gases that volcanoes emit are related to subsurface magma movements and the probability of eruptions. Volcanic gases and related acidic aerosols are also an important atmospheric pollution source that create environmental health hazards for people, animals, plants, and infrastructures. For these reasons, it is important to measure emissions from volcanic plumes during both day and night. We present image measurements of the volcanic plume at Kīlauea volcano, HI, and flux derivation, using a newly developed 8-14 um hyperspectral imaging spectrometer, the Thermal Hyperspectral Imager (THI). THI is capable of acquiring images of the scene it views from which spectra can be derived from each pixel. Each spectrum contains 50 wavelength samples between 8 and 14 um where CO2 and SO2 volcanic gases have diagnostic absorption/emission features respectively at 8.6 and 14 um. Plume radiance measurements were carried out both during the day and the night by using both the lava lake in the Halema'uma'u crater as a hot source and the sky as a cold background to detect respectively the spectral signatures of volcanic CO2 and SO2 gases. CO2 and SO2 path-concentrations were then obtained from the spectral radiance measurements using a new Partial Least Squares Regression (PLSR)-based inversion algorithm, which was developed as part of this project. Volcanic emission fluxes were determined by combining the path measurements with wind observations, derived directly from the images. Several hours long time-series of volcanic emission fluxes will be presented and the SO2 conversion rates into aerosols will be discussed. The new imaging and inversion technique, discussed here, are novel allowing for continuous CO2 and SO2 plume mapping during both day and night.
Serbin, Shawn P.; Singh, Aditya; Desai, Ankur R.; ...
2015-06-11
To date, the utility of ecosystem and Earth system models (EESMs) has been limited by poor spatial and temporal representation of critical input parameters. For example, EESMs often rely on leaf-scale or literature-derived estimates for a key determinant of canopy photosynthesis, the maximum velocity of RuBP carboxylation (Vcmax, μmol m –2 s –1). Our recent work (Ainsworth et al., 2014; Serbin et al., 2012) showed that reflectance spectroscopy could be used to estimate Vcmax at the leaf level. Here, we present evidence that imaging spectroscopy data can be used to simultaneously predict Vcmax and its sensitivity to temperature (E V)more » at the canopy scale. In 2013 and 2014, high-altitude Airborne Visible/Infrared Imaging Spectroscopy (AVIRIS) imagery and contemporaneous ground-based assessments of canopy structure and leaf photosynthesis were acquired across an array of monospecific agroecosystems in central and southern California, USA. A partial least-squares regression (PLSR) modeling approach was employed to characterize the pixel-level variation in canopy V cmax (at a standardized canopy temperature of 30 °C) and E V, based on visible and shortwave infrared AVIRIS spectra (414–2447 nm). Our approach yielded parsimonious models with strong predictive capability for Vcmax (at 30 °C) and E V (R 2 of withheld data = 0.94 and 0.92, respectively), both of which varied substantially in the field (≥ 1.7 fold) across the sampled crop types. The models were applied to additional AVIRIS imagery to generate maps of V cmax and E V, as well as their uncertainties, for agricultural landscapes in California. The spatial patterns exhibited in the maps were consistent with our in-situ observations. As a result, these findings highlight the considerable promise of airborne and, by implication, space-borne imaging spectroscopy, such as the proposed HyspIRI mission, to map spatial and temporal variation in key drivers of photosynthetic metabolism in terrestrial vegetation.« less
A Cross-Lingual Similarity Measure for Detecting Biomedical Term Translations
Bollegala, Danushka; Kontonatsios, Georgios; Ananiadou, Sophia
2015-01-01
Bilingual dictionaries for technical terms such as biomedical terms are an important resource for machine translation systems as well as for humans who would like to understand a concept described in a foreign language. Often a biomedical term is first proposed in English and later it is manually translated to other languages. Despite the fact that there are large monolingual lexicons of biomedical terms, only a fraction of those term lexicons are translated to other languages. Manually compiling large-scale bilingual dictionaries for technical domains is a challenging task because it is difficult to find a sufficiently large number of bilingual experts. We propose a cross-lingual similarity measure for detecting most similar translation candidates for a biomedical term specified in one language (source) from another language (target). Specifically, a biomedical term in a language is represented using two types of features: (a) intrinsic features that consist of character n-grams extracted from the term under consideration, and (b) extrinsic features that consist of unigrams and bigrams extracted from the contextual windows surrounding the term under consideration. We propose a cross-lingual similarity measure using each of those feature types. First, to reduce the dimensionality of the feature space in each language, we propose prototype vector projection (PVP)—a non-negative lower-dimensional vector projection method. Second, we propose a method to learn a mapping between the feature spaces in the source and target language using partial least squares regression (PLSR). The proposed method requires only a small number of training instances to learn a cross-lingual similarity measure. The proposed PVP method outperforms popular dimensionality reduction methods such as the singular value decomposition (SVD) and non-negative matrix factorization (NMF) in a nearest neighbor prediction task. Moreover, our experimental results covering several language pairs such as English–French, English–Spanish, English–Greek, and English–Japanese show that the proposed method outperforms several other feature projection methods in biomedical term translation prediction tasks. PMID:26030738
Spectroscopic determination of leaf traits using infrared spectra
NASA Astrophysics Data System (ADS)
Buitrago, Maria F.; Groen, Thomas A.; Hecker, Christoph A.; Skidmore, Andrew K.
2018-07-01
Leaf traits characterise and differentiate single species but can also be used for monitoring vegetation structure and function. Conventional methods to measure leaf traits, especially at the molecular level (e.g. water, lignin and cellulose content), are expensive and time-consuming. Spectroscopic methods to estimate leaf traits can provide an alternative approach. In this study, we investigated high spectral resolution (6612 bands) emissivity measurements from the short to the long wave infrared (1.4-16.0 μm) of leaves from 19 different plant species ranging from herbaceous to woody, and from temperate to tropical types. At the same time, we measured 14 leaf traits to characterise a leaf, including chemical (e.g., leaf water content, nitrogen, cellulose) and physical features (e.g., leaf area and leaf thickness). We fitted partial least squares regression (PLSR) models across the SWIR, MWIR and LWIR for each leaf trait. Then, reduced models (PLSRred) were derived by iteratively reducing the number of bands in the model (using a modified Jackknife resampling method with a Martens and Martens uncertainty test) down to a few bands (4-10 bands) that contribute the most to the variation of the trait. Most leaf traits could be determined from infrared data with a moderate accuracy (65 < Rcv2 < 77% for observed versus predicted plots) based on PLSRred models, while the accuracy using the whole infrared range (6612 bands) presented higher accuracies, 74 < Rcv2 < 90%. Using the full SWIR range (1.4-2.5 μm) shows similarly high accuracies compared to the whole infrared. Leaf thickness, leaf water content, cellulose, lignin and stomata density are the traits that could be estimated most accurately from infrared data (with Rcv2 above 0.80 for the full range models). Leaf thickness, cellulose and lignin were predicted with reasonable accuracy from a combination of single infrared bands. Nevertheless, for all leaf traits, a combination of a few bands yields moderate to accurate estimations.
Deng, Qingqiong; Zhou, Mingquan; Wu, Zhongke; Shui, Wuyang; Ji, Yuan; Wang, Xingce; Liu, Ching Yiu Jessica; Huang, Youliang; Jiang, Haiyan
2016-02-01
Craniofacial reconstruction recreates a facial outlook from the cranium based on the relationship between the face and the skull to assist identification. But craniofacial structures are very complex, and this relationship is not the same in different craniofacial regions. Several regional methods have recently been proposed, these methods segmented the face and skull into regions, and the relationship of each region is then learned independently, after that, facial regions for a given skull are estimated and finally glued together to generate a face. Most of these regional methods use vertex coordinates to represent the regions, and they define a uniform coordinate system for all of the regions. Consequently, the inconsistence in the positions of regions between different individuals is not eliminated before learning the relationships between the face and skull regions, and this reduces the accuracy of the craniofacial reconstruction. In order to solve this problem, an improved regional method is proposed in this paper involving two types of coordinate adjustments. One is the global coordinate adjustment performed on the skulls and faces with the purpose to eliminate the inconsistence of position and pose of the heads; the other is the local coordinate adjustment performed on the skull and face regions with the purpose to eliminate the inconsistence of position of these regions. After these two coordinate adjustments, partial least squares regression (PLSR) is used to estimate the relationship between the face region and the skull region. In order to obtain a more accurate reconstruction, a new fusion strategy is also proposed in the paper to maintain the reconstructed feature regions when gluing the facial regions together. This is based on the observation that the feature regions usually have less reconstruction errors compared to rest of the face. The results demonstrate that the coordinate adjustments and the new fusion strategy can significantly improve the craniofacial reconstructions. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Visible luminescence of Dy3+ doped PbF2-Li2O-SrO-ZnO-B2O3 glasses for yellow light applications
NASA Astrophysics Data System (ADS)
Anjaiah, G.; Sasikala, T.; Kistaiah, P.
2018-05-01
The present studies on various concentrations of Dy3+ ions doped PLSrZFB glasses were carried out through optical absorption, photoluminescence and decay time measurements. The Judd-Ofelt (JO) intensity parameters Ωλ (λ = 2,4,6) can be utilized to evaluate the emission properties. The decay curves for the 4F9/2 levels have been measured and these turns to non-exponential nature at higher concentrations (> 0.1 mol%) is due to energy transfer between the Dy3+-Dy3+ ions dipole -dipole type through cross relaxation channels. The CIE chromaticity color coordinates were calculated and they were all located within the vicinity of white region of the color coordination diagram. The Inokuti-Hirayama model is used to analyze the energy transfer process and also energy transfer parameters have been calculated and discussed.
Taljaard, Monica; McKenzie, Joanne E; Ramsay, Craig R; Grimshaw, Jeremy M
2014-06-19
An interrupted time series design is a powerful quasi-experimental approach for evaluating effects of interventions introduced at a specific point in time. To utilize the strength of this design, a modification to standard regression analysis, such as segmented regression, is required. In segmented regression analysis, the change in intercept and/or slope from pre- to post-intervention is estimated and used to test causal hypotheses about the intervention. We illustrate segmented regression using data from a previously published study that evaluated the effectiveness of a collaborative intervention to improve quality in pre-hospital ambulance care for acute myocardial infarction (AMI) and stroke. In the original analysis, a standard regression model was used with time as a continuous variable. We contrast the results from this standard regression analysis with those from segmented regression analysis. We discuss the limitations of the former and advantages of the latter, as well as the challenges of using segmented regression in analysing complex quality improvement interventions. Based on the estimated change in intercept and slope from pre- to post-intervention using segmented regression, we found insufficient evidence of a statistically significant effect on quality of care for stroke, although potential clinically important effects for AMI cannot be ruled out. Segmented regression analysis is the recommended approach for analysing data from an interrupted time series study. Several modifications to the basic segmented regression analysis approach are available to deal with challenges arising in the evaluation of complex quality improvement interventions.
Standardized Regression Coefficients as Indices of Effect Sizes in Meta-Analysis
ERIC Educational Resources Information Center
Kim, Rae Seon
2011-01-01
When conducting a meta-analysis, it is common to find many collected studies that report regression analyses, because multiple regression analysis is widely used in many fields. Meta-analysis uses effect sizes drawn from individual studies as a means of synthesizing a collection of results. However, indices of effect size from regression analyses…
Using Dominance Analysis to Determine Predictor Importance in Logistic Regression
ERIC Educational Resources Information Center
Azen, Razia; Traxel, Nicole
2009-01-01
This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…
Regression: The Apple Does Not Fall Far From the Tree.
Vetter, Thomas R; Schober, Patrick
2018-05-15
Researchers and clinicians are frequently interested in either: (1) assessing whether there is a relationship or association between 2 or more variables and quantifying this association; or (2) determining whether 1 or more variables can predict another variable. The strength of such an association is mainly described by the correlation. However, regression analysis and regression models can be used not only to identify whether there is a significant relationship or association between variables but also to generate estimations of such a predictive relationship between variables. This basic statistical tutorial discusses the fundamental concepts and techniques related to the most common types of regression analysis and modeling, including simple linear regression, multiple regression, logistic regression, ordinal regression, and Poisson regression, as well as the common yet often underrecognized phenomenon of regression toward the mean. The various types of regression analysis are powerful statistical techniques, which when appropriately applied, can allow for the valid interpretation of complex, multifactorial data. Regression analysis and models can assess whether there is a relationship or association between 2 or more observed variables and estimate the strength of this association, as well as determine whether 1 or more variables can predict another variable. Regression is thus being applied more commonly in anesthesia, perioperative, critical care, and pain research. However, it is crucial to note that regression can identify plausible risk factors; it does not prove causation (a definitive cause and effect relationship). The results of a regression analysis instead identify independent (predictor) variable(s) associated with the dependent (outcome) variable. As with other statistical methods, applying regression requires that certain assumptions be met, which can be tested with specific diagnostics.
Applied Multiple Linear Regression: A General Research Strategy
ERIC Educational Resources Information Center
Smith, Brandon B.
1969-01-01
Illustrates some of the basic concepts and procedures for using regression analysis in experimental design, analysis of variance, analysis of covariance, and curvilinear regression. Applications to evaluation of instruction and vocational education programs are illustrated. (GR)
NASA Technical Reports Server (NTRS)
Parsons, Vickie s.
2009-01-01
The request to conduct an independent review of regression models, developed for determining the expected Launch Commit Criteria (LCC) External Tank (ET)-04 cycle count for the Space Shuttle ET tanking process, was submitted to the NASA Engineering and Safety Center NESC on September 20, 2005. The NESC team performed an independent review of regression models documented in Prepress Regression Analysis, Tom Clark and Angela Krenn, 10/27/05. This consultation consisted of a peer review by statistical experts of the proposed regression models provided in the Prepress Regression Analysis. This document is the consultation's final report.
Ju, Se-Young; Park, Jong-Hwan; Kwak, Tong-Kyoung; Kim, Kyu-Earn
2015-10-01
The objective of this study was to investigate food allergens and prevalence rates of food allergies, followed by comparison of consumer attitudes and preferences regarding food allergy labeling by diagnosis of food allergies. A total of 543 individuals living in Seoul and Gyeonggi area participated in the survey from October 15 to 22 in 2013. The results show that the prevalence of doctor-diagnosed food allergies was 17.5%, whereas 6.4% of respondents self-reported food allergies. The most common allergens of doctor-diagnosed and self-reported food allergy respondents were peaches (30.3%) and eggs (33.3%), respectively, followed by peanuts, cow's milk, and crab. Regarding consumer attitudes toward food labeling, checking food allergens as an item was only significantly different between allergic and non-allergic respondents among all five items (P < 0.001). All respondents reported that all six items (bold font, font color, box frame, warning statement, front label, and addition of potential allergens) were necessary for an improved food allergen labeling system. PLSR analysis determined that the doctor-diagnosed group and checking of food allergens were positively correlated, whereas the non-allergy group was more concerned with checking product brands. An effective food labeling system is very important for health protection of allergic consumers. Additionally, government agencies must develop policies regarding prevalence of food allergies in Korea. Based on this information, the food industry and government agencies should provide clear and accurate food labeling practices for consumers.
Maintenance Operations in Mission Oriented Protective Posture Level IV (MOPPIV)
1987-10-01
Repair FADAC Printed Circuit Board ............. 6 3. Data Analysis Techniques ............................. 6 a. Multiple Linear Regression... ANALYSIS /DISCUSSION ............................... 12 1. Exa-ple of Regression Analysis ..................... 12 S2. Regression results for all tasks...6 * TABLE 9. Task Grouping for Analysis ........................ 7 "TABXLE 10. Remove/Replace H60A3 Power Pack................. 8 TABLE
NASA Technical Reports Server (NTRS)
Rummler, D. R.
1976-01-01
The results are presented of investigations to apply regression techniques to the development of methodology for creep-rupture data analysis. Regression analysis techniques are applied to the explicit description of the creep behavior of materials for space shuttle thermal protection systems. A regression analysis technique is compared with five parametric methods for analyzing three simulated and twenty real data sets, and a computer program for the evaluation of creep-rupture data is presented.
Resting-state functional magnetic resonance imaging: the impact of regression analysis.
Yeh, Chia-Jung; Tseng, Yu-Sheng; Lin, Yi-Ru; Tsai, Shang-Yueh; Huang, Teng-Yi
2015-01-01
To investigate the impact of regression methods on resting-state functional magnetic resonance imaging (rsfMRI). During rsfMRI preprocessing, regression analysis is considered effective for reducing the interference of physiological noise on the signal time course. However, it is unclear whether the regression method benefits rsfMRI analysis. Twenty volunteers (10 men and 10 women; aged 23.4 ± 1.5 years) participated in the experiments. We used node analysis and functional connectivity mapping to assess the brain default mode network by using five combinations of regression methods. The results show that regressing the global mean plays a major role in the preprocessing steps. When a global regression method is applied, the values of functional connectivity are significantly lower (P ≤ .01) than those calculated without a global regression. This step increases inter-subject variation and produces anticorrelated brain areas. rsfMRI data processed using regression should be interpreted carefully. The significance of the anticorrelated brain areas produced by global signal removal is unclear. Copyright © 2014 by the American Society of Neuroimaging.
Standards for Standardized Logistic Regression Coefficients
ERIC Educational Resources Information Center
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
Linear regression analysis: part 14 of a series on evaluation of scientific publications.
Schneider, Astrid; Hommel, Gerhard; Blettner, Maria
2010-11-01
Regression analysis is an important statistical method for the analysis of medical data. It enables the identification and characterization of relationships among multiple factors. It also enables the identification of prognostically relevant risk factors and the calculation of risk scores for individual prognostication. This article is based on selected textbooks of statistics, a selective review of the literature, and our own experience. After a brief introduction of the uni- and multivariable regression models, illustrative examples are given to explain what the important considerations are before a regression analysis is performed, and how the results should be interpreted. The reader should then be able to judge whether the method has been used correctly and interpret the results appropriately. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls, which are discussed here in detail. The reader is made aware of common errors of interpretation through practical examples. Both the opportunities for applying linear regression analysis and its limitations are presented.
An improved multiple linear regression and data analysis computer program package
NASA Technical Reports Server (NTRS)
Sidik, S. M.
1972-01-01
NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.
[A SAS marco program for batch processing of univariate Cox regression analysis for great database].
Yang, Rendong; Xiong, Jie; Peng, Yangqin; Peng, Xiaoning; Zeng, Xiaomin
2015-02-01
To realize batch processing of univariate Cox regression analysis for great database by SAS marco program. We wrote a SAS macro program, which can filter, integrate, and export P values to Excel by SAS9.2. The program was used for screening survival correlated RNA molecules of ovarian cancer. A SAS marco program could finish the batch processing of univariate Cox regression analysis, the selection and export of the results. The SAS macro program has potential applications in reducing the workload of statistical analysis and providing a basis for batch processing of univariate Cox regression analysis.
Exact Analysis of Squared Cross-Validity Coefficient in Predictive Regression Models
ERIC Educational Resources Information Center
Shieh, Gwowen
2009-01-01
In regression analysis, the notion of population validity is of theoretical interest for describing the usefulness of the underlying regression model, whereas the presumably more important concept of population cross-validity represents the predictive effectiveness for the regression equation in future research. It appears that the inference…
USDA-ARS?s Scientific Manuscript database
Selective principal component regression analysis (SPCR) uses a subset of the original image bands for principal component transformation and regression. For optimal band selection before the transformation, this paper used genetic algorithms (GA). In this case, the GA process used the regression co...
Development of a User Interface for a Regression Analysis Software Tool
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert Manfred; Volden, Thomas R.
2010-01-01
An easy-to -use user interface was implemented in a highly automated regression analysis tool. The user interface was developed from the start to run on computers that use the Windows, Macintosh, Linux, or UNIX operating system. Many user interface features were specifically designed such that a novice or inexperienced user can apply the regression analysis tool with confidence. Therefore, the user interface s design minimizes interactive input from the user. In addition, reasonable default combinations are assigned to those analysis settings that influence the outcome of the regression analysis. These default combinations will lead to a successful regression analysis result for most experimental data sets. The user interface comes in two versions. The text user interface version is used for the ongoing development of the regression analysis tool. The official release of the regression analysis tool, on the other hand, has a graphical user interface that is more efficient to use. This graphical user interface displays all input file names, output file names, and analysis settings for a specific software application mode on a single screen which makes it easier to generate reliable analysis results and to perform input parameter studies. An object-oriented approach was used for the development of the graphical user interface. This choice keeps future software maintenance costs to a reasonable limit. Examples of both the text user interface and graphical user interface are discussed in order to illustrate the user interface s overall design approach.
Regression Analysis and the Sociological Imagination
ERIC Educational Resources Information Center
De Maio, Fernando
2014-01-01
Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.
Multivariate Regression Analysis and Slaughter Livestock,
AGRICULTURE, *ECONOMICS), (*MEAT, PRODUCTION), MULTIVARIATE ANALYSIS, REGRESSION ANALYSIS , ANIMALS, WEIGHT, COSTS, PREDICTIONS, STABILITY, MATHEMATICAL MODELS, STORAGE, BEEF, PORK, FOOD, STATISTICAL DATA, ACCURACY
Regression Analysis: Legal Applications in Institutional Research
ERIC Educational Resources Information Center
Frizell, Julie A.; Shippen, Benjamin S., Jr.; Luna, Andrew L.
2008-01-01
This article reviews multiple regression analysis, describes how its results should be interpreted, and instructs institutional researchers on how to conduct such analyses using an example focused on faculty pay equity between men and women. The use of multiple regression analysis will be presented as a method with which to compare salaries of…
RAWS II: A MULTIPLE REGRESSION ANALYSIS PROGRAM,
This memorandum gives instructions for the use and operation of a revised version of RAWS, a multiple regression analysis program. The program...of preprocessed data, the directed retention of variable, listing of the matrix of the normal equations and its inverse, and the bypassing of the regression analysis to provide the input variable statistics only. (Author)
Wang, D Z; Wang, C; Shen, C F; Zhang, Y; Zhang, H; Song, G D; Xue, X D; Xu, Z L; Zhang, S; Jiang, G H
2017-05-10
We described the time trend of acute myocardial infarction (AMI) from 1999 to 2013 in Tianjin incidence rate with Cochran-Armitage trend (CAT) test and linear regression analysis, and the results were compared. Based on actual population, CAT test had much stronger statistical power than linear regression analysis for both overall incidence trend and age specific incidence trend (Cochran-Armitage trend P value
A primer for biomedical scientists on how to execute model II linear regression analysis.
Ludbrook, John
2012-04-01
1. There are two very different ways of executing linear regression analysis. One is Model I, when the x-values are fixed by the experimenter. The other is Model II, in which the x-values are free to vary and are subject to error. 2. I have received numerous complaints from biomedical scientists that they have great difficulty in executing Model II linear regression analysis. This may explain the results of a Google Scholar search, which showed that the authors of articles in journals of physiology, pharmacology and biochemistry rarely use Model II regression analysis. 3. I repeat my previous arguments in favour of using least products linear regression analysis for Model II regressions. I review three methods for executing ordinary least products (OLP) and weighted least products (WLP) regression analysis: (i) scientific calculator and/or computer spreadsheet; (ii) specific purpose computer programs; and (iii) general purpose computer programs. 4. Using a scientific calculator and/or computer spreadsheet, it is easy to obtain correct values for OLP slope and intercept, but the corresponding 95% confidence intervals (CI) are inaccurate. 5. Using specific purpose computer programs, the freeware computer program smatr gives the correct OLP regression coefficients and obtains 95% CI by bootstrapping. In addition, smatr can be used to compare the slopes of OLP lines. 6. When using general purpose computer programs, I recommend the commercial programs systat and Statistica for those who regularly undertake linear regression analysis and I give step-by-step instructions in the Supplementary Information as to how to use loss functions. © 2011 The Author. Clinical and Experimental Pharmacology and Physiology. © 2011 Blackwell Publishing Asia Pty Ltd.
Water quality parameter measurement using spectral signatures
NASA Technical Reports Server (NTRS)
White, P. E.
1973-01-01
Regression analysis is applied to the problem of measuring water quality parameters from remote sensing spectral signature data. The equations necessary to perform regression analysis are presented and methods of testing the strength and reliability of a regression are described. An efficient algorithm for selecting an optimal subset of the independent variables available for a regression is also presented.
Krishan, Kewal; Kanchan, Tanuj; Sharma, Abhilasha
2012-05-01
Estimation of stature is an important parameter in identification of human remains in forensic examinations. The present study is aimed to compare the reliability and accuracy of stature estimation and to demonstrate the variability in estimated stature and actual stature using multiplication factor and regression analysis methods. The study is based on a sample of 246 subjects (123 males and 123 females) from North India aged between 17 and 20 years. Four anthropometric measurements; hand length, hand breadth, foot length and foot breadth taken on the left side in each subject were included in the study. Stature was measured using standard anthropometric techniques. Multiplication factors were calculated and linear regression models were derived for estimation of stature from hand and foot dimensions. Derived multiplication factors and regression formula were applied to the hand and foot measurements in the study sample. The estimated stature from the multiplication factors and regression analysis was compared with the actual stature to find the error in estimated stature. The results indicate that the range of error in estimation of stature from regression analysis method is less than that of multiplication factor method thus, confirming that the regression analysis method is better than multiplication factor analysis in stature estimation. Copyright © 2012 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
Handling nonnormality and variance heterogeneity for quantitative sublethal toxicity tests.
Ritz, Christian; Van der Vliet, Leana
2009-09-01
The advantages of using regression-based techniques to derive endpoints from environmental toxicity data are clear, and slowly, this superior analytical technique is gaining acceptance. As use of regression-based analysis becomes more widespread, some of the associated nuances and potential problems come into sharper focus. Looking at data sets that cover a broad spectrum of standard test species, we noticed that some model fits to data failed to meet two key assumptions-variance homogeneity and normality-that are necessary for correct statistical analysis via regression-based techniques. Failure to meet these assumptions often is caused by reduced variance at the concentrations showing severe adverse effects. Although commonly used with linear regression analysis, transformation of the response variable only is not appropriate when fitting data using nonlinear regression techniques. Through analysis of sample data sets, including Lemna minor, Eisenia andrei (terrestrial earthworm), and algae, we show that both the so-called Box-Cox transformation and use of the Poisson distribution can help to correct variance heterogeneity and nonnormality and so allow nonlinear regression analysis to be implemented. Both the Box-Cox transformation and the Poisson distribution can be readily implemented into existing protocols for statistical analysis. By correcting for nonnormality and variance heterogeneity, these two statistical tools can be used to encourage the transition to regression-based analysis and the depreciation of less-desirable and less-flexible analytical techniques, such as linear interpolation.
NASA Astrophysics Data System (ADS)
Baptiste Barré, Jean; Bourrier, Franck; Bertrand, David; Rey, Freddy
2015-04-01
Ecological engineering corresponds to the design of efficient solutions for protection against natural hazards such as shallow landslides and soil erosion. In particular, bioengineering structures can be composed of a living part, made of plants, cuttings or seeds, and an inert part, a timber logs structure. As wood is not treated by preservatives, fungal degradation can occur from the start of the construction. It results in wood strength loss, which practitioners try to evaluate with non-destructive tools (NDT). Classical NDT are mainly based on density measurements. However, the fungal activity reduces the mechanical properties (modulus of elasticity - MOE) well before well before a density change could be measured. In this context, it would be useful to provide a tool for assessing the residual mechanical strength at different decay stages due to a fungal community. Near-infrared spectroscopy (NIRS) can be used for that purpose, as it can allow evaluating wood mechanical properties as well as wood chemical changes due to brown and white rots. We monitored 160 silver fir samples (30x30x6000mm) from green state to different levels of decay. The degradation process took place in a greenhouse and samples were inoculated with silver fir decayed debris in order to accelerate the process. For each sample, we calculated the normalized bending modulus of elasticity loss (Dw moe) and defined it as decay extent. Near infrared spectra collected from both green and decayed ground samples were corrected by the subtraction of baseline offset. Spectra of green samples were averaged into one mean spectrum and decayed spectra were subtracted from the mean spectrum to calculate the absorption loss. Partial least square regression (PLSR) has been performed between the normalized MOE loss Dw moe (0 < Dw moe < 1) and the absorption loss, with a correlation coefficient R² equal to 0.85. Finally, the prediction of silver fir biodegradation rate by NIRS was significant (RMSEP = 0.13). This tool improves the evaluation accuracy of wood decay extent in the context of ecological engineering structures used for natural hazard mitigation.
Perception and Modeling of Affective Qualities of Musical Instrument Sounds across Pitch Registers.
McAdams, Stephen; Douglas, Chelsea; Vempala, Naresh N
2017-01-01
Composers often pick specific instruments to convey a given emotional tone in their music, partly due to their expressive possibilities, but also due to their timbres in specific registers and at given dynamic markings. Of interest to both music psychology and music informatics from a computational point of view is the relation between the acoustic properties that give rise to the timbre at a given pitch and the perceived emotional quality of the tone. Musician and nonmusician listeners were presented with 137 tones produced at a fixed dynamic marking (forte) playing tones at pitch class D# across each instrument's entire pitch range and with different playing techniques for standard orchestral instruments drawn from the brass, woodwind, string, and pitched percussion families. They rated each tone on six analogical-categorical scales in terms of emotional valence (positive/negative and pleasant/unpleasant), energy arousal (awake/tired), tension arousal (excited/calm), preference (like/dislike), and familiarity. Linear mixed models revealed interactive effects of musical training, instrument family, and pitch register, with non-linear relations between pitch register and several dependent variables. Twenty-three audio descriptors from the Timbre Toolbox were computed for each sound and analyzed in two ways: linear partial least squares regression (PLSR) and nonlinear artificial neural net modeling. These two analyses converged in terms of the importance of various spectral, temporal, and spectrotemporal audio descriptors in explaining the emotion ratings, but some differences also emerged. Different combinations of audio descriptors make major contributions to the three emotion dimensions, suggesting that they are carried by distinct acoustic properties. Valence is more positive with lower spectral slopes, a greater emergence of strong partials, and an amplitude envelope with a sharper attack and earlier decay. Higher tension arousal is carried by brighter sounds, more spectral variation and more gentle attacks. Greater energy arousal is associated with brighter sounds, with higher spectral centroids and slower decrease of the spectral slope, as well as with greater spectral emergence. The divergences between linear and nonlinear approaches are discussed.
Salter, Ian
2018-01-01
Environmental DNA (eDNA) can be defined as the DNA pool recovered from an environmental sample that includes both extracellular and intracellular DNA. There has been a significant increase in the number of recent studies that have demonstrated the possibility to detect macroorganisms using eDNA. Despite the enormous potential of eDNA to serve as a biomonitoring and conservation tool in aquatic systems, there remain some important limitations concerning its application. One significant factor is the variable persistence of eDNA over natural environmental gradients, which imposes a critical constraint on the temporal and spatial scales of species detection. In the present study, a radiotracer bioassay approach was used to quantify the kinetic parameters of dissolved eDNA (d-eDNA), a component of extracellular DNA, over an annual cycle in the coastal Northwest Mediterranean. Significant seasonal variability in the biological uptake and turnover of d-eDNA was observed, the latter ranging from several hours to over one month. Maximum uptake rates of d-eDNA occurred in summer during a period of intense phosphate limitation (turnover <5 hrs). Corresponding increases in bacterial production and uptake of adenosine triphosphate (ATP) demonstrated the microbial utilization of d-eDNA as an organic phosphorus substrate. Higher temperatures during summer may amplify this effect through a general enhancement of microbial metabolism. A partial least squares regression (PLSR) model was able to reproduce the seasonal cycle in d-eDNA persistence and explained 60% of the variance in the observations. Rapid phosphate turnover and low concentrations of bioavailable phosphate, both indicative of phosphate limitation, were the most important parameters in the model. Abiotic factors such as pH, salinity and oxygen exerted minimal influence. The present study demonstrates significant seasonal variability in the persistence of d-eDNA in a natural marine environment that can be linked to the metabolic response of microbial communities to nutrient limitation. Future studies should consider the effect of natural environmental gradients on the seasonal persistence of eDNA, which will be of particular relevance for time-series biomonitoring programs.
2018-01-01
Environmental DNA (eDNA) can be defined as the DNA pool recovered from an environmental sample that includes both extracellular and intracellular DNA. There has been a significant increase in the number of recent studies that have demonstrated the possibility to detect macroorganisms using eDNA. Despite the enormous potential of eDNA to serve as a biomonitoring and conservation tool in aquatic systems, there remain some important limitations concerning its application. One significant factor is the variable persistence of eDNA over natural environmental gradients, which imposes a critical constraint on the temporal and spatial scales of species detection. In the present study, a radiotracer bioassay approach was used to quantify the kinetic parameters of dissolved eDNA (d-eDNA), a component of extracellular DNA, over an annual cycle in the coastal Northwest Mediterranean. Significant seasonal variability in the biological uptake and turnover of d-eDNA was observed, the latter ranging from several hours to over one month. Maximum uptake rates of d-eDNA occurred in summer during a period of intense phosphate limitation (turnover <5 hrs). Corresponding increases in bacterial production and uptake of adenosine triphosphate (ATP) demonstrated the microbial utilization of d-eDNA as an organic phosphorus substrate. Higher temperatures during summer may amplify this effect through a general enhancement of microbial metabolism. A partial least squares regression (PLSR) model was able to reproduce the seasonal cycle in d-eDNA persistence and explained 60% of the variance in the observations. Rapid phosphate turnover and low concentrations of bioavailable phosphate, both indicative of phosphate limitation, were the most important parameters in the model. Abiotic factors such as pH, salinity and oxygen exerted minimal influence. The present study demonstrates significant seasonal variability in the persistence of d-eDNA in a natural marine environment that can be linked to the metabolic response of microbial communities to nutrient limitation. Future studies should consider the effect of natural environmental gradients on the seasonal persistence of eDNA, which will be of particular relevance for time-series biomonitoring programs. PMID:29474423
Perception and Modeling of Affective Qualities of Musical Instrument Sounds across Pitch Registers
McAdams, Stephen; Douglas, Chelsea; Vempala, Naresh N.
2017-01-01
Composers often pick specific instruments to convey a given emotional tone in their music, partly due to their expressive possibilities, but also due to their timbres in specific registers and at given dynamic markings. Of interest to both music psychology and music informatics from a computational point of view is the relation between the acoustic properties that give rise to the timbre at a given pitch and the perceived emotional quality of the tone. Musician and nonmusician listeners were presented with 137 tones produced at a fixed dynamic marking (forte) playing tones at pitch class D# across each instrument's entire pitch range and with different playing techniques for standard orchestral instruments drawn from the brass, woodwind, string, and pitched percussion families. They rated each tone on six analogical-categorical scales in terms of emotional valence (positive/negative and pleasant/unpleasant), energy arousal (awake/tired), tension arousal (excited/calm), preference (like/dislike), and familiarity. Linear mixed models revealed interactive effects of musical training, instrument family, and pitch register, with non-linear relations between pitch register and several dependent variables. Twenty-three audio descriptors from the Timbre Toolbox were computed for each sound and analyzed in two ways: linear partial least squares regression (PLSR) and nonlinear artificial neural net modeling. These two analyses converged in terms of the importance of various spectral, temporal, and spectrotemporal audio descriptors in explaining the emotion ratings, but some differences also emerged. Different combinations of audio descriptors make major contributions to the three emotion dimensions, suggesting that they are carried by distinct acoustic properties. Valence is more positive with lower spectral slopes, a greater emergence of strong partials, and an amplitude envelope with a sharper attack and earlier decay. Higher tension arousal is carried by brighter sounds, more spectral variation and more gentle attacks. Greater energy arousal is associated with brighter sounds, with higher spectral centroids and slower decrease of the spectral slope, as well as with greater spectral emergence. The divergences between linear and nonlinear approaches are discussed. PMID:28228741
Using Robust Standard Errors to Combine Multiple Regression Estimates with Meta-Analysis
ERIC Educational Resources Information Center
Williams, Ryan T.
2012-01-01
Combining multiple regression estimates with meta-analysis has continued to be a difficult task. A variety of methods have been proposed and used to combine multiple regression slope estimates with meta-analysis, however, most of these methods have serious methodological and practical limitations. The purpose of this study was to explore the use…
A Quality Assessment Tool for Non-Specialist Users of Regression Analysis
ERIC Educational Resources Information Center
Argyrous, George
2015-01-01
This paper illustrates the use of a quality assessment tool for regression analysis. It is designed for non-specialist "consumers" of evidence, such as policy makers. The tool provides a series of questions such consumers of evidence can ask to interrogate regression analysis, and is illustrated with reference to a recent study published…
Park, Ji Hyun; Kim, Hyeon-Young; Lee, Hanna; Yun, Eun Kyoung
2015-12-01
This study compares the performance of the logistic regression and decision tree analysis methods for assessing the risk factors for infection in cancer patients undergoing chemotherapy. The subjects were 732 cancer patients who were receiving chemotherapy at K university hospital in Seoul, Korea. The data were collected between March 2011 and February 2013 and were processed for descriptive analysis, logistic regression and decision tree analysis using the IBM SPSS Statistics 19 and Modeler 15.1 programs. The most common risk factors for infection in cancer patients receiving chemotherapy were identified as alkylating agents, vinca alkaloid and underlying diabetes mellitus. The logistic regression explained 66.7% of the variation in the data in terms of sensitivity and 88.9% in terms of specificity. The decision tree analysis accounted for 55.0% of the variation in the data in terms of sensitivity and 89.0% in terms of specificity. As for the overall classification accuracy, the logistic regression explained 88.0% and the decision tree analysis explained 87.2%. The logistic regression analysis showed a higher degree of sensitivity and classification accuracy. Therefore, logistic regression analysis is concluded to be the more effective and useful method for establishing an infection prediction model for patients undergoing chemotherapy. Copyright © 2015 Elsevier Ltd. All rights reserved.
Zarb, Francis; McEntee, Mark F; Rainford, Louise
2015-06-01
To evaluate visual grading characteristics (VGC) and ordinal regression analysis during head CT optimisation as a potential alternative to visual grading assessment (VGA), traditionally employed to score anatomical visualisation. Patient images (n = 66) were obtained using current and optimised imaging protocols from two CT suites: a 16-slice scanner at the national Maltese centre for trauma and a 64-slice scanner in a private centre. Local resident radiologists (n = 6) performed VGA followed by VGC and ordinal regression analysis. VGC alone indicated that optimised protocols had similar image quality as current protocols. Ordinal logistic regression analysis provided an in-depth evaluation, criterion by criterion allowing the selective implementation of the protocols. The local radiology review panel supported the implementation of optimised protocols for brain CT examinations (including trauma) in one centre, achieving radiation dose reductions ranging from 24 % to 36 %. In the second centre a 29 % reduction in radiation dose was achieved for follow-up cases. The combined use of VGC and ordinal logistic regression analysis led to clinical decisions being taken on the implementation of the optimised protocols. This improved method of image quality analysis provided the evidence to support imaging protocol optimisation, resulting in significant radiation dose savings. • There is need for scientifically based image quality evaluation during CT optimisation. • VGC and ordinal regression analysis in combination led to better informed clinical decisions. • VGC and ordinal regression analysis led to dose reductions without compromising diagnostic efficacy.
REGRESSION ANALYSIS OF SEA-SURFACE-TEMPERATURE PATTERNS FOR THE NORTH PACIFIC OCEAN.
SEA WATER, *SURFACE TEMPERATURE, *OCEANOGRAPHIC DATA, PACIFIC OCEAN, REGRESSION ANALYSIS , STATISTICAL ANALYSIS, UNDERWATER EQUIPMENT, DETECTION, UNDERWATER COMMUNICATIONS, DISTRIBUTION, THERMAL PROPERTIES, COMPUTERS.
The process and utility of classification and regression tree methodology in nursing research
Kuhn, Lisa; Page, Karen; Ward, John; Worrall-Carter, Linda
2014-01-01
Aim This paper presents a discussion of classification and regression tree analysis and its utility in nursing research. Background Classification and regression tree analysis is an exploratory research method used to illustrate associations between variables not suited to traditional regression analysis. Complex interactions are demonstrated between covariates and variables of interest in inverted tree diagrams. Design Discussion paper. Data sources English language literature was sourced from eBooks, Medline Complete and CINAHL Plus databases, Google and Google Scholar, hard copy research texts and retrieved reference lists for terms including classification and regression tree* and derivatives and recursive partitioning from 1984–2013. Discussion Classification and regression tree analysis is an important method used to identify previously unknown patterns amongst data. Whilst there are several reasons to embrace this method as a means of exploratory quantitative research, issues regarding quality of data as well as the usefulness and validity of the findings should be considered. Implications for Nursing Research Classification and regression tree analysis is a valuable tool to guide nurses to reduce gaps in the application of evidence to practice. With the ever-expanding availability of data, it is important that nurses understand the utility and limitations of the research method. Conclusion Classification and regression tree analysis is an easily interpreted method for modelling interactions between health-related variables that would otherwise remain obscured. Knowledge is presented graphically, providing insightful understanding of complex and hierarchical relationships in an accessible and useful way to nursing and other health professions. PMID:24237048
The process and utility of classification and regression tree methodology in nursing research.
Kuhn, Lisa; Page, Karen; Ward, John; Worrall-Carter, Linda
2014-06-01
This paper presents a discussion of classification and regression tree analysis and its utility in nursing research. Classification and regression tree analysis is an exploratory research method used to illustrate associations between variables not suited to traditional regression analysis. Complex interactions are demonstrated between covariates and variables of interest in inverted tree diagrams. Discussion paper. English language literature was sourced from eBooks, Medline Complete and CINAHL Plus databases, Google and Google Scholar, hard copy research texts and retrieved reference lists for terms including classification and regression tree* and derivatives and recursive partitioning from 1984-2013. Classification and regression tree analysis is an important method used to identify previously unknown patterns amongst data. Whilst there are several reasons to embrace this method as a means of exploratory quantitative research, issues regarding quality of data as well as the usefulness and validity of the findings should be considered. Classification and regression tree analysis is a valuable tool to guide nurses to reduce gaps in the application of evidence to practice. With the ever-expanding availability of data, it is important that nurses understand the utility and limitations of the research method. Classification and regression tree analysis is an easily interpreted method for modelling interactions between health-related variables that would otherwise remain obscured. Knowledge is presented graphically, providing insightful understanding of complex and hierarchical relationships in an accessible and useful way to nursing and other health professions. © 2013 The Authors. Journal of Advanced Nursing Published by John Wiley & Sons Ltd.
Hoch, Jeffrey S; Dewa, Carolyn S
2014-04-01
Economic evaluations commonly accompany trials of new treatments or interventions; however, regression methods and their corresponding advantages for the analysis of cost-effectiveness data are not well known. To illustrate regression-based economic evaluation, we present a case study investigating the cost-effectiveness of a collaborative mental health care program for people receiving short-term disability benefits for psychiatric disorders. We implement net benefit regression to illustrate its strengths and limitations. Net benefit regression offers a simple option for cost-effectiveness analyses of person-level data. By placing economic evaluation in a regression framework, regression-based techniques can facilitate the analysis and provide simple solutions to commonly encountered challenges. Economic evaluations of person-level data (eg, from a clinical trial) should use net benefit regression to facilitate analysis and enhance results.
CADDIS Volume 4. Data Analysis: Basic Analyses
Use of statistical tests to determine if an observation is outside the normal range of expected values. Details of CART, regression analysis, use of quantile regression analysis, CART in causal analysis, simplifying or pruning resulting trees.
Population heterogeneity in the salience of multiple risk factors for adolescent delinquency.
Lanza, Stephanie T; Cooper, Brittany R; Bray, Bethany C
2014-03-01
To present mixture regression analysis as an alternative to more standard regression analysis for predicting adolescent delinquency. We demonstrate how mixture regression analysis allows for the identification of population subgroups defined by the salience of multiple risk factors. We identified population subgroups (i.e., latent classes) of individuals based on their coefficients in a regression model predicting adolescent delinquency from eight previously established risk indices drawn from the community, school, family, peer, and individual levels. The study included N = 37,763 10th-grade adolescents who participated in the Communities That Care Youth Survey. Standard, zero-inflated, and mixture Poisson and negative binomial regression models were considered. Standard and mixture negative binomial regression models were selected as optimal. The five-class regression model was interpreted based on the class-specific regression coefficients, indicating that risk factors had varying salience across classes of adolescents. Standard regression showed that all risk factors were significantly associated with delinquency. Mixture regression provided more nuanced information, suggesting a unique set of risk factors that were salient for different subgroups of adolescents. Implications for the design of subgroup-specific interventions are discussed. Copyright © 2014 Society for Adolescent Health and Medicine. Published by Elsevier Inc. All rights reserved.
ERIC Educational Resources Information Center
Dolan, Conor V.; Wicherts, Jelte M.; Molenaar, Peter C. M.
2004-01-01
We consider the question of how variation in the number and reliability of indicators affects the power to reject the hypothesis that the regression coefficients are zero in latent linear regression analysis. We show that power remains constant as long as the coefficient of determination remains unchanged. Any increase in the number of indicators…
Ju, Se-young; Park, Jong-Hwan; Kim, Kyu-earn
2015-01-01
BACKGROUND/OBJECTIVES The objective of this study was to investigate food allergens and prevalence rates of food allergies, followed by comparison of consumer attitudes and preferences regarding food allergy labeling by diagnosis of food allergies. SUBJECTS/METHODS A total of 543 individuals living in Seoul and Gyeonggi area participated in the survey from October 15 to 22 in 2013. RESULTS The results show that the prevalence of doctor-diagnosed food allergies was 17.5%, whereas 6.4% of respondents self-reported food allergies. The most common allergens of doctor-diagnosed and self-reported food allergy respondents were peaches (30.3%) and eggs (33.3%), respectively, followed by peanuts, cow's milk, and crab. Regarding consumer attitudes toward food labeling, checking food allergens as an item was only significantly different between allergic and non-allergic respondents among all five items (P < 0.001). All respondents reported that all six items (bold font, font color, box frame, warning statement, front label, and addition of potential allergens) were necessary for an improved food allergen labeling system. PLSR analysis determined that the doctor-diagnosed group and checking of food allergens were positively correlated, whereas the non-allergy group was more concerned with checking product brands. CONCLUSIONS An effective food labeling system is very important for health protection of allergic consumers. Additionally, government agencies must develop policies regarding prevalence of food allergies in Korea. Based on this information, the food industry and government agencies should provide clear and accurate food labeling practices for consumers. PMID:26425282
Moderation analysis using a two-level regression model.
Yuan, Ke-Hai; Cheng, Ying; Maxwell, Scott
2014-10-01
Moderation analysis is widely used in social and behavioral research. The most commonly used model for moderation analysis is moderated multiple regression (MMR) in which the explanatory variables of the regression model include product terms, and the model is typically estimated by least squares (LS). This paper argues for a two-level regression model in which the regression coefficients of a criterion variable on predictors are further regressed on moderator variables. An algorithm for estimating the parameters of the two-level model by normal-distribution-based maximum likelihood (NML) is developed. Formulas for the standard errors (SEs) of the parameter estimates are provided and studied. Results indicate that, when heteroscedasticity exists, NML with the two-level model gives more efficient and more accurate parameter estimates than the LS analysis of the MMR model. When error variances are homoscedastic, NML with the two-level model leads to essentially the same results as LS with the MMR model. Most importantly, the two-level regression model permits estimating the percentage of variance of each regression coefficient that is due to moderator variables. When applied to data from General Social Surveys 1991, NML with the two-level model identified a significant moderation effect of race on the regression of job prestige on years of education while LS with the MMR model did not. An R package is also developed and documented to facilitate the application of the two-level model.
Multiple Correlation versus Multiple Regression.
ERIC Educational Resources Information Center
Huberty, Carl J.
2003-01-01
Describes differences between multiple correlation analysis (MCA) and multiple regression analysis (MRA), showing how these approaches involve different research questions and study designs, different inferential approaches, different analysis strategies, and different reported information. (SLD)
Functional Relationships and Regression Analysis.
ERIC Educational Resources Information Center
Preece, Peter F. W.
1978-01-01
Using a degenerate multivariate normal model for the distribution of organismic variables, the form of least-squares regression analysis required to estimate a linear functional relationship between variables is derived. It is suggested that the two conventional regression lines may be considered to describe functional, not merely statistical,…
Isolating and Examining Sources of Suppression and Multicollinearity in Multiple Linear Regression
ERIC Educational Resources Information Center
Beckstead, Jason W.
2012-01-01
The presence of suppression (and multicollinearity) in multiple regression analysis complicates interpretation of predictor-criterion relationships. The mathematical conditions that produce suppression in regression analysis have received considerable attention in the methodological literature but until now nothing in the way of an analytic…
General Nature of Multicollinearity in Multiple Regression Analysis.
ERIC Educational Resources Information Center
Liu, Richard
1981-01-01
Discusses multiple regression, a very popular statistical technique in the field of education. One of the basic assumptions in regression analysis requires that independent variables in the equation should not be highly correlated. The problem of multicollinearity and some of the solutions to it are discussed. (Author)
Logistic Regression: Concept and Application
ERIC Educational Resources Information Center
Cokluk, Omay
2010-01-01
The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…
Applying Regression Analysis to Problems in Institutional Research.
ERIC Educational Resources Information Center
Bohannon, Tom R.
1988-01-01
Regression analysis is one of the most frequently used statistical techniques in institutional research. Principles of least squares, model building, residual analysis, influence statistics, and multi-collinearity are described and illustrated. (Author/MSE)
Multicollinearity in Regression Analyses Conducted in Epidemiologic Studies
Vatcheva, Kristina P.; Lee, MinJae; McCormick, Joseph B.; Rahbar, Mohammad H.
2016-01-01
The adverse impact of ignoring multicollinearity on findings and data interpretation in regression analysis is very well documented in the statistical literature. The failure to identify and report multicollinearity could result in misleading interpretations of the results. A review of epidemiological literature in PubMed from January 2004 to December 2013, illustrated the need for a greater attention to identifying and minimizing the effect of multicollinearity in analysis of data from epidemiologic studies. We used simulated datasets and real life data from the Cameron County Hispanic Cohort to demonstrate the adverse effects of multicollinearity in the regression analysis and encourage researchers to consider the diagnostic for multicollinearity as one of the steps in regression analysis. PMID:27274911
Multicollinearity in Regression Analyses Conducted in Epidemiologic Studies.
Vatcheva, Kristina P; Lee, MinJae; McCormick, Joseph B; Rahbar, Mohammad H
2016-04-01
The adverse impact of ignoring multicollinearity on findings and data interpretation in regression analysis is very well documented in the statistical literature. The failure to identify and report multicollinearity could result in misleading interpretations of the results. A review of epidemiological literature in PubMed from January 2004 to December 2013, illustrated the need for a greater attention to identifying and minimizing the effect of multicollinearity in analysis of data from epidemiologic studies. We used simulated datasets and real life data from the Cameron County Hispanic Cohort to demonstrate the adverse effects of multicollinearity in the regression analysis and encourage researchers to consider the diagnostic for multicollinearity as one of the steps in regression analysis.
Stepwise versus Hierarchical Regression: Pros and Cons
ERIC Educational Resources Information Center
Lewis, Mitzi
2007-01-01
Multiple regression is commonly used in social and behavioral data analysis. In multiple regression contexts, researchers are very often interested in determining the "best" predictors in the analysis. This focus may stem from a need to identify those predictors that are supportive of theory. Alternatively, the researcher may simply be interested…
Interpreting Bivariate Regression Coefficients: Going beyond the Average
ERIC Educational Resources Information Center
Halcoussis, Dennis; Phillips, G. Michael
2010-01-01
Statistics, econometrics, investment analysis, and data analysis classes often review the calculation of several types of averages, including the arithmetic mean, geometric mean, harmonic mean, and various weighted averages. This note shows how each of these can be computed using a basic regression framework. By recognizing when a regression model…
Regression Commonality Analysis: A Technique for Quantitative Theory Building
ERIC Educational Resources Information Center
Nimon, Kim; Reio, Thomas G., Jr.
2011-01-01
When it comes to multiple linear regression analysis (MLR), it is common for social and behavioral science researchers to rely predominately on beta weights when evaluating how predictors contribute to a regression model. Presenting an underutilized statistical technique, this article describes how organizational researchers can use commonality…
Precision Efficacy Analysis for Regression.
ERIC Educational Resources Information Center
Brooks, Gordon P.
When multiple linear regression is used to develop a prediction model, sample size must be large enough to ensure stable coefficients. If the derivation sample size is inadequate, the model may not predict well for future subjects. The precision efficacy analysis for regression (PEAR) method uses a cross- validity approach to select sample sizes…
Kanada, Yoshikiyo; Sakurai, Hiroaki; Sugiura, Yoshito; Arai, Tomoaki; Koyama, Soichiro; Tanabe, Shigeo
2017-11-01
[Purpose] To create a regression formula in order to estimate 1RM for knee extensors, based on the maximal isometric muscle strength measured using a hand-held dynamometer and data regarding the body composition. [Subjects and Methods] Measurement was performed in 21 healthy males in their twenties to thirties. Single regression analysis was performed, with measurement values representing 1RM and the maximal isometric muscle strength as dependent and independent variables, respectively. Furthermore, multiple regression analysis was performed, with data regarding the body composition incorporated as another independent variable, in addition to the maximal isometric muscle strength. [Results] Through single regression analysis with the maximal isometric muscle strength as an independent variable, the following regression formula was created: 1RM (kg)=0.714 + 0.783 × maximal isometric muscle strength (kgf). On multiple regression analysis, only the total muscle mass was extracted. [Conclusion] A highly accurate regression formula to estimate 1RM was created based on both the maximal isometric muscle strength and body composition. Using a hand-held dynamometer and body composition analyzer, it was possible to measure these items in a short time, and obtain clinically useful results.
Regression Model Optimization for the Analysis of Experimental Data
NASA Technical Reports Server (NTRS)
Ulbrich, N.
2009-01-01
A candidate math model search algorithm was developed at Ames Research Center that determines a recommended math model for the multivariate regression analysis of experimental data. The search algorithm is applicable to classical regression analysis problems as well as wind tunnel strain gage balance calibration analysis applications. The algorithm compares the predictive capability of different regression models using the standard deviation of the PRESS residuals of the responses as a search metric. This search metric is minimized during the search. Singular value decomposition is used during the search to reject math models that lead to a singular solution of the regression analysis problem. Two threshold dependent constraints are also applied. The first constraint rejects math models with insignificant terms. The second constraint rejects math models with near-linear dependencies between terms. The math term hierarchy rule may also be applied as an optional constraint during or after the candidate math model search. The final term selection of the recommended math model depends on the regressor and response values of the data set, the user s function class combination choice, the user s constraint selections, and the result of the search metric minimization. A frequently used regression analysis example from the literature is used to illustrate the application of the search algorithm to experimental data.
Cao, Qingqing; Wu, Zhenqiang; Sun, Ying; Wang, Tiezhu; Han, Tengwei; Gu, Chaomei; Sun, Yehuan
2011-11-01
To Eexplore the application of negative binomial regression and modified Poisson regression analysis in analyzing the influential factors for injury frequency and the risk factors leading to the increase of injury frequency. 2917 primary and secondary school students were selected from Hefei by cluster random sampling method and surveyed by questionnaire. The data on the count event-based injuries used to fitted modified Poisson regression and negative binomial regression model. The risk factors incurring the increase of unintentional injury frequency for juvenile students was explored, so as to probe the efficiency of these two models in studying the influential factors for injury frequency. The Poisson model existed over-dispersion (P < 0.0001) based on testing by the Lagrangemultiplier. Therefore, the over-dispersion dispersed data using a modified Poisson regression and negative binomial regression model, was fitted better. respectively. Both showed that male gender, younger age, father working outside of the hometown, the level of the guardian being above junior high school and smoking might be the results of higher injury frequencies. On a tendency of clustered frequency data on injury event, both the modified Poisson regression analysis and negative binomial regression analysis can be used. However, based on our data, the modified Poisson regression fitted better and this model could give a more accurate interpretation of relevant factors affecting the frequency of injury.
Deng, Yingyuan; Wang, Tianfu; Chen, Siping; Liu, Weixiang
2017-01-01
The aim of the study is to screen the significant sonographic features by logistic regression analysis and fit a model to diagnose thyroid nodules. A total of 525 pathological thyroid nodules were retrospectively analyzed. All the nodules underwent conventional ultrasonography (US), strain elastosonography (SE), and contrast -enhanced ultrasound (CEUS). Those nodules’ 12 suspicious sonographic features were used to assess thyroid nodules. The significant features of diagnosing thyroid nodules were picked out by logistic regression analysis. All variables that were statistically related to diagnosis of thyroid nodules, at a level of p < 0.05 were embodied in a logistic regression analysis model. The significant features in the logistic regression model of diagnosing thyroid nodules were calcification, suspected cervical lymph node metastasis, hypoenhancement pattern, margin, shape, vascularity, posterior acoustic, echogenicity, and elastography score. According to the results of logistic regression analysis, the formula that could predict whether or not thyroid nodules are malignant was established. The area under the receiver operating curve (ROC) was 0.930 and the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were 83.77%, 89.56%, 87.05%, 86.04%, and 87.79% respectively. PMID:29228030
Pang, Tiantian; Huang, Leidan; Deng, Yingyuan; Wang, Tianfu; Chen, Siping; Gong, Xuehao; Liu, Weixiang
2017-01-01
The aim of the study is to screen the significant sonographic features by logistic regression analysis and fit a model to diagnose thyroid nodules. A total of 525 pathological thyroid nodules were retrospectively analyzed. All the nodules underwent conventional ultrasonography (US), strain elastosonography (SE), and contrast -enhanced ultrasound (CEUS). Those nodules' 12 suspicious sonographic features were used to assess thyroid nodules. The significant features of diagnosing thyroid nodules were picked out by logistic regression analysis. All variables that were statistically related to diagnosis of thyroid nodules, at a level of p < 0.05 were embodied in a logistic regression analysis model. The significant features in the logistic regression model of diagnosing thyroid nodules were calcification, suspected cervical lymph node metastasis, hypoenhancement pattern, margin, shape, vascularity, posterior acoustic, echogenicity, and elastography score. According to the results of logistic regression analysis, the formula that could predict whether or not thyroid nodules are malignant was established. The area under the receiver operating curve (ROC) was 0.930 and the sensitivity, specificity, accuracy, positive predictive value, and negative predictive value were 83.77%, 89.56%, 87.05%, 86.04%, and 87.79% respectively.
Bennett, Bradley C; Husby, Chad E
2008-03-28
Botanical pharmacopoeias are non-random subsets of floras, with some taxonomic groups over- or under-represented. Moerman [Moerman, D.E., 1979. Symbols and selectivity: a statistical analysis of Native American medical ethnobotany, Journal of Ethnopharmacology 1, 111-119] introduced linear regression/residual analysis to examine these patterns. However, regression, the commonly-employed analysis, suffers from several statistical flaws. We use contingency table and binomial analyses to examine patterns of Shuar medicinal plant use (from Amazonian Ecuador). We first analyzed the Shuar data using Moerman's approach, modified to better meet requirements of linear regression analysis. Second, we assessed the exact randomization contingency table test for goodness of fit. Third, we developed a binomial model to test for non-random selection of plants in individual families. Modified regression models (which accommodated assumptions of linear regression) reduced R(2) to from 0.59 to 0.38, but did not eliminate all problems associated with regression analyses. Contingency table analyses revealed that the entire flora departs from the null model of equal proportions of medicinal plants in all families. In the binomial analysis, only 10 angiosperm families (of 115) differed significantly from the null model. These 10 families are largely responsible for patterns seen at higher taxonomic levels. Contingency table and binomial analyses offer an easy and statistically valid alternative to the regression approach.
The Precision Efficacy Analysis for Regression Sample Size Method.
ERIC Educational Resources Information Center
Brooks, Gordon P.; Barcikowski, Robert S.
The general purpose of this study was to examine the efficiency of the Precision Efficacy Analysis for Regression (PEAR) method for choosing appropriate sample sizes in regression studies used for precision. The PEAR method, which is based on the algebraic manipulation of an accepted cross-validity formula, essentially uses an effect size to…
Effect of Contact Damage on the Strength of Ceramic Materials.
1982-10-01
variables that are important to erosion, and a multivariate , linear regression analysis is used to fit the data to the dimensional analysis. The...of Equations 7 and 8 by a multivariable regression analysis (room tem- perature data) Exponent Regression Standard error Computed coefficient of...1980) 593. WEAVER, Proc. Brit. Ceram. Soc. 22 (1973) 125. 39. P. W. BRIDGMAN, "Dimensional Analaysis ", (Yale 18. R. W. RICE, S. W. FREIMAN and P. F
Li, Chunhui; Yu, Chuanhua
2013-01-01
To provide a reference for evaluating public non-profit hospitals in the new environment of medical reform, we established a performance evaluation system for public non-profit hospitals. The new “input-output” performance model for public non-profit hospitals is based on four primary indexes (input, process, output and effect) that include 11 sub-indexes and 41 items. The indicator weights were determined using the analytic hierarchy process (AHP) and entropy weight method. The BP neural network was applied to evaluate the performance of 14 level-3 public non-profit hospitals located in Hubei Province. The most stable BP neural network was produced by comparing different numbers of neurons in the hidden layer and using the “Leave-one-out” Cross Validation method. The performance evaluation system we established for public non-profit hospitals could reflect the basic goal of the new medical health system reform in China. Compared with PLSR, the result indicated that the BP neural network could be used effectively for evaluating the performance public non-profit hospitals. PMID:23955238
Contribution of low molecular weight phenols to bitter taste and mouthfeel properties in red wines.
Gonzalo-Diago, Ana; Dizy, Marta; Fernández-Zurbano, Purificación
2014-07-01
The aim of this study was to explore the relationship between low molecular weight compounds present in wines and their sensory contribution. Six young red wines were fractionated by gel permeation chromatography and subsequently each fraction obtained was separated from sugars and acids by solid phase extraction. Wines and both fractions were in-mouth evaluated by a trained sensory panel and UPLC-MS analyses were performed. The lack of ethanol and proanthocyanidins greatly increased the acidity perceived. The elimination of organic acids enabled the description of the samples, which were evaluated as bitter, persistent and slightly astringent. Coutaric acid and quercetin-3-O-rutinoside appear to be relevant astringent compounds in the absence of proanthocyanidins. Bitter taste was highly correlated with the in-mouth persistence. A significant predictive model for bitter taste was built by means of PLSR. Further research must be carried out to validate the sensory contribution of the compounds involved in bitterness and astringency and to verify the sensory interactions observed. Copyright © 2014 Elsevier Ltd. All rights reserved.
Common pitfalls in statistical analysis: Linear regression analysis
Aggarwal, Rakesh; Ranganathan, Priya
2017-01-01
In a previous article in this series, we explained correlation analysis which describes the strength of relationship between two continuous variables. In this article, we deal with linear regression analysis which predicts the value of one continuous variable from another. We also discuss the assumptions and pitfalls associated with this analysis. PMID:28447022
Quality of life in breast cancer patients--a quantile regression analysis.
Pourhoseingholi, Mohamad Amin; Safaee, Azadeh; Moghimi-Dehkordi, Bijan; Zeighami, Bahram; Faghihzadeh, Soghrat; Tabatabaee, Hamid Reza; Pourhoseingholi, Asma
2008-01-01
Quality of life study has an important role in health care especially in chronic diseases, in clinical judgment and in medical resources supplying. Statistical tools like linear regression are widely used to assess the predictors of quality of life. But when the response is not normal the results are misleading. The aim of this study is to determine the predictors of quality of life in breast cancer patients, using quantile regression model and compare to linear regression. A cross-sectional study conducted on 119 breast cancer patients that admitted and treated in chemotherapy ward of Namazi hospital in Shiraz. We used QLQ-C30 questionnaire to assessment quality of life in these patients. A quantile regression was employed to assess the assocciated factors and the results were compared to linear regression. All analysis carried out using SAS. The mean score for the global health status for breast cancer patients was 64.92+/-11.42. Linear regression showed that only grade of tumor, occupational status, menopausal status, financial difficulties and dyspnea were statistically significant. In spite of linear regression, financial difficulties were not significant in quantile regression analysis and dyspnea was only significant for first quartile. Also emotion functioning and duration of disease statistically predicted the QOL score in the third quartile. The results have demonstrated that using quantile regression leads to better interpretation and richer inference about predictors of the breast cancer patient quality of life.
The microcomputer scientific software series 2: general linear model--regression.
Harold M. Rauscher
1983-01-01
The general linear model regression (GLMR) program provides the microcomputer user with a sophisticated regression analysis capability. The output provides a regression ANOVA table, estimators of the regression model coefficients, their confidence intervals, confidence intervals around the predicted Y-values, residuals for plotting, a check for multicollinearity, a...
USAF (United States Air Force) Stability and Control DATCOM (Data Compendium)
1978-04-01
regression analysis involves the study of a group of variables to determine their effect on a given parameter. Because of the empirical nature of this...regression analysis of mathematical statistics. In general, a regression analysis involves the study of a group of variables to determine their effect on a...Excperiment, OSR TN 58-114, MIT Fluid Dynamics Research Group Rapt. 57-5, 1957. (U) 90. Kennet, H., and Ashley, H.: Review of Unsteady Aerodynamic Studies in
Tokunaga, Makoto; Watanabe, Susumu; Sonoda, Shigeru
2017-09-01
Multiple linear regression analysis is often used to predict the outcome of stroke rehabilitation. However, the predictive accuracy may not be satisfactory. The objective of this study was to elucidate the predictive accuracy of a method of calculating motor Functional Independence Measure (mFIM) at discharge from mFIM effectiveness predicted by multiple regression analysis. The subjects were 505 patients with stroke who were hospitalized in a convalescent rehabilitation hospital. The formula "mFIM at discharge = mFIM effectiveness × (91 points - mFIM at admission) + mFIM at admission" was used. By including the predicted mFIM effectiveness obtained through multiple regression analysis in this formula, we obtained the predicted mFIM at discharge (A). We also used multiple regression analysis to directly predict mFIM at discharge (B). The correlation between the predicted and the measured values of mFIM at discharge was compared between A and B. The correlation coefficients were .916 for A and .878 for B. Calculating mFIM at discharge from mFIM effectiveness predicted by multiple regression analysis had a higher degree of predictive accuracy of mFIM at discharge than that directly predicted. Copyright © 2017 National Stroke Association. Published by Elsevier Inc. All rights reserved.
John W. Edwards; Susan C. Loeb; David C. Guynn
1994-01-01
Multiple regression and use-availability analyses are two methods for examining habitat selection. Use-availability analysis is commonly used to evaluate macrohabitat selection whereas multiple regression analysis can be used to determine microhabitat selection. We compared these techniques using behavioral observations (n = 5534) and telemetry locations (n = 2089) of...
Anderson, Carl A; McRae, Allan F; Visscher, Peter M
2006-07-01
Standard quantitative trait loci (QTL) mapping techniques commonly assume that the trait is both fully observed and normally distributed. When considering survival or age-at-onset traits these assumptions are often incorrect. Methods have been developed to map QTL for survival traits; however, they are both computationally intensive and not available in standard genome analysis software packages. We propose a grouped linear regression method for the analysis of continuous survival data. Using simulation we compare this method to both the Cox and Weibull proportional hazards models and a standard linear regression method that ignores censoring. The grouped linear regression method is of equivalent power to both the Cox and Weibull proportional hazards methods and is significantly better than the standard linear regression method when censored observations are present. The method is also robust to the proportion of censored individuals and the underlying distribution of the trait. On the basis of linear regression methodology, the grouped linear regression model is computationally simple and fast and can be implemented readily in freely available statistical software.
Nie, Z Q; Ou, Y Q; Zhuang, J; Qu, Y J; Mai, J Z; Chen, J M; Liu, X Q
2016-05-01
Conditional logistic regression analysis and unconditional logistic regression analysis are commonly used in case control study, but Cox proportional hazard model is often used in survival data analysis. Most literature only refer to main effect model, however, generalized linear model differs from general linear model, and the interaction was composed of multiplicative interaction and additive interaction. The former is only statistical significant, but the latter has biological significance. In this paper, macros was written by using SAS 9.4 and the contrast ratio, attributable proportion due to interaction and synergy index were calculated while calculating the items of logistic and Cox regression interactions, and the confidence intervals of Wald, delta and profile likelihood were used to evaluate additive interaction for the reference in big data analysis in clinical epidemiology and in analysis of genetic multiplicative and additive interactions.
Prediction by regression and intrarange data scatter in surface-process studies
Toy, T.J.; Osterkamp, W.R.; Renard, K.G.
1993-01-01
Modeling is a major component of contemporary earth science, and regression analysis occupies a central position in the parameterization, calibration, and validation of geomorphic and hydrologic models. Although this methodology can be used in many ways, we are primarily concerned with the prediction of values for one variable from another variable. Examination of the literature reveals considerable inconsistency in the presentation of the results of regression analysis and the occurrence of patterns in the scatter of data points about the regression line. Both circumstances confound utilization and evaluation of the models. Statisticians are well aware of various problems associated with the use of regression analysis and offer improved practices; often, however, their guidelines are not followed. After a review of the aforementioned circumstances and until standard criteria for model evaluation become established, we recommend, as a minimum, inclusion of scatter diagrams, the standard error of the estimate, and sample size in reporting the results of regression analyses for most surface-process studies. ?? 1993 Springer-Verlag.
Quantile regression for the statistical analysis of immunological data with many non-detects.
Eilers, Paul H C; Röder, Esther; Savelkoul, Huub F J; van Wijk, Roy Gerth
2012-07-07
Immunological parameters are hard to measure. A well-known problem is the occurrence of values below the detection limit, the non-detects. Non-detects are a nuisance, because classical statistical analyses, like ANOVA and regression, cannot be applied. The more advanced statistical techniques currently available for the analysis of datasets with non-detects can only be used if a small percentage of the data are non-detects. Quantile regression, a generalization of percentiles to regression models, models the median or higher percentiles and tolerates very high numbers of non-detects. We present a non-technical introduction and illustrate it with an implementation to real data from a clinical trial. We show that by using quantile regression, groups can be compared and that meaningful linear trends can be computed, even if more than half of the data consists of non-detects. Quantile regression is a valuable addition to the statistical methods that can be used for the analysis of immunological datasets with non-detects.
NASA Astrophysics Data System (ADS)
Nishidate, Izumi; Wiswadarma, Aditya; Hase, Yota; Tanaka, Noriyuki; Maeda, Takaaki; Niizeki, Kyuichi; Aizu, Yoshihisa
2011-08-01
In order to visualize melanin and blood concentrations and oxygen saturation in human skin tissue, a simple imaging technique based on multispectral diffuse reflectance images acquired at six wavelengths (500, 520, 540, 560, 580 and 600nm) was developed. The technique utilizes multiple regression analysis aided by Monte Carlo simulation for diffuse reflectance spectra. Using the absorbance spectrum as a response variable and the extinction coefficients of melanin, oxygenated hemoglobin, and deoxygenated hemoglobin as predictor variables, multiple regression analysis provides regression coefficients. Concentrations of melanin and total blood are then determined from the regression coefficients using conversion vectors that are deduced numerically in advance, while oxygen saturation is obtained directly from the regression coefficients. Experiments with a tissue-like agar gel phantom validated the method. In vivo experiments with human skin of the human hand during upper limb occlusion and of the inner forearm exposed to UV irradiation demonstrated the ability of the method to evaluate physiological reactions of human skin tissue.
CADDIS Volume 4. Data Analysis: PECBO Appendix - R Scripts for Non-Parametric Regressions
Script for computing nonparametric regression analysis. Overview of using scripts to infer environmental conditions from biological observations, statistically estimating species-environment relationships, statistical scripts.
Kovalska, M P; Bürki, E; Schoetzau, A; Orguel, S F; Orguel, S; Grieshaber, M C
2011-04-01
The distinction of real progression from test variability in visual field (VF) series may be based on clinical judgment, on trend analysis based on follow-up of test parameters over time, or on identification of a significant change related to the mean of baseline exams (event analysis). The aim of this study was to compare a new population-based method (Octopus field analysis, OFA) with classic regression analyses and clinical judgment for detecting glaucomatous VF changes. 240 VF series of 240 patients with at least 9 consecutive examinations available were included into this study. They were independently classified by two experienced investigators. The results of such a classification served as a reference for comparison for the following statistical tests: (a) t-test global, (b) r-test global, (c) regression analysis of 10 VF clusters and (d) point-wise linear regression analysis. 32.5 % of the VF series were classified as progressive by the investigators. The sensitivity and specificity were 89.7 % and 92.0 % for r-test, and 73.1 % and 93.8 % for the t-test, respectively. In the point-wise linear regression analysis, the specificity was comparable (89.5 % versus 92 %), but the sensitivity was clearly lower than in the r-test (22.4 % versus 89.7 %) at a significance level of p = 0.01. A regression analysis for the 10 VF clusters showed a markedly higher sensitivity for the r-test (37.7 %) than the t-test (14.1 %) at a similar specificity (88.3 % versus 93.8 %) for a significant trend (p = 0.005). In regard to the cluster distribution, the paracentral clusters and the superior nasal hemifield progressed most frequently. The population-based regression analysis seems to be superior to the trend analysis in detecting VF progression in glaucoma, and may eliminate the drawbacks of the event analysis. Further, it may assist the clinician in the evaluation of VF series and may allow better visualization of the correlation between function and structure owing to VF clusters. © Georg Thieme Verlag KG Stuttgart · New York.
ERIC Educational Resources Information Center
Berenson, Mark L.
2013-01-01
There is consensus in the statistical literature that severe departures from its assumptions invalidate the use of regression modeling for purposes of inference. The assumptions of regression modeling are usually evaluated subjectively through visual, graphic displays in a residual analysis but such an approach, taken alone, may be insufficient…
L.R. Grosenbaugh
1967-01-01
Describes an expansible computerized system that provides data needed in regression or covariance analysis of as many as 50 variables, 8 of which may be dependent. Alternatively, it can screen variously generated combinations of independent variables to find the regression with the smallest mean-squared-residual, which will be fitted if desired. The user can easily...
Karabatsos, George
2017-02-01
Most of applied statistics involves regression analysis of data. In practice, it is important to specify a regression model that has minimal assumptions which are not violated by data, to ensure that statistical inferences from the model are informative and not misleading. This paper presents a stand-alone and menu-driven software package, Bayesian Regression: Nonparametric and Parametric Models, constructed from MATLAB Compiler. Currently, this package gives the user a choice from 83 Bayesian models for data analysis. They include 47 Bayesian nonparametric (BNP) infinite-mixture regression models; 5 BNP infinite-mixture models for density estimation; and 31 normal random effects models (HLMs), including normal linear models. Each of the 78 regression models handles either a continuous, binary, or ordinal dependent variable, and can handle multi-level (grouped) data. All 83 Bayesian models can handle the analysis of weighted observations (e.g., for meta-analysis), and the analysis of left-censored, right-censored, and/or interval-censored data. Each BNP infinite-mixture model has a mixture distribution assigned one of various BNP prior distributions, including priors defined by either the Dirichlet process, Pitman-Yor process (including the normalized stable process), beta (two-parameter) process, normalized inverse-Gaussian process, geometric weights prior, dependent Dirichlet process, or the dependent infinite-probits prior. The software user can mouse-click to select a Bayesian model and perform data analysis via Markov chain Monte Carlo (MCMC) sampling. After the sampling completes, the software automatically opens text output that reports MCMC-based estimates of the model's posterior distribution and model predictive fit to the data. Additional text and/or graphical output can be generated by mouse-clicking other menu options. This includes output of MCMC convergence analyses, and estimates of the model's posterior predictive distribution, for selected functionals and values of covariates. The software is illustrated through the BNP regression analysis of real data.
Applications of statistics to medical science, III. Correlation and regression.
Watanabe, Hiroshi
2012-01-01
In this third part of a series surveying medical statistics, the concepts of correlation and regression are reviewed. In particular, methods of linear regression and logistic regression are discussed. Arguments related to survival analysis will be made in a subsequent paper.
Optimizing methods for linking cinematic features to fMRI data.
Kauttonen, Janne; Hlushchuk, Yevhen; Tikka, Pia
2015-04-15
One of the challenges of naturalistic neurosciences using movie-viewing experiments is how to interpret observed brain activations in relation to the multiplicity of time-locked stimulus features. As previous studies have shown less inter-subject synchronization across viewers of random video footage than story-driven films, new methods need to be developed for analysis of less story-driven contents. To optimize the linkage between our fMRI data collected during viewing of a deliberately non-narrative silent film 'At Land' by Maya Deren (1944) and its annotated content, we combined the method of elastic-net regularization with the model-driven linear regression and the well-established data-driven independent component analysis (ICA) and inter-subject correlation (ISC) methods. In the linear regression analysis, both IC and region-of-interest (ROI) time-series were fitted with time-series of a total of 36 binary-valued and one real-valued tactile annotation of film features. The elastic-net regularization and cross-validation were applied in the ordinary least-squares linear regression in order to avoid over-fitting due to the multicollinearity of regressors, the results were compared against both the partial least-squares (PLS) regression and the un-regularized full-model regression. Non-parametric permutation testing scheme was applied to evaluate the statistical significance of regression. We found statistically significant correlation between the annotation model and 9 ICs out of 40 ICs. Regression analysis was also repeated for a large set of cubic ROIs covering the grey matter. Both IC- and ROI-based regression analyses revealed activations in parietal and occipital regions, with additional smaller clusters in the frontal lobe. Furthermore, we found elastic-net based regression more sensitive than PLS and un-regularized regression since it detected a larger number of significant ICs and ROIs. Along with the ISC ranking methods, our regression analysis proved a feasible method for ordering the ICs based on their functional relevance to the annotated cinematic features. The novelty of our method is - in comparison to the hypothesis-driven manual pre-selection and observation of some individual regressors biased by choice - in applying data-driven approach to all content features simultaneously. We found especially the combination of regularized regression and ICA useful when analyzing fMRI data obtained using non-narrative movie stimulus with a large set of complex and correlated features. Copyright © 2015. Published by Elsevier Inc.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Seong W. Lee
During this reporting period, the literature survey including the gasifier temperature measurement literature, the ultrasonic application and its background study in cleaning application, and spray coating process are completed. The gasifier simulator (cold model) testing has been successfully conducted. Four factors (blower voltage, ultrasonic application, injection time intervals, particle weight) were considered as significant factors that affect the temperature measurement. The Analysis of Variance (ANOVA) was applied to analyze the test data. The analysis shows that all four factors are significant to the temperature measurements in the gasifier simulator (cold model). The regression analysis for the case with the normalizedmore » room temperature shows that linear model fits the temperature data with 82% accuracy (18% error). The regression analysis for the case without the normalized room temperature shows 72.5% accuracy (27.5% error). The nonlinear regression analysis indicates a better fit than that of the linear regression. The nonlinear regression model's accuracy is 88.7% (11.3% error) for normalized room temperature case, which is better than the linear regression analysis. The hot model thermocouple sleeve design and fabrication are completed. The gasifier simulator (hot model) design and the fabrication are completed. The system tests of the gasifier simulator (hot model) have been conducted and some modifications have been made. Based on the system tests and results analysis, the gasifier simulator (hot model) has met the proposed design requirement and the ready for system test. The ultrasonic cleaning method is under evaluation and will be further studied for the gasifier simulator (hot model) application. The progress of this project has been on schedule.« less
The Economic Value of Mangroves: A Meta-Analysis
Marwa Salem; D. Evan Mercer
2012-01-01
This paper presents a synthesis of the mangrove ecosystem valuation literature through a meta-regression analysis. The main contribution of this study is that it is the first meta-analysis focusing solely on mangrove forests, whereas previous studies have included different types of wetlands. The number of studies included in the regression analysis is 44 for a total...
2014-01-01
Background Support vector regression (SVR) and Gaussian process regression (GPR) were used for the analysis of electroanalytical experimental data to estimate diffusion coefficients. Results For simulated cyclic voltammograms based on the EC, Eqr, and EqrC mechanisms these regression algorithms in combination with nonlinear kernel/covariance functions yielded diffusion coefficients with higher accuracy as compared to the standard approach of calculating diffusion coefficients relying on the Nicholson-Shain equation. The level of accuracy achieved by SVR and GPR is virtually independent of the rate constants governing the respective reaction steps. Further, the reduction of high-dimensional voltammetric signals by manual selection of typical voltammetric peak features decreased the performance of both regression algorithms compared to a reduction by downsampling or principal component analysis. After training on simulated data sets, diffusion coefficients were estimated by the regression algorithms for experimental data comprising voltammetric signals for three organometallic complexes. Conclusions Estimated diffusion coefficients closely matched the values determined by the parameter fitting method, but reduced the required computational time considerably for one of the reaction mechanisms. The automated processing of voltammograms according to the regression algorithms yields better results than the conventional analysis of peak-related data. PMID:24987463
Zhang, Chao; Jia, Pengli; Yu, Liu; Xu, Chang
2018-05-01
Dose-response meta-analysis (DRMA) is widely applied to investigate the dose-specific relationship between independent and dependent variables. Such methods have been in use for over 30 years and are increasingly employed in healthcare and clinical decision-making. In this article, we give an overview of the methodology used in DRMA. We summarize the commonly used regression model and the pooled method in DRMA. We also use an example to illustrate how to employ a DRMA by these methods. Five regression models, linear regression, piecewise regression, natural polynomial regression, fractional polynomial regression, and restricted cubic spline regression, were illustrated in this article to fit the dose-response relationship. And two types of pooling approaches, that is, one-stage approach and two-stage approach are illustrated to pool the dose-response relationship across studies. The example showed similar results among these models. Several dose-response meta-analysis methods can be used for investigating the relationship between exposure level and the risk of an outcome. However the methodology of DRMA still needs to be improved. © 2018 Chinese Cochrane Center, West China Hospital of Sichuan University and John Wiley & Sons Australia, Ltd.
Suzuki, Taku; Iwamoto, Takuji; Shizu, Kanae; Suzuki, Katsuji; Yamada, Harumoto; Sato, Kazuki
2017-05-01
This retrospective study was designed to investigate prognostic factors for postoperative outcomes for cubital tunnel syndrome (CubTS) using multiple logistic regression analysis with a large number of patients. Eighty-three patients with CubTS who underwent surgeries were enrolled. The following potential prognostic factors for disease severity were selected according to previous reports: sex, age, type of surgery, disease duration, body mass index, cervical lesion, presence of diabetes mellitus, Workers' Compensation status, preoperative severity, and preoperative electrodiagnostic testing. Postoperative severity of disease was assessed 2 years after surgery by Messina's criteria which is an outcome measure specifically for CubTS. Bivariate analysis was performed to select candidate prognostic factors for multiple linear regression analyses. Multiple logistic regression analysis was conducted to identify the association between postoperative severity and selected prognostic factors. Both bivariate and multiple linear regression analysis revealed only preoperative severity as an independent risk factor for poor prognosis, while other factors did not show any significant association. Although conflicting results exist regarding prognosis of CubTS, this study supports evidence from previous studies and concludes early surgical intervention portends the most favorable prognosis. Copyright © 2017 The Japanese Orthopaedic Association. Published by Elsevier B.V. All rights reserved.
Brunetti, Natale Daniele; Santoro, Francesco; De Gennaro, Luisa; Correale, Michele; Gaglione, Antonio; Di Biase, Matteo
2016-07-01
In a recent paper Singh et al. analyzed the effect of drug treatment on recurrence of takotsubo cardiomyopathy (TTC) in a comprehensive meta-analysis. The study found that recurrence rates were independent of clinic utilization of BB prescription, but inversely correlated with ACEi/ARB prescription: authors therefore conclude that ACEi/ARB rather than BB may reduce risk of recurrence. We aimed to re-analyze data reported in the study, now weighted for populations' size, in a meta-regression analysis. After multiple meta-regression analysis, we found a significant regression between rates of prescription of ACEi and rates of recurrence of TTC; regression was not statistically significant for BBs. On the bases of our re-analysis, we confirm that rates of recurrence of TTC are lower in populations of patients with higher rates of treatment with ACEi/ARB. That could not necessarily imply that ACEi may prevent recurrence of TTC, but barely that, for example, rates of recurrence are lower in cohorts more compliant with therapy or more prescribed with ACEi because more carefully followed. Randomized prospective studies are surely warranted. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
A general framework for the use of logistic regression models in meta-analysis.
Simmonds, Mark C; Higgins, Julian Pt
2016-12-01
Where individual participant data are available for every randomised trial in a meta-analysis of dichotomous event outcomes, "one-stage" random-effects logistic regression models have been proposed as a way to analyse these data. Such models can also be used even when individual participant data are not available and we have only summary contingency table data. One benefit of this one-stage regression model over conventional meta-analysis methods is that it maximises the correct binomial likelihood for the data and so does not require the common assumption that effect estimates are normally distributed. A second benefit of using this model is that it may be applied, with only minor modification, in a range of meta-analytic scenarios, including meta-regression, network meta-analyses and meta-analyses of diagnostic test accuracy. This single model can potentially replace the variety of often complex methods used in these areas. This paper considers, with a range of meta-analysis examples, how random-effects logistic regression models may be used in a number of different types of meta-analyses. This one-stage approach is compared with widely used meta-analysis methods including Bayesian network meta-analysis and the bivariate and hierarchical summary receiver operating characteristic (ROC) models for meta-analyses of diagnostic test accuracy. © The Author(s) 2014.
Examination of influential observations in penalized spline regression
NASA Astrophysics Data System (ADS)
Türkan, Semra
2013-10-01
In parametric or nonparametric regression models, the results of regression analysis are affected by some anomalous observations in the data set. Thus, detection of these observations is one of the major steps in regression analysis. These observations are precisely detected by well-known influence measures. Pena's statistic is one of them. In this study, Pena's approach is formulated for penalized spline regression in terms of ordinary residuals and leverages. The real data and artificial data are used to see illustrate the effectiveness of Pena's statistic as to Cook's distance on detecting influential observations. The results of the study clearly reveal that the proposed measure is superior to Cook's Distance to detect these observations in large data set.
Robust analysis of trends in noisy tokamak confinement data using geodesic least squares regression
DOE Office of Scientific and Technical Information (OSTI.GOV)
Verdoolaege, G., E-mail: geert.verdoolaege@ugent.be; Laboratory for Plasma Physics, Royal Military Academy, B-1000 Brussels; Shabbir, A.
Regression analysis is a very common activity in fusion science for unveiling trends and parametric dependencies, but it can be a difficult matter. We have recently developed the method of geodesic least squares (GLS) regression that is able to handle errors in all variables, is robust against data outliers and uncertainty in the regression model, and can be used with arbitrary distribution models and regression functions. We here report on first results of application of GLS to estimation of the multi-machine scaling law for the energy confinement time in tokamaks, demonstrating improved consistency of the GLS results compared to standardmore » least squares.« less
Retro-regression--another important multivariate regression improvement.
Randić, M
2001-01-01
We review the serious problem associated with instabilities of the coefficients of regression equations, referred to as the MRA (multivariate regression analysis) "nightmare of the first kind". This is manifested when in a stepwise regression a descriptor is included or excluded from a regression. The consequence is an unpredictable change of the coefficients of the descriptors that remain in the regression equation. We follow with consideration of an even more serious problem, referred to as the MRA "nightmare of the second kind", arising when optimal descriptors are selected from a large pool of descriptors. This process typically causes at different steps of the stepwise regression a replacement of several previously used descriptors by new ones. We describe a procedure that resolves these difficulties. The approach is illustrated on boiling points of nonanes which are considered (1) by using an ordered connectivity basis; (2) by using an ordering resulting from application of greedy algorithm; and (3) by using an ordering derived from an exhaustive search for optimal descriptors. A novel variant of multiple regression analysis, called retro-regression (RR), is outlined showing how it resolves the ambiguities associated with both "nightmares" of the first and the second kind of MRA.
Quantitative detection of settled coal dust over green canopy
NASA Astrophysics Data System (ADS)
Brook, Anna; Sahar, Nir
2017-04-01
The main task of environmental and geoscience applications are efficient and accurate quantitative classification of earth surfaces and spatial phenomena. In the past decade, there has been a significant interest in employing spectral unmixing in order to retrieve accurate quantitative information latent in in situ data. Recently, the ground-truth and laboratory measured spectral signatures promoted by advanced algorithms are proposed as a new path toward solving the unmixing problem in semi-supervised fashion. This study presents a practical implementation of field spectroscopy as a quantitative tool to detect settled coal dust over green canopy in free/open environment. Coal dust is a fine powdered form of coal, which is created by the crushing, grinding, and pulverizing of coal. Since the inelastic nature of coal, coal dust can be created during transportation, or by mechanically handling coal. Coal dust, categorized at silt-clay particle size, of particular concern due to heavy metals (lead, mercury, nickel, tin, cadmium, mercury, antimony, arsenic, isotopes of thorium and strontium) which are toxic also at low concentrations. This hazard exposes risk on both environment and public health. It has been identified by medical scientist around the world as causing a range of diseases and health problems, mainly heart and respiratory diseases like asthma and lung cancer. It is due to the fact that the fine invisible coal dust particles (less than 2.5 microns) long lodge in the lungs and are not naturally expelled, so long-term exposure increases the risk of health problems. Numerus studies reported that data to conduct study of geographic distribution of the very fine coal dust (smaller than PM 2.5) and related health impacts from coal exports, is not being collected. Sediment dust load in an indoor environment can be spectrally assessed using reflectance spectroscopy (Chudnovsky and Ben-Dor, 2009). Small amounts of particulate pollution that may carry a signature of a forthcoming environmental hazard are of key interest when considering the effects of pollution. According to the most basic distribution dynamics, dust consists of suspended particulate matter in a fine state of subdivision that are raised and carried by wind. In this context, it is increasingly important to first, understand the distribution dynamics of pollutants, and subsequently develop dedicated tools and measures to control and monitor pollutants in the free environment. The earliest effect of settled polluted dust particles is not always reflected through poor conditions of vegetation or soils, or any visible damages. In most of the cases, it has a quite long accumulation process that graduates from a polluted condition to long-term environmental and health related hazard. Although conducted experiments with pollutant analog powders under controlled conditions have tended to con- firm the findings from field studies (Brook, 2014; Brook and Ben-Dor 2016; Brook, 2016), a major criticism of all these experiments is their short duration. The resulting conclusion is that it is difficult, if not impossible, to determine the implications of long-term exposure to realistic concentrations of pollutants from such short-term studies. In general, the task of unmixing is to decompose the reflectance spectrum into a set of endmembers or principal combined spectra and their corresponding abundances (Bioucas-Dias et al., 2012). This study suggests that the sensitivity of sparse unmixing techniques provides an ideal approach to extract and identify coal dust settled over/upon green vegetation canopy using in situ spectral data collected by portable spectrometer. The optimal NMF algorithms, such as ALS and LPG, are assumed to be the simplest methods that achieve the minimum error. The suggested practical approach includes the following stages: 1. In situ spectral measurements, 2. Near-real-time spectral data analysis, 3. Estimated concentration of coal dust reported as mg/sq m. The stage 2 is completed by calculating: 1. Unmixing between the green canopy and the settle dust extraction only coal dust fraction, 2. Converting spectral feature of coal dust to concentration via PLSR spectral model. The spectral model was trained and validated PLSR model developed at laboratory using spectra across MIR (FTIR reflectance spectra) and NIR regions and XRD analysis. The obtained RMSE was satisfying for both spectral regions. Thus, it was concluded that field spectroscopy can be used for this purpose, and it can provide fully quantitative measures of settle coal dust. Nowadays this approach (both spectrometer and algorithm) has been accepted as a practical operational tool for environmental monitoring near power station Orot Rabin in Hadera and will be used by the Sharon-Carmel Districts Municipal Association for Environmental Protection, Israel as a regulatory tool. In summary, this work shows that coal dust can be assessed using in situ spectroscopy, making it a potentially powerful tool for environmental studies. References Chudnovsky, A., & Ben-Dor, E. (2009). Reflectance spectroscopy as a tool for settled dust monitoring in office environment. International Journal of Environment and Waste Management, 4(1), 32-49. Brook, A. (2014). Quantitative Detection of Settled dust over Green Canopy using Sparse Unmixing of Airborne Hyperspectral Data. IEEE-Whispers 6th Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing, 2014, Switzerland, 4-8. Brook, A. and Ben-Dor, E. (2016). Quantitative detection of settled dust over Green Canopy using sparse unmixing of airborne hyperspectral data. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 9(2), pp.884-897. Brook, A. (2016). Quantitative Detection and Long-Term Monitoring of Settle Dust Using Semisupervised Learning for Spectral Data. Water, Air, & Soil Pollution, 227(3), pp.1-9. Bioucas-Dias, J.M., Plaza, A., Dobigeon, N., Parente, M., Du, Q., Gader, P. and Chanussot, J. (2012). Hyperspectral unmixing overview: Geometrical, statistical, and sparse regression-based approaches. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 5(2), pp.354-379. Keshava, N., Mustard, J. (2002). Spectral unmixing. IEEE Signal Process. Mag., 19(1), 44-57. Bioucas-Dias et al. (2012). Hyperspectral unmixing overview: Geometrical, statistical, and sparse regression-based approaches, IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 5(2), 354 -379.
Regression analysis for solving diagnosis problem of children's health
NASA Astrophysics Data System (ADS)
Cherkashina, Yu A.; Gerget, O. M.
2016-04-01
The paper includes results of scientific researches. These researches are devoted to the application of statistical techniques, namely, regression analysis, to assess the health status of children in the neonatal period based on medical data (hemostatic parameters, parameters of blood tests, the gestational age, vascular-endothelial growth factor) measured at 3-5 days of children's life. In this paper a detailed description of the studied medical data is given. A binary logistic regression procedure is discussed in the paper. Basic results of the research are presented. A classification table of predicted values and factual observed values is shown, the overall percentage of correct recognition is determined. Regression equation coefficients are calculated, the general regression equation is written based on them. Based on the results of logistic regression, ROC analysis was performed, sensitivity and specificity of the model are calculated and ROC curves are constructed. These mathematical techniques allow carrying out diagnostics of health of children providing a high quality of recognition. The results make a significant contribution to the development of evidence-based medicine and have a high practical importance in the professional activity of the author.
Regression analysis using dependent Polya trees.
Schörgendorfer, Angela; Branscum, Adam J
2013-11-30
Many commonly used models for linear regression analysis force overly simplistic shape and scale constraints on the residual structure of data. We propose a semiparametric Bayesian model for regression analysis that produces data-driven inference by using a new type of dependent Polya tree prior to model arbitrary residual distributions that are allowed to evolve across increasing levels of an ordinal covariate (e.g., time, in repeated measurement studies). By modeling residual distributions at consecutive covariate levels or time points using separate, but dependent Polya tree priors, distributional information is pooled while allowing for broad pliability to accommodate many types of changing residual distributions. We can use the proposed dependent residual structure in a wide range of regression settings, including fixed-effects and mixed-effects linear and nonlinear models for cross-sectional, prospective, and repeated measurement data. A simulation study illustrates the flexibility of our novel semiparametric regression model to accurately capture evolving residual distributions. In an application to immune development data on immunoglobulin G antibodies in children, our new model outperforms several contemporary semiparametric regression models based on a predictive model selection criterion. Copyright © 2013 John Wiley & Sons, Ltd.
A comparison of methods for the analysis of binomial clustered outcomes in behavioral research.
Ferrari, Alberto; Comelli, Mario
2016-12-01
In behavioral research, data consisting of a per-subject proportion of "successes" and "failures" over a finite number of trials often arise. This clustered binary data are usually non-normally distributed, which can distort inference if the usual general linear model is applied and sample size is small. A number of more advanced methods is available, but they are often technically challenging and a comparative assessment of their performances in behavioral setups has not been performed. We studied the performances of some methods applicable to the analysis of proportions; namely linear regression, Poisson regression, beta-binomial regression and Generalized Linear Mixed Models (GLMMs). We report on a simulation study evaluating power and Type I error rate of these models in hypothetical scenarios met by behavioral researchers; plus, we describe results from the application of these methods on data from real experiments. Our results show that, while GLMMs are powerful instruments for the analysis of clustered binary outcomes, beta-binomial regression can outperform them in a range of scenarios. Linear regression gave results consistent with the nominal level of significance, but was overall less powerful. Poisson regression, instead, mostly led to anticonservative inference. GLMMs and beta-binomial regression are generally more powerful than linear regression; yet linear regression is robust to model misspecification in some conditions, whereas Poisson regression suffers heavily from violations of the assumptions when used to model proportion data. We conclude providing directions to behavioral scientists dealing with clustered binary data and small sample sizes. Copyright © 2016 Elsevier B.V. All rights reserved.
Regression analysis of informative current status data with the additive hazards model.
Zhao, Shishun; Hu, Tao; Ma, Ling; Wang, Peijie; Sun, Jianguo
2015-04-01
This paper discusses regression analysis of current status failure time data arising from the additive hazards model in the presence of informative censoring. Many methods have been developed for regression analysis of current status data under various regression models if the censoring is noninformative, and also there exists a large literature on parametric analysis of informative current status data in the context of tumorgenicity experiments. In this paper, a semiparametric maximum likelihood estimation procedure is presented and in the method, the copula model is employed to describe the relationship between the failure time of interest and the censoring time. Furthermore, I-splines are used to approximate the nonparametric functions involved and the asymptotic consistency and normality of the proposed estimators are established. A simulation study is conducted and indicates that the proposed approach works well for practical situations. An illustrative example is also provided.
Comparison of cranial sex determination by discriminant analysis and logistic regression.
Amores-Ampuero, Anabel; Alemán, Inmaculada
2016-04-05
Various methods have been proposed for estimating dimorphism. The objective of this study was to compare sex determination results from cranial measurements using discriminant analysis or logistic regression. The study sample comprised 130 individuals (70 males) of known sex, age, and cause of death from San José cemetery in Granada (Spain). Measurements of 19 neurocranial dimensions and 11 splanchnocranial dimensions were subjected to discriminant analysis and logistic regression, and the percentages of correct classification were compared between the sex functions obtained with each method. The discriminant capacity of the selected variables was evaluated with a cross-validation procedure. The percentage accuracy with discriminant analysis was 78.2% for the neurocranium (82.4% in females and 74.6% in males) and 73.7% for the splanchnocranium (79.6% in females and 68.8% in males). These percentages were higher with logistic regression analysis: 85.7% for the neurocranium (in both sexes) and 94.1% for the splanchnocranium (100% in females and 91.7% in males).
Building Regression Models: The Importance of Graphics.
ERIC Educational Resources Information Center
Dunn, Richard
1989-01-01
Points out reasons for using graphical methods to teach simple and multiple regression analysis. Argues that a graphically oriented approach has considerable pedagogic advantages in the exposition of simple and multiple regression. Shows that graphical methods may play a central role in the process of building regression models. (Author/LS)
Testing Different Model Building Procedures Using Multiple Regression.
ERIC Educational Resources Information Center
Thayer, Jerome D.
The stepwise regression method of selecting predictors for computer assisted multiple regression analysis was compared with forward, backward, and best subsets regression, using 16 data sets. The results indicated the stepwise method was preferred because of its practical nature, when the models chosen by different selection methods were similar…
Yu, Xiaojin; Liu, Pei; Min, Jie; Chen, Qiguang
2009-01-01
To explore the application of regression on order statistics (ROS) in estimating nondetects for food exposure assessment. Regression on order statistics was adopted in analysis of cadmium residual data set from global food contaminant monitoring, the mean residual was estimated basing SAS programming and compared with the results from substitution methods. The results show that ROS method performs better obviously than substitution methods for being robust and convenient for posterior analysis. Regression on order statistics is worth to adopt,but more efforts should be make for details of application of this method.
Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon
2015-01-01
Background: Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. Methods: In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. Results: The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Conclusion: Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended. PMID:26793655
Ebrahimzadeh, Farzad; Hajizadeh, Ebrahim; Vahabi, Nasim; Almasian, Mohammad; Bakhteyar, Katayoon
2015-01-01
Unwanted pregnancy not intended by at least one of the parents has undesirable consequences for the family and the society. In the present study, three classification models were used and compared to predict unwanted pregnancies in an urban population. In this cross-sectional study, 887 pregnant mothers referring to health centers in Khorramabad, Iran, in 2012 were selected by the stratified and cluster sampling; relevant variables were measured and for prediction of unwanted pregnancy, logistic regression, discriminant analysis, and probit regression models and SPSS software version 21 were used. To compare these models, indicators such as sensitivity, specificity, the area under the ROC curve, and the percentage of correct predictions were used. The prevalence of unwanted pregnancies was 25.3%. The logistic and probit regression models indicated that parity and pregnancy spacing, contraceptive methods, household income and number of living male children were related to unwanted pregnancy. The performance of the models based on the area under the ROC curve was 0.735, 0.733, and 0.680 for logistic regression, probit regression, and linear discriminant analysis, respectively. Given the relatively high prevalence of unwanted pregnancies in Khorramabad, it seems necessary to revise family planning programs. Despite the similar accuracy of the models, if the researcher is interested in the interpretability of the results, the use of the logistic regression model is recommended.
Regression Analysis of Physician Distribution to Identify Areas of Need: Some Preliminary Findings.
ERIC Educational Resources Information Center
Morgan, Bruce B.; And Others
A regression analysis was conducted of factors that help to explain the variance in physician distribution and which identify those factors that influence the maldistribution of physicians. Models were developed for different geographic areas to determine the most appropriate unit of analysis for the Western Missouri Area Health Education Center…
Criteria for the use of regression analysis for remote sensing of sediment and pollutants
NASA Technical Reports Server (NTRS)
Whitlock, C. H.; Kuo, C. Y.; Lecroy, S. R. (Principal Investigator)
1982-01-01
Data analysis procedures for quantification of water quality parameters that are already identified and are known to exist within the water body are considered. The liner multiple-regression technique was examined as a procedure for defining and calibrating data analysis algorithms for such instruments as spectrometers and multispectral scanners.
The Analysis of the Regression-Discontinuity Design in R
ERIC Educational Resources Information Center
Thoemmes, Felix; Liao, Wang; Jin, Ze
2017-01-01
This article describes the analysis of regression-discontinuity designs (RDDs) using the R packages rdd, rdrobust, and rddtools. We discuss similarities and differences between these packages and provide directions on how to use them effectively. We use real data from the Carolina Abecedarian Project to show how an analysis of an RDD can be…
Tu, Yu-Kang; Krämer, Nicole; Lee, Wen-Chung
2012-07-01
In the analysis of trends in health outcomes, an ongoing issue is how to separate and estimate the effects of age, period, and cohort. As these 3 variables are perfectly collinear by definition, regression coefficients in a general linear model are not unique. In this tutorial, we review why identification is a problem, and how this problem may be tackled using partial least squares and principal components regression analyses. Both methods produce regression coefficients that fulfill the same collinearity constraint as the variables age, period, and cohort. We show that, because the constraint imposed by partial least squares and principal components regression is inherent in the mathematical relation among the 3 variables, this leads to more interpretable results. We use one dataset from a Taiwanese health-screening program to illustrate how to use partial least squares regression to analyze the trends in body heights with 3 continuous variables for age, period, and cohort. We then use another dataset of hepatocellular carcinoma mortality rates for Taiwanese men to illustrate how to use partial least squares regression to analyze tables with aggregated data. We use the second dataset to show the relation between the intrinsic estimator, a recently proposed method for the age-period-cohort analysis, and partial least squares regression. We also show that the inclusion of all indicator variables provides a more consistent approach. R code for our analyses is provided in the eAppendix.
MODELING SNAKE MICROHABITAT FROM RADIOTELEMETRY STUDIES USING POLYTOMOUS LOGISTIC REGRESSION
Multivariate analysis of snake microhabitat has historically used techniques that were derived under assumptions of normality and common covariance structure (e.g., discriminant function analysis, MANOVA). In this study, polytomous logistic regression (PLR which does not require ...
NASA Technical Reports Server (NTRS)
Kalton, G.
1983-01-01
A number of surveys were conducted to study the relationship between the level of aircraft or traffic noise exposure experienced by people living in a particular area and their annoyance with it. These surveys generally employ a clustered sample design which affects the precision of the survey estimates. Regression analysis of annoyance on noise measures and other variables is often an important component of the survey analysis. Formulae are presented for estimating the standard errors of regression coefficients and ratio of regression coefficients that are applicable with a two- or three-stage clustered sample design. Using a simple cost function, they also determine the optimum allocation of the sample across the stages of the sample design for the estimation of a regression coefficient.
NASA Astrophysics Data System (ADS)
Bae, Gihyun; Huh, Hoon; Park, Sungho
This paper deals with a regression model for light weight and crashworthiness enhancement design of automotive parts in frontal car crash. The ULSAB-AVC model is employed for the crash analysis and effective parts are selected based on the amount of energy absorption during the crash behavior. Finite element analyses are carried out for designated design cases in order to investigate the crashworthiness and weight according to the material and thickness of main energy absorption parts. Based on simulations results, a regression analysis is performed to construct a regression model utilized for light weight and crashworthiness enhancement design of automotive parts. An example for weight reduction of main energy absorption parts demonstrates the validity of a regression model constructed.
Ai, Zi-Sheng; Gao, You-Shui; Sun, Yuan; Liu, Yue; Zhang, Chang-Qing; Jiang, Cheng-Hua
2013-03-01
Risk factors for femoral neck fracture-induced avascular necrosis of the femoral head have not been elucidated clearly in middle-aged and elderly patients. Moreover, the high incidence of screw removal in China and its effect on the fate of the involved femoral head require statistical methods to reflect their intrinsic relationship. Ninety-nine patients older than 45 years with femoral neck fracture were treated by internal fixation between May 1999 and April 2004. Descriptive analysis, interaction analysis between associated factors, single factor logistic regression, multivariate logistic regression, and detailed interaction analysis were employed to explore potential relationships among associated factors. Avascular necrosis of the femoral head was found in 15 cases (15.2 %). Age × the status of implants (removal vs. maintenance) and gender × the timing of reduction were interactive according to two-factor interactive analysis. Age, the displacement of fractures, the quality of reduction, and the status of implants were found to be significant factors in single factor logistic regression analysis. Age, age × the status of implants, and the quality of reduction were found to be significant factors in multivariate logistic regression analysis. In fine interaction analysis after multivariate logistic regression analysis, implant removal was the most important risk factor for avascular necrosis in 56-to-85-year-old patients, with a risk ratio of 26.00 (95 % CI = 3.076-219.747). The middle-aged and elderly have less incidence of avascular necrosis of the femoral head following femoral neck fractures treated by cannulated screws. The removal of cannulated screws can induce a significantly high incidence of avascular necrosis of the femoral head in elderly patients, while a high-quality reduction is helpful to reduce avascular necrosis.
Two Paradoxes in Linear Regression Analysis.
Feng, Ge; Peng, Jing; Tu, Dongke; Zheng, Julia Z; Feng, Changyong
2016-12-25
Regression is one of the favorite tools in applied statistics. However, misuse and misinterpretation of results from regression analysis are common in biomedical research. In this paper we use statistical theory and simulation studies to clarify some paradoxes around this popular statistical method. In particular, we show that a widely used model selection procedure employed in many publications in top medical journals is wrong. Formal procedures based on solid statistical theory should be used in model selection.
NASA Astrophysics Data System (ADS)
Vaudour, Emmanuelle; Gomez, Cécile; Fouad, Youssef; Gilliot, Jean-Marc; Lagacherie, Philippe
2017-04-01
This study aimed at exploring the potential of SENTINEL-2 (S2A) multispectral satellite images for predicting several topsoil properties in two contrasted environments: a temperate region marked by intensive annual crop cultivation and soils derived from either loess or colluvium and/or marine limestone or chalk for one part (Versailles Plain, 221 km2), and a Mediterranean region marked by vineyard cultivation and soils derived from either lacustrine limestone, calcareous sandstones, colluvium, or alluvial deposits (La Peyne catchment, 48 km2) for the other part. Two S2A images (acquired in mid-March 2016 over each site) were atmospherically corrected. Then NDVI was computed and thresholded (0.35) in order to extract bare soils. Prediction models of soil properties based on partial least squares regressions (PLSR) were built from S2A spectra of 72 and 143 sampling locations in the Versailles Plain and La Peyne catchment, respectively. Ten soil properties were investigated in both regions: pH, cation exchange capacity (CEC), five texture fractions (clay, coarse silt, fine silt, coarse sand and fine sand), iron, calcium carbonate and soil organic carbon (SOC) in the tilled horizon. Predictive abilities were studied according to R_cv2 and ratio of performance to deviation (RPD). Intermediate to near intermediate performances of prediction (R_cv2 and RPD between 0.28-0.70 and 1.19-1.85 respectively) were obtained for 6 topsoil properties: clay, iron, SOC, CEC, pH, coarse silt. In the Versailles Plain, 5 out of these properties could be predicted (by decreasing performance, CEC, SOC, pH, clay, coarse silt), while there were 4 predictable properties for La Peyne catchment (Iron, clay, CEC, coarse silt). The amount in coarse fragment content appeared to impact prediction error for iron content over La Peyne, while it influenced prediction error for SOC content over the Versailles Plain along with calcium carbonate content. A spatial structure of the estimated soil properties for bare soils pixels was highlighted, which promises further improvements in spatial prediction models for these properties. This work was carried out in the framework of both the TOSCA-CES "Cartographie Numérique des sols" and the PLEIADES-CO projects of the French Space Agency (CNES).
Peters, K.E.; Ramos, L.S.; Zumberge, J.E.; Valin, Z.C.; Scotese, C.R.
2008-01-01
Tectonic geochemical paleolatitude (TGP) models were developed to predict the paleolatitude of petroleum source rock from the geochemical composition of crude oil. The results validate studies designed to reconstruct ancient source rock depositional environments using oil chemistry and tectonic reconstruction of paleogeography from coordinates of the present day collection site. TGP models can also be used to corroborate tectonic paleolatitude in cases where the predicted paleogeography conflicts with the depositional setting predicted by the oil chemistry, or to predict paleolatitude when the present day collection locality is far removed from the source rock, as might occur due to long distance subsurface migration or transport of tarballs by ocean currents. Biomarker and stable carbon isotope ratios were measured for 496 crude oil samples inferred to originate from Upper Jurassic source rock in West Siberia, the North Sea and offshore Labrador. First, a unique, multi-tiered chemometric (multivariate statistics) decision tree was used to classify these samples into seven oil families and infer the type of organic matter, lithology and depositional environment of each organofacies of source rock [Peters, K.E., Ramos, L.S., Zumberge, J.E., Valin, Z.C., Scotese, C.R., Gautier, D.L., 2007. Circum-Arctic petroleum systems identified using decision-tree chemometrics. American Association of Petroleum Geologists Bulletin 91, 877-913]. Second, present day geographic locations for each sample were used to restore the tectonic paleolatitude of the source rock during Late Jurassic time (???150 Ma). Third, partial least squares regression (PLSR) was used to construct linear TGP models that relate tectonic and geochemical paleolatitude, where the latter is based on 19 source-related biomarker and isotope ratios for each oil family. The TGP models were calibrated using 70% of the samples in each family and the remaining 30% of samples were used for model validation. Positive relationships exist between tectonic and geochemical paleolatitude for each family. Standard error of prediction for geochemical paleolatitude ranges from 0.9?? to 2.6?? of tectonic paleolatitude, which translates to a relative standard error of prediction in the range 1.5-4.8%. The results suggest that the observed effect of source rock paleolatitude on crude oil composition is caused by (i) stable carbon isotope fractionation during photosynthetic fixation of carbon and (ii) species diversity at different latitudes during Late Jurassic time. ?? 2008 Elsevier Ltd. All rights reserved.
NASA Astrophysics Data System (ADS)
Denli, H. H.; Koc, Z.
2015-12-01
Estimation of real properties depending on standards is difficult to apply in time and location. Regression analysis construct mathematical models which describe or explain relationships that may exist between variables. The problem of identifying price differences of properties to obtain a price index can be converted into a regression problem, and standard techniques of regression analysis can be used to estimate the index. Considering regression analysis for real estate valuation, which are presented in real marketing process with its current characteristics and quantifiers, the method will help us to find the effective factors or variables in the formation of the value. In this study, prices of housing for sale in Zeytinburnu, a district in Istanbul, are associated with its characteristics to find a price index, based on information received from a real estate web page. The associated variables used for the analysis are age, size in m2, number of floors having the house, floor number of the estate and number of rooms. The price of the estate represents the dependent variable, whereas the rest are independent variables. Prices from 60 real estates have been used for the analysis. Same price valued locations have been found and plotted on the map and equivalence curves have been drawn identifying the same valued zones as lines.
Schörgendorfer, Angela; Branscum, Adam J; Hanson, Timothy E
2013-06-01
Logistic regression is a popular tool for risk analysis in medical and population health science. With continuous response data, it is common to create a dichotomous outcome for logistic regression analysis by specifying a threshold for positivity. Fitting a linear regression to the nondichotomized response variable assuming a logistic sampling model for the data has been empirically shown to yield more efficient estimates of odds ratios than ordinary logistic regression of the dichotomized endpoint. We illustrate that risk inference is not robust to departures from the parametric logistic distribution. Moreover, the model assumption of proportional odds is generally not satisfied when the condition of a logistic distribution for the data is violated, leading to biased inference from a parametric logistic analysis. We develop novel Bayesian semiparametric methodology for testing goodness of fit of parametric logistic regression with continuous measurement data. The testing procedures hold for any cutoff threshold and our approach simultaneously provides the ability to perform semiparametric risk estimation. Bayes factors are calculated using the Savage-Dickey ratio for testing the null hypothesis of logistic regression versus a semiparametric generalization. We propose a fully Bayesian and a computationally efficient empirical Bayesian approach to testing, and we present methods for semiparametric estimation of risks, relative risks, and odds ratios when parametric logistic regression fails. Theoretical results establish the consistency of the empirical Bayes test. Results from simulated data show that the proposed approach provides accurate inference irrespective of whether parametric assumptions hold or not. Evaluation of risk factors for obesity shows that different inferences are derived from an analysis of a real data set when deviations from a logistic distribution are permissible in a flexible semiparametric framework. © 2013, The International Biometric Society.
Zhang, Hong-guang; Lu, Jian-gang
2016-02-01
Abstract To overcome the problems of significant difference among samples and nonlinearity between the property and spectra of samples in spectral quantitative analysis, a local regression algorithm is proposed in this paper. In this algorithm, net signal analysis method(NAS) was firstly used to obtain the net analyte signal of the calibration samples and unknown samples, then the Euclidean distance between net analyte signal of the sample and net analyte signal of calibration samples was calculated and utilized as similarity index. According to the defined similarity index, the local calibration sets were individually selected for each unknown sample. Finally, a local PLS regression model was built on each local calibration sets for each unknown sample. The proposed method was applied to a set of near infrared spectra of meat samples. The results demonstrate that the prediction precision and model complexity of the proposed method are superior to global PLS regression method and conventional local regression algorithm based on spectral Euclidean distance.
Multilayer Perceptron for Robust Nonlinear Interval Regression Analysis Using Genetic Algorithms
2014-01-01
On the basis of fuzzy regression, computational models in intelligence such as neural networks have the capability to be applied to nonlinear interval regression analysis for dealing with uncertain and imprecise data. When training data are not contaminated by outliers, computational models perform well by including almost all given training data in the data interval. Nevertheless, since training data are often corrupted by outliers, robust learning algorithms employed to resist outliers for interval regression analysis have been an interesting area of research. Several approaches involving computational intelligence are effective for resisting outliers, but the required parameters for these approaches are related to whether the collected data contain outliers or not. Since it seems difficult to prespecify the degree of contamination beforehand, this paper uses multilayer perceptron to construct the robust nonlinear interval regression model using the genetic algorithm. Outliers beyond or beneath the data interval will impose slight effect on the determination of data interval. Simulation results demonstrate that the proposed method performs well for contaminated datasets. PMID:25110755
Multilayer perceptron for robust nonlinear interval regression analysis using genetic algorithms.
Hu, Yi-Chung
2014-01-01
On the basis of fuzzy regression, computational models in intelligence such as neural networks have the capability to be applied to nonlinear interval regression analysis for dealing with uncertain and imprecise data. When training data are not contaminated by outliers, computational models perform well by including almost all given training data in the data interval. Nevertheless, since training data are often corrupted by outliers, robust learning algorithms employed to resist outliers for interval regression analysis have been an interesting area of research. Several approaches involving computational intelligence are effective for resisting outliers, but the required parameters for these approaches are related to whether the collected data contain outliers or not. Since it seems difficult to prespecify the degree of contamination beforehand, this paper uses multilayer perceptron to construct the robust nonlinear interval regression model using the genetic algorithm. Outliers beyond or beneath the data interval will impose slight effect on the determination of data interval. Simulation results demonstrate that the proposed method performs well for contaminated datasets.
The use of cognitive ability measures as explanatory variables in regression analysis.
Junker, Brian; Schofield, Lynne Steuerle; Taylor, Lowell J
2012-12-01
Cognitive ability measures are often taken as explanatory variables in regression analysis, e.g., as a factor affecting a market outcome such as an individual's wage, or a decision such as an individual's education acquisition. Cognitive ability is a latent construct; its true value is unobserved. Nonetheless, researchers often assume that a test score , constructed via standard psychometric practice from individuals' responses to test items, can be safely used in regression analysis. We examine problems that can arise, and suggest that an alternative approach, a "mixed effects structural equations" (MESE) model, may be more appropriate in many circumstances.
Factor analysis and multiple regression between topography and precipitation on Jeju Island, Korea
NASA Astrophysics Data System (ADS)
Um, Myoung-Jin; Yun, Hyeseon; Jeong, Chang-Sam; Heo, Jun-Haeng
2011-11-01
SummaryIn this study, new factors that influence precipitation were extracted from geographic variables using factor analysis, which allow for an accurate estimation of orographic precipitation. Correlation analysis was also used to examine the relationship between nine topographic variables from digital elevation models (DEMs) and the precipitation in Jeju Island. In addition, a spatial analysis was performed in order to verify the validity of the regression model. From the results of the correlation analysis, it was found that all of the topographic variables had a positive correlation with the precipitation. The relations between the variables also changed in accordance with a change in the precipitation duration. However, upon examining the correlation matrix, no significant relationship between the latitude and the aspect was found. According to the factor analysis, eight topographic variables (latitude being the exception) were found to have a direct influence on the precipitation. Three factors were then extracted from the eight topographic variables. By directly comparing the multiple regression model with the factors (model 1) to the multiple regression model with the topographic variables (model 3), it was found that model 1 did not violate the limits of statistical significance and multicollinearity. As such, model 1 was considered to be appropriate for estimating the precipitation when taking into account the topography. In the study of model 1, the multiple regression model using factor analysis was found to be the best method for estimating the orographic precipitation on Jeju Island.
2014-01-01
Background Meta-regression is becoming increasingly used to model study level covariate effects. However this type of statistical analysis presents many difficulties and challenges. Here two methods for calculating confidence intervals for the magnitude of the residual between-study variance in random effects meta-regression models are developed. A further suggestion for calculating credible intervals using informative prior distributions for the residual between-study variance is presented. Methods Two recently proposed and, under the assumptions of the random effects model, exact methods for constructing confidence intervals for the between-study variance in random effects meta-analyses are extended to the meta-regression setting. The use of Generalised Cochran heterogeneity statistics is extended to the meta-regression setting and a Newton-Raphson procedure is developed to implement the Q profile method for meta-analysis and meta-regression. WinBUGS is used to implement informative priors for the residual between-study variance in the context of Bayesian meta-regressions. Results Results are obtained for two contrasting examples, where the first example involves a binary covariate and the second involves a continuous covariate. Intervals for the residual between-study variance are wide for both examples. Conclusions Statistical methods, and R computer software, are available to compute exact confidence intervals for the residual between-study variance under the random effects model for meta-regression. These frequentist methods are almost as easily implemented as their established counterparts for meta-analysis. Bayesian meta-regressions are also easily performed by analysts who are comfortable using WinBUGS. Estimates of the residual between-study variance in random effects meta-regressions should be routinely reported and accompanied by some measure of their uncertainty. Confidence and/or credible intervals are well-suited to this purpose. PMID:25196829
Choi, Seung Hoan; Labadorf, Adam T; Myers, Richard H; Lunetta, Kathryn L; Dupuis, Josée; DeStefano, Anita L
2017-02-06
Next generation sequencing provides a count of RNA molecules in the form of short reads, yielding discrete, often highly non-normally distributed gene expression measurements. Although Negative Binomial (NB) regression has been generally accepted in the analysis of RNA sequencing (RNA-Seq) data, its appropriateness has not been exhaustively evaluated. We explore logistic regression as an alternative method for RNA-Seq studies designed to compare cases and controls, where disease status is modeled as a function of RNA-Seq reads using simulated and Huntington disease data. We evaluate the effect of adjusting for covariates that have an unknown relationship with gene expression. Finally, we incorporate the data adaptive method in order to compare false positive rates. When the sample size is small or the expression levels of a gene are highly dispersed, the NB regression shows inflated Type-I error rates but the Classical logistic and Bayes logistic (BL) regressions are conservative. Firth's logistic (FL) regression performs well or is slightly conservative. Large sample size and low dispersion generally make Type-I error rates of all methods close to nominal alpha levels of 0.05 and 0.01. However, Type-I error rates are controlled after applying the data adaptive method. The NB, BL, and FL regressions gain increased power with large sample size, large log2 fold-change, and low dispersion. The FL regression has comparable power to NB regression. We conclude that implementing the data adaptive method appropriately controls Type-I error rates in RNA-Seq analysis. Firth's logistic regression provides a concise statistical inference process and reduces spurious associations from inaccurately estimated dispersion parameters in the negative binomial framework.
Lewis, Jason M.
2010-01-01
Peak-streamflow regression equations were determined for estimating flows with exceedance probabilities from 50 to 0.2 percent for the state of Oklahoma. These regression equations incorporate basin characteristics to estimate peak-streamflow magnitude and frequency throughout the state by use of a generalized least squares regression analysis. The most statistically significant independent variables required to estimate peak-streamflow magnitude and frequency for unregulated streams in Oklahoma are contributing drainage area, mean-annual precipitation, and main-channel slope. The regression equations are applicable for watershed basins with drainage areas less than 2,510 square miles that are not affected by regulation. The resulting regression equations had a standard model error ranging from 31 to 46 percent. Annual-maximum peak flows observed at 231 streamflow-gaging stations through water year 2008 were used for the regression analysis. Gage peak-streamflow estimates were used from previous work unless 2008 gaging-station data were available, in which new peak-streamflow estimates were calculated. The U.S. Geological Survey StreamStats web application was used to obtain the independent variables required for the peak-streamflow regression equations. Limitations on the use of the regression equations and the reliability of regression estimates for natural unregulated streams are described. Log-Pearson Type III analysis information, basin and climate characteristics, and the peak-streamflow frequency estimates for the 231 gaging stations in and near Oklahoma are listed. Methodologies are presented to estimate peak streamflows at ungaged sites by using estimates from gaging stations on unregulated streams. For ungaged sites on urban streams and streams regulated by small floodwater retarding structures, an adjustment of the statewide regression equations for natural unregulated streams can be used to estimate peak-streamflow magnitude and frequency.
Neither fixed nor random: weighted least squares meta-regression.
Stanley, T D; Doucouliagos, Hristos
2017-03-01
Our study revisits and challenges two core conventional meta-regression estimators: the prevalent use of 'mixed-effects' or random-effects meta-regression analysis and the correction of standard errors that defines fixed-effects meta-regression analysis (FE-MRA). We show how and explain why an unrestricted weighted least squares MRA (WLS-MRA) estimator is superior to conventional random-effects (or mixed-effects) meta-regression when there is publication (or small-sample) bias that is as good as FE-MRA in all cases and better than fixed effects in most practical applications. Simulations and statistical theory show that WLS-MRA provides satisfactory estimates of meta-regression coefficients that are practically equivalent to mixed effects or random effects when there is no publication bias. When there is publication selection bias, WLS-MRA always has smaller bias than mixed effects or random effects. In practical applications, an unrestricted WLS meta-regression is likely to give practically equivalent or superior estimates to fixed-effects, random-effects, and mixed-effects meta-regression approaches. However, random-effects meta-regression remains viable and perhaps somewhat preferable if selection for statistical significance (publication bias) can be ruled out and when random, additive normal heterogeneity is known to directly affect the 'true' regression coefficient. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
[How to fit and interpret multilevel models using SPSS].
Pardo, Antonio; Ruiz, Miguel A; San Martín, Rafael
2007-05-01
Hierarchic or multilevel models are used to analyse data when cases belong to known groups and sample units are selected both from the individual level and from the group level. In this work, the multilevel models most commonly discussed in the statistic literature are described, explaining how to fit these models using the SPSS program (any version as of the 11 th ) and how to interpret the outcomes of the analysis. Five particular models are described, fitted, and interpreted: (1) one-way analysis of variance with random effects, (2) regression analysis with means-as-outcomes, (3) one-way analysis of covariance with random effects, (4) regression analysis with random coefficients, and (5) regression analysis with means- and slopes-as-outcomes. All models are explained, trying to make them understandable to researchers in health and behaviour sciences.
Correlative and multivariate analysis of increased radon concentration in underground laboratory.
Maletić, Dimitrije M; Udovičić, Vladimir I; Banjanac, Radomir M; Joković, Dejan R; Dragić, Aleksandar L; Veselinović, Nikola B; Filipović, Jelena
2014-11-01
The results of analysis using correlative and multivariate methods, as developed for data analysis in high-energy physics and implemented in the Toolkit for Multivariate Analysis software package, of the relations of the variation of increased radon concentration with climate variables in shallow underground laboratory is presented. Multivariate regression analysis identified a number of multivariate methods which can give a good evaluation of increased radon concentrations based on climate variables. The use of the multivariate regression methods will enable the investigation of the relations of specific climate variable with increased radon concentrations by analysis of regression methods resulting in 'mapped' underlying functional behaviour of radon concentrations depending on a wide spectrum of climate variables. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Brenn, T; Arnesen, E
1985-01-01
For comparative evaluation, discriminant analysis, logistic regression and Cox's model were used to select risk factors for total and coronary deaths among 6595 men aged 20-49 followed for 9 years. Groups with mortality between 5 and 93 per 1000 were considered. Discriminant analysis selected variable sets only marginally different from the logistic and Cox methods which always selected the same sets. A time-saving option, offered for both the logistic and Cox selection, showed no advantage compared with discriminant analysis. Analysing more than 3800 subjects, the logistic and Cox methods consumed, respectively, 80 and 10 times more computer time than discriminant analysis. When including the same set of variables in non-stepwise analyses, all methods estimated coefficients that in most cases were almost identical. In conclusion, discriminant analysis is advocated for preliminary or stepwise analysis, otherwise Cox's method should be used.
Bias due to two-stage residual-outcome regression analysis in genetic association studies.
Demissie, Serkalem; Cupples, L Adrienne
2011-11-01
Association studies of risk factors and complex diseases require careful assessment of potential confounding factors. Two-stage regression analysis, sometimes referred to as residual- or adjusted-outcome analysis, has been increasingly used in association studies of single nucleotide polymorphisms (SNPs) and quantitative traits. In this analysis, first, a residual-outcome is calculated from a regression of the outcome variable on covariates and then the relationship between the adjusted-outcome and the SNP is evaluated by a simple linear regression of the adjusted-outcome on the SNP. In this article, we examine the performance of this two-stage analysis as compared with multiple linear regression (MLR) analysis. Our findings show that when a SNP and a covariate are correlated, the two-stage approach results in biased genotypic effect and loss of power. Bias is always toward the null and increases with the squared-correlation between the SNP and the covariate (). For example, for , 0.1, and 0.5, two-stage analysis results in, respectively, 0, 10, and 50% attenuation in the SNP effect. As expected, MLR was always unbiased. Since individual SNPs often show little or no correlation with covariates, a two-stage analysis is expected to perform as well as MLR in many genetic studies; however, it produces considerably different results from MLR and may lead to incorrect conclusions when independent variables are highly correlated. While a useful alternative to MLR under , the two -stage approach has serious limitations. Its use as a simple substitute for MLR should be avoided. © 2011 Wiley Periodicals, Inc.
Variable Selection in Logistic Regression.
1987-06-01
23 %. AUTIOR(.) S. CONTRACT OR GRANT NUMBE Rf.i %Z. D. Bai, P. R. Krishnaiah and . C. Zhao F49620-85- C-0008 " PERFORMING ORGANIZATION NAME AND AOORESS...d I7 IOK-TK- d 7 -I0 7’ VARIABLE SELECTION IN LOGISTIC REGRESSION Z. D. Bai, P. R. Krishnaiah and L. C. Zhao Center for Multivariate Analysis...University of Pittsburgh Center for Multivariate Analysis University of Pittsburgh Y !I VARIABLE SELECTION IN LOGISTIC REGRESSION Z- 0. Bai, P. R. Krishnaiah
Two Paradoxes in Linear Regression Analysis
FENG, Ge; PENG, Jing; TU, Dongke; ZHENG, Julia Z.; FENG, Changyong
2016-01-01
Summary Regression is one of the favorite tools in applied statistics. However, misuse and misinterpretation of results from regression analysis are common in biomedical research. In this paper we use statistical theory and simulation studies to clarify some paradoxes around this popular statistical method. In particular, we show that a widely used model selection procedure employed in many publications in top medical journals is wrong. Formal procedures based on solid statistical theory should be used in model selection. PMID:28638214
Wang, Wen-Cheng; Cho, Wen-Chien; Chen, Yin-Jen
2014-01-01
It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models. PMID:24574916
Composite marginal quantile regression analysis for longitudinal adolescent body mass index data.
Yang, Chi-Chuan; Chen, Yi-Hau; Chang, Hsing-Yi
2017-09-20
Childhood and adolescenthood overweight or obesity, which may be quantified through the body mass index (BMI), is strongly associated with adult obesity and other health problems. Motivated by the child and adolescent behaviors in long-term evolution (CABLE) study, we are interested in individual, family, and school factors associated with marginal quantiles of longitudinal adolescent BMI values. We propose a new method for composite marginal quantile regression analysis for longitudinal outcome data, which performs marginal quantile regressions at multiple quantile levels simultaneously. The proposed method extends the quantile regression coefficient modeling method introduced by Frumento and Bottai (Biometrics 2016; 72:74-84) to longitudinal data accounting suitably for the correlation structure in longitudinal observations. A goodness-of-fit test for the proposed modeling is also developed. Simulation results show that the proposed method can be much more efficient than the analysis without taking correlation into account and the analysis performing separate quantile regressions at different quantile levels. The application to the longitudinal adolescent BMI data from the CABLE study demonstrates the practical utility of our proposal. Copyright © 2017 John Wiley & Sons, Ltd. Copyright © 2017 John Wiley & Sons, Ltd.
Cervical Vertebral Body's Volume as a New Parameter for Predicting the Skeletal Maturation Stages.
Choi, Youn-Kyung; Kim, Jinmi; Yamaguchi, Tetsutaro; Maki, Koutaro; Ko, Ching-Chang; Kim, Yong-Il
2016-01-01
This study aimed to determine the correlation between the volumetric parameters derived from the images of the second, third, and fourth cervical vertebrae by using cone beam computed tomography with skeletal maturation stages and to propose a new formula for predicting skeletal maturation by using regression analysis. We obtained the estimation of skeletal maturation levels from hand-wrist radiographs and volume parameters derived from the second, third, and fourth cervical vertebrae bodies from 102 Japanese patients (54 women and 48 men, 5-18 years of age). We performed Pearson's correlation coefficient analysis and simple regression analysis. All volume parameters derived from the second, third, and fourth cervical vertebrae exhibited statistically significant correlations (P < 0.05). The simple regression model with the greatest R-square indicated the fourth-cervical-vertebra volume as an independent variable with a variance inflation factor less than ten. The explanation power was 81.76%. Volumetric parameters of cervical vertebrae using cone beam computed tomography are useful in regression models. The derived regression model has the potential for clinical application as it enables a simple and quantitative analysis to evaluate skeletal maturation level.
Cervical Vertebral Body's Volume as a New Parameter for Predicting the Skeletal Maturation Stages
Choi, Youn-Kyung; Kim, Jinmi; Maki, Koutaro; Ko, Ching-Chang
2016-01-01
This study aimed to determine the correlation between the volumetric parameters derived from the images of the second, third, and fourth cervical vertebrae by using cone beam computed tomography with skeletal maturation stages and to propose a new formula for predicting skeletal maturation by using regression analysis. We obtained the estimation of skeletal maturation levels from hand-wrist radiographs and volume parameters derived from the second, third, and fourth cervical vertebrae bodies from 102 Japanese patients (54 women and 48 men, 5–18 years of age). We performed Pearson's correlation coefficient analysis and simple regression analysis. All volume parameters derived from the second, third, and fourth cervical vertebrae exhibited statistically significant correlations (P < 0.05). The simple regression model with the greatest R-square indicated the fourth-cervical-vertebra volume as an independent variable with a variance inflation factor less than ten. The explanation power was 81.76%. Volumetric parameters of cervical vertebrae using cone beam computed tomography are useful in regression models. The derived regression model has the potential for clinical application as it enables a simple and quantitative analysis to evaluate skeletal maturation level. PMID:27340668
Wang, Wen-Cheng; Cho, Wen-Chien; Chen, Yin-Jen
2014-01-01
It is estimated that mainland Chinese tourists travelling to Taiwan can bring annual revenues of 400 billion NTD to the Taiwan economy. Thus, how the Taiwanese Government formulates relevant measures to satisfy both sides is the focus of most concern. Taiwan must improve the facilities and service quality of its tourism industry so as to attract more mainland tourists. This paper conducted a questionnaire survey of mainland tourists and used grey relational analysis in grey mathematics to analyze the satisfaction performance of all satisfaction question items. The first eight satisfaction items were used as independent variables, and the overall satisfaction performance was used as a dependent variable for quantile regression model analysis to discuss the relationship between the dependent variable under different quantiles and independent variables. Finally, this study further discussed the predictive accuracy of the least mean regression model and each quantile regression model, as a reference for research personnel. The analysis results showed that other variables could also affect the overall satisfaction performance of mainland tourists, in addition to occupation and age. The overall predictive accuracy of quantile regression model Q0.25 was higher than that of the other three models.
Analysis of Sting Balance Calibration Data Using Optimized Regression Models
NASA Technical Reports Server (NTRS)
Ulbrich, N.; Bader, Jon B.
2010-01-01
Calibration data of a wind tunnel sting balance was processed using a candidate math model search algorithm that recommends an optimized regression model for the data analysis. During the calibration the normal force and the moment at the balance moment center were selected as independent calibration variables. The sting balance itself had two moment gages. Therefore, after analyzing the connection between calibration loads and gage outputs, it was decided to choose the difference and the sum of the gage outputs as the two responses that best describe the behavior of the balance. The math model search algorithm was applied to these two responses. An optimized regression model was obtained for each response. Classical strain gage balance load transformations and the equations of the deflection of a cantilever beam under load are used to show that the search algorithm s two optimized regression models are supported by a theoretical analysis of the relationship between the applied calibration loads and the measured gage outputs. The analysis of the sting balance calibration data set is a rare example of a situation when terms of a regression model of a balance can directly be derived from first principles of physics. In addition, it is interesting to note that the search algorithm recommended the correct regression model term combinations using only a set of statistical quality metrics that were applied to the experimental data during the algorithm s term selection process.
Covariate Imbalance and Adjustment for Logistic Regression Analysis of Clinical Trial Data
Ciolino, Jody D.; Martin, Reneé H.; Zhao, Wenle; Jauch, Edward C.; Hill, Michael D.; Palesch, Yuko Y.
2014-01-01
In logistic regression analysis for binary clinical trial data, adjusted treatment effect estimates are often not equivalent to unadjusted estimates in the presence of influential covariates. This paper uses simulation to quantify the benefit of covariate adjustment in logistic regression. However, International Conference on Harmonization guidelines suggest that covariate adjustment be pre-specified. Unplanned adjusted analyses should be considered secondary. Results suggest that that if adjustment is not possible or unplanned in a logistic setting, balance in continuous covariates can alleviate some (but never all) of the shortcomings of unadjusted analyses. The case of log binomial regression is also explored. PMID:24138438
Linear regression analysis of survival data with missing censoring indicators.
Wang, Qihua; Dinse, Gregg E
2011-04-01
Linear regression analysis has been studied extensively in a random censorship setting, but typically all of the censoring indicators are assumed to be observed. In this paper, we develop synthetic data methods for estimating regression parameters in a linear model when some censoring indicators are missing. We define estimators based on regression calibration, imputation, and inverse probability weighting techniques, and we prove all three estimators are asymptotically normal. The finite-sample performance of each estimator is evaluated via simulation. We illustrate our methods by assessing the effects of sex and age on the time to non-ambulatory progression for patients in a brain cancer clinical trial.
Kuiper, Gerhardus J A J M; Houben, Rik; Wetzels, Rick J H; Verhezen, Paul W M; Oerle, Rene van; Ten Cate, Hugo; Henskens, Yvonne M C; Lancé, Marcus D
2017-11-01
Low platelet counts and hematocrit levels hinder whole blood point-of-care testing of platelet function. Thus far, no reference ranges for MEA (multiple electrode aggregometry) and PFA-100 (platelet function analyzer 100) devices exist for low ranges. Through dilution methods of volunteer whole blood, platelet function at low ranges of platelet count and hematocrit levels was assessed on MEA for four agonists and for PFA-100 in two cartridges. Using (multiple) regression analysis, 95% reference intervals were computed for these low ranges. Low platelet counts affected MEA in a positive correlation (all agonists showed r 2 ≥ 0.75) and PFA-100 in an inverse correlation (closure times were prolonged with lower platelet counts). Lowered hematocrit did not affect MEA testing, except for arachidonic acid activation (ASPI), which showed a weak positive correlation (r 2 = 0.14). Closure time on PFA-100 testing was inversely correlated with hematocrit for both cartridges. Regression analysis revealed different 95% reference intervals in comparison with originally established intervals for both MEA and PFA-100 in low platelet or hematocrit conditions. Multiple regression analysis of ASPI and both tests on the PFA-100 for combined low platelet and hematocrit conditions revealed that only PFA-100 testing should be adjusted for both thrombocytopenia and anemia. 95% reference intervals were calculated using multiple regression analysis. However, coefficients of determination of PFA-100 were poor, and some variance remained unexplained. Thus, in this pilot study using (multiple) regression analysis, we could establish reference intervals of platelet function in anemia and thrombocytopenia conditions on PFA-100 and in thrombocytopenia conditions on MEA.
Clustering performance comparison using K-means and expectation maximization algorithms.
Jung, Yong Gyu; Kang, Min Soo; Heo, Jun
2014-11-14
Clustering is an important means of data mining based on separating data categories by similar features. Unlike the classification algorithm, clustering belongs to the unsupervised type of algorithms. Two representatives of the clustering algorithms are the K -means and the expectation maximization (EM) algorithm. Linear regression analysis was extended to the category-type dependent variable, while logistic regression was achieved using a linear combination of independent variables. To predict the possibility of occurrence of an event, a statistical approach is used. However, the classification of all data by means of logistic regression analysis cannot guarantee the accuracy of the results. In this paper, the logistic regression analysis is applied to EM clusters and the K -means clustering method for quality assessment of red wine, and a method is proposed for ensuring the accuracy of the classification results.
Farno, E; Coventry, K; Slatter, P; Eshtiaghi, N
2018-06-15
Sludge pumps in wastewater treatment plants are often oversized due to uncertainty in calculation of pressure drop. This issue costs millions of dollars for industry to purchase and operate the oversized pumps. Besides costs, higher electricity consumption is associated with extra CO 2 emission which creates huge environmental impacts. Calculation of pressure drop via current pipe flow theory requires model estimation of flow curve data which depends on regression analysis and also varies with natural variation of rheological data. This study investigates impact of variation of rheological data and regression analysis on variation of pressure drop calculated via current pipe flow theories. Results compare the variation of calculated pressure drop between different models and regression methods and suggest on the suitability of each method. Copyright © 2018 Elsevier Ltd. All rights reserved.
Henrard, S; Speybroeck, N; Hermans, C
2015-11-01
Haemophilia is a rare genetic haemorrhagic disease characterized by partial or complete deficiency of coagulation factor VIII, for haemophilia A, or IX, for haemophilia B. As in any other medical research domain, the field of haemophilia research is increasingly concerned with finding factors associated with binary or continuous outcomes through multivariable models. Traditional models include multiple logistic regressions, for binary outcomes, and multiple linear regressions for continuous outcomes. Yet these regression models are at times difficult to implement, especially for non-statisticians, and can be difficult to interpret. The present paper sought to didactically explain how, why, and when to use classification and regression tree (CART) analysis for haemophilia research. The CART method is non-parametric and non-linear, based on the repeated partitioning of a sample into subgroups based on a certain criterion. Breiman developed this method in 1984. Classification trees (CTs) are used to analyse categorical outcomes and regression trees (RTs) to analyse continuous ones. The CART methodology has become increasingly popular in the medical field, yet only a few examples of studies using this methodology specifically in haemophilia have to date been published. Two examples using CART analysis and previously published in this field are didactically explained in details. There is increasing interest in using CART analysis in the health domain, primarily due to its ease of implementation, use, and interpretation, thus facilitating medical decision-making. This method should be promoted for analysing continuous or categorical outcomes in haemophilia, when applicable. © 2015 John Wiley & Sons Ltd.
The potential of UAS imagery for soil mapping at the agricultural plot scale
NASA Astrophysics Data System (ADS)
Gilliot, Jean-Marc; Michelin, Joël; Becu, Maxime; Cissé, Moustapha; Hadjar, Dalila; Vaudour, Emmanuelle
2017-04-01
Soil mapping is expensive and time consuming. Airborne and satellite remote sensing data have already been used to predict some soil properties but now Unmanned Aerial Systems (UAS) allow to do many images acquisitions in various field conditions in favour of developing methods for better prediction models construction. This study propose an operational method for spatial prediction of soil properties (organic carbon, clay) at the scale of the agricultural plot by using UAS imagery. An agricultural plot of 28 ha, located in the western region of Paris France, was studied from March to May 2016. An area of 3.6 ha was delimited within the plot and a total of 16 flights were completed. The UAS platforms used were the eBee fixed wing provided by Sensefly® flying at an altitude from 60m to 130m and the iris+ 3DR® Quadcopter (from 30m to 100m). Two multispectral visible near-infrared cameras were used: the AirInov® MultiSPEC 4C® and the Micasense® RedEdge®. 42 ground control points (GCP) were sampled within the 3.6 ha plot. A centimetric Trimble Geo 7x DGPS was used to determine precise GCP positions. On each GCP the soil horizons were described and the top soil were sampled for standard physico-chemical analysis. Ground spectral measurements with a Spectral Evolution® SR-3500 spectroradiometer were made synchronously with the drone flights. 22 additional GCP were placed around the 3.6 ha area in order to realize a precise georeferencing. The multispectral mosaics were calculated using the Agisoft Photoscan® software and all mapping processings were done with the ESRI ArcGIS® 10.3 software. The soil properties were estimated by partial least squares regression (PLSR) between the laboratory analyses and the multispectral information of the UAS images, with the PLS package of the R software. The objective was to establish a model that would achieve an acceptable prediction quality using minimum number of points. For this, we tested 5 models with a decreasing number of calibration points: 20, 15, 10, 5 and 3 points. The remaining points were used to validate the models. The point positions were determined on the basis of a soil brightness index map calculated from the UAS image, in order to distribute the points in areas of contrasted brightness. Root Mean Squared Error Prediction (RMSEP) obtained by cross-validation were 1.6 g.kg-1 and 28 g.kg-1 for organic carbon and clay respectively, with 20 points. Results showed ability to obtain acceptable precision (2 g.kg-1 and 48 g.kg-1) with only 3 points. This work was supported by the SolFIT research network of the BASC LabEx (Laboratory of Excellence) and by the TOSCA-PLEIADES-CO project of the French Space Agency (CNES).
Using Refined Regression Analysis To Assess The Ecological Services Of Restored Wetlands
A hierarchical approach to regression analysis of wetland water treatment was conducted to determine which factors are the most appropriate for characterizing wetlands of differing structure and function. We used this approach in an effort to identify the types and characteristi...
Regression Analysis: Instructional Resource for Cost/Managerial Accounting
ERIC Educational Resources Information Center
Stout, David E.
2015-01-01
This paper describes a classroom-tested instructional resource, grounded in principles of active learning and a constructivism, that embraces two primary objectives: "demystify" for accounting students technical material from statistics regarding ordinary least-squares (OLS) regression analysis--material that students may find obscure or…
Ultrasound-enhanced bioscouring of greige cotton: regression analysis of process factors
USDA-ARS?s Scientific Manuscript database
Process factors of enzyme concentration, time, power and frequency were investigated for ultrasound-enhanced bioscouring of greige cotton. A fractional factorial experimental design and subsequent regression analysis of the process factors were employed to determine the significance of each factor a...
London Measure of Unplanned Pregnancy: guidance for its use as an outcome measure
Hall, Jennifer A; Barrett, Geraldine; Copas, Andrew; Stephenson, Judith
2017-01-01
Background The London Measure of Unplanned Pregnancy (LMUP) is a psychometrically validated measure of the degree of intention of a current or recent pregnancy. The LMUP is increasingly being used worldwide, and can be used to evaluate family planning or preconception care programs. However, beyond recommending the use of the full LMUP scale, there is no published guidance on how to use the LMUP as an outcome measure. Ordinal logistic regression has been recommended informally, but studies published to date have all used binary logistic regression and dichotomized the scale at different cut points. There is thus a need for evidence-based guidance to provide a standardized methodology for multivariate analysis and to enable comparison of results. This paper makes recommendations for the regression method for analysis of the LMUP as an outcome measure. Materials and methods Data collected from 4,244 pregnant women in Malawi were used to compare five regression methods: linear, logistic with two cut points, and ordinal logistic with either the full or grouped LMUP score. The recommendations were then tested on the original UK LMUP data. Results There were small but no important differences in the findings across the regression models. Logistic regression resulted in the largest loss of information, and assumptions were violated for the linear and ordinal logistic regression. Consequently, robust standard errors were used for linear regression and a partial proportional odds ordinal logistic regression model attempted. The latter could only be fitted for grouped LMUP score. Conclusion We recommend the linear regression model with robust standard errors to make full use of the LMUP score when analyzed as an outcome measure. Ordinal logistic regression could be considered, but a partial proportional odds model with grouped LMUP score may be required. Logistic regression is the least-favored option, due to the loss of information. For logistic regression, the cut point for un/planned pregnancy should be between nine and ten. These recommendations will standardize the analysis of LMUP data and enhance comparability of results across studies. PMID:28435343
Afantitis, Antreas; Melagraki, Georgia; Sarimveis, Haralambos; Koutentis, Panayiotis A; Markopoulos, John; Igglessi-Markopoulou, Olga
2006-08-01
A quantitative-structure activity relationship was obtained by applying Multiple Linear Regression Analysis to a series of 80 1-[2-hydroxyethoxy-methyl]-6-(phenylthio) thymine (HEPT) derivatives with significant anti-HIV activity. For the selection of the best among 37 different descriptors, the Elimination Selection Stepwise Regression Method (ES-SWR) was utilized. The resulting QSAR model (R (2) (CV) = 0.8160; S (PRESS) = 0.5680) proved to be very accurate both in training and predictive stages.
The use of cognitive ability measures as explanatory variables in regression analysis
Junker, Brian; Schofield, Lynne Steuerle; Taylor, Lowell J
2015-01-01
Cognitive ability measures are often taken as explanatory variables in regression analysis, e.g., as a factor affecting a market outcome such as an individual’s wage, or a decision such as an individual’s education acquisition. Cognitive ability is a latent construct; its true value is unobserved. Nonetheless, researchers often assume that a test score, constructed via standard psychometric practice from individuals’ responses to test items, can be safely used in regression analysis. We examine problems that can arise, and suggest that an alternative approach, a “mixed effects structural equations” (MESE) model, may be more appropriate in many circumstances. PMID:26998417
Suzuki, Hideaki; Tabata, Takahisa; Koizumi, Hiroki; Hohchi, Nobusuke; Takeuchi, Shoko; Kitamura, Takuro; Fujino, Yoshihisa; Ohbuchi, Toyoaki
2014-12-01
This study aimed to create a multiple regression model for predicting hearing outcomes of idiopathic sudden sensorineural hearing loss (ISSNHL). The participants were 205 consecutive patients (205 ears) with ISSNHL (hearing level ≥ 40 dB, interval between onset and treatment ≤ 30 days). They received systemic steroid administration combined with intratympanic steroid injection. Data were examined by simple and multiple regression analyses. Three hearing indices (percentage hearing improvement, hearing gain, and posttreatment hearing level [HLpost]) and 7 prognostic factors (age, days from onset to treatment, initial hearing level, initial hearing level at low frequencies, initial hearing level at high frequencies, presence of vertigo, and contralateral hearing level) were included in the multiple regression analysis as dependent and explanatory variables, respectively. In the simple regression analysis, the percentage hearing improvement, hearing gain, and HLpost showed significant correlation with 2, 5, and 6 of the 7 prognostic factors, respectively. The multiple correlation coefficients were 0.396, 0.503, and 0.714 for the percentage hearing improvement, hearing gain, and HLpost, respectively. Predicted values of HLpost calculated by the multiple regression equation were reliable with 70% probability with a 40-dB-width prediction interval. Prediction of HLpost by the multiple regression model may be useful to estimate the hearing prognosis of ISSNHL. © The Author(s) 2014.
MULGRES: a computer program for stepwise multiple regression analysis
A. Jeff Martin
1971-01-01
MULGRES is a computer program source deck that is designed for multiple regression analysis employing the technique of stepwise deletion in the search for most significant variables. The features of the program, along with inputs and outputs, are briefly described, with a note on machine compatibility.
CatReg Software for Categorical Regression Analysis (May 2016)
CatReg 3.0 is a Microsoft Windows enhanced version of the Agency’s categorical regression analysis (CatReg) program. CatReg complements EPA’s existing Benchmark Dose Software (BMDS) by greatly enhancing a risk assessor’s ability to determine whether data from separate toxicologic...
Method for nonlinear exponential regression analysis
NASA Technical Reports Server (NTRS)
Junkin, B. G.
1972-01-01
Two computer programs developed according to two general types of exponential models for conducting nonlinear exponential regression analysis are described. Least squares procedure is used in which the nonlinear problem is linearized by expanding in a Taylor series. Program is written in FORTRAN 5 for the Univac 1108 computer.
ERIC Educational Resources Information Center
Koon, Sharon; Petscher, Yaacov
2015-01-01
The purpose of this report was to explicate the use of logistic regression and classification and regression tree (CART) analysis in the development of early warning systems. It was motivated by state education leaders' interest in maintaining high classification accuracy while simultaneously improving practitioner understanding of the rules by…
Interrupted Time Series Versus Statistical Process Control in Quality Improvement Projects.
Andersson Hagiwara, Magnus; Andersson Gäre, Boel; Elg, Mattias
2016-01-01
To measure the effect of quality improvement interventions, it is appropriate to use analysis methods that measure data over time. Examples of such methods include statistical process control analysis and interrupted time series with segmented regression analysis. This article compares the use of statistical process control analysis and interrupted time series with segmented regression analysis for evaluating the longitudinal effects of quality improvement interventions, using an example study on an evaluation of a computerized decision support system.
NASA Astrophysics Data System (ADS)
Prahutama, Alan; Suparti; Wahyu Utami, Tiani
2018-03-01
Regression analysis is an analysis to model the relationship between response variables and predictor variables. The parametric approach to the regression model is very strict with the assumption, but nonparametric regression model isn’t need assumption of model. Time series data is the data of a variable that is observed based on a certain time, so if the time series data wanted to be modeled by regression, then we should determined the response and predictor variables first. Determination of the response variable in time series is variable in t-th (yt), while the predictor variable is a significant lag. In nonparametric regression modeling, one developing approach is to use the Fourier series approach. One of the advantages of nonparametric regression approach using Fourier series is able to overcome data having trigonometric distribution. In modeling using Fourier series needs parameter of K. To determine the number of K can be used Generalized Cross Validation method. In inflation modeling for the transportation sector, communication and financial services using Fourier series yields an optimal K of 120 parameters with R-square 99%. Whereas if it was modeled by multiple linear regression yield R-square 90%.
Westreich, Daniel; Lessler, Justin; Funk, Michele Jonsson
2010-08-01
Propensity scores for the analysis of observational data are typically estimated using logistic regression. Our objective in this review was to assess machine learning alternatives to logistic regression, which may accomplish the same goals but with fewer assumptions or greater accuracy. We identified alternative methods for propensity score estimation and/or classification from the public health, biostatistics, discrete mathematics, and computer science literature, and evaluated these algorithms for applicability to the problem of propensity score estimation, potential advantages over logistic regression, and ease of use. We identified four techniques as alternatives to logistic regression: neural networks, support vector machines, decision trees (classification and regression trees [CART]), and meta-classifiers (in particular, boosting). Although the assumptions of logistic regression are well understood, those assumptions are frequently ignored. All four alternatives have advantages and disadvantages compared with logistic regression. Boosting (meta-classifiers) and, to a lesser extent, decision trees (particularly CART), appear to be most promising for use in the context of propensity score analysis, but extensive simulation studies are needed to establish their utility in practice. Copyright (c) 2010 Elsevier Inc. All rights reserved.
Analysis of Sting Balance Calibration Data Using Optimized Regression Models
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert; Bader, Jon B.
2009-01-01
Calibration data of a wind tunnel sting balance was processed using a search algorithm that identifies an optimized regression model for the data analysis. The selected sting balance had two moment gages that were mounted forward and aft of the balance moment center. The difference and the sum of the two gage outputs were fitted in the least squares sense using the normal force and the pitching moment at the balance moment center as independent variables. The regression model search algorithm predicted that the difference of the gage outputs should be modeled using the intercept and the normal force. The sum of the two gage outputs, on the other hand, should be modeled using the intercept, the pitching moment, and the square of the pitching moment. Equations of the deflection of a cantilever beam are used to show that the search algorithm s two recommended math models can also be obtained after performing a rigorous theoretical analysis of the deflection of the sting balance under load. The analysis of the sting balance calibration data set is a rare example of a situation when regression models of balance calibration data can directly be derived from first principles of physics and engineering. In addition, it is interesting to see that the search algorithm recommended the same regression models for the data analysis using only a set of statistical quality metrics.
Meta-regression analysis of commensal and pathogenic Escherichia coli survival in soil and water.
Franz, Eelco; Schijven, Jack; de Roda Husman, Ana Maria; Blaak, Hetty
2014-06-17
The extent to which pathogenic and commensal E. coli (respectively PEC and CEC) can survive, and which factors predominantly determine the rate of decline, are crucial issues from a public health point of view. The goal of this study was to provide a quantitative summary of the variability in E. coli survival in soil and water over a broad range of individual studies and to identify the most important sources of variability. To that end, a meta-regression analysis on available literature data was conducted. The considerable variation in reported decline rates indicated that the persistence of E. coli is not easily predictable. The meta-analysis demonstrated that for soil and water, the type of experiment (laboratory or field), the matrix subtype (type of water and soil), and temperature were the main factors included in the regression analysis. A higher average decline rate in soil of PEC compared with CEC was observed. The regression models explained at best 57% of the variation in decline rate in soil and 41% of the variation in decline rate in water. This indicates that additional factors, not included in the current meta-regression analysis, are of importance but rarely reported. More complete reporting of experimental conditions may allow future inference on the global effects of these variables on the decline rate of E. coli.