squared prediction errors: Topics by Science.gov

Sample records for squared prediction errors

Discordance between net analyte signal theory and practical multivariate calibration.

PubMed

Brown, Christopher D

2004-08-01

Lorber's concept of net analyte signal is reviewed in the context of classical and inverse least-squares approaches to multivariate calibration. It is shown that, in the presence of device measurement error, the classical and inverse calibration procedures have radically different theoretical prediction objectives, and the assertion that the popular inverse least-squares procedures (including partial least squares, principal components regression) approximate Lorber's net analyte signal vector in the limit is disproved. Exact theoretical expressions for the prediction error bias, variance, and mean-squared error are given under general measurement error conditions, which reinforce the very discrepant behavior between these two predictive approaches, and Lorber's net analyte signal theory. Implications for multivariate figures of merit and numerous recently proposed preprocessing treatments involving orthogonal projections are also discussed.
Estimating Model Prediction Error: Should You Treat Predictions as Fixed or Random?

NASA Technical Reports Server (NTRS)

Wallach, Daniel; Thorburn, Peter; Asseng, Senthold; Challinor, Andrew J.; Ewert, Frank; Jones, James W.; Rotter, Reimund; Ruane, Alexander

2016-01-01

Crop models are important tools for impact assessment of climate change, as well as for exploring management options under current climate. It is essential to evaluate the uncertainty associated with predictions of these models. We compare two criteria of prediction error; MSEP fixed, which evaluates mean squared error of prediction for a model with fixed structure, parameters and inputs, and MSEP uncertain( X), which evaluates mean squared error averaged over the distributions of model structure, inputs and parameters. Comparison of model outputs with data can be used to estimate the former. The latter has a squared bias term, which can be estimated using hindcasts, and a model variance term, which can be estimated from a simulation experiment. The separate contributions to MSEP uncertain (X) can be estimated using a random effects ANOVA. It is argued that MSEP uncertain (X) is the more informative uncertainty criterion, because it is specific to each prediction situation.
Chemical library subset selection algorithms: a unified derivation using spatial statistics.

PubMed

Hamprecht, Fred A; Thiel, Walter; van Gunsteren, Wilfred F

2002-01-01

If similar compounds have similar activity, rational subset selection becomes superior to random selection in screening for pharmacological lead discovery programs. Traditional approaches to this experimental design problem fall into two classes: (i) a linear or quadratic response function is assumed (ii) some space filling criterion is optimized. The assumptions underlying the first approach are clear but not always defendable; the second approach yields more intuitive designs but lacks a clear theoretical foundation. We model activity in a bioassay as realization of a stochastic process and use the best linear unbiased estimator to construct spatial sampling designs that optimize the integrated mean square prediction error, the maximum mean square prediction error, or the entropy. We argue that our approach constitutes a unifying framework encompassing most proposed techniques as limiting cases and sheds light on their underlying assumptions. In particular, vector quantization is obtained, in dimensions up to eight, in the limiting case of very smooth response surfaces for the integrated mean square error criterion. Closest packing is obtained for very rough surfaces under the integrated mean square error and entropy criteria. We suggest to use either the integrated mean square prediction error or the entropy as optimization criteria rather than approximations thereof and propose a scheme for direct iterative minimization of the integrated mean square prediction error. Finally, we discuss how the quality of chemical descriptors manifests itself and clarify the assumptions underlying the selection of diverse or representative subsets.
Does the sensorimotor system minimize prediction error or select the most likely prediction during object lifting?

PubMed Central

McGregor, Heather R.; Pun, Henry C. H.; Buckingham, Gavin; Gribble, Paul L.

2016-01-01

The human sensorimotor system is routinely capable of making accurate predictions about an object's weight, which allows for energetically efficient lifts and prevents objects from being dropped. Often, however, poor predictions arise when the weight of an object can vary and sensory cues about object weight are sparse (e.g., picking up an opaque water bottle). The question arises, what strategies does the sensorimotor system use to make weight predictions when one is dealing with an object whose weight may vary? For example, does the sensorimotor system use a strategy that minimizes prediction error (minimal squared error) or one that selects the weight that is most likely to be correct (maximum a posteriori)? In this study we dissociated the predictions of these two strategies by having participants lift an object whose weight varied according to a skewed probability distribution. We found, using a small range of weight uncertainty, that four indexes of sensorimotor prediction (grip force rate, grip force, load force rate, and load force) were consistent with a feedforward strategy that minimizes the square of prediction errors. These findings match research in the visuomotor system, suggesting parallels in underlying processes. We interpret our findings within a Bayesian framework and discuss the potential benefits of using a minimal squared error strategy. NEW & NOTEWORTHY Using a novel experimental model of object lifting, we tested whether the sensorimotor system models the weight of objects by minimizing lifting errors or by selecting the statistically most likely weight. We found that the sensorimotor system minimizes the square of prediction errors for object lifting. This parallels the results of studies that investigated visually guided reaching, suggesting an overlap in the underlying mechanisms between tasks that involve different sensory systems. PMID:27760821
Mitigating Errors in External Respiratory Surrogate-Based Models of Tumor Position

DOE Office of Scientific and Technical Information (OSTI.GOV)

Malinowski, Kathleen T.; Fischell Department of Bioengineering, University of Maryland, College Park, MD; McAvoy, Thomas J.

2012-04-01

Purpose: To investigate the effect of tumor site, measurement precision, tumor-surrogate correlation, training data selection, model design, and interpatient and interfraction variations on the accuracy of external marker-based models of tumor position. Methods and Materials: Cyberknife Synchrony system log files comprising synchronously acquired positions of external markers and the tumor from 167 treatment fractions were analyzed. The accuracy of Synchrony, ordinary-least-squares regression, and partial-least-squares regression models for predicting the tumor position from the external markers was evaluated. The quantity and timing of the data used to build the predictive model were varied. The effects of tumor-surrogate correlation and the precisionmore » in both the tumor and the external surrogate position measurements were explored by adding noise to the data. Results: The tumor position prediction errors increased during the duration of a fraction. Increasing the training data quantities did not always lead to more accurate models. Adding uncorrelated noise to the external marker-based inputs degraded the tumor-surrogate correlation models by 16% for partial-least-squares and 57% for ordinary-least-squares. External marker and tumor position measurement errors led to tumor position prediction changes 0.3-3.6 times the magnitude of the measurement errors, varying widely with model algorithm. The tumor position prediction errors were significantly associated with the patient index but not with the fraction index or tumor site. Partial-least-squares was as accurate as Synchrony and more accurate than ordinary-least-squares. Conclusions: The accuracy of surrogate-based inferential models of tumor position was affected by all the investigated factors, except for the tumor site and fraction index.« less
Response Surface Modeling Using Multivariate Orthogonal Functions

NASA Technical Reports Server (NTRS)

Morelli, Eugene A.; DeLoach, Richard

2001-01-01

A nonlinear modeling technique was used to characterize response surfaces for non-dimensional longitudinal aerodynamic force and moment coefficients, based on wind tunnel data from a commercial jet transport model. Data were collected using two experimental procedures - one based on modem design of experiments (MDOE), and one using a classical one factor at a time (OFAT) approach. The nonlinear modeling technique used multivariate orthogonal functions generated from the independent variable data as modeling functions in a least squares context to characterize the response surfaces. Model terms were selected automatically using a prediction error metric. Prediction error bounds computed from the modeling data alone were found to be- a good measure of actual prediction error for prediction points within the inference space. Root-mean-square model fit error and prediction error were less than 4 percent of the mean response value in all cases. Efficacy and prediction performance of the response surface models identified from both MDOE and OFAT experiments were investigated.
Quality assessment of gasoline using comprehensive two-dimensional gas chromatography combined with unfolded partial least squares: A reliable approach for the detection of gasoline adulteration.

PubMed

Parastar, Hadi; Mostafapour, Sara; Azimi, Gholamhasan

2016-01-01

Comprehensive two-dimensional gas chromatography and flame ionization detection combined with unfolded-partial least squares is proposed as a simple, fast and reliable method to assess the quality of gasoline and to detect its potential adulterants. The data for the calibration set are first baseline corrected using a two-dimensional asymmetric least squares algorithm. The number of significant partial least squares components to build the model is determined using the minimum value of root-mean square error of leave-one out cross validation, which was 4. In this regard, blends of gasoline with kerosene, white spirit and paint thinner as frequently used adulterants are used to make calibration samples. Appropriate statistical parameters of regression coefficient of 0.996-0.998, root-mean square error of prediction of 0.005-0.010 and relative error of prediction of 1.54-3.82% for the calibration set show the reliability of the developed method. In addition, the developed method is externally validated with three samples in validation set (with a relative error of prediction below 10.0%). Finally, to test the applicability of the proposed strategy for the analysis of real samples, five real gasoline samples collected from gas stations are used for this purpose and the gasoline proportions were in range of 70-85%. Also, the relative standard deviations were below 8.5% for different samples in the prediction set. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Measurement System Characterization in the Presence of Measurement Errors

NASA Technical Reports Server (NTRS)

Commo, Sean A.

2012-01-01

In the calibration of a measurement system, data are collected in order to estimate a mathematical model between one or more factors of interest and a response. Ordinary least squares is a method employed to estimate the regression coefficients in the model. The method assumes that the factors are known without error; yet, it is implicitly known that the factors contain some uncertainty. In the literature, this uncertainty is known as measurement error. The measurement error affects both the estimates of the model coefficients and the prediction, or residual, errors. There are some methods, such as orthogonal least squares, that are employed in situations where measurement errors exist, but these methods do not directly incorporate the magnitude of the measurement errors. This research proposes a new method, known as modified least squares, that combines the principles of least squares with knowledge about the measurement errors. This knowledge is expressed in terms of the variance ratio - the ratio of response error variance to measurement error variance.
Very-short-term wind power prediction by a hybrid model with single- and multi-step approaches

NASA Astrophysics Data System (ADS)

Mohammed, E.; Wang, S.; Yu, J.

2017-05-01

Very-short-term wind power prediction (VSTWPP) has played an essential role for the operation of electric power systems. This paper aims at improving and applying a hybrid method of VSTWPP based on historical data. The hybrid method is combined by multiple linear regressions and least square (MLR&LS), which is intended for reducing prediction errors. The predicted values are obtained through two sub-processes:1) transform the time-series data of actual wind power into the power ratio, and then predict the power ratio;2) use the predicted power ratio to predict the wind power. Besides, the proposed method can include two prediction approaches: single-step prediction (SSP) and multi-step prediction (MSP). WPP is tested comparatively by auto-regressive moving average (ARMA) model from the predicted values and errors. The validity of the proposed hybrid method is confirmed in terms of error analysis by using probability density function (PDF), mean absolute percent error (MAPE) and means square error (MSE). Meanwhile, comparison of the correlation coefficients between the actual values and the predicted values for different prediction times and window has confirmed that MSP approach by using the hybrid model is the most accurate while comparing to SSP approach and ARMA. The MLR&LS is accurate and promising for solving problems in WPP.
A Canonical Ensemble Correlation Prediction Model for Seasonal Precipitation Anomaly

NASA Technical Reports Server (NTRS)

Shen, Samuel S. P.; Lau, William K. M.; Kim, Kyu-Myong; Li, Guilong

2001-01-01

This report describes an optimal ensemble forecasting model for seasonal precipitation and its error estimation. Each individual forecast is based on the canonical correlation analysis (CCA) in the spectral spaces whose bases are empirical orthogonal functions (EOF). The optimal weights in the ensemble forecasting crucially depend on the mean square error of each individual forecast. An estimate of the mean square error of a CCA prediction is made also using the spectral method. The error is decomposed onto EOFs of the predictand and decreases linearly according to the correlation between the predictor and predictand. This new CCA model includes the following features: (1) the use of area-factor, (2) the estimation of prediction error, and (3) the optimal ensemble of multiple forecasts. The new CCA model is applied to the seasonal forecasting of the United States precipitation field. The predictor is the sea surface temperature.
Intrinsic Raman spectroscopy for quantitative biological spectroscopy Part II

PubMed Central

Bechtel, Kate L.; Shih, Wei-Chuan; Feld, Michael S.

2009-01-01

We demonstrate the effectiveness of intrinsic Raman spectroscopy (IRS) at reducing errors caused by absorption and scattering. Physical tissue models, solutions of varying absorption and scattering coefficients with known concentrations of Raman scatterers, are studied. We show significant improvement in prediction error by implementing IRS to predict concentrations of Raman scatterers using both ordinary least squares regression (OLS) and partial least squares regression (PLS). In particular, we show that IRS provides a robust calibration model that does not increase in error when applied to samples with optical properties outside the range of calibration. PMID:18711512
Water quality management using statistical analysis and time-series prediction model

NASA Astrophysics Data System (ADS)

Parmar, Kulwinder Singh; Bhardwaj, Rashmi

2014-12-01

This paper deals with water quality management using statistical analysis and time-series prediction model. The monthly variation of water quality standards has been used to compare statistical mean, median, mode, standard deviation, kurtosis, skewness, coefficient of variation at Yamuna River. Model validated using R-squared, root mean square error, mean absolute percentage error, maximum absolute percentage error, mean absolute error, maximum absolute error, normalized Bayesian information criterion, Ljung-Box analysis, predicted value and confidence limits. Using auto regressive integrated moving average model, future water quality parameters values have been estimated. It is observed that predictive model is useful at 95 % confidence limits and curve is platykurtic for potential of hydrogen (pH), free ammonia, total Kjeldahl nitrogen, dissolved oxygen, water temperature (WT); leptokurtic for chemical oxygen demand, biochemical oxygen demand. Also, it is observed that predicted series is close to the original series which provides a perfect fit. All parameters except pH and WT cross the prescribed limits of the World Health Organization /United States Environmental Protection Agency, and thus water is not fit for drinking, agriculture and industrial use.
Some Results on Mean Square Error for Factor Score Prediction

ERIC Educational Resources Information Center

Krijnen, Wim P.

2006-01-01

For the confirmatory factor model a series of inequalities is given with respect to the mean square error (MSE) of three main factor score predictors. The eigenvalues of these MSE matrices are a monotonic function of the eigenvalues of the matrix gamma[subscript rho] = theta[superscript 1/2] lambda[subscript rho] 'psi[subscript rho] [superscript…
Weighted linear regression using D2H and D2 as the independent variables

Treesearch

Hans T. Schreuder; Michael S. Williams

1998-01-01

Several error structures for weighted regression equations used for predicting volume were examined for 2 large data sets of felled and standing loblolly pine trees (Pinus taeda L.). The generally accepted model with variance of error proportional to the value of the covariate squared ( D2H = diameter squared times height or D...
Hypoglycemia early alarm systems based on recursive autoregressive partial least squares models.

PubMed

Bayrak, Elif Seyma; Turksoy, Kamuran; Cinar, Ali; Quinn, Lauretta; Littlejohn, Elizabeth; Rollins, Derrick

2013-01-01

Hypoglycemia caused by intensive insulin therapy is a major challenge for artificial pancreas systems. Early detection and prevention of potential hypoglycemia are essential for the acceptance of fully automated artificial pancreas systems. Many of the proposed alarm systems are based on interpretation of recent values or trends in glucose values. In the present study, subject-specific linear models are introduced to capture glucose variations and predict future blood glucose concentrations. These models can be used in early alarm systems of potential hypoglycemia. A recursive autoregressive partial least squares (RARPLS) algorithm is used to model the continuous glucose monitoring sensor data and predict future glucose concentrations for use in hypoglycemia alarm systems. The partial least squares models constructed are updated recursively at each sampling step with a moving window. An early hypoglycemia alarm algorithm using these models is proposed and evaluated. Glucose prediction models based on real-time filtered data has a root mean squared error of 7.79 and a sum of squares of glucose prediction error of 7.35% for six-step-ahead (30 min) glucose predictions. The early alarm systems based on RARPLS shows good performance. A sensitivity of 86% and a false alarm rate of 0.42 false positive/day are obtained for the early alarm system based on six-step-ahead predicted glucose values with an average early detection time of 25.25 min. The RARPLS models developed provide satisfactory glucose prediction with relatively smaller error than other proposed algorithms and are good candidates to forecast and warn about potential hypoglycemia unless preventive action is taken far in advance. © 2012 Diabetes Technology Society.
Hypoglycemia Early Alarm Systems Based on Recursive Autoregressive Partial Least Squares Models

PubMed Central

Bayrak, Elif Seyma; Turksoy, Kamuran; Cinar, Ali; Quinn, Lauretta; Littlejohn, Elizabeth; Rollins, Derrick

2013-01-01

Background Hypoglycemia caused by intensive insulin therapy is a major challenge for artificial pancreas systems. Early detection and prevention of potential hypoglycemia are essential for the acceptance of fully automated artificial pancreas systems. Many of the proposed alarm systems are based on interpretation of recent values or trends in glucose values. In the present study, subject-specific linear models are introduced to capture glucose variations and predict future blood glucose concentrations. These models can be used in early alarm systems of potential hypoglycemia. Methods A recursive autoregressive partial least squares (RARPLS) algorithm is used to model the continuous glucose monitoring sensor data and predict future glucose concentrations for use in hypoglycemia alarm systems. The partial least squares models constructed are updated recursively at each sampling step with a moving window. An early hypoglycemia alarm algorithm using these models is proposed and evaluated. Results Glucose prediction models based on real-time filtered data has a root mean squared error of 7.79 and a sum of squares of glucose prediction error of 7.35% for six-step-ahead (30 min) glucose predictions. The early alarm systems based on RARPLS shows good performance. A sensitivity of 86% and a false alarm rate of 0.42 false positive/day are obtained for the early alarm system based on six-step-ahead predicted glucose values with an average early detection time of 25.25 min. Conclusions The RARPLS models developed provide satisfactory glucose prediction with relatively smaller error than other proposed algorithms and are good candidates to forecast and warn about potential hypoglycemia unless preventive action is taken far in advance. PMID:23439179
Application of Rapid Visco Analyser (RVA) viscograms and chemometrics for maize hardness characterisation.

PubMed

Guelpa, Anina; Bevilacqua, Marta; Marini, Federico; O'Kennedy, Kim; Geladi, Paul; Manley, Marena

2015-04-15

It has been established in this study that the Rapid Visco Analyser (RVA) can describe maize hardness, irrespective of the RVA profile, when used in association with appropriate multivariate data analysis techniques. Therefore, the RVA can complement or replace current and/or conventional methods as a hardness descriptor. Hardness modelling based on RVA viscograms was carried out using seven conventional hardness methods (hectoliter mass (HLM), hundred kernel mass (HKM), particle size index (PSI), percentage vitreous endosperm (%VE), protein content, percentage chop (%chop) and near infrared (NIR) spectroscopy) as references and three different RVA profiles (hard, soft and standard) as predictors. An approach using locally weighted partial least squares (LW-PLS) was followed to build the regression models. The resulted prediction errors (root mean square error of cross-validation (RMSECV) and root mean square error of prediction (RMSEP)) for the quantification of hardness values were always lower or in the same order of the laboratory error of the reference method. Copyright © 2014 Elsevier Ltd. All rights reserved.
Forecasting Error Calculation with Mean Absolute Deviation and Mean Absolute Percentage Error

NASA Astrophysics Data System (ADS)

Khair, Ummul; Fahmi, Hasanul; Hakim, Sarudin Al; Rahim, Robbi

2017-12-01

Prediction using a forecasting method is one of the most important things for an organization, the selection of appropriate forecasting methods is also important but the percentage error of a method is more important in order for decision makers to adopt the right culture, the use of the Mean Absolute Deviation and Mean Absolute Percentage Error to calculate the percentage of mistakes in the least square method resulted in a percentage of 9.77% and it was decided that the least square method be worked for time series and trend data.
Determination of suitable drying curve model for bread moisture loss during baking

NASA Astrophysics Data System (ADS)

Soleimani Pour-Damanab, A. R.; Jafary, A.; Rafiee, S.

2013-03-01

This study presents mathematical modelling of bread moisture loss or drying during baking in a conventional bread baking process. In order to estimate and select the appropriate moisture loss curve equation, 11 different models, semi-theoretical and empirical, were applied to the experimental data and compared according to their correlation coefficients, chi-squared test and root mean square error which were predicted by nonlinear regression analysis. Consequently, of all the drying models, a Page model was selected as the best one, according to the correlation coefficients, chi-squared test, and root mean square error values and its simplicity. Mean absolute estimation error of the proposed model by linear regression analysis for natural and forced convection modes was 2.43, 4.74%, respectively.
Navy Fuel Composition and Screening Tool (FCAST) v2.8

DTIC Science & Technology

2016-05-10

allowed us to develop partial least squares (PLS) models based on gas chromatography–mass spectrometry (GC-MS) data that predict fuel properties. The...Chemometric property modeling Partial least squares PLS Compositional profiler Naval Air Systems Command Air-4.4.5 Patuxent River Naval Air Station Patuxent...Cumulative predicted residual error sum of squares DiEGME Diethylene glycol monomethyl ether FCAST Fuel Composition and Screening Tool FFP Fit for

Nondestructive quantification of the soluble-solids content and the available acidity of apples by Fourier-transform near-infrared spectroscopy

NASA Astrophysics Data System (ADS)

Ying, Yibin; Liu, Yande; Tao, Yang

2005-09-01

This research evaluated the feasibility of using Fourier-transform near-infrared (FT-NIR) spectroscopy to quantify the soluble-solids content (SSC) and the available acidity (VA) in intact apples. Partial least-squares calibration models, obtained from several preprocessing techniques (smoothing, derivative, etc.) in several wave-number ranges were compared. The best models were obtained with the high coefficient determination (r) 0.940 for the SSC and a moderate r of 0.801 for the VA, root-mean-square errors of prediction of 0.272% and 0.053%, and root-mean-square errors of calibration of 0.261% and 0.046%, respectively. The results indicate that the FT-NIR spectroscopy yields good predictions of the SSC and also showed the feasibility of using it to predict the VA of apples.
Selective Weighted Least Squares Method for Fourier Transform Infrared Quantitative Analysis.

PubMed

Wang, Xin; Li, Yan; Wei, Haoyun; Chen, Xia

2017-06-01

Classical least squares (CLS) regression is a popular multivariate statistical method used frequently for quantitative analysis using Fourier transform infrared (FT-IR) spectrometry. Classical least squares provides the best unbiased estimator for uncorrelated residual errors with zero mean and equal variance. However, the noise in FT-IR spectra, which accounts for a large portion of the residual errors, is heteroscedastic. Thus, if this noise with zero mean dominates in the residual errors, the weighted least squares (WLS) regression method described in this paper is a better estimator than CLS. However, if bias errors, such as the residual baseline error, are significant, WLS may perform worse than CLS. In this paper, we compare the effect of noise and bias error in using CLS and WLS in quantitative analysis. Results indicated that for wavenumbers with low absorbance, the bias error significantly affected the error, such that the performance of CLS is better than that of WLS. However, for wavenumbers with high absorbance, the noise significantly affected the error, and WLS proves to be better than CLS. Thus, we propose a selective weighted least squares (SWLS) regression that processes data with different wavenumbers using either CLS or WLS based on a selection criterion, i.e., lower or higher than an absorbance threshold. The effects of various factors on the optimal threshold value (OTV) for SWLS have been studied through numerical simulations. These studies reported that: (1) the concentration and the analyte type had minimal effect on OTV; and (2) the major factor that influences OTV is the ratio between the bias error and the standard deviation of the noise. The last part of this paper is dedicated to quantitative analysis of methane gas spectra, and methane/toluene mixtures gas spectra as measured using FT-IR spectrometry and CLS, WLS, and SWLS. The standard error of prediction (SEP), bias of prediction (bias), and the residual sum of squares of the errors (RSS) from the three quantitative analyses were compared. In methane gas analysis, SWLS yielded the lowest SEP and RSS among the three methods. In methane/toluene mixture gas analysis, a modification of the SWLS has been presented to tackle the bias error from other components. The SWLS without modification presents the lowest SEP in all cases but not bias and RSS. The modification of SWLS reduced the bias, which showed a lower RSS than CLS, especially for small components.
Nondestructive quantification of the soluble-solids content and the available acidity of apples by Fourier-transform near-infrared spectroscopy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Ying Yibin; Liu Yande; Tao Yang

2005-09-01

This research evaluated the feasibility of using Fourier-transform near-infrared (FT-NIR) spectroscopy to quantify the soluble-solids content (SSC) and the available acidity (VA) in intact apples. Partial least-squares calibration models, obtained from several preprocessing techniques (smoothing, derivative, etc.) in several wave-number ranges were compared. The best models were obtained with the high coefficient determination (r{sup 2}) 0.940 for the SSC and a moderate r{sup 2} of 0.801 for the VA, root-mean-square errors of prediction of 0.272% and 0.053%, and root-mean-square errors of calibration of 0.261% and 0.046%, respectively. The results indicate that the FT-NIR spectroscopy yields good predictions of the SSCmore » and also showed the feasibility of using it to predict the VA of apples.« less
Limited Sampling Strategy for Accurate Prediction of Pharmacokinetics of Saroglitazar: A 3-point Linear Regression Model Development and Successful Prediction of Human Exposure.

PubMed

Joshi, Shuchi N; Srinivas, Nuggehally R; Parmar, Deven V

2018-03-01

Our aim was to develop and validate the extrapolative performance of a regression model using a limited sampling strategy for accurate estimation of the area under the plasma concentration versus time curve for saroglitazar. Healthy subject pharmacokinetic data from a well-powered food-effect study (fasted vs fed treatments; n = 50) was used in this work. The first 25 subjects' serial plasma concentration data up to 72 hours and corresponding AUC 0-t (ie, 72 hours) from the fasting group comprised a training dataset to develop the limited sampling model. The internal datasets for prediction included the remaining 25 subjects from the fasting group and all 50 subjects from the fed condition of the same study. The external datasets included pharmacokinetic data for saroglitazar from previous single-dose clinical studies. Limited sampling models were composed of 1-, 2-, and 3-concentration-time points' correlation with AUC 0-t of saroglitazar. Only models with regression coefficients (R 2 ) >0.90 were screened for further evaluation. The best R 2 model was validated for its utility based on mean prediction error, mean absolute prediction error, and root mean square error. Both correlations between predicted and observed AUC 0-t of saroglitazar and verification of precision and bias using Bland-Altman plot were carried out. None of the evaluated 1- and 2-concentration-time points models achieved R 2 > 0.90. Among the various 3-concentration-time points models, only 4 equations passed the predefined criterion of R 2 > 0.90. Limited sampling models with time points 0.5, 2, and 8 hours (R 2 = 0.9323) and 0.75, 2, and 8 hours (R 2 = 0.9375) were validated. Mean prediction error, mean absolute prediction error, and root mean square error were <30% (predefined criterion) and correlation (r) was at least 0.7950 for the consolidated internal and external datasets of 102 healthy subjects for the AUC 0-t prediction of saroglitazar. The same models, when applied to the AUC 0-t prediction of saroglitazar sulfoxide, showed mean prediction error, mean absolute prediction error, and root mean square error <30% and correlation (r) was at least 0.9339 in the same pool of healthy subjects. A 3-concentration-time points limited sampling model predicts the exposure of saroglitazar (ie, AUC 0-t ) within predefined acceptable bias and imprecision limit. Same model was also used to predict AUC 0-∞ . The same limited sampling model was found to predict the exposure of saroglitazar sulfoxide within predefined criteria. This model can find utility during late-phase clinical development of saroglitazar in the patient population. Copyright © 2018 Elsevier HS Journals, Inc. All rights reserved.
Determination of main fruits in adulterated nectars by ATR-FTIR spectroscopy combined with multivariate calibration and variable selection methods.

PubMed

Miaw, Carolina Sheng Whei; Assis, Camila; Silva, Alessandro Rangel Carolino Sales; Cunha, Maria Luísa; Sena, Marcelo Martins; de Souza, Scheilla Vitorino Carvalho

2018-07-15

Grape, orange, peach and passion fruit nectars were formulated and adulterated by dilution with syrup, apple and cashew juices at 10 levels for each adulterant. Attenuated total reflectance Fourier transform mid infrared (ATR-FTIR) spectra were obtained. Partial least squares (PLS) multivariate calibration models allied to different variable selection methods, such as interval partial least squares (iPLS), ordered predictors selection (OPS) and genetic algorithm (GA), were used to quantify the main fruits. PLS improved by iPLS-OPS variable selection showed the highest predictive capacity to quantify the main fruit contents. The selected variables in the final models varied from 72 to 100; the root mean square errors of prediction were estimated from 0.5 to 2.6%; the correlation coefficients of prediction ranged from 0.948 to 0.990; and, the mean relative errors of prediction varied from 3.0 to 6.7%. All of the developed models were validated. Copyright © 2018 Elsevier Ltd. All rights reserved.
Comparison of structural and least-squares lines for estimating geologic relations

USGS Publications Warehouse

Williams, G.P.; Troutman, B.M.

1990-01-01

Two different goals in fitting straight lines to data are to estimate a "true" linear relation (physical law) and to predict values of the dependent variable with the smallest possible error. Regarding the first goal, a Monte Carlo study indicated that the structural-analysis (SA) method of fitting straight lines to data is superior to the ordinary least-squares (OLS) method for estimating "true" straight-line relations. Number of data points, slope and intercept of the true relation, and variances of the errors associated with the independent (X) and dependent (Y) variables influence the degree of agreement. For example, differences between the two line-fitting methods decrease as error in X becomes small relative to error in Y. Regarding the second goal-predicting the dependent variable-OLS is better than SA. Again, the difference diminishes as X takes on less error relative to Y. With respect to estimation of slope and intercept and prediction of Y, agreement between Monte Carlo results and large-sample theory was very good for sample sizes of 100, and fair to good for sample sizes of 20. The procedures and error measures are illustrated with two geologic examples. ?? 1990 International Association for Mathematical Geology.
Application of Adaptive Neuro-Fuzzy Inference System for Prediction of Neutron Yield of IR-IECF Facility in High Voltages

NASA Astrophysics Data System (ADS)

Adineh-Vand, A.; Torabi, M.; Roshani, G. H.; Taghipour, M.; Feghhi, S. A. H.; Rezaei, M.; Sadati, S. M.

2013-09-01

This paper presents a soft computing based artificial intelligent technique, adaptive neuro-fuzzy inference system (ANFIS) to predict the neutron production rate (NPR) of IR-IECF device in wide discharge current and voltage ranges. A hybrid learning algorithm consists of back-propagation and least-squares estimation is used for training the ANFIS model. The performance of the proposed ANFIS model is tested using the experimental data using four performance measures: correlation coefficient, mean absolute error, mean relative error percentage (MRE%) and root mean square error. The obtained results show that the proposed ANFIS model has achieved good agreement with the experimental results. In comparison to the experimental data the proposed ANFIS model has MRE% <1.53 and 2.85 % for training and testing data respectively. Therefore, this model can be used as an efficient tool to predict the NPR in the IR-IECF device.
Application of the Polynomial-Based Least Squares and Total Least Squares Models for the Attenuated Total Reflection Fourier Transform Infrared Spectra of Binary Mixtures of Hydroxyl Compounds.

PubMed

Shan, Peng; Peng, Silong; Zhao, Yuhui; Tang, Liang

2016-03-01

An analysis of binary mixtures of hydroxyl compound by Attenuated Total Reflection Fourier transform infrared spectroscopy (ATR FT-IR) and classical least squares (CLS) yield large model error due to the presence of unmodeled components such as H-bonded components. To accommodate these spectral variations, polynomial-based least squares (LSP) and polynomial-based total least squares (TLSP) are proposed to capture the nonlinear absorbance-concentration relationship. LSP is based on assuming that only absorbance noise exists; while TLSP takes both absorbance noise and concentration noise into consideration. In addition, based on different solving strategy, two optimization algorithms (limited-memory Broyden-Fletcher-Goldfarb-Shanno (LBFGS) algorithm and Levenberg-Marquardt (LM) algorithm) are combined with TLSP and then two different TLSP versions (termed as TLSP-LBFGS and TLSP-LM) are formed. The optimum order of each nonlinear model is determined by cross-validation. Comparison and analyses of the four models are made from two aspects: absorbance prediction and concentration prediction. The results for water-ethanol solution and ethanol-ethyl lactate solution show that LSP, TLSP-LBFGS, and TLSP-LM can, for both absorbance prediction and concentration prediction, obtain smaller root mean square error of prediction than CLS. Additionally, they can also greatly enhance the accuracy of estimated pure component spectra. However, from the view of concentration prediction, the Wilcoxon signed rank test shows that there is no statistically significant difference between each nonlinear model and CLS. © The Author(s) 2016.
Prediction of CO concentrations based on a hybrid Partial Least Square and Support Vector Machine model

NASA Astrophysics Data System (ADS)

Yeganeh, B.; Motlagh, M. Shafie Pour; Rashidi, Y.; Kamalan, H.

2012-08-01

Due to the health impacts caused by exposures to air pollutants in urban areas, monitoring and forecasting of air quality parameters have become popular as an important topic in atmospheric and environmental research today. The knowledge on the dynamics and complexity of air pollutants behavior has made artificial intelligence models as a useful tool for a more accurate pollutant concentration prediction. This paper focuses on an innovative method of daily air pollution prediction using combination of Support Vector Machine (SVM) as predictor and Partial Least Square (PLS) as a data selection tool based on the measured values of CO concentrations. The CO concentrations of Rey monitoring station in the south of Tehran, from Jan. 2007 to Feb. 2011, have been used to test the effectiveness of this method. The hourly CO concentrations have been predicted using the SVM and the hybrid PLS-SVM models. Similarly, daily CO concentrations have been predicted based on the aforementioned four years measured data. Results demonstrated that both models have good prediction ability; however the hybrid PLS-SVM has better accuracy. In the analysis presented in this paper, statistic estimators including relative mean errors, root mean squared errors and the mean absolute relative error have been employed to compare performances of the models. It has been concluded that the errors decrease after size reduction and coefficients of determination increase from 56 to 81% for SVM model to 65-85% for hybrid PLS-SVM model respectively. Also it was found that the hybrid PLS-SVM model required lower computational time than SVM model as expected, hence supporting the more accurate and faster prediction ability of hybrid PLS-SVM model.
[Application of near infrared spectroscopy combined with particle swarm optimization based least square support vactor machine to rapid quantitative analysis of Corni Fructus].

PubMed

Liu, Xue-song; Sun, Fen-fang; Jin, Ye; Wu, Yong-jiang; Gu, Zhi-xin; Zhu, Li; Yan, Dong-lan

2015-12-01

A novel method was developed for the rapid determination of multi-indicators in corni fructus by means of near infrared (NIR) spectroscopy. Particle swarm optimization (PSO) based least squares support vector machine was investigated to increase the levels of quality control. The calibration models of moisture, extractum, morroniside and loganin were established using the PSO-LS-SVM algorithm. The performance of PSO-LS-SVM models was compared with partial least squares regression (PLSR) and back propagation artificial neural network (BP-ANN). The calibration and validation results of PSO-LS-SVM were superior to both PLS and BP-ANN. For PSO-LS-SVM models, the correlation coefficients (r) of calibrations were all above 0.942. The optimal prediction results were also achieved by PSO-LS-SVM models with the RMSEP (root mean square error of prediction) and RSEP (relative standard errors of prediction) less than 1.176 and 15.5% respectively. The results suggest that PSO-LS-SVM algorithm has a good model performance and high prediction accuracy. NIR has a potential value for rapid determination of multi-indicators in Corni Fructus.
A Sensor Dynamic Measurement Error Prediction Model Based on NAPSO-SVM.

PubMed

Jiang, Minlan; Jiang, Lan; Jiang, Dingde; Li, Fei; Song, Houbing

2018-01-15

Dynamic measurement error correction is an effective way to improve sensor precision. Dynamic measurement error prediction is an important part of error correction, and support vector machine (SVM) is often used for predicting the dynamic measurement errors of sensors. Traditionally, the SVM parameters were always set manually, which cannot ensure the model's performance. In this paper, a SVM method based on an improved particle swarm optimization (NAPSO) is proposed to predict the dynamic measurement errors of sensors. Natural selection and simulated annealing are added in the PSO to raise the ability to avoid local optima. To verify the performance of NAPSO-SVM, three types of algorithms are selected to optimize the SVM's parameters: the particle swarm optimization algorithm (PSO), the improved PSO optimization algorithm (NAPSO), and the glowworm swarm optimization (GSO). The dynamic measurement error data of two sensors are applied as the test data. The root mean squared error and mean absolute percentage error are employed to evaluate the prediction models' performances. The experimental results show that among the three tested algorithms the NAPSO-SVM method has a better prediction precision and a less prediction errors, and it is an effective method for predicting the dynamic measurement errors of sensors.
Nondestructive evaluation of soluble solid content in strawberry by near infrared spectroscopy

NASA Astrophysics Data System (ADS)

Guo, Zhiming; Huang, Wenqian; Chen, Liping; Wang, Xiu; Peng, Yankun

This paper indicates the feasibility to use near infrared (NIR) spectroscopy combined with synergy interval partial least squares (siPLS) algorithms as a rapid nondestructive method to estimate the soluble solid content (SSC) in strawberry. Spectral preprocessing methods were optimized selected by cross-validation in the model calibration. Partial least squares (PLS) algorithm was conducted on the calibration of regression model. The performance of the final model was back-evaluated according to root mean square error of calibration (RMSEC) and correlation coefficient (R2 c) in calibration set, and tested by mean square error of prediction (RMSEP) and correlation coefficient (R2 p) in prediction set. The optimal siPLS model was obtained with after first derivation spectra preprocessing. The measurement results of best model were achieved as follow: RMSEC = 0.2259, R2 c = 0.9590 in the calibration set; and RMSEP = 0.2892, R2 p = 0.9390 in the prediction set. This work demonstrated that NIR spectroscopy and siPLS with efficient spectral preprocessing is a useful tool for nondestructively evaluation SSC in strawberry.
Moments and Root-Mean-Square Error of the Bayesian MMSE Estimator of Classification Error in the Gaussian Model.

PubMed

Zollanvari, Amin; Dougherty, Edward R

2014-06-01

The most important aspect of any classifier is its error rate, because this quantifies its predictive capacity. Thus, the accuracy of error estimation is critical. Error estimation is problematic in small-sample classifier design because the error must be estimated using the same data from which the classifier has been designed. Use of prior knowledge, in the form of a prior distribution on an uncertainty class of feature-label distributions to which the true, but unknown, feature-distribution belongs, can facilitate accurate error estimation (in the mean-square sense) in circumstances where accurate completely model-free error estimation is impossible. This paper provides analytic asymptotically exact finite-sample approximations for various performance metrics of the resulting Bayesian Minimum Mean-Square-Error (MMSE) error estimator in the case of linear discriminant analysis (LDA) in the multivariate Gaussian model. These performance metrics include the first, second, and cross moments of the Bayesian MMSE error estimator with the true error of LDA, and therefore, the Root-Mean-Square (RMS) error of the estimator. We lay down the theoretical groundwork for Kolmogorov double-asymptotics in a Bayesian setting, which enables us to derive asymptotic expressions of the desired performance metrics. From these we produce analytic finite-sample approximations and demonstrate their accuracy via numerical examples. Various examples illustrate the behavior of these approximations and their use in determining the necessary sample size to achieve a desired RMS. The Supplementary Material contains derivations for some equations and added figures.
[Influence of Spectral Pre-Processing on PLS Quantitative Model of Detecting Cu in Navel Orange by LIBS].

PubMed

Li, Wen-bing; Yao, Lin-tao; Liu, Mu-hua; Huang, Lin; Yao, Ming-yin; Chen, Tian-bing; He, Xiu-wen; Yang, Ping; Hu, Hui-qin; Nie, Jiang-hui

2015-05-01

Cu in navel orange was detected rapidly by laser-induced breakdown spectroscopy (LIBS) combined with partial least squares (PLS) for quantitative analysis, then the effect on the detection accuracy of the model with different spectral data ptetreatment methods was explored. Spectral data for the 52 Gannan navel orange samples were pretreated by different data smoothing, mean centralized and standard normal variable transform. Then 319~338 nm wavelength section containing characteristic spectral lines of Cu was selected to build PLS models, the main evaluation indexes of models such as regression coefficient (r), root mean square error of cross validation (RMSECV) and the root mean square error of prediction (RMSEP) were compared and analyzed. Three indicators of PLS model after 13 points smoothing and processing of the mean center were found reaching 0. 992 8, 3. 43 and 3. 4 respectively, the average relative error of prediction model is only 5. 55%, and in one word, the quality of calibration and prediction of this model are the best results. The results show that selecting the appropriate data pre-processing method, the prediction accuracy of PLS quantitative model of fruits and vegetables detected by LIBS can be improved effectively, providing a new method for fast and accurate detection of fruits and vegetables by LIBS.
Criterion Predictability: Identifying Differences Between [r-squares

ERIC Educational Resources Information Center

Malgady, Robert G.

1976-01-01

An analysis of variance procedure for testing differences in r-squared, the coefficient of determination, across independent samples is proposed and briefly discussed. The principal advantage of the procedure is to minimize Type I error for follow-up tests of pairwise differences. (Author/JKS)
Integrating Map Algebra and Statistical Modeling for Spatio- Temporal Analysis of Monthly Mean Daily Incident Photosynthetically Active Radiation (PAR) over a Complex Terrain.

PubMed

Evrendilek, Fatih

2007-12-12

This study aims at quantifying spatio-temporal dynamics of monthly mean dailyincident photosynthetically active radiation (PAR) over a vast and complex terrain such asTurkey. The spatial interpolation method of universal kriging, and the combination ofmultiple linear regression (MLR) models and map algebra techniques were implemented togenerate surface maps of PAR with a grid resolution of 500 x 500 m as a function of fivegeographical and 14 climatic variables. Performance of the geostatistical and MLR modelswas compared using mean prediction error (MPE), root-mean-square prediction error(RMSPE), average standard prediction error (ASE), mean standardized prediction error(MSPE), root-mean-square standardized prediction error (RMSSPE), and adjustedcoefficient of determination (R² adj. ). The best-fit MLR- and universal kriging-generatedmodels of monthly mean daily PAR were validated against an independent 37-year observeddataset of 35 climate stations derived from 160 stations across Turkey by the Jackknifingmethod. The spatial variability patterns of monthly mean daily incident PAR were moreaccurately reflected in the surface maps created by the MLR-based models than in thosecreated by the universal kriging method, in particular, for spring (May) and autumn(November). The MLR-based spatial interpolation algorithms of PAR described in thisstudy indicated the significance of the multifactor approach to understanding and mappingspatio-temporal dynamics of PAR for a complex terrain over meso-scales.
Modeling Multiplicative Error Variance: An Example Predicting Tree Diameter from Stump Dimensions in Baldcypress

Treesearch

Bernard R. Parresol

1993-01-01

In the context of forest modeling, it is often reasonable to assume a multiplicative heteroscedastic error structure to the data. Under such circumstances ordinary least squares no longer provides minimum variance estimates of the model parameters. Through study of the error structure, a suitable error variance model can be specified and its parameters estimated. This...
Prediction of BP reactivity to talking using hybrid soft computing approaches.

PubMed

Kaur, Gurmanik; Arora, Ajat Shatru; Jain, Vijender Kumar

2014-01-01

High blood pressure (BP) is associated with an increased risk of cardiovascular diseases. Therefore, optimal precision in measurement of BP is appropriate in clinical and research studies. In this work, anthropometric characteristics including age, height, weight, body mass index (BMI), and arm circumference (AC) were used as independent predictor variables for the prediction of BP reactivity to talking. Principal component analysis (PCA) was fused with artificial neural network (ANN), adaptive neurofuzzy inference system (ANFIS), and least square-support vector machine (LS-SVM) model to remove the multicollinearity effect among anthropometric predictor variables. The statistical tests in terms of coefficient of determination (R (2)), root mean square error (RMSE), and mean absolute percentage error (MAPE) revealed that PCA based LS-SVM (PCA-LS-SVM) model produced a more efficient prediction of BP reactivity as compared to other models. This assessment presents the importance and advantages posed by PCA fused prediction models for prediction of biological variables.
pKa prediction of monoprotic small molecules the SMARTS way.

PubMed

Lee, Adam C; Yu, Jing-Yu; Crippen, Gordon M

2008-10-01

Realizing favorable absorption, distribution, metabolism, elimination, and toxicity profiles is a necessity due to the high attrition rate of lead compounds in drug development today. The ability to accurately predict bioavailability can help save time and money during the screening and optimization processes. As several robust programs already exist for predicting logP, we have turned our attention to the fast and robust prediction of pK(a) for small molecules. Using curated data from the Beilstein Database and Lange's Handbook of Chemistry, we have created a decision tree based on a novel set of SMARTS strings that can accurately predict the pK(a) for monoprotic compounds with R(2) of 0.94 and root mean squared error of 0.68. Leave-some-out (10%) cross-validation achieved Q(2) of 0.91 and root mean squared error of 0.80.
Simultaneous estimation of cross-validation errors in least squares collocation applied for statistical testing and evaluation of the noise variance components

NASA Astrophysics Data System (ADS)

Behnabian, Behzad; Mashhadi Hossainali, Masoud; Malekzadeh, Ahad

2018-02-01

The cross-validation technique is a popular method to assess and improve the quality of prediction by least squares collocation (LSC). We present a formula for direct estimation of the vector of cross-validation errors (CVEs) in LSC which is much faster than element-wise CVE computation. We show that a quadratic form of CVEs follows Chi-squared distribution. Furthermore, a posteriori noise variance factor is derived by the quadratic form of CVEs. In order to detect blunders in the observations, estimated standardized CVE is proposed as the test statistic which can be applied when noise variances are known or unknown. We use LSC together with the methods proposed in this research for interpolation of crustal subsidence in the northern coast of the Gulf of Mexico. The results show that after detection and removing outliers, the root mean square (RMS) of CVEs and estimated noise standard deviation are reduced about 51 and 59%, respectively. In addition, RMS of LSC prediction error at data points and RMS of estimated noise of observations are decreased by 39 and 67%, respectively. However, RMS of LSC prediction error on a regular grid of interpolation points covering the area is only reduced about 4% which is a consequence of sparse distribution of data points for this case study. The influence of gross errors on LSC prediction results is also investigated by lower cutoff CVEs. It is indicated that after elimination of outliers, RMS of this type of errors is also reduced by 19.5% for a 5 km radius of vicinity. We propose a method using standardized CVEs for classification of dataset into three groups with presumed different noise variances. The noise variance components for each of the groups are estimated using restricted maximum-likelihood method via Fisher scoring technique. Finally, LSC assessment measures were computed for the estimated heterogeneous noise variance model and compared with those of the homogeneous model. The advantage of the proposed method is the reduction in estimated noise levels for those groups with the fewer number of noisy data points.

A partial least squares based spectrum normalization method for uncertainty reduction for laser-induced breakdown spectroscopy measurements

NASA Astrophysics Data System (ADS)

Li, Xiongwei; Wang, Zhe; Lui, Siu-Lung; Fu, Yangting; Li, Zheng; Liu, Jianming; Ni, Weidou

2013-10-01

A bottleneck of the wide commercial application of laser-induced breakdown spectroscopy (LIBS) technology is its relatively high measurement uncertainty. A partial least squares (PLS) based normalization method was proposed to improve pulse-to-pulse measurement precision for LIBS based on our previous spectrum standardization method. The proposed model utilized multi-line spectral information of the measured element and characterized the signal fluctuations due to the variation of plasma characteristic parameters (plasma temperature, electron number density, and total number density) for signal uncertainty reduction. The model was validated by the application of copper concentration prediction in 29 brass alloy samples. The results demonstrated an improvement on both measurement precision and accuracy over the generally applied normalization as well as our previously proposed simplified spectrum standardization method. The average relative standard deviation (RSD), average of the standard error (error bar), the coefficient of determination (R2), the root-mean-square error of prediction (RMSEP), and average value of the maximum relative error (MRE) were 1.80%, 0.23%, 0.992, 1.30%, and 5.23%, respectively, while those for the generally applied spectral area normalization were 3.72%, 0.71%, 0.973, 1.98%, and 14.92%, respectively.
Evaluation of Fast-Time Wake Vortex Prediction Models

NASA Technical Reports Server (NTRS)

Proctor, Fred H.; Hamilton, David W.

2009-01-01

Current fast-time wake models are reviewed and three basic types are defined. Predictions from several of the fast-time models are compared. Previous statistical evaluations of the APA-Sarpkaya and D2P fast-time models are discussed. Root Mean Square errors between fast-time model predictions and Lidar wake measurements are examined for a 24 hr period at Denver International Airport. Shortcomings in current methodology for evaluating wake errors are also discussed.
Synchronous front-face fluorescence spectroscopy for authentication of the adulteration of edible vegetable oil with refined used frying oil.

PubMed

Tan, Jin; Li, Rong; Jiang, Zi-Tao; Tang, Shu-Hua; Wang, Ying; Shi, Meng; Xiao, Yi-Qian; Jia, Bin; Lu, Tian-Xiang; Wang, Hao

2017-02-15

Synchronous front-face fluorescence spectroscopy has been developed for the discrimination of used frying oil (UFO) from edible vegetable oil (EVO), the estimation of the using time of UFO, and the determination of the adulteration of EVO with UFO. Both the heating time of laboratory prepared UFO and the adulteration of EVO with UFO could be determined by partial least squares regression (PLSR). To simulate the EVO adulteration with UFO, for each kind of oil, fifty adulterated samples at the adulterant amounts range of 1-50% were prepared. PLSR was then adopted to build the model and both full (leave-one-out) cross-validation and external validation were performed to evaluate the predictive ability. Under the optimum condition, the plots of observed versus predicted values exhibited high linearity (R(2)>0.96). The root mean square error of cross-validation (RMSECV) and root mean square error of prediction (RMSEP) were both lower than 3%. Copyright © 2016 Elsevier Ltd. All rights reserved.
Performance of statistical models to predict mental health and substance abuse cost.

PubMed

Montez-Rath, Maria; Christiansen, Cindy L; Ettner, Susan L; Loveland, Susan; Rosen, Amy K

2006-10-26

Providers use risk-adjustment systems to help manage healthcare costs. Typically, ordinary least squares (OLS) models on either untransformed or log-transformed cost are used. We examine the predictive ability of several statistical models, demonstrate how model choice depends on the goal for the predictive model, and examine whether building models on samples of the data affects model choice. Our sample consisted of 525,620 Veterans Health Administration patients with mental health (MH) or substance abuse (SA) diagnoses who incurred costs during fiscal year 1999. We tested two models on a transformation of cost: a Log Normal model and a Square-root Normal model, and three generalized linear models on untransformed cost, defined by distributional assumption and link function: Normal with identity link (OLS); Gamma with log link; and Gamma with square-root link. Risk-adjusters included age, sex, and 12 MH/SA categories. To determine the best model among the entire dataset, predictive ability was evaluated using root mean square error (RMSE), mean absolute prediction error (MAPE), and predictive ratios of predicted to observed cost (PR) among deciles of predicted cost, by comparing point estimates and 95% bias-corrected bootstrap confidence intervals. To study the effect of analyzing a random sample of the population on model choice, we re-computed these statistics using random samples beginning with 5,000 patients and ending with the entire sample. The Square-root Normal model had the lowest estimates of the RMSE and MAPE, with bootstrap confidence intervals that were always lower than those for the other models. The Gamma with square-root link was best as measured by the PRs. The choice of best model could vary if smaller samples were used and the Gamma with square-root link model had convergence problems with small samples. Models with square-root transformation or link fit the data best. This function (whether used as transformation or as a link) seems to help deal with the high comorbidity of this population by introducing a form of interaction. The Gamma distribution helps with the long tail of the distribution. However, the Normal distribution is suitable if the correct transformation of the outcome is used.
A Sensor Dynamic Measurement Error Prediction Model Based on NAPSO-SVM

PubMed Central

Jiang, Minlan; Jiang, Lan; Jiang, Dingde; Li, Fei

2018-01-01

Dynamic measurement error correction is an effective way to improve sensor precision. Dynamic measurement error prediction is an important part of error correction, and support vector machine (SVM) is often used for predicting the dynamic measurement errors of sensors. Traditionally, the SVM parameters were always set manually, which cannot ensure the model’s performance. In this paper, a SVM method based on an improved particle swarm optimization (NAPSO) is proposed to predict the dynamic measurement errors of sensors. Natural selection and simulated annealing are added in the PSO to raise the ability to avoid local optima. To verify the performance of NAPSO-SVM, three types of algorithms are selected to optimize the SVM’s parameters: the particle swarm optimization algorithm (PSO), the improved PSO optimization algorithm (NAPSO), and the glowworm swarm optimization (GSO). The dynamic measurement error data of two sensors are applied as the test data. The root mean squared error and mean absolute percentage error are employed to evaluate the prediction models’ performances. The experimental results show that among the three tested algorithms the NAPSO-SVM method has a better prediction precision and a less prediction errors, and it is an effective method for predicting the dynamic measurement errors of sensors. PMID:29342942
Prediction of valid acidity in intact apples with Fourier transform near infrared spectroscopy.

PubMed

Liu, Yan-De; Ying, Yi-Bin; Fu, Xia-Ping

2005-03-01

To develop nondestructive acidity prediction for intact Fuji apples, the potential of Fourier transform near infrared (FT-NIR) method with fiber optics in interactance mode was investigated. Interactance in the 800 nm to 2619 nm region was measured for intact apples, harvested from early to late maturity stages. Spectral data were analyzed by two multivariate calibration techniques including partial least squares (PLS) and principal component regression (PCR) methods. A total of 120 Fuji apples were tested and 80 of them were used to form a calibration data set. The influences of different data preprocessing and spectra treatments were also quantified. Calibration models based on smoothing spectra were slightly worse than that based on derivative spectra, and the best result was obtained when the segment length was 5 nm and the gap size was 10 points. Depending on data preprocessing and PLS method, the best prediction model yielded correlation coefficient of determination (r2) of 0.759, low root mean square error of prediction (RMSEP) of 0.0677, low root mean square error of calibration (RMSEC) of 0.0562. The results indicated the feasibility of FT-NIR spectral analysis for predicting apple valid acidity in a nondestructive way.
Prediction of valid acidity in intact apples with Fourier transform near infrared spectroscopy*

PubMed Central

Liu, Yan-de; Ying, Yi-bin; Fu, Xia-ping

2005-01-01

To develop nondestructive acidity prediction for intact Fuji apples, the potential of Fourier transform near infrared (FT-NIR) method with fiber optics in interactance mode was investigated. Interactance in the 800 nm to 2619 nm region was measured for intact apples, harvested from early to late maturity stages. Spectral data were analyzed by two multivariate calibration techniques including partial least squares (PLS) and principal component regression (PCR) methods. A total of 120 Fuji apples were tested and 80 of them were used to form a calibration data set. The influences of different data preprocessing and spectra treatments were also quantified. Calibration models based on smoothing spectra were slightly worse than that based on derivative spectra, and the best result was obtained when the segment length was 5 nm and the gap size was 10 points. Depending on data preprocessing and PLS method, the best prediction model yielded correlation coefficient of determination (r 2) of 0.759, low root mean square error of prediction (RMSEP) of 0.0677, low root mean square error of calibration (RMSEC) of 0.0562. The results indicated the feasibility of FT-NIR spectral analysis for predicting apple valid acidity in a nondestructive way. PMID:15682498
Simple Forest Canopy Thermal Exitance Model

NASA Technical Reports Server (NTRS)

Smith J. A.; Goltz, S. M.

1999-01-01

We describe a model to calculate brightness temperature and surface energy balance for a forest canopy system. The model is an extension of an earlier vegetation only model by inclusion of a simple soil layer. The root mean square error in brightness temperature for a dense forest canopy was 2.5 C. Surface energy balance predictions were also in good agreement. The corresponding root mean square errors for net radiation, latent, and sensible heat were 38.9, 30.7, and 41.4 W/sq m respectively.
Automated body weight prediction of dairy cows using 3-dimensional vision.

PubMed

Song, X; Bokkers, E A M; van der Tol, P P J; Groot Koerkamp, P W G; van Mourik, S

2018-05-01

The objectives of this study were to quantify the error of body weight prediction using automatically measured morphological traits in a 3-dimensional (3-D) vision system and to assess the influence of various sources of uncertainty on body weight prediction. In this case study, an image acquisition setup was created in a cow selection box equipped with a top-view 3-D camera. Morphological traits of hip height, hip width, and rump length were automatically extracted from the raw 3-D images taken of the rump area of dairy cows (n = 30). These traits combined with days in milk, age, and parity were used in multiple linear regression models to predict body weight. To find the best prediction model, an exhaustive feature selection algorithm was used to build intermediate models (n = 63). Each model was validated by leave-one-out cross-validation, giving the root mean square error and mean absolute percentage error. The model consisting of hip width (measurement variability of 0.006 m), days in milk, and parity was the best model, with the lowest errors of 41.2 kg of root mean square error and 5.2% mean absolute percentage error. Our integrated system, including the image acquisition setup, image analysis, and the best prediction model, predicted the body weights with a performance similar to that achieved using semi-automated or manual methods. Moreover, the variability of our simplified morphological trait measurement showed a negligible contribution to the uncertainty of body weight prediction. We suggest that dairy cow body weight prediction can be improved by incorporating more predictive morphological traits and by improving the prediction model structure. The Authors. Published by FASS Inc. and Elsevier Inc. on behalf of the American Dairy Science Association®. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/3.0/).
Theoretical and experimental studies of error in square-law detector circuits

NASA Technical Reports Server (NTRS)

Stanley, W. D.; Hearn, C. P.; Williams, J. B.

1984-01-01

Square law detector circuits to determine errors from the ideal input/output characteristic function were investigated. The nonlinear circuit response is analyzed by a power series expansion containing terms through the fourth degree, from which the significant deviation from square law can be predicted. Both fixed bias current and flexible bias current configurations are considered. The latter case corresponds with the situation where the mean current can change with the application of a signal. Experimental investigations of the circuit arrangements are described. Agreement between the analytical models and the experimental results are established. Factors which contribute to differences under certain conditions are outlined.
Near infra red spectroscopy as a multivariate process analytical tool for predicting pharmaceutical co-crystal concentration.

PubMed

Wood, Clive; Alwati, Abdolati; Halsey, Sheelagh; Gough, Tim; Brown, Elaine; Kelly, Adrian; Paradkar, Anant

2016-09-10

The use of near infra red spectroscopy to predict the concentration of two pharmaceutical co-crystals; 1:1 ibuprofen-nicotinamide (IBU-NIC) and 1:1 carbamazepine-nicotinamide (CBZ-NIC) has been evaluated. A partial least squares (PLS) regression model was developed for both co-crystal pairs using sets of standard samples to create calibration and validation data sets with which to build and validate the models. Parameters such as the root mean square error of calibration (RMSEC), root mean square error of prediction (RMSEP) and correlation coefficient were used to assess the accuracy and linearity of the models. Accurate PLS regression models were created for both co-crystal pairs which can be used to predict the co-crystal concentration in a powder mixture of the co-crystal and the active pharmaceutical ingredient (API). The IBU-NIC model had smaller errors than the CBZ-NIC model, possibly due to the complex CBZ-NIC spectra which could reflect the different arrangement of hydrogen bonding associated with the co-crystal compared to the IBU-NIC co-crystal. These results suggest that NIR spectroscopy can be used as a PAT tool during a variety of pharmaceutical co-crystal manufacturing methods and the presented data will facilitate future offline and in-line NIR studies involving pharmaceutical co-crystals. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Environmental fate model for ultra-low-volume insecticide applications used for adult mosquito management

USGS Publications Warehouse

Schleier, Jerome J.; Peterson, Robert K.D.; Irvine, Kathryn M.; Marshall, Lucy M.; Weaver, David K.; Preftakes, Collin J.

2012-01-01

One of the more effective ways of managing high densities of adult mosquitoes that vector human and animal pathogens is ultra-low-volume (ULV) aerosol applications of insecticides. The U.S. Environmental Protection Agency uses models that are not validated for ULV insecticide applications and exposure assumptions to perform their human and ecological risk assessments. Currently, there is no validated model that can accurately predict deposition of insecticides applied using ULV technology for adult mosquito management. In addition, little is known about the deposition and drift of small droplets like those used under conditions encountered during ULV applications. The objective of this study was to perform field studies to measure environmental concentrations of insecticides and to develop a validated model to predict the deposition of ULV insecticides. The final regression model was selected by minimizing the Bayesian Information Criterion and its prediction performance was evaluated using k-fold cross validation. Density of the formulation and the density and CMD interaction coefficients were the largest in the model. The results showed that as density of the formulation decreases, deposition increases. The interaction of density and CMD showed that higher density formulations and larger droplets resulted in greater deposition. These results are supported by the aerosol physics literature. A k-fold cross validation demonstrated that the mean square error of the selected regression model is not biased, and the mean square error and mean square prediction error indicated good predictive ability.
Quantitative determination of additive Chlorantraniliprole in Abamectin preparation: Investigation of bootstrapping soft shrinkage approach by mid-infrared spectroscopy

NASA Astrophysics Data System (ADS)

Yan, Hong; Song, Xiangzhong; Tian, Kuangda; Chen, Yilin; Xiong, Yanmei; Min, Shungeng

2018-02-01

A novel method, mid-infrared (MIR) spectroscopy, which enables the determination of Chlorantraniliprole in Abamectin within minutes, is proposed. We further evaluate the prediction ability of four wavelength selection methods, including bootstrapping soft shrinkage approach (BOSS), Monte Carlo uninformative variable elimination (MCUVE), genetic algorithm partial least squares (GA-PLS) and competitive adaptive reweighted sampling (CARS) respectively. The results showed that BOSS method obtained the lowest root mean squared error of cross validation (RMSECV) (0.0245) and root mean squared error of prediction (RMSEP) (0.0271), as well as the highest coefficient of determination of cross-validation (Qcv2) (0.9998) and the coefficient of determination of test set (Q2test) (0.9989), which demonstrated that the mid infrared spectroscopy can be used to detect Chlorantraniliprole in Abamectin conveniently. Meanwhile, a suitable wavelength selection method (BOSS) is essential to conducting a component spectral analysis.
Achievable accuracy of hip screw holding power estimation by insertion torque measurement.

PubMed

Erani, Paolo; Baleani, Massimiliano

2018-02-01

To ensure stability of proximal femoral fractures, the hip screw must firmly engage into the femoral head. Some studies suggested that screw holding power into trabecular bone could be evaluated, intraoperatively, through measurement of screw insertion torque. However, those studies used synthetic bone, instead of trabecular bone, as host material or they did not evaluate accuracy of predictions. We determined prediction accuracy, also assessing the impact of screw design and host material. We measured, under highly-repeatable experimental conditions, disregarding clinical procedure complexities, insertion torque and pullout strength of four screw designs, both in 120 synthetic and 80 trabecular bone specimens of variable density. For both host materials, we calculated the root-mean-square error and the mean-absolute-percentage error of predictions based on the best fitting model of torque-pullout data, in both single-screw and merged dataset. Predictions based on screw-specific regression models were the most accurate. Host material impacts on prediction accuracy: the replacement of synthetic with trabecular bone decreased both root-mean-square errors, from 0.54 ÷ 0.76 kN to 0.21 ÷ 0.40 kN, and mean-absolute-percentage errors, from 14 ÷ 21% to 10 ÷ 12%. However, holding power predicted on low insertion torque remained inaccurate, with errors up to 40% for torques below 1 Nm. In poor-quality trabecular bone, tissue inhomogeneities likely affect pullout strength and insertion torque to different extents, limiting the predictive power of the latter. This bias decreases when the screw engages good-quality bone. Under this condition, predictions become more accurate although this result must be confirmed by close in-vitro simulation of the clinical procedure. Copyright © 2018 Elsevier Ltd. All rights reserved.
Rovibrational spectra of ammonia. I. Unprecedented accuracy of a potential energy surface used with nonadiabatic corrections.

PubMed

Huang, Xinchuan; Schwenke, David W; Lee, Timothy J

2011-01-28

In this work, we build upon our previous work on the theoretical spectroscopy of ammonia, NH(3). Compared to our 2008 study, we include more physics in our rovibrational calculations and more experimental data in the refinement procedure, and these enable us to produce a potential energy surface (PES) of unprecedented accuracy. We call this the HSL-2 PES. The additional physics we include is a second-order correction for the breakdown of the Born-Oppenheimer approximation, and we find it to be critical for improved results. By including experimental data for higher rotational levels in the refinement procedure, we were able to greatly reduce our systematic errors for the rotational dependence of our predictions. These additions together lead to a significantly improved total angular momentum (J) dependence in our computed rovibrational energies. The root-mean-square error between our predictions using the HSL-2 PES and the reliable energy levels from the HITRAN database for J = 0-6 and J = 7∕8 for (14)NH(3) is only 0.015 cm(-1) and 0.020∕0.023 cm(-1), respectively. The root-mean-square errors for the characteristic inversion splittings are approximately 1∕3 smaller than those for energy levels. The root-mean-square error for the 6002 J = 0-8 transition energies is 0.020 cm(-1). Overall, for J = 0-8, the spectroscopic data computed with HSL-2 is roughly an order of magnitude more accurate relative to our previous best ammonia PES (denoted HSL-1). These impressive numbers are eclipsed only by the root-mean-square error between our predictions for purely rotational transition energies of (15)NH(3) and the highly accurate Cologne database (CDMS): 0.00034 cm(-1) (10 MHz), in other words, 2 orders of magnitude smaller. In addition, we identify a deficiency in the (15)NH(3) energy levels determined from a model of the experimental data.
Prediction of Soil pH Hyperspectral Spectrum in Guanzhong Area of Shaanxi Province Based on PLS

NASA Astrophysics Data System (ADS)

Liu, Jinbao; Zhang, Yang; Wang, Huanyuan; Cheng, Jie; Tong, Wei; Wei, Jing

2017-12-01

The soil pH of Fufeng County, Yangling County and Wugong County in Shaanxi Province was studied. The spectral reflectance was measured by ASD Field Spec HR portable terrain spectrum, and its spectral characteristics were analyzed. The first deviation of the original spectral reflectance of the soil, the second deviation, the logarithm of the reciprocal logarithm, the first order differential of the reciprocal logarithm and the second order differential of the reciprocal logarithm were used to establish the soil pH Spectral prediction model. The results showed that the correlation between the reflectance spectra after SNV pre-treatment and the soil pH was significantly improved. The optimal prediction model of soil pH established by partial least squares method was a prediction model based on the first order differential of the reciprocal logarithm of spectral reflectance. The principal component factor was 10, the decision coefficient Rc2 = 0.9959, the model root means square error RMSEC = 0.0076, the correction deviation SEC = 0.0077; the verification decision coefficient Rv2 = 0.9893, the predicted root mean square error RMSEP = 0.0157, The deviation of SEP = 0.0160, the model was stable, the fitting ability and the prediction ability were high, and the soil pH can be measured quickly.
Distribution of kriging errors, the implications and how to communicate them

NASA Astrophysics Data System (ADS)

Li, Hong Yi; Milne, Alice; Webster, Richard

2016-04-01

Kriging in one form or another has become perhaps the most popular method for spatial prediction in environmental science. Each prediction is unbiased and of minimum variance, which itself is estimated. The kriging variances depend on the mathematical model chosen to describe the spatial variation; different models, however plausible, give rise to different minimized variances. Practitioners often compare models by so-called cross-validation before finally choosing the most appropriate for their kriging. One proceeds as follows. One removes a unit (a sampling point) from the whole set, kriges the value there and compares the kriged value with the value observed to obtain the deviation or error. One repeats the process for each and every point in turn and for all plausible models. One then computes the mean errors (MEs) and the mean of the squared errors (MSEs). Ideally a squared error should equal the corresponding kriging variance (σK2), and so one is advised to choose the model for which on average the squared errors most nearly equal the kriging variances, i.e. the ratio MSDR = MSE/σK2 ≈ 1. Maximum likelihood estimation of models almost guarantees that the MSDR equals 1, and so the kriging variances are unbiased predictors of the squared error across the region. The method is based on the assumption that the errors have a normal distribution. The squared deviation ratio (SDR) should therefore be distributed as χ2 with one degree of freedom with a median of 0.455. We have found that often the median of the SDR (MedSDR) is less, in some instances much less, than 0.455 even though the mean of the SDR is close to 1. It seems that in these cases the distributions of the errors are leptokurtic, i.e. they have an excess of predictions close to the true values, excesses near the extremes and a dearth of predictions in between. In these cases the kriging variances are poor measures of the uncertainty at individual sites. The uncertainty is typically under-estimated for the extreme observations and compensated for by over estimating for other observations. Statisticians must tell users when they present maps of predictions. We illustrate the situation with results from mapping salinity in land reclaimed from the Yangtze delta in the Gulf of Hangzhou, China. There the apparent electrical conductivity (ECa) of the topsoil was measured at 525 points in a field of 2.3 ha. The marginal distribution of the observations was strongly positively skewed, and so the observed ECas were transformed to their logarithms to give an approximately symmetric distribution. That distribution was strongly platykurtic with short tails and no evident outliers. The logarithms were analysed as a mixed model of quadratic drift plus correlated random residuals with a spherical variogram. The kriged predictions that deviated from their true values with an MSDR of 0.993, but with a medSDR=0.324. The coefficient of kurtosis of the deviations was 1.45, i.e. substantially larger than 0 for a normal distribution. The reasons for this behaviour are being sought. The most likely explanation is that there are spatial outliers, i.e. points at which the observed values that differ markedly from those at their their closest neighbours.
Distribution of kriging errors, the implications and how to communicate them

NASA Astrophysics Data System (ADS)

Li, HongYi; Milne, Alice; Webster, Richard

2015-04-01

Kriging in one form or another has become perhaps the most popular method for spatial prediction in environmental science. Each prediction is unbiased and of minimum variance, which itself is estimated. The kriging variances depend on the mathematical model chosen to describe the spatial variation; different models, however plausible, give rise to different minimized variances. Practitioners often compare models by so-called cross-validation before finally choosing the most appropriate for their kriging. One proceeds as follows. One removes a unit (a sampling point) from the whole set, kriges the value there and compares the kriged value with the value observed to obtain the deviation or error. One repeats the process for each and every point in turn and for all plausible models. One then computes the mean errors (MEs) and the mean of the squared errors (MSEs). Ideally a squared error should equal the corresponding kriging variance (σ_K^2), and so one is advised to choose the model for which on average the squared errors most nearly equal the kriging variances, i.e. the ratio MSDR=MSE/ σ_K2 ≈1. Maximum likelihood estimation of models almost guarantees that the MSDR equals 1, and so the kriging variances are unbiased predictors of the squared error across the region. The method is based on the assumption that the errors have a normal distribution. The squared deviation ratio (SDR) should therefore be distributed as χ2 with one degree of freedom with a median of 0.455. We have found that often the median of the SDR (MedSDR) is less, in some instances much less, than 0.455 even though the mean of the SDR is close to 1. It seems that in these cases the distributions of the errors are leptokurtic, i.e. they have an excess of predictions close to the true values, excesses near the extremes and a dearth of predictions in between. In these cases the kriging variances are poor measures of the uncertainty at individual sites. The uncertainty is typically under-estimated for the extreme observations and compensated for by over estimating for other observations. Statisticians must tell users when they present maps of predictions. We illustrate the situation with results from mapping salinity in land reclaimed from the Yangtze delta in the Gulf of Hangzhou, China. There the apparent electrical conductivity (EC_a) of the topsoil was measured at 525 points in a field of 2.3~ha. The marginal distribution of the observations was strongly positively skewed, and so the observed EC_as were transformed to their logarithms to give an approximately symmetric distribution. That distribution was strongly platykurtic with short tails and no evident outliers. The logarithms were analysed as a mixed model of quadratic drift plus correlated random residuals with a spherical variogram. The kriged predictions that deviated from their true values with an MSDR of 0.993, but with a medSDR=0.324. The coefficient of kurtosis of the deviations was 1.45, i.e. substantially larger than 0 for a normal distribution. The reasons for this behaviour are being sought. The most likely explanation is that there are spatial outliers, i.e. points at which the observed values that differ markedly from those at their their closest neighbours.
An analysis of input errors in precipitation-runoff models using regression with errors in the independent variables

USGS Publications Warehouse

Troutman, Brent M.

1982-01-01

Errors in runoff prediction caused by input data errors are analyzed by treating precipitation-runoff models as regression (conditional expectation) models. Independent variables of the regression consist of precipitation and other input measurements; the dependent variable is runoff. In models using erroneous input data, prediction errors are inflated and estimates of expected storm runoff for given observed input variables are biased. This bias in expected runoff estimation results in biased parameter estimates if these parameter estimates are obtained by a least squares fit of predicted to observed runoff values. The problems of error inflation and bias are examined in detail for a simple linear regression of runoff on rainfall and for a nonlinear U.S. Geological Survey precipitation-runoff model. Some implications for flood frequency analysis are considered. A case study using a set of data from Turtle Creek near Dallas, Texas illustrates the problems of model input errors.
New equations improve NIR prediction of body fat among high school wrestlers.

PubMed

Oppliger, R A; Clark, R R; Nielsen, D H

2000-09-01

Methodologic study to derive prediction equations for percent body fat (%BF). To develop valid regression equations using NIR to assess body composition among high school wrestlers. Clinicians need a portable, fast, and simple field method for assessing body composition among wrestlers. Near-infrared photospectrometry (NIR) meets these criteria, but its efficacy has been challenged. Subjects were 150 high school wrestlers from 2 Midwestern states with mean +/- SD age of 16.3 +/- 1.1 yrs, weight of 69.5 +/- 11.7 kg, and height of 174.4 +/- 7.0 cm. Relative body fatness (%BF) determined from hydrostatic weighing was the criterion measure, and NIR optical density (OD) measurements at multiple sites, plus height, weight, and body mass index (BMI) were the predictor variables. Four equations were developed with multiple R2s that varied from .530 to .693, root mean squared errors varied from 2.8% BF to 3.4% BF, and prediction errors varied from 2.9% BF to 3.1% BF. The best equation used OD measurements at the biceps, triceps, and thigh sites, BMI, and age. The root mean squared error and prediction error for all 4 equations were equal to or smaller than for a skinfold equation commonly used with wrestlers. The results substantiate the validity of NIR for predicting % BF among high school wrestlers. Cross-validation of these equations is warranted.

NIR spectroscopic measurement of moisture content in Scots pine seeds.

PubMed

Lestander, Torbjörn A; Geladi, Paul

2003-04-01

When tree seeds are used for seedling production it is important that they are of high quality in order to be viable. One of the factors influencing viability is moisture content and an ideal quality control system should be able to measure this factor quickly for each seed. Seed moisture content within the range 3-34% was determined by near-infrared (NIR) spectroscopy on Scots pine (Pinus sylvestris L.) single seeds and on bulk seed samples consisting of 40-50 seeds. The models for predicting water content from the spectra were made by partial least squares (PLS) and ordinary least squares (OLS) regression. Different conditions were simulated involving both using less wavelengths and going from samples to single seeds. Reflectance and transmission measurements were used. Different spectral pretreatment methods were tested on the spectra. Including bias, the lowest prediction errors for PLS models based on reflectance within 780-2280 nm from bulk samples and single seeds were 0.8% and 1.9%, respectively. Reduction of the single seed reflectance spectrum to 850-1048 nm gave higher biases and prediction errors in the test set. In transmission (850-1048 nm) the prediction error was 2.7% for single seeds. OLS models based on simulated 4-sensor single seed system consisting of optical filters with Gaussian transmission indicated more than 3.4% error in prediction. A practical F-test based on test sets to differentiate models is introduced.
Demand forecasting of electricity in Indonesia with limited historical data

NASA Astrophysics Data System (ADS)

Dwi Kartikasari, Mujiati; Rohmad Prayogi, Arif

2018-03-01

Demand forecasting of electricity is an important activity for electrical agents to know the description of electricity demand in future. Prediction of demand electricity can be done using time series models. In this paper, double moving average model, Holt’s exponential smoothing model, and grey model GM(1,1) are used to predict electricity demand in Indonesia under the condition of limited historical data. The result shows that grey model GM(1,1) has the smallest value of MAE (mean absolute error), MSE (mean squared error), and MAPE (mean absolute percentage error).
Estimating current and future streamflow characteristics at ungaged sites, central and eastern Montana, with application to evaluating effects of climate change on fish populations

USGS Publications Warehouse

Sando, Roy; Chase, Katherine J.

2017-03-23

A common statistical procedure for estimating streamflow statistics at ungaged locations is to develop a relational model between streamflow and drainage basin characteristics at gaged locations using least squares regression analysis; however, least squares regression methods are parametric and make constraining assumptions about the data distribution. The random forest regression method provides an alternative nonparametric method for estimating streamflow characteristics at ungaged sites and requires that the data meet fewer statistical conditions than least squares regression methods.Random forest regression analysis was used to develop predictive models for 89 streamflow characteristics using Precipitation-Runoff Modeling System simulated streamflow data and drainage basin characteristics at 179 sites in central and eastern Montana. The predictive models were developed from streamflow data simulated for current (baseline, water years 1982–99) conditions and three future periods (water years 2021–38, 2046–63, and 2071–88) under three different climate-change scenarios. These predictive models were then used to predict streamflow characteristics for baseline conditions and three future periods at 1,707 fish sampling sites in central and eastern Montana. The average root mean square error for all predictive models was about 50 percent. When streamflow predictions at 23 fish sampling sites were compared to nearby locations with simulated data, the mean relative percent difference was about 43 percent. When predictions were compared to streamflow data recorded at 21 U.S. Geological Survey streamflow-gaging stations outside of the calibration basins, the average mean absolute percent error was about 73 percent.
Error Estimation of An Ensemble Statistical Seasonal Precipitation Prediction Model

NASA Technical Reports Server (NTRS)

Shen, Samuel S. P.; Lau, William K. M.; Kim, Kyu-Myong; Li, Gui-Long

2001-01-01

This NASA Technical Memorandum describes an optimal ensemble canonical correlation forecasting model for seasonal precipitation. Each individual forecast is based on the canonical correlation analysis (CCA) in the spectral spaces whose bases are empirical orthogonal functions (EOF). The optimal weights in the ensemble forecasting crucially depend on the mean square error of each individual forecast. An estimate of the mean square error of a CCA prediction is made also using the spectral method. The error is decomposed onto EOFs of the predictand and decreases linearly according to the correlation between the predictor and predictand. Since new CCA scheme is derived for continuous fields of predictor and predictand, an area-factor is automatically included. Thus our model is an improvement of the spectral CCA scheme of Barnett and Preisendorfer. The improvements include (1) the use of area-factor, (2) the estimation of prediction error, and (3) the optimal ensemble of multiple forecasts. The new CCA model is applied to the seasonal forecasting of the United States (US) precipitation field. The predictor is the sea surface temperature (SST). The US Climate Prediction Center's reconstructed SST is used as the predictor's historical data. The US National Center for Environmental Prediction's optimally interpolated precipitation (1951-2000) is used as the predictand's historical data. Our forecast experiments show that the new ensemble canonical correlation scheme renders a reasonable forecasting skill. For example, when using September-October-November SST to predict the next season December-January-February precipitation, the spatial pattern correlation between the observed and predicted are positive in 46 years among the 50 years of experiments. The positive correlations are close to or greater than 0.4 in 29 years, which indicates excellent performance of the forecasting model. The forecasting skill can be further enhanced when several predictors are used.
[Study on predicting sugar content and valid acidity of apples by near infrared diffuse reflectance technique].

PubMed

Liu, Yan-de; Ying, Yi-bin; Fu, Xia-ping

2005-11-01

The nondestructive method for quantifying sugar content (SC) and available acid (VA) of intact apples using diffuse near infrared reflectance and optical fiber sensing techniques were explored in the present research. The standard sample sets and prediction models were established by partial least squares analysis (PLS). A total of 120 Shandong Fuji apples were tested in the wave number of 12,500 - 4000 cm(-1) using Fourier transform near infrared spectroscopy. The results of the research indicated that the nondestructive quantification of SC and VA, gave a high correlation coefficient 0.970 and 0.906, a low root mean square error of prediction (RMSEP) 0.272 and 0.056 2, a low root mean square error of calibration (RMSEC) 0.261 and 0.0677, and a small difference between RMSEP and RMSEC 0.011 a nd 0.0115. It was suggested that the diffuse nearinfrared reflectance technique be feasible for nondestructive determination of apple sugar content in the wave number range of 10,341 - 5461 cm(-1) and for available acid in the wave number range of 10,341 - 3818 cm(-1).
Detection of Butter Adulteration with Lard by Employing (1)H-NMR Spectroscopy and Multivariate Data Analysis.

PubMed

Fadzillah, Nurrulhidayah Ahmad; Man, Yaakob bin Che; Rohman, Abdul; Rosman, Arieff Salleh; Ismail, Amin; Mustafa, Shuhaimi; Khatib, Alfi

2015-01-01

The authentication of food products from the presence of non-allowed components for certain religion like lard is very important. In this study, we used proton Nuclear Magnetic Resonance ((1)H-NMR) spectroscopy for the analysis of butter adulterated with lard by simultaneously quantification of all proton bearing compounds, and consequently all relevant sample classes. Since the spectra obtained were too complex to be analyzed visually by the naked eyes, the classification of spectra was carried out.The multivariate calibration of partial least square (PLS) regression was used for modelling the relationship between actual value of lard and predicted value. The model yielded a highest regression coefficient (R(2)) of 0.998 and the lowest root mean square error calibration (RMSEC) of 0.0091% and root mean square error prediction (RMSEP) of 0.0090, respectively. Cross validation testing evaluates the predictive power of the model. PLS model was shown as good models as the intercept of R(2)Y and Q(2)Y were 0.0853 and -0.309, respectively.
Rapid Detection of Volatile Oil in Mentha haplocalyx by Near-Infrared Spectroscopy and Chemometrics.

PubMed

Yan, Hui; Guo, Cheng; Shao, Yang; Ouyang, Zhen

2017-01-01

Near-infrared spectroscopy combined with partial least squares regression (PLSR) and support vector machine (SVM) was applied for the rapid determination of chemical component of volatile oil content in Mentha haplocalyx . The effects of data pre-processing methods on the accuracy of the PLSR calibration models were investigated. The performance of the final model was evaluated according to the correlation coefficient ( R ) and root mean square error of prediction (RMSEP). For PLSR model, the best preprocessing method combination was first-order derivative, standard normal variate transformation (SNV), and mean centering, which had of 0.8805, of 0.8719, RMSEC of 0.091, and RMSEP of 0.097, respectively. The wave number variables linking to volatile oil are from 5500 to 4000 cm-1 by analyzing the loading weights and variable importance in projection (VIP) scores. For SVM model, six LVs (less than seven LVs in PLSR model) were adopted in model, and the result was better than PLSR model. The and were 0.9232 and 0.9202, respectively, with RMSEC and RMSEP of 0.084 and 0.082, respectively, which indicated that the predicted values were accurate and reliable. This work demonstrated that near infrared reflectance spectroscopy with chemometrics could be used to rapidly detect the main content volatile oil in M. haplocalyx . The quality of medicine directly links to clinical efficacy, thus, it is important to control the quality of Mentha haplocalyx . Near-infrared spectroscopy combined with partial least squares regression (PLSR) and support vector machine (SVM) was applied for the rapid determination of chemical component of volatile oil content in Mentha haplocalyx . For SVM model, 6 LVs (less than 7 LVs in PLSR model) were adopted in model, and the result was better than PLSR model. It demonstrated that near infrared reflectance spectroscopy with chemometrics could be used to rapidly detect the main content volatile oil in Mentha haplocalyx . Abbreviations used: 1 st der: First-order derivative; 2 nd der: Second-order derivative; LOO: Leave-one-out; LVs: Latent variables; MC: Mean centering, NIR: Near-infrared; NIRS: Near infrared spectroscopy; PCR: Principal component regression, PLSR: Partial least squares regression; RBF: Radial basis function; RMSEC: Root mean square error of cross validation, RMSEC: Root mean square error of calibration; RMSEP: Root mean square error of prediction; SNV: Standard normal variate transformation; SVM: Support vector machine; VIP: Variable Importance in projection.
Evaluating the utility of mid-infrared spectral subspaces for predicting soil properties.

PubMed

Sila, Andrew M; Shepherd, Keith D; Pokhariyal, Ganesh P

2016-04-15

We propose four methods for finding local subspaces in large spectral libraries. The proposed four methods include (a) cosine angle spectral matching; (b) hit quality index spectral matching; (c) self-organizing maps and (d) archetypal analysis methods. Then evaluate prediction accuracies for global and subspaces calibration models. These methods were tested on a mid-infrared spectral library containing 1907 soil samples collected from 19 different countries under the Africa Soil Information Service project. Calibration models for pH, Mehlich-3 Ca, Mehlich-3 Al, total carbon and clay soil properties were developed for the whole library and for the subspace. Root mean square error of prediction was used to evaluate predictive performance of subspace and global models. The root mean square error of prediction was computed using a one-third-holdout validation set. Effect of pretreating spectra with different methods was tested for 1st and 2nd derivative Savitzky-Golay algorithm, multiplicative scatter correction, standard normal variate and standard normal variate followed by detrending methods. In summary, the results show that global models outperformed the subspace models. We, therefore, conclude that global models are more accurate than the local models except in few cases. For instance, sand and clay root mean square error values from local models from archetypal analysis method were 50% poorer than the global models except for subspace models obtained using multiplicative scatter corrected spectra with which were 12% better. However, the subspace approach provides novel methods for discovering data pattern that may exist in large spectral libraries.
Comparing Parameter Estimation Techniques for an Electrical Power Transformer Oil Temperature Prediction Model

NASA Technical Reports Server (NTRS)

Morris, A. Terry

1999-01-01

This paper examines various sources of error in MIT's improved top oil temperature rise over ambient temperature model and estimation process. The sources of error are the current parameter estimation technique, quantization noise, and post-processing of the transformer data. Results from this paper will show that an output error parameter estimation technique should be selected to replace the current least squares estimation technique. The output error technique obtained accurate predictions of transformer behavior, revealed the best error covariance, obtained consistent parameter estimates, and provided for valid and sensible parameters. This paper will also show that the output error technique should be used to minimize errors attributed to post-processing (decimation) of the transformer data. Models used in this paper are validated using data from a large transformer in service.
A new approach to measure visual field progression in glaucoma patients using variational bayes linear regression.

PubMed

Murata, Hiroshi; Araie, Makoto; Asaoka, Ryo

2014-11-20

We generated a variational Bayes model to predict visual field (VF) progression in glaucoma patients. This retrospective study included VF series from 911 eyes of 547 glaucoma patients as test data, and VF series from 5049 eyes of 2858 glaucoma patients as training data. Using training data, variational Bayes linear regression (VBLR) was created to predict VF progression. The performance of VBLR was compared against ordinary least-squares linear regression (OLSLR) by predicting VFs in the test dataset. The total deviation (TD) values of test patients' 11th VFs were predicted using TD values from their second to 10th VFs (VF2-10), the root mean squared error (RMSE) associated with each approach then was calculated. Similarly, mean TD (mTD) of test patients' 11th VFs was predicted using VBLR and OLSLR, and the absolute prediction errors compared. The RMSE resulting from VBLR averaged 3.9 ± 2.1 (SD) and 4.9 ± 2.6 dB for prediction based on the second to 10th VFs (VF2-10) and the second to fourth VFs (VF2-4), respectively. The RMSE resulting from OLSLR was 4.1 ± 2.0 (VF2-10) and 19.9 ± 12.0 (VF2-4) dB. The absolute prediction error (SD) for mTD using VBLR was 1.2 ± 1.3 (VF2-10) and 1.9 ± 2.0 (VF2-4) dB, while the prediction error resulting from OLSLR was 1.2 ± 1.3 (VF2-10) and 6.2 ± 6.6 (VF2-4) dB. The VBLR more accurately predicts future VF progression in glaucoma patients compared to conventional OLSLR, especially in short VF series. © ARVO.
[Gaussian process regression and its application in near-infrared spectroscopy analysis].

PubMed

Feng, Ai-Ming; Fang, Li-Min; Lin, Min

2011-06-01

Gaussian process (GP) is applied in the present paper as a chemometric method to explore the complicated relationship between the near infrared (NIR) spectra and ingredients. After the outliers were detected by Monte Carlo cross validation (MCCV) method and removed from dataset, different preprocessing methods, such as multiplicative scatter correction (MSC), smoothing and derivate, were tried for the best performance of the models. Furthermore, uninformative variable elimination (UVE) was introduced as a variable selection technique and the characteristic wavelengths obtained were further employed as input for modeling. A public dataset with 80 NIR spectra of corn was introduced as an example for evaluating the new algorithm. The optimal models for oil, starch and protein were obtained by the GP regression method. The performance of the final models were evaluated according to the root mean square error of calibration (RMSEC), root mean square error of cross-validation (RMSECV), root mean square error of prediction (RMSEP) and correlation coefficient (r). The models give good calibration ability with r values above 0.99 and the prediction ability is also satisfactory with r values higher than 0.96. The overall results demonstrate that GP algorithm is an effective chemometric method and is promising for the NIR analysis.
Damage level prediction of non-reshaped berm breakwater using ANN, SVM and ANFIS models

NASA Astrophysics Data System (ADS)

Mandal, Sukomal; Rao, Subba; N., Harish; Lokesha

2012-06-01

The damage analysis of coastal structure is very important as it involves many design parameters to be considered for the better and safe design of structure. In the present study experimental data for non-reshaped berm breakwater are collected from Marine Structures Laboratory, Department of Applied Mechanics and Hydraulics, NITK, Surathkal, India. Soft computing techniques like Artificial Neural Network (ANN), Support Vector Machine (SVM) and Adaptive Neuro Fuzzy Inference system (ANFIS) models are constructed using experimental data sets to predict the damage level of non-reshaped berm breakwater. The experimental data are used to train ANN, SVM and ANFIS models and results are determined in terms of statistical measures like mean square error, root mean square error, correla-tion coefficient and scatter index. The result shows that soft computing techniques i.e., ANN, SVM and ANFIS can be efficient tools in predicting damage levels of non reshaped berm breakwater.
MO-G-18C-05: Real-Time Prediction in Free-Breathing Perfusion MRI

DOE Office of Scientific and Technical Information (OSTI.GOV)

Song, H; Liu, W; Ruan, D

Purpose: The aim is to minimize frame-wise difference errors caused by respiratory motion and eliminate the need for breath-holds in magnetic resonance imaging (MRI) sequences with long acquisitions and repeat times (TRs). The technique is being applied to perfusion MRI using arterial spin labeling (ASL). Methods: Respiratory motion prediction (RMP) using navigator echoes was implemented in ASL. A least-square method was used to extract the respiratory motion information from the 1D navigator. A generalized artificial neutral network (ANN) with three layers was developed to simultaneously predict 10 time points forward in time and correct for respiratory motion during MRI acquisition.more » During the training phase, the parameters of the ANN were optimized to minimize the aggregated prediction error based on acquired navigator data. During realtime prediction, the trained ANN was applied to the most recent estimated displacement trajectory to determine in real-time the amount of spatial Results: The respiratory motion information extracted from the least-square method can accurately represent the navigator profiles, with a normalized chi-square value of 0.037±0.015 across the training phase. During the 60-second training phase, the ANN successfully learned the respiratory motion pattern from the navigator training data. During real-time prediction, the ANN received displacement estimates and predicted the motion in the continuum of a 1.0 s prediction window. The ANN prediction was able to provide corrections for different respiratory states (i.e., inhalation/exhalation) during real-time scanning with a mean absolute error of < 1.8 mm. Conclusion: A new technique enabling free-breathing acquisition during MRI is being developed. A generalized ANN development has demonstrated its efficacy in predicting a continuum of motion profile for volumetric imaging based on navigator inputs. Future work will enhance the robustness of ANN and verify its effectiveness with human subjects. Research supported by National Institutes of Health National Cancer Institute Grant R01 CA159471-01.« less
Application of near-infrared spectroscopy for the rapid quality assessment of Radix Paeoniae Rubra

NASA Astrophysics Data System (ADS)

Zhan, Hao; Fang, Jing; Tang, Liying; Yang, Hongjun; Li, Hua; Wang, Zhuju; Yang, Bin; Wu, Hongwei; Fu, Meihong

2017-08-01

Near-infrared (NIR) spectroscopy with multivariate analysis was used to quantify gallic acid, catechin, albiflorin, and paeoniflorin in Radix Paeoniae Rubra, and the feasibility to classify the samples originating from different areas was investigated. A new high-performance liquid chromatography method was developed and validated to analyze gallic acid, catechin, albiflorin, and paeoniflorin in Radix Paeoniae Rubra as the reference. Partial least squares (PLS), principal component regression (PCR), and stepwise multivariate linear regression (SMLR) were performed to calibrate the regression model. Different data pretreatments such as derivatives (1st and 2nd), multiplicative scatter correction, standard normal variate, Savitzky-Golay filter, and Norris derivative filter were applied to remove the systematic errors. The performance of the model was evaluated according to the root mean square of calibration (RMSEC), root mean square error of prediction (RMSEP), root mean square error of cross-validation (RMSECV), and correlation coefficient (r). The results show that compared to PCR and SMLR, PLS had a lower RMSEC, RMSECV, and RMSEP and higher r for all the four analytes. PLS coupled with proper pretreatments showed good performance in both the fitting and predicting results. Furthermore, the original areas of Radix Paeoniae Rubra samples were partly distinguished by principal component analysis. This study shows that NIR with PLS is a reliable, inexpensive, and rapid tool for the quality assessment of Radix Paeoniae Rubra.
Green method by diffuse reflectance infrared spectroscopy and spectral region selection for the quantification of sulphamethoxazole and trimethoprim in pharmaceutical formulations.

PubMed

da Silva, Fabiana E B; Flores, Érico M M; Parisotto, Graciele; Müller, Edson I; Ferrão, Marco F

2016-03-01

An alternative method for the quantification of sulphametoxazole (SMZ) and trimethoprim (TMP) using diffuse reflectance infrared Fourier-transform spectroscopy (DRIFTS) and partial least square regression (PLS) was developed. Interval Partial Least Square (iPLS) and Synergy Partial Least Square (siPLS) were applied to select a spectral range that provided the lowest prediction error in comparison to the full-spectrum model. Fifteen commercial tablet formulations and forty-nine synthetic samples were used. The ranges of concentration considered were 400 to 900 mg g-1SMZ and 80 to 240 mg g-1 TMP. Spectral data were recorded between 600 and 4000 cm-1 with a 4 cm-1 resolution by Diffuse Reflectance Infrared Fourier Transform Spectroscopy (DRIFTS). The proposed procedure was compared to high performance liquid chromatography (HPLC). The results obtained from the root mean square error of prediction (RMSEP), during the validation of the models for samples of sulphamethoxazole (SMZ) and trimethoprim (TMP) using siPLS, demonstrate that this approach is a valid technique for use in quantitative analysis of pharmaceutical formulations. The selected interval algorithm allowed building regression models with minor errors when compared to the full spectrum PLS model. A RMSEP of 13.03 mg g-1for SMZ and 4.88 mg g-1 for TMP was obtained after the selection the best spectral regions by siPLS.
Feasibility of using near infrared spectroscopy to detect and quantify an adulterant in high quality sandalwood oil.

PubMed

Kuriakose, Saji; Joe, I Hubert

2013-11-01

Determination of the authenticity of essential oils has become more significant, in recent years, following some illegal adulteration and contamination scandals. The present investigative study focuses on the application of near infrared spectroscopy to detect sample authenticity and quantify economic adulteration of sandalwood oils. Several data pre-treatments are investigated for calibration and prediction using partial least square regression (PLSR). The quantitative data analysis is done using a new spectral approach - full spectrum or sequential spectrum. The optimum number of PLS components is obtained according to the lowest root mean square error of calibration (RMSEC=0.00009% v/v). The lowest root mean square error of prediction (RMSEP=0.00016% v/v) in the test set and the highest coefficient of determination (R(2)=0.99989) are used as the evaluation tools for the best model. A nonlinear method, locally weighted regression (LWR), is added to extract nonlinear information and to compare with the linear PLSR model. Copyright © 2013 Elsevier B.V. All rights reserved.
The effect of constraints on the analytical figures of merit achieved by extended multivariate curve resolution-alternating least-squares.

PubMed

Pellegrino Vidal, Rocío B; Allegrini, Franco; Olivieri, Alejandro C

2018-03-20

Multivariate curve resolution-alternating least-squares (MCR-ALS) is the model of choice when dealing with some non-trilinear arrays, specifically when the data are of chromatographic origin. To drive the iterative procedure to chemically interpretable solutions, the use of constraints becomes essential. In this work, both simulated and experimental data have been analyzed by MCR-ALS, applying chemically reasonable constraints, and investigating the relationship between selectivity, analytical sensitivity (γ) and root mean square error of prediction (RMSEP). As the selectivity in the instrumental modes decreases, the estimated values for γ did not fully represent the predictive model capabilities, judged from the obtained RMSEP values. Since the available sensitivity expressions have been developed by error propagation theory in unconstrained systems, there is a need of developing new expressions or analytical indicators. They should not only consider the specific profiles retrieved by MCR-ALS, but also the constraints under which the latter ones have been obtained. Copyright © 2017 Elsevier B.V. All rights reserved.
Feasibility of using near infrared spectroscopy to detect and quantify an adulterant in high quality sandalwood oil

NASA Astrophysics Data System (ADS)

Kuriakose, Saji; Joe, I. Hubert

2013-11-01

Determination of the authenticity of essential oils has become more significant, in recent years, following some illegal adulteration and contamination scandals. The present investigative study focuses on the application of near infrared spectroscopy to detect sample authenticity and quantify economic adulteration of sandalwood oils. Several data pre-treatments are investigated for calibration and prediction using partial least square regression (PLSR). The quantitative data analysis is done using a new spectral approach - full spectrum or sequential spectrum. The optimum number of PLS components is obtained according to the lowest root mean square error of calibration (RMSEC = 0.00009% v/v). The lowest root mean square error of prediction (RMSEP = 0.00016% v/v) in the test set and the highest coefficient of determination (R2 = 0.99989) are used as the evaluation tools for the best model. A nonlinear method, locally weighted regression (LWR), is added to extract nonlinear information and to compare with the linear PLSR model.
On the calibration process of film dosimetry: OLS inverse regression versus WLS inverse prediction.

PubMed

Crop, F; Van Rompaye, B; Paelinck, L; Vakaet, L; Thierens, H; De Wagter, C

2008-07-21

The purpose of this study was both putting forward a statistically correct model for film calibration and the optimization of this process. A reliable calibration is needed in order to perform accurate reference dosimetry with radiographic (Gafchromic) film. Sometimes, an ordinary least squares simple linear (in the parameters) regression is applied to the dose-optical-density (OD) curve with the dose as a function of OD (inverse regression) or sometimes OD as a function of dose (inverse prediction). The application of a simple linear regression fit is an invalid method because heteroscedasticity of the data is not taken into account. This could lead to erroneous results originating from the calibration process itself and thus to a lower accuracy. In this work, we compare the ordinary least squares (OLS) inverse regression method with the correct weighted least squares (WLS) inverse prediction method to create calibration curves. We found that the OLS inverse regression method could lead to a prediction bias of up to 7.3 cGy at 300 cGy and total prediction errors of 3% or more for Gafchromic EBT film. Application of the WLS inverse prediction method resulted in a maximum prediction bias of 1.4 cGy and total prediction errors below 2% in a 0-400 cGy range. We developed a Monte-Carlo-based process to optimize calibrations, depending on the needs of the experiment. This type of thorough analysis can lead to a higher accuracy for film dosimetry.
Quantitative Modelling of Trace Elements in Hard Coal.

PubMed

Smoliński, Adam; Howaniec, Natalia

2016-01-01

The significance of coal in the world economy remains unquestionable for decades. It is also expected to be the dominant fossil fuel in the foreseeable future. The increased awareness of sustainable development reflected in the relevant regulations implies, however, the need for the development and implementation of clean coal technologies on the one hand, and adequate analytical tools on the other. The paper presents the application of the quantitative Partial Least Squares method in modeling the concentrations of trace elements (As, Ba, Cd, Co, Cr, Cu, Mn, Ni, Pb, Rb, Sr, V and Zn) in hard coal based on the physical and chemical parameters of coal, and coal ash components. The study was focused on trace elements potentially hazardous to the environment when emitted from coal processing systems. The studied data included 24 parameters determined for 132 coal samples provided by 17 coal mines of the Upper Silesian Coal Basin, Poland. Since the data set contained outliers, the construction of robust Partial Least Squares models for contaminated data set and the correct identification of outlying objects based on the robust scales were required. These enabled the development of the correct Partial Least Squares models, characterized by good fit and prediction abilities. The root mean square error was below 10% for all except for one the final Partial Least Squares models constructed, and the prediction error (root mean square error of cross-validation) exceeded 10% only for three models constructed. The study is of both cognitive and applicative importance. It presents the unique application of the chemometric methods of data exploration in modeling the content of trace elements in coal. In this way it contributes to the development of useful tools of coal quality assessment.

Quantitative Modelling of Trace Elements in Hard Coal

PubMed Central

Smoliński, Adam; Howaniec, Natalia

2016-01-01

The significance of coal in the world economy remains unquestionable for decades. It is also expected to be the dominant fossil fuel in the foreseeable future. The increased awareness of sustainable development reflected in the relevant regulations implies, however, the need for the development and implementation of clean coal technologies on the one hand, and adequate analytical tools on the other. The paper presents the application of the quantitative Partial Least Squares method in modeling the concentrations of trace elements (As, Ba, Cd, Co, Cr, Cu, Mn, Ni, Pb, Rb, Sr, V and Zn) in hard coal based on the physical and chemical parameters of coal, and coal ash components. The study was focused on trace elements potentially hazardous to the environment when emitted from coal processing systems. The studied data included 24 parameters determined for 132 coal samples provided by 17 coal mines of the Upper Silesian Coal Basin, Poland. Since the data set contained outliers, the construction of robust Partial Least Squares models for contaminated data set and the correct identification of outlying objects based on the robust scales were required. These enabled the development of the correct Partial Least Squares models, characterized by good fit and prediction abilities. The root mean square error was below 10% for all except for one the final Partial Least Squares models constructed, and the prediction error (root mean square error of cross–validation) exceeded 10% only for three models constructed. The study is of both cognitive and applicative importance. It presents the unique application of the chemometric methods of data exploration in modeling the content of trace elements in coal. In this way it contributes to the development of useful tools of coal quality assessment. PMID:27438794
Spectrophotometric determination of ternary mixtures of thiamin, riboflavin and pyridoxal in pharmaceutical and human plasma by least-squares support vector machines.

PubMed

Niazi, Ali; Zolgharnein, Javad; Afiuni-Zadeh, Somaie

2007-11-01

Ternary mixtures of thiamin, riboflavin and pyridoxal have been simultaneously determined in synthetic and real samples by applications of spectrophotometric and least-squares support vector machines. The calibration graphs were linear in the ranges of 1.0 - 20.0, 1.0 - 10.0 and 1.0 - 20.0 microg ml(-1) with detection limits of 0.6, 0.5 and 0.7 microg ml(-1) for thiamin, riboflavin and pyridoxal, respectively. The experimental calibration matrix was designed with 21 mixtures of these chemicals. The concentrations were varied between calibration graph concentrations of vitamins. The simultaneous determination of these vitamin mixtures by using spectrophotometric methods is a difficult problem, due to spectral interferences. The partial least squares (PLS) modeling and least-squares support vector machines were used for the multivariate calibration of the spectrophotometric data. An excellent model was built using LS-SVM, with low prediction errors and superior performance in relation to PLS. The root mean square errors of prediction (RMSEP) for thiamin, riboflavin and pyridoxal with PLS and LS-SVM were 0.6926, 0.3755, 0.4322 and 0.0421, 0.0318, 0.0457, respectively. The proposed method was satisfactorily applied to the rapid simultaneous determination of thiamin, riboflavin and pyridoxal in commercial pharmaceutical preparations and human plasma samples.
Comparison of genetic algorithm and imperialist competitive algorithms in predicting bed load transport in clean pipe.

PubMed

Ebtehaj, Isa; Bonakdari, Hossein

2014-01-01

The existence of sediments in wastewater greatly affects the performance of the sewer and wastewater transmission systems. Increased sedimentation in wastewater collection systems causes problems such as reduced transmission capacity and early combined sewer overflow. The article reviews the performance of the genetic algorithm (GA) and imperialist competitive algorithm (ICA) in minimizing the target function (mean square error of observed and predicted Froude number). To study the impact of bed load transport parameters, using four non-dimensional groups, six different models have been presented. Moreover, the roulette wheel selection method is used to select the parents. The ICA with root mean square error (RMSE) = 0.007, mean absolute percentage error (MAPE) = 3.5% show better results than GA (RMSE = 0.007, MAPE = 5.6%) for the selected model. All six models return better results than the GA. Also, the results of these two algorithms were compared with multi-layer perceptron and existing equations.
Resolving Mixed Algal Species in Hyperspectral Images

PubMed Central

Mehrubeoglu, Mehrube; Teng, Ming Y.; Zimba, Paul V.

2014-01-01

We investigated a lab-based hyperspectral imaging system's response from pure (single) and mixed (two) algal cultures containing known algae types and volumetric combinations to characterize the system's performance. The spectral response to volumetric changes in single and combinations of algal mixtures with known ratios were tested. Constrained linear spectral unmixing was applied to extract the algal content of the mixtures based on abundances that produced the lowest root mean square error. Percent prediction error was computed as the difference between actual percent volumetric content and abundances at minimum RMS error. Best prediction errors were computed as 0.4%, 0.4% and 6.3% for the mixed spectra from three independent experiments. The worst prediction errors were found as 5.6%, 5.4% and 13.4% for the same order of experiments. Additionally, Beer-Lambert's law was utilized to relate transmittance to different volumes of pure algal suspensions demonstrating linear logarithmic trends for optical property measurements. PMID:24451451
Predicting Soil Organic Carbon and Total Nitrogen in the Russian Chernozem from Depth and Wireless Color Sensor Measurements

NASA Astrophysics Data System (ADS)

Mikhailova, E. A.; Stiglitz, R. Y.; Post, C. J.; Schlautman, M. A.; Sharp, J. L.; Gerard, P. D.

2017-12-01

Color sensor technologies offer opportunities for affordable and rapid assessment of soil organic carbon (SOC) and total nitrogen (TN) in the field, but the applicability of these technologies may vary by soil type. The objective of this study was to use an inexpensive color sensor to develop SOC and TN prediction models for the Russian Chernozem (Haplic Chernozem) in the Kursk region of Russia. Twenty-one dried soil samples were analyzed using a Nix Pro™ color sensor that is controlled through a mobile application and Bluetooth to collect CIEL*a*b* (darkness to lightness, green to red, and blue to yellow) color data. Eleven samples were randomly selected to be used to construct prediction models and the remaining ten samples were set aside for cross validation. The root mean squared error (RMSE) was calculated to determine each model's prediction error. The data from the eleven soil samples were used to develop the natural log of SOC (lnSOC) and TN (lnTN) prediction models using depth, L*, a*, and b* for each sample as predictor variables in regression analyses. Resulting residual plots, root mean square errors (RMSE), mean squared prediction error (MSPE) and coefficients of determination ( R 2, adjusted R 2) were used to assess model fit for each of the SOC and total N prediction models. Final models were fit using all soil samples, which included depth and color variables, for lnSOC ( R 2 = 0.987, Adj. R 2 = 0.981, RMSE = 0.003, p-value < 0.001, MSPE = 0.182) and lnTN ( R 2 = 0.980 Adj. R 2 = 0.972, RMSE = 0.004, p-value < 0.001, MSPE = 0.001). Additionally, final models were fit for all soil samples, which included only color variables, for lnSOC ( R 2 = 0.959 Adj. R 2 = 0.949, RMSE = 0.007, p-value < 0.001, MSPE = 0.536) and lnTN ( R 2 = 0.912 Adj. R 2 = 0.890, RMSE = 0.015, p-value < 0.001, MSPE = 0.001). The results suggest that soil color may be used for rapid assessment of SOC and TN in these agriculturally important soils.
Model-based mean square error estimators for k-nearest neighbour predictions and applications using remotely sensed data for forest inventories

Treesearch

Steen Magnussen; Ronald E. McRoberts; Erkki O. Tomppo

2009-01-01

New model-based estimators of the uncertainty of pixel-level and areal k-nearest neighbour (knn) predictions of attribute Y from remotely-sensed ancillary data X are presented. Non-parametric functions predict Y from scalar 'Single Index Model' transformations of X. Variance functions generated...
A component prediction method for flue gas of natural gas combustion based on nonlinear partial least squares method.

PubMed

Cao, Hui; Yan, Xingyu; Li, Yaojiang; Wang, Yanxia; Zhou, Yan; Yang, Sanchun

2014-01-01

Quantitative analysis for the flue gas of natural gas-fired generator is significant for energy conservation and emission reduction. The traditional partial least squares method may not deal with the nonlinear problems effectively. In the paper, a nonlinear partial least squares method with extended input based on radial basis function neural network (RBFNN) is used for components prediction of flue gas. For the proposed method, the original independent input matrix is the input of RBFNN and the outputs of hidden layer nodes of RBFNN are the extension term of the original independent input matrix. Then, the partial least squares regression is performed on the extended input matrix and the output matrix to establish the components prediction model of flue gas. A near-infrared spectral dataset of flue gas of natural gas combustion is used for estimating the effectiveness of the proposed method compared with PLS. The experiments results show that the root-mean-square errors of prediction values of the proposed method for methane, carbon monoxide, and carbon dioxide are, respectively, reduced by 4.74%, 21.76%, and 5.32% compared to those of PLS. Hence, the proposed method has higher predictive capabilities and better robustness.
Predicting the random drift of MEMS gyroscope based on K-means clustering and OLS RBF Neural Network

NASA Astrophysics Data System (ADS)

Wang, Zhen-yu; Zhang, Li-jie

2017-10-01

Measure error of the sensor can be effectively compensated with prediction. Aiming at large random drift error of MEMS(Micro Electro Mechanical System))gyroscope, an improved learning algorithm of Radial Basis Function(RBF) Neural Network(NN) based on K-means clustering and Orthogonal Least-Squares (OLS) is proposed in this paper. The algorithm selects the typical samples as the initial cluster centers of RBF NN firstly, candidates centers with K-means algorithm secondly, and optimizes the candidate centers with OLS algorithm thirdly, which makes the network structure simpler and makes the prediction performance better. Experimental results show that the proposed K-means clustering OLS learning algorithm can predict the random drift of MEMS gyroscope effectively, the prediction error of which is 9.8019e-007°/s and the prediction time of which is 2.4169e-006s
Error-Based Design Space Windowing

NASA Technical Reports Server (NTRS)

Papila, Melih; Papila, Nilay U.; Shyy, Wei; Haftka, Raphael T.; Fitz-Coy, Norman

2002-01-01

Windowing of design space is considered in order to reduce the bias errors due to low-order polynomial response surfaces (RS). Standard design space windowing (DSW) uses a region of interest by setting a requirement on response level and checks it by a global RS predictions over the design space. This approach, however, is vulnerable since RS modeling errors may lead to the wrong region to zoom on. The approach is modified by introducing an eigenvalue error measure based on point-to-point mean squared error criterion. Two examples are presented to demonstrate the benefit of the error-based DSW.
Modified locally weighted--partial least squares regression improving clinical predictions from infrared spectra of human serum samples.

PubMed

Perez-Guaita, David; Kuligowski, Julia; Quintás, Guillermo; Garrigues, Salvador; Guardia, Miguel de la

2013-03-30

Locally weighted partial least squares regression (LW-PLSR) has been applied to the determination of four clinical parameters in human serum samples (total protein, triglyceride, glucose and urea contents) by Fourier transform infrared (FTIR) spectroscopy. Classical LW-PLSR models were constructed using different spectral regions. For the selection of parameters by LW-PLSR modeling, a multi-parametric study was carried out employing the minimum root-mean square error of cross validation (RMSCV) as objective function. In order to overcome the effect of strong matrix interferences on the predictive accuracy of LW-PLSR models, this work focuses on sample selection. Accordingly, a novel strategy for the development of local models is proposed. It was based on the use of: (i) principal component analysis (PCA) performed on an analyte specific spectral region for identifying most similar sample spectra and (ii) partial least squares regression (PLSR) constructed using the whole spectrum. Results found by using this strategy were compared to those provided by PLSR using the same spectral intervals as for LW-PLSR. Prediction errors found by both, classical and modified LW-PLSR improved those obtained by PLSR. Hence, both proposed approaches were useful for the determination of analytes present in a complex matrix as in the case of human serum samples. Copyright © 2013 Elsevier B.V. All rights reserved.
Filter Tuning Using the Chi-Squared Statistic

NASA Technical Reports Server (NTRS)

Lilly-Salkowski, Tyler B.

2017-01-01

This paper examines the use of the Chi-square statistic as a means of evaluating filter performance. The goal of the process is to characterize the filter performance in the metric of covariance realism. The Chi-squared statistic is the value calculated to determine the realism of a covariance based on the prediction accuracy and the covariance values at a given point in time. Once calculated, it is the distribution of this statistic that provides insight on the accuracy of the covariance. The process of tuning an Extended Kalman Filter (EKF) for Aqua and Aura support is described, including examination of the measurement errors of available observation types, and methods of dealing with potentially volatile atmospheric drag modeling. Predictive accuracy and the distribution of the Chi-squared statistic, calculated from EKF solutions, are assessed.
Modeling number of claims and prediction of total claim amount

NASA Astrophysics Data System (ADS)

Acar, Aslıhan Şentürk; Karabey, Uǧur

2017-07-01

In this study we focus on annual number of claims of a private health insurance data set which belongs to a local insurance company in Turkey. In addition to Poisson model and negative binomial model, zero-inflated Poisson model and zero-inflated negative binomial model are used to model the number of claims in order to take into account excess zeros. To investigate the impact of different distributional assumptions for the number of claims on the prediction of total claim amount, predictive performances of candidate models are compared by using root mean square error (RMSE) and mean absolute error (MAE) criteria.
Development of a Nonlinear Soft-Sensor Using a GMDH Network for a Refinery Crude Distillation Tower

NASA Astrophysics Data System (ADS)

Fujii, Kenzo; Yamamoto, Toru

In atmospheric distillation processes, the stabilization of processes is required in order to optimize the crude-oil composition that corresponds to product market conditions. However, the process control systems sometimes fall into unstable states in the case where unexpected disturbances are introduced, and these unusual phenomena have had an undesirable affect on certain products. Furthermore, a useful chemical engineering model has not yet been established for these phenomena. This remains a serious problem in the atmospheric distillation process. This paper describes a new modeling scheme to predict unusual phenomena in the atmospheric distillation process using the GMDH (Group Method of Data Handling) network which is one type of network model. According to the GMDH network, the model structure can be determined systematically. However, the least squares method has been commonly utilized in determining weight coefficients (model parameters). Estimation accuracy is not entirely expected, because the sum of squared errors between the measured values and estimates is evaluated. Therefore, instead of evaluating the sum of squared errors, the sum of absolute value of errors is introduced and the Levenberg-Marquardt method is employed in order to determine model parameters. The effectiveness of the proposed method is evaluated by the foaming prediction in the crude oil switching operation in the atmospheric distillation process.
Chemometrics resolution and quantification power evaluation: Application on pharmaceutical quaternary mixture of Paracetamol, Guaifenesin, Phenylephrine and p-aminophenol

NASA Astrophysics Data System (ADS)

Yehia, Ali M.; Mohamed, Heba M.

2016-01-01

Three advanced chemmometric-assisted spectrophotometric methods namely; Concentration Residuals Augmented Classical Least Squares (CRACLS), Multivariate Curve Resolution-Alternating Least Squares (MCR-ALS) and Principal Component Analysis-Artificial Neural Networks (PCA-ANN) were developed, validated and benchmarked to PLS calibration; to resolve the severely overlapped spectra and simultaneously determine; Paracetamol (PAR), Guaifenesin (GUA) and Phenylephrine (PHE) in their ternary mixture and in presence of p-aminophenol (AP) the main degradation product and synthesis impurity of Paracetamol. The analytical performance of the proposed methods was described by percentage recoveries, root mean square error of calibration and standard error of prediction. The four multivariate calibration methods could be directly used without any preliminary separation step and successfully applied for pharmaceutical formulation analysis, showing no excipients' interference.
Evaluation of the prediction precision capability of partial least squares regression approach for analysis of high alloy steel by laser induced breakdown spectroscopy

NASA Astrophysics Data System (ADS)

Sarkar, Arnab; Karki, Vijay; Aggarwal, Suresh K.; Maurya, Gulab S.; Kumar, Rohit; Rai, Awadhesh K.; Mao, Xianglei; Russo, Richard E.

2015-06-01

Laser induced breakdown spectroscopy (LIBS) was applied for elemental characterization of high alloy steel using partial least squares regression (PLSR) with an objective to evaluate the analytical performance of this multivariate approach. The optimization of the number of principle components for minimizing error in PLSR algorithm was investigated. The effect of different pre-treatment procedures on the raw spectral data before PLSR analysis was evaluated based on several statistical (standard error of prediction, percentage relative error of prediction etc.) parameters. The pre-treatment with "NORM" parameter gave the optimum statistical results. The analytical performance of PLSR model improved by increasing the number of laser pulses accumulated per spectrum as well as by truncating the spectrum to appropriate wavelength region. It was found that the statistical benefit of truncating the spectrum can also be accomplished by increasing the number of laser pulses per accumulation without spectral truncation. The constituents (Co and Mo) present in hundreds of ppm were determined with relative precision of 4-9% (2σ), whereas the major constituents Cr and Ni (present at a few percent levels) were determined with a relative precision of ~ 2%(2σ).
Prediction of ethanol in bottled Chinese rice wine by NIR spectroscopy

NASA Astrophysics Data System (ADS)

Ying, Yibin; Yu, Haiyan; Pan, Xingxiang; Lin, Tao

2006-10-01

To evaluate the applicability of non-invasive visible and near infrared (VIS-NIR) spectroscopy for determining ethanol concentration of Chinese rice wine in square brown glass bottle, transmission spectra of 100 bottled Chinese rice wine samples were collected in the spectral range of 350-1200 nm. Statistical equations were established between the reference data and VIS-NIR spectra by partial least squares (PLS) regression method. Performance of three kinds of mathematical treatment of spectra (original spectra, first derivative spectra and second derivative spectra) were also discussed. The PLS models of original spectra turned out better results, with higher correlation coefficient in calibration (R cal) of 0.89, lower root mean standard error of calibration (RMSEC) of 0.165, and lower root mean standard error of cross validation (RMSECV) of 0.179. Using original spectra, PLS models for ethanol concentration prediction were developed. The R cal and the correlation coefficient in validation (R val) were 0.928 and 0.875, respectively; and the RMSEC and the root mean standard error of validation (RMSEP) were 0.135 (%, v v -1) and 0.177 (%, v v -1), respectively. The results demonstrated that VIS-NIR spectroscopy could be used to predict ethanol concentration in bottled Chinese rice wine.
Spindle Thermal Error Optimization Modeling of a Five-axis Machine Tool

NASA Astrophysics Data System (ADS)

Guo, Qianjian; Fan, Shuo; Xu, Rufeng; Cheng, Xiang; Zhao, Guoyong; Yang, Jianguo

2017-05-01

Aiming at the problem of low machining accuracy and uncontrollable thermal errors of NC machine tools, spindle thermal error measurement, modeling and compensation of a two turntable five-axis machine tool are researched. Measurement experiment of heat sources and thermal errors are carried out, and GRA(grey relational analysis) method is introduced into the selection of temperature variables used for thermal error modeling. In order to analyze the influence of different heat sources on spindle thermal errors, an ANN (artificial neural network) model is presented, and ABC(artificial bee colony) algorithm is introduced to train the link weights of ANN, a new ABC-NN(Artificial bee colony-based neural network) modeling method is proposed and used in the prediction of spindle thermal errors. In order to test the prediction performance of ABC-NN model, an experiment system is developed, the prediction results of LSR (least squares regression), ANN and ABC-NN are compared with the measurement results of spindle thermal errors. Experiment results show that the prediction accuracy of ABC-NN model is higher than LSR and ANN, and the residual error is smaller than 3 μm, the new modeling method is feasible. The proposed research provides instruction to compensate thermal errors and improve machining accuracy of NC machine tools.
Effect on the partial least-squares prediction of yarn properties combining raman and infrared measurements and applying wavelength selection.

PubMed

de Groot, P J; Swierenga, H; Postma, G J; Melssen, W J; Buydens, L M C

2003-06-01

The combination of Raman and infrared spectroscopy on the one hand and wavelength selection on the other hand is used to improve the partial least-squares (PLS) prediction of seven selected yarn properties. These properties are important for on-line quality control during production. From 71 yarn samples, the Raman and infrared spectra are measured and reference methods are used to determine the selected properties. Making separate PLS models for all yarn properties using the Raman and infrared spectra, prior to wavelength selection, reveals that Raman spectroscopy outperforms infrared spectroscopy. If wavelength selection is applied, the PLS prediction error decreases and the correlation coefficient increases for all properties. However, a substantial wavelength selection effect is present for the infrared spectra compared to the Raman spectra. For the infrared spectra, wavelength selection results in PLS prediction errors comparable with the prediction performance of the Raman spectra prior to wavelength selection. Concatenating the Raman and infrared spectra does not enhance the PLS prediction performance, not even after wavelength selection. It is concluded that an infrared spectrometer, combined with a wavelength selection procedure, can be used if no (suitable) Raman instrument is available.
Prediction Model for Predicting Powdery Mildew using ANN for Medicinal Plant— Picrorhiza kurrooa

NASA Astrophysics Data System (ADS)

Shivling, V. D.; Ghanshyam, C.; Kumar, Rakesh; Kumar, Sanjay; Sharma, Radhika; Kumar, Dinesh; Sharma, Atul; Sharma, Sudhir Kumar

2017-02-01

Plant disease fore casting system is an important system as it can be used for prediction of disease, further it can be used as an alert system to warn the farmers in advance so as to protect their crop from being getting infected. Fore casting system will predict the risk of infection for crop by using the environmental factors that favor in germination of disease. In this study an artificial neural network based system for predicting the risk of powdery mildew in Picrorhiza kurrooa was developed. For development, Levenberg-Marquardt backpropagation algorithm was used having a single hidden layer of ten nodes. Temperature and duration of wetness are the major environmental factors that favor infection. Experimental data was used as a training set and some percentage of data was used for testing and validation. The performance of the system was measured in the form of the coefficient of correlation (R), coefficient of determination (R2), mean square error and root mean square error. For simulating the network an inter face was developed. Using this interface the network was simulated by putting temperature and wetness duration so as to predict the level of risk at that particular value of the input data.
Soil-pipe interaction modeling for pipe behavior prediction with super learning based methods

NASA Astrophysics Data System (ADS)

Shi, Fang; Peng, Xiang; Liu, Huan; Hu, Yafei; Liu, Zheng; Li, Eric

2018-03-01

Underground pipelines are subject to severe distress from the surrounding expansive soil. To investigate the structural response of water mains to varying soil movements, field data, including pipe wall strains in situ soil water content, soil pressure and temperature, was collected. The research on monitoring data analysis has been reported, but the relationship between soil properties and pipe deformation has not been well-interpreted. To characterize the relationship between soil property and pipe deformation, this paper presents a super learning based approach combining feature selection algorithms to predict the water mains structural behavior in different soil environments. Furthermore, automatic variable selection method, e.i. recursive feature elimination algorithm, were used to identify the critical predictors contributing to the pipe deformations. To investigate the adaptability of super learning to different predictive models, this research employed super learning based methods to three different datasets. The predictive performance was evaluated by R-squared, root-mean-square error and mean absolute error. Based on the prediction performance evaluation, the superiority of super learning was validated and demonstrated by predicting three types of pipe deformations accurately. In addition, a comprehensive understand of the water mains working environments becomes possible.

Sampling Errors of SSM/I and TRMM Rainfall Averages: Comparison with Error Estimates from Surface Data and a Sample Model

NASA Technical Reports Server (NTRS)

Bell, Thomas L.; Kundu, Prasun K.; Kummerow, Christian D.; Einaudi, Franco (Technical Monitor)

2000-01-01

Quantitative use of satellite-derived maps of monthly rainfall requires some measure of the accuracy of the satellite estimates. The rainfall estimate for a given map grid box is subject to both remote-sensing error and, in the case of low-orbiting satellites, sampling error due to the limited number of observations of the grid box provided by the satellite. A simple model of rain behavior predicts that Root-mean-square (RMS) random error in grid-box averages should depend in a simple way on the local average rain rate, and the predicted behavior has been seen in simulations using surface rain-gauge and radar data. This relationship was examined using satellite SSM/I data obtained over the western equatorial Pacific during TOGA COARE. RMS error inferred directly from SSM/I rainfall estimates was found to be larger than predicted from surface data, and to depend less on local rain rate than was predicted. Preliminary examination of TRMM microwave estimates shows better agreement with surface data. A simple method of estimating rms error in satellite rainfall estimates is suggested, based on quantities that can be directly computed from the satellite data.
[Prediction of schistosomiasis infection rates of population based on ARIMA-NARNN model].

PubMed

Ke-Wei, Wang; Yu, Wu; Jin-Ping, Li; Yu-Yu, Jiang

2016-07-12

To explore the effect of the autoregressive integrated moving average model-nonlinear auto-regressive neural network (ARIMA-NARNN) model on predicting schistosomiasis infection rates of population. The ARIMA model, NARNN model and ARIMA-NARNN model were established based on monthly schistosomiasis infection rates from January 2005 to February 2015 in Jiangsu Province, China. The fitting and prediction performances of the three models were compared. Compared to the ARIMA model and NARNN model, the mean square error (MSE), mean absolute error (MAE) and mean absolute percentage error (MAPE) of the ARIMA-NARNN model were the least with the values of 0.011 1, 0.090 0 and 0.282 4, respectively. The ARIMA-NARNN model could effectively fit and predict schistosomiasis infection rates of population, which might have a great application value for the prevention and control of schistosomiasis.
EVALUATING PREDICTIVE ERRORS OF A COMPLEX ENVIRONMENTAL MODEL USING A GENERAL LINEAR MODEL AND LEAST SQUARE MEANS

EPA Science Inventory

A General Linear Model (GLM) was used to evaluate the deviation of predicted values from expected values for a complex environmental model. For this demonstration, we used the default level interface of the Regional Mercury Cycling Model (R-MCM) to simulate epilimnetic total mer...
Objective Analysis of Oceanic Data for Coast Guard Trajectory Models Phase II

DTIC Science & Technology

1997-12-01

as outliers depends on the desired probability of false alarm, Pfa values, which is the probability of marking a valid point as an outlier. Table 2-2...constructed to minimize the mean-squared prediction error of the grid point estimate under the constraint that the estimate is unbiased . The...prediction error, e= Zl(S) _oizl(Si)+oC1iZz(S) (2.44) subject to the constraints of unbiasedness , • c/1 = 1,and (2.45) i SCC12 = 0. (2.46) Denoting
Prediction of Soil Organic Carbon at the European Scale by Visible and Near InfraRed Reflectance Spectroscopy.

PubMed

Stevens, Antoine; Nocita, Marco; Tóth, Gergely; Montanarella, Luca; van Wesemael, Bas

2013-01-01

Soil organic carbon is a key soil property related to soil fertility, aggregate stability and the exchange of CO2 with the atmosphere. Existing soil maps and inventories can rarely be used to monitor the state and evolution in soil organic carbon content due to their poor spatial resolution, lack of consistency and high updating costs. Visible and Near Infrared diffuse reflectance spectroscopy is an alternative method to provide cheap and high-density soil data. However, there are still some uncertainties on its capacity to produce reliable predictions for areas characterized by large soil diversity. Using a large-scale EU soil survey of about 20,000 samples and covering 23 countries, we assessed the performance of reflectance spectroscopy for the prediction of soil organic carbon content. The best calibrations achieved a root mean square error ranging from 4 to 15 g C kg(-1) for mineral soils and a root mean square error of 50 g C kg(-1) for organic soil materials. Model errors are shown to be related to the levels of soil organic carbon and variations in other soil properties such as sand and clay content. Although errors are ∼5 times larger than the reproducibility error of the laboratory method, reflectance spectroscopy provides unbiased predictions of the soil organic carbon content. Such estimates could be used for assessing the mean soil organic carbon content of large geographical entities or countries. This study is a first step towards providing uniform continental-scale spectroscopic estimations of soil organic carbon, meeting an increasing demand for information on the state of the soil that can be used in biogeochemical models and the monitoring of soil degradation.
Prediction of Soil Organic Carbon at the European Scale by Visible and Near InfraRed Reflectance Spectroscopy

PubMed Central

Stevens, Antoine; Nocita, Marco; Tóth, Gergely; Montanarella, Luca; van Wesemael, Bas

2013-01-01

Soil organic carbon is a key soil property related to soil fertility, aggregate stability and the exchange of CO2 with the atmosphere. Existing soil maps and inventories can rarely be used to monitor the state and evolution in soil organic carbon content due to their poor spatial resolution, lack of consistency and high updating costs. Visible and Near Infrared diffuse reflectance spectroscopy is an alternative method to provide cheap and high-density soil data. However, there are still some uncertainties on its capacity to produce reliable predictions for areas characterized by large soil diversity. Using a large-scale EU soil survey of about 20,000 samples and covering 23 countries, we assessed the performance of reflectance spectroscopy for the prediction of soil organic carbon content. The best calibrations achieved a root mean square error ranging from 4 to 15 g C kg−1 for mineral soils and a root mean square error of 50 g C kg−1 for organic soil materials. Model errors are shown to be related to the levels of soil organic carbon and variations in other soil properties such as sand and clay content. Although errors are ∼5 times larger than the reproducibility error of the laboratory method, reflectance spectroscopy provides unbiased predictions of the soil organic carbon content. Such estimates could be used for assessing the mean soil organic carbon content of large geographical entities or countries. This study is a first step towards providing uniform continental-scale spectroscopic estimations of soil organic carbon, meeting an increasing demand for information on the state of the soil that can be used in biogeochemical models and the monitoring of soil degradation. PMID:23840459
[Determination of Carbaryl in Rice by Using FT Far-IR and THz-TDS Techniques].

PubMed

Sun, Tong; Zhang, Zhuo-yong; Xiang, Yu-hong; Zhu, Ruo-hua

2016-02-01

Determination of carbaryl in rice by using Fourier transform far-infrared (FT- Far-IR) and terahertz time-domain spectroscopy (THz-TDS) combined with chemometrics was studied and the spectral characteristics of carbaryl in terahertz region was investigated. Samples were prepared by mixing carbaryl at different amounts with rice powder, and then a 13 mm diameter, and about 1 mm thick pellet with polyethylene (PE) as matrix was compressed under the pressure of 5-7 tons. Terahertz time domain spectra of the pellets were measured at 0.5~1.5 THz, and the absorption spectra at 1.6. 3 THz were acquired with Fourier transform far-IR spectroscopy. The method of sample preparation is so simple that it does not need separation and enrichment. The absorption peaks in the frequency range of 1.8-6.3 THz have been found at 3.2 and 5.2 THz by Far-IR. There are several weak absorption peaks in the range of 0.5-1.5 THz by THz-TDS. These two kinds of characteristic absorption spectra were randomly divided into calibration set and prediction set by leave-N-out cross-validation, respectively. Finally, the partial least squares regression (PLSR) method was used to establish two quantitative analysis models. The root mean square error (RMSECV), the root mean square errors of prediction (RMSEP) and the correlation coefficient of the prediction are used as a basis for the model of performance evaluation. For the R,, a higher value is better; for the RMSEC and RMSEP, lower is better. The obtained results demonstrated that the predictive accuracy of. the two models with PLSR method were satisfactory. For the FT-Far-IR model, the correlation between actual and predicted values of prediction samples (Rv) was 0.99. The root mean square error of prediction set (RMSEP) was 0.008 6, and for calibration set (RMSECV) was 0.007 7. For the THz-TDS model, R. was 0. 98, RMSEP was 0.004 4, and RMSECV was 0.002 5. Results proved that the technology of FT-Far-IR and THz- TDS can be a feasible tool for quantitative determination of carbaryl in rice. This paper provides a new method for the quantitative determination pesticide in other grain samples.
The Role of Multimodel Combination in Improving Streamflow Prediction

NASA Astrophysics Data System (ADS)

Arumugam, S.; Li, W.

2008-12-01

Model errors are the inevitable part in any prediction exercise. One approach that is currently gaining attention to reduce model errors is by optimally combining multiple models to develop improved predictions. The rationale behind this approach primarily lies on the premise that optimal weights could be derived for each model so that the developed multimodel predictions will result in improved predictability. In this study, we present a new approach to combine multiple hydrological models by evaluating their predictability contingent on the predictor state. We combine two hydrological models, 'abcd' model and Variable Infiltration Capacity (VIC) model, with each model's parameter being estimated by two different objective functions to develop multimodel streamflow predictions. The performance of multimodel predictions is compared with individual model predictions using correlation, root mean square error and Nash-Sutcliffe coefficient. To quantify precisely under what conditions the multimodel predictions result in improved predictions, we evaluate the proposed algorithm by testing it against streamflow generated from a known model ('abcd' model or VIC model) with errors being homoscedastic or heteroscedastic. Results from the study show that streamflow simulated from individual models performed better than multimodels under almost no model error. Under increased model error, the multimodel consistently performed better than the single model prediction in terms of all performance measures. The study also evaluates the proposed algorithm for streamflow predictions in two humid river basins from NC as well as in two arid basins from Arizona. Through detailed validation in these four sites, the study shows that multimodel approach better predicts the observed streamflow in comparison to the single model predictions.
Functional Mixed Effects Model for Small Area Estimation.

PubMed

Maiti, Tapabrata; Sinha, Samiran; Zhong, Ping-Shou

2016-09-01

Functional data analysis has become an important area of research due to its ability of handling high dimensional and complex data structures. However, the development is limited in the context of linear mixed effect models, and in particular, for small area estimation. The linear mixed effect models are the backbone of small area estimation. In this article, we consider area level data, and fit a varying coefficient linear mixed effect model where the varying coefficients are semi-parametrically modeled via B-splines. We propose a method of estimating the fixed effect parameters and consider prediction of random effects that can be implemented using a standard software. For measuring prediction uncertainties, we derive an analytical expression for the mean squared errors, and propose a method of estimating the mean squared errors. The procedure is illustrated via a real data example, and operating characteristics of the method are judged using finite sample simulation studies.
Gene expression programming approach for the estimation of moisture ratio in herbal plants drying with vacuum heat pump dryer

NASA Astrophysics Data System (ADS)

Dikmen, Erkan; Ayaz, Mahir; Gül, Doğan; Şahin, Arzu Şencan

2017-07-01

The determination of drying behavior of herbal plants is a complex process. In this study, gene expression programming (GEP) model was used to determine drying behavior of herbal plants as fresh sweet basil, parsley and dill leaves. Time and drying temperatures are input parameters for the estimation of moisture ratio of herbal plants. The results of the GEP model are compared with experimental drying data. The statistical values as mean absolute percentage error, root-mean-squared error and R-square are used to calculate the difference between values predicted by the GEP model and the values actually observed from the experimental study. It was found that the results of the GEP model and experimental study are in moderately well agreement. The results have shown that the GEP model can be considered as an efficient modelling technique for the prediction of moisture ratio of herbal plants.
Chemometrics resolution and quantification power evaluation: Application on pharmaceutical quaternary mixture of Paracetamol, Guaifenesin, Phenylephrine and p-aminophenol.

PubMed

Yehia, Ali M; Mohamed, Heba M

2016-01-05

Three advanced chemmometric-assisted spectrophotometric methods namely; Concentration Residuals Augmented Classical Least Squares (CRACLS), Multivariate Curve Resolution-Alternating Least Squares (MCR-ALS) and Principal Component Analysis-Artificial Neural Networks (PCA-ANN) were developed, validated and benchmarked to PLS calibration; to resolve the severely overlapped spectra and simultaneously determine; Paracetamol (PAR), Guaifenesin (GUA) and Phenylephrine (PHE) in their ternary mixture and in presence of p-aminophenol (AP) the main degradation product and synthesis impurity of Paracetamol. The analytical performance of the proposed methods was described by percentage recoveries, root mean square error of calibration and standard error of prediction. The four multivariate calibration methods could be directly used without any preliminary separation step and successfully applied for pharmaceutical formulation analysis, showing no excipients' interference. Copyright © 2015 Elsevier B.V. All rights reserved.
Least-squares model-based halftoning

NASA Astrophysics Data System (ADS)

Pappas, Thrasyvoulos N.; Neuhoff, David L.

1992-08-01

A least-squares model-based approach to digital halftoning is proposed. It exploits both a printer model and a model for visual perception. It attempts to produce an 'optimal' halftoned reproduction, by minimizing the squared error between the response of the cascade of the printer and visual models to the binary image and the response of the visual model to the original gray-scale image. Conventional methods, such as clustered ordered dither, use the properties of the eye only implicitly, and resist printer distortions at the expense of spatial and gray-scale resolution. In previous work we showed that our printer model can be used to modify error diffusion to account for printer distortions. The modified error diffusion algorithm has better spatial and gray-scale resolution than conventional techniques, but produces some well known artifacts and asymmetries because it does not make use of an explicit eye model. Least-squares model-based halftoning uses explicit eye models and relies on printer models that predict distortions and exploit them to increase, rather than decrease, both spatial and gray-scale resolution. We have shown that the one-dimensional least-squares problem, in which each row or column of the image is halftoned independently, can be implemented with the Viterbi's algorithm. Unfortunately, no closed form solution can be found in two dimensions. The two-dimensional least squares solution is obtained by iterative techniques. Experiments show that least-squares model-based halftoning produces more gray levels and better spatial resolution than conventional techniques. We also show that the least- squares approach eliminates the problems associated with error diffusion. Model-based halftoning can be especially useful in transmission of high quality documents using high fidelity gray-scale image encoders. As we have shown, in such cases halftoning can be performed at the receiver, just before printing. Apart from coding efficiency, this approach permits the halftoner to be tuned to the individual printer, whose characteristics may vary considerably from those of other printers, for example, write-black vs. write-white laser printers.
Mapping the EORTC QLQ-C30 onto the EQ-5D-3L: assessing the external validity of existing mapping algorithms.

PubMed

Doble, Brett; Lorgelly, Paula

2016-04-01

To determine the external validity of existing mapping algorithms for predicting EQ-5D-3L utility values from EORTC QLQ-C30 responses and to establish their generalizability in different types of cancer. A main analysis (pooled) sample of 3560 observations (1727 patients) and two disease severity patient samples (496 and 93 patients) with repeated observations over time from Cancer 2015 were used to validate the existing algorithms. Errors were calculated between observed and predicted EQ-5D-3L utility values using a single pooled sample and ten pooled tumour type-specific samples. Predictive accuracy was assessed using mean absolute error (MAE) and standardized root-mean-squared error (RMSE). The association between observed and predicted EQ-5D utility values and other covariates across the distribution was tested using quantile regression. Quality-adjusted life years (QALYs) were calculated using observed and predicted values to test responsiveness. Ten 'preferred' mapping algorithms were identified. Two algorithms estimated via response mapping and ordinary least-squares regression using dummy variables performed well on number of validation criteria, including accurate prediction of the best and worst QLQ-C30 health states, predicted values within the EQ-5D tariff range, relatively small MAEs and RMSEs, and minimal differences between estimated QALYs. Comparison of predictive accuracy across ten tumour type-specific samples highlighted that algorithms are relatively insensitive to grouping by tumour type and affected more by differences in disease severity. Two of the 'preferred' mapping algorithms suggest more accurate predictions, but limitations exist. We recommend extensive scenario analyses if mapped utilities are used in cost-utility analyses.
Top-of-Climb Matching Method for Reducing Aircraft Trajectory Prediction Errors.

PubMed

Thipphavong, David P

2016-09-01

The inaccuracies of the aircraft performance models utilized by trajectory predictors with regard to takeoff weight, thrust, climb profile, and other parameters result in altitude errors during the climb phase that often exceed the vertical separation standard of 1000 feet. This study investigates the potential reduction in altitude trajectory prediction errors that could be achieved for climbing flights if just one additional parameter is made available: top-of-climb (TOC) time. The TOC-matching method developed and evaluated in this paper is straightforward: a set of candidate trajectory predictions is generated using different aircraft weight parameters, and the one that most closely matches TOC in terms of time is selected. This algorithm was tested using more than 1000 climbing flights in Fort Worth Center. Compared to the baseline trajectory predictions of a real-time research prototype (Center/TRACON Automation System), the TOC-matching method reduced the altitude root mean square error (RMSE) for a 5-minute prediction time by 38%. It also decreased the percentage of flights with absolute altitude error greater than the vertical separation standard of 1000 ft for the same look-ahead time from 55% to 30%.
Top-of-Climb Matching Method for Reducing Aircraft Trajectory Prediction Errors

PubMed Central

Thipphavong, David P.

2017-01-01

The inaccuracies of the aircraft performance models utilized by trajectory predictors with regard to takeoff weight, thrust, climb profile, and other parameters result in altitude errors during the climb phase that often exceed the vertical separation standard of 1000 feet. This study investigates the potential reduction in altitude trajectory prediction errors that could be achieved for climbing flights if just one additional parameter is made available: top-of-climb (TOC) time. The TOC-matching method developed and evaluated in this paper is straightforward: a set of candidate trajectory predictions is generated using different aircraft weight parameters, and the one that most closely matches TOC in terms of time is selected. This algorithm was tested using more than 1000 climbing flights in Fort Worth Center. Compared to the baseline trajectory predictions of a real-time research prototype (Center/TRACON Automation System), the TOC-matching method reduced the altitude root mean square error (RMSE) for a 5-minute prediction time by 38%. It also decreased the percentage of flights with absolute altitude error greater than the vertical separation standard of 1000 ft for the same look-ahead time from 55% to 30%. PMID:28684883
Top-of-Climb Matching Method for Reducing Aircraft Trajectory Prediction Errors

NASA Technical Reports Server (NTRS)

Thipphavong, David P.

2016-01-01

The inaccuracies of the aircraft performance models utilized by trajectory predictors with regard to takeoff weight, thrust, climb profile, and other parameters result in altitude errors during the climb phase that often exceed the vertical separation standard of 1000 feet. This study investigates the potential reduction in altitude trajectory prediction errors that could be achieved for climbing flights if just one additional parameter is made available: top-of-climb (TOC) time. The TOC-matching method developed and evaluated in this paper is straightforward: a set of candidate trajectory predictions is generated using different aircraft weight parameters, and the one that most closely matches TOC in terms of time is selected. This algorithm was tested using more than 1000 climbing flights in Fort Worth Center. Compared to the baseline trajectory predictions of a real-time research prototype (Center/TRACON Automation System), the TOC-matching method reduced the altitude root mean square error (RMSE) for a 5-minute prediction time by 38%. It also decreased the percentage of flights with absolute altitude error greater than the vertical separation standard of 1000 ft for the same look-ahead time from 55% to 30%.
Quantitative transmission Raman spectroscopy of pharmaceutical tablets and capsules.

PubMed

Johansson, Jonas; Sparén, Anders; Svensson, Olof; Folestad, Staffan; Claybourn, Mike

2007-11-01

Quantitative analysis of pharmaceutical formulations using the new approach of transmission Raman spectroscopy has been investigated. For comparison, measurements were also made in conventional backscatter mode. The experimental setup consisted of a Raman probe-based spectrometer with 785 nm excitation for measurements in backscatter mode. In transmission mode the same system was used to detect the Raman scattered light, while an external diode laser of the same type was used as excitation source. Quantitative partial least squares models were developed for both measurement modes. The results for tablets show that the prediction error for an independent test set was lower for the transmission measurements with a relative root mean square error of about 2.2% as compared with 2.9% for the backscatter mode. Furthermore, the models were simpler in the transmission case, for which only a single partial least squares (PLS) component was required to explain the variation. The main reason for the improvement using the transmission mode is a more representative sampling of the tablets compared with the backscatter mode. Capsules containing mixtures of pharmaceutical powders were also assessed by transmission only. The quantitative results for the capsules' contents were good, with a prediction error of 3.6% w/w for an independent test set. The advantage of transmission Raman over backscatter Raman spectroscopy has been demonstrated for quantitative analysis of pharmaceutical formulations, and the prospects for reliable, lean calibrations for pharmaceutical analysis is discussed.
Artificial Intelligence Techniques for Predicting and Mapping Daily Pan Evaporation

NASA Astrophysics Data System (ADS)

Arunkumar, R.; Jothiprakash, V.; Sharma, Kirty

2017-09-01

In this study, Artificial Intelligence techniques such as Artificial Neural Network (ANN), Model Tree (MT) and Genetic Programming (GP) are used to develop daily pan evaporation time-series (TS) prediction and cause-effect (CE) mapping models. Ten years of observed daily meteorological data such as maximum temperature, minimum temperature, relative humidity, sunshine hours, dew point temperature and pan evaporation are used for developing the models. For each technique, several models are developed by changing the number of inputs and other model parameters. The performance of each model is evaluated using standard statistical measures such as Mean Square Error, Mean Absolute Error, Normalized Mean Square Error and correlation coefficient (R). The results showed that daily TS-GP (4) model predicted better with a correlation coefficient of 0.959 than other TS models. Among various CE models, CE-ANN (6-10-1) resulted better than MT and GP models with a correlation coefficient of 0.881. Because of the complex non-linear inter-relationship among various meteorological variables, CE mapping models could not achieve the performance of TS models. From this study, it was found that GP performs better for recognizing single pattern (time series modelling), whereas ANN is better for modelling multiple patterns (cause-effect modelling) in the data.
Hyperspectral Analysis of Soil Total Nitrogen in Subsided Land Using the Local Correlation Maximization-Complementary Superiority (LCMCS) Method.

PubMed

Lin, Lixin; Wang, Yunjia; Teng, Jiyao; Xi, Xiuxiu

2015-07-23

The measurement of soil total nitrogen (TN) by hyperspectral remote sensing provides an important tool for soil restoration programs in areas with subsided land caused by the extraction of natural resources. This study used the local correlation maximization-complementary superiority method (LCMCS) to establish TN prediction models by considering the relationship between spectral reflectance (measured by an ASD FieldSpec 3 spectroradiometer) and TN based on spectral reflectance curves of soil samples collected from subsided land which is determined by synthetic aperture radar interferometry (InSAR) technology. Based on the 1655 selected effective bands of the optimal spectrum (OSP) of the first derivate differential of reciprocal logarithm ([log{1/R}]'), (correlation coefficients, p < 0.01), the optimal model of LCMCS method was obtained to determine the final model, which produced lower prediction errors (root mean square error of validation [RMSEV] = 0.89, mean relative error of validation [MREV] = 5.93%) when compared with models built by the local correlation maximization (LCM), complementary superiority (CS) and partial least squares regression (PLS) methods. The predictive effect of LCMCS model was optional in Cangzhou, Renqiu and Fengfeng District. Results indicate that the LCMCS method has great potential to monitor TN in subsided lands caused by the extraction of natural resources including groundwater, oil and coal.
Medium-range Performance of the Global NWP Model

NASA Astrophysics Data System (ADS)

Kim, J.; Jang, T.; Kim, J.; Kim, Y.

2017-12-01

The medium-range performance of the global numerical weather prediction (NWP) model in the Korea Meteorological Administration (KMA) is investigated. The performance is based on the prediction of the extratropical circulation. The mean square error is expressed by sum of spatial variance of discrepancy between forecasts and observations and the square of the mean error (ME). Thus, it is important to investigate the ME effect in order to understand the model performance. The ME is expressed by the subtraction of an anomaly from forecast difference against the real climatology. It is found that the global model suffers from a severe systematic ME in medium-range forecasts. The systematic ME is dominant in the entire troposphere in all months. Such ME can explain at most 25% of root mean square error. We also compare the extratropical ME distribution with that from other NWP centers. NWP models exhibit similar spatial ME structure each other. It is found that the spatial ME pattern is highly correlated to that of an anomaly, implying that the ME varies with seasons. For example, the correlation coefficient between ME and anomaly ranges from -0.51 to -0.85 by months. The pattern of the extratropical circulation also has a high correlation to an anomaly. The global model has trouble in faithfully simulating extratropical cyclones and blockings in the medium-range forecast. In particular, the model has a hard to simulate an anomalous event in medium-range forecasts. If we choose an anomalous period for a test-bed experiment, we will suffer from a large error due to an anomaly.

Comparative Analysis of Hybrid Models for Prediction of BP Reactivity to Crossed Legs.

PubMed

Kaur, Gurmanik; Arora, Ajat Shatru; Jain, Vijender Kumar

2017-01-01

Crossing the legs at the knees, during BP measurement, is one of the several physiological stimuli that considerably influence the accuracy of BP measurements. Therefore, it is paramount to develop an appropriate prediction model for interpreting influence of crossed legs on BP. This research work described the use of principal component analysis- (PCA-) fused forward stepwise regression (FSWR), artificial neural network (ANN), adaptive neuro fuzzy inference system (ANFIS), and least squares support vector machine (LS-SVM) models for prediction of BP reactivity to crossed legs among the normotensive and hypertensive participants. The evaluation of the performance of the proposed prediction models using appropriate statistical indices showed that the PCA-based LS-SVM (PCA-LS-SVM) model has the highest prediction accuracy with coefficient of determination ( R 2 ) = 93.16%, root mean square error (RMSE) = 0.27, and mean absolute percentage error (MAPE) = 5.71 for SBP prediction in normotensive subjects. Furthermore, R 2 = 96.46%, RMSE = 0.19, and MAPE = 1.76 for SBP prediction and R 2 = 95.44%, RMSE = 0.21, and MAPE = 2.78 for DBP prediction in hypertensive subjects using the PCA-LSSVM model. This assessment presents the importance and advantages posed by hybrid computing models for the prediction of variables in biomedical research studies.
Baseline correction combined partial least squares algorithm and its application in on-line Fourier transform infrared quantitative analysis.

PubMed

Peng, Jiangtao; Peng, Silong; Xie, Qiong; Wei, Jiping

2011-04-01

In order to eliminate the lower order polynomial interferences, a new quantitative calibration algorithm "Baseline Correction Combined Partial Least Squares (BCC-PLS)", which combines baseline correction and conventional PLS, is proposed. By embedding baseline correction constraints into PLS weights selection, the proposed calibration algorithm overcomes the uncertainty in baseline correction and can meet the requirement of on-line attenuated total reflectance Fourier transform infrared (ATR-FTIR) quantitative analysis. The effectiveness of the algorithm is evaluated by the analysis of glucose and marzipan ATR-FTIR spectra. BCC-PLS algorithm shows improved prediction performance over PLS. The root mean square error of cross-validation (RMSECV) on marzipan spectra for the prediction of the moisture is found to be 0.53%, w/w (range 7-19%). The sugar content is predicted with a RMSECV of 2.04%, w/w (range 33-68%). Copyright © 2011 Elsevier B.V. All rights reserved.
Maximum Likelihood Time-of-Arrival Estimation of Optical Pulses via Photon-Counting Photodetectors

NASA Technical Reports Server (NTRS)

Erkmen, Baris I.; Moision, Bruce E.

2010-01-01

Many optical imaging, ranging, and communications systems rely on the estimation of the arrival time of an optical pulse. Recently, such systems have been increasingly employing photon-counting photodetector technology, which changes the statistics of the observed photocurrent. This requires time-of-arrival estimators to be developed and their performances characterized. The statistics of the output of an ideal photodetector, which are well modeled as a Poisson point process, were considered. An analytical model was developed for the mean-square error of the maximum likelihood (ML) estimator, demonstrating two phenomena that cause deviations from the minimum achievable error at low signal power. An approximation was derived to the threshold at which the ML estimator essentially fails to provide better than a random guess of the pulse arrival time. Comparing the analytic model performance predictions to those obtained via simulations, it was verified that the model accurately predicts the ML performance over all regimes considered. There is little prior art that attempts to understand the fundamental limitations to time-of-arrival estimation from Poisson statistics. This work establishes both a simple mathematical description of the error behavior, and the associated physical processes that yield this behavior. Previous work on mean-square error characterization for ML estimators has predominantly focused on additive Gaussian noise. This work demonstrates that the discrete nature of the Poisson noise process leads to a distinctly different error behavior.
Mortality risk score prediction in an elderly population using machine learning.

PubMed

Rose, Sherri

2013-03-01

Standard practice for prediction often relies on parametric regression methods. Interesting new methods from the machine learning literature have been introduced in epidemiologic studies, such as random forest and neural networks. However, a priori, an investigator will not know which algorithm to select and may wish to try several. Here I apply the super learner, an ensembling machine learning approach that combines multiple algorithms into a single algorithm and returns a prediction function with the best cross-validated mean squared error. Super learning is a generalization of stacking methods. I used super learning in the Study of Physical Performance and Age-Related Changes in Sonomans (SPPARCS) to predict death among 2,066 residents of Sonoma, California, aged 54 years or more during the period 1993-1999. The super learner for predicting death (risk score) improved upon all single algorithms in the collection of algorithms, although its performance was similar to that of several algorithms. Super learner outperformed the worst algorithm (neural networks) by 44% with respect to estimated cross-validated mean squared error and had an R2 value of 0.201. The improvement of super learner over random forest with respect to R2 was approximately 2-fold. Alternatives for risk score prediction include the super learner, which can provide improved performance.
Seasonality and Trend Forecasting of Tuberculosis Prevalence Data in Eastern Cape, South Africa, Using a Hybrid Model.

PubMed

Azeez, Adeboye; Obaromi, Davies; Odeyemi, Akinwumi; Ndege, James; Muntabayi, Ruffin

2016-07-26

Tuberculosis (TB) is a deadly infectious disease caused by Mycobacteria tuberculosis. Tuberculosis as a chronic and highly infectious disease is prevalent in almost every part of the globe. More than 95% of TB mortality occurs in low/middle income countries. In 2014, approximately 10 million people were diagnosed with active TB and two million died from the disease. In this study, our aim is to compare the predictive powers of the seasonal autoregressive integrated moving average (SARIMA) and neural network auto-regression (SARIMA-NNAR) models of TB incidence and analyse its seasonality in South Africa. TB incidence cases data from January 2010 to December 2015 were extracted from the Eastern Cape Health facility report of the electronic Tuberculosis Register (ERT.Net). A SARIMA model and a combined model of SARIMA model and a neural network auto-regression (SARIMA-NNAR) model were used in analysing and predicting the TB data from 2010 to 2015. Simulation performance parameters of mean square error (MSE), root mean square error (RMSE), mean absolute error (MAE), mean percent error (MPE), mean absolute scaled error (MASE) and mean absolute percentage error (MAPE) were applied to assess the better performance of prediction between the models. Though practically, both models could predict TB incidence, the combined model displayed better performance. For the combined model, the Akaike information criterion (AIC), second-order AIC (AICc) and Bayesian information criterion (BIC) are 288.56, 308.31 and 299.09 respectively, which were lower than the SARIMA model with corresponding values of 329.02, 327.20 and 341.99, respectively. The seasonality trend of TB incidence was forecast to have a slightly increased seasonal TB incidence trend from the SARIMA-NNAR model compared to the single model. The combined model indicated a better TB incidence forecasting with a lower AICc. The model also indicates the need for resolute intervention to reduce infectious disease transmission with co-infection with HIV and other concomitant diseases, and also at festival peak periods.
Use of machine learning methods to reduce predictive error of groundwater models.

PubMed

Xu, Tianfang; Valocchi, Albert J; Choi, Jaesik; Amir, Eyal

2014-01-01

Quantitative analyses of groundwater flow and transport typically rely on a physically-based model, which is inherently subject to error. Errors in model structure, parameter and data lead to both random and systematic error even in the output of a calibrated model. We develop complementary data-driven models (DDMs) to reduce the predictive error of physically-based groundwater models. Two machine learning techniques, the instance-based weighting and support vector regression, are used to build the DDMs. This approach is illustrated using two real-world case studies of the Republican River Compact Administration model and the Spokane Valley-Rathdrum Prairie model. The two groundwater models have different hydrogeologic settings, parameterization, and calibration methods. In the first case study, cluster analysis is introduced for data preprocessing to make the DDMs more robust and computationally efficient. The DDMs reduce the root-mean-square error (RMSE) of the temporal, spatial, and spatiotemporal prediction of piezometric head of the groundwater model by 82%, 60%, and 48%, respectively. In the second case study, the DDMs reduce the RMSE of the temporal prediction of piezometric head of the groundwater model by 77%. It is further demonstrated that the effectiveness of the DDMs depends on the existence and extent of the structure in the error of the physically-based model. © 2013, National GroundWater Association.
Predicting protein concentrations with ELISA microarray assays, monotonic splines and Monte Carlo simulation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Daly, Don S.; Anderson, Kevin K.; White, Amanda M.

Background: A microarray of enzyme-linked immunosorbent assays, or ELISA microarray, predicts simultaneously the concentrations of numerous proteins in a small sample. These predictions, however, are uncertain due to processing error and biological variability. Making sound biological inferences as well as improving the ELISA microarray process require require both concentration predictions and creditable estimates of their errors. Methods: We present a statistical method based on monotonic spline statistical models, penalized constrained least squares fitting (PCLS) and Monte Carlo simulation (MC) to predict concentrations and estimate prediction errors in ELISA microarray. PCLS restrains the flexible spline to a fit of assay intensitymore » that is a monotone function of protein concentration. With MC, both modeling and measurement errors are combined to estimate prediction error. The spline/PCLS/MC method is compared to a common method using simulated and real ELISA microarray data sets. Results: In contrast to the rigid logistic model, the flexible spline model gave credible fits in almost all test cases including troublesome cases with left and/or right censoring, or other asymmetries. For the real data sets, 61% of the spline predictions were more accurate than their comparable logistic predictions; especially the spline predictions at the extremes of the prediction curve. The relative errors of 50% of comparable spline and logistic predictions differed by less than 20%. Monte Carlo simulation rendered acceptable asymmetric prediction intervals for both spline and logistic models while propagation of error produced symmetric intervals that diverged unrealistically as the standard curves approached horizontal asymptotes. Conclusions: The spline/PCLS/MC method is a flexible, robust alternative to a logistic/NLS/propagation-of-error method to reliably predict protein concentrations and estimate their errors. The spline method simplifies model selection and fitting, and reliably estimates believable prediction errors. For the 50% of the real data sets fit well by both methods, spline and logistic predictions are practically indistinguishable, varying in accuracy by less than 15%. The spline method may be useful when automated prediction across simultaneous assays of numerous proteins must be applied routinely with minimal user intervention.« less
A new approach for modeling patient overall radiosensitivity and predicting multiple toxicity endpoints for breast cancer patients.

PubMed

Mbah, Chamberlain; De Ruyck, Kim; De Schrijver, Silke; De Sutter, Charlotte; Schiettecatte, Kimberly; Monten, Chris; Paelinck, Leen; De Neve, Wilfried; Thierens, Hubert; West, Catharine; Amorim, Gustavo; Thas, Olivier; Veldeman, Liv

2018-05-01

Evaluation of patient characteristics inducing toxicity in breast radiotherapy, using simultaneous modeling of multiple endpoints. In 269 early-stage breast cancer patients treated with whole-breast irradiation (WBI) after breast-conserving surgery, toxicity was scored, based on five dichotomized endpoints. Five logistic regression models were fitted, one for each endpoint and the effect sizes of all variables were estimated using maximum likelihood (MLE). The MLEs are improved with James-Stein estimates (JSEs). The method combines all the MLEs, obtained for the same variable but from different endpoints. Misclassification errors were computed using MLE- and JSE-based prediction models. For associations, p-values from the sum of squares of MLEs were compared with p-values from the Standardized Total Average Toxicity (STAT) Score. With JSEs, 19 highest ranked variables were predictive of the five different endpoints. Important variables increasing radiation-induced toxicity were chemotherapy, age, SATB2 rs2881208 SNP and nodal irradiation. Treatment position (prone position) was most protective and ranked eighth. Overall, the misclassification errors were 45% and 34% for the MLE- and JSE-based models, respectively. p-Values from the sum of squares of MLEs and p-values from STAT score led to very similar conclusions, except for the variables nodal irradiation and treatment position, for which STAT p-values suggested an association with radiosensitivity, whereas p-values from the sum of squares indicated no association. Breast volume was ranked as the most significant variable in both strategies. The James-Stein estimator was used for selecting variables that are predictive for multiple toxicity endpoints. With this estimator, 19 variables were predictive for all toxicities of which four were significantly associated with overall radiosensitivity. JSEs led to almost 25% reduction in the misclassification error rate compared to conventional MLEs. Finally, patient characteristics that are associated with radiosensitivity were identified without explicitly quantifying radiosensitivity.
Certification in Structural Health Monitoring Systems

DTIC Science & Technology

2011-09-01

validation [3,8]. This may be accomplished by computing the sum of squares of pure error ( SSPE ) and its associated squared correlation [3,8]. To compute...these values, a cross- validation sample must be established. In general, if the SSPE is high, the model does not predict well on independent data...plethora of cross- validation methods, some of which are more useful for certain models than others [3,8]. When possible, a disclosure of the SSPE
Methods for estimating flood frequency in Montana based on data through water year 1998

USGS Publications Warehouse

Parrett, Charles; Johnson, Dave R.

2004-01-01

Annual peak discharges having recurrence intervals of 2, 5, 10, 25, 50, 100, 200, and 500 years (T-year floods) were determined for 660 gaged sites in Montana and in adjacent areas of Idaho, Wyoming, and Canada, based on data through water year 1998. The updated flood-frequency information was subsequently used in regression analyses, either ordinary or generalized least squares, to develop equations relating T-year floods to various basin and climatic characteristics, equations relating T-year floods to active-channel width, and equations relating T-year floods to bankfull width. The equations can be used to estimate flood frequency at ungaged sites. Montana was divided into eight regions, within which flood characteristics were considered to be reasonably homogeneous, and the three sets of regression equations were developed for each region. A measure of the overall reliability of the regression equations is the average standard error of prediction. The average standard errors of prediction for the equations based on basin and climatic characteristics ranged from 37.4 percent to 134.1 percent. Average standard errors of prediction for the equations based on active-channel width ranged from 57.2 percent to 141.3 percent. Average standard errors of prediction for the equations based on bankfull width ranged from 63.1 percent to 155.5 percent. In most regions, the equations based on basin and climatic characteristics generally had smaller average standard errors of prediction than equations based on active-channel or bankfull width. An exception was the Southeast Plains Region, where all equations based on active-channel width had smaller average standard errors of prediction than equations based on basin and climatic characteristics or bankfull width. Methods for weighting estimates derived from the basin- and climatic-characteristic equations and the channel-width equations also were developed. The weights were based on the cross correlation of residuals from the different methods and the average standard errors of prediction. When all three methods were combined, the average standard errors of prediction ranged from 37.4 percent to 120.2 percent. Weighting of estimates reduced the standard errors of prediction for all T-year flood estimates in four regions, reduced the standard errors of prediction for some T-year flood estimates in two regions, and provided no reduction in average standard error of prediction in two regions. A computer program for solving the regression equations, weighting estimates, and determining reliability of individual estimates was developed and placed on the USGS Montana District World Wide Web page. A new regression method, termed Region of Influence regression, also was tested. Test results indicated that the Region of Influence method was not as reliable as the regional equations based on generalized least squares regression. Two additional methods for estimating flood frequency at ungaged sites located on the same streams as gaged sites also are described. The first method, based on a drainage-area-ratio adjustment, is intended for use on streams where the ungaged site of interest is located near a gaged site. The second method, based on interpolation between gaged sites, is intended for use on streams that have two or more streamflow-gaging stations.
Comparison of various error functions in predicting the optimum isotherm by linear and non-linear regression analysis for the sorption of basic red 9 by activated carbon.

PubMed

Kumar, K Vasanth; Porkodi, K; Rocha, F

2008-01-15

A comparison of linear and non-linear regression method in selecting the optimum isotherm was made to the experimental equilibrium data of basic red 9 sorption by activated carbon. The r(2) was used to select the best fit linear theoretical isotherm. In the case of non-linear regression method, six error functions namely coefficient of determination (r(2)), hybrid fractional error function (HYBRID), Marquardt's percent standard deviation (MPSD), the average relative error (ARE), sum of the errors squared (ERRSQ) and sum of the absolute errors (EABS) were used to predict the parameters involved in the two and three parameter isotherms and also to predict the optimum isotherm. Non-linear regression was found to be a better way to obtain the parameters involved in the isotherms and also the optimum isotherm. For two parameter isotherm, MPSD was found to be the best error function in minimizing the error distribution between the experimental equilibrium data and predicted isotherms. In the case of three parameter isotherm, r(2) was found to be the best error function to minimize the error distribution structure between experimental equilibrium data and theoretical isotherms. The present study showed that the size of the error function alone is not a deciding factor to choose the optimum isotherm. In addition to the size of error function, the theory behind the predicted isotherm should be verified with the help of experimental data while selecting the optimum isotherm. A coefficient of non-determination, K(2) was explained and was found to be very useful in identifying the best error function while selecting the optimum isotherm.
[Effect of stock abundance and environmental factors on the recruitment success of small yellow croaker in the East China Sea].

PubMed

Liu, Zun-lei; Yuan, Xing-wei; Yang, Lin-lin; Yan, Li-ping; Zhang, Hui; Cheng, Jia-hua

2015-02-01

Multiple hypotheses are available to explain recruitment rate. Model selection methods can be used to identify the best model that supports a particular hypothesis. However, using a single model for estimating recruitment success is often inadequate for overexploited population because of high model uncertainty. In this study, stock-recruitment data of small yellow croaker in the East China Sea collected from fishery dependent and independent surveys between 1992 and 2012 were used to examine density-dependent effects on recruitment success. Model selection methods based on frequentist (AIC, maximum adjusted R2 and P-values) and Bayesian (Bayesian model averaging, BMA) methods were applied to identify the relationship between recruitment and environment conditions. Interannual variability of the East China Sea environment was indicated by sea surface temperature ( SST) , meridional wind stress (MWS), zonal wind stress (ZWS), sea surface pressure (SPP) and runoff of Changjiang River ( RCR). Mean absolute error, mean squared predictive error and continuous ranked probability score were calculated to evaluate the predictive performance of recruitment success. The results showed that models structures were not consistent based on three kinds of model selection methods, predictive variables of models were spawning abundance and MWS by AIC, spawning abundance by P-values, spawning abundance, MWS and RCR by maximum adjusted R2. The recruitment success decreased linearly with stock abundance (P < 0.01), suggesting overcompensation effect in the recruitment success might be due to cannibalism or food competition. Meridional wind intensity showed marginally significant and positive effects on the recruitment success (P = 0.06), while runoff of Changjiang River showed a marginally negative effect (P = 0.07). Based on mean absolute error and continuous ranked probability score, predictive error associated with models obtained from BMA was the smallest amongst different approaches, while that from models selected based on the P-value of the independent variables was the highest. However, mean squared predictive error from models selected based on the maximum adjusted R2 was highest. We found that BMA method could improve the prediction of recruitment success, derive more accurate prediction interval and quantitatively evaluate model uncertainty.
Prediction of thermal conductivity of polyvinylpyrrolidone (PVP) electrospun nanocomposite fibers using artificial neural network and prey-predator algorithm.

PubMed

Khan, Waseem S; Hamadneh, Nawaf N; Khan, Waqar A

2017-01-01

In this study, multilayer perception neural network (MLPNN) was employed to predict thermal conductivity of PVP electrospun nanocomposite fibers with multiwalled carbon nanotubes (MWCNTs) and Nickel Zinc ferrites [(Ni0.6Zn0.4) Fe2O4]. This is the second attempt on the application of MLPNN with prey predator algorithm for the prediction of thermal conductivity of PVP electrospun nanocomposite fibers. The prey predator algorithm was used to train the neural networks to find the best models. The best models have the minimal of sum squared error between the experimental testing data and the corresponding models results. The minimal error was found to be 0.0028 for MWCNTs model and 0.00199 for Ni-Zn ferrites model. The predicted artificial neural networks (ANNs) responses were analyzed statistically using z-test, correlation coefficient, and the error functions for both inclusions. The predicted ANN responses for PVP electrospun nanocomposite fibers were compared with the experimental data and were found in good agreement.
[On-line monitoring of biomass in 1,3-propanediol fermentation by Fourier-transformed near-infrared spectra analysis].

PubMed

Wang, Lu; Liu, Tao; Chen, Yang; Sun, Yaqin; Xiu, Zhilong

2017-01-25

Biomass is an important parameter reflecting the fermentation dynamics. Real-time monitoring of biomass can be used to control and optimize a fermentation process. To overcome the deficiencies of measurement delay and manual errors from offline measurement, we designed an experimental platform for online monitoring the biomass during a 1,3-propanediol fermentation process, based on using the fourier-transformed near-infrared (FT-NIR) spectra analysis. By pre-processing the real-time sampled spectra and analyzing the sensitive spectra bands, a partial least-squares algorithm was proposed to establish a dynamic prediction model for the biomass change during a 1,3-propanediol fermentation process. The fermentation processes with substrate glycerol concentrations of 60 g/L and 40 g/L were used as the external validation experiments. The root mean square error of prediction (RMSEP) obtained by analyzing experimental data was 0.341 6 and 0.274 3, respectively. These results showed that the established model gave good prediction and could be effectively used for on-line monitoring the biomass during a 1,3-propanediol fermentation process.
Predicting the digestible energy of corn determined with growing swine from nutrient composition and cross-species measurements.

PubMed

Smith, B; Hassen, A; Hinds, M; Rice, D; Jones, D; Sauber, T; Iiams, C; Sevenich, D; Allen, R; Owens, F; McNaughton, J; Parsons, C

2015-03-01

The DE values of corn grain for pigs will differ among corn sources. More accurate prediction of DE may improve diet formulation and reduce diet cost. Corn grain sources ( = 83) were assayed with growing swine (20 kg) in DE experiments with total collection of feces, with 3-wk-old broiler chick in nitrogen-corrected apparent ME (AME) trials and with cecectomized adult roosters in nitrogen-corrected true ME (TME) studies. Additional AME data for the corn grain source set was generated based on an existing near-infrared transmittance prediction model (near-infrared transmittance-predicted AME [NIT-AME]). Corn source nutrient composition was determined by wet chemistry methods. These data were then used to 1) test the accuracy of predicting swine DE of individual corn sources based on available literature equations and nutrient composition and 2) develop models for predicting DE of sources from nutrient composition and the cross-species information gathered above (AME, NIT-AME, and TME). The overall measured DE, AME, NIT-AME, and TME values were 4,105 ± 11, 4,006 ± 10, 4,004 ± 10, and 4,086 ± 12 kcal/kg DM, respectively. Prediction models were developed using 80% of the corn grain sources; the remaining 20% was reserved for validation of the developed prediction equation. Literature equations based on nutrient composition proved imprecise for predicting corn DE; the root mean square error of prediction ranged from 105 to 331 kcal/kg, an equivalent of 2.6 to 8.8% error. Yet among the corn composition traits, 4-variable models developed in the current study provided adequate prediction of DE (model ranging from 0.76 to 0.79 and root mean square error [RMSE] of 50 kcal/kg). When prediction equations were tested using the validation set, these models had a 1 to 1.2% error of prediction. Simple linear equations from AME, NIT-AME, or TME provided an accurate prediction of DE for individual sources ( ranged from 0.65 to 0.73 and RMSE ranged from 50 to 61 kcal/kg). Percentage error of prediction based on the validation data set was greater (1.4%) for the TME model than for the NIT-AME or AME models (1 and 1.2%, respectively), indicating that swine DE values could be accurately predicted by using AME or NIT-AME. In conclusion, regression equations developed from broiler measurements or from analyzed nutrient composition proved adequate to reliably predict the DE of commercially available corn hybrids for growing pigs.
Quantitative analysis of bayberry juice acidity based on visible and near-infrared spectroscopy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shao Yongni; He Yong; Mao Jingyuan

Visible and near-infrared (Vis/NIR) reflectance spectroscopy has been investigated for its ability to nondestructively detect acidity in bayberry juice. What we believe to be a new, better mathematic model is put forward, which we have named principal component analysis-stepwise regression analysis-backpropagation neural network (PCA-SRA-BPNN), to build a correlation between the spectral reflectivity data and the acidity of bayberry juice. In this model, the optimum network parameters,such as the number of input nodes, hidden nodes, learning rate, and momentum, are chosen by the value of root-mean-square (rms) error. The results show that its prediction statistical parameters are correlation coefficient (r) ofmore » 0.9451 and root-mean-square error of prediction(RMSEP) of 0.1168. Partial least-squares (PLS) regression is also established to compare with this model. Before doing this, the influences of various spectral pretreatments (standard normal variate, multiplicative scatter correction, S. Golay first derivative, and wavelet package transform) are compared. The PLS approach with wavelet package transform preprocessing spectra is found to provide the best results, and its prediction statistical parameters are correlation coefficient (r) of 0.9061 and RMSEP of 0.1564. Hence, these two models are both desirable to analyze the data from Vis/NIR spectroscopy and to solve the problem of the acidity prediction of bayberry juice. This supplies basal research to ultimately realize the online measurements of the juice's internal quality through this Vis/NIR spectroscopy technique.« less
PLS-LS-SVM based modeling of ATR-IR as a robust method in detection and qualification of alprazolam

NASA Astrophysics Data System (ADS)

Parhizkar, Elahehnaz; Ghazali, Mohammad; Ahmadi, Fatemeh; Sakhteman, Amirhossein

2017-02-01

According to the United States pharmacopeia (USP), Gold standard technique for Alprazolam determination in dosage forms is HPLC, an expensive and time-consuming method that is not easy to approach. In this study chemometrics assisted ATR-IR was introduced as an alternative method that produce similar results in fewer time and energy consumed manner. Fifty-eight samples containing different concentrations of commercial alprazolam were evaluated by HPLC and ATR-IR method. A preprocessing approach was applied to convert raw data obtained from ATR-IR spectra to normal matrix. Finally, a relationship between alprazolam concentrations achieved by HPLC and ATR-IR data was established using PLS-LS-SVM (partial least squares least squares support vector machines). Consequently, validity of the method was verified to yield a model with low error values (root mean square error of cross validation equal to 0.98). The model was able to predict about 99% of the samples according to R2 of prediction set. Response permutation test was also applied to affirm that the model was not assessed by chance correlations. At conclusion, ATR-IR can be a reliable method in manufacturing process in detection and qualification of alprazolam content.
Noncontact analysis of the fiber weight per unit area in prepreg by near-infrared spectroscopy.

PubMed

Jiang, B; Huang, Y D

2008-05-26

The fiber weight per unit area in prepreg is an important factor to ensure the quality of the composite products. Near-infrared spectroscopy (NIRS) technology together with a noncontact reflectance sources has been applied for quality analysis of the fiber weight per unit area. The range of the unit area fiber weight was 13.39-14.14mgcm(-2). The regression method was employed by partial least squares (PLS) and principal components regression (PCR). The calibration model was developed by 55 samples to determine the fiber weight per unit area in prepreg. The determination coefficient (R(2)), root mean square error of calibration (RMSEC) and root mean square error of prediction (RMSEP) were 0.82, 0.092, 0.099, respectively. The predicted values of the fiber weight per unit area in prepreg measured by NIRS technology were comparable to the values obtained by the reference method. For this technology, the noncontact reflectance sources focused directly on the sample with neither previous treatment nor manipulation. The results of the paired t-test revealed that there was no significant difference between the NIR method and the reference method. Besides, the prepreg could be analyzed one time within 20s without sample destruction.
Raman spectroscopy compared against traditional predictors of shear force in lamb m. longissimus lumborum.

PubMed

Fowler, Stephanie M; Schmidt, Heinar; van de Ven, Remy; Wynn, Peter; Hopkins, David L

2014-12-01

A Raman spectroscopic hand held device was used to predict shear force (SF) of 80 fresh lamb m. longissimus lumborum (LL) at 1 and 5days post mortem (PM). Traditional predictors of SF including sarcomere length (SL), particle size (PS), cooking loss (CL), percentage myofibrillar breaks and pH were also measured. SF values were regressed against Raman spectra using partial least squares regression and against the traditional predictors using linear regression. The best prediction of shear force values used spectra at 1day PM to predict shear force at 1day which gave a root mean square error of prediction (RMSEP) of 13.6 (Null=14.0) and the R(2) between observed and cross validated predicted values was 0.06 (R(2)cv). Overall, for fresh LL, the predictability SF, by either the Raman hand held probe or traditional predictors was low. Copyright © 2014 Elsevier Ltd. All rights reserved.
An augmented classical least squares method for quantitative Raman spectral analysis against component information loss.

PubMed

Zhou, Yan; Cao, Hui

2013-01-01

We propose an augmented classical least squares (ACLS) calibration method for quantitative Raman spectral analysis against component information loss. The Raman spectral signals with low analyte concentration correlations were selected and used as the substitutes for unknown quantitative component information during the CLS calibration procedure. The number of selected signals was determined by using the leave-one-out root-mean-square error of cross-validation (RMSECV) curve. An ACLS model was built based on the augmented concentration matrix and the reference spectral signal matrix. The proposed method was compared with partial least squares (PLS) and principal component regression (PCR) using one example: a data set recorded from an experiment of analyte concentration determination using Raman spectroscopy. A 2-fold cross-validation with Venetian blinds strategy was exploited to evaluate the predictive power of the proposed method. The one-way variance analysis (ANOVA) was used to access the predictive power difference between the proposed method and existing methods. Results indicated that the proposed method is effective at increasing the robust predictive power of traditional CLS model against component information loss and its predictive power is comparable to that of PLS or PCR.

Practical guidance on representing the heteroscedasticity of residual errors of hydrological predictions

NASA Astrophysics Data System (ADS)

McInerney, David; Thyer, Mark; Kavetski, Dmitri; Kuczera, George

2016-04-01

Appropriate representation of residual errors in hydrological modelling is essential for accurate and reliable probabilistic streamflow predictions. In particular, residual errors of hydrological predictions are often heteroscedastic, with large errors associated with high runoff events. Although multiple approaches exist for representing this heteroscedasticity, few if any studies have undertaken a comprehensive evaluation and comparison of these approaches. This study fills this research gap by evaluating a range of approaches for representing heteroscedasticity in residual errors. These approaches include the 'direct' weighted least squares approach and 'transformational' approaches, such as logarithmic, Box-Cox (with and without fitting the transformation parameter), logsinh and the inverse transformation. The study reports (1) theoretical comparison of heteroscedasticity approaches, (2) empirical evaluation of heteroscedasticity approaches using a range of multiple catchments / hydrological models / performance metrics and (3) interpretation of empirical results using theory to provide practical guidance on the selection of heteroscedasticity approaches. Importantly, for hydrological practitioners, the results will simplify the choice of approaches to represent heteroscedasticity. This will enhance their ability to provide hydrological probabilistic predictions with the best reliability and precision for different catchment types (e.g. high/low degree of ephemerality).
[Establishment of the Mathematical Model for PMI Estimation Using FTIR Spectroscopy and Data Mining Method].

PubMed

Wang, L; Qin, X C; Lin, H C; Deng, K F; Luo, Y W; Sun, Q R; Du, Q X; Wang, Z Y; Tuo, Y; Sun, J H

2018-02-01

To analyse the relationship between Fourier transform infrared （FTIR） spectrum of rat's spleen tissue and postmortem interval （PMI） for PMI estimation using FTIR spectroscopy combined with data mining method. Rats were sacrificed by cervical dislocation, and the cadavers were placed at 20 ℃. The FTIR spectrum data of rats' spleen tissues were taken and measured at different time points. After pretreatment, the data was analysed by data mining method. The absorption peak intensity of rat's spleen tissue spectrum changed with the PMI, while the absorption peak position was unchanged. The results of principal component analysis （PCA） showed that the cumulative contribution rate of the first three principal components was 96%. There was an obvious clustering tendency for the spectrum sample at each time point. The methods of partial least squares discriminant analysis （PLS-DA） and support vector machine classification （SVMC） effectively divided the spectrum samples with different PMI into four categories （0-24 h, 48-72 h, 96-120 h and 144-168 h）. The determination coefficient （ R ²） of the PMI estimation model established by PLS regression analysis was 0.96, and the root mean square error of calibration （RMSEC） and root mean square error of cross validation （RMSECV） were 9.90 h and 11.39 h respectively. In prediction set, the R ² was 0.97, and the root mean square error of prediction （RMSEP） was 10.49 h. The FTIR spectrum of the rat's spleen tissue can be effectively analyzed qualitatively and quantitatively by the combination of FTIR spectroscopy and data mining method, and the classification and PLS regression models can be established for PMI estimation. Copyright© by the Editorial Department of Journal of Forensic Medicine.
Measurement of non-sugar solids content in Chinese rice wine using near infrared spectroscopy combined with an efficient characteristic variables selection algorithm.

PubMed

Ouyang, Qin; Zhao, Jiewen; Chen, Quansheng

2015-01-01

The non-sugar solids (NSS) content is one of the most important nutrition indicators of Chinese rice wine. This study proposed a rapid method for the measurement of NSS content in Chinese rice wine using near infrared (NIR) spectroscopy. We also systemically studied the efficient spectral variables selection algorithms that have to go through modeling. A new algorithm of synergy interval partial least square with competitive adaptive reweighted sampling (Si-CARS-PLS) was proposed for modeling. The performance of the final model was back-evaluated using root mean square error of calibration (RMSEC) and correlation coefficient (Rc) in calibration set and similarly tested by mean square error of prediction (RMSEP) and correlation coefficient (Rp) in prediction set. The optimum model by Si-CARS-PLS algorithm was achieved when 7 PLS factors and 18 variables were included, and the results were as follows: Rc=0.95 and RMSEC=1.12 in the calibration set, Rp=0.95 and RMSEP=1.22 in the prediction set. In addition, Si-CARS-PLS algorithm showed its superiority when compared with the commonly used algorithms in multivariate calibration. This work demonstrated that NIR spectroscopy technique combined with a suitable multivariate calibration algorithm has a high potential in rapid measurement of NSS content in Chinese rice wine. Copyright © 2015 Elsevier B.V. All rights reserved.
Analysis of pork adulteration in beef meatball using Fourier transform infrared (FTIR) spectroscopy.

PubMed

Rohman, A; Sismindari; Erwanto, Y; Che Man, Yaakob B

2011-05-01

Meatball is one of the favorite foods in Indonesia. The adulteration of pork in beef meatball is frequently occurring. This study was aimed to develop a fast and non destructive technique for the detection and quantification of pork in beef meatball using Fourier transform infrared (FTIR) spectroscopy and partial least square (PLS) calibration. The spectral bands associated with pork fat (PF), beef fat (BF), and their mixtures in meatball formulation were scanned, interpreted, and identified by relating them to those spectroscopically representative to pure PF and BF. For quantitative analysis, PLS regression was used to develop a calibration model at the selected fingerprint regions of 1200-1000 cm(-1). The equation obtained for the relationship between actual PF value and FTIR predicted values in PLS calibration model was y = 0.999x + 0.004, with coefficient of determination (R(2)) and root mean square error of calibration are 0.999 and 0.442, respectively. The PLS calibration model was subsequently used for the prediction of independent samples using laboratory made meatball samples containing the mixtures of BF and PF. Using 4 principal components, root mean square error of prediction is 0.742. The results showed that FTIR spectroscopy can be used for the detection and quantification of pork in beef meatball formulation for Halal verification purposes. Copyright © 2010 The American Meat Science Association. Published by Elsevier Ltd. All rights reserved.
Optical diagnosis of malaria infection in human plasma using Raman spectroscopy

NASA Astrophysics Data System (ADS)

Bilal, Muhammad; Saleem, Muhammad; Amanat, Samina Tufail; Shakoor, Huma Abdul; Rashid, Rashad; Mahmood, Arshad; Ahmed, Mushtaq

2015-01-01

We present the prediction of malaria infection in human plasma using Raman spectroscopy. Raman spectra of malaria-infected samples are compared with those of healthy and dengue virus infected ones for disease recognition. Raman spectra were acquired using a laser at 532 nm as an excitation source and 10 distinct spectral signatures that statistically differentiated malaria from healthy and dengue-infected cases were found. A multivariate regression model has been developed that utilized Raman spectra of 20 malaria-infected, 10 non-malarial with fever, 10 healthy, and 6 dengue-infected samples to optically predict the malaria infection. The model yields the correlation coefficient r2 value of 0.981 between the predicted values and clinically known results of trainee samples, and the root mean square error in cross validation was found to be 0.09; both these parameters validated the model. The model was further blindly tested for 30 unknown suspected samples and found to be 86% accurate compared with the clinical results, with the inaccuracy due to three samples which were predicted in the gray region. Standard deviation and root mean square error in prediction for unknown samples were found to be 0.150 and 0.149, which are accepted for the clinical validation of the model.
A comparative study of kinetic and connectionist modeling for shelf-life prediction of Basundi mix.

PubMed

Ruhil, A P; Singh, R R B; Jain, D K; Patel, A A; Patil, G R

2011-04-01

A ready-to-reconstitute formulation of Basundi, a popular Indian dairy dessert was subjected to storage at various temperatures (10, 25 and 40 °C) and deteriorative changes in the Basundi mix were monitored using quality indices like pH, hydroxyl methyl furfural (HMF), bulk density (BD) and insolubility index (II). The multiple regression equations and the Arrhenius functions that describe the parameters' dependence on temperature for the four physico-chemical parameters were integrated to develop mathematical models for predicting sensory quality of Basundi mix. Connectionist model using multilayer feed forward neural network with back propagation algorithm was also developed for predicting the storage life of the product employing artificial neural network (ANN) tool box of MATLAB software. The quality indices served as the input parameters whereas the output parameters were the sensorily evaluated flavour and total sensory score. A total of 140 observations were used and the prediction performance was judged on the basis of per cent root mean square error. The results obtained from the two approaches were compared. Relatively lower magnitudes of percent root mean square error for both the sensory parameters indicated that the connectionist models were better fitted than kinetic models for predicting storage life.
A nonlinear model of gold production in Malaysia

NASA Astrophysics Data System (ADS)

Ramli, Norashikin; Muda, Nora; Umor, Mohd Rozi

2014-06-01

Malaysia is a country which is rich in natural resources and one of it is a gold. Gold has already become an important national commodity. This study is conducted to determine a model that can be well fitted with the gold production in Malaysia from the year 1995-2010. Five nonlinear models are presented in this study which are Logistic model, Gompertz, Richard, Weibull and Chapman-Richard model. These model are used to fit the cumulative gold production in Malaysia. The best model is then selected based on the model performance. The performance of the fitted model is measured by sum squares error, root mean squares error, coefficient of determination, mean relative error, mean absolute error and mean absolute percentage error. This study has found that a Weibull model is shown to have significantly outperform compare to the other models. To confirm that Weibull is the best model, the latest data are fitted to the model. Once again, Weibull model gives the lowest readings at all types of measurement error. We can concluded that the future gold production in Malaysia can be predicted according to the Weibull model and this could be important findings for Malaysia to plan their economic activities.
Analysis of the Mean Absolute Error (MAE) and the Root Mean Square Error (RMSE) in Assessing Rounding Model

NASA Astrophysics Data System (ADS)

Wang, Weijie; Lu, Yanmin

2018-03-01

Most existing Collaborative Filtering (CF) algorithms predict a rating as the preference of an active user toward a given item, which is always a decimal fraction. Meanwhile, the actual ratings in most data sets are integers. In this paper, we discuss and demonstrate why rounding can bring different influences to these two metrics; prove that rounding is necessary in post-processing of the predicted ratings, eliminate of model prediction bias, improving the accuracy of the prediction. In addition, we also propose two new rounding approaches based on the predicted rating probability distribution, which can be used to round the predicted rating to an optimal integer rating, and get better prediction accuracy compared to the Basic Rounding approach. Extensive experiments on different data sets validate the correctness of our analysis and the effectiveness of our proposed rounding approaches.
Physiologically grounded metrics of model skill: a case study estimating heat stress in intertidal populations

PubMed Central

Kish, Nicole E.; Helmuth, Brian; Wethey, David S.

2016-01-01

Models of ecological responses to climate change fundamentally assume that predictor variables, which are often measured at large scales, are to some degree diagnostic of the smaller-scale biological processes that ultimately drive patterns of abundance and distribution. Given that organisms respond physiologically to stressors, such as temperature, in highly non-linear ways, small modelling errors in predictor variables can potentially result in failures to predict mortality or severe stress, especially if an organism exists near its physiological limits. As a result, a central challenge facing ecologists, particularly those attempting to forecast future responses to environmental change, is how to develop metrics of forecast model skill (the ability of a model to predict defined events) that are biologically meaningful and reflective of underlying processes. We quantified the skill of four simple models of body temperature (a primary determinant of physiological stress) of an intertidal mussel, Mytilus californianus, using common metrics of model performance, such as root mean square error, as well as forecast verification skill scores developed by the meteorological community. We used a physiologically grounded framework to assess each model's ability to predict optimal, sub-optimal, sub-lethal and lethal physiological responses. Models diverged in their ability to predict different levels of physiological stress when evaluated using skill scores, even though common metrics, such as root mean square error, indicated similar accuracy overall. Results from this study emphasize the importance of grounding assessments of model skill in the context of an organism's physiology and, especially, of considering the implications of false-positive and false-negative errors when forecasting the ecological effects of environmental change. PMID:27729979
Statistical variability comparison in MODIS and AERONET derived aerosol optical depth over Indo-Gangetic Plains using time series modeling.

PubMed

Soni, Kirti; Parmar, Kulwinder Singh; Kapoor, Sangeeta; Kumar, Nishant

2016-05-15

A lot of studies in the literature of Aerosol Optical Depth (AOD) done by using Moderate Resolution Imaging Spectroradiometer (MODIS) derived data, but the accuracy of satellite data in comparison to ground data derived from ARrosol Robotic NETwork (AERONET) has been always questionable. So to overcome from this situation, comparative study of a comprehensive ground based and satellite data for the period of 2001-2012 is modeled. The time series model is used for the accurate prediction of AOD and statistical variability is compared to assess the performance of the model in both cases. Root mean square error (RMSE), mean absolute percentage error (MAPE), stationary R-squared, R-squared, maximum absolute percentage error (MAPE), normalized Bayesian information criterion (NBIC) and Ljung-Box methods are used to check the applicability and validity of the developed ARIMA models revealing significant precision in the model performance. It was found that, it is possible to predict the AOD by statistical modeling using time series obtained from past data of MODIS and AERONET as input data. Moreover, the result shows that MODIS data can be formed from AERONET data by adding 0.251627 ± 0.133589 and vice-versa by subtracting. From the forecast available for AODs for the next four years (2013-2017) by using the developed ARIMA model, it is concluded that the forecasted ground AOD has increased trend. Copyright © 2016 Elsevier B.V. All rights reserved.
[NIR Assignment of Magnolol by 2D-COS Technology and Model Application Huoxiangzhengqi Oral Liduid].

PubMed

Pei, Yan-ling; Wu, Zhi-sheng; Shi, Xin-yuan; Pan, Xiao-ning; Peng, Yan-fang; Qiao, Yan-jiang

2015-08-01

Near infrared (NIR) spectroscopy assignment of Magnolol was performed using deuterated chloroform solvent and two-dimensional correlation spectroscopy (2D-COS) technology. According to the synchronous spectra of deuterated chloroform solvent and Magnolol, 1365~1455, 1600~1720, 2000~2181 and 2275~2465 nm were the characteristic absorption of Magnolol. Connected with the structure of Magnolol, 1440 nm was the stretching vibration of phenolic group O-H, 1679 nm was the stretching vibration of aryl and methyl which connected with aryl, 2117, 2304, 2339 and 2370 nm were the combination of the stretching vibration, bending vibration and deformation vibration for aryl C-H, 2445 nm were the bending vibration of methyl which linked with aryl group, these bands attribut to the characteristics of Magnolol. Huoxiangzhengqi Oral Liduid was adopted to study the Magnolol, the characteristic band by spectral assignment and the band by interval Partial Least Squares (iPLS) and Synergy interval Partial Least Squares (SiPLS) were used to establish Partial Least Squares (PLS) quantitative model, the coefficient of determination Rcal(2) and Rpre(2) were greater than 0.99, the Root Mean of Square Error of Calibration (RM-SEC), Root Mean of Square Error of Cross Validation (RMSECV) and Root Mean of Square Error of Prediction (RMSEP) were very small. It indicated that the characteristic band by spectral assignment has the same results with the Chemometrics in PLS model. It provided a reference for NIR spectral assignment of chemical compositions in Chinese Materia Medica, and the band filters of NIR were interpreted.
Variable complexity online sequential extreme learning machine, with applications to streamflow prediction

NASA Astrophysics Data System (ADS)

Lima, Aranildo R.; Hsieh, William W.; Cannon, Alex J.

2017-12-01

In situations where new data arrive continually, online learning algorithms are computationally much less costly than batch learning ones in maintaining the model up-to-date. The extreme learning machine (ELM), a single hidden layer artificial neural network with random weights in the hidden layer, is solved by linear least squares, and has an online learning version, the online sequential ELM (OSELM). As more data become available during online learning, information on the longer time scale becomes available, so ideally the model complexity should be allowed to change, but the number of hidden nodes (HN) remains fixed in OSELM. A variable complexity VC-OSELM algorithm is proposed to dynamically add or remove HN in the OSELM, allowing the model complexity to vary automatically as online learning proceeds. The performance of VC-OSELM was compared with OSELM in daily streamflow predictions at two hydrological stations in British Columbia, Canada, with VC-OSELM significantly outperforming OSELM in mean absolute error, root mean squared error and Nash-Sutcliffe efficiency at both stations.
Predicting tree species presence and basal area in Utah: A comparison of stochastic gradient boosting, generalized additive models, and tree-based methods

USGS Publications Warehouse

Moisen, Gretchen G.; Freeman, E.A.; Blackard, J.A.; Frescino, T.S.; Zimmermann, N.E.; Edwards, T.C.

2006-01-01

Many efforts are underway to produce broad-scale forest attribute maps by modelling forest class and structure variables collected in forest inventories as functions of satellite-based and biophysical information. Typically, variants of classification and regression trees implemented in Rulequest's?? See5 and Cubist (for binary and continuous responses, respectively) are the tools of choice in many of these applications. These tools are widely used in large remote sensing applications, but are not easily interpretable, do not have ties with survey estimation methods, and use proprietary unpublished algorithms. Consequently, three alternative modelling techniques were compared for mapping presence and basal area of 13 species located in the mountain ranges of Utah, USA. The modelling techniques compared included the widely used See5/Cubist, generalized additive models (GAMs), and stochastic gradient boosting (SGB). Model performance was evaluated using independent test data sets. Evaluation criteria for mapping species presence included specificity, sensitivity, Kappa, and area under the curve (AUC). Evaluation criteria for the continuous basal area variables included correlation and relative mean squared error. For predicting species presence (setting thresholds to maximize Kappa), SGB had higher values for the majority of the species for specificity and Kappa, while GAMs had higher values for the majority of the species for sensitivity. In evaluating resultant AUC values, GAM and/or SGB models had significantly better results than the See5 models where significant differences could be detected between models. For nine out of 13 species, basal area prediction results for all modelling techniques were poor (correlations less than 0.5 and relative mean squared errors greater than 0.8), but SGB provided the most stable predictions in these instances. SGB and Cubist performed equally well for modelling basal area for three species with moderate prediction success, while all three modelling tools produced comparably good predictions (correlation of 0.68 and relative mean squared error of 0.56) for one species. ?? 2006 Elsevier B.V. All rights reserved.
Hyperspectral Analysis of Soil Total Nitrogen in Subsided Land Using the Local Correlation Maximization-Complementary Superiority (LCMCS) Method

PubMed Central

Lin, Lixin; Wang, Yunjia; Teng, Jiyao; Xi, Xiuxiu

2015-01-01

The measurement of soil total nitrogen (TN) by hyperspectral remote sensing provides an important tool for soil restoration programs in areas with subsided land caused by the extraction of natural resources. This study used the local correlation maximization-complementary superiority method (LCMCS) to establish TN prediction models by considering the relationship between spectral reflectance (measured by an ASD FieldSpec 3 spectroradiometer) and TN based on spectral reflectance curves of soil samples collected from subsided land which is determined by synthetic aperture radar interferometry (InSAR) technology. Based on the 1655 selected effective bands of the optimal spectrum (OSP) of the first derivate differential of reciprocal logarithm ([log{1/R}]′), (correlation coefficients, p < 0.01), the optimal model of LCMCS method was obtained to determine the final model, which produced lower prediction errors (root mean square error of validation [RMSEV] = 0.89, mean relative error of validation [MREV] = 5.93%) when compared with models built by the local correlation maximization (LCM), complementary superiority (CS) and partial least squares regression (PLS) methods. The predictive effect of LCMCS model was optional in Cangzhou, Renqiu and Fengfeng District. Results indicate that the LCMCS method has great potential to monitor TN in subsided lands caused by the extraction of natural resources including groundwater, oil and coal. PMID:26213935
[Application of wavelet transform-radial basis function neural network in NIRS for determination of rifampicin and isoniazide tablets].

PubMed

Lu, Jia-hui; Zhang, Yi-bo; Zhang, Zhuo-yong; Meng, Qing-fan; Guo, Wei-liang; Teng, Li-rong

2008-06-01

A calibration model (WT-RBFNN) combination of wavelet transform (WT) and radial basis function neural network (RBFNN) was proposed for synchronous and rapid determination of rifampicin and isoniazide in Rifampicin and Isoniazide tablets by near infrared reflectance spectroscopy (NIRS). The approximation coefficients were used for input data in RBFNN. The network parameters including the number of hidden layer neurons and spread constant (SC) were investigated. WT-RBFNN model which compressed the original spectra data, removed the noise and the interference of background, and reduced the randomness, the capabilities of prediction were well optimized. The root mean square errors of prediction (RMSEP) for the determination of rifampicin and isoniazide obtained from the optimum WT-RBFNN model are 0.00639 and 0.00587, and the root mean square errors of cross-calibration (RMSECV) for them are 0.00604 and 0.00457, respectively which are superior to those obtained by the optimum RBFNN and PLS models. Regression coefficient (R) between NIRS predicted values and RP-HPLC values for rifampicin and isoniazide are 0.99522 and 0.99392, respectively and the relative error is lower than 2.300%. It was verified that WT-RBFNN model is a suitable approach to dealing with NIRS. The proposed WT-RBFNN model is convenient, and rapid and with no pollution for the determination of rifampicin and isoniazide tablets.
Online measurement of urea concentration in spent dialysate during hemodialysis.

PubMed

Olesberg, Jonathon T; Arnold, Mark A; Flanigan, Michael J

2004-01-01

We describe online optical measurements of urea in the effluent dialysate line during regular hemodialysis treatment of several patients. Monitoring urea removal can provide valuable information about dialysis efficiency. Spectral measurements were performed with a Fourier-transform infrared spectrometer equipped with a flow-through cell. Spectra were recorded across the 5000-4000 cm(-1) (2.0-2.5 microm) wavelength range at 1-min intervals. Savitzky-Golay filtering was used to remove baseline variations attributable to the temperature dependence of the water absorption spectrum. Urea concentrations were extracted from the filtered spectra by use of partial least-squares regression and the net analyte signal of urea. Urea concentrations predicted by partial least-squares regression matched concentrations obtained from standard chemical assays with a root mean square error of 0.30 mmol/L (0.84 mg/dL urea nitrogen) over an observed concentration range of 0-11 mmol/L. The root mean square error obtained with the net analyte signal of urea was 0.43 mmol/L with a calibration based only on a set of pure-component spectra. The error decreased to 0.23 mmol/L when a slope and offset correction were used. Urea concentrations can be continuously monitored during hemodialysis by near-infrared spectroscopy. Calibrations based on the net analyte signal of urea are particularly appealing because they do not require a training step, as do statistical multivariate calibration procedures such as partial least-squares regression.
Geospatial distribution modeling and determining suitability of groundwater quality for irrigation purpose using geospatial methods and water quality index (WQI) in Northern Ethiopia

NASA Astrophysics Data System (ADS)

Gidey, Amanuel

2018-06-01

Determining suitability and vulnerability of groundwater quality for irrigation use is a key alarm and first aid for careful management of groundwater resources to diminish the impacts on irrigation. This study was conducted to determine the overall suitability of groundwater quality for irrigation use and to generate their spatial distribution maps in Elala catchment, Northern Ethiopia. Thirty-nine groundwater samples were collected to analyze and map the water quality variables. Atomic absorption spectrophotometer, ultraviolet spectrophotometer, titration and calculation methods were used for laboratory groundwater quality analysis. Arc GIS, geospatial analysis tools, semivariogram model types and interpolation methods were used to generate geospatial distribution maps. Twelve and eight water quality variables were used to produce weighted overlay and irrigation water quality index models, respectively. Root-mean-square error, mean square error, absolute square error, mean error, root-mean-square standardized error, measured values versus predicted values were used for cross-validation. The overall weighted overlay model result showed that 146 km2 areas are highly suitable, 135 km2 moderately suitable and 60 km2 area unsuitable for irrigation use. The result of irrigation water quality index confirms 10.26% with no restriction, 23.08% with low restriction, 20.51% with moderate restriction, 15.38% with high restriction and 30.76% with the severe restriction for irrigation use. GIS and irrigation water quality index are better methods for irrigation water resources management to achieve a full yield irrigation production to improve food security and to sustain it for a long period, to avoid the possibility of increasing environmental problems for the future generation.
Methods for estimating the magnitude and frequency of peak streamflows at ungaged sites in and near the Oklahoma Panhandle

USGS Publications Warehouse

Smith, S. Jerrod; Lewis, Jason M.; Graves, Grant M.

2015-09-28

Generalized-least-squares multiple-linear regression analysis was used to formulate regression relations between peak-streamflow frequency statistics and basin characteristics. Contributing drainage area was the only basin characteristic determined to be statistically significant for all percentage of annual exceedance probabilities and was the only basin characteristic used in regional regression equations for estimating peak-streamflow frequency statistics on unregulated streams in and near the Oklahoma Panhandle. The regression model pseudo-coefficient of determination, converted to percent, for the Oklahoma Panhandle regional regression equations ranged from about 38 to 63 percent. The standard errors of prediction and the standard model errors for the Oklahoma Panhandle regional regression equations ranged from about 84 to 148 percent and from about 76 to 138 percent, respectively. These errors were comparable to those reported for regional peak-streamflow frequency regression equations for the High Plains areas of Texas and Colorado. The root mean square errors for the Oklahoma Panhandle regional regression equations (ranging from 3,170 to 92,000 cubic feet per second) were less than the root mean square errors for the Oklahoma statewide regression equations (ranging from 18,900 to 412,000 cubic feet per second); therefore, the Oklahoma Panhandle regional regression equations produce more accurate peak-streamflow statistic estimates for the irrigated period of record in the Oklahoma Panhandle than do the Oklahoma statewide regression equations. The regression equations developed in this report are applicable to streams that are not substantially affected by regulation, impoundment, or surface-water withdrawals. These regression equations are intended for use for stream sites with contributing drainage areas less than or equal to about 2,060 square miles, the maximum value for the independent variable used in the regression analysis.
Determination of heat capacity of ionic liquid based nanofluids using group method of data handling technique

NASA Astrophysics Data System (ADS)

Sadi, Maryam

2018-01-01

In this study a group method of data handling model has been successfully developed to predict heat capacity of ionic liquid based nanofluids by considering reduced temperature, acentric factor and molecular weight of ionic liquids, and nanoparticle concentration as input parameters. In order to accomplish modeling, 528 experimental data points extracted from the literature have been divided into training and testing subsets. The training set has been used to predict model coefficients and the testing set has been applied for model validation. The ability and accuracy of developed model, has been evaluated by comparison of model predictions with experimental values using different statistical parameters such as coefficient of determination, mean square error and mean absolute percentage error. The mean absolute percentage error of developed model for training and testing sets are 1.38% and 1.66%, respectively, which indicate excellent agreement between model predictions and experimental data. Also, the results estimated by the developed GMDH model exhibit a higher accuracy when compared to the available theoretical correlations.
Artificial neural network modelling of a large-scale wastewater treatment plant operation.

PubMed

Güçlü, Dünyamin; Dursun, Sükrü

2010-11-01

Artificial Neural Networks (ANNs), a method of artificial intelligence method, provide effective predictive models for complex processes. Three independent ANN models trained with back-propagation algorithm were developed to predict effluent chemical oxygen demand (COD), suspended solids (SS) and aeration tank mixed liquor suspended solids (MLSS) concentrations of the Ankara central wastewater treatment plant. The appropriate architecture of ANN models was determined through several steps of training and testing of the models. ANN models yielded satisfactory predictions. Results of the root mean square error, mean absolute error and mean absolute percentage error were 3.23, 2.41 mg/L and 5.03% for COD; 1.59, 1.21 mg/L and 17.10% for SS; 52.51, 44.91 mg/L and 3.77% for MLSS, respectively, indicating that the developed model could be efficiently used. The results overall also confirm that ANN modelling approach may have a great implementation potential for simulation, precise performance prediction and process control of wastewater treatment plants.

Methods of automatic nucleotide-sequence analysis. Multicomponent spectrophotometric analysis of mixtures of nucleic acid components by a least-squares procedure

PubMed Central

Lee, Sheila; McMullen, D.; Brown, G. L.; Stokes, A. R.

1965-01-01

1. A theoretical analysis of the errors in multicomponent spectrophotometric analysis of nucleoside mixtures, by a least-squares procedure, has been made to obtain an expression for the error coefficient, relating the error in calculated concentration to the error in extinction measurements. 2. The error coefficients, which depend only on the `library' of spectra used to fit the experimental curves, have been computed for a number of `libraries' containing the following nucleosides found in s-RNA: adenosine, guanosine, cytidine, uridine, 5-ribosyluracil, 7-methylguanosine, 6-dimethylaminopurine riboside, 6-methylaminopurine riboside and thymine riboside. 3. The error coefficients have been used to determine the best conditions for maximum accuracy in the determination of the compositions of nucleoside mixtures. 4. Experimental determinations of the compositions of nucleoside mixtures have been made and the errors found to be consistent with those predicted by the theoretical analysis. 5. It has been demonstrated that, with certain precautions, the multicomponent spectrophotometric method described is suitable as a basis for automatic nucleotide-composition analysis of oligonucleotides containing nine nucleotides. Used in conjunction with continuous chromatography and flow chemical techniques, this method can be applied to the study of the sequence of s-RNA. PMID:14346087
Training the Recurrent neural network by the Fuzzy Min-Max algorithm for fault prediction

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zemouri, Ryad; Racoceanu, Daniel; Zerhouni, Noureddine

2009-03-05

In this paper, we present a training technique of a Recurrent Radial Basis Function neural network for fault prediction. We use the Fuzzy Min-Max technique to initialize the k-center of the RRBF neural network. The k-means algorithm is then applied to calculate the centers that minimize the mean square error of the prediction task. The performances of the k-means algorithm are then boosted by the Fuzzy Min-Max technique.
Near infrared spectrometric technique for testing fruit quality: optimisation of regression models using genetic algorithms

NASA Astrophysics Data System (ADS)

Isingizwe Nturambirwe, J. Frédéric; Perold, Willem J.; Opara, Umezuruike L.

2016-02-01

Near infrared (NIR) spectroscopy has gained extensive use in quality evaluation. It is arguably one of the most advanced spectroscopic tools in non-destructive quality testing of food stuff, from measurement to data analysis and interpretation. NIR spectral data are interpreted through means often involving multivariate statistical analysis, sometimes associated with optimisation techniques for model improvement. The objective of this research was to explore the extent to which genetic algorithms (GA) can be used to enhance model development, for predicting fruit quality. Apple fruits were used, and NIR spectra in the range from 12000 to 4000 cm-1 were acquired on both bruised and healthy tissues, with different degrees of mechanical damage. GAs were used in combination with partial least squares regression methods to develop bruise severity prediction models, and compared to PLS models developed using the full NIR spectrum. A classification model was developed, which clearly separated bruised from unbruised apple tissue. GAs helped improve prediction models by over 10%, in comparison with full spectrum-based models, as evaluated in terms of error of prediction (Root Mean Square Error of Cross-validation). PLS models to predict internal quality, such as sugar content and acidity were developed and compared to the versions optimized by genetic algorithm. Overall, the results highlighted the potential use of GA method to improve speed and accuracy of fruit quality prediction.
Validation of the Kp Geomagnetic Index Forecast at CCMC

NASA Astrophysics Data System (ADS)

Frechette, B. P.; Mays, M. L.

2017-12-01

The Community Coordinated Modeling Center (CCMC) Space Weather Research Center (SWRC) sub-team provides space weather services to NASA robotic mission operators and science campaigns and prototypes new models, forecasting techniques, and procedures. The Kp index is a measure of geomagnetic disturbances for space weather in the magnetosphere such as geomagnetic storms and substorms. In this study, we performed validation on the Newell et al. (2007) Kp prediction equation from December 2010 to July 2017. The purpose of this research is to understand the Kp forecast performance because it's critical for NASA missions to have confidence in the space weather forecast. This research was done by computing the Kp error for each forecast (average, minimum, maximum) and each synoptic period. Then to quantify forecast performance we computed the mean error, mean absolute error, root mean square error, multiplicative bias and correlation coefficient. A contingency table was made for each forecast and skill scores were computed. The results are compared to the perfect score and reference forecast skill score. In conclusion, the skill score and error results show that the minimum of the predicted Kp over each synoptic period from the Newell et al. (2007) Kp prediction equation performed better than the maximum or average of the prediction. However, persistence (reference forecast) outperformed all of the Kp forecasts (minimum, maximum, and average). Overall, the Newell Kp prediction still predicts within a range of 1, even though persistence beats it.
Near-infrared Spectroscopy as a Process Analytical Technology Tool for Monitoring the Parching Process of Traditional Chinese Medicine Based on Two Kinds of Chemical Indicators.

PubMed

Li, Kaiyue; Wang, Weiying; Liu, Yanping; Jiang, Su; Huang, Guo; Ye, Liming

2017-01-01

The active ingredients and thus pharmacological efficacy of traditional Chinese medicine (TCM) at different degrees of parching process vary greatly. Near-infrared spectroscopy (NIR) was used to develop a new method for rapid online analysis of TCM parching process, using two kinds of chemical indicators (5-(hydroxymethyl) furfural [5-HMF] content and 420 nm absorbance) as reference values which were obviously observed and changed in most TCM parching process. Three representative TCMs, Areca ( Areca catechu L.), Malt ( Hordeum Vulgare L.), and Hawthorn ( Crataegus pinnatifida Bge.), were used in this study. With partial least squares regression, calibration models of NIR were generated based on two kinds of reference values, i.e. 5-HMF contents measured by high-performance liquid chromatography (HPLC) and 420 nm absorbance measured by ultraviolet-visible spectroscopy (UV/Vis), respectively. In the optimized models for 5-HMF, the root mean square errors of prediction (RMSEP) for Areca, Malt, and Hawthorn was 0.0192, 0.0301, and 0.2600 and correlation coefficients ( R cal ) were 99.86%, 99.88%, and 99.88%, respectively. Moreover, in the optimized models using 420 nm absorbance as reference values, the RMSEP for Areca, Malt, and Hawthorn was 0.0229, 0.0096, and 0.0409 and R cal were 99.69%, 99.81%, and 99.62%, respectively. NIR models with 5-HMF content and 420 nm absorbance as reference values can rapidly and effectively identify three kinds of TCM in different parching processes. This method has great promise to replace current subjective color judgment and time-consuming HPLC or UV/Vis methods and is suitable for rapid online analysis and quality control in TCM industrial manufacturing process. Near-infrared spectroscopy.(NIR) was used to develop a new method for online analysis of traditional Chinese medicine.(TCM) parching processCalibration and validation models of Areca, Malt, and Hawthorn were generated by partial least squares regression using 5.(hydroxymethyl) furfural contents and 420.nm absorbance as reference values, respectively, which were main indicator components during parching process of most TCMThe established NIR models of three TCMs had low root mean square errors of prediction and high correlation coefficientsThe NIR method has great promise for use in TCM industrial manufacturing processes for rapid online analysis and quality control. Abbreviations used: NIR: Near-infrared Spectroscopy; TCM: Traditional Chinese medicine; Areca: Areca catechu L.; Hawthorn: Crataegus pinnatifida Bge.; Malt: Hordeum vulgare L.; 5-HMF: 5-(hydroxymethyl) furfural; PLS: Partial least squares; D: Dimension faction; SLS: Straight line subtraction, MSC: Multiplicative scatter correction; VN: Vector normalization; RMSECV: Root mean square errors of cross-validation; RMSEP: Root mean square errors of validation; R cal : Correlation coefficients; RPD: Residual predictive deviation; PAT: Process analytical technology; FDA: Food and Drug Administration; ICH: International Conference on Harmonization of Technical Requirements for Registration of Pharmaceuticals for Human Use.
Adaptive control of theophylline therapy: importance of blood sampling times.

PubMed

D'Argenio, D Z; Khakmahd, K

1983-10-01

A two-observation protocol for estimating theophylline clearance during a constant-rate intravenous infusion is used to examine the importance of blood sampling schedules with regard to the information content of resulting concentration data. Guided by a theory for calculating maximally informative sample times, population simulations are used to assess the effect of specific sampling times on the precision of resulting clearance estimates and subsequent predictions of theophylline plasma concentrations. The simulations incorporated noise terms for intersubject variability, dosing errors, sample collection errors, and assay error. Clearance was estimated using Chiou's method, least squares, and a Bayesian estimation procedure. The results of these simulations suggest that clinically significant estimation and prediction errors may result when using the above two-point protocol for estimating theophylline clearance if the time separating the two blood samples is less than one population mean elimination half-life.
Generalized regression neural network (GRNN)-based approach for colored dissolved organic matter (CDOM) retrieval: case study of Connecticut River at Middle Haddam Station, USA.

PubMed

Heddam, Salim

2014-11-01

The prediction of colored dissolved organic matter (CDOM) using artificial neural network approaches has received little attention in the past few decades. In this study, colored dissolved organic matter (CDOM) was modeled using generalized regression neural network (GRNN) and multiple linear regression (MLR) models as a function of Water temperature (TE), pH, specific conductance (SC), and turbidity (TU). Evaluation of the prediction accuracy of the models is based on the root mean square error (RMSE), mean absolute error (MAE), coefficient of correlation (CC), and Willmott's index of agreement (d). The results indicated that GRNN can be applied successfully for prediction of colored dissolved organic matter (CDOM).
Simultaneous learning and filtering without delusions: a Bayes-optimal combination of Predictive Inference and Adaptive Filtering.

PubMed

Kneissler, Jan; Drugowitsch, Jan; Friston, Karl; Butz, Martin V

2015-01-01

Predictive coding appears to be one of the fundamental working principles of brain processing. Amongst other aspects, brains often predict the sensory consequences of their own actions. Predictive coding resembles Kalman filtering, where incoming sensory information is filtered to produce prediction errors for subsequent adaptation and learning. However, to generate prediction errors given motor commands, a suitable temporal forward model is required to generate predictions. While in engineering applications, it is usually assumed that this forward model is known, the brain has to learn it. When filtering sensory input and learning from the residual signal in parallel, a fundamental problem arises: the system can enter a delusional loop when filtering the sensory information using an overly trusted forward model. In this case, learning stalls before accurate convergence because uncertainty about the forward model is not properly accommodated. We present a Bayes-optimal solution to this generic and pernicious problem for the case of linear forward models, which we call Predictive Inference and Adaptive Filtering (PIAF). PIAF filters incoming sensory information and learns the forward model simultaneously. We show that PIAF is formally related to Kalman filtering and to the Recursive Least Squares linear approximation method, but combines these procedures in a Bayes optimal fashion. Numerical evaluations confirm that the delusional loop is precluded and that the learning of the forward model is more than 10-times faster when compared to a naive combination of Kalman filtering and Recursive Least Squares.
Detection of Tetracycline in Milk using NIR Spectroscopy and Partial Least Squares

NASA Astrophysics Data System (ADS)

Wu, Nan; Xu, Chenshan; Yang, Renjie; Ji, Xinning; Liu, Xinyuan; Yang, Fan; Zeng, Ming

2018-02-01

The feasibility of measuring tetracycline in milk was investigated by near infrared (NIR) spectroscopic technique combined with partial least squares (PLS) method. The NIR transmittance spectra of 40 pure milk samples and 40 tetracycline adulterated milk samples with different concentrations (from 0.005 to 40 mg/L) were obtained. The pure milk and tetracycline adulterated milk samples were properly assigned to the categories with 100% accuracy in the calibration set, and the rate of correct classification of 96.3% was obtained in the prediction set. For the quantitation of tetracycline in adulterated milk, the root mean squares errors for calibration and prediction models were 0.61 mg/L and 4.22 mg/L, respectively. The PLS model had good fitting effect in calibration set, however its predictive ability was limited, especially for low tetracycline concentration samples. Totally, this approach can be considered as a promising tool for discrimination of tetracycline adulterated milk, as a supplement to high performance liquid chromatography.
Determination of total phenolic compounds in compost by infrared spectroscopy.

PubMed

Cascant, M M; Sisouane, M; Tahiri, S; Krati, M El; Cervera, M L; Garrigues, S; de la Guardia, M

2016-06-01

Middle and near infrared (MIR and NIR) were applied to determine the total phenolic compounds (TPC) content in compost samples based on models built by using partial least squares (PLS) regression. The multiplicative scatter correction, standard normal variate and first derivative were employed as spectra pretreatment, and the number of latent variable were optimized by leave-one-out cross-validation. The performance of PLS-ATR-MIR and PLS-DR-NIR models was evaluated according to root mean square error of cross validation and prediction (RMSECV and RMSEP), the coefficient of determination for prediction (Rpred(2)) and residual predictive deviation (RPD) being obtained for this latter values of 5.83 and 8.26 for MIR and NIR, respectively. Copyright © 2016 Elsevier B.V. All rights reserved.
Performance analysis of adaptive equalization for coherent acoustic communications in the time-varying ocean environment.

PubMed

Preisig, James C

2005-07-01

Equations are derived for analyzing the performance of channel estimate based equalizers. The performance is characterized in terms of the mean squared soft decision error (sigma2(s)) of each equalizer. This error is decomposed into two components. These are the minimum achievable error (sigma2(0)) and the excess error (sigma2(e)). The former is the soft decision error that would be realized by the equalizer if the filter coefficient calculation were based upon perfect knowledge of the channel impulse response and statistics of the interfering noise field. The latter is the additional soft decision error that is realized due to errors in the estimates of these channel parameters. These expressions accurately predict the equalizer errors observed in the processing of experimental data by a channel estimate based decision feedback equalizer (DFE) and a passive time-reversal equalizer. Further expressions are presented that allow equalizer performance to be predicted given the scattering function of the acoustic channel. The analysis using these expressions yields insights into the features of surface scattering that most significantly impact equalizer performance in shallow water environments and motivates the implementation of a DFE that is robust with respect to channel estimation errors.
The modelling of lead removal from water by deep eutectic solvents functionalized CNTs: artificial neural network (ANN) approach.

PubMed

Fiyadh, Seef Saadi; AlSaadi, Mohammed Abdulhakim; AlOmar, Mohamed Khalid; Fayaed, Sabah Saadi; Hama, Ako R; Bee, Sharifah; El-Shafie, Ahmed

2017-11-01

The main challenge in the lead removal simulation is the behaviour of non-linearity relationships between the process parameters. The conventional modelling technique usually deals with this problem by a linear method. The substitute modelling technique is an artificial neural network (ANN) system, and it is selected to reflect the non-linearity in the interaction among the variables in the function. Herein, synthesized deep eutectic solvents were used as a functionalized agent with carbon nanotubes as adsorbents of Pb 2+ . Different parameters were used in the adsorption study including pH (2.7 to 7), adsorbent dosage (5 to 20 mg), contact time (3 to 900 min) and Pb 2+ initial concentration (3 to 60 mg/l). The number of experimental trials to feed and train the system was 158 runs conveyed in laboratory scale. Two ANN types were designed in this work, the feed-forward back-propagation and layer recurrent; both methods are compared based on their predictive proficiency in terms of the mean square error (MSE), root mean square error, relative root mean square error, mean absolute percentage error and determination coefficient (R 2 ) based on the testing dataset. The ANN model of lead removal was subjected to accuracy determination and the results showed R 2 of 0.9956 with MSE of 1.66 × 10 -4 . The maximum relative error is 14.93% for the feed-forward back-propagation neural network model.
Red Wine Age Estimation by the Alteration of Its Color Parameters: Fourier Transform Infrared Spectroscopy as a Tool to Monitor Wine Maturation Time

PubMed Central

Basalekou, M.; Pappas, C.; Kotseridis, Y.; Tarantilis, P. A.; Kontaxakis, E.

2017-01-01

Color, phenolic content, and chemical age values of red wines made from Cretan grape varieties (Kotsifali, Mandilari) were evaluated over nine months of maturation in different containers for two vintages. The wines differed greatly on their anthocyanin profiles. Mid-IR spectra were also recorded with the use of a Fourier Transform Infrared Spectrophotometer in ZnSe disk mode. Analysis of Variance was used to explore the parameter's dependency on time. Determination models were developed for the chemical age indexes using Partial Least Squares (PLS) (TQ Analyst software) considering the spectral region 1830–1500 cm−1. The correlation coefficients (r) for chemical age index i were 0.86 for Kotsifali (Root Mean Square Error of Calibration (RMSEC) = 0.067, Root Mean Square Error of Prediction (RMSEP) = 0,115, and Root Mean Square Error of Validation (RMSECV) = 0.164) and 0.90 for Mandilari (RMSEC = 0.050, RMSEP = 0.040, and RMSECV = 0.089). For chemical age index ii the correlation coefficients (r) were 0.86 and 0.97 for Kotsifali (RMSEC 0.044, RMSEP = 0.087, and RMSECV = 0.214) and Mandilari (RMSEC = 0.024, RMSEP = 0.033, and RMSECV = 0.078), respectively. The proposed method is simpler, less time consuming, and more economical and does not require chemical reagents. PMID:29225994
Quick method (FT-NIR) for the determination of oil and major fatty acids content in whole achenes of milk thistle (Silybum marianum (L.) Gaertn.).

PubMed

Koláčková, Pavla; Růžičková, Gabriela; Gregor, Tomáš; Šišperová, Eliška

2015-08-30

Calibration models for the Fourier transform-near infrared (FT-NIR) instrument were developed for quick and non-destructive determination of oil and fatty acids in whole achenes of milk thistle. Samples with a range of oil and fatty acid levels were collected and their transmittance spectra were obtained by the FT-NIR instrument. Based on these spectra and data gained by the means of the reference method - Soxhlet extraction and gas chromatography (GC) - calibration models were created by means of partial least square (PLS) regression analysis. Precision and accuracy of the calibration models was verified via the cross-validation of validation samples whose spectra were not part of the calibration model and also according to the root mean square error of prediction (RMSEP), root mean square error of calibration (RMSEC), root mean square error of cross-validation (RMSECV) and the validation coefficient of determination (R(2) ). R(2) for whole seeds were 0.96, 0.96, 0.83 and 0.67 and the RMSEP values were 0.76, 1.68, 1.24, 0.54 for oil, linoleic (C18:2), oleic (C18:1) and palmitic (C16:0) acids, respectively. The calibration models are appropriate for the non-destructive determination of oil and fatty acids levels in whole seeds of milk thistle. © 2014 Society of Chemical Industry.
Using Least Squares for Error Propagation

ERIC Educational Resources Information Center

Tellinghuisen, Joel

2015-01-01

The method of least-squares (LS) has a built-in procedure for estimating the standard errors (SEs) of the adjustable parameters in the fit model: They are the square roots of the diagonal elements of the covariance matrix. This means that one can use least-squares to obtain numerical values of propagated errors by defining the target quantities as…
Quantitative prediction of ionization effect on human skin permeability.

PubMed

Baba, Hiromi; Ueno, Yusuke; Hashida, Mitsuru; Yamashita, Fumiyoshi

2017-04-30

Although skin permeability of an active ingredient can be severely affected by its ionization in a dose solution, most of the existing prediction models cannot predict such impacts. To provide reliable predictors, we curated a novel large dataset of in vitro human skin permeability coefficients for 322 entries comprising chemically diverse permeants whose ionization fractions can be calculated. Subsequently, we generated thousands of computational descriptors, including LogD (octanol-water distribution coefficient at a specific pH), and analyzed the dataset using nonlinear support vector regression (SVR) and Gaussian process regression (GPR) combined with greedy descriptor selection. The SVR model was slightly superior to the GPR model, with externally validated squared correlation coefficient, root mean square error, and mean absolute error values of 0.94, 0.29, and 0.21, respectively. These models indicate that Log D is effective for a comprehensive prediction of ionization effects on skin permeability. In addition, the proposed models satisfied the statistical criteria endorsed in recent model validation studies. These models can evaluate virtually generated compounds at any pH; therefore, they can be used for high-throughput evaluations of numerous active ingredients and optimization of their skin permeability with respect to permeant ionization. Copyright © 2017 Elsevier B.V. All rights reserved.
Detection and quantification of adulteration in sandalwood oil through near infrared spectroscopy.

PubMed

Kuriakose, Saji; Thankappan, Xavier; Joe, Hubert; Venkataraman, Venkateswaran

2010-10-01

The confirmation of authenticity of essential oils and the detection of adulteration are problems of increasing importance in the perfumes, pharmaceutical, flavor and fragrance industries. This is especially true for 'value added' products like sandalwood oil. A methodical study is conducted here to demonstrate the potential use of Near Infrared (NIR) spectroscopy along with multivariate calibration models like principal component regression (PCR) and partial least square regression (PLSR) as rapid analytical techniques for the qualitative and quantitative determination of adulterants in sandalwood oil. After suitable pre-processing of the NIR raw spectral data, the models are built-up by cross-validation. The lowest Root Mean Square Error of Cross-Validation and Calibration (RMSECV and RMSEC % v/v) are used as a decision supporting system to fix the optimal number of factors. The coefficient of determination (R(2)) and the Root Mean Square Error of Prediction (RMSEP % v/v) in the prediction sets are used as the evaluation parameters (R(2) = 0.9999 and RMSEP = 0.01355). The overall result leads to the conclusion that NIR spectroscopy with chemometric techniques could be successfully used as a rapid, simple, instant and non-destructive method for the detection of adulterants, even 1% of the low-grade oils, in the high quality form of sandalwood oil.
Measurement of process variables in solid-state fermentation of wheat straw using FT-NIR spectroscopy and synergy interval PLS algorithm

NASA Astrophysics Data System (ADS)

Jiang, Hui; Liu, Guohai; Mei, Congli; Yu, Shuang; Xiao, Xiahong; Ding, Yuhan

2012-11-01

The feasibility of rapid determination of the process variables (i.e. pH and moisture content) in solid-state fermentation (SSF) of wheat straw using Fourier transform near infrared (FT-NIR) spectroscopy was studied. Synergy interval partial least squares (siPLS) algorithm was implemented to calibrate regression model. The number of PLS factors and the number of subintervals were optimized simultaneously by cross-validation. The performance of the prediction model was evaluated according to the root mean square error of cross-validation (RMSECV), the root mean square error of prediction (RMSEP) and the correlation coefficient (R). The measurement results of the optimal model were obtained as follows: RMSECV = 0.0776, Rc = 0.9777, RMSEP = 0.0963, and Rp = 0.9686 for pH model; RMSECV = 1.3544% w/w, Rc = 0.8871, RMSEP = 1.4946% w/w, and Rp = 0.8684 for moisture content model. Finally, compared with classic PLS and iPLS models, the siPLS model revealed its superior performance. The overall results demonstrate that FT-NIR spectroscopy combined with siPLS algorithm can be used to measure process variables in solid-state fermentation of wheat straw, and NIR spectroscopy technique has a potential to be utilized in SSF industry.
Estimating Inflows to Lake Okeechobee Using Climate Indices: A Machine Learning Modeling Approach

NASA Astrophysics Data System (ADS)

Kalra, A.; Ahmad, S.

2008-12-01

The operation of regional water management systems that include lakes and storage reservoirs for flood control and water supply can be significantly improved by using climate indices. This research is focused on forecasting Lag 1 annual inflow to Lake Okeechobee, located in South Florida, using annual oceanic- atmospheric indices of Pacific Decadal Oscillation (PDO), North Atlantic Oscillation (NAO), Atlantic Multidecadal Oscillation (AMO), and El Nino-Southern Oscillations (ENSO). Support Vector Machine (SVM) and Least Square Support Vector Machine (LSSVM), belonging to the class of data driven models, are developed to forecast annual lake inflow using annual oceanic-atmospheric indices data from 1914 to 2003. The models were trained with 80 years of data and tested for 10 years of data. Based on Correlation Coefficient, Root Means Square Error, and Mean Absolute Error model predictions were in good agreement with measured inflow volumes. Sensitivity analysis, performed to evaluate the effect of individual and coupled oscillations, revealed a strong signal for AMO and ENSO indices compared to PDO and NAO indices for one year lead-time inflow forecast. Inflow predictions from the SVM models were better when compared with the predictions obtained from feed forward back propagation Artificial Neural Network (ANN) models.
Accuracy of band dendrometers

Treesearch

L. R. Auchmoody

1976-01-01

A study to determine the reliability of first-year growth measurements obtained from aluminum band dendrometers showed that growth was underestimated for black cherry trees growing less than 0.5 inch in diameter or accumulating less than 0.080 square foot of basal area. Prediction equations to correct for these errors are given.

Analysis of a Shock-Associated Noise Prediction Model Using Measured Jet Far-Field Noise Data

NASA Technical Reports Server (NTRS)

Dahl, Milo D.; Sharpe, Jacob A.

2014-01-01

A code for predicting supersonic jet broadband shock-associated noise was assessed using a database containing noise measurements of a jet issuing from a convergent nozzle. The jet was operated at 24 conditions covering six fully expanded Mach numbers with four total temperature ratios. To enable comparisons of the predicted shock-associated noise component spectra with data, the measured total jet noise spectra were separated into mixing noise and shock-associated noise component spectra. Comparisons between predicted and measured shock-associated noise component spectra were used to identify deficiencies in the prediction model. Proposed revisions to the model, based on a study of the overall sound pressure levels for the shock-associated noise component of the measured data, a sensitivity analysis of the model parameters with emphasis on the definition of the convection velocity parameter, and a least-squares fit of the predicted to the measured shock-associated noise component spectra, resulted in a new definition for the source strength spectrum in the model. An error analysis showed that the average error in the predicted spectra was reduced by as much as 3.5 dB for the revised model relative to the average error for the original model.
Prediction of pH of cola beverage using Vis/NIR spectroscopy and least squares-support vector machine

NASA Astrophysics Data System (ADS)

Liu, Fei; He, Yong

2008-02-01

Visible and near infrared (Vis/NIR) transmission spectroscopy and chemometric methods were utilized to predict the pH values of cola beverages. Five varieties of cola were prepared and 225 samples (45 samples for each variety) were selected for the calibration set, while 75 samples (15 samples for each variety) for the validation set. The smoothing way of Savitzky-Golay and standard normal variate (SNV) followed by first-derivative were used as the pre-processing methods. Partial least squares (PLS) analysis was employed to extract the principal components (PCs) which were used as the inputs of least squares-support vector machine (LS-SVM) model according to their accumulative reliabilities. Then LS-SVM with radial basis function (RBF) kernel function and a two-step grid search technique were applied to build the regression model with a comparison of PLS regression. The correlation coefficient (r), root mean square error of prediction (RMSEP) and bias were 0.961, 0.040 and 0.012 for PLS, while 0.975, 0.031 and 4.697x10 -3 for LS-SVM, respectively. Both methods obtained a satisfying precision. The results indicated that Vis/NIR spectroscopy combined with chemometric methods could be applied as an alternative way for the prediction of pH of cola beverages.
State-space prediction model for chaotic time series

NASA Astrophysics Data System (ADS)

Alparslan, A. K.; Sayar, M.; Atilgan, A. R.

1998-08-01

A simple method for predicting the continuation of scalar chaotic time series ahead in time is proposed. The false nearest neighbors technique in connection with the time-delayed embedding is employed so as to reconstruct the state space. A local forecasting model based upon the time evolution of the topological neighboring in the reconstructed phase space is suggested. A moving root-mean-square error is utilized in order to monitor the error along the prediction horizon. The model is tested for the convection amplitude of the Lorenz model. The results indicate that for approximately 100 cycles of the training data, the prediction follows the actual continuation very closely about six cycles. The proposed model, like other state-space forecasting models, captures the long-term behavior of the system due to the use of spatial neighbors in the state space.
Tablet potency of Tianeptine in coated tablets by near infrared spectroscopy: model optimisation, calibration transfer and confidence intervals.

PubMed

Boiret, Mathieu; Meunier, Loïc; Ginot, Yves-Michel

2011-02-20

A near infrared (NIR) method was developed for determination of tablet potency of active pharmaceutical ingredient (API) in a complex coated tablet matrix. The calibration set contained samples from laboratory and production scale batches. The reference values were obtained by high performance liquid chromatography (HPLC) and partial least squares (PLS) regression was used to establish a model. The model was challenged by calculating tablet potency of two external test sets. Root mean square errors of prediction were respectively equal to 2.0% and 2.7%. To use this model with a second spectrometer from the production field, a calibration transfer method called piecewise direct standardisation (PDS) was used. After the transfer, the root mean square error of prediction of the first test set was 2.4% compared to 4.0% without transferring the spectra. A statistical technique using bootstrap of PLS residuals was used to estimate confidence intervals of tablet potency calculations. This method requires an optimised PLS model, selection of the bootstrap number and determination of the risk. In the case of a chemical analysis, the tablet potency value will be included within the confidence interval calculated by the bootstrap method. An easy to use graphical interface was developed to easily determine if the predictions, surrounded by minimum and maximum values, are within the specifications defined by the regulatory organisation. Copyright © 2010 Elsevier B.V. All rights reserved.
Adaptive Trajectory Prediction Algorithm for Climbing Flights

NASA Technical Reports Server (NTRS)

Schultz, Charles Alexander; Thipphavong, David P.; Erzberger, Heinz

2012-01-01

Aircraft climb trajectories are difficult to predict, and large errors in these predictions reduce the potential operational benefits of some advanced features for NextGen. The algorithm described in this paper improves climb trajectory prediction accuracy by adjusting trajectory predictions based on observed track data. It utilizes rate-of-climb and airspeed measurements derived from position data to dynamically adjust the aircraft weight modeled for trajectory predictions. In simulations with weight uncertainty, the algorithm is able to adapt to within 3 percent of the actual gross weight within two minutes of the initial adaptation. The root-mean-square of altitude errors for five-minute predictions was reduced by 73 percent. Conflict detection performance also improved, with a 15 percent reduction in missed alerts and a 10 percent reduction in false alerts. In a simulation with climb speed capture intent and weight uncertainty, the algorithm improved climb trajectory prediction accuracy by up to 30 percent and conflict detection performance, reducing missed and false alerts by up to 10 percent.
Flood loss model transfer: on the value of additional data

NASA Astrophysics Data System (ADS)

Schröter, Kai; Lüdtke, Stefan; Vogel, Kristin; Kreibich, Heidi; Thieken, Annegret; Merz, Bruno

2017-04-01

The transfer of models across geographical regions and flood events is a key challenge in flood loss estimation. Variations in local characteristics and continuous system changes require regional adjustments and continuous updating with current evidence. However, acquiring data on damage influencing factors is expensive and therefore assessing the value of additional data in terms of model reliability and performance improvement is of high relevance. The present study utilizes empirical flood loss data on direct damage to residential buildings available from computer aided telephone interviews that were carried out after the floods in 2002, 2005, 2006, 2010, 2011 and 2013 mainly in the Elbe and Danube catchments in Germany. Flood loss model performance is assessed for incrementally increased numbers of loss data which are differentiated according to region and flood event. Two flood loss modeling approaches are considered: (i) a multi-variable flood loss model approach using Random Forests and (ii) a uni-variable stage damage function. Both model approaches are embedded in a bootstrapping process which allows evaluating the uncertainty of model predictions. Predictive performance of both models is evaluated with regard to mean bias, mean absolute and mean squared errors, as well as hit rate and sharpness. Mean bias and mean absolute error give information about the accuracy of model predictions; mean squared error and sharpness about precision and hit rate is an indicator for model reliability. The results of incremental, regional and temporal updating demonstrate the usefulness of additional data to improve model predictive performance and increase model reliability, particularly in a spatial-temporal transfer setting.
Thickness distribution of a cooling pyroclastic flow deposit on Augustine Volcano, Alaska: Optimization using InSAR, FEMs, and an adaptive mesh algorithm

USGS Publications Warehouse

Masterlark, Timothy; Lu, Zhong; Rykhus, Russell P.

2006-01-01

Interferometric synthetic aperture radar (InSAR) imagery documents the consistent subsidence, during the interval 1992–1999, of a pyroclastic flow deposit (PFD) emplaced during the 1986 eruption of Augustine Volcano, Alaska. We construct finite element models (FEMs) that simulate thermoelastic contraction of the PFD to account for the observed subsidence. Three-dimensional problem domains of the FEMs include a thermoelastic PFD embedded in an elastic substrate. The thickness of the PFD is initially determined from the difference between post- and pre-eruption digital elevation models (DEMs). The initial excess temperature of the PFD at the time of deposition, 640 °C, is estimated from FEM predictions and an InSAR image via standard least-squares inverse methods. Although the FEM predicts the major features of the observed transient deformation, systematic prediction errors (RMSE = 2.2 cm) are most likely associated with errors in the a priori PFD thickness distribution estimated from the DEM differences. We combine an InSAR image, FEMs, and an adaptive mesh algorithm to iteratively optimize the geometry of the PFD with respect to a minimized misfit between the predicted thermoelastic deformation and observed deformation. Prediction errors from an FEM, which includes an optimized PFD geometry and the initial excess PFD temperature estimated from the least-squares analysis, are sub-millimeter (RMSE = 0.3 mm). The average thickness (9.3 m), maximum thickness (126 m), and volume (2.1 × 107m3) of the PFD, estimated using the adaptive mesh algorithm, are about twice as large as the respective estimations for the a priori PFD geometry. Sensitivity analyses suggest unrealistic PFD thickness distributions are required for initial excess PFD temperatures outside of the range 500–800 °C.
Online Soft Sensor of Humidity in PEM Fuel Cell Based on Dynamic Partial Least Squares

PubMed Central

Long, Rong; Chen, Qihong; Zhang, Liyan; Ma, Longhua; Quan, Shuhai

2013-01-01

Online monitoring humidity in the proton exchange membrane (PEM) fuel cell is an important issue in maintaining proper membrane humidity. The cost and size of existing sensors for monitoring humidity are prohibitive for online measurements. Online prediction of humidity using readily available measured data would be beneficial to water management. In this paper, a novel soft sensor method based on dynamic partial least squares (DPLS) regression is proposed and applied to humidity prediction in PEM fuel cell. In order to obtain data of humidity and test the feasibility of the proposed DPLS-based soft sensor a hardware-in-the-loop (HIL) test system is constructed. The time lag of the DPLS-based soft sensor is selected as 30 by comparing the root-mean-square error in different time lag. The performance of the proposed DPLS-based soft sensor is demonstrated by experimental results. PMID:24453923
A New Global Regression Analysis Method for the Prediction of Wind Tunnel Model Weight Corrections

NASA Technical Reports Server (NTRS)

Ulbrich, Norbert Manfred; Bridge, Thomas M.; Amaya, Max A.

2014-01-01

A new global regression analysis method is discussed that predicts wind tunnel model weight corrections for strain-gage balance loads during a wind tunnel test. The method determines corrections by combining "wind-on" model attitude measurements with least squares estimates of the model weight and center of gravity coordinates that are obtained from "wind-off" data points. The method treats the least squares fit of the model weight separate from the fit of the center of gravity coordinates. Therefore, it performs two fits of "wind- off" data points and uses the least squares estimator of the model weight as an input for the fit of the center of gravity coordinates. Explicit equations for the least squares estimators of the weight and center of gravity coordinates are derived that simplify the implementation of the method in the data system software of a wind tunnel. In addition, recommendations for sets of "wind-off" data points are made that take typical model support system constraints into account. Explicit equations of the confidence intervals on the model weight and center of gravity coordinates and two different error analyses of the model weight prediction are also discussed in the appendices of the paper.
Carbon dioxide emission prediction using support vector machine

NASA Astrophysics Data System (ADS)

Saleh, Chairul; Rachman Dzakiyullah, Nur; Bayu Nugroho, Jonathan

2016-02-01

In this paper, the SVM model was proposed for predict expenditure of carbon (CO2) emission. The energy consumption such as electrical energy and burning coal is input variable that affect directly increasing of CO2 emissions were conducted to built the model. Our objective is to monitor the CO2 emission based on the electrical energy and burning coal used from the production process. The data electrical energy and burning coal used were obtained from Alcohol Industry in order to training and testing the models. It divided by cross-validation technique into 90% of training data and 10% of testing data. To find the optimal parameters of SVM model was used the trial and error approach on the experiment by adjusting C parameters and Epsilon. The result shows that the SVM model has an optimal parameter on C parameters 0.1 and 0 Epsilon. To measure the error of the model by using Root Mean Square Error (RMSE) with error value as 0.004. The smallest error of the model represents more accurately prediction. As a practice, this paper was contributing for an executive manager in making the effective decision for the business operation were monitoring expenditure of CO2 emission.
Climate Prediction for Brazil's Nordeste: Performance of Empirical and Numerical Modeling Methods.

NASA Astrophysics Data System (ADS)

Moura, Antonio Divino; Hastenrath, Stefan

2004-07-01

Comparisons of performance of climate forecast methods require consistency in the predictand and a long common reference period. For Brazil's Nordeste, empirical methods developed at the University of Wisconsin use preseason (October January) rainfall and January indices of the fields of meridional wind component and sea surface temperature (SST) in the tropical Atlantic and the equatorial Pacific as input to stepwise multiple regression and neural networking. These are used to predict the March June rainfall at a network of 27 stations. An experiment at the International Research Institute for Climate Prediction, Columbia University, with a numerical model (ECHAM4.5) used global SST information through February to predict the March June rainfall at three grid points in the Nordeste. The predictands for the empirical and numerical model forecasts are correlated at +0.96, and the period common to the independent portion of record of the empirical prediction and the numerical modeling is 1968 99. Over this period, predicted versus observed rainfall are evaluated in terms of correlation, root-mean-square error, absolute error, and bias. Performance is high for both approaches. Numerical modeling produces a correlation of +0.68, moderate errors, and strong negative bias. For the empirical methods, errors and bias are small, and correlations of +0.73 and +0.82 are reached between predicted and observed rainfall.
Modeling RP-1 fuel advanced distillation data using comprehensive two-dimensional gas chromatography coupled with time-of-flight mass spectrometry and partial least squares analysis.

PubMed

Kehimkar, Benjamin; Parsons, Brendon A; Hoggard, Jamin C; Billingsley, Matthew C; Bruno, Thomas J; Synovec, Robert E

2015-01-01

Recent efforts in predicting rocket propulsion (RP-1) fuel performance through modeling put greater emphasis on obtaining detailed and accurate fuel properties, as well as elucidating the relationships between fuel compositions and their properties. Herein, we study multidimensional chromatographic data obtained by comprehensive two-dimensional gas chromatography combined with time-of-flight mass spectrometry (GC × GC-TOFMS) to analyze RP-1 fuels. For GC × GC separations, RTX-Wax (polar stationary phase) and RTX-1 (non-polar stationary phase) columns were implemented for the primary and secondary dimensions, respectively, to separate the chemical compound classes (alkanes, cycloalkanes, aromatics, etc.), providing a significant level of chemical compositional information. The GC × GC-TOFMS data were analyzed using partial least squares regression (PLS) chemometric analysis to model and predict advanced distillation curve (ADC) data for ten RP-1 fuels that were previously analyzed using the ADC method. The PLS modeling provides insight into the chemical species that impact the ADC data. The PLS modeling correlates compositional information found in the GC × GC-TOFMS chromatograms of each RP-1 fuel, and their respective ADC, and allows prediction of the ADC for each RP-1 fuel with good precision and accuracy. The root-mean-square error of calibration (RMSEC) ranged from 0.1 to 0.5 °C, and was typically below ∼0.2 °C, for the PLS calibration of the ADC modeling with GC × GC-TOFMS data, indicating a good fit of the model to the calibration data. Likewise, the predictive power of the overall method via PLS modeling was assessed using leave-one-out cross-validation (LOOCV) yielding root-mean-square error of cross-validation (RMSECV) ranging from 1.4 to 2.6 °C, and was typically below ∼2.0 °C, at each % distilled measurement point during the ADC analysis.
Combining forecast weights: Why and how?

NASA Astrophysics Data System (ADS)

Yin, Yip Chee; Kok-Haur, Ng; Hock-Eam, Lim

2012-09-01

This paper proposes a procedure called forecast weight averaging which is a specific combination of forecast weights obtained from different methods of constructing forecast weights for the purpose of improving the accuracy of pseudo out of sample forecasting. It is found that under certain specified conditions, forecast weight averaging can lower the mean squared forecast error obtained from model averaging. In addition, we show that in a linear and homoskedastic environment, this superior predictive ability of forecast weight averaging holds true irrespective whether the coefficients are tested by t statistic or z statistic provided the significant level is within the 10% range. By theoretical proofs and simulation study, we have shown that model averaging like, variance model averaging, simple model averaging and standard error model averaging, each produces mean squared forecast error larger than that of forecast weight averaging. Finally, this result also holds true marginally when applied to business and economic empirical data sets, Gross Domestic Product (GDP growth rate), Consumer Price Index (CPI) and Average Lending Rate (ALR) of Malaysia.
Predicting ambient aerosol Thermal Optical Reflectance (TOR) measurements from infrared spectra: organic carbon

NASA Astrophysics Data System (ADS)

Dillner, A. M.; Takahama, S.

2014-11-01

Organic carbon (OC) can constitute 50% or more of the mass of atmospheric particulate matter. Typically, the organic carbon concentration is measured using thermal methods such as Thermal-Optical Reflectance (TOR) from quartz fiber filters. Here, methods are presented whereby Fourier Transform Infrared (FT-IR) absorbance spectra from polytetrafluoroethylene (PTFE or Teflon) filters are used to accurately predict TOR OC. Transmittance FT-IR analysis is rapid, inexpensive, and non-destructive to the PTFE filters. To develop and test the method, FT-IR absorbance spectra are obtained from 794 samples from seven Interagency Monitoring of PROtected Visual Environment (IMPROVE) sites sampled during 2011. Partial least squares regression is used to calibrate sample FT-IR absorbance spectra to artifact-corrected TOR OC. The FTIR spectra are divided into calibration and test sets by sampling site and date which leads to precise and accurate OC predictions by FT-IR as indicated by high coefficient of determination (R2; 0.96), low bias (0.02 μg m-3, all μg m-3 values based on the nominal IMPROVE sample volume of 32.8 m-3), low error (0.08 μg m-3) and low normalized error (11%). These performance metrics can be achieved with various degrees of spectral pretreatment (e.g., including or excluding substrate contributions to the absorbances) and are comparable in precision and accuracy to collocated TOR measurements. FT-IR spectra are also divided into calibration and test sets by OC mass and by OM / OC which reflects the organic composition of the particulate matter and is obtained from organic functional group composition; this division also leads to precise and accurate OC predictions. Low OC concentrations have higher bias and normalized error due to TOR analytical errors and artifact correction errors, not due to the range of OC mass of the samples in the calibration set. However, samples with low OC mass can be used to predict samples with high OC mass indicating that the calibration is linear. Using samples in the calibration set that have a different OM / OC or ammonium / OC distributions than the test set leads to only a modest increase in bias and normalized error in the predicted samples. We conclude that FT-IR analysis with partial least squares regression is a robust method for accurately predicting TOR OC in IMPROVE network samples; providing complementary information to the organic functional group composition and organic aerosol mass estimated previously from the same set of sample spectra (Ruthenburg et al., 2014).
[Application of near-infrared spectroscopy technology in extraction and concentration process of Reduning injection].

PubMed

Zhang, Ya-Fei; Zuo, Xiang-Yun; Bi, Yu-An; Wu, Jian-Xiong; Wang, Zhen-Zhong; L, Ping; Xiao, Wei

2014-08-01

To establish a rapid quantitative analysis method for the content of chlorogenic acid and solid content in the extraction liquid concentration process during the production of Reduning injection by using the near-infrared (NIR) spectroscopy, in order to reflect the concentration state in a real-time manner and really realize the quality control of concentrating process of the extraction and concentration process. The samples during the Jinqing extraction liquid concentration process were collected. After the removal of abnormal samples, the spectra pretreatment and the wave band selection, the quantitative calibration model between NIR spectra and chlorogenic acid HPLC analytical value and solid content was established by using PLS algorithm, and unknown samples were predicted. The correlation coefficients between the chlorogenic acid content and the solid content were respectively 0.992 1 and 0.994 0, and the correlation coefficients of the verification model were respectively 0.994 4 and 0.998 4, with the root mean square error of calibration (RMSEC) of 0.814 6 and 2.656 1 and the root mean square error of prediction (RMSEP) of 0.704 6 and 1.876 7 respectively, and the relative standard errors of predictions (RSEP) were 6.01% and 2.93% respectively. The method is simple, rapid, nondestructive, accurate and reliable, thus could be adopted for the fast monitoring of the chlorogenic acid content and the solid content during the concentration process of Reduning injection extraction liquid.
Optimization of Stripping Voltammetric Sensor by a Back Propagation Artificial Neural Network for the Accurate Determination of Pb(II) in the Presence of Cd(II).

PubMed

Zhao, Guo; Wang, Hui; Liu, Gang; Wang, Zhiqiang

2016-09-21

An easy, but effective, method has been proposed to detect and quantify the Pb(II) in the presence of Cd(II) based on a Bi/glassy carbon electrode (Bi/GCE) with the combination of a back propagation artificial neural network (BP-ANN) and square wave anodic stripping voltammetry (SWASV) without further electrode modification. The effects of Cd(II) in different concentrations on stripping responses of Pb(II) was studied. The results indicate that the presence of Cd(II) will reduce the prediction precision of a direct calibration model. Therefore, a two-input and one-output BP-ANN was built for the optimization of a stripping voltammetric sensor, which considering the combined effects of Cd(II) and Pb(II) on the SWASV detection of Pb(II) and establishing the nonlinear relationship between the stripping peak currents of Pb(II) and Cd(II) and the concentration of Pb(II). The key parameters of the BP-ANN and the factors affecting the SWASV detection of Pb(II) were optimized. The prediction performance of direct calibration model and BP-ANN model were tested with regard to the mean absolute error (MAE), root mean square error (RMSE), average relative error (ARE), and correlation coefficient. The results proved that the BP-ANN model exhibited higher prediction accuracy than the direct calibration model. Finally, a real samples analysis was performed to determine trace Pb(II) in some soil specimens with satisfactory results.
Water Level Prediction of Lake Cascade Mahakam Using Adaptive Neural Network Backpropagation (ANNBP)

NASA Astrophysics Data System (ADS)

Mislan; Gaffar, A. F. O.; Haviluddin; Puspitasari, N.

2018-04-01

A natural hazard information and flood events are indispensable as a form of prevention and improvement. One of the causes is flooding in the areas around the lake. Therefore, forecasting the surface of Lake water level to anticipate flooding is required. The purpose of this paper is implemented computational intelligence method namely Adaptive Neural Network Backpropagation (ANNBP) to forecasting the Lake Cascade Mahakam. Based on experiment, performance of ANNBP indicated that Lake water level prediction have been accurate by using mean square error (MSE) and mean absolute percentage error (MAPE). In other words, computational intelligence method can produce good accuracy. A hybrid and optimization of computational intelligence are focus in the future work.
Prediction of atmospheric degradation data for POPs by gene expression programming.

PubMed

Luan, F; Si, H Z; Liu, H T; Wen, Y Y; Zhang, X Y

2008-01-01

Quantitative structure-activity relationship models for the prediction of the mean and the maximum atmospheric degradation half-life values of persistent organic pollutants were developed based on the linear heuristic method (HM) and non-linear gene expression programming (GEP). Molecular descriptors, calculated from the structures alone, were used to represent the characteristics of the compounds. HM was used both to pre-select the whole descriptor sets and to build the linear model. GEP yielded satisfactory prediction results: the square of the correlation coefficient r(2) was 0.80 and 0.81 for the mean and maximum half-life values of the test set, and the root mean square errors were 0.448 and 0.426, respectively. The results of this work indicate that the GEP is a very promising tool for non-linear approximations.
Prediction of p38 map kinase inhibitory activity of 3, 4-dihydropyrido [3, 2-d] pyrimidone derivatives using an expert system based on principal component analysis and least square support vector machine

PubMed Central

Shahlaei, M.; Saghaie, L.

2014-01-01

A quantitative structure–activity relationship (QSAR) study is suggested for the prediction of biological activity (pIC50) of 3, 4-dihydropyrido [3,2-d] pyrimidone derivatives as p38 inhibitors. Modeling of the biological activities of compounds of interest as a function of molecular structures was established by means of principal component analysis (PCA) and least square support vector machine (LS-SVM) methods. The results showed that the pIC50 values calculated by LS-SVM are in good agreement with the experimental data, and the performance of the LS-SVM regression model is superior to the PCA-based model. The developed LS-SVM model was applied for the prediction of the biological activities of pyrimidone derivatives, which were not in the modeling procedure. The resulted model showed high prediction ability with root mean square error of prediction of 0.460 for LS-SVM. The study provided a novel and effective approach for predicting biological activities of 3, 4-dihydropyrido [3,2-d] pyrimidone derivatives as p38 inhibitors and disclosed that LS-SVM can be used as a powerful chemometrics tool for QSAR studies. PMID:26339262
Frequency noise measurement of diode-pumped Nd:YAG ring lasers

NASA Technical Reports Server (NTRS)

Chen, Chien-Chung; Win, Moe Zaw

1990-01-01

The combined frequency noise spectrum of two model 120-01A nonplanar ring oscillator lasers was measured by first heterodyne detecting the IF signal and then measuring the IF frequency noise using an RF frequency discriminator. The results indicated the presence of a 1/f-squared noise component in the power-spectral density of the frequency fluctuations between 1 Hz and 1 kHz. After incorporating this 1/f-squared into the analysis of the optical phase tracking loop, the measured phase error variance closely matches the theoretical predictions.

Demand Forecasting: An Evaluation of DODs Accuracy Metric and Navys Procedures

DTIC Science & Technology

2016-06-01

inventory management improvement plan, mean of absolute scaled error, lead time adjusted squared error, forecast accuracy, benchmarking, naïve method...Manager JASA Journal of the American Statistical Association LASE Lead-time Adjusted Squared Error LCI Life Cycle Indicator MA Moving Average MAE...Mean Squared Error xvi NAVSUP Naval Supply Systems Command NDAA National Defense Authorization Act NIIN National Individual Identification Number
Incorporating a prediction of postgrazing herbage mass into a whole-farm model for pasture-based dairy systems.

PubMed

Gregorini, P; Galli, J; Romera, A J; Levy, G; Macdonald, K A; Fernandez, H H; Beukes, P C

2014-07-01

The DairyNZ whole-farm model (WFM; DairyNZ, Hamilton, New Zealand) consists of a framework that links component models for animal, pastures, crops, and soils. The model was developed to assist with analysis and design of pasture-based farm systems. New (this work) and revised (e.g., cow, pasture, crops) component models can be added to the WFM, keeping the model flexible and up to date. Nevertheless, the WFM does not account for plant-animal relationships determining herbage-depletion dynamics. The user has to preset the maximum allowable level of herbage depletion [i.e., postgrazing herbage mass (residuals)] throughout the year. Because residuals have a direct effect on herbage regrowth, the WFM in its current form does not dynamically simulate the effect of grazing pressure on herbage depletion and consequent effect on herbage regrowth. The management of grazing pressure is a key component of pasture-based dairy systems. Thus, the main objective of the present work was to develop a new version of the WFM able to predict residuals, and thereby simulate related effects of grazing pressure dynamically at the farm scale. This objective was accomplished by incorporating a new component model into the WFM. This model represents plant-animal relationships, for example sward structure and herbage intake rate, and resulting level of herbage depletion. The sensitivity of the new version of the WFM was evaluated and then the new WFM was tested against an experimental data set previously used to evaluate the WFM and to illustrate the adequacy and improvement of the model development. Key outputs variables of the new version pertinent to this work (milk production, herbage dry matter intake, intake rate, harvesting efficiency, and residuals) responded acceptably to a range of input variables. The relative prediction errors for monthly and mean annual residual predictions were 20 and 5%, respectively. Monthly predictions of residuals had a line bias (1.5%), with a proportion of square root of mean square prediction error (RMSPE) due to random error of 97.5%. Predicted monthly herbage growth rates had a line bias of 2%, a proportion of RMSPE due to random error of 96%, and a concordance correlation coefficient of 0.87. Annual herbage production was predicted with an RMSPE of 531 (kg of herbage dry matter/ha per year), a line bias of 11%, a proportion of RMSPE due to random error of 80%, and relative prediction errors of 2%. Annual herbage dry matter intake per cow and hectare, both per year, were predicted with RMSPE, relative prediction error, and concordance correlation coefficient of 169 and 692kg of dry matter, 3 and 4%, and 0.91 and 0.87, respectively. These results indicate that predictions of the new WFM are relatively accurate and precise, with a conclusion that incorporating a plant-animal relationship model into the WFM allows for dynamic predictions of residuals and more realistic simulations of the effect of grazing pressure on herbage production and intake at the farm level without the intervention from the user. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Real-time prediction and gating of respiratory motion in 3D space using extended Kalman filters and Gaussian process regression network

NASA Astrophysics Data System (ADS)

Bukhari, W.; Hong, S.-M.

2016-03-01

The prediction as well as the gating of respiratory motion have received much attention over the last two decades for reducing the targeting error of the radiation treatment beam due to respiratory motion. In this article, we present a real-time algorithm for predicting respiratory motion in 3D space and realizing a gating function without pre-specifying a particular phase of the patient’s breathing cycle. The algorithm, named EKF-GPRN+ , first employs an extended Kalman filter (EKF) independently along each coordinate to predict the respiratory motion and then uses a Gaussian process regression network (GPRN) to correct the prediction error of the EKF in 3D space. The GPRN is a nonparametric Bayesian algorithm for modeling input-dependent correlations between the output variables in multi-output regression. Inference in GPRN is intractable and we employ variational inference with mean field approximation to compute an approximate predictive mean and predictive covariance matrix. The approximate predictive mean is used to correct the prediction error of the EKF. The trace of the approximate predictive covariance matrix is utilized to capture the uncertainty in EKF-GPRN+ prediction error and systematically identify breathing points with a higher probability of large prediction error in advance. This identification enables us to pause the treatment beam over such instances. EKF-GPRN+ implements a gating function by using simple calculations based on the trace of the predictive covariance matrix. Extensive numerical experiments are performed based on a large database of 304 respiratory motion traces to evaluate EKF-GPRN+ . The experimental results show that the EKF-GPRN+ algorithm reduces the patient-wise prediction error to 38%, 40% and 40% in root-mean-square, compared to no prediction, at lookahead lengths of 192 ms, 384 ms and 576 ms, respectively. The EKF-GPRN+ algorithm can further reduce the prediction error by employing the gating function, albeit at the cost of reduced duty cycle. The error reduction allows the clinical target volume to planning target volume (CTV-PTV) margin to be reduced, leading to decreased normal-tissue toxicity and possible dose escalation. The CTV-PTV margin is also evaluated to quantify clinical benefits of EKF-GPRN+ prediction.
Water-quality models to assess algal community dynamics, water quality, and fish habitat suitability for two agricultural land-use dominated lakes in Minnesota, 2014

USGS Publications Warehouse

Smith, Erik A.; Kiesling, Richard L.; Ziegeweid, Jeffrey R.

2017-07-20

Fish habitat can degrade in many lakes due to summer blue-green algal blooms. Predictive models are needed to better manage and mitigate loss of fish habitat due to these changes. The U.S. Geological Survey (USGS), in cooperation with the Minnesota Department of Natural Resources, developed predictive water-quality models for two agricultural land-use dominated lakes in Minnesota—Madison Lake and Pearl Lake, which are part of Minnesota’s sentinel lakes monitoring program—to assess algal community dynamics, water quality, and fish habitat suitability of these two lakes under recent (2014) meteorological conditions. The interaction of basin processes to these two lakes, through the delivery of nutrient loads, were simulated using CE-QUAL-W2, a carbon-based, laterally averaged, two-dimensional water-quality model that predicts distribution of temperature and oxygen from interactions between nutrient cycling, primary production, and trophic dynamics.The CE-QUAL-W2 models successfully predicted water temperature and dissolved oxygen on the basis of the two metrics of mean absolute error and root mean square error. For Madison Lake, the mean absolute error and root mean square error were 0.53 and 0.68 degree Celsius, respectively, for the vertical temperature profile comparisons; for Pearl Lake, the mean absolute error and root mean square error were 0.71 and 0.95 degree Celsius, respectively, for the vertical temperature profile comparisons. Temperature and dissolved oxygen were key metrics for calibration targets. These calibrated lake models also simulated algal community dynamics and water quality. The model simulations presented potential explanations for persistently large total phosphorus concentrations in Madison Lake, key differences in nutrient concentrations between these lakes, and summer blue-green algal bloom persistence.Fish habitat suitability simulations for cool-water and warm-water fish indicated that, in general, both lakes contained a large proportion of good-growth habitat and a sustained period of optimal growth habitat in the summer, without any periods of lethal oxythermal habitat. For Madison and Pearl Lakes, examples of important cool-water fish, particularly game fish, include northern pike (Esox lucius), walleye (Sander vitreus), and black crappie (Pomoxis nigromaculatus); examples of important warm-water fish include bluegill (Lepomis macrochirus), largemouth bass (Micropterus salmoides), and smallmouth bass (Micropterus dolomieu). Sensitivity analyses were completed to understand lake response effects through the use of controlled departures on certain calibrated model parameters and input nutrient loads. These sensitivity analyses also operated as land-use change scenarios because alterations in agricultural practices, for example, could potentially increase or decrease nutrient loads.
Development of a Direct Spectrophotometric and Chemometric Method for Determining Food Dye Concentrations.

PubMed

Arroz, Erin; Jordan, Michael; Dumancas, Gerard G

2017-07-01

An ultraviolet visible (UV-Vis) spectrophotometric and partial least squares (PLS) chemometric method was developed for the simultaneous determination of erythrosine B (red), Brilliant Blue, and tartrazine (yellow) dyes. A training set (n = 64) was generated using a full factorial design and its accuracy was tested in a test set (n = 13) using a Box-Behnken design. The test set garnered a root mean square error (RMSE) of 1.79 × 10 -7 for blue, 4.59 × 10 -7 for red, and 1.13 × 10 -6 for yellow dyes. The relatively small RMSE suggests only a small difference between predicted versus measured concentrations, demonstrating the accuracy of our model. The relative error of prediction (REP) for the test set were 11.73%, 19.52%, 19.38%, for blue, red, and yellow dyes, respectively. A comparable overlay between the actual candy samples and their replicated synthetic spectra were also obtained indicating the model as a potentially accurate method for determining concentrations of dyes in food samples.
A rapid method for detection of fumonisins B1 and B2 in corn meal using Fourier transform near infrared (FT-NIR) spectroscopy implemented with integrating sphere.

PubMed

Gaspardo, B; Del Zotto, S; Torelli, E; Cividino, S R; Firrao, G; Della Riccia, G; Stefanon, B

2012-12-01

Fourier transform near infrared (FT-NIR) spectroscopy is an analytical procedure generally used to detect organic compounds in food. In this work the ability to predict fumonisin B(1)+B(2) contents in corn meal using an FT-NIR spectrophotometer, equipped with an integration sphere, was assessed. A total of 143 corn meal samples were collected in Friuli Venezia Giulia Region (Italy) and used to define a 15 principal components regression model, applying partial least square regression algorithm with full cross validation as internal validation. External validation was performed to 25 unknown samples. Coefficients of correlation, root mean square error and standard error of calibration were 0.964, 0.630 and 0.632, respectively and the external validation confirmed a fair potential of the model in predicting FB(1)+FB(2) concentration. Results suggest that FT-NIR analysis is a suitable method to detect FB(1)+FB(2) in corn meal and to discriminate safe meals from those contaminated. Copyright © 2012 Elsevier Ltd. All rights reserved.
Study of the convergence behavior of the complex kernel least mean square algorithm.

PubMed

Paul, Thomas K; Ogunfunmi, Tokunbo

2013-09-01

The complex kernel least mean square (CKLMS) algorithm is recently derived and allows for online kernel adaptive learning for complex data. Kernel adaptive methods can be used in finding solutions for neural network and machine learning applications. The derivation of CKLMS involved the development of a modified Wirtinger calculus for Hilbert spaces to obtain the cost function gradient. We analyze the convergence of the CKLMS with different kernel forms for complex data. The expressions obtained enable us to generate theory-predicted mean-square error curves considering the circularity of the complex input signals and their effect on nonlinear learning. Simulations are used for verifying the analysis results.
Validation of the Hong Kong Liver Cancer Staging System in Determining Prognosis of the North American Patients Following Intra-arterial Therapy.

PubMed

Sohn, Jae Ho; Duran, Rafael; Zhao, Yan; Fleckenstein, Florian; Chapiro, Julius; Sahu, Sonia; Schernthaner, Rüdiger E; Qian, Tianchen; Lee, Howard; Zhao, Li; Hamilton, James; Frangakis, Constantine; Lin, MingDe; Salem, Riad; Geschwind, Jean-Francois

2017-05-01

There is debate over the best way to stage hepatocellular carcinoma (HCC). We attempted to validate the prognostic and clinical utility of the recently developed Hong Kong Liver Cancer (HKLC) staging system, a hepatitis B-based model, and compared data with that from the Barcelona Clinic Liver Cancer (BCLC) staging system in a North American population that underwent intra-arterial therapy (IAT). We performed a retrospective analysis of data from 1009 patients with HCC who underwent IAT from 2000 through 2014. Most patients had hepatitis C or unresectable tumors; all patients underwent IAT, with or without resection, transplantation, and/or systemic chemotherapy. We calculated HCC stage for each patient using 5-stage HKLC (HKLC-5) and 9-stage HKLC (HKLC-9) system classifications, and the BCLC system. Survival information was collected up until the end of 2014 at which point living or unconfirmed patients were censored. We compared performance of the BCLC, HKLC-5, and HKLC-9 systems in predicting patient outcomes using Kaplan-Meier estimates, calibration plots, C statistic, Akaike information criterion, and the likelihood ratio test. Median overall survival time, calculated from first IAT until date of death or censorship, for the entire cohort (all stages) was 9.8 months. The BCLC and HKLC staging systems predicted patient survival times with significance (P < .001). HKLC-5 and HKLC-9 each demonstrated good calibration. The HKLC-5 system outperformed the BCLC system in predicting patient survival times (HKLC C = 0.71, Akaike information criterion = 6242; BCLC C = 0.64, Akaike information criterion = 6320), reducing error in predicting survival time (HKLC reduced error by 14%, BCLC reduced error by 12%), and homogeneity (HKLC chi-square = 201, P < .001; BCLC chi-square = 119, P < .001) and monotonicity (HKLC linear trend chi-square = 193, P < .001; BCLC linear trend chi-square = 111, P < .001). Small proportions of patients with HCC of stages IV or V, according to the HKLC system, survived for 6 months and 4 months, respectively. In a retrospective analysis of patients who underwent IAT for unresectable HCC, we found the HKLC-5 staging system to have the best combination of performances in survival separation, calibration, and discrimination; it consistently outperformed the BCLC system in predicting survival times of patients. The HKLC system identified patients with HCC of stages IV and V who are unlikely to benefit from IAT. Copyright © 2017 AGA Institute. Published by Elsevier Inc. All rights reserved.
SU-G-BRB-03: Assessing the Sensitivity and False Positive Rate of the Integrated Quality Monitor (IQM) Large Area Ion Chamber to MLC Positioning Errors

DOE Office of Scientific and Technical Information (OSTI.GOV)

Boehnke, E McKenzie; DeMarco, J; Steers, J

2016-06-15

Purpose: To examine both the IQM’s sensitivity and false positive rate to varying MLC errors. By balancing these two characteristics, an optimal tolerance value can be derived. Methods: An un-modified SBRT Liver IMRT plan containing 7 fields was randomly selected as a representative clinical case. The active MLC positions for all fields were perturbed randomly from a square distribution of varying width (±1mm to ±5mm). These unmodified and modified plans were measured multiple times each by the IQM (a large area ion chamber mounted to a TrueBeam linac head). Measurements were analyzed relative to the initial, unmodified measurement. IQM readingsmore » are analyzed as a function of control points. In order to examine sensitivity to errors along a field’s delivery, each measured field was divided into 5 groups of control points, and the maximum error in each group was recorded. Since the plans have known errors, we compared how well the IQM is able to differentiate between unmodified and error plans. ROC curves and logistic regression were used to analyze this, independent of thresholds. Results: A likelihood-ratio Chi-square test showed that the IQM could significantly predict whether a plan had MLC errors, with the exception of the beginning and ending control points. Upon further examination, we determined there was ramp-up occurring at the beginning of delivery. Once the linac AFC was tuned, the subsequent measurements (relative to a new baseline) showed significant (p <0.005) abilities to predict MLC errors. Using the area under the curve, we show the IQM’s ability to detect errors increases with increasing MLC error (Spearman’s Rho=0.8056, p<0.0001). The optimal IQM count thresholds from the ROC curves are ±3%, ±2%, and ±7% for the beginning, middle 3, and end segments, respectively. Conclusion: The IQM has proven to be able to detect not only MLC errors, but also differences in beam tuning (ramp-up). Partially supported by the Susan Scott Foundation.« less
Hyperspectral Imaging in Tandem with R Statistics and Image Processing for Detection and Visualization of pH in Japanese Big Sausages Under Different Storage Conditions.

PubMed

Feng, Chao-Hui; Makino, Yoshio; Yoshimura, Masatoshi; Thuyet, Dang Quoc; García-Martín, Juan Francisco

2018-02-01

The potential of hyperspectral imaging with wavelengths of 380 to 1000 nm was used to determine the pH of cooked sausages after different storage conditions (4 °C for 1 d, 35 °C for 1, 3, and 5 d). The mean spectra of the sausages were extracted from the hyperspectral images and partial least squares regression (PLSR) model was developed to relate spectral profiles with the pH of the cooked sausages. Eleven important wavelengths were selected based on the regression coefficient values. The PLSR model established using the optimal wavelengths showed good precision being the prediction coefficient of determination (R p 2 ) 0.909 and the root mean square error of prediction 0.035. The prediction map for illustrating pH indices in sausages was for the first time developed by R statistics. The overall results suggested that hyperspectral imaging combined with PLSR and R statistics are capable to quantify and visualize the sausages pH evolution under different storage conditions. In this paper, hyperspectral imaging is for the first time used to detect pH in cooked sausages using R statistics, which provides another useful information for the researchers who do not have the access to Matlab. Eleven optimal wavelengths were successfully selected, which were used for simplifying the PLSR model established based on the full wavelengths. This simplified model achieved a high R p 2 (0.909) and a low root mean square error of prediction (0.035), which can be useful for the design of multispectral imaging systems. © 2017 Institute of Food Technologists®.
Desert soil clay content estimation using reflectance spectroscopy preprocessed by fractional derivative

PubMed Central

Tiyip, Tashpolat; Ding, Jianli; Zhang, Dong; Liu, Wei; Wang, Fei; Tashpolat, Nigara

2017-01-01

Effective pretreatment of spectral reflectance is vital to model accuracy in soil parameter estimation. However, the classic integer derivative has some disadvantages, including spectral information loss and the introduction of high-frequency noise. In this paper, the fractional order derivative algorithm was applied to the pretreatment and partial least squares regression (PLSR) was used to assess the clay content of desert soils. Overall, 103 soil samples were collected from the Ebinur Lake basin in the Xinjiang Uighur Autonomous Region of China, and used as data sets for calibration and validation. Following laboratory measurements of spectral reflectance and clay content, the raw spectral reflectance and absorbance data were treated using the fractional derivative order from the 0.0 to the 2.0 order (order interval: 0.2). The ratio of performance to deviation (RPD), determinant coefficients of calibration (Rc2), root mean square errors of calibration (RMSEC), determinant coefficients of prediction (Rp2), and root mean square errors of prediction (RMSEP) were applied to assess the performance of predicting models. The results showed that models built on the fractional derivative order performed better than when using the classic integer derivative. Comparison of the predictive effects of 22 models for estimating clay content, calibrated by PLSR, showed that those models based on the fractional derivative 1.8 order of spectral reflectance (Rc2 = 0.907, RMSEC = 0.425%, Rp2 = 0.916, RMSEP = 0.364%, and RPD = 2.484 ≥ 2.000) and absorbance (Rc2 = 0.888, RMSEC = 0.446%, Rp2 = 0.918, RMSEP = 0.383% and RPD = 2.511 ≥ 2.000) were most effective. Furthermore, they performed well in quantitative estimations of the clay content of soils in the study area. PMID:28934274
PREVAIL: Predicting Recovery through Estimation and Visualization of Active and Incident Lesions.

PubMed

Dworkin, Jordan D; Sweeney, Elizabeth M; Schindler, Matthew K; Chahin, Salim; Reich, Daniel S; Shinohara, Russell T

2016-01-01

The goal of this study was to develop a model that integrates imaging and clinical information observed at lesion incidence for predicting the recovery of white matter lesions in multiple sclerosis (MS) patients. Demographic, clinical, and magnetic resonance imaging (MRI) data were obtained from 60 subjects with MS as part of a natural history study at the National Institute of Neurological Disorders and Stroke. A total of 401 lesions met the inclusion criteria and were used in the study. Imaging features were extracted from the intensity-normalized T1-weighted (T1w) and T2-weighted sequences as well as magnetization transfer ratio (MTR) sequence acquired at lesion incidence. T1w and MTR signatures were also extracted from images acquired one-year post-incidence. Imaging features were integrated with clinical and demographic data observed at lesion incidence to create statistical prediction models for long-term damage within the lesion. The performance of the T1w and MTR predictions was assessed in two ways: first, the predictive accuracy was measured quantitatively using leave-one-lesion-out cross-validated (CV) mean-squared predictive error. Then, to assess the prediction performance from the perspective of expert clinicians, three board-certified MS clinicians were asked to individually score how similar the CV model-predicted one-year appearance was to the true one-year appearance for a random sample of 100 lesions. The cross-validated root-mean-square predictive error was 0.95 for normalized T1w and 0.064 for MTR, compared to the estimated measurement errors of 0.48 and 0.078 respectively. The three expert raters agreed that T1w and MTR predictions closely resembled the true one-year follow-up appearance of the lesions in both degree and pattern of recovery within lesions. This study demonstrates that by using only information from a single visit at incidence, we can predict how a new lesion will recover using relatively simple statistical techniques. The potential to visualize the likely course of recovery has implications for clinical decision-making, as well as trial enrichment.
Hyperspectral analysis of soil organic matter in coal mining regions using wavelets, correlations, and partial least squares regression.

PubMed

Lin, Lixin; Wang, Yunjia; Teng, Jiyao; Wang, Xuchen

2016-02-01

Hyperspectral estimation of soil organic matter (SOM) in coal mining regions is an important tool for enhancing fertilization in soil restoration programs. The correlation--partial least squares regression (PLSR) method effectively solves the information loss problem of correlation--multiple linear stepwise regression, but results of the correlation analysis must be optimized to improve precision. This study considers the relationship between spectral reflectance and SOM based on spectral reflectance curves of soil samples collected from coal mining regions. Based on the major absorption troughs in the 400-1006 nm spectral range, PLSR analysis was performed using 289 independent bands of the second derivative (SDR) with three levels and measured SOM values. A wavelet-correlation-PLSR (W-C-PLSR) model was then constructed. By amplifying useful information that was previously obscured by noise, the W-C-PLSR model was optimal for estimating SOM content, with smaller prediction errors in both calibration (R(2) = 0.970, root mean square error (RMSEC) = 3.10, and mean relative error (MREC) = 8.75) and validation (RMSEV = 5.85 and MREV = 14.32) analyses, as compared with other models. Results indicate that W-C-PLSR has great potential to estimate SOM in coal mining regions.
Prediction of fat-free body mass from bioelectrical impedance and anthropometry among 3-year-old children using DXA

PubMed Central

Ejlerskov, Katrine T.; Jensen, Signe M.; Christensen, Line B.; Ritz, Christian; Michaelsen, Kim F.; Mølgaard, Christian

2014-01-01

For 3-year-old children suitable methods to estimate body composition are sparse. We aimed to develop predictive equations for estimating fat-free mass (FFM) from bioelectrical impedance (BIA) and anthropometry using dual-energy X-ray absorptiometry (DXA) as reference method using data from 99 healthy 3-year-old Danish children. Predictive equations were derived from two multiple linear regression models, a comprehensive model (height2/resistance (RI), six anthropometric measurements) and a simple model (RI, height, weight). Their uncertainty was quantified by means of 10-fold cross-validation approach. Prediction error of FFM was 3.0% for both equations (root mean square error: 360 and 356 g, respectively). The derived equations produced BIA-based prediction of FFM and FM near DXA scan results. We suggest that the predictive equations can be applied in similar population samples aged 2–4 years. The derived equations may prove useful for studies linking body composition to early risk factors and early onset of obesity. PMID:24463487
Prediction of fat-free body mass from bioelectrical impedance and anthropometry among 3-year-old children using DXA.

PubMed

Ejlerskov, Katrine T; Jensen, Signe M; Christensen, Line B; Ritz, Christian; Michaelsen, Kim F; Mølgaard, Christian

2014-01-27

For 3-year-old children suitable methods to estimate body composition are sparse. We aimed to develop predictive equations for estimating fat-free mass (FFM) from bioelectrical impedance (BIA) and anthropometry using dual-energy X-ray absorptiometry (DXA) as reference method using data from 99 healthy 3-year-old Danish children. Predictive equations were derived from two multiple linear regression models, a comprehensive model (height(2)/resistance (RI), six anthropometric measurements) and a simple model (RI, height, weight). Their uncertainty was quantified by means of 10-fold cross-validation approach. Prediction error of FFM was 3.0% for both equations (root mean square error: 360 and 356 g, respectively). The derived equations produced BIA-based prediction of FFM and FM near DXA scan results. We suggest that the predictive equations can be applied in similar population samples aged 2-4 years. The derived equations may prove useful for studies linking body composition to early risk factors and early onset of obesity.
Direct and simultaneous quantification of tannin mean degree of polymerization and percentage of galloylation in grape seeds using diffuse reflectance fourier transform-infrared spectroscopy.

PubMed

Pappas, Christos; Kyraleou, Maria; Voskidi, Eleni; Kotseridis, Yorgos; Taranilis, Petros A; Kallithraka, Stamatina

2015-02-01

The direct and simultaneous quantitative determination of the mean degree of polymerization (mDP) and the degree of galloylation (%G) in grape seeds were quantified using diffuse reflectance infrared Fourier transform spectroscopy and partial least squares (PLS). The results were compared with those obtained using the conventional analysis employing phloroglucinolysis as pretreatment followed by high performance liquid chromatography-UV and mass spectrometry detection. Infrared spectra were recorded in solid state samples after freeze drying. The 2nd derivative of the 1832 to 1416 and 918 to 739 cm(-1) spectral regions for the quantification of mDP, the 2nd derivative of the 1813 to 607 cm(-1) spectral region for the degree of %G determination and PLS regression were used. The determination coefficients (R(2) ) of mDP and %G were 0.99 and 0.98, respectively. The corresponding values of the root-mean-square error of calibration were found 0.506 and 0.692, the root-mean-square error of cross validation 0.811 and 0.921, and the root-mean-square error of prediction 0.612 and 0.801. The proposed method in comparison with the conventional method is simpler, less time consuming, more economical, and requires reduced quantities of chemical reagents and fewer sample pretreatment steps. It could be a starting point for the design of more specific models according to the requirements of the wineries. © 2015 Institute of Food Technologists®
Rapid discrimination between buffalo and cow milk and detection of adulteration of buffalo milk with cow milk using synchronous fluorescence spectroscopy in combination with multivariate methods.

PubMed

Durakli Velioglu, Serap; Ercioglu, Elif; Boyaci, Ismail Hakki

2017-05-01

This research paper describes the potential of synchronous fluorescence (SF) spectroscopy for authentication of buffalo milk, a favourable raw material in the production of some premium dairy products. Buffalo milk is subjected to fraudulent activities like many other high priced foodstuffs. The current methods widely used for the detection of adulteration of buffalo milk have various disadvantages making them unattractive for routine analysis. Thus, the aim of the present study was to assess the potential of SF spectroscopy in combination with multivariate methods for rapid discrimination between buffalo and cow milk and detection of the adulteration of buffalo milk with cow milk. SF spectra of cow and buffalo milk samples were recorded between 400-550 nm excitation range with Δλ of 10-100 nm, in steps of 10 nm. The data obtained for ∆λ = 10 nm were utilised to classify the samples using principal component analysis (PCA), and detect the adulteration level of buffalo milk with cow milk using partial least square (PLS) methods. Successful discrimination of samples and detection of adulteration of buffalo milk with limit of detection value (LOD) of 6% are achieved with the models having root mean square error of calibration (RMSEC) and the root mean square error of cross-validation (RMSECV) and root mean square error of prediction (RMSEP) values of 2, 7, and 4%, respectively. The results reveal the potential of SF spectroscopy for rapid authentication of buffalo milk.
Quantized kernel least mean square algorithm.

PubMed

Chen, Badong; Zhao, Songlin; Zhu, Pingping; Príncipe, José C

2012-01-01

In this paper, we propose a quantization approach, as an alternative of sparsification, to curb the growth of the radial basis function structure in kernel adaptive filtering. The basic idea behind this method is to quantize and hence compress the input (or feature) space. Different from sparsification, the new approach uses the "redundant" data to update the coefficient of the closest center. In particular, a quantized kernel least mean square (QKLMS) algorithm is developed, which is based on a simple online vector quantization method. The analytical study of the mean square convergence has been carried out. The energy conservation relation for QKLMS is established, and on this basis we arrive at a sufficient condition for mean square convergence, and a lower and upper bound on the theoretical value of the steady-state excess mean square error. Static function estimation and short-term chaotic time-series prediction examples are presented to demonstrate the excellent performance.
Determination of total iron-reactive phenolics, anthocyanins and tannins in wine grapes of skins and seeds based on near-infrared hyperspectral imaging.

PubMed

Zhang, Ni; Liu, Xu; Jin, Xiaoduo; Li, Chen; Wu, Xuan; Yang, Shuqin; Ning, Jifeng; Yanne, Paul

2017-12-15

Phenolics contents in wine grapes are key indicators for assessing ripeness. Near-infrared hyperspectral images during ripening have been explored to achieve an effective method for predicting phenolics contents. Principal component regression (PCR), partial least squares regression (PLSR) and support vector regression (SVR) models were built, respectively. The results show that SVR behaves globally better than PLSR and PCR, except in predicting tannins content of seeds. For the best prediction results, the squared correlation coefficient and root mean square error reached 0.8960 and 0.1069g/L (+)-catechin equivalents (CE), respectively, for tannins in skins, 0.9065 and 0.1776 (g/L CE) for total iron-reactive phenolics (TIRP) in skins, 0.8789 and 0.1442 (g/L M3G) for anthocyanins in skins, 0.9243 and 0.2401 (g/L CE) for tannins in seeds, and 0.8790 and 0.5190 (g/L CE) for TIRP in seeds. Our results indicated that NIR hyperspectral imaging has good prospects for evaluation of phenolics in wine grapes. Copyright © 2017 Elsevier Ltd. All rights reserved.
Two Enhancements of the Logarithmic Least-Squares Method for Analyzing Subjective Comparisons

DTIC Science & Technology

1989-03-25

error term. 1 For this model, the total sum of squares ( SSTO ), defined as n 2 SSTO = E (yi y) i=1 can be partitioned into error and regression sums...of the regression line around the mean value. Mathematically, for the model given by equation A.4, SSTO = SSE + SSR (A.6) A-4 where SSTO is the total...sum of squares (i.e., the variance of the yi’s), SSE is error sum of squares, and SSR is the regression sum of squares. SSTO , SSE, and SSR are given

Visible and Near-Infrared Spectroscopy Analysis of a Polycyclic Aromatic Hydrocarbon in Soils

PubMed Central

Okparanma, Reuben N.; Mouazen, Abdul M.

2013-01-01

Visible and near-infrared (VisNIR) spectroscopy is becoming recognised by soil scientists as a rapid and cost-effective measurement method for hydrocarbons in petroleum-contaminated soils. This study investigated the potential application of VisNIR spectroscopy (350–2500 nm) for the prediction of phenanthrene, a polycyclic aromatic hydrocarbon (PAH), in soils. A total of 150 diesel-contaminated soil samples were used in the investigation. Partial least-squares (PLS) regression analysis with full cross-validation was used to develop models to predict the PAH compound. Results showed that the PAH compound was predicted well with residual prediction deviation of 2.0–2.32, root-mean-square error of prediction of 0.21–0.25 mg kg−1, and coefficient of determination (r 2) of 0.75–0.83. The mechanism of prediction was attributed to covariation of the PAH with clay and soil organic carbon. Overall, the results demonstrated that the methodology may be used for predicting phenanthrene in soils utilizing the interrelationship between clay and soil organic carbon. PMID:24453798
Predictive Behavior of a Computational Foot/Ankle Model through Artificial Neural Networks.

PubMed

Chande, Ruchi D; Hargraves, Rosalyn Hobson; Ortiz-Robinson, Norma; Wayne, Jennifer S

2017-01-01

Computational models are useful tools to study the biomechanics of human joints. Their predictive performance is heavily dependent on bony anatomy and soft tissue properties. Imaging data provides anatomical requirements while approximate tissue properties are implemented from literature data, when available. We sought to improve the predictive capability of a computational foot/ankle model by optimizing its ligament stiffness inputs using feedforward and radial basis function neural networks. While the former demonstrated better performance than the latter per mean square error, both networks provided reasonable stiffness predictions for implementation into the computational model.
Improving probabilistic prediction of daily streamflow by identifying Pareto optimal approaches for modelling heteroscedastic residual errors

NASA Astrophysics Data System (ADS)

David, McInerney; Mark, Thyer; Dmitri, Kavetski; George, Kuczera

2017-04-01

This study provides guidance to hydrological researchers which enables them to provide probabilistic predictions of daily streamflow with the best reliability and precision for different catchment types (e.g. high/low degree of ephemerality). Reliable and precise probabilistic prediction of daily catchment-scale streamflow requires statistical characterization of residual errors of hydrological models. It is commonly known that hydrological model residual errors are heteroscedastic, i.e. there is a pattern of larger errors in higher streamflow predictions. Although multiple approaches exist for representing this heteroscedasticity, few studies have undertaken a comprehensive evaluation and comparison of these approaches. This study fills this research gap by evaluating 8 common residual error schemes, including standard and weighted least squares, the Box-Cox transformation (with fixed and calibrated power parameter, lambda) and the log-sinh transformation. Case studies include 17 perennial and 6 ephemeral catchments in Australia and USA, and two lumped hydrological models. We find the choice of heteroscedastic error modelling approach significantly impacts on predictive performance, though no single scheme simultaneously optimizes all performance metrics. The set of Pareto optimal schemes, reflecting performance trade-offs, comprises Box-Cox schemes with lambda of 0.2 and 0.5, and the log scheme (lambda=0, perennial catchments only). These schemes significantly outperform even the average-performing remaining schemes (e.g., across ephemeral catchments, median precision tightens from 105% to 40% of observed streamflow, and median biases decrease from 25% to 4%). Theoretical interpretations of empirical results highlight the importance of capturing the skew/kurtosis of raw residuals and reproducing zero flows. Recommendations for researchers and practitioners seeking robust residual error schemes for practical work are provided.
Improving probabilistic prediction of daily streamflow by identifying Pareto optimal approaches for modeling heteroscedastic residual errors

NASA Astrophysics Data System (ADS)

McInerney, David; Thyer, Mark; Kavetski, Dmitri; Lerat, Julien; Kuczera, George

2017-03-01

Reliable and precise probabilistic prediction of daily catchment-scale streamflow requires statistical characterization of residual errors of hydrological models. This study focuses on approaches for representing error heteroscedasticity with respect to simulated streamflow, i.e., the pattern of larger errors in higher streamflow predictions. We evaluate eight common residual error schemes, including standard and weighted least squares, the Box-Cox transformation (with fixed and calibrated power parameter λ) and the log-sinh transformation. Case studies include 17 perennial and 6 ephemeral catchments in Australia and the United States, and two lumped hydrological models. Performance is quantified using predictive reliability, precision, and volumetric bias metrics. We find the choice of heteroscedastic error modeling approach significantly impacts on predictive performance, though no single scheme simultaneously optimizes all performance metrics. The set of Pareto optimal schemes, reflecting performance trade-offs, comprises Box-Cox schemes with λ of 0.2 and 0.5, and the log scheme (λ = 0, perennial catchments only). These schemes significantly outperform even the average-performing remaining schemes (e.g., across ephemeral catchments, median precision tightens from 105% to 40% of observed streamflow, and median biases decrease from 25% to 4%). Theoretical interpretations of empirical results highlight the importance of capturing the skew/kurtosis of raw residuals and reproducing zero flows. Paradoxically, calibration of λ is often counterproductive: in perennial catchments, it tends to overfit low flows at the expense of abysmal precision in high flows. The log-sinh transformation is dominated by the simpler Pareto optimal schemes listed above. Recommendations for researchers and practitioners seeking robust residual error schemes for practical work are provided.
Intranasal Pharmacokinetic Data for Triptans Such as Sumatriptan and Zolmitriptan Can Render Area Under the Curve (AUC) Predictions for the Oral Route: Strategy Development and Application.

PubMed

Srinivas, Nuggehally R; Syed, Muzeeb

2016-01-01

Limited pharmacokinetic sampling strategy may be useful for predicting the area under the curve (AUC) for triptans and may have clinical utility as a prospective tool for prediction. Using appropriate intranasal pharmacokinetic data, a Cmax vs. AUC relationship was established by linear regression models for sumatriptan and zolmitriptan. The predictions of the AUC values were performed using published mean/median Cmax data and appropriate regression lines. The quotient of observed and predicted values rendered fold-difference calculation. The mean absolute error (MAE), mean positive error (MPE), mean negative error (MNE), root mean square error (RMSE), correlation coefficient (r), and the goodness of the AUC fold prediction were used to evaluate the two triptans. Also, data from the mean concentration profiles at time points of 1 hour (sumatriptan) and 3 hours (zolmitriptan) were used for the AUC prediction. The Cmax vs. AUC models displayed excellent correlation for both sumatriptan (r = .9997; P < .001) and zolmitriptan (r = .9999; P < .001). Irrespective of the two triptans, the majority of the predicted AUCs (83%-85%) were within 0.76-1.25-fold difference using the regression model. The prediction of AUC values for sumatriptan or zolmitriptan using the concentration data that reflected the Tmax occurrence were in the proximity of the reported values. In summary, the Cmax vs. AUC models exhibited strong correlations for sumatriptan and zolmitriptan. The usefulness of the prediction of the AUC values was established by a rigorous statistical approach.
Application of the hybrid ANFIS models for long term wind power density prediction with extrapolation capability.

PubMed

Hossain, Monowar; Mekhilef, Saad; Afifi, Firdaus; Halabi, Laith M; Olatomiwa, Lanre; Seyedmahmoudian, Mehdi; Horan, Ben; Stojcevski, Alex

2018-01-01

In this paper, the suitability and performance of ANFIS (adaptive neuro-fuzzy inference system), ANFIS-PSO (particle swarm optimization), ANFIS-GA (genetic algorithm) and ANFIS-DE (differential evolution) has been investigated for the prediction of monthly and weekly wind power density (WPD) of four different locations named Mersing, Kuala Terengganu, Pulau Langkawi and Bayan Lepas all in Malaysia. For this aim, standalone ANFIS, ANFIS-PSO, ANFIS-GA and ANFIS-DE prediction algorithm are developed in MATLAB platform. The performance of the proposed hybrid ANFIS models is determined by computing different statistical parameters such as mean absolute bias error (MABE), mean absolute percentage error (MAPE), root mean square error (RMSE) and coefficient of determination (R2). The results obtained from ANFIS-PSO and ANFIS-GA enjoy higher performance and accuracy than other models, and they can be suggested for practical application to predict monthly and weekly mean wind power density. Besides, the capability of the proposed hybrid ANFIS models is examined to predict the wind data for the locations where measured wind data are not available, and the results are compared with the measured wind data from nearby stations.
Application of the hybrid ANFIS models for long term wind power density prediction with extrapolation capability

PubMed Central

Mekhilef, Saad; Afifi, Firdaus; Halabi, Laith M.; Olatomiwa, Lanre; Seyedmahmoudian, Mehdi; Stojcevski, Alex

2018-01-01

In this paper, the suitability and performance of ANFIS (adaptive neuro-fuzzy inference system), ANFIS-PSO (particle swarm optimization), ANFIS-GA (genetic algorithm) and ANFIS-DE (differential evolution) has been investigated for the prediction of monthly and weekly wind power density (WPD) of four different locations named Mersing, Kuala Terengganu, Pulau Langkawi and Bayan Lepas all in Malaysia. For this aim, standalone ANFIS, ANFIS-PSO, ANFIS-GA and ANFIS-DE prediction algorithm are developed in MATLAB platform. The performance of the proposed hybrid ANFIS models is determined by computing different statistical parameters such as mean absolute bias error (MABE), mean absolute percentage error (MAPE), root mean square error (RMSE) and coefficient of determination (R2). The results obtained from ANFIS-PSO and ANFIS-GA enjoy higher performance and accuracy than other models, and they can be suggested for practical application to predict monthly and weekly mean wind power density. Besides, the capability of the proposed hybrid ANFIS models is examined to predict the wind data for the locations where measured wind data are not available, and the results are compared with the measured wind data from nearby stations. PMID:29702645
Measurement of process variables in solid-state fermentation of wheat straw using FT-NIR spectroscopy and synergy interval PLS algorithm.

PubMed

Jiang, Hui; Liu, Guohai; Mei, Congli; Yu, Shuang; Xiao, Xiahong; Ding, Yuhan

2012-11-01

The feasibility of rapid determination of the process variables (i.e. pH and moisture content) in solid-state fermentation (SSF) of wheat straw using Fourier transform near infrared (FT-NIR) spectroscopy was studied. Synergy interval partial least squares (siPLS) algorithm was implemented to calibrate regression model. The number of PLS factors and the number of subintervals were optimized simultaneously by cross-validation. The performance of the prediction model was evaluated according to the root mean square error of cross-validation (RMSECV), the root mean square error of prediction (RMSEP) and the correlation coefficient (R). The measurement results of the optimal model were obtained as follows: RMSECV=0.0776, R(c)=0.9777, RMSEP=0.0963, and R(p)=0.9686 for pH model; RMSECV=1.3544% w/w, R(c)=0.8871, RMSEP=1.4946% w/w, and R(p)=0.8684 for moisture content model. Finally, compared with classic PLS and iPLS models, the siPLS model revealed its superior performance. The overall results demonstrate that FT-NIR spectroscopy combined with siPLS algorithm can be used to measure process variables in solid-state fermentation of wheat straw, and NIR spectroscopy technique has a potential to be utilized in SSF industry. Copyright © 2012 Elsevier B.V. All rights reserved.
Analysis of a Shock-Associated Noise Prediction Model Using Measured Jet Far-Field Noise Data

NASA Technical Reports Server (NTRS)

Dahl, Milo D.; Sharpe, Jacob A.

2014-01-01

A code for predicting supersonic jet broadband shock-associated noise was assessed us- ing a database containing noise measurements of a jet issuing from a convergent nozzle. The jet was operated at 24 conditions covering six fully expanded Mach numbers with four total temperature ratios. To enable comparisons of the predicted shock-associated noise component spectra with data, the measured total jet noise spectra were separated into mixing noise and shock-associated noise component spectra. Comparisons between predicted and measured shock-associated noise component spectra were used to identify de ciencies in the prediction model. Proposed revisions to the model, based on a study of the overall sound pressure levels for the shock-associated noise component of the mea- sured data, a sensitivity analysis of the model parameters with emphasis on the de nition of the convection velocity parameter, and a least-squares t of the predicted to the mea- sured shock-associated noise component spectra, resulted in a new de nition for the source strength spectrum in the model. An error analysis showed that the average error in the predicted spectra was reduced by as much as 3.5 dB for the revised model relative to the average error for the original model.
Prediction of the compression ratio for municipal solid waste using decision tree.

PubMed

Heshmati R, Ali Akbar; Mokhtari, Maryam; Shakiba Rad, Saeed

2014-01-01

The compression ratio of municipal solid waste (MSW) is an essential parameter for evaluation of waste settlement and landfill design. However, no appropriate model has been proposed to estimate the waste compression ratio so far. In this study, a decision tree method was utilized to predict the waste compression ratio (C'c). The tree was constructed using Quinlan's M5 algorithm. A reliable database retrieved from the literature was used to develop a practical model that relates C'c to waste composition and properties, including dry density, dry weight water content, and percentage of biodegradable organic waste using the decision tree method. The performance of the developed model was examined in terms of different statistical criteria, including correlation coefficient, root mean squared error, mean absolute error and mean bias error, recommended by researchers. The obtained results demonstrate that the suggested model is able to evaluate the compression ratio of MSW effectively.
Determination of nutritional parameters of yoghurts by FT Raman spectroscopy

NASA Astrophysics Data System (ADS)

Czaja, Tomasz; Baranowska, Maria; Mazurek, Sylwester; Szostak, Roman

2018-05-01

FT-Raman quantitative analysis of nutritional parameters of yoghurts was performed with the help of partial least squares models. The relative standard errors of prediction for fat, lactose and protein determination in the quantified commercial samples equalled to 3.9, 3.2 and 3.6%, respectively. Models based on attenuated total reflectance spectra of the liquid yoghurt samples and of dried yoghurt films collected with the single reflection diamond accessory showed relative standard errors of prediction values of 1.6-5.0% and 2.7-5.2%, respectively, for the analysed components. Despite a relatively low signal-to-noise ratio in the obtained spectra, Raman spectroscopy, combined with chemometrics, constitutes a fast and powerful tool for macronutrients quantification in yoghurts. Errors received for attenuated total reflectance method were found to be relatively higher than those for Raman spectroscopy due to inhomogeneity of the analysed samples.
[Application of near infrared spectroscopy in rapid and simultaneous determination of essential components in five varieties of anti-tuberculosis tablets].

PubMed

Teng, Le-sheng; Wang, Di; Song, Jia; Zhang, Yi-bo; Guo, Wei-liang; Teng, Li-rong

2008-08-01

Since 1980s, tuberculosis has become increasingly serious. Rifampicin tablets, isoniazide tablets, pyrazinamide tablets, rifampicin and isoniazide tablets and rifampicin isoniazide and pyrazinamide tablets are currently relatively efficacious antituberculosis drugs. In the present paper, near infrared spectroscopy (NIRS) with partial least squares (PLS) was applied to the simultaneous determination of rifampicin (RMP), isoniazide (INH) and pyrazinamide (PZA) contents in 5 varieties of anti-tuberculosis tablets. As the results showed, all of the models for the determination of RMP, INH and PZA contents applied the original NIR spectra. The most efficacious wavelength range for the determination of RMP contents was 1981-2195 nm, it was 1540-1717 nm and 2086-2197 nm for the determination of INH contents, and it was 1460-1537 nm, 1956-2022 nm and 2268-2393 nm for determination of PZA contents. The root mean square error of the calibration set obtained by cross-validation (RMSECV) of the optimum models for the quantitative analysis of RMP, INH and PZA contents was 0.0494, 0.0257 and 0.0307, respectively. Using these optimum models for the determination of RMP, INH and PZA contents in prediction set, the root mean square error of prediction set (RMSEP) was 0.0182, 0.0166 and 0.0134, respectively. The correlation coefficient (r(p)) between the predicted values and actual values was 0.9864, 0.9989 and 0.9993, respectively. These results demonstrated that this method was precise and reliable, and is significative for in situ measurement and the on-line quality control for anti-tuberculosis tablets production.
AKLSQF - LEAST SQUARES CURVE FITTING

NASA Technical Reports Server (NTRS)

Kantak, A. V.

1994-01-01

The Least Squares Curve Fitting program, AKLSQF, computes the polynomial which will least square fit uniformly spaced data easily and efficiently. The program allows the user to specify the tolerable least squares error in the fitting or allows the user to specify the polynomial degree. In both cases AKLSQF returns the polynomial and the actual least squares fit error incurred in the operation. The data may be supplied to the routine either by direct keyboard entry or via a file. AKLSQF produces the least squares polynomial in two steps. First, the data points are least squares fitted using the orthogonal factorial polynomials. The result is then reduced to a regular polynomial using Sterling numbers of the first kind. If an error tolerance is specified, the program starts with a polynomial of degree 1 and computes the least squares fit error. The degree of the polynomial used for fitting is then increased successively until the error criterion specified by the user is met. At every step the polynomial as well as the least squares fitting error is printed to the screen. In general, the program can produce a curve fitting up to a 100 degree polynomial. All computations in the program are carried out under Double Precision format for real numbers and under long integer format for integers to provide the maximum accuracy possible. AKLSQF was written for an IBM PC X/AT or compatible using Microsoft's Quick Basic compiler. It has been implemented under DOS 3.2.1 using 23K of RAM. AKLSQF was developed in 1989.
Prediction of pKa Values for Neutral and Basic Drugs based on Hybrid Artificial Intelligence Methods.

PubMed

Li, Mengshan; Zhang, Huaijing; Chen, Bingsheng; Wu, Yan; Guan, Lixin

2018-03-05

The pKa value of drugs is an important parameter in drug design and pharmacology. In this paper, an improved particle swarm optimization (PSO) algorithm was proposed based on the population entropy diversity. In the improved algorithm, when the population entropy was higher than the set maximum threshold, the convergence strategy was adopted; when the population entropy was lower than the set minimum threshold the divergence strategy was adopted; when the population entropy was between the maximum and minimum threshold, the self-adaptive adjustment strategy was maintained. The improved PSO algorithm was applied in the training of radial basis function artificial neural network (RBF ANN) model and the selection of molecular descriptors. A quantitative structure-activity relationship model based on RBF ANN trained by the improved PSO algorithm was proposed to predict the pKa values of 74 kinds of neutral and basic drugs and then validated by another database containing 20 molecules. The validation results showed that the model had a good prediction performance. The absolute average relative error, root mean square error, and squared correlation coefficient were 0.3105, 0.0411, and 0.9685, respectively. The model can be used as a reference for exploring other quantitative structure-activity relationships.
Road traffic accidents prediction modelling: An analysis of Anambra State, Nigeria.

PubMed

Ihueze, Chukwutoo C; Onwurah, Uchendu O

2018-03-01

One of the major problems in the world today is the rate of road traffic crashes and deaths on our roads. Majority of these deaths occur in low-and-middle income countries including Nigeria. This study analyzed road traffic crashes in Anambra State, Nigeria with the intention of developing accurate predictive models for forecasting crash frequency in the State using autoregressive integrated moving average (ARIMA) and autoregressive integrated moving average with explanatory variables (ARIMAX) modelling techniques. The result showed that ARIMAX model outperformed the ARIMA (1,1,1) model generated when their performances were compared using the lower Bayesian information criterion, mean absolute percentage error, root mean square error; and higher coefficient of determination (R-Squared) as accuracy measures. The findings of this study reveal that incorporating human, vehicle and environmental related factors in time series analysis of crash dataset produces a more robust predictive model than solely using aggregated crash count. This study contributes to the body of knowledge on road traffic safety and provides an approach to forecasting using many human, vehicle and environmental factors. The recommendations made in this study if applied will help in reducing the number of road traffic crashes in Nigeria. Copyright © 2017 Elsevier Ltd. All rights reserved.
[Outlier sample discriminating methods for building calibration model in melons quality detecting using NIR spectra].

PubMed

Tian, Hai-Qing; Wang, Chun-Guang; Zhang, Hai-Jun; Yu, Zhi-Hong; Li, Jian-Kang

2012-11-01

Outlier samples strongly influence the precision of the calibration model in soluble solids content measurement of melons using NIR Spectra. According to the possible sources of outlier samples, three methods (predicted concentration residual test; Chauvenet test; leverage and studentized residual test) were used to discriminate these outliers respectively. Nine suspicious outliers were detected from calibration set which including 85 fruit samples. Considering the 9 suspicious outlier samples maybe contain some no-outlier samples, they were reclaimed to the model one by one to see whether they influence the model and prediction precision or not. In this way, 5 samples which were helpful to the model joined in calibration set again, and a new model was developed with the correlation coefficient (r) 0. 889 and root mean square errors for calibration (RMSEC) 0.6010 Brix. For 35 unknown samples, the root mean square errors prediction (RMSEP) was 0.854 degrees Brix. The performance of this model was more better than that developed with non outlier was eliminated from calibration set (r = 0.797, RMSEC= 0.849 degrees Brix, RMSEP = 1.19 degrees Brix), and more representative and stable with all 9 samples were eliminated from calibration set (r = 0.892, RMSEC = 0.605 degrees Brix, RMSEP = 0.862 degrees).
Statistical variation in progressive scrambling

NASA Astrophysics Data System (ADS)

Clark, Robert D.; Fox, Peter C.

2004-07-01

The two methods most often used to evaluate the robustness and predictivity of partial least squares (PLS) models are cross-validation and response randomization. Both methods may be overly optimistic for data sets that contain redundant observations, however. The kinds of perturbation analysis widely used for evaluating model stability in the context of ordinary least squares regression are only applicable when the descriptors are independent of each other and errors are independent and normally distributed; neither assumption holds for QSAR in general and for PLS in particular. Progressive scrambling is a novel, non-parametric approach to perturbing models in the response space in a way that does not disturb the underlying covariance structure of the data. Here, we introduce adjustments for two of the characteristic values produced by a progressive scrambling analysis - the deprecated predictivity (Q_s^{ast^2}) and standard error of prediction (SDEP s * ) - that correct for the effect of introduced perturbation. We also explore the statistical behavior of the adjusted values (Q_0^{ast^2} and SDEP 0 * ) and the sensitivity to perturbation (d q 2/d r yy ' 2). It is shown that the three statistics are all robust for stable PLS models, in terms of the stochastic component of their determination and of their variation due to sampling effects involved in training set selection.
Direct comparison of low- and mid-frequency Raman spectroscopy for quantitative solid-state pharmaceutical analysis.

PubMed

Lipiäinen, Tiina; Fraser-Miller, Sara J; Gordon, Keith C; Strachan, Clare J

2018-02-05

This study considers the potential of low-frequency (terahertz) Raman spectroscopy in the quantitative analysis of ternary mixtures of solid-state forms. Direct comparison between low-frequency and mid-frequency spectral regions for quantitative analysis of crystal form mixtures, without confounding sampling and instrumental variations, is reported for the first time. Piroxicam was used as a model drug, and the low-frequency spectra of piroxicam forms β, α2 and monohydrate are presented for the first time. These forms show clear spectral differences in both the low- and mid-frequency regions. Both spectral regions provided quantitative models suitable for predicting the mixture compositions using partial least squares regression (PLSR), but the low-frequency data gave better models, based on lower errors of prediction (2.7, 3.1 and 3.2% root-mean-square errors of prediction [RMSEP] values for the β, α2 and monohydrate forms, respectively) than the mid-frequency data (6.3, 5.4 and 4.8%, for the β, α2 and monohydrate forms, respectively). The better performance of low-frequency Raman analysis was attributed to larger spectral differences between the solid-state forms, combined with a higher signal-to-noise ratio. Copyright © 2017 Elsevier B.V. All rights reserved.
Determination of the carmine content based on spectrum fluorescence spectral and PSO-SVM

NASA Astrophysics Data System (ADS)

Wang, Shu-tao; Peng, Tao; Cheng, Qi; Wang, Gui-chuan; Kong, De-ming; Wang, Yu-tian

2018-03-01

Carmine is a widely used food pigment in various food and beverage additives. Excessive consumption of synthetic pigment shall do harm to body seriously. The food is generally associated with a variety of colors. Under the simulation context of various food pigments' coexistence, we adopted the technology of fluorescence spectroscopy, together with the PSO-SVM algorithm, so that to establish a method for the determination of carmine content in mixed solution. After analyzing the prediction results of PSO-SVM, we collected a bunch of data: the carmine average recovery rate was 100.84%, the root mean square error of prediction (RMSEP) for 1.03e-04, 0.999 for the correlation coefficient between the model output and the real value of the forecast. Compared with the prediction results of reverse transmission, the correlation coefficient of PSO-SVM was 2.7% higher, the average recovery rate for 0.6%, and the root mean square error was nearly one order of magnitude lower. According to the analysis results, it can effectively avoid the interference caused by pigment with the combination of the fluorescence spectrum technique and PSO-SVM, accurately determining the content of carmine in mixed solution with an effect better than that of BP.
Determinants of Antibiotic Consumption - Development of a Model using Partial Least Squares Regression based on Data from India.

PubMed

Tamhankar, Ashok J; Karnik, Shreyasee S; Stålsby Lundborg, Cecilia

2018-04-23

Antibiotic resistance, a consequence of antibiotic use, is a threat to health, with severe consequences for resource constrained settings. If determinants for human antibiotic use in India, a lower middle income country, with one of the highest antibiotic consumption in the world could be understood, interventions could be developed, having implications for similar settings. Year wise data for India, for potential determinants and antibiotic consumption, was sourced from publicly available databases for the years 2000-2010. Data was analyzed using Partial Least Squares regression and correlation between determinants and antibiotic consumption was evaluated, formulating 'Predictors' and 'Prediction models'. The 'prediction model' with the statistically most significant predictors (root mean square errors of prediction for train set-377.0 and test set-297.0) formulated from a combination of Health infrastructure + Surface transport infrastructure (HISTI), predicted antibiotic consumption within 95% confidence interval and estimated an antibiotic consumption of 11.6 standard units/person (14.37 billion standard units totally; standard units = number of doses sold in the country; a dose being a pill, capsule, or ampoule) for India for 2014. The HISTI model may become useful in predicting antibiotic consumption for countries/regions having circumstances and data similar to India, but without resources to measure actual data of antibiotic consumption.

Rapid and non-invasive analysis of deoxynivalenol in durum and common wheat by Fourier-Transform Near Infrared (FT-NIR) spectroscopy.

PubMed

De Girolamo, A; Lippolis, V; Nordkvist, E; Visconti, A

2009-06-01

Fourier transform near-infrared spectroscopy (FT-NIR) was used for rapid and non-invasive analysis of deoxynivalenol (DON) in durum and common wheat. The relevance of using ground wheat samples with a homogeneous particle size distribution to minimize measurement variations and avoid DON segregation among particles of different sizes was established. Calibration models for durum wheat, common wheat and durum + common wheat samples, with particle size <500 microm, were obtained by using partial least squares (PLS) regression with an external validation technique. Values of root mean square error of prediction (RMSEP, 306-379 microg kg(-1)) were comparable and not too far from values of root mean square error of cross-validation (RMSECV, 470-555 microg kg(-1)). Coefficients of determination (r(2)) indicated an "approximate to good" level of prediction of the DON content by FT-NIR spectroscopy in the PLS calibration models (r(2) = 0.71-0.83), and a "good" discrimination between low and high DON contents in the PLS validation models (r(2) = 0.58-0.63). A "limited to good" practical utility of the models was ascertained by range error ratio (RER) values higher than 6. A qualitative model, based on 197 calibration samples, was developed to discriminate between blank and naturally contaminated wheat samples by setting a cut-off at 300 microg kg(-1) DON to separate the two classes. The model correctly classified 69% of the 65 validation samples with most misclassified samples (16 of 20) showing DON contamination levels quite close to the cut-off level. These findings suggest that FT-NIR analysis is suitable for the determination of DON in unprocessed wheat at levels far below the maximum permitted limits set by the European Commission.
Development of a transformation model to derive general population-based utility: Mapping the pruritus-visual analog scale (VAS) to the EQ-5D utility.

PubMed

Park, Sun-Young; Park, Eun-Ja; Suh, Hae Sun; Ha, Dongmun; Lee, Eui-Kyung

2017-08-01

Although nonpreference-based disease-specific measures are widely used in clinical studies, they cannot generate utilities for economic evaluation. A solution to this problem is to estimate utilities from disease-specific instruments using the mapping function. This study aimed to develop a transformation model for mapping the pruritus-visual analog scale (VAS) to the EuroQol 5-Dimension 3-Level (EQ-5D-3L) utility index in pruritus. A cross-sectional survey was conducted with a sample (n = 268) drawn from the general population of South Korea. Data were randomly divided into 2 groups, one for estimating and the other for validating mapping models. To select the best model, we developed and compared 3 separate models using demographic information and the pruritus-VAS as independent variables. The predictive performance was assessed using the mean absolute deviation and root mean square error in a separate dataset. Among the 3 models, model 2 using age, age squared, sex, and the pruritus-VAS as independent variables had the best performance based on the goodness of fit and model simplicity, with a log likelihood of 187.13. The 3 models had similar precision errors based on mean absolute deviation and root mean square error in the validation dataset. No statistically significant difference was observed between the mean observed and predicted values in all models. In conclusion, model 2 was chosen as the preferred mapping model. Outcomes measured as the pruritus-VAS can be transformed into the EQ-5D-3L utility index using this mapping model, which makes an economic evaluation possible when only pruritus-VAS data are available. © 2017 John Wiley & Sons, Ltd.
Application of Molecular Dynamics Simulations in Molecular Property Prediction II: Diffusion Coefficient

PubMed Central

Wang, Junmei; Hou, Tingjun

2011-01-01

In this work, we have evaluated how well the General AMBER force field (GAFF) performs in studying the dynamic properties of liquids. Diffusion coefficients (D) have been predicted for 17 solvents, 5 organic compounds in aqueous solutions, 4 proteins in aqueous solutions, and 9 organic compounds in non-aqueous solutions. An efficient sampling strategy has been proposed and tested in the calculation of the diffusion coefficients of solutes in solutions. There are two major findings of this study. First of all, the diffusion coefficients of organic solutes in aqueous solution can be well predicted: the average unsigned error (AUE) and the root-mean-square error (RMSE) are 0.137 and 0.171 ×10−5 cm−2s−1, respectively. Second, although the absolute values of D cannot be predicted, good correlations have been achieved for 8 organic solvents with experimental data (R2 = 0.784), 4 proteins in aqueous solutions (R2 = 0.996) and 9 organic compounds in non-aqueous solutions (R2 = 0.834). The temperature dependent behaviors of three solvents, namely, TIP3P water, dimethyl sulfoxide (DMSO) and cyclohexane have been studied. The major MD settings, such as the sizes of simulation boxes and with/without wrapping the coordinates of MD snapshots into the primary simulation boxes have been explored. We have concluded that our sampling strategy that averaging the mean square displacement (MSD) collected in multiple short-MD simulations is efficient in predicting diffusion coefficients of solutes at infinite dilution. PMID:21953689
Analysis of Lard in Lipstick Formulation Using FTIR Spectroscopy and Multivariate Calibration: A Comparison of Three Extraction Methods.

PubMed

Waskitho, Dri; Lukitaningsih, Endang; Sudjadi; Rohman, Abdul

2016-01-01

Analysis of lard extracted from lipstick formulation containing castor oil has been performed using FTIR spectroscopic method combined with multivariate calibration. Three different extraction methods were compared, namely saponification method followed by liquid/liquid extraction with hexane/dichlorometane/ethanol/water, saponification method followed by liquid/liquid extraction with dichloromethane/ethanol/water, and Bligh & Dyer method using chloroform/methanol/water as extracting solvent. Qualitative and quantitative analysis of lard were performed using principle component (PCA) and partial least square (PLS) analysis, respectively. The results showed that, in all samples prepared by the three extraction methods, PCA was capable of identifying lard at wavelength region of 1200-800 cm -1 with the best result was obtained by Bligh & Dyer method. Furthermore, PLS analysis at the same wavelength region used for qualification showed that Bligh and Dyer was the most suitable extraction method with the highest determination coefficient (R 2 ) and the lowest root mean square error of calibration (RMSEC) as well as root mean square error of prediction (RMSEP) values.
Detection of Glutamic Acid in Oilseed Rape Leaves Using Near Infrared Spectroscopy and the Least Squares-Support Vector Machine

PubMed Central

Bao, Yidan; Kong, Wenwen; Liu, Fei; Qiu, Zhengjun; He, Yong

2012-01-01

Amino acids are quite important indices to indicate the growth status of oilseed rape under herbicide stress. Near infrared (NIR) spectroscopy combined with chemometrics was applied for fast determination of glutamic acid in oilseed rape leaves. The optimal spectral preprocessing method was obtained after comparing Savitzky-Golay smoothing, standard normal variate, multiplicative scatter correction, first and second derivatives, detrending and direct orthogonal signal correction. Linear and nonlinear calibration methods were developed, including partial least squares (PLS) and least squares-support vector machine (LS-SVM). The most effective wavelengths (EWs) were determined by the successive projections algorithm (SPA), and these wavelengths were used as the inputs of PLS and LS-SVM model. The best prediction results were achieved by SPA-LS-SVM (Raw) model with correlation coefficient r = 0.9943 and root mean squares error of prediction (RMSEP) = 0.0569 for prediction set. These results indicated that NIR spectroscopy combined with SPA-LS-SVM was feasible for the fast and effective detection of glutamic acid in oilseed rape leaves. The selected EWs could be used to develop spectral sensors, and the important and basic amino acid data were helpful to study the function mechanism of herbicide. PMID:23203052
Effective thermal conductivity determination for low-density insulating materials

NASA Technical Reports Server (NTRS)

Williams, S. D.; Curry, D. M.

1978-01-01

That nonlinear least squares can be used to determine effective thermal conductivity was demonstrated, and a method for assessing the relative error associated with these predicted values was provided. The differences between dynamic and static determination of effective thermal conductivity of low-density materials that transfer heat by a combination of conduction, convection, and radiation were discussed.
Evaluating and improving the representation of heteroscedastic errors in hydrological models

NASA Astrophysics Data System (ADS)

McInerney, D. J.; Thyer, M. A.; Kavetski, D.; Kuczera, G. A.

2013-12-01

Appropriate representation of residual errors in hydrological modelling is essential for accurate and reliable probabilistic predictions. In particular, residual errors of hydrological models are often heteroscedastic, with large errors associated with high rainfall and runoff events. Recent studies have shown that using a weighted least squares (WLS) approach - where the magnitude of residuals are assumed to be linearly proportional to the magnitude of the flow - captures some of this heteroscedasticity. In this study we explore a range of Bayesian approaches for improving the representation of heteroscedasticity in residual errors. We compare several improved formulations of the WLS approach, the well-known Box-Cox transformation and the more recent log-sinh transformation. Our results confirm that these approaches are able to stabilize the residual error variance, and that it is possible to improve the representation of heteroscedasticity compared with the linear WLS approach. We also find generally good performance of the Box-Cox and log-sinh transformations, although as indicated in earlier publications, the Box-Cox transform sometimes produces unrealistically large prediction limits. Our work explores the trade-offs between these different uncertainty characterization approaches, investigates how their performance varies across diverse catchments and models, and recommends practical approaches suitable for large-scale applications.
Multi-analyte quantification in bioprocesses by Fourier-transform-infrared spectroscopy by partial least squares regression and multivariate curve resolution.

PubMed

Koch, Cosima; Posch, Andreas E; Goicoechea, Héctor C; Herwig, Christoph; Lendl, Bernhard

2014-01-07

This paper presents the quantification of Penicillin V and phenoxyacetic acid, a precursor, inline during Pencillium chrysogenum fermentations by FTIR spectroscopy and partial least squares (PLS) regression and multivariate curve resolution - alternating least squares (MCR-ALS). First, the applicability of an attenuated total reflection FTIR fiber optic probe was assessed offline by measuring standards of the analytes of interest and investigating matrix effects of the fermentation broth. Then measurements were performed inline during four fed-batch fermentations with online HPLC for the determination of Penicillin V and phenoxyacetic acid as reference analysis. PLS and MCR-ALS models were built using these data and validated by comparison of single analyte spectra with the selectivity ratio of the PLS models and the extracted spectral traces of the MCR-ALS models, respectively. The achieved root mean square errors of cross-validation for the PLS regressions were 0.22 g L(-1) for Penicillin V and 0.32 g L(-1) for phenoxyacetic acid and the root mean square errors of prediction for MCR-ALS were 0.23 g L(-1) for Penicillin V and 0.15 g L(-1) for phenoxyacetic acid. A general work-flow for building and assessing chemometric regression models for the quantification of multiple analytes in bioprocesses by FTIR spectroscopy is given. Copyright © 2013 The Authors. Published by Elsevier B.V. All rights reserved.
Prediction of matching condition for a microstrip subsystem using artificial neural network and adaptive neuro-fuzzy inference system

NASA Astrophysics Data System (ADS)

Salehi, Mohammad Reza; Noori, Leila; Abiri, Ebrahim

2016-11-01

In this paper, a subsystem consisting of a microstrip bandpass filter and a microstrip low noise amplifier (LNA) is designed for WLAN applications. The proposed filter has a small implementation area (49 mm2), small insertion loss (0.08 dB) and wide fractional bandwidth (FBW) (61%). To design the proposed LNA, the compact microstrip cells, an field effect transistor, and only a lumped capacitor are used. It has a low supply voltage and a low return loss (-40 dB) at the operation frequency. The matching condition of the proposed subsystem is predicted using subsystem analysis, artificial neural network (ANN) and adaptive neuro-fuzzy inference system (ANFIS). To design the proposed filter, the transmission matrix of the proposed resonator is obtained and analysed. The performance of the proposed ANN and ANFIS models is tested using the numerical data by four performance measures, namely the correlation coefficient (CC), the mean absolute error (MAE), the average percentage error (APE) and the root mean square error (RMSE). The obtained results show that these models are in good agreement with the numerical data, and a small error between the predicted values and numerical solution is obtained.
Study on elevated-temperature flow behavior of Ni-Cr-Mo-B ultra-heavy-plate steel via experiment and modelling

NASA Astrophysics Data System (ADS)

Gao, Zhi-yu; Kang, Yu; Li, Yan-shuai; Meng, Chao; Pan, Tao

2018-04-01

Elevated-temperature flow behavior of a novel Ni-Cr-Mo-B ultra-heavy-plate steel was investigated by conducting hot compressive deformation tests on a Gleeble-3800 thermo-mechanical simulator at a temperature range of 1123 K–1423 K with a strain rate range from 0.01 s‑1 to10 s‑1 and a height reduction of 70%. Based on the experimental results, classic strain-compensated Arrhenius-type, a new revised strain-compensated Arrhenius-type and classic modified Johnson-Cook constitutive models were developed for predicting the high-temperature deformation behavior of the steel. The predictability of these models were comparatively evaluated in terms of statistical parameters including correlation coefficient (R), average absolute relative error (AARE), average root mean square error (RMSE), normalized mean bias error (NMBE) and relative error. The statistical results indicate that the new revised strain-compensated Arrhenius-type model could give prediction of elevated-temperature flow stress for the steel accurately under the entire process conditions. However, the predicted values by the classic modified Johnson-Cook model could not agree well with the experimental values, and the classic strain-compensated Arrhenius-type model could track the deformation behavior more accurately compared with the modified Johnson-Cook model, but less accurately with the new revised strain-compensated Arrhenius-type model. In addition, reasons of differences in predictability of these models were discussed in detail.
Evaluation of procedures for prediction of unconventional gas in the presence of geologic trends

USGS Publications Warehouse

Attanasi, E.D.; Coburn, T.C.

2009-01-01

This study extends the application of local spatial nonparametric prediction models to the estimation of recoverable gas volumes in continuous-type gas plays to regimes where there is a single geologic trend. A transformation is presented, originally proposed by Tomczak, that offsets the distortions caused by the trend. This article reports on numerical experiments that compare predictive and classification performance of the local nonparametric prediction models based on the transformation with models based on Euclidean distance. The transformation offers improvement in average root mean square error when the trend is not severely misspecified. Because of the local nature of the models, even those based on Euclidean distance in the presence of trends are reasonably robust. The tests based on other model performance metrics such as prediction error associated with the high-grade tracts and the ability of the models to identify sites with the largest gas volumes also demonstrate the robustness of both local modeling approaches. ?? International Association for Mathematical Geology 2009.
Seasonal prediction skill of winter temperature over North India

NASA Astrophysics Data System (ADS)

Tiwari, P. R.; Kar, S. C.; Mohanty, U. C.; Dey, S.; Kumari, S.; Sinha, P.

2016-04-01

The climatology, amplitude error, phase error, and mean square skill score (MSSS) of temperature predictions from five different state-of-the-art general circulation models (GCMs) have been examined for the winter (December-January-February) seasons over North India. In this region, temperature variability affects the phenological development processes of wheat crops and the grain yield. The GCM forecasts of temperature for a whole season issued in November from various organizations are compared with observed gridded temperature data obtained from the India Meteorological Department (IMD) for the period 1982-2009. The MSSS indicates that the models have skills of varying degrees. Predictions of maximum and minimum temperature obtained from the National Centers for Environmental Prediction (NCEP) climate forecast system model (NCEP_CFSv2) are compared with station level observations from the Snow and Avalanche Study Establishment (SASE). It has been found that when the model temperatures are corrected to account the bias in the model and actual orography, the predictions are able to delineate the observed trend compared to the trend without orography correction.
Determination of rice syrup adulterant concentration in honey using three-dimensional fluorescence spectra and multivariate calibrations

NASA Astrophysics Data System (ADS)

Chen, Quansheng; Qi, Shuai; Li, Huanhuan; Han, Xiaoyan; Ouyang, Qin; Zhao, Jiewen

2014-10-01

To rapidly and efficiently detect the presence of adulterants in honey, three-dimensional fluorescence spectroscopy (3DFS) technique was employed with the help of multivariate calibration. The data of 3D fluorescence spectra were compressed using characteristic extraction and the principal component analysis (PCA). Then, partial least squares (PLS) and back propagation neural network (BP-ANN) algorithms were used for modeling. The model was optimized by cross validation, and its performance was evaluated according to root mean square error of prediction (RMSEP) and correlation coefficient (R) in prediction set. The results showed that BP-ANN model was superior to PLS models, and the optimum prediction results of the mixed group (sunflower ± longan ± buckwheat ± rape) model were achieved as follow: RMSEP = 0.0235 and R = 0.9787 in the prediction set. The study demonstrated that the 3D fluorescence spectroscopy technique combined with multivariate calibration has high potential in rapid, nondestructive, and accurate quantitative analysis of honey adulteration.
Application of partial least squares near-infrared spectral classification in diabetic identification

NASA Astrophysics Data System (ADS)

Yan, Wen-juan; Yang, Ming; He, Guo-quan; Qin, Lin; Li, Gang

2014-11-01

In order to identify the diabetic patients by using tongue near-infrared (NIR) spectrum - a spectral classification model of the NIR reflectivity of the tongue tip is proposed, based on the partial least square (PLS) method. 39sample data of tongue tip's NIR spectra are harvested from healthy people and diabetic patients , respectively. After pretreatment of the reflectivity, the spectral data are set as the independent variable matrix, and information of classification as the dependent variables matrix, Samples were divided into two groups - i.e. 53 samples as calibration set and 25 as prediction set - then the PLS is used to build the classification model The constructed modelfrom the 53 samples has the correlation of 0.9614 and the root mean square error of cross-validation (RMSECV) of 0.1387.The predictions for the 25 samples have the correlation of 0.9146 and the RMSECV of 0.2122.The experimental result shows that the PLS method can achieve good classification on features of healthy people and diabetic patients.
Soil sail content estimation in the yellow river delta with satellite hyperspectral data

USGS Publications Warehouse

Weng, Yongling; Gong, Peng; Zhu, Zhi-Liang

2008-01-01

Soil salinization is one of the most common land degradation processes and is a severe environmental hazard. The primary objective of this study is to investigate the potential of predicting salt content in soils with hyperspectral data acquired with EO-1 Hyperion. Both partial least-squares regression (PLSR) and conventional multiple linear regression (MLR), such as stepwise regression (SWR), were tested as the prediction model. PLSR is commonly used to overcome the problem caused by high-dimensional and correlated predictors. Chemical analysis of 95 samples collected from the top layer of soils in the Yellow River delta area shows that salt content was high on average, and the dominant chemicals in the saline soil were NaCl and MgCl2. Multivariate models were established between soil contents and hyperspectral data. Our results indicate that the PLSR technique with laboratory spectral data has a strong prediction capacity. Spectral bands at 1487-1527, 1971-1991, 2032-2092, and 2163-2355 nm possessed large absolute values of regression coefficients, with the largest coefficient at 2203 nm. We obtained a root mean squared error (RMSE) for calibration (with 61 samples) of RMSEC = 0.753 (R2 = 0.893) and a root mean squared error for validation (with 30 samples) of RMSEV = 0.574. The prediction model was applied on a pixel-by-pixel basis to a Hyperion reflectance image to yield a quantitative surface distribution map of soil salt content. The result was validated successfully from 38 sampling points. We obtained an RMSE estimate of 1.037 (R2 = 0.784) for the soil salt content map derived by the PLSR model. The salinity map derived from the SWR model shows that the predicted value is higher than the true value. These results demonstrate that the PLSR method is a more suitable technique than stepwise regression for quantitative estimation of soil salt content in a large area. ?? 2008 CASI.
[Main Components of Xinjiang Lavender Essential Oil Determined by Partial Least Squares and Near Infrared Spectroscopy].

PubMed

Liao, Xiang; Wang, Qing; Fu, Ji-hong; Tang, Jun

2015-09-01

This work was undertaken to establish a quantitative analysis model which can rapid determinate the content of linalool, linalyl acetate of Xinjiang lavender essential oil. Totally 165 lavender essential oil samples were measured by using near infrared absorption spectrum (NIR), after analyzing the near infrared spectral absorption peaks of all samples, lavender essential oil have abundant chemical information and the interference of random noise may be relatively low on the spectral intervals of 7100~4500 cm(-1). Thus, the PLS models was constructed by using this interval for further analysis. 8 abnormal samples were eliminated. Through the clustering method, 157 lavender essential oil samples were divided into 105 calibration set samples and 52 validation set samples. Gas chromatography mass spectrometry (GC-MS) was used as a tool to determine the content of linalool and linalyl acetate in lavender essential oil. Then the matrix was established with the GC-MS raw data of two compounds in combination with the original NIR data. In order to optimize the model, different pretreatment methods were used to preprocess the raw NIR spectral to contrast the spectral filtering effect, after analysizing the quantitative model results of linalool and linalyl acetate, the root mean square error prediction (RMSEP) of orthogonal signal transformation (OSC) was 0.226, 0.558, spectrally, it was the optimum pretreatment method. In addition, forward interval partial least squares (FiPLS) method was used to exclude the wavelength points which has nothing to do with determination composition or present nonlinear correlation, finally 8 spectral intervals totally 160 wavelength points were obtained as the dataset. Combining the data sets which have optimized by OSC-FiPLS with partial least squares (PLS) to establish a rapid quantitative analysis model for determining the content of linalool and linalyl acetate in Xinjiang lavender essential oil, numbers of hidden variables of two components were 8 in the model. The performance of the model was evaluated according to root mean square error of cross-validation (RMSECV), root mean square error of prediction (RMSEP). In the model, RESECV of linalool and linalyl acetate were 0.170 and 0.416, respectively; RM-SEP were 0.188 and 0.364. The results indicated that raw data was pretreated by OSC and FiPLS, the NIR-PLS quantitative analysis model with good robustness, high measurement precision; it could quickly determine the content of linalool and linalyl acetate in lavender essential oil. In addition, the model has a favorable prediction ability. The study also provide a new effective method which could rapid quantitative analysis the major components of Xinjiang lavender essential oil.
Comparison of three-dimensional fluorescence analysis methods for predicting formation of trihalomethanes and haloacetic acids.

PubMed

Peleato, Nicolás M; Andrews, Robert C

2015-01-01

This work investigated the application of several fluorescence excitation-emission matrix analysis methods as natural organic matter (NOM) indicators for use in predicting the formation of trihalomethanes (THMs) and haloacetic acids (HAAs). Waters from four different sources (two rivers and two lakes) were subjected to jar testing followed by 24hr disinfection by-product formation tests using chlorine. NOM was quantified using three common measures: dissolved organic carbon, ultraviolet absorbance at 254 nm, and specific ultraviolet absorbance as well as by principal component analysis, peak picking, and parallel factor analysis of fluorescence spectra. Based on multi-linear modeling of THMs and HAAs, principle component (PC) scores resulted in the lowest mean squared prediction error of cross-folded test sets (THMs: 43.7 (μg/L)(2), HAAs: 233.3 (μg/L)(2)). Inclusion of principle components representative of protein-like material significantly decreased prediction error for both THMs and HAAs. Parallel factor analysis did not identify a protein-like component and resulted in prediction errors similar to traditional NOM surrogates as well as fluorescence peak picking. These results support the value of fluorescence excitation-emission matrix-principal component analysis as a suitable NOM indicator in predicting the formation of THMs and HAAs for the water sources studied. Copyright © 2014. Published by Elsevier B.V.
Feasibility of predicting tumor motion using online data acquired during treatment and a generalized neural network optimized with offline patient tumor trajectories.

PubMed

Teo, Troy P; Ahmed, Syed Bilal; Kawalec, Philip; Alayoubi, Nadia; Bruce, Neil; Lyn, Ethan; Pistorius, Stephen

2018-02-01

The accurate prediction of intrafraction lung tumor motion is required to compensate for system latency in image-guided adaptive radiotherapy systems. The goal of this study was to identify an optimal prediction model that has a short learning period so that prediction and adaptation can commence soon after treatment begins, and requires minimal reoptimization for individual patients. Specifically, the feasibility of predicting tumor position using a combination of a generalized (i.e., averaged) neural network, optimized using historical patient data (i.e., tumor trajectories) obtained offline, coupled with the use of real-time online tumor positions (obtained during treatment delivery) was examined. A 3-layer perceptron neural network was implemented to predict tumor motion for a prediction horizon of 650 ms. A backpropagation algorithm and batch gradient descent approach were used to train the model. Twenty-seven 1-min lung tumor motion samples (selected from a CyberKnife patient dataset) were sampled at a rate of 7.5 Hz (0.133 s) to emulate the frame rate of an electronic portal imaging device (EPID). A sliding temporal window was used to sample the data for learning. The sliding window length was set to be equivalent to the first breathing cycle detected from each trajectory. Performing a parametric sweep, an averaged error surface of mean square errors (MSE) was obtained from the prediction responses of seven trajectories used for the training of the model (Group 1). An optimal input data size and number of hidden neurons were selected to represent the generalized model. To evaluate the prediction performance of the generalized model on unseen data, twenty tumor traces (Group 2) that were not involved in the training of the model were used for the leave-one-out cross-validation purposes. An input data size of 35 samples (4.6 s) and 20 hidden neurons were selected for the generalized neural network. An average sliding window length of 28 data samples was used. The average initial learning period prior to the availability of the first predicted tumor position was 8.53 ± 1.03 s. Average mean absolute error (MAE) of 0.59 ± 0.13 mm and 0.56 ± 0.18 mm were obtained from Groups 1 and 2, respectively, giving an overall MAE of 0.57 ± 0.17 mm. Average root-mean-square-error (RMSE) of 0.67 ± 0.36 for all the traces (0.76 ± 0.34 mm, Group 1 and 0.63 ± 0.36 mm, Group 2), is comparable to previously published results. Prediction errors are mainly due to the irregular periodicities between cycles. Since the errors from Groups 1 and 2 are within the same range, it demonstrates that this model can generalize and predict on unseen data. This is a first attempt to use an averaged MSE error surface (obtained from the prediction of different patients' tumor trajectories) to determine the parameters of a generalized neural network. This network could be deployed as a plug-and-play predictor for tumor trajectory during treatment delivery, eliminating the need for optimizing individual networks with pretreatment patient data. © 2017 American Association of Physicists in Medicine.
An analysis of the least-squares problem for the DSN systematic pointing error model

NASA Technical Reports Server (NTRS)

Alvarez, L. S.

1991-01-01

A systematic pointing error model is used to calibrate antennas in the Deep Space Network. The least squares problem is described and analyzed along with the solution methods used to determine the model's parameters. Specifically studied are the rank degeneracy problems resulting from beam pointing error measurement sets that incorporate inadequate sky coverage. A least squares parameter subset selection method is described and its applicability to the systematic error modeling process is demonstrated on Voyager 2 measurement distribution.
Orbit Determination of KOMPSAT-1 and Cryosat-2 Satellites Using Optical Wide-field Patrol Network (OWL-Net) Data with Batch Least Squares Filter

NASA Astrophysics Data System (ADS)

Lee, Eunji; Park, Sang-Young; Shin, Bumjoon; Cho, Sungki; Choi, Eun-Jung; Jo, Junghyun; Park, Jang-Hyun

2017-03-01

The optical wide-field patrol network (OWL-Net) is a Korean optical surveillance system that tracks and monitors domestic satellites. In this study, a batch least squares algorithm was developed for optical measurements and verified by Monte Carlo simulation and covariance analysis. Potential error sources of OWL-Net, such as noise, bias, and clock errors, were analyzed. There is a linear relation between the estimation accuracy and the noise level, and the accuracy significantly depends on the declination bias. In addition, the time-tagging error significantly degrades the observation accuracy, while the time-synchronization offset corresponds to the orbital motion. The Cartesian state vector and measurement bias were determined using the OWL-Net tracking data of the KOMPSAT-1 and Cryosat-2 satellites. The comparison with known orbital information based on two-line elements (TLE) and the consolidated prediction format (CPF) shows that the orbit determination accuracy is similar to that of TLE. Furthermore, the precision and accuracy of OWL-Net observation data were determined to be tens of arcsec and sub-degree level, respectively.

Estimation of the daily global solar radiation based on the Gaussian process regression methodology in the Saharan climate

NASA Astrophysics Data System (ADS)

Guermoui, Mawloud; Gairaa, Kacem; Rabehi, Abdelaziz; Djafer, Djelloul; Benkaciali, Said

2018-06-01

Accurate estimation of solar radiation is the major concern in renewable energy applications. Over the past few years, a lot of machine learning paradigms have been proposed in order to improve the estimation performances, mostly based on artificial neural networks, fuzzy logic, support vector machine and adaptive neuro-fuzzy inference system. The aim of this work is the prediction of the daily global solar radiation, received on a horizontal surface through the Gaussian process regression (GPR) methodology. A case study of Ghardaïa region (Algeria) has been used in order to validate the above methodology. In fact, several combinations have been tested; it was found that, GPR-model based on sunshine duration, minimum air temperature and relative humidity gives the best results in term of mean absolute bias error (MBE), root mean square error (RMSE), relative mean square error (rRMSE), and correlation coefficient ( r) . The obtained values of these indicators are 0.67 MJ/m2, 1.15 MJ/m2, 5.2%, and 98.42%, respectively.
Modeling of surface dust concentrations using neural networks and kriging

NASA Astrophysics Data System (ADS)

Buevich, Alexander G.; Medvedev, Alexander N.; Sergeev, Alexander P.; Tarasov, Dmitry A.; Shichkin, Andrey V.; Sergeeva, Marina V.; Atanasova, T. B.

2016-12-01

Creating models which are able to accurately predict the distribution of pollutants based on a limited set of input data is an important task in environmental studies. In the paper two neural approaches: (multilayer perceptron (MLP)) and generalized regression neural network (GRNN)), and two geostatistical approaches: (kriging and cokriging), are using for modeling and forecasting of dust concentrations in snow cover. The area of study is under the influence of dust emissions from a copper quarry and a several industrial companies. The comparison of two mentioned approaches is conducted. Three indices are used as the indicators of the models accuracy: the mean absolute error (MAE), root mean square error (RMSE) and relative root mean square error (RRMSE). Models based on artificial neural networks (ANN) have shown better accuracy. When considering all indices, the most precision model was the GRNN, which uses as input parameters for modeling the coordinates of sampling points and the distance to the probable emissions source. The results of work confirm that trained ANN may be more suitable tool for modeling of dust concentrations in snow cover.
August Median Streamflow on Ungaged Streams in Eastern Aroostook County, Maine

USGS Publications Warehouse

Lombard, Pamela J.; Tasker, Gary D.; Nielsen, Martha G.

2003-01-01

Methods for estimating August median streamflow were developed for ungaged, unregulated streams in the eastern part of Aroostook County, Maine, with drainage areas from 0.38 to 43 square miles and mean basin elevations from 437 to 1,024 feet. Few long-term, continuous-record streamflow-gaging stations with small drainage areas were available from which to develop the equations; therefore, 24 partial-record gaging stations were established in this investigation. A mathematical technique for estimating a standard low-flow statistic, August median streamflow, at partial-record stations was applied by relating base-flow measurements at these stations to concurrent daily flows at nearby long-term, continuous-record streamflow- gaging stations (index stations). Generalized least-squares regression analysis (GLS) was used to relate estimates of August median streamflow at gaging stations to basin characteristics at these same stations to develop equations that can be applied to estimate August median streamflow on ungaged streams. GLS accounts for varying periods of record at the gaging stations and the cross correlation of concurrent streamflows among gaging stations. Twenty-three partial-record stations and one continuous-record station were used for the final regression equations. The basin characteristics of drainage area and mean basin elevation are used in the calculated regression equation for ungaged streams to estimate August median flow. The equation has an average standard error of prediction from -38 to 62 percent. A one-variable equation uses only drainage area to estimate August median streamflow when less accuracy is acceptable. This equation has an average standard error of prediction from -40 to 67 percent. Model error is larger than sampling error for both equations, indicating that additional basin characteristics could be important to improved estimates of low-flow statistics. Weighted estimates of August median streamflow, which can be used when making estimates at partial-record or continuous-record gaging stations, range from 0.03 to 11.7 cubic feet per second or from 0.1 to 0.4 cubic feet per second per square mile. Estimates of August median streamflow on ungaged streams in the eastern part of Aroostook County, within the range of acceptable explanatory variables, range from 0.03 to 30 cubic feet per second or 0.1 to 0.7 cubic feet per second per square mile. Estimates of August median streamflow per square mile of drainage area generally increase as mean elevation and drainage area increase.
Comparison of in vitro and in situ methods in evaluation of forage digestibility in ruminants.

PubMed

Krizsan, S J; Nyholm, L; Nousiainen, J; Südekum, K-H; Huhtanen, P

2012-09-01

The objective of this study was to compare the application of different in vitro and in situ methods in empirical and mechanistic predictions of in vivo OM digestibility (OMD) and their associations to near-infrared reflectance spectroscopy spectra for a variety of forages. Apparent in vivo OMD of silages made from alfalfa (n = 2), corn (n = 9), corn stover (n = 2), grass (n = 11), whole crops of wheat and barley (n = 8) and red clover (n = 7), and fresh alfalfa (n = 1), grass hays (n = 5), and wheat straws (n = 5) had previously been determined in sheep. Concentrations of indigestible NDF (iNDF) in all forage samples were determined by a 288-h ruminal in situ incubation. Gas production of isolated forage NDF was measured by in vitro incubations for 72 h. In vitro pepsin-cellulase OM solubility (OMS) of the forages was determined by a 2-step gravimetric digestion method. Samples were also subjected to a 2-step determination of in vitro OMD based on buffered rumen fluid and pepsin. Further, rumen fluid digestible OM was determined from a single 96-h incubation at 38°C. Digestibility of OM from the in situ and the in vitro incubations was calculated according to published empirical equations, which were either forage specific or general (1 equation for all forages) within method. Indigestible NDF was also used in a mechanistic model to predict OMD. Predictions of OMD were evaluated by residual analysis using the GLM procedure in SAS. In vitro OMS in a general prediction equation of OMD did not display a significant forage-type effect on the residuals (observed - predicted OMD; P = 0.10). Predictions of OMD within forage types were consistent between iNDF and the 2-step in vitro method based on rumen fluid. Root mean square error of OMD was least (0.032) when the prediction was based on a general forage equation of OMS. However, regenerating a simple regression for iNDF by omitting alfalfa and wheat straw reduced the root mean square error of OMD to 0.025. Indigestible NDF in a general forage equation predicted OMD without any bias (P ≥ 0.16), and root mean square error of prediction was smallest among all methods when alfalfa and wheat straw samples were excluded. Our study suggests that compared with the in vitro laboratory methods, iNDF used in forage-specific equations will improve overall predictions of forage in vivo OMD. The in vitro and in situ methods performed equally well in calibrations of iNDF or OMD by near-infrared reflectance spectroscopy.
Window-Based Channel Impulse Response Prediction for Time-Varying Ultra-Wideband Channels.

PubMed

Al-Samman, A M; Azmi, M H; Rahman, T A; Khan, I; Hindia, M N; Fattouh, A

2016-01-01

This work proposes channel impulse response (CIR) prediction for time-varying ultra-wideband (UWB) channels by exploiting the fast movement of channel taps within delay bins. Considering the sparsity of UWB channels, we introduce a window-based CIR (WB-CIR) to approximate the high temporal resolutions of UWB channels. A recursive least square (RLS) algorithm is adopted to predict the time evolution of the WB-CIR. For predicting the future WB-CIR tap of window wk, three RLS filter coefficients are computed from the observed WB-CIRs of the left wk-1, the current wk and the right wk+1 windows. The filter coefficient with the lowest RLS error is used to predict the future WB-CIR tap. To evaluate our proposed prediction method, UWB CIRs are collected through measurement campaigns in outdoor environments considering line-of-sight (LOS) and non-line-of-sight (NLOS) scenarios. Under similar computational complexity, our proposed method provides an improvement in prediction errors of approximately 80% for LOS and 63% for NLOS scenarios compared with a conventional method.
Window-Based Channel Impulse Response Prediction for Time-Varying Ultra-Wideband Channels

PubMed Central

Al-Samman, A. M.; Azmi, M. H.; Rahman, T. A.; Khan, I.; Hindia, M. N.; Fattouh, A.

2016-01-01

This work proposes channel impulse response (CIR) prediction for time-varying ultra-wideband (UWB) channels by exploiting the fast movement of channel taps within delay bins. Considering the sparsity of UWB channels, we introduce a window-based CIR (WB-CIR) to approximate the high temporal resolutions of UWB channels. A recursive least square (RLS) algorithm is adopted to predict the time evolution of the WB-CIR. For predicting the future WB-CIR tap of window wk, three RLS filter coefficients are computed from the observed WB-CIRs of the left wk−1, the current wk and the right wk+1 windows. The filter coefficient with the lowest RLS error is used to predict the future WB-CIR tap. To evaluate our proposed prediction method, UWB CIRs are collected through measurement campaigns in outdoor environments considering line-of-sight (LOS) and non-line-of-sight (NLOS) scenarios. Under similar computational complexity, our proposed method provides an improvement in prediction errors of approximately 80% for LOS and 63% for NLOS scenarios compared with a conventional method. PMID:27992445
Offline modeling for product quality prediction of mineral processing using modeling error PDF shaping and entropy minimization.

PubMed

Ding, Jinliang; Chai, Tianyou; Wang, Hong

2011-03-01

This paper presents a novel offline modeling for product quality prediction of mineral processing which consists of a number of unit processes in series. The prediction of the product quality of the whole mineral process (i.e., the mixed concentrate grade) plays an important role and the establishment of its predictive model is a key issue for the plantwide optimization. For this purpose, a hybrid modeling approach of the mixed concentrate grade prediction is proposed, which consists of a linear model and a nonlinear model. The least-squares support vector machine is adopted to establish the nonlinear model. The inputs of the predictive model are the performance indices of each unit process, while the output is the mixed concentrate grade. In this paper, the model parameter selection is transformed into the shape control of the probability density function (PDF) of the modeling error. In this context, both the PDF-control-based and minimum-entropy-based model parameter selection approaches are proposed. Indeed, this is the first time that the PDF shape control idea is used to deal with system modeling, where the key idea is to turn model parameters so that either the modeling error PDF is controlled to follow a target PDF or the modeling error entropy is minimized. The experimental results using the real plant data and the comparison of the two approaches are discussed. The results show the effectiveness of the proposed approaches.
Comparison of updates to the Molly cow model to predict methane production from dairy cows fed pasture.

PubMed

Gregorini, P; Beukes, P C; Hanigan, M D; Waghorn, G; Muetzel, S; McNamara, J P

2013-08-01

Molly is a deterministic, mechanistic, dynamic model representing the digestion, metabolism, and production of a dairy cow. This study compared the predictions of enteric methane production from the original version of Molly (MollyOrigin) and 2 new versions of Molly. Updated versions included new ruminal fiber digestive parameters and animal hormonal parameters (Molly84) and a revised version of digestive and ruminal parameters (Molly85), using 3 different ruminal volatile fatty acid (VFA) stoichiometry constructs to describe the VFA pattern and methane (CH4) production (g of CH4/d). The VFA stoichiometry constructs were the original forage and mixed-diet VFA constructs and a new VFA stoichiometry based on a more recent and larger set of data that includes lactate and valerate production, amylolytic and cellulolytic bacteria, as well as protozoal pools. The models' outputs were challenged using data from 16 dairy cattle 26 mo old [standard error of the mean (SEM)=1.7], 82 (SEM=8.7) d in milk, producing 17 (SEM=0.2) kg of milk/d, and fed fresh-cut ryegrass [dry matter intake=12.3 (SEM=0.3) kg of DM/d] in respiration chambers. Mean observed CH4 production was 266±5.6 SEM (g/d). Mean predicted values for CH4 production were 287 and 258 g/d for MollyOrigin without and with the new VFA construct. Model Molly84 predicted 295 and 288 g of CH4/d with and without the new VFA settings. Model Molly85 predicted the same CH4 production (276 g/d) with or without the new VFA construct. The incorporation of the new VFA construct did not consistently reduce the low prediction error across the versions of Molly evaluated in the present study. The improvements in the Molly versions from MollyOrigin to Molly84 to Molly85 resulted in a decrease in mean square prediction error from 8.6 to 8.3 to 4.3% using the forage diet setting. The majority of the mean square prediction error was apportioned to random bias (e.g., 43, 65, and 70% in MollyOrigin, Molly84, and Molly85, respectively, on the forage setting, showing that with the updated versions a greater proportion of error was random). The slope bias was less than 2% in all cases. We concluded that, of the versions of Molly used for pastoral systems, Molly85 has the capability to predict CH4 production from grass-fed dairy cows with the highest accuracy. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Real-time prediction and gating of respiratory motion using an extended Kalman filter and Gaussian process regression

NASA Astrophysics Data System (ADS)

Bukhari, W.; Hong, S.-M.

2015-01-01

Motion-adaptive radiotherapy aims to deliver a conformal dose to the target tumour with minimal normal tissue exposure by compensating for tumour motion in real time. The prediction as well as the gating of respiratory motion have received much attention over the last two decades for reducing the targeting error of the treatment beam due to respiratory motion. In this article, we present a real-time algorithm for predicting and gating respiratory motion that utilizes a model-based and a model-free Bayesian framework by combining them in a cascade structure. The algorithm, named EKF-GPR+, implements a gating function without pre-specifying a particular region of the patient’s breathing cycle. The algorithm first employs an extended Kalman filter (LCM-EKF) to predict the respiratory motion and then uses a model-free Gaussian process regression (GPR) to correct the error of the LCM-EKF prediction. The GPR is a non-parametric Bayesian algorithm that yields predictive variance under Gaussian assumptions. The EKF-GPR+ algorithm utilizes the predictive variance from the GPR component to capture the uncertainty in the LCM-EKF prediction error and systematically identify breathing points with a higher probability of large prediction error in advance. This identification allows us to pause the treatment beam over such instances. EKF-GPR+ implements the gating function by using simple calculations based on the predictive variance with no additional detection mechanism. A sparse approximation of the GPR algorithm is employed to realize EKF-GPR+ in real time. Extensive numerical experiments are performed based on a large database of 304 respiratory motion traces to evaluate EKF-GPR+. The experimental results show that the EKF-GPR+ algorithm effectively reduces the prediction error in a root-mean-square (RMS) sense by employing the gating function, albeit at the cost of a reduced duty cycle. As an example, EKF-GPR+ reduces the patient-wise RMS error to 37%, 39% and 42% in percent ratios relative to no prediction for a duty cycle of 80% at lookahead lengths of 192 ms, 384 ms and 576 ms, respectively. The experiments also confirm that EKF-GPR+ controls the duty cycle with reasonable accuracy.
Prediction skill of rainstorm events over India in the TIGGE weather prediction models

NASA Astrophysics Data System (ADS)

Karuna Sagar, S.; Rajeevan, M.; Vijaya Bhaskara Rao, S.; Mitra, A. K.

2017-12-01

Extreme rainfall events pose a serious threat of leading to severe floods in many countries worldwide. Therefore, advance prediction of its occurrence and spatial distribution is very essential. In this paper, an analysis has been made to assess the skill of numerical weather prediction models in predicting rainstorms over India. Using gridded daily rainfall data set and objective criteria, 15 rainstorms were identified during the monsoon season (June to September). The analysis was made using three TIGGE (THe Observing System Research and Predictability Experiment (THORPEX) Interactive Grand Global Ensemble) models. The models considered are the European Centre for Medium-Range Weather Forecasts (ECMWF), National Centre for Environmental Prediction (NCEP) and the UK Met Office (UKMO). Verification of the TIGGE models for 43 observed rainstorm days from 15 rainstorm events has been made for the period 2007-2015. The comparison reveals that rainstorm events are predictable up to 5 days in advance, however with a bias in spatial distribution and intensity. The statistical parameters like mean error (ME) or Bias, root mean square error (RMSE) and correlation coefficient (CC) have been computed over the rainstorm region using the multi-model ensemble (MME) mean. The study reveals that the spread is large in ECMWF and UKMO followed by the NCEP model. Though the ensemble spread is quite small in NCEP, the ensemble member averages are not well predicted. The rank histograms suggest that the forecasts are under prediction. The modified Contiguous Rain Area (CRA) technique was used to verify the spatial as well as the quantitative skill of the TIGGE models. Overall, the contribution from the displacement and pattern errors to the total RMSE is found to be more in magnitude. The volume error increases from 24 hr forecast to 48 hr forecast in all the three models.
A fast hybrid algorithm combining regularized motion tracking and predictive search for reducing the occurrence of large displacement errors.

PubMed

Jiang, Jingfeng; Hall, Timothy J

2011-04-01

A hybrid approach that inherits both the robustness of the regularized motion tracking approach and the efficiency of the predictive search approach is reported. The basic idea is to use regularized speckle tracking to obtain high-quality seeds in an explorative search that can be used in the subsequent intelligent predictive search. The performance of the hybrid speckle-tracking algorithm was compared with three published speckle-tracking methods using in vivo breast lesion data. We found that the hybrid algorithm provided higher displacement quality metric values, lower root mean squared errors compared with a locally smoothed displacement field, and higher improvement ratios compared with the classic block-matching algorithm. On the basis of these comparisons, we concluded that the hybrid method can further enhance the accuracy of speckle tracking compared with its real-time counterparts, at the expense of slightly higher computational demands. © 2011 IEEE
Modeling and predicting historical volatility in exchange rate markets

NASA Astrophysics Data System (ADS)

Lahmiri, Salim

2017-04-01

Volatility modeling and forecasting of currency exchange rate is an important task in several business risk management tasks; including treasury risk management, derivatives pricing, and portfolio risk evaluation. The purpose of this study is to present a simple and effective approach for predicting historical volatility of currency exchange rate. The approach is based on a limited set of technical indicators as inputs to the artificial neural networks (ANN). To show the effectiveness of the proposed approach, it was applied to forecast US/Canada and US/Euro exchange rates volatilities. The forecasting results show that our simple approach outperformed the conventional GARCH and EGARCH with different distribution assumptions, and also the hybrid GARCH and EGARCH with ANN in terms of mean absolute error, mean of squared errors, and Theil's inequality coefficient. Because of the simplicity and effectiveness of the approach, it is promising for US currency volatility prediction tasks.
Response Surface Analysis of Experiments with Random Blocks

DTIC Science & Technology

1988-09-01

partitioned into a lack of fit sum of squares, SSLOF, and a pure error sum of squares, SSPE . The latter is obtained by pooling the pure error sums of squares...from the blocks. Tests concerning the polynomial effects can then proceed using SSPE as the error term in the denominators of the F test statistics. 3.2...the center point in each of the three blocks is equal to SSPE = 2.0127 with 5 degrees of freedom. Hence, the lack of fit sum of squares is SSLoF
Least-Squares Curve-Fitting Program

NASA Technical Reports Server (NTRS)

Kantak, Anil V.

1990-01-01

Least Squares Curve Fitting program, AKLSQF, easily and efficiently computes polynomial providing least-squares best fit to uniformly spaced data. Enables user to specify tolerable least-squares error in fit or degree of polynomial. AKLSQF returns polynomial and actual least-squares-fit error incurred in operation. Data supplied to routine either by direct keyboard entry or via file. Written for an IBM PC X/AT or compatible using Microsoft's Quick Basic compiler.
A novel auto-tuning PID control mechanism for nonlinear systems.

PubMed

Cetin, Meric; Iplikci, Serdar

2015-09-01

In this paper, a novel Runge-Kutta (RK) discretization-based model-predictive auto-tuning proportional-integral-derivative controller (RK-PID) is introduced for the control of continuous-time nonlinear systems. The parameters of the PID controller are tuned using RK model of the system through prediction error-square minimization where the predicted information of tracking error provides an enhanced tuning of the parameters. Based on the model-predictive control (MPC) approach, the proposed mechanism provides necessary PID parameter adaptations while generating additive correction terms to assist the initially inadequate PID controller. Efficiency of the proposed mechanism has been tested on two experimental real-time systems: an unstable single-input single-output (SISO) nonlinear magnetic-levitation system and a nonlinear multi-input multi-output (MIMO) liquid-level system. RK-PID has been compared to standard PID, standard nonlinear MPC (NMPC), RK-MPC and conventional sliding-mode control (SMC) methods in terms of control performance, robustness, computational complexity and design issue. The proposed mechanism exhibits acceptable tuning and control performance with very small steady-state tracking errors, and provides very short settling time for parameter convergence. Copyright © 2015 ISA. Published by Elsevier Ltd. All rights reserved.
Best of both worlds: combining pharma data and state of the art modeling technology to improve in Silico pKa prediction.

PubMed

Fraczkiewicz, Robert; Lobell, Mario; Göller, Andreas H; Krenz, Ursula; Schoenneis, Rolf; Clark, Robert D; Hillisch, Alexander

2015-02-23

In a unique collaboration between a software company and a pharmaceutical company, we were able to develop a new in silico pKa prediction tool with outstanding prediction quality. An existing pKa prediction method from Simulations Plus based on artificial neural network ensembles (ANNE), microstates analysis, and literature data was retrained with a large homogeneous data set of drug-like molecules from Bayer. The new model was thus built with curated sets of ∼14,000 literature pKa values (∼11,000 compounds, representing literature chemical space) and ∼19,500 pKa values experimentally determined at Bayer Pharma (∼16,000 compounds, representing industry chemical space). Model validation was performed with several test sets consisting of a total of ∼31,000 new pKa values measured at Bayer. For the largest and most difficult test set with >16,000 pKa values that were not used for training, the original model achieved a mean absolute error (MAE) of 0.72, root-mean-square error (RMSE) of 0.94, and squared correlation coefficient (R(2)) of 0.87. The new model achieves significantly improved prediction statistics, with MAE = 0.50, RMSE = 0.67, and R(2) = 0.93. It is commercially available as part of the Simulations Plus ADMET Predictor release 7.0. Good predictions are only of value when delivered effectively to those who can use them. The new pKa prediction model has been integrated into Pipeline Pilot and the PharmacophorInformatics (PIx) platform used by scientists at Bayer Pharma. Different output formats allow customized application by medicinal chemists, physical chemists, and computational chemists.
Documentation of a spreadsheet for time-series analysis and drawdown estimation

USGS Publications Warehouse

Halford, Keith J.

2006-01-01

Drawdowns during aquifer tests can be obscured by barometric pressure changes, earth tides, regional pumping, and recharge events in the water-level record. These stresses can create water-level fluctuations that should be removed from observed water levels prior to estimating drawdowns. Simple models have been developed for estimating unpumped water levels during aquifer tests that are referred to as synthetic water levels. These models sum multiple time series such as barometric pressure, tidal potential, and background water levels to simulate non-pumping water levels. The amplitude and phase of each time series are adjusted so that synthetic water levels match measured water levels during periods unaffected by an aquifer test. Differences between synthetic and measured water levels are minimized with a sum-of-squares objective function. Root-mean-square errors during fitting and prediction periods were compared multiple times at four geographically diverse sites. Prediction error equaled fitting error when fitting periods were greater than or equal to four times prediction periods. The proposed drawdown estimation approach has been implemented in a spreadsheet application. Measured time series are independent so that collection frequencies can differ and sampling times can be asynchronous. Time series can be viewed selectively and magnified easily. Fitting and prediction periods can be defined graphically or entered directly. Synthetic water levels for each observation well are created with earth tides, measured time series, moving averages of time series, and differences between measured and moving averages of time series. Selected series and fitting parameters for synthetic water levels are stored and drawdowns are estimated for prediction periods. Drawdowns can be viewed independently and adjusted visually if an anomaly skews initial drawdowns away from 0. The number of observations in a drawdown time series can be reduced by averaging across user-defined periods. Raw or reduced drawdown estimates can be copied from the spreadsheet application or written to tab-delimited ASCII files.
Data assimilation for groundwater flow modelling using Unbiased Ensemble Square Root Filter: Case study in Guantao, North China Plain

NASA Astrophysics Data System (ADS)

Li, N.; Kinzelbach, W.; Li, H.; Li, W.; Chen, F.; Wang, L.

2017-12-01

Data assimilation techniques are widely used in hydrology to improve the reliability of hydrological models and to reduce model predictive uncertainties. This provides critical information for decision makers in water resources management. This study aims to evaluate a data assimilation system for the Guantao groundwater flow model coupled with a one-dimensional soil column simulation (Hydrus 1D) using an Unbiased Ensemble Square Root Filter (UnEnSRF) originating from the Ensemble Kalman Filter (EnKF) to update parameters and states, separately or simultaneously. To simplify the coupling between unsaturated and saturated zone, a linear relationship obtained from analyzing inputs to and outputs from Hydrus 1D is applied in the data assimilation process. Unlike EnKF, the UnEnSRF updates parameter ensemble mean and ensemble perturbations separately. In order to keep the ensemble filter working well during the data assimilation, two factors are introduced in the study. One is called damping factor to dampen the update amplitude of the posterior ensemble mean to avoid nonrealistic values. The other is called inflation factor to relax the posterior ensemble perturbations close to prior to avoid filter inbreeding problems. The sensitivities of the two factors are studied and their favorable values for the Guantao model are determined. The appropriate observation error and ensemble size were also determined to facilitate the further analysis. This study demonstrated that the data assimilation of both model parameters and states gives a smaller model prediction error but with larger uncertainty while the data assimilation of only model states provides a smaller predictive uncertainty but with a larger model prediction error. Data assimilation in a groundwater flow model will improve model prediction and at the same time make the model converge to the true parameters, which provides a successful base for applications in real time modelling or real time controlling strategies in groundwater resources management.
[Discrimination of types of polyacrylamide based on near infrared spectroscopy coupled with least square support vector machine].

PubMed

Zhang, Hong-Guang; Yang, Qin-Min; Lu, Jian-Gang

2014-04-01

In this paper, a novel discriminant methodology based on near infrared spectroscopic analysis technique and least square support vector machine was proposed for rapid and nondestructive discrimination of different types of Polyacrylamide. The diffuse reflectance spectra of samples of Non-ionic Polyacrylamide, Anionic Polyacrylamide and Cationic Polyacrylamide were measured. Then principal component analysis method was applied to reduce the dimension of the spectral data and extract of the principal compnents. The first three principal components were used for cluster analysis of the three different types of Polyacrylamide. Then those principal components were also used as inputs of least square support vector machine model. The optimization of the parameters and the number of principal components used as inputs of least square support vector machine model was performed through cross validation based on grid search. 60 samples of each type of Polyacrylamide were collected. Thus a total of 180 samples were obtained. 135 samples, 45 samples for each type of Polyacrylamide, were randomly split into a training set to build calibration model and the rest 45 samples were used as test set to evaluate the performance of the developed model. In addition, 5 Cationic Polyacrylamide samples and 5 Anionic Polyacrylamide samples adulterated with different proportion of Non-ionic Polyacrylamide were also prepared to show the feasibilty of the proposed method to discriminate the adulterated Polyacrylamide samples. The prediction error threshold for each type of Polyacrylamide was determined by F statistical significance test method based on the prediction error of the training set of corresponding type of Polyacrylamide in cross validation. The discrimination accuracy of the built model was 100% for prediction of the test set. The prediction of the model for the 10 mixing samples was also presented, and all mixing samples were accurately discriminated as adulterated samples. The overall results demonstrate that the discrimination method proposed in the present paper can rapidly and nondestructively discriminate the different types of Polyacrylamide and the adulterated Polyacrylamide samples, and offered a new approach to discriminate the types of Polyacrylamide.
Criterion for evaluating the predictive ability of nonlinear regression models without cross-validation.

PubMed

Kaneko, Hiromasa; Funatsu, Kimito

2013-09-23

We propose predictive performance criteria for nonlinear regression models without cross-validation. The proposed criteria are the determination coefficient and the root-mean-square error for the midpoints between k-nearest-neighbor data points. These criteria can be used to evaluate predictive ability after the regression models are updated, whereas cross-validation cannot be performed in such a situation. The proposed method is effective and helpful in handling big data when cross-validation cannot be applied. By analyzing data from numerical simulations and quantitative structural relationships, we confirm that the proposed criteria enable the predictive ability of the nonlinear regression models to be appropriately quantified.

Simulation of large particle transport near the surface under stable conditions: comparison with the Hanford tracer experiments

NASA Astrophysics Data System (ADS)

Kim, Eugene; Larson, Timothy

A plume model is presented describing the downwind transport of large particles (1-100 μm) under stable conditions. The model includes both vertical variations in wind speed and turbulence intensity as well as an algorithm for particle deposition at the surface. Model predictions compare favorably with the Hanford single and dual tracer experiments of crosswind integrated concentration (for particles: relative bias=-0.02 and 0.16, normalized mean square error=0.61 and 0.14, for the single and dual tracer experiments, respectively), whereas the US EPA's fugitive dust model consistently overestimates the observed concentrations at downwind distances beyond several hundred meters (for particles: relative bias=0.31 and 2.26, mean square error=0.42 and 1.71, respectively). For either plume model, the measured ratio of particle to gas concentration is consistently overestimated when using the deposition velocity algorithm of Sehmel and Hodgson (1978. DOE Report PNL-SA-6721, Pacific Northwest Laboratories, Richland, WA). In contrast, these same ratios are predicted with relatively little bias when using the algorithm of Kim et al. (2000. Atmospheric Environment 34 (15), 2387-2397).
Partial Least Squares Regression Calibration of an Ultraviolet-Visible Spectrophotometer for Measurements of Chemical Oxygen Demand in Dye Wastewater

NASA Astrophysics Data System (ADS)

Mai, W.; Zhang, J.-F.; Zhao, X.-M.; Li, Z.; Xu, Z.-W.

2017-11-01

Wastewater from the dye industry is typically analyzed using a standard method for measurement of chemical oxygen demand (COD) or by a single-wavelength spectroscopic method. To overcome the disadvantages of these methods, ultraviolet-visible (UV-Vis) spectroscopy was combined with principal component regression (PCR) and partial least squares regression (PLSR) in this study. Unlike the standard method, this method does not require digestion of the samples for preparation. Experiments showed that the PLSR model offered high prediction performance for COD, with a mean relative error of about 5% for two dyes. This error is similar to that obtained with the standard method. In this study, the precision of the PLSR model decreased with the number of dye compounds present. It is likely that multiple models will be required in reality, and the complexity of a COD monitoring system would be greatly reduced if the PLSR model is used because it can include several dyes. UV-Vis spectroscopy with PLSR successfully enhanced the performance of COD prediction for dye wastewater and showed good potential for application in on-line water quality monitoring.
Prediction of future uniform milk prices in Florida federal milk marketing order 6 from milk futures markets.

PubMed

De Vries, A; Feleke, S

2008-12-01

This study assessed the accuracy of 3 methods that predict the uniform milk price in Federal Milk Marketing Order 6 (Florida). Predictions were made for 1 to 12 mo into the future. Data were from January 2003 to May 2007. The CURRENT method assumed that future uniform milk prices were equal to the last announced uniform milk price. The F+BASIS and F+UTIL methods were based on the milk futures markets because the futures prices reflect the market's expectation of the class III and class IV cash prices that are announced monthly by USDA. The F+BASIS method added an exponentially weighted moving average of the difference between the class III cash price and the historical uniform milk price (also known as basis) to the class III futures price. The F+UTIL method used the class III and class IV futures prices, the most recently announced butter price, and historical utilizations to predict the skim milk prices, butterfat prices, and utilizations in all 4 classes. Predictions of future utilizations were made with a Holt-Winters smoothing method. Federal Milk Marketing Order 6 had high class I utilization (85 +/- 4.8%). Mean and standard deviation of the class III and class IV cash prices were $13.39 +/- 2.40/cwt (1 cwt = 45.36 kg) and $12.06 +/- 1.80/cwt, respectively. The actual uniform price in Tampa, Florida, was $16.62 +/- 2.16/cwt. The basis was $3.23 +/- 1.23/cwt. The F+BASIS and F+UTIL predictions were generally too low during the period considered because the class III cash prices were greater than the corresponding class III futures prices. For the 1- to 6-mo-ahead predictions, the root of the mean squared prediction errors from the F+BASIS method were $1.12, $1.20, $1.55, $1.91, $2.16, and $2.34/cwt, respectively. The root of the mean squared prediction errors ranged from $2.50 to $2.73/cwt for predictions up to 12 mo ahead. Results from the F+UTIL method were similar. The accuracies of the F+BASIS and F+UTIL methods for all 12 fore-cast horizons were not significantly different. Application of the modified Mariano-Diebold tests showed that no method included all the information contained in the other methods. In conclusion, both F+BASIS and F+UTIL methods tended to more accurately predict the future uniform milk prices than the CURRENT method, but prediction errors could be substantial even a few months into the future. The majority of the prediction error was caused by the inefficiency of the futures markets to predict the class III cash prices.
Toward autonomous spacecraft

NASA Technical Reports Server (NTRS)

Fogel, L. J.; Calabrese, P. G.; Walsh, M. J.; Owens, A. J.

1982-01-01

Ways in which autonomous behavior of spacecraft can be extended to treat situations wherein a closed loop control by a human may not be appropriate or even possible are explored. Predictive models that minimize mean least squared error and arbitrary cost functions are discussed. A methodology for extracting cyclic components for an arbitrary environment with respect to usual and arbitrary criteria is developed. An approach to prediction and control based on evolutionary programming is outlined. A computer program capable of predicting time series is presented. A design of a control system for a robotic dense with partially unknown physical properties is presented.
Physical characteristics cannot be used to predict cooling time using cold-water immersion as a treatment for exertional hyperthermia.

PubMed

Poirier, Martin P; Notley, Sean R; Flouris, Andreas D; Kenny, Glen P

2018-03-12

We examined if physical characteristics could be used to predict cooling time during cold water immersion (CWI, 2°C) following exertional hyperthermia (rectal temperature ≥39.5°C) in a physically heterogeneous group of men and women (n=62). Lean body mass was the only significant predictor of cooling time following CWI (R²=0.137; P<0.001); however that prediction did not provide the precision (mean residual square error: 3.18±2.28 min) required to act as a safe alternative to rectal temperature measurements.
Augmented classical least squares multivariate spectral analysis

DOEpatents

Haaland, David M.; Melgaard, David K.

2004-02-03

A method of multivariate spectral analysis, termed augmented classical least squares (ACLS), provides an improved CLS calibration model when unmodeled sources of spectral variation are contained in a calibration sample set. The ACLS methods use information derived from component or spectral residuals during the CLS calibration to provide an improved calibration-augmented CLS model. The ACLS methods are based on CLS so that they retain the qualitative benefits of CLS, yet they have the flexibility of PLS and other hybrid techniques in that they can define a prediction model even with unmodeled sources of spectral variation that are not explicitly included in the calibration model. The unmodeled sources of spectral variation may be unknown constituents, constituents with unknown concentrations, nonlinear responses, non-uniform and correlated errors, or other sources of spectral variation that are present in the calibration sample spectra. Also, since the various ACLS methods are based on CLS, they can incorporate the new prediction-augmented CLS (PACLS) method of updating the prediction model for new sources of spectral variation contained in the prediction sample set without having to return to the calibration process. The ACLS methods can also be applied to alternating least squares models. The ACLS methods can be applied to all types of multivariate data.
Augmented Classical Least Squares Multivariate Spectral Analysis

DOEpatents

Haaland, David M.; Melgaard, David K.

2005-07-26

A method of multivariate spectral analysis, termed augmented classical least squares (ACLS), provides an improved CLS calibration model when unmodeled sources of spectral variation are contained in a calibration sample set. The ACLS methods use information derived from component or spectral residuals during the CLS calibration to provide an improved calibration-augmented CLS model. The ACLS methods are based on CLS so that they retain the qualitative benefits of CLS, yet they have the flexibility of PLS and other hybrid techniques in that they can define a prediction model even with unmodeled sources of spectral variation that are not explicitly included in the calibration model. The unmodeled sources of spectral variation may be unknown constituents, constituents with unknown concentrations, nonlinear responses, non-uniform and correlated errors, or other sources of spectral variation that are present in the calibration sample spectra. Also, since the various ACLS methods are based on CLS, they can incorporate the new prediction-augmented CLS (PACLS) method of updating the prediction model for new sources of spectral variation contained in the prediction sample set without having to return to the calibration process. The ACLS methods can also be applied to alternating least squares models. The ACLS methods can be applied to all types of multivariate data.
Augmented Classical Least Squares Multivariate Spectral Analysis

DOEpatents

Haaland, David M.; Melgaard, David K.

2005-01-11

A method of multivariate spectral analysis, termed augmented classical least squares (ACLS), provides an improved CLS calibration model when unmodeled sources of spectral variation are contained in a calibration sample set. The ACLS methods use information derived from component or spectral residuals during the CLS calibration to provide an improved calibration-augmented CLS model. The ACLS methods are based on CLS so that they retain the qualitative benefits of CLS, yet they have the flexibility of PLS and other hybrid techniques in that they can define a prediction model even with unmodeled sources of spectral variation that are not explicitly included in the calibration model. The unmodeled sources of spectral variation may be unknown constituents, constituents with unknown concentrations, nonlinear responses, non-uniform and correlated errors, or other sources of spectral variation that are present in the calibration sample spectra. Also, since the various ACLS methods are based on CLS, they can incorporate the new prediction-augmented CLS (PACLS) method of updating the prediction model for new sources of spectral variation contained in the prediction sample set without having to return to the calibration process. The ACLS methods can also be applied to alternating least squares models. The ACLS methods can be applied to all types of multivariate data.
Error quantification of abnormal extreme high waves in Operational Oceanographic System in Korea

NASA Astrophysics Data System (ADS)

Jeong, Sang-Hun; Kim, Jinah; Heo, Ki-Young; Park, Kwang-Soon

2017-04-01

In winter season, large-height swell-like waves have occurred on the East coast of Korea, causing property damages and loss of human life. It is known that those waves are generated by a local strong wind made by temperate cyclone moving to eastward in the East Sea of Korean peninsula. Because the waves are often occurred in the clear weather, in particular, the damages are to be maximized. Therefore, it is necessary to predict and forecast large-height swell-like waves to prevent and correspond to the coastal damages. In Korea, an operational oceanographic system (KOOS) has been developed by the Korea institute of ocean science and technology (KIOST) and KOOS provides daily basis 72-hours' ocean forecasts such as wind, water elevation, sea currents, water temperature, salinity, and waves which are computed from not only meteorological and hydrodynamic model (WRF, ROMS, MOM, and MOHID) but also wave models (WW-III and SWAN). In order to evaluate the model performance and guarantee a certain level of accuracy of ocean forecasts, a Skill Assessment (SA) system was established as a one of module in KOOS. It has been performed through comparison of model results with in-situ observation data and model errors have been quantified with skill scores. Statistics which are used in skill assessment are including a measure of both errors and correlations such as root-mean-square-error (RMSE), root-mean-square-error percentage (RMSE%), mean bias (MB), correlation coefficient (R), scatter index (SI), circular correlation (CC) and central frequency (CF) that is a frequency with which errors lie within acceptable error criteria. It should be utilized simultaneously not only to quantify an error but also to improve an accuracy of forecasts by providing a feedback interactively. However, in an abnormal phenomena such as high-height swell-like waves in the East coast of Korea, it requires more advanced and optimized error quantification method that allows to predict the abnormal waves well and to improve the accuracy of forecasts by supporting modification of physics and numeric on numerical models through sensitivity test. In this study, we proposed an appropriate method of error quantification especially on abnormal high waves which are occurred by local weather condition. Furthermore, we introduced that how the quantification errors are contributed to improve wind-wave modeling by applying data assimilation and utilizing reanalysis data.
Quantitative analysis of essential oils in perfume using multivariate curve resolution combined with comprehensive two-dimensional gas chromatography.

PubMed

de Godoy, Luiz Antonio Fonseca; Hantao, Leandro Wang; Pedroso, Marcio Pozzobon; Poppi, Ronei Jesus; Augusto, Fabio

2011-08-05

The use of multivariate curve resolution (MCR) to build multivariate quantitative models using data obtained from comprehensive two-dimensional gas chromatography with flame ionization detection (GC×GC-FID) is presented and evaluated. The MCR algorithm presents some important features, such as second order advantage and the recovery of the instrumental response for each pure component after optimization by an alternating least squares (ALS) procedure. A model to quantify the essential oil of rosemary was built using a calibration set containing only known concentrations of the essential oil and cereal alcohol as solvent. A calibration curve correlating the concentration of the essential oil of rosemary and the instrumental response obtained from the MCR-ALS algorithm was obtained, and this calibration model was applied to predict the concentration of the oil in complex samples (mixtures of the essential oil, pineapple essence and commercial perfume). The values of the root mean square error of prediction (RMSEP) and of the root mean square error of the percentage deviation (RMSPD) obtained were 0.4% (v/v) and 7.2%, respectively. Additionally, a second model was built and used to evaluate the accuracy of the method. A model to quantify the essential oil of lemon grass was built and its concentration was predicted in the validation set and real perfume samples. The RMSEP and RMSPD obtained were 0.5% (v/v) and 6.9%, respectively, and the concentration of the essential oil of lemon grass in perfume agreed to the value informed by the manufacturer. The result indicates that the MCR algorithm is adequate to resolve the target chromatogram from the complex sample and to build multivariate models of GC×GC-FID data. Copyright © 2011 Elsevier B.V. All rights reserved.
Rapid investigation of α-glucosidase inhibitory activity of Phaleria macrocarpa extracts using FTIR-ATR based fingerprinting.

PubMed

Easmin, Sabina; Sarker, Md Zaidul Islam; Ghafoor, Kashif; Ferdosh, Sahena; Jaffri, Juliana; Ali, Md Eaqub; Mirhosseini, Hamed; Al-Juhaimi, Fahad Y; Perumal, Vikneswari; Khatib, Alfi

2017-04-01

Phaleria macrocarpa, known as "Mahkota Dewa", is a widely used medicinal plant in Malaysia. This study focused on the characterization of α-glucosidase inhibitory activity of P. macrocarpa extracts using Fourier transform infrared spectroscopy (FTIR)-based metabolomics. P. macrocarpa and its extracts contain thousands of compounds having synergistic effect. Generally, their variability exists, and there are many active components in meager amounts. Thus, the conventional measurement methods of a single component for the quality control are time consuming, laborious, expensive, and unreliable. It is of great interest to develop a rapid prediction method for herbal quality control to investigate the α-glucosidase inhibitory activity of P. macrocarpa by multicomponent analyses. In this study, a rapid and simple analytical method was developed using FTIR spectroscopy-based fingerprinting. A total of 36 extracts of different ethanol concentrations were prepared and tested on inhibitory potential and fingerprinted using FTIR spectroscopy, coupled with chemometrics of orthogonal partial least square (OPLS) at the 4000-400 cm -1 frequency region and resolution of 4 cm -1 . The OPLS model generated the highest regression coefficient with R 2 Y = 0.98 and Q 2 Y = 0.70, lowest root mean square error estimation = 17.17, and root mean square error of cross validation = 57.29. A five-component (1+4+0) predictive model was build up to correlate FTIR spectra with activity, and the responsible functional groups, such as -CH, -NH, -COOH, and -OH, were identified for the bioactivity. A successful multivariate model was constructed using FTIR-attenuated total reflection as a simple and rapid technique to predict the inhibitory activity. Copyright © 2016. Published by Elsevier B.V.
Application of FTIR-ATR spectroscopy coupled with multivariate analysis for rapid estimation of butter adulteration.

PubMed

Fadzlillah, Nurrulhidayah Ahmad; Rohman, Abdul; Ismail, Amin; Mustafa, Shuhaimi; Khatib, Alfi

2013-01-01

In dairy product sector, butter is one of the potential sources of fat soluble vitamins, namely vitamin A, D, E, K; consequently, butter is taken into account as high valuable price from other dairy products. This fact has attracted unscrupulous market players to blind butter with other animal fats to gain economic profit. Animal fats like mutton fat (MF) are potential to be mixed with butter due to the similarity in terms of fatty acid composition. This study focused on the application of FTIR-ATR spectroscopy in conjunction with chemometrics for classification and quantification of MF as adulterant in butter. The FTIR spectral region of 3910-710 cm⁻¹ was used for classification between butter and butter blended with MF at various concentrations with the aid of discriminant analysis (DA). DA is able to classify butter and adulterated butter without any mistakenly grouped. For quantitative analysis, partial least square (PLS) regression was used to develop a calibration model at the frequency regions of 3910-710 cm⁻¹. The equation obtained for the relationship between actual value of MF and FTIR predicted values of MF in PLS calibration model was y = 0.998x + 1.033, with the values of coefficient of determination (R²) and root mean square error of calibration are 0.998 and 0.046% (v/v), respectively. The PLS calibration model was subsequently used for the prediction of independent samples containing butter in the binary mixtures with MF. Using 9 principal components, root mean square error of prediction (RMSEP) is 1.68% (v/v). The results showed that FTIR spectroscopy can be used for the classification and quantification of MF in butter formulation for verification purposes.
Alternative approaches to predicting methane emissions from dairy cows.

PubMed

Mills, J A N; Kebreab, E; Yates, C M; Crompton, L A; Cammell, S B; Dhanoa, M S; Agnew, R E; France, J

2003-12-01

Previous attempts to apply statistical models, which correlate nutrient intake with methane production, have been of limited value where predictions are obtained for nutrient intakes and diet types outside those used in model construction. Dynamic mechanistic models have proved more suitable for extrapolation, but they remain computationally expensive and are not applied easily in practical situations. The first objective of this research focused on employing conventional techniques to generate statistical models of methane production appropriate to United Kingdom dairy systems. The second objective was to evaluate these models and a model published previously using both United Kingdom and North American data sets. Thirdly, nonlinear models were considered as alternatives to the conventional linear regressions. The United Kingdom calorimetry data used to construct the linear models also were used to develop the three nonlinear alternatives that were all of modified Mitscherlich (monomolecular) form. Of the linear models tested, an equation from the literature proved most reliable across the full range of evaluation data (root mean square prediction error = 21.3%). However, the Mitscherlich models demonstrated the greatest degree of adaptability across diet types and intake level. The most successful model for simulating the independent data was a modified Mitscherlich equation with the steepness parameter set to represent dietary starch-to-ADF ratio (root mean square prediction error = 20.6%). However, when such data were unavailable, simpler Mitscherlich forms relating dry matter or metabolizable energy intake to methane production remained better alternatives relative to their linear counterparts.
Direct Quantification of Cd2+ in the Presence of Cu2+ by a Combination of Anodic Stripping Voltammetry Using a Bi-Film-Modified Glassy Carbon Electrode and an Artificial Neural Network.

PubMed

Zhao, Guo; Wang, Hui; Liu, Gang

2017-07-03

Abstract : In this study, a novel method based on a Bi/glassy carbon electrode (Bi/GCE) for quantitatively and directly detecting Cd 2+ in the presence of Cu 2+ without further electrode modifications by combining square-wave anodic stripping voltammetry (SWASV) and a back-propagation artificial neural network (BP-ANN) has been proposed. The influence of the Cu 2+ concentration on the stripping response to Cd 2+ was studied. In addition, the effect of the ferrocyanide concentration on the SWASV detection of Cd 2+ in the presence of Cu 2+ was investigated. A BP-ANN with two inputs and one output was used to establish the nonlinear relationship between the concentration of Cd 2+ and the stripping peak currents of Cu 2+ and Cd 2+ . The factors affecting the SWASV detection of Cd 2+ and the key parameters of the BP-ANN were optimized. Moreover, the direct calibration model (i.e., adding 0.1 mM ferrocyanide before detection), the BP-ANN model and other prediction models were compared to verify the prediction performance of these models in terms of their mean absolute errors (MAEs), root mean square errors (RMSEs) and correlation coefficients. The BP-ANN model exhibited higher prediction accuracy than the direct calibration model and the other prediction models. Finally, the proposed method was used to detect Cd 2+ in soil samples with satisfactory results.
Global optimization method based on ray tracing to achieve optimum figure error compensation

NASA Astrophysics Data System (ADS)

Liu, Xiaolin; Guo, Xuejia; Tang, Tianjin

2017-02-01

Figure error would degrade the performance of optical system. When predicting the performance and performing system assembly, compensation by clocking of optical components around the optical axis is a conventional but user-dependent method. Commercial optical software cannot optimize this clocking. Meanwhile existing automatic figure-error balancing methods can introduce approximate calculation error and the build process of optimization model is complex and time-consuming. To overcome these limitations, an accurate and automatic global optimization method of figure error balancing is proposed. This method is based on precise ray tracing to calculate the wavefront error, not approximate calculation, under a given elements' rotation angles combination. The composite wavefront error root-mean-square (RMS) acts as the cost function. Simulated annealing algorithm is used to seek the optimal combination of rotation angles of each optical element. This method can be applied to all rotational symmetric optics. Optimization results show that this method is 49% better than previous approximate analytical method.
Novel method to predict body weight in children based on age and morphological facial features.

PubMed

Huang, Ziyin; Barrett, Jeffrey S; Barrett, Kyle; Barrett, Ryan; Ng, Chee M

2015-04-01

A new and novel approach of predicting the body weight of children based on age and morphological facial features using a three-layer feed-forward artificial neural network (ANN) model is reported. The model takes in four parameters, including age-based CDC-inferred median body weight and three facial feature distances measured from digital facial images. In this study, thirty-nine volunteer subjects with age ranging from 6-18 years old and BW ranging from 18.6-96.4 kg were used for model development and validation. The final model has a mean prediction error of 0.48, a mean squared error of 18.43, and a coefficient of correlation of 0.94. The model shows significant improvement in prediction accuracy over several age-based body weight prediction methods. Combining with a facial recognition algorithm that can detect, extract and measure the facial features used in this study, mobile applications that incorporate this body weight prediction method may be developed for clinical investigations where access to scales is limited. © 2014, The American College of Clinical Pharmacology.
Optimizing Blasting’s Air Overpressure Prediction Model using Swarm Intelligence

NASA Astrophysics Data System (ADS)

Nur Asmawisham Alel, Mohd; Ruben Anak Upom, Mark; Asnida Abdullah, Rini; Hazreek Zainal Abidin, Mohd

2018-04-01

Air overpressure (AOp) resulting from blasting can cause damage and nuisance to nearby civilians. Thus, it is important to be able to predict AOp accurately. In this study, 8 different Artificial Neural Network (ANN) were developed for the purpose of prediction of AOp. The ANN models were trained using different variants of Particle Swarm Optimization (PSO) algorithm. AOp predictions were also made using an empirical equation, as suggested by United States Bureau of Mines (USBM), to serve as a benchmark. In order to develop the models, 76 blasting operations in Hulu Langat were investigated. All the ANN models were found to outperform the USBM equation in three performance metrics; root mean square error (RMSE), mean absolute percentage error (MAPE) and coefficient of determination (R2). Using a performance ranking method, MSO-Rand-Mut was determined to be the best prediction model for AOp with a performance metric of RMSE=2.18, MAPE=1.73% and R2=0.97. The result shows that ANN models trained using PSO are capable of predicting AOp with great accuracy.
Estimation of perceptible water vapor of atmosphere using artificial neural network, support vector machine and multiple linear regression algorithm and their comparative study

NASA Astrophysics Data System (ADS)

Shastri, Niket; Pathak, Kamlesh

2018-05-01

The water vapor content in atmosphere plays very important role in climate. In this paper the application of GPS signal in meteorology is discussed, which is useful technique that is used to estimate the perceptible water vapor of atmosphere. In this paper various algorithms like artificial neural network, support vector machine and multiple linear regression are use to predict perceptible water vapor. The comparative studies in terms of root mean square error and mean absolute errors are also carried out for all the algorithms.
Application of Visible and Near-Infrared Hyperspectral Imaging to Determine Soluble Protein Content in Oilseed Rape Leaves

PubMed Central

Zhang, Chu; Liu, Fei; Kong, Wenwen; He, Yong

2015-01-01

Visible and near-infrared hyperspectral imaging covering spectral range of 380–1030 nm as a rapid and non-destructive method was applied to estimate the soluble protein content of oilseed rape leaves. Average spectrum (500–900 nm) of the region of interest (ROI) of each sample was extracted, and four samples out of 128 samples were defined as outliers by Monte Carlo-partial least squares (MCPLS). Partial least squares (PLS) model using full spectra obtained dependable performance with the correlation coefficient (rp) of 0.9441, root mean square error of prediction (RMSEP) of 0.1658 mg/g and residual prediction deviation (RPD) of 2.98. The weighted regression coefficient (Bw), successive projections algorithm (SPA) and genetic algorithm-partial least squares (GAPLS) selected 18, 15, and 16 sensitive wavelengths, respectively. SPA-PLS model obtained the best performance with rp of 0.9554, RMSEP of 0.1538 mg/g and RPD of 3.25. Distribution of protein content within the rape leaves were visualized and mapped on the basis of the SPA-PLS model. The overall results indicated that hyperspectral imaging could be used to determine and visualize the soluble protein content of rape leaves. PMID:26184198
Blind prediction of cyclohexane-water distribution coefficients from the SAMPL5 challenge.

PubMed

Bannan, Caitlin C; Burley, Kalistyn H; Chiu, Michael; Shirts, Michael R; Gilson, Michael K; Mobley, David L

2016-11-01

In the recent SAMPL5 challenge, participants submitted predictions for cyclohexane/water distribution coefficients for a set of 53 small molecules. Distribution coefficients (log D) replace the hydration free energies that were a central part of the past five SAMPL challenges. A wide variety of computational methods were represented by the 76 submissions from 18 participating groups. Here, we analyze submissions by a variety of error metrics and provide details for a number of reference calculations we performed. As in the SAMPL4 challenge, we assessed the ability of participants to evaluate not just their statistical uncertainty, but their model uncertainty-how well they can predict the magnitude of their model or force field error for specific predictions. Unfortunately, this remains an area where prediction and analysis need improvement. In SAMPL4 the top performing submissions achieved a root-mean-squared error (RMSE) around 1.5 kcal/mol. If we anticipate accuracy in log D predictions to be similar to the hydration free energy predictions in SAMPL4, the expected error here would be around 1.54 log units. Only a few submissions had an RMSE below 2.5 log units in their predicted log D values. However, distribution coefficients introduced complexities not present in past SAMPL challenges, including tautomer enumeration, that are likely to be important in predicting biomolecular properties of interest to drug discovery, therefore some decrease in accuracy would be expected. Overall, the SAMPL5 distribution coefficient challenge provided great insight into the importance of modeling a variety of physical effects. We believe these types of measurements will be a promising source of data for future blind challenges, especially in view of the relatively straightforward nature of the experiments and the level of insight provided.

Blind prediction of cyclohexane-water distribution coefficients from the SAMPL5 challenge

PubMed Central

Bannan, Caitlin C.; Burley, Kalistyn H.; Chiu, Michael; Shirts, Michael R.; Gilson, Michael K.; Mobley, David L.

2016-01-01

In the recent SAMPL5 challenge, participants submitted predictions for cyclohexane/water distribution coefficients for a set of 53 small molecules. Distribution coefficients (log D) replace the hydration free energies that were a central part of the past five SAMPL challenges. A wide variety of computational methods were represented by the 76 submissions from 18 participating groups. Here, we analyze submissions by a variety of error metrics and provide details for a number of reference calculations we performed. As in the SAMPL4 challenge, we assessed the ability of participants to evaluate not just their statistical uncertainty, but their model uncertainty – how well they can predict the magnitude of their model or force field error for specific predictions. Unfortunately, this remains an area where prediction and analysis need improvement. In SAMPL4 the top performing submissions achieved a root-mean-squared error (RMSE) around 1.5 kcal/mol. If we anticipate accuracy in log D predictions to be similar to the hydration free energy predictions in SAMPL4, the expected error here would be around 1.54 log units. Only a few submissions had an RMSE below 2.5 log units in their predicted log D values. However, distribution coefficients introduced complexities not present in past SAMPL challenges, including tautomer enumeration, that are likely to be important in predicting biomolecular properties of interest to drug discovery, therefore some decrease in accuracy would be expected. Overall, the SAMPL5 distribution coefficient challenge provided great insight into the importance of modeling a variety of physical effects. We believe these types of measurements will be a promising source of data for future blind challenges, especially in view of the relatively straightforward nature of the experiments and the level of insight provided. PMID:27677750
Weighted partial least squares based on the error and variance of the recovery rate in calibration set.

PubMed

Yu, Shaohui; Xiao, Xue; Ding, Hong; Xu, Ge; Li, Haixia; Liu, Jing

2017-08-05

The quantitative analysis is very difficult for the emission-excitation fluorescence spectroscopy of multi-component mixtures whose fluorescence peaks are serious overlapping. As an effective method for the quantitative analysis, partial least squares can extract the latent variables from both the independent variables and the dependent variables, so it can model for multiple correlations between variables. However, there are some factors that usually affect the prediction results of partial least squares, such as the noise, the distribution and amount of the samples in calibration set etc. This work focuses on the problems in the calibration set that are mentioned above. Firstly, the outliers in the calibration set are removed by leave-one-out cross-validation. Then, according to two different prediction requirements, the EWPLS method and the VWPLS method are proposed. The independent variables and dependent variables are weighted in the EWPLS method by the maximum error of the recovery rate and weighted in the VWPLS method by the maximum variance of the recovery rate. Three organic matters with serious overlapping excitation-emission fluorescence spectroscopy are selected for the experiments. The step adjustment parameter, the iteration number and the sample amount in the calibration set are discussed. The results show the EWPLS method and the VWPLS method are superior to the PLS method especially for the case of small samples in the calibration set. Copyright © 2017 Elsevier B.V. All rights reserved.
Statistical Modeling and Prediction for Tourism Economy Using Dendritic Neural Network

PubMed Central

Yu, Ying; Wang, Yirui; Tang, Zheng

2017-01-01

With the impact of global internationalization, tourism economy has also been a rapid development. The increasing interest aroused by more advanced forecasting methods leads us to innovate forecasting methods. In this paper, the seasonal trend autoregressive integrated moving averages with dendritic neural network model (SA-D model) is proposed to perform the tourism demand forecasting. First, we use the seasonal trend autoregressive integrated moving averages model (SARIMA model) to exclude the long-term linear trend and then train the residual data by the dendritic neural network model and make a short-term prediction. As the result showed in this paper, the SA-D model can achieve considerably better predictive performances. In order to demonstrate the effectiveness of the SA-D model, we also use the data that other authors used in the other models and compare the results. It also proved that the SA-D model achieved good predictive performances in terms of the normalized mean square error, absolute percentage of error, and correlation coefficient. PMID:28246527
Statistical Modeling and Prediction for Tourism Economy Using Dendritic Neural Network.

PubMed

Yu, Ying; Wang, Yirui; Gao, Shangce; Tang, Zheng

2017-01-01

With the impact of global internationalization, tourism economy has also been a rapid development. The increasing interest aroused by more advanced forecasting methods leads us to innovate forecasting methods. In this paper, the seasonal trend autoregressive integrated moving averages with dendritic neural network model (SA-D model) is proposed to perform the tourism demand forecasting. First, we use the seasonal trend autoregressive integrated moving averages model (SARIMA model) to exclude the long-term linear trend and then train the residual data by the dendritic neural network model and make a short-term prediction. As the result showed in this paper, the SA-D model can achieve considerably better predictive performances. In order to demonstrate the effectiveness of the SA-D model, we also use the data that other authors used in the other models and compare the results. It also proved that the SA-D model achieved good predictive performances in terms of the normalized mean square error, absolute percentage of error, and correlation coefficient.
Prediction models for Arabica coffee beverage quality based on aroma analyses and chemometrics.

PubMed

Ribeiro, J S; Augusto, F; Salva, T J G; Ferreira, M M C

2012-11-15

In this work, soft modeling based on chemometric analyses of coffee beverage sensory data and the chromatographic profiles of volatile roasted coffee compounds is proposed to predict the scores of acidity, bitterness, flavor, cleanliness, body, and overall quality of the coffee beverage. A partial least squares (PLS) regression method was used to construct the models. The ordered predictor selection (OPS) algorithm was applied to select the compounds for the regression model of each sensory attribute in order to take only significant chromatographic peaks into account. The prediction errors of these models, using 4 or 5 latent variables, were equal to 0.28, 0.33, 0.35, 0.33, 0.34 and 0.41, for each of the attributes and compatible with the errors of the mean scores of the experts. Thus, the results proved the feasibility of using a similar methodology in on-line or routine applications to predict the sensory quality of Brazilian Arabica coffee. Copyright © 2012 Elsevier B.V. All rights reserved.
Muscle Synergies May Improve Optimization Prediction of Knee Contact Forces During Walking

PubMed Central

Walter, Jonathan P.; Kinney, Allison L.; Banks, Scott A.; D'Lima, Darryl D.; Besier, Thor F.; Lloyd, David G.; Fregly, Benjamin J.

2014-01-01

The ability to predict patient-specific joint contact and muscle forces accurately could improve the treatment of walking-related disorders. Muscle synergy analysis, which decomposes a large number of muscle electromyographic (EMG) signals into a small number of synergy control signals, could reduce the dimensionality and thus redundancy of the muscle and contact force prediction process. This study investigated whether use of subject-specific synergy controls can improve optimization prediction of knee contact forces during walking. To generate the predictions, we performed mixed dynamic muscle force optimizations (i.e., inverse skeletal dynamics with forward muscle activation and contraction dynamics) using data collected from a subject implanted with a force-measuring knee replacement. Twelve optimization problems (three cases with four subcases each) that minimized the sum of squares of muscle excitations were formulated to investigate how synergy controls affect knee contact force predictions. The three cases were: (1) Calibrate+Match where muscle model parameter values were calibrated and experimental knee contact forces were simultaneously matched, (2) Precalibrate+Predict where experimental knee contact forces were predicted using precalibrated muscle model parameters values from the first case, and (3) Calibrate+Predict where muscle model parameter values were calibrated and experimental knee contact forces were simultaneously predicted, all while matching inverse dynamic loads at the hip, knee, and ankle. The four subcases used either 44 independent controls or five synergy controls with and without EMG shape tracking. For the Calibrate+Match case, all four subcases closely reproduced the measured medial and lateral knee contact forces (R2 ≥ 0.94, root-mean-square (RMS) error < 66 N), indicating sufficient model fidelity for contact force prediction. For the Precalibrate+Predict and Calibrate+Predict cases, synergy controls yielded better contact force predictions (0.61 < R2 < 0.90, 83 N < RMS error < 161 N) than did independent controls (-0.15 < R2 < 0.79, 124 N < RMS error < 343 N) for corresponding subcases. For independent controls, contact force predictions improved when precalibrated model parameter values or EMG shape tracking was used. For synergy controls, contact force predictions were relatively insensitive to how model parameter values were calibrated, while EMG shape tracking made lateral (but not medial) contact force predictions worse. For the subject and optimization cost function analyzed in this study, use of subject-specific synergy controls improved the accuracy of knee contact force predictions, especially for lateral contact force when EMG shape tracking was omitted, and reduced prediction sensitivity to uncertainties in muscle model parameter values. PMID:24402438
Muscle synergies may improve optimization prediction of knee contact forces during walking.

PubMed

Walter, Jonathan P; Kinney, Allison L; Banks, Scott A; D'Lima, Darryl D; Besier, Thor F; Lloyd, David G; Fregly, Benjamin J

2014-02-01

The ability to predict patient-specific joint contact and muscle forces accurately could improve the treatment of walking-related disorders. Muscle synergy analysis, which decomposes a large number of muscle electromyographic (EMG) signals into a small number of synergy control signals, could reduce the dimensionality and thus redundancy of the muscle and contact force prediction process. This study investigated whether use of subject-specific synergy controls can improve optimization prediction of knee contact forces during walking. To generate the predictions, we performed mixed dynamic muscle force optimizations (i.e., inverse skeletal dynamics with forward muscle activation and contraction dynamics) using data collected from a subject implanted with a force-measuring knee replacement. Twelve optimization problems (three cases with four subcases each) that minimized the sum of squares of muscle excitations were formulated to investigate how synergy controls affect knee contact force predictions. The three cases were: (1) Calibrate+Match where muscle model parameter values were calibrated and experimental knee contact forces were simultaneously matched, (2) Precalibrate+Predict where experimental knee contact forces were predicted using precalibrated muscle model parameters values from the first case, and (3) Calibrate+Predict where muscle model parameter values were calibrated and experimental knee contact forces were simultaneously predicted, all while matching inverse dynamic loads at the hip, knee, and ankle. The four subcases used either 44 independent controls or five synergy controls with and without EMG shape tracking. For the Calibrate+Match case, all four subcases closely reproduced the measured medial and lateral knee contact forces (R2 ≥ 0.94, root-mean-square (RMS) error < 66 N), indicating sufficient model fidelity for contact force prediction. For the Precalibrate+Predict and Calibrate+Predict cases, synergy controls yielded better contact force predictions (0.61 < R2 < 0.90, 83 N < RMS error < 161 N) than did independent controls (-0.15 < R2 < 0.79, 124 N < RMS error < 343 N) for corresponding subcases. For independent controls, contact force predictions improved when precalibrated model parameter values or EMG shape tracking was used. For synergy controls, contact force predictions were relatively insensitive to how model parameter values were calibrated, while EMG shape tracking made lateral (but not medial) contact force predictions worse. For the subject and optimization cost function analyzed in this study, use of subject-specific synergy controls improved the accuracy of knee contact force predictions, especially for lateral contact force when EMG shape tracking was omitted, and reduced prediction sensitivity to uncertainties in muscle model parameter values.
Arima model and exponential smoothing method: A comparison

NASA Astrophysics Data System (ADS)

Wan Ahmad, Wan Kamarul Ariffin; Ahmad, Sabri

2013-04-01

This study shows the comparison between Autoregressive Moving Average (ARIMA) model and Exponential Smoothing Method in making a prediction. The comparison is focused on the ability of both methods in making the forecasts with the different number of data sources and the different length of forecasting period. For this purpose, the data from The Price of Crude Palm Oil (RM/tonne), Exchange Rates of Ringgit Malaysia (RM) in comparison to Great Britain Pound (GBP) and also The Price of SMR 20 Rubber Type (cents/kg) with three different time series are used in the comparison process. Then, forecasting accuracy of each model is measured by examinethe prediction error that producedby using Mean Squared Error (MSE), Mean Absolute Percentage Error (MAPE), and Mean Absolute deviation (MAD). The study shows that the ARIMA model can produce a better prediction for the long-term forecasting with limited data sources, butcannot produce a better prediction for time series with a narrow range of one point to another as in the time series for Exchange Rates. On the contrary, Exponential Smoothing Method can produce a better forecasting for Exchange Rates that has a narrow range of one point to another for its time series, while itcannot produce a better prediction for a longer forecasting period.
Free variable selection QSPR study to predict 19F chemical shifts of some fluorinated organic compounds using Random Forest and RBF-PLS methods

NASA Astrophysics Data System (ADS)

Goudarzi, Nasser

2016-04-01

In this work, two new and powerful chemometrics methods are applied for the modeling and prediction of the 19F chemical shift values of some fluorinated organic compounds. The radial basis function-partial least square (RBF-PLS) and random forest (RF) are employed to construct the models to predict the 19F chemical shifts. In this study, we didn't used from any variable selection method and RF method can be used as variable selection and modeling technique. Effects of the important parameters affecting the ability of the RF prediction power such as the number of trees (nt) and the number of randomly selected variables to split each node (m) were investigated. The root-mean-square errors of prediction (RMSEP) for the training set and the prediction set for the RBF-PLS and RF models were 44.70, 23.86, 29.77, and 23.69, respectively. Also, the correlation coefficients of the prediction set for the RBF-PLS and RF models were 0.8684 and 0.9313, respectively. The results obtained reveal that the RF model can be used as a powerful chemometrics tool for the quantitative structure-property relationship (QSPR) studies.
Application of molecular dynamics simulations in molecular property prediction II: diffusion coefficient.

PubMed

Wang, Junmei; Hou, Tingjun

2011-12-01

In this work, we have evaluated how well the general assisted model building with energy refinement (AMBER) force field performs in studying the dynamic properties of liquids. Diffusion coefficients (D) have been predicted for 17 solvents, five organic compounds in aqueous solutions, four proteins in aqueous solutions, and nine organic compounds in nonaqueous solutions. An efficient sampling strategy has been proposed and tested in the calculation of the diffusion coefficients of solutes in solutions. There are two major findings of this study. First of all, the diffusion coefficients of organic solutes in aqueous solution can be well predicted: the average unsigned errors and the root mean square errors are 0.137 and 0.171 × 10(-5) cm(-2) s(-1), respectively. Second, although the absolute values of D cannot be predicted, good correlations have been achieved for eight organic solvents with experimental data (R(2) = 0.784), four proteins in aqueous solutions (R(2) = 0.996), and nine organic compounds in nonaqueous solutions (R(2) = 0.834). The temperature dependent behaviors of three solvents, namely, TIP3P water, dimethyl sulfoxide, and cyclohexane have been studied. The major molecular dynamics (MD) settings, such as the sizes of simulation boxes and with/without wrapping the coordinates of MD snapshots into the primary simulation boxes have been explored. We have concluded that our sampling strategy that averaging the mean square displacement collected in multiple short-MD simulations is efficient in predicting diffusion coefficients of solutes at infinite dilution. Copyright © 2011 Wiley Periodicals, Inc.
A Novel Hybrid Data-Driven Model for Daily Land Surface Temperature Forecasting Using Long Short-Term Memory Neural Network Based on Ensemble Empirical Mode Decomposition

PubMed Central

Zhang, Xike; Zhang, Qiuwen; Zhang, Gui; Nie, Zhiping; Gui, Zifan; Que, Huafei

2018-01-01

Daily land surface temperature (LST) forecasting is of great significance for application in climate-related, agricultural, eco-environmental, or industrial studies. Hybrid data-driven prediction models using Ensemble Empirical Mode Composition (EEMD) coupled with Machine Learning (ML) algorithms are useful for achieving these purposes because they can reduce the difficulty of modeling, require less history data, are easy to develop, and are less complex than physical models. In this article, a computationally simple, less data-intensive, fast and efficient novel hybrid data-driven model called the EEMD Long Short-Term Memory (LSTM) neural network, namely EEMD-LSTM, is proposed to reduce the difficulty of modeling and to improve prediction accuracy. The daily LST data series from the Mapoling and Zhijiang stations in the Dongting Lake basin, central south China, from 1 January 2014 to 31 December 2016 is used as a case study. The EEMD is firstly employed to decompose the original daily LST data series into many Intrinsic Mode Functions (IMFs) and a single residue item. Then, the Partial Autocorrelation Function (PACF) is used to obtain the number of input data sample points for LSTM models. Next, the LSTM models are constructed to predict the decompositions. All the predicted results of the decompositions are aggregated as the final daily LST. Finally, the prediction performance of the hybrid EEMD-LSTM model is assessed in terms of the Mean Square Error (MSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), Root Mean Square Error (RMSE), Pearson Correlation Coefficient (CC) and Nash-Sutcliffe Coefficient of Efficiency (NSCE). To validate the hybrid data-driven model, the hybrid EEMD-LSTM model is compared with the Recurrent Neural Network (RNN), LSTM and Empirical Mode Decomposition (EMD) coupled with RNN, EMD-LSTM and EEMD-RNN models, and their comparison results demonstrate that the hybrid EEMD-LSTM model performs better than the other five models. The scatterplots of the predicted results of the six models versus the original daily LST data series show that the hybrid EEMD-LSTM model is superior to the other five models. It is concluded that the proposed hybrid EEMD-LSTM model in this study is a suitable tool for temperature forecasting. PMID:29883381
A Novel Hybrid Data-Driven Model for Daily Land Surface Temperature Forecasting Using Long Short-Term Memory Neural Network Based on Ensemble Empirical Mode Decomposition.

PubMed

Zhang, Xike; Zhang, Qiuwen; Zhang, Gui; Nie, Zhiping; Gui, Zifan; Que, Huafei

2018-05-21

Daily land surface temperature (LST) forecasting is of great significance for application in climate-related, agricultural, eco-environmental, or industrial studies. Hybrid data-driven prediction models using Ensemble Empirical Mode Composition (EEMD) coupled with Machine Learning (ML) algorithms are useful for achieving these purposes because they can reduce the difficulty of modeling, require less history data, are easy to develop, and are less complex than physical models. In this article, a computationally simple, less data-intensive, fast and efficient novel hybrid data-driven model called the EEMD Long Short-Term Memory (LSTM) neural network, namely EEMD-LSTM, is proposed to reduce the difficulty of modeling and to improve prediction accuracy. The daily LST data series from the Mapoling and Zhijaing stations in the Dongting Lake basin, central south China, from 1 January 2014 to 31 December 2016 is used as a case study. The EEMD is firstly employed to decompose the original daily LST data series into many Intrinsic Mode Functions (IMFs) and a single residue item. Then, the Partial Autocorrelation Function (PACF) is used to obtain the number of input data sample points for LSTM models. Next, the LSTM models are constructed to predict the decompositions. All the predicted results of the decompositions are aggregated as the final daily LST. Finally, the prediction performance of the hybrid EEMD-LSTM model is assessed in terms of the Mean Square Error (MSE), Mean Absolute Error (MAE), Mean Absolute Percentage Error (MAPE), Root Mean Square Error (RMSE), Pearson Correlation Coefficient (CC) and Nash-Sutcliffe Coefficient of Efficiency (NSCE). To validate the hybrid data-driven model, the hybrid EEMD-LSTM model is compared with the Recurrent Neural Network (RNN), LSTM and Empirical Mode Decomposition (EMD) coupled with RNN, EMD-LSTM and EEMD-RNN models, and their comparison results demonstrate that the hybrid EEMD-LSTM model performs better than the other five models. The scatterplots of the predicted results of the six models versus the original daily LST data series show that the hybrid EEMD-LSTM model is superior to the other five models. It is concluded that the proposed hybrid EEMD-LSTM model in this study is a suitable tool for temperature forecasting.
Phase measurement error in summation of electron holography series.

PubMed

McLeod, Robert A; Bergen, Michael; Malac, Marek

2014-06-01

Off-axis electron holography is a method for the transmission electron microscope (TEM) that measures the electric and magnetic properties of a specimen. The electrostatic and magnetic potentials modulate the electron wavefront phase. The error in measurement of the phase therefore determines the smallest observable changes in electric and magnetic properties. Here we explore the summation of a hologram series to reduce the phase error and thereby improve the sensitivity of electron holography. Summation of hologram series requires independent registration and correction of image drift and phase wavefront drift, the consequences of which are discussed. Optimization of the electro-optical configuration of the TEM for the double biprism configuration is examined. An analytical model of image and phase drift, composed of a combination of linear drift and Brownian random-walk, is derived and experimentally verified. The accuracy of image registration via cross-correlation and phase registration is characterized by simulated hologram series. The model of series summation errors allows the optimization of phase error as a function of exposure time and fringe carrier frequency for a target spatial resolution. An experimental example of hologram series summation is provided on WS2 fullerenes. A metric is provided to measure the object phase error from experimental results and compared to analytical predictions. The ultimate experimental object root-mean-square phase error is 0.006 rad (2π/1050) at a spatial resolution less than 0.615 nm and a total exposure time of 900 s. The ultimate phase error in vacuum adjacent to the specimen is 0.0037 rad (2π/1700). The analytical prediction of phase error differs with the experimental metrics by +7% inside the object and -5% in the vacuum, indicating that the model can provide reliable quantitative predictions. Crown Copyright © 2014. Published by Elsevier B.V. All rights reserved.
Two-body potential model based on cosine series expansion for ionic materials

DOE PAGES

Oda, Takuji; Weber, William J.; Tanigawa, Hisashi

2015-09-23

There is a method to construct a two-body potential model for ionic materials with a Fourier series basis and we examine it. For this method, the coefficients of cosine basis functions are uniquely determined by solving simultaneous linear equations to minimize the sum of weighted mean square errors in energy, force and stress, where first-principles calculation results are used as the reference data. As a validation test of the method, potential models for magnesium oxide are constructed. The mean square errors appropriately converge with respect to the truncation of the cosine series. This result mathematically indicates that the constructed potentialmore » model is sufficiently close to the one that is achieved with the non-truncated Fourier series and demonstrates that this potential virtually provides minimum error from the reference data within the two-body representation. The constructed potential models work appropriately in both molecular statics and dynamics simulations, especially if a two-step correction to revise errors expected in the reference data is performed, and the models clearly outperform two existing Buckingham potential models that were tested. Moreover, the good agreement over a broad range of energies and forces with first-principles calculations should enable the prediction of materials behavior away from equilibrium conditions, such as a system under irradiation.« less
[Prediction of soil nutrients spatial distribution based on neural network model combined with goestatistics].

PubMed

Li, Qi-Quan; Wang, Chang-Quan; Zhang, Wen-Jiang; Yu, Yong; Li, Bing; Yang, Juan; Bai, Gen-Chuan; Cai, Yan

2013-02-01

In this study, a radial basis function neural network model combined with ordinary kriging (RBFNN_OK) was adopted to predict the spatial distribution of soil nutrients (organic matter and total N) in a typical hilly region of Sichuan Basin, Southwest China, and the performance of this method was compared with that of ordinary kriging (OK) and regression kriging (RK). All the three methods produced the similar soil nutrient maps. However, as compared with those obtained by multiple linear regression model, the correlation coefficients between the measured values and the predicted values of soil organic matter and total N obtained by neural network model increased by 12. 3% and 16. 5% , respectively, suggesting that neural network model could more accurately capture the complicated relationships between soil nutrients and quantitative environmental factors. The error analyses of the prediction values of 469 validation points indicated that the mean absolute error (MAE) , mean relative error (MRE), and root mean squared error (RMSE) of RBFNN_OK were 6.9%, 7.4%, and 5. 1% (for soil organic matter), and 4.9%, 6.1% , and 4.6% (for soil total N) smaller than those of OK (P<0.01), and 2.4%, 2.6% , and 1.8% (for soil organic matter), and 2.1%, 2.8%, and 2.2% (for soil total N) smaller than those of RK, respectively (P<0.05).
Crop/weed discrimination using near-infrared reflectance spectroscopy (NIRS)

NASA Astrophysics Data System (ADS)

Zhang, Yun; He, Yong

2006-09-01

The traditional uniform herbicide application often results in an over chemical residues on soil, crop plants and agriculture produce, which have imperiled the environment and food security. Near-infrared reflectance spectroscopy (NIRS) offers a promising means for weed detection and site-specific herbicide application. In laboratory, a total of 90 samples (30 for each species) of the detached leaves of two weeds, i.e., threeseeded mercury (Acalypha australis L.) and fourleafed duckweed (Marsilea quadrfolia L.), and one crop soybean (Glycine max) was investigated for NIRS on 325- 1075 nm using a field spectroradiometer. 20 absorbance samples of each species after pretreatment were exported and the lacked Y variables were assigned independent values for partial least squares (PLS) analysis. During the combined principle component analysis (PCA) on 400-1000 nm, the PC1 and PC2 could together explain over 91% of the total variance and detect the three plant species with 98.3% accuracy. The full-cross validation results of PLS, i.e., standard error of prediction (SEP) 0.247, correlation coefficient (r) 0.954 and root mean square error of prediction (RMSEP) 0.245, indicated an optimum model for weed identification. By predicting the remaining 10 samples of each species in the PLS model, the results with deviation presented a 100% crop/weed detection rate. Thus, it could be concluded that PLS was an available alternative of for qualitative weed discrimination on NTRS.
Designing an artificial neural network using radial basis function to model exergetic efficiency of nanofluids in mini double pipe heat exchanger

NASA Astrophysics Data System (ADS)

Ghasemi, Nahid; Aghayari, Reza; Maddah, Heydar

2018-06-01

The present study aims at predicting and optimizing exergetic efficiency of TiO2-Al2O3/water nanofluid at different Reynolds numbers, volume fractions and twisted ratios using Artificial Neural Networks (ANN) and experimental data. Central Composite Design (CCD) and cascade Radial Basis Function (RBF) were used to display the significant levels of the analyzed factors on the exergetic efficiency. The size of TiO2-Al2O3/water nanocomposite was 20-70 nm. The parameters of ANN model were adapted by a training algorithm of radial basis function (RBF) with a wide range of experimental data set. Total mean square error and correlation coefficient were used to evaluate the results which the best result was obtained from double layer perceptron neural network with 30 neurons in which total Mean Square Error(MSE) and correlation coefficient (R2) were equal to 0.002 and 0.999, respectively. This indicated successful prediction of the network. Moreover, the proposed equation for predicting exergetic efficiency was extremely successful. According to the optimal curves, the optimum designing parameters of double pipe heat exchanger with inner twisted tape and nanofluid under the constrains of exergetic efficiency 0.937 are found to be Reynolds number 2500, twisted ratio 2.5 and volume fraction( v/v%) 0.05.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Kwon, Deukwoo; Little, Mark P.; Miller, Donald L.

Purpose: To determine more accurate regression formulas for estimating peak skin dose (PSD) from reference air kerma (RAK) or kerma-area product (KAP). Methods: After grouping of the data from 21 procedures into 13 clinically similar groups, assessments were made of optimal clustering using the Bayesian information criterion to obtain the optimal linear regressions of (log-transformed) PSD vs RAK, PSD vs KAP, and PSD vs RAK and KAP. Results: Three clusters of clinical groups were optimal in regression of PSD vs RAK, seven clusters of clinical groups were optimal in regression of PSD vs KAP, and six clusters of clinical groupsmore » were optimal in regression of PSD vs RAK and KAP. Prediction of PSD using both RAK and KAP is significantly better than prediction of PSD with either RAK or KAP alone. The regression of PSD vs RAK provided better predictions of PSD than the regression of PSD vs KAP. The partial-pooling (clustered) method yields smaller mean squared errors compared with the complete-pooling method.Conclusion: PSD distributions for interventional radiology procedures are log-normal. Estimates of PSD derived from RAK and KAP jointly are most accurate, followed closely by estimates derived from RAK alone. Estimates of PSD derived from KAP alone are the least accurate. Using a stochastic search approach, it is possible to cluster together certain dissimilar types of procedures to minimize the total error sum of squares.« less
Predicting ambient aerosol thermal-optical reflectance (TOR) measurements from infrared spectra: organic carbon

NASA Astrophysics Data System (ADS)

Dillner, A. M.; Takahama, S.

2015-03-01

Organic carbon (OC) can constitute 50% or more of the mass of atmospheric particulate matter. Typically, organic carbon is measured from a quartz fiber filter that has been exposed to a volume of ambient air and analyzed using thermal methods such as thermal-optical reflectance (TOR). Here, methods are presented that show the feasibility of using Fourier transform infrared (FT-IR) absorbance spectra from polytetrafluoroethylene (PTFE or Teflon) filters to accurately predict TOR OC. This work marks an initial step in proposing a method that can reduce the operating costs of large air quality monitoring networks with an inexpensive, non-destructive analysis technique using routinely collected PTFE filter samples which, in addition to OC concentrations, can concurrently provide information regarding the composition of organic aerosol. This feasibility study suggests that the minimum detection limit and errors (or uncertainty) of FT-IR predictions are on par with TOR OC such that evaluation of long-term trends and epidemiological studies would not be significantly impacted. To develop and test the method, FT-IR absorbance spectra are obtained from 794 samples from seven Interagency Monitoring of PROtected Visual Environment (IMPROVE) sites collected during 2011. Partial least-squares regression is used to calibrate sample FT-IR absorbance spectra to TOR OC. The FTIR spectra are divided into calibration and test sets by sampling site and date. The calibration produces precise and accurate TOR OC predictions of the test set samples by FT-IR as indicated by high coefficient of variation (R2; 0.96), low bias (0.02 μg m-3, the nominal IMPROVE sample volume is 32.8 m3), low error (0.08 μg m-3) and low normalized error (11%). These performance metrics can be achieved with various degrees of spectral pretreatment (e.g., including or excluding substrate contributions to the absorbances) and are comparable in precision to collocated TOR measurements. FT-IR spectra are also divided into calibration and test sets by OC mass and by OM / OC ratio, which reflects the organic composition of the particulate matter and is obtained from organic functional group composition; these divisions also leads to precise and accurate OC predictions. Low OC concentrations have higher bias and normalized error due to TOR analytical errors and artifact-correction errors, not due to the range of OC mass of the samples in the calibration set. However, samples with low OC mass can be used to predict samples with high OC mass, indicating that the calibration is linear. Using samples in the calibration set that have different OM / OC or ammonium / OC distributions than the test set leads to only a modest increase in bias and normalized error in the predicted samples. We conclude that FT-IR analysis with partial least-squares regression is a robust method for accurately predicting TOR OC in IMPROVE network samples - providing complementary information to the organic functional group composition and organic aerosol mass estimated previously from the same set of sample spectra (Ruthenburg et al., 2014).
Error propagation of partial least squares for parameters optimization in NIR modeling.

PubMed

Du, Chenzhao; Dai, Shengyun; Qiao, Yanjiang; Wu, Zhisheng

2018-03-05

A novel methodology is proposed to determine the error propagation of partial least-square (PLS) for parameters optimization in near-infrared (NIR) modeling. The parameters include spectral pretreatment, latent variables and variable selection. In this paper, an open source dataset (corn) and a complicated dataset (Gardenia) were used to establish PLS models under different modeling parameters. And error propagation of modeling parameters for water quantity in corn and geniposide quantity in Gardenia were presented by both type І and type II error. For example, when variable importance in the projection (VIP), interval partial least square (iPLS) and backward interval partial least square (BiPLS) variable selection algorithms were used for geniposide in Gardenia, compared with synergy interval partial least squares (SiPLS), the error weight varied from 5% to 65%, 55% and 15%. The results demonstrated how and what extent the different modeling parameters affect error propagation of PLS for parameters optimization in NIR modeling. The larger the error weight, the worse the model. Finally, our trials finished a powerful process in developing robust PLS models for corn and Gardenia under the optimal modeling parameters. Furthermore, it could provide a significant guidance for the selection of modeling parameters of other multivariate calibration models. Copyright © 2017. Published by Elsevier B.V.

Error propagation of partial least squares for parameters optimization in NIR modeling

NASA Astrophysics Data System (ADS)

Du, Chenzhao; Dai, Shengyun; Qiao, Yanjiang; Wu, Zhisheng

2018-03-01

A novel methodology is proposed to determine the error propagation of partial least-square (PLS) for parameters optimization in near-infrared (NIR) modeling. The parameters include spectral pretreatment, latent variables and variable selection. In this paper, an open source dataset (corn) and a complicated dataset (Gardenia) were used to establish PLS models under different modeling parameters. And error propagation of modeling parameters for water quantity in corn and geniposide quantity in Gardenia were presented by both type І and type II error. For example, when variable importance in the projection (VIP), interval partial least square (iPLS) and backward interval partial least square (BiPLS) variable selection algorithms were used for geniposide in Gardenia, compared with synergy interval partial least squares (SiPLS), the error weight varied from 5% to 65%, 55% and 15%. The results demonstrated how and what extent the different modeling parameters affect error propagation of PLS for parameters optimization in NIR modeling. The larger the error weight, the worse the model. Finally, our trials finished a powerful process in developing robust PLS models for corn and Gardenia under the optimal modeling parameters. Furthermore, it could provide a significant guidance for the selection of modeling parameters of other multivariate calibration models.
Determining particle size and water content by near-infrared spectroscopy in the granulation of naproxen sodium.

PubMed

Bär, David; Debus, Heiko; Brzenczek, Sina; Fischer, Wolfgang; Imming, Peter

2018-03-20

Near-infrared spectroscopy is frequently used by the pharmaceutical industry to monitor and optimize several production processes. In combination with chemometrics, a mathematical-statistical technique, the following advantages of near-infrared spectroscopy can be applied: It is a fast, non-destructive, non-invasive, and economical analytical method. One of the most advanced and popular chemometric technique is the partial least square algorithm with its best applicability in routine and its results. The required reference analytic enables the analysis of various parameters of interest, for example, moisture content, particle size, and many others. Parameters like the correlation coefficient, root mean square error of prediction, root mean square error of calibration, and root mean square error of validation have been used for evaluating the applicability and robustness of these analytical methods developed. This study deals with investigating a Naproxen Sodium granulation process using near-infrared spectroscopy and the development of water content and particle-size methods. For the water content method, one should consider a maximum water content of about 21% in the granulation process, which must be confirmed by the loss on drying. Further influences to be considered are the constantly changing product temperature, rising to about 54 °C, the creation of hydrated states of Naproxen Sodium when using a maximum of about 21% water content, and the large quantity of about 87% Naproxen Sodium in the formulation. It was considered to use a combination of these influences in developing the near-infrared spectroscopy method for the water content of Naproxen Sodium granules. The "Root Mean Square Error" was 0.25% for calibration dataset and 0.30% for the validation dataset, which was obtained after different stages of optimization by multiplicative scatter correction and the first derivative. Using laser diffraction, the granules have been analyzed for particle sizes and obtaining the summary sieve sizes of >63 μm and >100 μm. The following influences should be considered for application in routine production: constant changes in water content up to 21% and a product temperature up to 54 °C. The different stages of optimization result in a "Root Mean Square Error" of 2.54% for the calibration data set and 3.53% for the validation set by using the Kubelka-Munk conversion and first derivative for the near-infrared spectroscopy method for a particle size >63 μm. For the near-infrared spectroscopy method using a particle size >100 μm, the "Root Mean Square Error" was 3.47% for the calibration data set and 4.51% for the validation set, while using the same pre-treatments. - The robustness and suitability of this methodology has already been demonstrated by its recent successful implementation in a routine granulate production process. Copyright © 2018 Elsevier B.V. All rights reserved.
Benchmark fragment-based 1H, 13C, 15N and 17O chemical shift predictions in molecular crystals†

PubMed Central

Hartman, Joshua D.; Kudla, Ryan A.; Day, Graeme M.; Mueller, Leonard J.; Beran, Gregory J. O.

2016-01-01

The performance of fragment-based ab initio 1H, 13C, 15N and 17O chemical shift predictions is assessed against experimental NMR chemical shift data in four benchmark sets of molecular crystals. Employing a variety of commonly used density functionals (PBE0, B3LYP, TPSSh, OPBE, PBE, TPSS), we explore the relative performance of cluster, two-body fragment, and combined cluster/fragment models. The hybrid density functionals (PBE0, B3LYP and TPSSh) generally out-perform their generalized gradient approximation (GGA)-based counterparts. 1H, 13C, 15N, and 17O isotropic chemical shifts can be predicted with root-mean-square errors of 0.3, 1.5, 4.2, and 9.8 ppm, respectively, using a computationally inexpensive electrostatically embedded two-body PBE0 fragment model. Oxygen chemical shieldings prove particularly sensitive to local many-body effects, and using a combined cluster/fragment model instead of the simple two-body fragment model decreases the root-mean-square errors to 7.6 ppm. These fragment-based model errors compare favorably with GIPAW PBE ones of 0.4, 2.2, 5.4, and 7.2 ppm for the same 1H, 13C, 15N, and 17O test sets. Using these benchmark calculations, a set of recommended linear regression parameters for mapping between calculated chemical shieldings and observed chemical shifts are provided and their robustness assessed using statistical cross-validation. We demonstrate the utility of these approaches and the reported scaling parameters on applications to 9-tertbutyl anthracene, several histidine co-crystals, benzoic acid and the C-nitrosoarene SnCl2(CH3)2(NODMA)2. PMID:27431490
Benchmark fragment-based (1)H, (13)C, (15)N and (17)O chemical shift predictions in molecular crystals.

PubMed

Hartman, Joshua D; Kudla, Ryan A; Day, Graeme M; Mueller, Leonard J; Beran, Gregory J O

2016-08-21

The performance of fragment-based ab initio(1)H, (13)C, (15)N and (17)O chemical shift predictions is assessed against experimental NMR chemical shift data in four benchmark sets of molecular crystals. Employing a variety of commonly used density functionals (PBE0, B3LYP, TPSSh, OPBE, PBE, TPSS), we explore the relative performance of cluster, two-body fragment, and combined cluster/fragment models. The hybrid density functionals (PBE0, B3LYP and TPSSh) generally out-perform their generalized gradient approximation (GGA)-based counterparts. (1)H, (13)C, (15)N, and (17)O isotropic chemical shifts can be predicted with root-mean-square errors of 0.3, 1.5, 4.2, and 9.8 ppm, respectively, using a computationally inexpensive electrostatically embedded two-body PBE0 fragment model. Oxygen chemical shieldings prove particularly sensitive to local many-body effects, and using a combined cluster/fragment model instead of the simple two-body fragment model decreases the root-mean-square errors to 7.6 ppm. These fragment-based model errors compare favorably with GIPAW PBE ones of 0.4, 2.2, 5.4, and 7.2 ppm for the same (1)H, (13)C, (15)N, and (17)O test sets. Using these benchmark calculations, a set of recommended linear regression parameters for mapping between calculated chemical shieldings and observed chemical shifts are provided and their robustness assessed using statistical cross-validation. We demonstrate the utility of these approaches and the reported scaling parameters on applications to 9-tert-butyl anthracene, several histidine co-crystals, benzoic acid and the C-nitrosoarene SnCl2(CH3)2(NODMA)2.
Computational intelligence models to predict porosity of tablets using minimum features

PubMed Central

Khalid, Mohammad Hassan; Kazemi, Pezhman; Perez-Gandarillas, Lucia; Michrafy, Abderrahim; Szlęk, Jakub; Jachowicz, Renata; Mendyk, Aleksander

2017-01-01

The effects of different formulations and manufacturing process conditions on the physical properties of a solid dosage form are of importance to the pharmaceutical industry. It is vital to have in-depth understanding of the material properties and governing parameters of its processes in response to different formulations. Understanding the mentioned aspects will allow tighter control of the process, leading to implementation of quality-by-design (QbD) practices. Computational intelligence (CI) offers an opportunity to create empirical models that can be used to describe the system and predict future outcomes in silico. CI models can help explore the behavior of input parameters, unlocking deeper understanding of the system. This research endeavor presents CI models to predict the porosity of tablets created by roll-compacted binary mixtures, which were milled and compacted under systematically varying conditions. CI models were created using tree-based methods, artificial neural networks (ANNs), and symbolic regression trained on an experimental data set and screened using root-mean-square error (RMSE) scores. The experimental data were composed of proportion of microcrystalline cellulose (MCC) (in percentage), granule size fraction (in micrometers), and die compaction force (in kilonewtons) as inputs and porosity as an output. The resulting models show impressive generalization ability, with ANNs (normalized root-mean-square error [NRMSE] =1%) and symbolic regression (NRMSE =4%) as the best-performing methods, also exhibiting reliable predictive behavior when presented with a challenging external validation data set (best achieved symbolic regression: NRMSE =3%). Symbolic regression demonstrates the transition from the black box modeling paradigm to more transparent predictive models. Predictive performance and feature selection behavior of CI models hints at the most important variables within this factor space. PMID:28138223
Computational intelligence models to predict porosity of tablets using minimum features.

PubMed

Khalid, Mohammad Hassan; Kazemi, Pezhman; Perez-Gandarillas, Lucia; Michrafy, Abderrahim; Szlęk, Jakub; Jachowicz, Renata; Mendyk, Aleksander

2017-01-01

The effects of different formulations and manufacturing process conditions on the physical properties of a solid dosage form are of importance to the pharmaceutical industry. It is vital to have in-depth understanding of the material properties and governing parameters of its processes in response to different formulations. Understanding the mentioned aspects will allow tighter control of the process, leading to implementation of quality-by-design (QbD) practices. Computational intelligence (CI) offers an opportunity to create empirical models that can be used to describe the system and predict future outcomes in silico. CI models can help explore the behavior of input parameters, unlocking deeper understanding of the system. This research endeavor presents CI models to predict the porosity of tablets created by roll-compacted binary mixtures, which were milled and compacted under systematically varying conditions. CI models were created using tree-based methods, artificial neural networks (ANNs), and symbolic regression trained on an experimental data set and screened using root-mean-square error (RMSE) scores. The experimental data were composed of proportion of microcrystalline cellulose (MCC) (in percentage), granule size fraction (in micrometers), and die compaction force (in kilonewtons) as inputs and porosity as an output. The resulting models show impressive generalization ability, with ANNs (normalized root-mean-square error [NRMSE] =1%) and symbolic regression (NRMSE =4%) as the best-performing methods, also exhibiting reliable predictive behavior when presented with a challenging external validation data set (best achieved symbolic regression: NRMSE =3%). Symbolic regression demonstrates the transition from the black box modeling paradigm to more transparent predictive models. Predictive performance and feature selection behavior of CI models hints at the most important variables within this factor space.
Neural activity during affect labeling predicts expressive writing effects on well-being: GLM and SVM approaches

PubMed Central

Memarian, Negar; Torre, Jared B.; Haltom, Kate E.; Stanton, Annette L.

2017-01-01

Abstract Affect labeling (putting feelings into words) is a form of incidental emotion regulation that could underpin some benefits of expressive writing (i.e. writing about negative experiences). Here, we show that neural responses during affect labeling predicted changes in psychological and physical well-being outcome measures 3 months later. Furthermore, neural activity of specific frontal regions and amygdala predicted those outcomes as a function of expressive writing. Using supervised learning (support vector machines regression), improvements in four measures of psychological and physical health (physical symptoms, depression, anxiety and life satisfaction) after an expressive writing intervention were predicted with an average of 0.85% prediction error [root mean square error (RMSE) %]. The predictions were significantly more accurate with machine learning than with the conventional generalized linear model method (average RMSE: 1.3%). Consistent with affect labeling research, right ventrolateral prefrontal cortex (RVLPFC) and amygdalae were top predictors of improvement in the four outcomes. Moreover, RVLPFC and left amygdala predicted benefits due to expressive writing in satisfaction with life and depression outcome measures, respectively. This study demonstrates the substantial merit of supervised machine learning for real-world outcome prediction in social and affective neuroscience. PMID:28992270
Effects of air-sea interaction on extended-range prediction of geopotential height at 500 hPa over the northern extratropical region

NASA Astrophysics Data System (ADS)

Wang, Xujia; Zheng, Zhihai; Feng, Guolin

2018-04-01

The contribution of air-sea interaction on the extended-range prediction of geopotential height at 500 hPa in the northern extratropical region has been analyzed with a coupled model form Beijing Climate Center and its atmospheric components. Under the assumption of the perfect model, the extended-range prediction skill was evaluated by anomaly correlation coefficient (ACC), root mean square error (RMSE), and signal-to-noise ratio (SNR). The coupled model has a better prediction skill than its atmospheric model, especially, the air-sea interaction in July made a greater contribution for the improvement of prediction skill than other months. The prediction skill of the extratropical region in the coupled model reaches 16-18 days in all months, while the atmospheric model reaches 10-11 days in January, April, and July and only 7-8 days in October, indicating that the air-sea interaction can extend the prediction skill of the atmospheric model by about 1 week. The errors of both the coupled model and the atmospheric model reach saturation in about 20 days, suggesting that the predictable range is less than 3 weeks.
Optimizing finite element predictions of local subchondral bone structural stiffness using neural network-derived density-modulus relationships for proximal tibial subchondral cortical and trabecular bone.

PubMed

Nazemi, S Majid; Amini, Morteza; Kontulainen, Saija A; Milner, Jaques S; Holdsworth, David W; Masri, Bassam A; Wilson, David R; Johnston, James D

2017-01-01

Quantitative computed tomography based subject-specific finite element modeling has potential to clarify the role of subchondral bone alterations in knee osteoarthritis initiation, progression, and pain. However, it is unclear what density-modulus equation(s) should be applied with subchondral cortical and subchondral trabecular bone when constructing finite element models of the tibia. Using a novel approach applying neural networks, optimization, and back-calculation against in situ experimental testing results, the objective of this study was to identify subchondral-specific equations that optimized finite element predictions of local structural stiffness at the proximal tibial subchondral surface. Thirteen proximal tibial compartments were imaged via quantitative computed tomography. Imaged bone mineral density was converted to elastic moduli using multiple density-modulus equations (93 total variations) then mapped to corresponding finite element models. For each variation, root mean squared error was calculated between finite element prediction and in situ measured stiffness at 47 indentation sites. Resulting errors were used to train an artificial neural network, which provided an unlimited number of model variations, with corresponding error, for predicting stiffness at the subchondral bone surface. Nelder-Mead optimization was used to identify optimum density-modulus equations for predicting stiffness. Finite element modeling predicted 81% of experimental stiffness variance (with 10.5% error) using optimized equations for subchondral cortical and trabecular bone differentiated with a 0.5g/cm 3 density. In comparison with published density-modulus relationships, optimized equations offered improved predictions of local subchondral structural stiffness. Further research is needed with anisotropy inclusion, a smaller voxel size and de-blurring algorithms to improve predictions. Copyright © 2016 Elsevier Ltd. All rights reserved.
Application of a bioenergetics model for hatchery production: Largemouth bass fed commercial diets

USGS Publications Warehouse

Csargo, Isak J.; Michael L. Brown,; Chipps, Steven R.

2012-01-01

Fish bioenergetics models based on natural prey items have been widely used to address research and management questions. However, few attempts have been made to evaluate and apply bioenergetics models to hatchery-reared fish receiving commercial feeds that contain substantially higher energy densities than natural prey. In this study, we evaluated a bioenergetics model for age-0 largemouth bass Micropterus salmoidesreared on four commercial feeds. Largemouth bass (n ≈ 3,504) were reared for 70 d at 25°C in sixteen 833-L circular tanks connected in parallel to a recirculation system. Model performance was evaluated using error components (mean, slope, and random) derived from decomposition of the mean square error obtained from regression of observed on predicted values. Mean predicted consumption was only 8.9% lower than mean observed consumption and was similar to error rates observed for largemouth bass consuming natural prey. Model evaluation showed that the 97.5% joint confidence region included the intercept of 0 (−0.43 ± 3.65) and slope of 1 (1.08 ± 0.20), which indicates the model accurately predicted consumption. Moreover model error was similar among feeds (P = 0.98), and most error was probably attributable to sampling error (unconsumed feed), underestimated predator energy densities, or consumption-dependent error, which is common in bioenergetics models. This bioenergetics model could provide a valuable tool in hatchery production of largemouth bass. Furthermore, we believe that bioenergetics modeling could be useful in aquaculture production, particularly for species lacking historical hatchery constants or conventional growth models.
[Rapid discriminating hogwash oil and edible vegetable oil using near infrared optical fiber spectrometer technique].

PubMed

Zhang, Bing-Fang; Yuan, Li-Bo; Kong, Qing-Ming; Shen, Wei-Zheng; Zhang, Bing-Xiu; Liu, Cheng-Hai

2014-10-01

In the present study, a new method using near infrared spectroscopy combined with optical fiber sensing technology was applied to the analysis of hogwash oil in blended oil. The 50 samples were a blend of frying oil and "nine three" soybean oil according to a certain volume ratio. The near infrared transmission spectroscopies were collected and the quantitative analysis model of frying oil was established by partial least squares (PLS) and BP artificial neural network The coefficients of determina- tion of calibration sets were 0.908 and 0.934 respectively. The coefficients of determination of validation sets were 0.961 and 0.952, the root mean square error of calibrations (RMSEC) was 0.184 and 0.136, and the root mean square error of predictions (RMSEP) was all 0.111 6. They conform to the model application requirement. At the same time, frying oil and qualified edible oil were identified with the principal component analysis (PCA), and the accurate rate was 100%. The experiment proved that near infrared spectral technology not only can quickly and accurately identify hogwash oil, but also can quantitatively detect hog- wash oil. This method has a wide application prospect in the detection of oil.
Determination of thiamine HCl and pyridoxine HCl in pharmaceutical preparations using UV-visible spectrophotometry and genetic algorithm based multivariate calibration methods.

PubMed

Ozdemir, Durmus; Dinc, Erdal

2004-07-01

Simultaneous determination of binary mixtures pyridoxine hydrochloride and thiamine hydrochloride in a vitamin combination using UV-visible spectrophotometry and classical least squares (CLS) and three newly developed genetic algorithm (GA) based multivariate calibration methods was demonstrated. The three genetic multivariate calibration methods are Genetic Classical Least Squares (GCLS), Genetic Inverse Least Squares (GILS) and Genetic Regression (GR). The sample data set contains the UV-visible spectra of 30 synthetic mixtures (8 to 40 microg/ml) of these vitamins and 10 tablets containing 250 mg from each vitamin. The spectra cover the range from 200 to 330 nm in 0.1 nm intervals. Several calibration models were built with the four methods for the two components. Overall, the standard error of calibration (SEC) and the standard error of prediction (SEP) for the synthetic data were in the range of <0.01 and 0.43 microg/ml for all the four methods. The SEP values for the tablets were in the range of 2.91 and 11.51 mg/tablets. A comparison of genetic algorithm selected wavelengths for each component using GR method was also included.
Near infrared spectral linearisation in quantifying soluble solids content of intact carambola.

PubMed

Omar, Ahmad Fairuz; MatJafri, Mohd Zubir

2013-04-12

This study presents a novel application of near infrared (NIR) spectral linearisation for measuring the soluble solids content (SSC) of carambola fruits. NIR spectra were measured using reflectance and interactance methods. In this study, only the interactance measurement technique successfully generated a reliable measurement result with a coefficient of determination of (R2) = 0.724 and a root mean square error of prediction for (RMSEP) = 0.461° Brix. The results from this technique produced a highly accurate and stable prediction model compared with multiple linear regression techniques.
Near Infrared Spectral Linearisation in Quantifying Soluble Solids Content of Intact Carambola

PubMed Central

Omar, Ahmad Fairuz; MatJafri, Mohd Zubir

2013-01-01

This study presents a novel application of near infrared (NIR) spectral linearisation for measuring the soluble solids content (SSC) of carambola fruits. NIR spectra were measured using reflectance and interactance methods. In this study, only the interactance measurement technique successfully generated a reliable measurement result with a coefficient of determination of (R2) = 0.724 and a root mean square error of prediction for (RMSEP) = 0.461° Brix. The results from this technique produced a highly accurate and stable prediction model compared with multiple linear regression techniques. PMID:23584118
? filtering for stochastic systems driven by Poisson processes

NASA Astrophysics Data System (ADS)

Song, Bo; Wu, Zheng-Guang; Park, Ju H.; Shi, Guodong; Zhang, Ya

2015-01-01

This paper investigates the ? filtering problem for stochastic systems driven by Poisson processes. By utilising the martingale theory such as the predictable projection operator and the dual predictable projection operator, this paper transforms the expectation of stochastic integral with respect to the Poisson process into the expectation of Lebesgue integral. Then, based on this, this paper designs an ? filter such that the filtering error system is mean-square asymptotically stable and satisfies a prescribed ? performance level. Finally, a simulation example is given to illustrate the effectiveness of the proposed filtering scheme.
[Research on Resistant Starch Content of Rice Grain Based on NIR Spectroscopy Model].

PubMed

Luo, Xi; Wu, Fang-xi; Xie, Hong-guang; Zhu, Yong-sheng; Zhang, Jian-fu; Xie, Hua-an

2016-03-01

A new method based on near-infrared reflectance spectroscopy (NIRS) analysis was explored to determine the content of rice-resistant starch instead of common chemical method which took long time was high-cost. First of all, we collected 62 spectral data which have big differences in terms of resistant starch content of rice, and then the spectral data and detected chemical values are imported chemometrics software. After that a near-infrared spectroscopy calibration model for rice-resistant starch content was constructed with partial least squares (PLS) method. Results are as follows: In respect of internal cross validation, the coefficient of determination (R2) of untreated, pretreatment with MSC+1thD, pretreatment with 1thD+SNV were 0.920 2, 0.967 0 and 0.976 7 respectively. Root mean square error of prediction (RMSEP) were 1.533 7, 1.011 2 and 0.837 1 respectively. In respect of external validation, the coefficient of determination (R2) of untreated, pretreatment with MSC+ 1thD, pretreatment with 1thD+SNV were 0.805, 0.976 and 0.992 respectively. The average absolute error was 1.456, 0.818, 0.515 respectively. There was no significant difference between chemical and predicted values (Turkey multiple comparison), so we think near infrared spectrum analysis is more feasible than chemical measurement. Among the different pretreatment, the first derivation and standard normal variate (1thD+SNV) have higher coefficient of determination (R2) and lower error value whether in internal validation and external validation. In other words, the calibration model has higher precision and less error by pretreatment with 1thD+SNV.
Analysis of basic clustering algorithms for numerical estimation of statistical averages in biomolecules.

PubMed

Anandakrishnan, Ramu; Onufriev, Alexey

2008-03-01

In statistical mechanics, the equilibrium properties of a physical system of particles can be calculated as the statistical average over accessible microstates of the system. In general, these calculations are computationally intractable since they involve summations over an exponentially large number of microstates. Clustering algorithms are one of the methods used to numerically approximate these sums. The most basic clustering algorithms first sub-divide the system into a set of smaller subsets (clusters). Then, interactions between particles within each cluster are treated exactly, while all interactions between different clusters are ignored. These smaller clusters have far fewer microstates, making the summation over these microstates, tractable. These algorithms have been previously used for biomolecular computations, but remain relatively unexplored in this context. Presented here, is a theoretical analysis of the error and computational complexity for the two most basic clustering algorithms that were previously applied in the context of biomolecular electrostatics. We derive a tight, computationally inexpensive, error bound for the equilibrium state of a particle computed via these clustering algorithms. For some practical applications, it is the root mean square error, which can be significantly lower than the error bound, that may be more important. We how that there is a strong empirical relationship between error bound and root mean square error, suggesting that the error bound could be used as a computationally inexpensive metric for predicting the accuracy of clustering algorithms for practical applications. An example of error analysis for such an application-computation of average charge of ionizable amino-acids in proteins-is given, demonstrating that the clustering algorithm can be accurate enough for practical purposes.
Developing a NIR multispectral imaging for prediction and visualization of peanut protein content using variable selection algorithms

NASA Astrophysics Data System (ADS)

Cheng, Jun-Hu; Jin, Huali; Liu, Zhiwei

2018-01-01

The feasibility of developing a multispectral imaging method using important wavelengths from hyperspectral images selected by genetic algorithm (GA), successive projection algorithm (SPA) and regression coefficient (RC) methods for modeling and predicting protein content in peanut kernel was investigated for the first time. Partial least squares regression (PLSR) calibration model was established between the spectral data from the selected optimal wavelengths and the reference measured protein content ranged from 23.46% to 28.43%. The RC-PLSR model established using eight key wavelengths (1153, 1567, 1972, 2143, 2288, 2339, 2389 and 2446 nm) showed the best predictive results with the coefficient of determination of prediction (R2P) of 0.901, and root mean square error of prediction (RMSEP) of 0.108 and residual predictive deviation (RPD) of 2.32. Based on the obtained best model and image processing algorithms, the distribution maps of protein content were generated. The overall results of this study indicated that developing a rapid and online multispectral imaging system using the feature wavelengths and PLSR analysis is potential and feasible for determination of the protein content in peanut kernels.
Evaluation of three statistical prediction models for forensic age prediction based on DNA methylation.

PubMed

Smeers, Inge; Decorte, Ronny; Van de Voorde, Wim; Bekaert, Bram

2018-05-01

DNA methylation is a promising biomarker for forensic age prediction. A challenge that has emerged in recent studies is the fact that prediction errors become larger with increasing age due to interindividual differences in epigenetic ageing rates. This phenomenon of non-constant variance or heteroscedasticity violates an assumption of the often used method of ordinary least squares (OLS) regression. The aim of this study was to evaluate alternative statistical methods that do take heteroscedasticity into account in order to provide more accurate, age-dependent prediction intervals. A weighted least squares (WLS) regression is proposed as well as a quantile regression model. Their performances were compared against an OLS regression model based on the same dataset. Both models provided age-dependent prediction intervals which account for the increasing variance with age, but WLS regression performed better in terms of success rate in the current dataset. However, quantile regression might be a preferred method when dealing with a variance that is not only non-constant, but also not normally distributed. Ultimately the choice of which model to use should depend on the observed characteristics of the data. Copyright © 2018 Elsevier B.V. All rights reserved.
Detection and quantification of adulteration of sesame oils with vegetable oils using gas chromatography and multivariate data analysis.

PubMed

Peng, Dan; Bi, Yanlan; Ren, Xiaona; Yang, Guolong; Sun, Shangde; Wang, Xuede

2015-12-01

This study was performed to develop a hierarchical approach for detection and quantification of adulteration of sesame oil with vegetable oils using gas chromatography (GC). At first, a model was constructed to discriminate the difference between authentic sesame oils and adulterated sesame oils using support vector machine (SVM) algorithm. Then, another SVM-based model is developed to identify the type of adulterant in the mixed oil. At last, prediction models for sesame oil were built for each kind of oil using partial least square method. To validate this approach, 746 samples were prepared by mixing authentic sesame oils with five types of vegetable oil. The prediction results show that the detection limit for authentication is as low as 5% in mixing ratio and the root-mean-square errors for prediction range from 1.19% to 4.29%, meaning that this approach is a valuable tool to detect and quantify the adulteration of sesame oil. Copyright © 2015 Elsevier Ltd. All rights reserved.

Pharmacokinetics of low-dose nedaplatin and validation of AUC prediction in patients with non-small-cell lung carcinoma.

PubMed

Niioka, Takenori; Uno, Tsukasa; Yasui-Furukori, Norio; Takahata, Takenori; Shimizu, Mikiko; Sugawara, Kazunobu; Tateishi, Tomonori

2007-04-01

The aim of this study was to determine the pharmacokinetics of low-dose nedaplatin combined with paclitaxel and radiation therapy in patients having non-small-cell lung carcinoma and establish the optimal dosage regimen for low-dose nedaplatin. We also evaluated predictive accuracy of reported formulas to estimate the area under the plasma concentration-time curve (AUC) of low-dose nedaplatin. A total of 19 patients were administered a constant intravenous infusion of 20 mg/m(2) body surface area (BSA) nedaplatin for an hour, and blood samples were collected at 1, 2, 3, 4, 6, 8, and 19 h after the administration. Plasma concentrations of unbound platinum were measured, and the actual value of platinum AUC (actual AUC) was calculated based on these data. The predicted value of platinum AUC (predicted AUC) was determined by three predictive methods reported in previous studies, consisting of Bayesian method, limited sampling strategies with plasma concentration at a single time point, and simple formula method (SFM) without measured plasma concentration. Three error indices, mean prediction error (ME, measure of bias), mean absolute error (MAE, measure of accuracy), and root mean squared prediction error (RMSE, measure of precision), were obtained from the difference between the actual and the predicted AUC, to compare the accuracy between the three predictive methods. The AUC showed more than threefold inter-patient variation, and there was a favorable correlation between nedaplatin clearance and creatinine clearance (Ccr) (r = 0.832, P < 0.01). In three error indices, MAE and RMSE showed significant difference between the three AUC predictive methods, and the method of SFM had the most favorable results, in which %ME, %MAE, and %RMSE were 5.5, 10.7, and 15.4, respectively. The dosage regimen of low-dose nedaplatin should be established based on Ccr rather than on BSA. Since prediction accuracy of SFM, which did not require measured plasma concentration, was most favorable among the three methods evaluated in this study, SFM could be the most practical method to predict AUC of low-dose nedaplatin in a clinical situation judging from its high accuracy in predicting AUC without measured plasma concentration.
Evaluation of the predicted error of the soil moisture retrieval from C-band SAR by comparison against modelled soil moisture estimates over Australia

PubMed Central

Doubková, Marcela; Van Dijk, Albert I.J.M.; Sabel, Daniel; Wagner, Wolfgang; Blöschl, Günter

2012-01-01

The Sentinel-1 will carry onboard a C-band radar instrument that will map the European continent once every four days and the global land surface at least once every twelve days with finest 5 × 20 m spatial resolution. The high temporal sampling rate and operational configuration make Sentinel-1 of interest for operational soil moisture monitoring. Currently, updated soil moisture data are made available at 1 km spatial resolution as a demonstration service using Global Mode (GM) measurements from the Advanced Synthetic Aperture Radar (ASAR) onboard ENVISAT. The service demonstrates the potential of the C-band observations to monitor variations in soil moisture. Importantly, a retrieval error estimate is also available; these are needed to assimilate observations into models. The retrieval error is estimated by propagating sensor errors through the retrieval model. In this work, the existing ASAR GM retrieval error product is evaluated using independent top soil moisture estimates produced by the grid-based landscape hydrological model (AWRA-L) developed within the Australian Water Resources Assessment system (AWRA). The ASAR GM retrieval error estimate, an assumed prior AWRA-L error estimate and the variance in the respective datasets were used to spatially predict the root mean square error (RMSE) and the Pearson's correlation coefficient R between the two datasets. These were compared with the RMSE calculated directly from the two datasets. The predicted and computed RMSE showed a very high level of agreement in spatial patterns as well as good quantitative agreement; the RMSE was predicted within accuracy of 4% of saturated soil moisture over 89% of the Australian land mass. Predicted and calculated R maps corresponded within accuracy of 10% over 61% of the continent. The strong correspondence between the predicted and calculated RMSE and R builds confidence in the retrieval error model and derived ASAR GM error estimates. The ASAR GM and Sentinel-1 have the same basic physical measurement characteristics, and therefore very similar retrieval error estimation method can be applied. Because of the expected improvements in radiometric resolution of the Sentinel-1 backscatter measurements, soil moisture estimation errors can be expected to be an order of magnitude less than those for ASAR GM. This opens the possibility for operationally available medium resolution soil moisture estimates with very well-specified errors that can be assimilated into hydrological or crop yield models, with potentially large benefits for land-atmosphere fluxes, crop growth, and water balance monitoring and modelling. PMID:23483015
Assessing the external validity of algorithms to estimate EQ-5D-3L from the WOMAC.

PubMed

Kiadaliri, Aliasghar A; Englund, Martin

2016-10-04

The use of mapping algorithms have been suggested as a solution to predict health utilities when no preference-based measure is included in the study. However, validity and predictive performance of these algorithms are highly variable and hence assessing the accuracy and validity of algorithms before use them in a new setting is of importance. The aim of the current study was to assess the predictive accuracy of three mapping algorithms to estimate the EQ-5D-3L from the Western Ontario and McMaster Universities Osteoarthritis Index (WOMAC) among Swedish people with knee disorders. Two of these algorithms developed using ordinary least squares (OLS) models and one developed using mixture model. The data from 1078 subjects mean (SD) age 69.4 (7.2) years with frequent knee pain and/or knee osteoarthritis from the Malmö Osteoarthritis study in Sweden were used. The algorithms' performance was assessed using mean error, mean absolute error, and root mean squared error. Two types of prediction were estimated for mixture model: weighted average (WA), and conditional on estimated component (CEC). The overall mean was overpredicted by an OLS model and underpredicted by two other algorithms (P < 0.001). All predictions but the CEC predictions of mixture model had a narrower range than the observed scores (22 to 90 %). All algorithms suffered from overprediction for severe health states and underprediction for mild health states with lesser extent for mixture model. While the mixture model outperformed OLS models at the extremes of the EQ-5D-3D distribution, it underperformed around the center of the distribution. While algorithm based on mixture model reflected the distribution of EQ-5D-3L data more accurately compared with OLS models, all algorithms suffered from systematic bias. This calls for caution in applying these mapping algorithms in a new setting particularly in samples with milder knee problems than original sample. Assessing the impact of the choice of these algorithms on cost-effectiveness studies through sensitivity analysis is recommended.
Combining the genetic algorithm and successive projection algorithm for the selection of feature wavelengths to evaluate exudative characteristics in frozen-thawed fish muscle.

PubMed

Cheng, Jun-Hu; Sun, Da-Wen; Pu, Hongbin

2016-04-15

The potential use of feature wavelengths for predicting drip loss in grass carp fish, as affected by being frozen at -20°C for 24 h and thawed at 4°C for 1, 2, 4, and 6 days, was investigated. Hyperspectral images of frozen-thawed fish were obtained and their corresponding spectra were extracted. Least-squares support vector machine and multiple linear regression (MLR) models were established using five key wavelengths, selected by combining a genetic algorithm and successive projections algorithm, and this showed satisfactory performance in drip loss prediction. The MLR model with a determination coefficient of prediction (R(2)P) of 0.9258, and lower root mean square error estimated by a prediction (RMSEP) of 1.12%, was applied to transfer each pixel of the image and generate the distribution maps of exudation changes. The results confirmed that it is feasible to identify the feature wavelengths using variable selection methods and chemometric analysis for developing on-line multispectral imaging. Copyright © 2015 Elsevier Ltd. All rights reserved.
A Novel RSSI Prediction Using Imperialist Competition Algorithm (ICA), Radial Basis Function (RBF) and Firefly Algorithm (FFA) in Wireless Networks

PubMed Central

Goudarzi, Shidrokh; Haslina Hassan, Wan; Abdalla Hashim, Aisha-Hassan; Soleymani, Seyed Ahmad; Anisi, Mohammad Hossein; Zakaria, Omar M.

2016-01-01

This study aims to design a vertical handover prediction method to minimize unnecessary handovers for a mobile node (MN) during the vertical handover process. This relies on a novel method for the prediction of a received signal strength indicator (RSSI) referred to as IRBF-FFA, which is designed by utilizing the imperialist competition algorithm (ICA) to train the radial basis function (RBF), and by hybridizing with the firefly algorithm (FFA) to predict the optimal solution. The prediction accuracy of the proposed IRBF–FFA model was validated by comparing it to support vector machines (SVMs) and multilayer perceptron (MLP) models. In order to assess the model’s performance, we measured the coefficient of determination (R2), correlation coefficient (r), root mean square error (RMSE) and mean absolute percentage error (MAPE). The achieved results indicate that the IRBF–FFA model provides more precise predictions compared to different ANNs, namely, support vector machines (SVMs) and multilayer perceptron (MLP). The performance of the proposed model is analyzed through simulated and real-time RSSI measurements. The results also suggest that the IRBF–FFA model can be applied as an efficient technique for the accurate prediction of vertical handover. PMID:27438600
A Novel RSSI Prediction Using Imperialist Competition Algorithm (ICA), Radial Basis Function (RBF) and Firefly Algorithm (FFA) in Wireless Networks.

PubMed

Goudarzi, Shidrokh; Haslina Hassan, Wan; Abdalla Hashim, Aisha-Hassan; Soleymani, Seyed Ahmad; Anisi, Mohammad Hossein; Zakaria, Omar M

2016-01-01

This study aims to design a vertical handover prediction method to minimize unnecessary handovers for a mobile node (MN) during the vertical handover process. This relies on a novel method for the prediction of a received signal strength indicator (RSSI) referred to as IRBF-FFA, which is designed by utilizing the imperialist competition algorithm (ICA) to train the radial basis function (RBF), and by hybridizing with the firefly algorithm (FFA) to predict the optimal solution. The prediction accuracy of the proposed IRBF-FFA model was validated by comparing it to support vector machines (SVMs) and multilayer perceptron (MLP) models. In order to assess the model's performance, we measured the coefficient of determination (R2), correlation coefficient (r), root mean square error (RMSE) and mean absolute percentage error (MAPE). The achieved results indicate that the IRBF-FFA model provides more precise predictions compared to different ANNs, namely, support vector machines (SVMs) and multilayer perceptron (MLP). The performance of the proposed model is analyzed through simulated and real-time RSSI measurements. The results also suggest that the IRBF-FFA model can be applied as an efficient technique for the accurate prediction of vertical handover.
Model assessment using a multi-metric ranking technique

NASA Astrophysics Data System (ADS)

Fitzpatrick, P. J.; Lau, Y.; Alaka, G.; Marks, F.

2017-12-01

Validation comparisons of multiple models presents challenges when skill levels are similar, especially in regimes dominated by the climatological mean. Assessing skill separation will require advanced validation metrics and identifying adeptness in extreme events, but maintain simplicity for management decisions. Flexibility for operations is also an asset. This work postulates a weighted tally and consolidation technique which ranks results by multiple types of metrics. Variables include absolute error, bias, acceptable absolute error percentages, outlier metrics, model efficiency, Pearson correlation, Kendall's Tau, reliability Index, multiplicative gross error, and root mean squared differences. Other metrics, such as root mean square difference and rank correlation were also explored, but removed when the information was discovered to be generally duplicative to other metrics. While equal weights are applied, weights could be altered depending for preferred metrics. Two examples are shown comparing ocean models' currents and tropical cyclone products, including experimental products. The importance of using magnitude and direction for tropical cyclone track forecasts instead of distance, along-track, and cross-track are discussed. Tropical cyclone intensity and structure prediction are also assessed. Vector correlations are not included in the ranking process, but found useful in an independent context, and will be briefly reported.
Prediction and validation of residual feed intake and dry matter intake in Danish lactating dairy cows using mid-infrared spectroscopy of milk.

PubMed

Shetty, N; Løvendahl, P; Lund, M S; Buitenhuis, A J

2017-01-01

The present study explored the effectiveness of Fourier transform mid-infrared (FT-IR) spectral profiles as a predictor for dry matter intake (DMI) and residual feed intake (RFI). The partial least squares regression method was used to develop the prediction models. The models were validated using different external test sets, one randomly leaving out 20% of the records (validation A), the second randomly leaving out 20% of cows (validation B), and a third (for DMI prediction models) randomly leaving out one cow (validation C). The data included 1,044 records from 140 cows; 97 were Danish Holstein and 43 Danish Jersey. Results showed better accuracies for validation A compared with other validation methods. Milk yield (MY) contributed largely to DMI prediction; MY explained 59% of the variation and the validated model error root mean square error of prediction (RMSEP) was 2.24kg. The model was improved by adding live weight (LW) as an additional predictor trait, where the accuracy R 2 increased from 0.59 to 0.72 and error RMSEP decreased from 2.24 to 1.83kg. When only the milk FT-IR spectral profile was used in DMI prediction, a lower prediction ability was obtained, with R 2 =0.30 and RMSEP=2.91kg. However, once the spectral information was added, along with MY and LW as predictors, model accuracy improved and R 2 increased to 0.81 and RMSEP decreased to 1.49kg. Prediction accuracies of RFI changed throughout lactation. The RFI prediction model for the early-lactation stage was better compared with across lactation or mid- and late-lactation stages, with R 2 =0.46 and RMSEP=1.70. The most important spectral wavenumbers that contributed to DMI and RFI prediction models included fat, protein, and lactose peaks. Comparable prediction results were obtained when using infrared-predicted fat, protein, and lactose instead of full spectra, indicating that FT-IR spectral data do not add significant new information to improve DMI and RFI prediction models. Therefore, in practice, if full FT-IR spectral data are not stored, it is possible to achieve similar DMI or RFI prediction results based on standard milk control data. For DMI, the milk fat region was responsible for the major variation in milk spectra; for RFI, the major variation in milk spectra was within the milk protein region. Copyright © 2017 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Linking Field and Satellite Observations to Reveal Differences in Single vs. Double-Cropped Soybean Yields in Central Brazil

NASA Astrophysics Data System (ADS)

Jeffries, G. R.; Cohn, A.

2016-12-01

Soy-corn double cropping (DC) has been widely adopted in Central Brazil alongside single cropped (SC) soybean production. DC involves different cropping calendars, soy varieties, and may be associated with different crop yield patterns and volatility than SC. Study of the performance of the region's agriculture in a changing climate depends on tracking differences in the productivity of SC vs. DC, but has been limited by crop yield data that conflate the two systems. We predicted SC and DC yields across Central Brazil, drawing on field observations and remotely sensed data. We first modeled field yield estimates as a function of remotely sensed DC status and vegetation index (VI) metrics, and other management and biophysical factors. We then used the statistical model estimated to predict SC and DC soybean yields at each 500 m2 grid cell of Central Brazil for harvest years 2001 - 2015. The yield estimation model was constructed using 1) a repeated cross-sectional survey of soybean yields and management factors for years 2007-2015, 2) a custom agricultural land cover classification dataset which assimilates earlier datasets for the region, and 3) 500m 8-day MODIS image composites used to calculate the wide dynamic range vegetation index (WDRVI) and derivative metrics such as area under the curve for WDRVI values in critical crop development periods. A statistical yield estimation model which primarily entails WDRVI metrics, DC status, and spatial fixed effects was developed on a subset of the yield dataset. Model validation was conducted by predicting previously withheld yield records, and then assessing error and goodness-of-fit for predicted values with metrics including root mean squared error (RMSE), mean squared error (MSE), and R2. We found a statistical yield estimation model which incorporates WDRVI and DC status to be way to estimate crop yields over the region. Statistical properties of the resulting gridded yield dataset may be valuable for understanding linkages between crop yields, farm management factors, and climate.
Performance assessment of nitrate leaching models for highly vulnerable soils used in low-input farming based on lysimeter data.

PubMed

Groenendijk, Piet; Heinen, Marius; Klammler, Gernot; Fank, Johann; Kupfersberger, Hans; Pisinaras, Vassilios; Gemitzi, Alexandra; Peña-Haro, Salvador; García-Prats, Alberto; Pulido-Velazquez, Manuel; Perego, Alessia; Acutis, Marco; Trevisan, Marco

2014-11-15

The agricultural sector faces the challenge of ensuring food security without an excessive burden on the environment. Simulation models provide excellent instruments for researchers to gain more insight into relevant processes and best agricultural practices and provide tools for planners for decision making support. The extent to which models are capable of reliable extrapolation and prediction is important for exploring new farming systems or assessing the impacts of future land and climate changes. A performance assessment was conducted by testing six detailed state-of-the-art models for simulation of nitrate leaching (ARMOSA, COUPMODEL, DAISY, EPIC, SIMWASER/STOTRASIM, SWAP/ANIMO) for lysimeter data of the Wagna experimental field station in Eastern Austria, where the soil is highly vulnerable to nitrate leaching. Three consecutive phases were distinguished to gain insight in the predictive power of the models: 1) a blind test for 2005-2008 in which only soil hydraulic characteristics, meteorological data and information about the agricultural management were accessible; 2) a calibration for the same period in which essential information on field observations was additionally available to the modellers; and 3) a validation for 2009-2011 with the corresponding type of data available as for the blind test. A set of statistical metrics (mean absolute error, root mean squared error, index of agreement, model efficiency, root relative squared error, Pearson's linear correlation coefficient) was applied for testing the results and comparing the models. None of the models performed good for all of the statistical metrics. Models designed for nitrate leaching in high-input farming systems had difficulties in accurately predicting leaching in low-input farming systems that are strongly influenced by the retention of nitrogen in catch crops and nitrogen fixation by legumes. An accurate calibration does not guarantee a good predictive power of the model. Nevertheless all models were able to identify years and crops with high- and low-leaching rates. Copyright © 2014 Elsevier B.V. All rights reserved.
August median streamflow on ungaged streams in Eastern Coastal Maine

USGS Publications Warehouse

Lombard, Pamela J.

2004-01-01

Methods for estimating August median streamflow were developed for ungaged, unregulated streams in eastern coastal Maine. The methods apply to streams with drainage areas ranging in size from 0.04 to 73.2 square miles and fraction of basin underlain by a sand and gravel aquifer ranging from 0 to 71 percent. The equations were developed with data from three long-term (greater than or equal to 10 years of record) continuous-record streamflow-gaging stations, 23 partial-record streamflow- gaging stations, and 5 short-term (less than 10 years of record) continuous-record streamflow-gaging stations. A mathematical technique for estimating a standard low-flow statistic, August median streamflow, at partial-record streamflow-gaging stations and short-term continuous-record streamflow-gaging stations was applied by relating base-flow measurements at these stations to concurrent daily streamflows at nearby long-term continuous-record streamflow-gaging stations (index stations). Generalized least-squares regression analysis (GLS) was used to relate estimates of August median streamflow at streamflow-gaging stations to basin characteristics at these same stations to develop equations that can be applied to estimate August median streamflow on ungaged streams. GLS accounts for different periods of record at the gaging stations and the cross correlation of concurrent streamflows among gaging stations. Thirty-one stations were used for the final regression equations. Two basin characteristics?drainage area and fraction of basin underlain by a sand and gravel aquifer?are used in the calculated regression equation to estimate August median streamflow for ungaged streams. The equation has an average standard error of prediction from -27 to 38 percent. A one-variable equation uses only drainage area to estimate August median streamflow when less accuracy is acceptable. This equation has an average standard error of prediction from -30 to 43 percent. Model error is larger than sampling error for both equations, indicating that additional or improved estimates of basin characteristics could be important to improved estimates of low-flow statistics. Weighted estimates of August median streamflow at partial- record or continuous-record gaging stations range from 0.003 to 31.0 cubic feet per second or from 0.1 to 0.6 cubic feet per second per square mile. Estimates of August median streamflow on ungaged streams in eastern coastal Maine, within the range of acceptable explanatory variables, range from 0.003 to 45 cubic feet per second or 0.1 to 0.6 cubic feet per second per square mile. Estimates of August median streamflow per square mile of drainage area generally increase as drainage area and fraction of basin underlain by a sand and gravel aquifer increase.
Three-dimensional holoscopic image coding scheme using high-efficiency video coding with kernel-based minimum mean-square-error estimation

NASA Astrophysics Data System (ADS)

Liu, Deyang; An, Ping; Ma, Ran; Yang, Chao; Shen, Liquan; Li, Kai

2016-07-01

Three-dimensional (3-D) holoscopic imaging, also known as integral imaging, light field imaging, or plenoptic imaging, can provide natural and fatigue-free 3-D visualization. However, a large amount of data is required to represent the 3-D holoscopic content. Therefore, efficient coding schemes for this particular type of image are needed. A 3-D holoscopic image coding scheme with kernel-based minimum mean square error (MMSE) estimation is proposed. In the proposed scheme, the coding block is predicted by an MMSE estimator under statistical modeling. In order to obtain the signal statistical behavior, kernel density estimation (KDE) is utilized to estimate the probability density function of the statistical modeling. As bandwidth estimation (BE) is a key issue in the KDE problem, we also propose a BE method based on kernel trick. The experimental results demonstrate that the proposed scheme can achieve a better rate-distortion performance and a better visual rendering quality.
Radon-222 concentrations in ground water and soil gas on Indian reservations in Wisconsin

USGS Publications Warehouse

DeWild, John F.; Krohelski, James T.

1995-01-01

For sites with wells finished in the sand and gravel aquifer, the coefficient of determination (R2) of the regression of concentration of radon-222 in ground water as a function of well depth is 0.003 and the significance level is 0.32, which indicates that there is not a statistically significant relation between radon-222 concentrations in ground water and well depth. The coefficient of determination of the regression of radon-222 in ground water and soil gas is 0.19 and the root mean square error of the regression line is 271 picocuries per liter. Even though the significance level (0.036) indicates a statistical relation, the root mean square error of the regression is so large that the regression equation would not give reliable predictions. Because of an inadequate number of samples, similar statistical analyses could not be performed for sites with wells finished in the crystalline and sedimentary bedrock aquifers.
Analysis of the quality of image data acquired by the LANDSAT-4 Thematic Mapper and Multispectral Scanners

NASA Technical Reports Server (NTRS)

Colwell, R. N. (Principal Investigator)

1984-01-01

The geometric quality of TM film and digital products is evaluated by making selective photomeasurements and by measuring the coordinates of known features on both the TM products and map products. These paired observations are related using a standard linear least squares regression approach. Using regression equations and coefficients developed from 225 (TM film product) and 20 (TM digital product) control points, map coordinates of test points are predicted. The residual error vectors and analysis of variance (ANOVA) were performed on the east and north residual using nine image segments (blocks) as treatments. Based on the root mean square error of the 223 (TM film product) and 22 (TM digital product) test points, users of TM data expect the planimetric accuracy of mapped points to be within 91 meters and within 117 meters for the film products, and to be within 12 meters and within 14 meters for the digital products.
Measuring a diffusion coefficient by single-particle tracking: statistical analysis of experimental mean squared displacement curves.

PubMed

Ernst, Dominique; Köhler, Jürgen

2013-01-21

We provide experimental results on the accuracy of diffusion coefficients obtained by a mean squared displacement (MSD) analysis of single-particle trajectories. We have recorded very long trajectories comprising more than 1.5 × 10(5) data points and decomposed these long trajectories into shorter segments providing us with ensembles of trajectories of variable lengths. This enabled a statistical analysis of the resulting MSD curves as a function of the lengths of the segments. We find that the relative error of the diffusion coefficient can be minimized by taking an optimum number of points into account for fitting the MSD curves, and that this optimum does not depend on the segment length. Yet, the magnitude of the relative error for the diffusion coefficient does, and achieving an accuracy in the order of 10% requires the recording of trajectories with about 1000 data points. Finally, we compare our results with theoretical predictions and find very good qualitative and quantitative agreement between experiment and theory.
Improvement of near infrared spectroscopic (NIRS) analysis of caffeine in roasted Arabica coffee by variable selection method of stability competitive adaptive reweighted sampling (SCARS).

PubMed

Zhang, Xuan; Li, Wei; Yin, Bin; Chen, Weizhong; Kelly, Declan P; Wang, Xiaoxin; Zheng, Kaiyi; Du, Yiping

2013-10-01

Coffee is the most heavily consumed beverage in the world after water, for which quality is a key consideration in commercial trade. Therefore, caffeine content which has a significant effect on the final quality of the coffee products requires to be determined fast and reliably by new analytical techniques. The main purpose of this work was to establish a powerful and practical analytical method based on near infrared spectroscopy (NIRS) and chemometrics for quantitative determination of caffeine content in roasted Arabica coffees. Ground coffee samples within a wide range of roasted levels were analyzed by NIR, meanwhile, in which the caffeine contents were quantitative determined by the most commonly used HPLC-UV method as the reference values. Then calibration models based on chemometric analyses of the NIR spectral data and reference concentrations of coffee samples were developed. Partial least squares (PLS) regression was used to construct the models. Furthermore, diverse spectra pretreatment and variable selection techniques were applied in order to obtain robust and reliable reduced-spectrum regression models. Comparing the respective quality of the different models constructed, the application of second derivative pretreatment and stability competitive adaptive reweighted sampling (SCARS) variable selection provided a notably improved regression model, with root mean square error of cross validation (RMSECV) of 0.375 mg/g and correlation coefficient (R) of 0.918 at PLS factor of 7. An independent test set was used to assess the model, with the root mean square error of prediction (RMSEP) of 0.378 mg/g, mean relative error of 1.976% and mean relative standard deviation (RSD) of 1.707%. Thus, the results provided by the high-quality calibration model revealed the feasibility of NIR spectroscopy for at-line application to predict the caffeine content of unknown roasted coffee samples, thanks to the short analysis time of a few seconds and non-destructive advantages of NIRS. Copyright © 2013 Elsevier B.V. All rights reserved.
Improvement of near infrared spectroscopic (NIRS) analysis of caffeine in roasted Arabica coffee by variable selection method of stability competitive adaptive reweighted sampling (SCARS)

NASA Astrophysics Data System (ADS)

Zhang, Xuan; Li, Wei; Yin, Bin; Chen, Weizhong; Kelly, Declan P.; Wang, Xiaoxin; Zheng, Kaiyi; Du, Yiping

2013-10-01

Coffee is the most heavily consumed beverage in the world after water, for which quality is a key consideration in commercial trade. Therefore, caffeine content which has a significant effect on the final quality of the coffee products requires to be determined fast and reliably by new analytical techniques. The main purpose of this work was to establish a powerful and practical analytical method based on near infrared spectroscopy (NIRS) and chemometrics for quantitative determination of caffeine content in roasted Arabica coffees. Ground coffee samples within a wide range of roasted levels were analyzed by NIR, meanwhile, in which the caffeine contents were quantitative determined by the most commonly used HPLC-UV method as the reference values. Then calibration models based on chemometric analyses of the NIR spectral data and reference concentrations of coffee samples were developed. Partial least squares (PLS) regression was used to construct the models. Furthermore, diverse spectra pretreatment and variable selection techniques were applied in order to obtain robust and reliable reduced-spectrum regression models. Comparing the respective quality of the different models constructed, the application of second derivative pretreatment and stability competitive adaptive reweighted sampling (SCARS) variable selection provided a notably improved regression model, with root mean square error of cross validation (RMSECV) of 0.375 mg/g and correlation coefficient (R) of 0.918 at PLS factor of 7. An independent test set was used to assess the model, with the root mean square error of prediction (RMSEP) of 0.378 mg/g, mean relative error of 1.976% and mean relative standard deviation (RSD) of 1.707%. Thus, the results provided by the high-quality calibration model revealed the feasibility of NIR spectroscopy for at-line application to predict the caffeine content of unknown roasted coffee samples, thanks to the short analysis time of a few seconds and non-destructive advantages of NIRS.
Evaluation of single-point sampling strategies for the estimation of moclobemide exposure in depressive patients.

PubMed

Ignjatovic, Anita Rakic; Miljkovic, Branislava; Todorovic, Dejan; Timotijevic, Ivana; Pokrajac, Milena

2011-05-01

Because moclobemide pharmacokinetics vary considerably among individuals, monitoring of plasma concentrations lends insight into its pharmacokinetic behavior and enhances its rational use in clinical practice. The aim of this study was to evaluate whether single concentration-time points could adequately predict moclobemide systemic exposure. Pharmacokinetic data (full 7-point pharmacokinetic profiles), obtained from 21 depressive inpatients receiving moclobemide (150 mg 3 times daily), were randomly split into development (n = 18) and validation (n = 16) sets. Correlations between the single concentration-time points and the area under the concentration-time curve within a 6-hour dosing interval at steady-state (AUC(0-6)) were assessed by linear regression analyses. The predictive performance of single-point sampling strategies was evaluated in the validation set by mean prediction error, mean absolute error, and root mean square error. Plasma concentrations in the absorption phase yielded unsatisfactory predictions of moclobemide AUC(0-6). The best estimation of AUC(0-6) was achieved from concentrations at 4 and 6 hours following dosing. As the most reliable surrogate for moclobemide systemic exposure, concentrations at 4 and 6 hours should be used instead of predose trough concentrations as an indicator of between-patient variability and a guide for dose adjustments in specific clinical situations.
Application of Artificial Neural Network and Response Surface Methodology in Modeling of Surface Roughness in WS2 Solid Lubricant Assisted MQL Turning of Inconel 718

NASA Astrophysics Data System (ADS)

Maheshwera Reddy Paturi, Uma; Devarasetti, Harish; Abimbola Fadare, David; Reddy Narala, Suresh Kumar

2018-04-01

In the present paper, the artificial neural network (ANN) and response surface methodology (RSM) are used in modeling of surface roughness in WS2 (tungsten disulphide) solid lubricant assisted minimal quantity lubrication (MQL) machining. The real time MQL turning of Inconel 718 experimental data considered in this paper was available in the literature [1]. In ANN modeling, performance parameters such as mean square error (MSE), mean absolute percentage error (MAPE) and average error in prediction (AEP) for the experimental data were determined based on Levenberg–Marquardt (LM) feed forward back propagation training algorithm with tansig as transfer function. The MATLAB tool box has been utilized in training and testing of neural network model. Neural network model with three input neurons, one hidden layer with five neurons and one output neuron (3-5-1 architecture) is found to be most confidence and optimal. The coefficient of determination (R2) for both the ANN and RSM model were seen to be 0.998 and 0.982 respectively. The surface roughness predictions from ANN and RSM model were related with experimentally measured values and found to be in good agreement with each other. However, the prediction efficacy of ANN model is relatively high when compared with RSM model predictions.
Carbon Nanotubes' Effect on Mitochondrial Oxygen Flux Dynamics: Polarography Experimental Study and Machine Learning Models using Star Graph Trace Invariants of Raman Spectra.

PubMed

González-Durruthy, Michael; Monserrat, Jose M; Rasulev, Bakhtiyor; Casañola-Martín, Gerardo M; Barreiro Sorrivas, José María; Paraíso-Medina, Sergio; Maojo, Víctor; González-Díaz, Humberto; Pazos, Alejandro; Munteanu, Cristian R

2017-11-11

This study presents the impact of carbon nanotubes (CNTs) on mitochondrial oxygen mass flux ( J m ) under three experimental conditions. New experimental results and a new methodology are reported for the first time and they are based on CNT Raman spectra star graph transform (spectral moments) and perturbation theory. The experimental measures of J m showed that no tested CNT family can inhibit the oxygen consumption profiles of mitochondria. The best model for the prediction of J m for other CNTs was provided by random forest using eight features, obtaining test R-squared ( R ²) of 0.863 and test root-mean-square error (RMSE) of 0.0461. The results demonstrate the capability of encoding CNT information into spectral moments of the Raman star graphs (SG) transform with a potential applicability as predictive tools in nanotechnology and material risk assessments.

Simultaneous determination of three herbicides by differential pulse voltammetry and chemometrics.

PubMed

Ni, Yongnian; Wang, Lin; Kokot, Serge

2011-01-01

A novel differential pulse voltammetry method (DPV) was researched and developed for the simultaneous determination of Pendimethalin, Dinoseb and sodium 5-nitroguaiacolate (5NG) with the aid of chemometrics. The voltammograms of these three compounds overlapped significantly, and to facilitate the simultaneous determination of the three analytes, chemometrics methods were applied. These included classical least squares (CLS), principal component regression (PCR), partial least squares (PLS) and radial basis function-artificial neural networks (RBF-ANN). A separately prepared verification data set was used to confirm the calibrations, which were built from the original and first derivative data matrices of the voltammograms. On the basis relative prediction errors and recoveries of the analytes, the RBF-ANN and the DPLS (D - first derivative spectra) models performed best and are particularly recommended for application. The DPLS calibration model was applied satisfactorily for the prediction of the three analytes from market vegetables and lake water samples.
Submillimeter, millimeter, and microwave spectral line catalogue

NASA Technical Reports Server (NTRS)

Poynter, R. L.; Pickett, H. M.

1980-01-01

A computer accessible catalogue of submillimeter, millimeter, and microwave spectral lines in the frequency range between O and 3000 GHz (such as; wavelengths longer than 100 m) is discussed. The catalogue was used as a planning guide and as an aid in the identification and analysis of observed spectral lines. The information listed for each spectral line includes the frequency and its estimated error, the intensity, lower state energy, and quantum number assignment. The catalogue was constructed by using theoretical least squares fits of published spectral lines to accepted molecular models. The associated predictions and their estimated errors are based upon the resultant fitted parameters and their covariances.
Predicting enteric methane emission of dairy cows with milk Fourier-transform infrared spectra and gas chromatography-based milk fatty acid profiles.

PubMed

van Gastelen, S; Mollenhorst, H; Antunes-Fernandes, E C; Hettinga, K A; van Burgsteden, G G; Dijkstra, J; Rademaker, J L W

2018-06-01

The objective of the present study was to compare the prediction potential of milk Fourier-transform infrared spectroscopy (FTIR) for CH 4 emissions of dairy cows with that of gas chromatography (GC)-based milk fatty acids (MFA). Data from 9 experiments with lactating Holstein-Friesian cows, with a total of 30 dietary treatments and 218 observations, were used. Methane emissions were measured for 3 consecutive days in climate respiration chambers and expressed as production (g/d), yield (g/kg of dry matter intake; DMI), and intensity (g/kg of fat- and protein-corrected milk; FPCM). Dry matter intake was 16.3 ± 2.18 kg/d (mean ± standard deviation), FPCM yield was 25.9 ± 5.06 kg/d, CH 4 production was 366 ± 53.9 g/d, CH 4 yield was 22.5 ± 2.10 g/kg of DMI, and CH 4 intensity was 14.4 ± 2.58 g/kg of FPCM. Milk was sampled during the same days and analyzed by GC and by FTIR. Multivariate GC-determined MFA-based and FTIR-based CH 4 prediction models were developed, and subsequently, the final CH 4 prediction models were evaluated with root mean squared error of prediction and concordance correlation coefficient analysis. Further, we performed a random 10-fold cross validation to calculate the performance parameters of the models (e.g., the coefficient of determination of cross validation). The final GC-determined MFA-based CH 4 prediction models estimate CH 4 production, yield, and intensity with a root mean squared error of prediction of 35.7 g/d, 1.6 g/kg of DMI, and 1.6 g/kg of FPCM and with a concordance correlation coefficient of 0.72, 0.59, and 0.77, respectively. The final FTIR-based CH 4 prediction models estimate CH 4 production, yield, and intensity with a root mean squared error of prediction of 43.2 g/d, 1.9 g/kg of DMI, and 1.7 g/kg of FPCM and with a concordance correlation coefficient of 0.52, 0.40, and 0.72, respectively. The GC-determined MFA-based prediction models described a greater part of the observed variation in CH 4 emission than did the FTIR-based models. The cross validation results indicate that all CH 4 prediction models (both GC-determined MFA-based and FTIR-based models) are robust; the difference between the coefficient of determination and the coefficient of determination of cross validation ranged from 0.01 to 0.07. The results indicate that GC-determined MFA have a greater potential than FTIR spectra to estimate CH 4 production, yield, and intensity. Both techniques hold potential but may not yet be ready to predict CH 4 emission of dairy cows in practice. Additional CH 4 measurements are needed to improve the accuracy and robustness of GC-determined MFA and FTIR spectra for CH 4 prediction. Copyright © 2018 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Multivariate Adaptive Regression Splines (Preprint)

DTIC Science & Technology

1990-08-01

fold cross -validation would take about ten time as long, and MARS is not all that fast to begin with. Friedman has a number of examples showing...standardized mean squared error of prediction (MSEP), the generalized cross validation (GCV), and the number of selected terms (TERMS). In accordance with...and mi= 10 case were almost exclusively spurious cross product terms and terms involving the nuisance variables x6 through xlo. This large number of
Quantitative assessment of copper proteinates used as animal feed additives using ATR-FTIR spectroscopy and powder X-ray diffraction (PXRD) analysis.

PubMed

Cantwell, Caoimhe A; Byrne, Laurann A; Connolly, Cathal D; Hynes, Michael J; McArdle, Patrick; Murphy, Richard A

2017-08-01

The aim of the present work was to establish a reliable analytical method to determine the degree of complexation in commercial metal proteinates used as feed additives in the solid state. Two complementary techniques were developed. Firstly, a quantitative attenuated total reflectance Fourier transform infrared (ATR-FTIR) spectroscopic method investigated modifications in vibrational absorption bands of the ligand on complex formation. Secondly, a powder X-ray diffraction (PXRD) method to quantify the amount of crystalline material in the proteinate product was developed. These methods were developed in tandem and cross-validated with each other. Multivariate analysis (MVA) was used to develop validated calibration and prediction models. The FTIR and PXRD calibrations showed excellent linearity (R 2 > 0.99). The diagnostic model parameters showed that the FTIR and PXRD methods were robust with a root mean square error of calibration RMSEC ≤3.39% and a root mean square error of prediction RMSEP ≤7.17% respectively. Comparative statistics show excellent agreement between the MVA packages assessed and between the FTIR and PXRD methods. The methods can be used to determine the degree of complexation in complexes of both protein hydrolysates and pure amino acids.
On a stronger-than-best property for best prediction

NASA Astrophysics Data System (ADS)

Teunissen, P. J. G.

2008-03-01

The minimum mean squared error (MMSE) criterion is a popular criterion for devising best predictors. In case of linear predictors, it has the advantage that no further distributional assumptions need to be made, other then about the first- and second-order moments. In the spatial and Earth sciences, it is the best linear unbiased predictor (BLUP) that is used most often. Despite the fact that in this case only the first- and second-order moments need to be known, one often still makes statements about the complete distribution, in particular when statistical testing is involved. For such cases, one can do better than the BLUP, as shown in Teunissen (J Geod. doi: 10.1007/s00190-007-0140-6, 2006), and thus devise predictors that have a smaller MMSE than the BLUP. Hence, these predictors are to be preferred over the BLUP, if one really values the MMSE-criterion. In the present contribution, we will show, however, that the BLUP has another optimality property than the MMSE-property, provided that the distribution is Gaussian. It will be shown that in the Gaussian case, the prediction error of the BLUP has the highest possible probability of all linear unbiased predictors of being bounded in the weighted squared norm sense. This is a stronger property than the often advertised MMSE-property of the BLUP.
Accounting for measurement error in log regression models with applications to accelerated testing.

PubMed

Richardson, Robert; Tolley, H Dennis; Evenson, William E; Lunt, Barry M

2018-01-01

In regression settings, parameter estimates will be biased when the explanatory variables are measured with error. This bias can significantly affect modeling goals. In particular, accelerated lifetime testing involves an extrapolation of the fitted model, and a small amount of bias in parameter estimates may result in a significant increase in the bias of the extrapolated predictions. Additionally, bias may arise when the stochastic component of a log regression model is assumed to be multiplicative when the actual underlying stochastic component is additive. To account for these possible sources of bias, a log regression model with measurement error and additive error is approximated by a weighted regression model which can be estimated using Iteratively Re-weighted Least Squares. Using the reduced Eyring equation in an accelerated testing setting, the model is compared to previously accepted approaches to modeling accelerated testing data with both simulations and real data.
Kinetic modelling for zinc (II) ions biosorption onto Luffa cylindrica

DOE Office of Scientific and Technical Information (OSTI.GOV)

Oboh, I., E-mail: innocentoboh@uniuyo.edu.ng; Aluyor, E.; Audu, T.

The biosorption of Zinc (II) ions onto a biomaterial - Luffa cylindrica has been studied. This biomaterial was characterized by elemental analysis, surface area, pore size distribution, scanning electron microscopy, and the biomaterial before and after sorption, was characterized by Fourier Transform Infra Red (FTIR) spectrometer. The kinetic nonlinear models fitted were Pseudo-first order, Pseudo-second order and Intra-particle diffusion. A comparison of non-linear regression method in selecting the kinetic model was made. Four error functions, namely coefficient of determination (R{sup 2}), hybrid fractional error function (HYBRID), average relative error (ARE), and sum of the errors squared (ERRSQ), were used tomore » predict the parameters of the kinetic models. The strength of this study is that a biomaterial with wide distribution particularly in the tropical world and which occurs as waste material could be put into effective utilization as a biosorbent to address a crucial environmental problem.« less
Kinetic modelling for zinc (II) ions biosorption onto Luffa cylindrica

NASA Astrophysics Data System (ADS)

Oboh, I.; Aluyor, E.; Audu, T.

2015-03-01

The biosorption of Zinc (II) ions onto a biomaterial - Luffa cylindrica has been studied. This biomaterial was characterized by elemental analysis, surface area, pore size distribution, scanning electron microscopy, and the biomaterial before and after sorption, was characterized by Fourier Transform Infra Red (FTIR) spectrometer. The kinetic nonlinear models fitted were Pseudo-first order, Pseudo-second order and Intra-particle diffusion. A comparison of non-linear regression method in selecting the kinetic model was made. Four error functions, namely coefficient of determination (R2), hybrid fractional error function (HYBRID), average relative error (ARE), and sum of the errors squared (ERRSQ), were used to predict the parameters of the kinetic models. The strength of this study is that a biomaterial with wide distribution particularly in the tropical world and which occurs as waste material could be put into effective utilization as a biosorbent to address a crucial environmental problem.
Lameness detection challenges in automated milking systems addressed with partial least squares discriminant analysis.

PubMed

Garcia, E; Klaas, I; Amigo, J M; Bro, R; Enevoldsen, C

2014-12-01

Lameness causes decreased animal welfare and leads to higher production costs. This study explored data from an automatic milking system (AMS) to model on-farm gait scoring from a commercial farm. A total of 88 cows were gait scored once per week, for 2 5-wk periods. Eighty variables retrieved from AMS were summarized week-wise and used to predict 2 defined classes: nonlame and clinically lame cows. Variables were represented with 2 transformations of the week summarized variables, using 2-wk data blocks before gait scoring, totaling 320 variables (2 × 2 × 80). The reference gait scoring error was estimated in the first week of the study and was, on average, 15%. Two partial least squares discriminant analysis models were fitted to parity 1 and parity 2 groups, respectively, to assign the lameness class according to the predicted probability of being lame (score 3 or 4/4) or not lame (score 1/4). Both models achieved sensitivity and specificity values around 80%, both in calibration and cross-validation. At the optimum values in the receiver operating characteristic curve, the false-positive rate was 28% in the parity 1 model, whereas in the parity 2 model it was about half (16%), which makes it more suitable for practical application; the model error rates were, 23 and 19%, respectively. Based on data registered automatically from one AMS farm, we were able to discriminate nonlame and lame cows, where partial least squares discriminant analysis achieved similar performance to the reference method. Copyright © 2014 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Modeling and forecasting of KLCI weekly return using WT-ANN integrated model

NASA Astrophysics Data System (ADS)

Liew, Wei-Thong; Liong, Choong-Yeun; Hussain, Saiful Izzuan; Isa, Zaidi

2013-04-01

The forecasting of weekly return is one of the most challenging tasks in investment since the time series are volatile and non-stationary. In this study, an integrated model of wavelet transform and artificial neural network, WT-ANN is studied for modeling and forecasting of KLCI weekly return. First, the WT is applied to decompose the weekly return time series in order to eliminate noise. Then, a mathematical model of the time series is constructed using the ANN. The performance of the suggested model will be evaluated by root mean squared error (RMSE), mean absolute error (MAE), mean absolute percentage error (MAPE). The result shows that the WT-ANN model can be considered as a feasible and powerful model for time series modeling and prediction.
[Model of multiple seasonal autoregressive integrated moving average model and its application in prediction of the hand-foot-mouth disease incidence in Changsha].

PubMed

Tan, Ting; Chen, Lizhang; Liu, Fuqiang

2014-11-01

To establish multiple seasonal autoregressive integrated moving average model (ARIMA) according to the hand-foot-mouth disease incidence in Changsha, and to explore the feasibility of the multiple seasonal ARIMA in predicting the hand-foot-mouth disease incidence. EVIEWS 6.0 was used to establish multiple seasonal ARIMA according to the hand-foot- mouth disease incidence from May 2008 to August 2013 in Changsha, and the data of the hand- foot-mouth disease incidence from September 2013 to February 2014 were served as the examined samples of the multiple seasonal ARIMA, then the errors were compared between the forecasted incidence and the real value. Finally, the incidence of hand-foot-mouth disease from March 2014 to August 2014 was predicted by the model. After the data sequence was handled by smooth sequence, model identification and model diagnosis, the multiple seasonal ARIMA (1, 0, 1)×(0, 1, 1)12 was established. The R2 value of the model fitting degree was 0.81, the root mean square prediction error was 8.29 and the mean absolute error was 5.83. The multiple seasonal ARIMA is a good prediction model, and the fitting degree is good. It can provide reference for the prevention and control work in hand-foot-mouth disease.
Intelligent sensing sensory quality of Chinese rice wine using near infrared spectroscopy and nonlinear tools

NASA Astrophysics Data System (ADS)

Ouyang, Qin; Chen, Quansheng; Zhao, Jiewen

2016-02-01

The approach presented herein reports the application of near infrared (NIR) spectroscopy, in contrast with human sensory panel, as a tool for estimating Chinese rice wine quality; concretely, to achieve the prediction of the overall sensory scores assigned by the trained sensory panel. Back propagation artificial neural network (BPANN) combined with adaptive boosting (AdaBoost) algorithm, namely BP-AdaBoost, as a novel nonlinear algorithm, was proposed in modeling. First, the optimal spectra intervals were selected by synergy interval partial least square (Si-PLS). Then, BP-AdaBoost model based on the optimal spectra intervals was established, called Si-BP-AdaBoost model. These models were optimized by cross validation, and the performance of each final model was evaluated according to correlation coefficient (Rp) and root mean square error of prediction (RMSEP) in prediction set. Si-BP-AdaBoost showed excellent performance in comparison with other models. The best Si-BP-AdaBoost model was achieved with Rp = 0.9180 and RMSEP = 2.23 in the prediction set. It was concluded that NIR spectroscopy combined with Si-BP-AdaBoost was an appropriate method for the prediction of the sensory quality in Chinese rice wine.
Using FT-NIR spectroscopy technique to determine arginine content in fermented Cordyceps sinensis mycelium.

PubMed

Xie, Chuanqi; Xu, Ning; Shao, Yongni; He, Yong

2015-01-01

This research investigated the feasibility of using Fourier transform near-infrared (FT-NIR) spectral technique for determining arginine content in fermented Cordyceps sinensis (C. sinensis) mycelium. Three different models were carried out to predict the arginine content. Wavenumber selection methods such as competitive adaptive reweighted sampling (CARS) and successive projections algorithm (SPA) were used to identify the most important wavenumbers and reduce the high dimensionality of the raw spectral data. Only a few wavenumbers were selected by CARS and CARS-SPA as the optimal wavenumbers, respectively. Among the prediction models, CARS-least squares-support vector machine (CARS-LS-SVM) model performed best with the highest values of the coefficient of determination of prediction (Rp(2)=0.8370) and residual predictive deviation (RPD=2.4741), the lowest value of root mean square error of prediction (RMSEP=0.0841). Moreover, the number of the input variables was forty-five, which only accounts for 2.04% of that of the full wavenumbers. The results showed that FT-NIR spectral technique has the potential to be an objective and non-destructive method to detect arginine content in fermented C. sinensis mycelium. Copyright © 2015 Elsevier B.V. All rights reserved.
Characterization of Free Phenytoin Concentrations in End-Stage Renal Disease Using the Winter-Tozer Equation.

PubMed

Soriano, Vincent V; Tesoro, Eljim P; Kane, Sean P

2017-08-01

The Winter-Tozer (WT) equation has been shown to reliably predict free phenytoin levels in healthy patients. In patients with end-stage renal disease (ESRD), phenytoin-albumin binding is altered and, thus, affects interpretation of total serum levels. Although an ESRD WT equation was historically proposed for this population, there is a lack of data evaluating its accuracy. The objective of this study was to determine the accuracy of the ESRD WT equation in predicting free serum phenytoin concentration in patients with ESRD on hemodialysis (HD). A retrospective analysis of adult patients with ESRD on HD and concurrent free and total phenytoin concentrations was conducted. Each patient's true free phenytoin concentration was compared with a calculated value using the ESRD WT equation and a revised version of the ESRD WT equation. A total of 21 patients were included for analysis. The ESRD WT equation produced a percentage error of 75% and a root mean square error of 1.76 µg/mL. Additionally, 67% of the samples had an error >50% when using the ESRD WT equation. A revised equation was found to have high predictive accuracy, with only 5% of the samples demonstrating >50% error. The ESRD WT equation was not accurate in predicting free phenytoin concentration in patients with ESRD on HD. A revised ESRD WT equation was found to be significantly more accurate. Given the small study sample, further studies are required to fully evaluate the clinical utility of the revised ESRD WT equation.
Irradiation dose detection of irradiated milk powder using visible and near-infrared spectroscopy and chemometrics.

PubMed

Kong, W W; Zhang, C; Liu, F; Gong, A P; He, Y

2013-08-01

The objective of this study was to examine the possibility of applying visible and near-infrared spectroscopy to the quantitative detection of irradiation dose of irradiated milk powder. A total of 150 samples were used: 100 for the calibration set and 50 for the validation set. The samples were irradiated at 5 different dose levels in the dose range 0 to 6.0 kGy. Six different pretreatment methods were compared. The prediction results of full spectra given by linear and nonlinear calibration methods suggested that Savitzky-Golay smoothing and first derivative were suitable pretreatment methods in this study. Regression coefficient analysis was applied to select effective wavelengths (EW). Less than 10 EW were selected and they were useful for portable detection instrument or sensor development. Partial least squares, extreme learning machine, and least squares support vector machine were used. The best prediction performance was achieved by the EW-extreme learning machine model with first-derivative spectra, and correlation coefficients=0.97 and root mean square error of prediction=0.844. This study provided a new approach for the fast detection of irradiation dose of milk powder. The results could be helpful for quality detection and safety monitoring of milk powder. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Near-infrared hyperspectral imaging and partial least squares regression for rapid and reagentless determination of Enterobacteriaceae on chicken fillets.

PubMed

Feng, Yao-Ze; Elmasry, Gamal; Sun, Da-Wen; Scannell, Amalia G M; Walsh, Des; Morcy, Noha

2013-06-01

Bacterial pathogens are the main culprits for outbreaks of food-borne illnesses. This study aimed to use the hyperspectral imaging technique as a non-destructive tool for quantitative and direct determination of Enterobacteriaceae loads on chicken fillets. Partial least squares regression (PLSR) models were established and the best model using full wavelengths was obtained in the spectral range 930-1450 nm with coefficients of determination R(2)≥ 0.82 and root mean squared errors (RMSEs) ≤ 0.47 log(10)CFUg(-1). In further development of simplified models, second derivative spectra and weighted PLS regression coefficients (BW) were utilised to select important wavelengths. However, the three wavelengths (930, 1121 and 1345 nm) selected from BW were competent and more preferred for predicting Enterobacteriaceae loads with R(2) of 0.89, 0.86 and 0.87 and RMSEs of 0.33, 0.40 and 0.45 log(10)CFUg(-1) for calibration, cross-validation and prediction, respectively. Besides, the constructed prediction map provided the distribution of Enterobacteriaceae bacteria on chicken fillets, which cannot be achieved by conventional methods. It was demonstrated that hyperspectral imaging is a potential tool for determining food sanitation and detecting bacterial pathogens on food matrix without using complicated laboratory regimes. Copyright © 2012 Elsevier Ltd. All rights reserved.
Application of mid-infrared spectroscopy to the prediction of maturity and sensory texture attributes of cheddar cheese.

PubMed

Fagan, C C; O'Donnell, C P; O'Callaghan, D J; Downey, G; Sheehan, E M; Delahunty, C M; Everard, C; Guinee, T P; Howard, V

2007-04-01

The objective of this study was to determine the potential of mid-infrared spectroscopy in conjunction with partial least squares (PLS) regression to predict various quality parameters in cheddar cheese. Cheddar cheeses (n= 24) were manufactured and stored at 8 degrees C for 12 mo. Mid-infrared spectra (640 to 4000/cm) were recorded after 4, 6, 9, and 12 mo storage. At 4, 6, and 9 mo, the water-soluble nitrogen (WSN) content of the samples was determined and the samples were also evaluated for 11 sensory texture attributes using descriptive sensory analysis. The mid-infrared spectra were subjected to a number of pretreatments, and predictive models were developed for all parameters. Age was predicted using scatter-corrected, 1st derivative spectra with a root mean square error of cross-validation (RMSECV) of 1 mo, while WSN was predicted using 1st derivative spectra (RMSECV = 2.6%). The sensory texture attributes most successfully predicted were rubbery, crumbly, chewy, and massforming. These attributes were modeled using 2nd derivative spectra and had corresponding RMSECV values in the range of 2.5 to 4.2 on a scale of 0 to 100. It was concluded that mid-infrared spectroscopy has the potential to predict age, WSN, and several sensory texture attributes of cheddar cheese.
Iterative random vs. Kennard-Stone sampling for IR spectrum-based classification task using PLS2-DA

NASA Astrophysics Data System (ADS)

Lee, Loong Chuen; Liong, Choong-Yeun; Jemain, Abdul Aziz

2018-04-01

External testing (ET) is preferred over auto-prediction (AP) or k-fold-cross-validation in estimating more realistic predictive ability of a statistical model. With IR spectra, Kennard-stone (KS) sampling algorithm is often used to split the data into training and test sets, i.e. respectively for model construction and for model testing. On the other hand, iterative random sampling (IRS) has not been the favored choice though it is theoretically more likely to produce reliable estimation. The aim of this preliminary work is to compare performances of KS and IRS in sampling a representative training set from an attenuated total reflectance - Fourier transform infrared spectral dataset (of four varieties of blue gel pen inks) for PLS2-DA modeling. The `best' performance achievable from the dataset is estimated with AP on the full dataset (APF, error). Both IRS (n = 200) and KS were used to split the dataset in the ratio of 7:3. The classic decision rule (i.e. maximum value-based) is employed for new sample prediction via partial least squares - discriminant analysis (PLS2-DA). Error rate of each model was estimated repeatedly via: (a) AP on full data (APF, error); (b) AP on training set (APS, error); and (c) ET on the respective test set (ETS, error). A good PLS2-DA model is expected to produce APS, error and EVS, error that is similar to the APF, error. Bearing that in mind, the similarities between (a) APS, error vs. APF, error; (b) ETS, error vs. APF, error and; (c) APS, error vs. ETS, error were evaluated using correlation tests (i.e. Pearson and Spearman's rank test), using series of PLS2-DA models computed from KS-set and IRS-set, respectively. Overall, models constructed from IRS-set exhibits more similarities between the internal and external error rates than the respective KS-set, i.e. less risk of overfitting. In conclusion, IRS is more reliable than KS in sampling representative training set.
[Near infrared spectroscopy study on water content in turbine oil].

PubMed

Chen, Bin; Liu, Ge; Zhang, Xian-Ming

2013-11-01

Near infrared (NIR) spectroscopy combined with successive projections algorithm (SPA) was investigated for determination of water content in turbine oil. Through the 57 samples of different water content in turbine oil scanned applying near infrared (NIR) spectroscopy, with the water content in the turbine oil of 0-0.156%, different pretreatment methods such as the original spectra, first derivative spectra and differential polynomial least squares fitting algorithm Savitzky-Golay (SG), and successive projections algorithm (SPA) were applied for the extraction of effective wavelengths, the correlation coefficient (R) and root mean square error (RMSE) were used as the model evaluation indices, accordingly water content in turbine oil was investigated. The results indicated that the original spectra with different water content in turbine oil were pretreated by the performance of first derivative + SG pretreatments, then the selected effective wavelengths were used as the inputs of least square support vector machine (LS-SVM). A total of 16 variables selected by SPA were employed to construct the model of SPA and least square support vector machine (SPA-LS-SVM). There is 9 as The correlation coefficient was 0.975 9 and the root of mean square error of validation set was 2.655 8 x 10(-3) using the model, and it is feasible to determine the water content in oil using near infrared spectroscopy and SPA-LS-SVM, and an excellent prediction precision was obtained. This study supplied a new and alternative approach to the further application of near infrared spectroscopy in on-line monitoring of contamination such as water content in oil.

Optimal information transfer in enzymatic networks: A field theoretic formulation

NASA Astrophysics Data System (ADS)

Samanta, Himadri S.; Hinczewski, Michael; Thirumalai, D.

2017-07-01

Signaling in enzymatic networks is typically triggered by environmental fluctuations, resulting in a series of stochastic chemical reactions, leading to corruption of the signal by noise. For example, information flow is initiated by binding of extracellular ligands to receptors, which is transmitted through a cascade involving kinase-phosphatase stochastic chemical reactions. For a class of such networks, we develop a general field-theoretic approach to calculate the error in signal transmission as a function of an appropriate control variable. Application of the theory to a simple push-pull network, a module in the kinase-phosphatase cascade, recovers the exact results for error in signal transmission previously obtained using umbral calculus [Hinczewski and Thirumalai, Phys. Rev. X 4, 041017 (2014), 10.1103/PhysRevX.4.041017]. We illustrate the generality of the theory by studying the minimal errors in noise reduction in a reaction cascade with two connected push-pull modules. Such a cascade behaves as an effective three-species network with a pseudointermediate. In this case, optimal information transfer, resulting in the smallest square of the error between the input and output, occurs with a time delay, which is given by the inverse of the decay rate of the pseudointermediate. Surprisingly, in these examples the minimum error computed using simulations that take nonlinearities and discrete nature of molecules into account coincides with the predictions of a linear theory. In contrast, there are substantial deviations between simulations and predictions of the linear theory in error in signal propagation in an enzymatic push-pull network for a certain range of parameters. Inclusion of second-order perturbative corrections shows that differences between simulations and theoretical predictions are minimized. Our study establishes that a field theoretic formulation of stochastic biological signaling offers a systematic way to understand error propagation in networks of arbitrary complexity.
Microwave Photonic Architecture for Direction Finding of LPI Emitters: Post-Processing for Angle of Arrival Estimation

DTIC Science & Technology

2016-09-01

mean- square (RMS) error of 0.29° at ə° resolution. For a P4 coded signal, the RMS error in estimating the AOA is 0.32° at 1° resolution. 14...FMCW signal, it was demonstrated that the system is capable of estimating the AOA with a root-mean- square (RMS) error of 0.29° at ə° resolution. For a...Modulator PCB printed circuit board PD photodetector RF radio frequency RMS root-mean- square xvi THIS PAGE INTENTIONALLY LEFT BLANK xvii
Analysis of the Magnitude and Frequency of Peak Discharges for the Navajo Nation in Arizona, Utah, Colorado, and New Mexico

USGS Publications Warehouse

Waltemeyer, Scott D.

2006-01-01

Estimates of the magnitude and frequency of peak discharges are necessary for the reliable flood-hazard mapping in the Navajo Nation in Arizona, Utah, Colorado, and New Mexico. The Bureau of Indian Affairs, U.S. Army Corps of Engineers, and Navajo Nation requested that the U.S. Geological Survey update estimates of peak discharge magnitude for gaging stations in the region and update regional equations for estimation of peak discharge and frequency at ungaged sites. Equations were developed for estimating the magnitude of peak discharges for recurrence intervals of 2, 5, 10, 25, 50, 100, and 500 years at ungaged sites using data collected through 1999 at 146 gaging stations, an additional 13 years of peak-discharge data since a 1997 investigation, which used gaging-station data through 1986. The equations for estimation of peak discharges at ungaged sites were developed for flood regions 8, 11, high elevation, and 6 and are delineated on the basis of the hydrologic codes from the 1997 investigation. Peak discharges for selected recurrence intervals were determined at gaging stations by fitting observed data to a log-Pearson Type III distribution with adjustments for a low-discharge threshold and a zero skew coefficient. A low-discharge threshold was applied to frequency analysis of 82 of the 146 gaging stations. This application provides an improved fit of the log-Pearson Type III frequency distribution. Use of the low-discharge threshold generally eliminated the peak discharge having a recurrence interval of less than 1.4 years in the probability-density function. Within each region, logarithms of the peak discharges for selected recurrence intervals were related to logarithms of basin and climatic characteristics using stepwise ordinary least-squares regression techniques for exploratory data analysis. Generalized least-squares regression techniques, an improved regression procedure that accounts for time and spatial sampling errors, then was applied to the same data used in the ordinary least-squares regression analyses. The average standard error of prediction for a peak discharge have a recurrence interval of 100-years for region 8 was 53 percent (average) for the 100-year flood. The average standard of prediction, which includes average sampling error and average standard error of regression, ranged from 45 to 83 percent for the 100-year flood. Estimated standard error of prediction for a hybrid method for region 11 was large in the 1997 investigation. No distinction of floods produced from a high-elevation region was presented in the 1997 investigation. Overall, the equations based on generalized least-squares regression techniques are considered to be more reliable than those in the 1997 report because of the increased length of record and improved GIS method. Techniques for transferring flood-frequency relations to ungaged sites on the same stream can be estimated at an ungaged site by a direct application of the regional regression equation or at an ungaged site on a stream that has a gaging station upstream or downstream by using the drainage-area ratio and the drainage-area exponent from the regional regression equation of the respective region.
Ensemble forecasting of short-term system scale irrigation demands using real-time flow data and numerical weather predictions

NASA Astrophysics Data System (ADS)

Perera, Kushan C.; Western, Andrew W.; Robertson, David E.; George, Biju; Nawarathna, Bandara

2016-06-01

Irrigation demands fluctuate in response to weather variations and a range of irrigation management decisions, which creates challenges for water supply system operators. This paper develops a method for real-time ensemble forecasting of irrigation demand and applies it to irrigation command areas of various sizes for lead times of 1 to 5 days. The ensemble forecasts are based on a deterministic time series model coupled with ensemble representations of the various inputs to that model. Forecast inputs include past flow, precipitation, and potential evapotranspiration. These inputs are variously derived from flow observations from a modernized irrigation delivery system; short-term weather forecasts derived from numerical weather prediction models and observed weather data available from automatic weather stations. The predictive performance for the ensemble spread of irrigation demand was quantified using rank histograms, the mean continuous rank probability score (CRPS), the mean CRPS reliability and the temporal mean of the ensemble root mean squared error (MRMSE). The mean forecast was evaluated using root mean squared error (RMSE), Nash-Sutcliffe model efficiency (NSE) and bias. The NSE values for evaluation periods ranged between 0.96 (1 day lead time, whole study area) and 0.42 (5 days lead time, smallest command area). Rank histograms and comparison of MRMSE, mean CRPS, mean CRPS reliability and RMSE indicated that the ensemble spread is generally a reliable representation of the forecast uncertainty for short lead times but underestimates the uncertainty for long lead times.
A theoretical framework to predict the most likely ion path in particle imaging.

PubMed

Collins-Fekete, Charles-Antoine; Volz, Lennart; Portillo, Stephen K N; Beaulieu, Luc; Seco, Joao

2017-03-07

In this work, a generic rigorous Bayesian formalism is introduced to predict the most likely path of any ion crossing a medium between two detection points. The path is predicted based on a combination of the particle scattering in the material and measurements of its initial and final position, direction and energy. The path estimate's precision is compared to the Monte Carlo simulated path. Every ion from hydrogen to carbon is simulated in two scenarios, (1) where the range is fixed and (2) where the initial velocity is fixed. In the scenario where the range is kept constant, the maximal root-mean-square error between the estimated path and the Monte Carlo path drops significantly between the proton path estimate (0.50 mm) and the helium path estimate (0.18 mm), but less so up to the carbon path estimate (0.09 mm). However, this scenario is identified as the configuration that maximizes the dose while minimizing the path resolution. In the scenario where the initial velocity is fixed, the maximal root-mean-square error between the estimated path and the Monte Carlo path drops significantly between the proton path estimate (0.29 mm) and the helium path estimate (0.09 mm) but increases for heavier ions up to carbon (0.12 mm). As a result, helium is found to be the particle with the most accurate path estimate for the lowest dose, potentially leading to tomographic images of higher spatial resolution.
Applying Intelligent Algorithms to Automate the Identification of Error Factors.

PubMed

Jin, Haizhe; Qu, Qingxing; Munechika, Masahiko; Sano, Masataka; Kajihara, Chisato; Duffy, Vincent G; Chen, Han

2018-05-03

Medical errors are the manifestation of the defects occurring in medical processes. Extracting and identifying defects as medical error factors from these processes are an effective approach to prevent medical errors. However, it is a difficult and time-consuming task and requires an analyst with a professional medical background. The issues of identifying a method to extract medical error factors and reduce the extraction difficulty need to be resolved. In this research, a systematic methodology to extract and identify error factors in the medical administration process was proposed. The design of the error report, extraction of the error factors, and identification of the error factors were analyzed. Based on 624 medical error cases across four medical institutes in both Japan and China, 19 error-related items and their levels were extracted. After which, they were closely related to 12 error factors. The relational model between the error-related items and error factors was established based on a genetic algorithm (GA)-back-propagation neural network (BPNN) model. Additionally, compared to GA-BPNN, BPNN, partial least squares regression and support vector regression, GA-BPNN exhibited a higher overall prediction accuracy, being able to promptly identify the error factors from the error-related items. The combination of "error-related items, their different levels, and the GA-BPNN model" was proposed as an error-factor identification technology, which could automatically identify medical error factors.
Portable visible and near-infrared spectrophotometer for triglyceride measurements.

PubMed

Kobayashi, Takanori; Kato, Yukiko Hakariya; Tsukamoto, Megumi; Ikuta, Kazuyoshi; Sakudo, Akikazu

2009-01-01

An affordable and portable machine is required for the practical use of visible and near-infrared (Vis-NIR) spectroscopy. A portable fruit tester comprising a Vis-NIR spectrophotometer was modified for use in the transmittance mode and employed to quantify triglyceride levels in serum in combination with a chemometric analysis. Transmittance spectra collected in the 600- to 1100-nm region were subjected to a partial least-squares regression analysis and leave-out cross-validation to develop a chemometrics model for predicting triglyceride concentrations in serum. The model yielded a coefficient of determination in cross-validation (R2VAL) of 0.7831 with a standard error of cross-validation (SECV) of 43.68 mg/dl. The detection limit of the model was 148.79 mg/dl. Furthermore, masked samples predicted by the model yielded a coefficient of determination in prediction (R2PRED) of 0.6856 with a standard error of prediction (SEP) and detection limit of 61.54 and 159.38 mg/dl, respectively. The portable Vis-NIR spectrophotometer may prove convenient for the measurement of triglyceride concentrations in serum, although before practical use there remain obstacles, which are discussed.
Protein and oil composition predictions of single soybeans by transmission Raman spectroscopy.

PubMed

Schulmerich, Matthew V; Walsh, Michael J; Gelber, Matthew K; Kong, Rong; Kole, Matthew R; Harrison, Sandra K; McKinney, John; Thompson, Dennis; Kull, Linda S; Bhargava, Rohit

2012-08-22

The soybean industry requires rapid, accurate, and precise technologies for the analyses of seed/grain constituents. While the current gold standard for nondestructive quantification of economically and nutritionally important soybean components is near-infrared spectroscopy (NIRS), emerging technology may provide viable alternatives and lead to next generation instrumentation for grain compositional analysis. In principle, Raman spectroscopy provides the necessary chemical information to generate models for predicting the concentration of soybean constituents. In this communication, we explore the use of transmission Raman spectroscopy (TRS) for nondestructive soybean measurements. We show that TRS uses the light scattering properties of soybeans to effectively homogenize the heterogeneous bulk of a soybean for representative sampling. Working with over 1000 individual intact soybean seeds, we developed a simple partial least-squares model for predicting oil and protein content nondestructively. We find TRS to have a root-mean-standard error of prediction (RMSEP) of 0.89% for oil measurements and 0.92% for protein measurements. In both calibration and validation sets, the predicative capabilities of the model were similar to the error in the reference methods.
Design of experiments-based monitoring of critical quality attributes for the spray-drying process of insulin by NIR spectroscopy.

PubMed

Maltesen, Morten Jonas; van de Weert, Marco; Grohganz, Holger

2012-09-01

Moisture content and aerodynamic particle size are critical quality attributes for spray-dried protein formulations. In this study, spray-dried insulin powders intended for pulmonary delivery were produced applying design of experiments methodology. Near infrared spectroscopy (NIR) in combination with preprocessing and multivariate analysis in the form of partial least squares projections to latent structures (PLS) were used to correlate the spectral data with moisture content and aerodynamic particle size measured by a time of flight principle. PLS models predicting the moisture content were based on the chemical information of the water molecules in the NIR spectrum. Models yielded prediction errors (RMSEP) between 0.39% and 0.48% with thermal gravimetric analysis used as reference method. The PLS models predicting the aerodynamic particle size were based on baseline offset in the NIR spectra and yielded prediction errors between 0.27 and 0.48 μm. The morphology of the spray-dried particles had a significant impact on the predictive ability of the models. Good predictive models could be obtained for spherical particles with a calibration error (RMSECV) of 0.22 μm, whereas wrinkled particles resulted in much less robust models with a Q (2) of 0.69. Based on the results in this study, NIR is a suitable tool for process analysis of the spray-drying process and for control of moisture content and particle size, in particular for smooth and spherical particles.
Neural activity during affect labeling predicts expressive writing effects on well-being: GLM and SVM approaches.

PubMed

Memarian, Negar; Torre, Jared B; Haltom, Kate E; Stanton, Annette L; Lieberman, Matthew D

2017-09-01

Affect labeling (putting feelings into words) is a form of incidental emotion regulation that could underpin some benefits of expressive writing (i.e. writing about negative experiences). Here, we show that neural responses during affect labeling predicted changes in psychological and physical well-being outcome measures 3 months later. Furthermore, neural activity of specific frontal regions and amygdala predicted those outcomes as a function of expressive writing. Using supervised learning (support vector machines regression), improvements in four measures of psychological and physical health (physical symptoms, depression, anxiety and life satisfaction) after an expressive writing intervention were predicted with an average of 0.85% prediction error [root mean square error (RMSE) %]. The predictions were significantly more accurate with machine learning than with the conventional generalized linear model method (average RMSE: 1.3%). Consistent with affect labeling research, right ventrolateral prefrontal cortex (RVLPFC) and amygdalae were top predictors of improvement in the four outcomes. Moreover, RVLPFC and left amygdala predicted benefits due to expressive writing in satisfaction with life and depression outcome measures, respectively. This study demonstrates the substantial merit of supervised machine learning for real-world outcome prediction in social and affective neuroscience. © The Author (2017). Published by Oxford University Press.
Prediction of Safety Stock Using Fuzzy Time Series (FTS) and Technology of Radio Frequency Identification (RFID) for Stock Control at Vendor Managed Inventory (VMI)

NASA Astrophysics Data System (ADS)

Mashuri, Chamdan; Suryono; Suseno, Jatmiko Endro

2018-02-01

This research was conducted by prediction of safety stock using Fuzzy Time Series (FTS) and technology of Radio Frequency Identification (RFID) for stock control at Vendor Managed Inventory (VMI). Well-controlled stock influenced company revenue and minimized cost. It discussed about information system of safety stock prediction developed through programming language of PHP. Input data consisted of demand got from automatic, online and real time acquisition using technology of RFID, then, sent to server and stored at online database. Furthermore, data of acquisition result was predicted by using algorithm of FTS applying universe of discourse defining and fuzzy sets determination. Fuzzy set result was continued to division process of universe of discourse in order to be to final step. Prediction result was displayed at information system dashboard developed. By using 60 data from demand data, prediction score was 450.331 and safety stock was 135.535. Prediction result was done by error deviation validation using Mean Square Percent Error of 15%. It proved that FTS was good enough in predicting demand and safety stock for stock control. For deeper analysis, researchers used data of demand and universe of discourse U varying at FTS to get various result based on test data used.
Predicting root zone soil moisture with soil properties and satellite near-surface moisture data across the conterminous United States

NASA Astrophysics Data System (ADS)

Baldwin, D.; Manfreda, S.; Keller, K.; Smithwick, E. A. H.

2017-03-01

Satellite-based near-surface (0-2 cm) soil moisture estimates have global coverage, but do not capture variations of soil moisture in the root zone (up to 100 cm depth) and may be biased with respect to ground-based soil moisture measurements. Here, we present an ensemble Kalman filter (EnKF) hydrologic data assimilation system that predicts bias in satellite soil moisture data to support the physically based Soil Moisture Analytical Relationship (SMAR) infiltration model, which estimates root zone soil moisture with satellite soil moisture data. The SMAR-EnKF model estimates a regional-scale bias parameter using available in situ data. The regional bias parameter is added to satellite soil moisture retrievals before their use in the SMAR model, and the bias parameter is updated continuously over time with the EnKF algorithm. In this study, the SMAR-EnKF assimilates in situ soil moisture at 43 Soil Climate Analysis Network (SCAN) monitoring locations across the conterminous U.S. Multivariate regression models are developed to estimate SMAR parameters using soil physical properties and the moderate resolution imaging spectroradiometer (MODIS) evapotranspiration data product as covariates. SMAR-EnKF root zone soil moisture predictions are in relatively close agreement with in situ observations when using optimal model parameters, with root mean square errors averaging 0.051 [cm3 cm-3] (standard error, s.e. = 0.005). The average root mean square error associated with a 20-fold cross-validation analysis with permuted SMAR parameter regression models increases moderately (0.082 [cm3 cm-3], s.e. = 0.004). The expected regional-scale satellite correction bias is negative in four out of six ecoregions studied (mean = -0.12 [-], s.e. = 0.002), excluding the Great Plains and Eastern Temperate Forests (0.053 [-], s.e. = 0.001). With its capability of estimating regional-scale satellite bias, the SMAR-EnKF system can predict root zone soil moisture over broad extents and has applications in drought predictions and other operational hydrologic modeling purposes.
Characterization of Used Nuclear Fuel with Multivariate Analysis for Process Monitoring

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dayman, Kenneth J.; Coble, Jamie B.; Orton, Christopher R.

2014-01-01

The Multi-Isotope Process (MIP) Monitor combines gamma spectroscopy and multivariate analysis to detect anomalies in various process streams in a nuclear fuel reprocessing system. Measured spectra are compared to models of nominal behavior at each measurement location to detect unexpected changes in system behavior. In order to improve the accuracy and specificity of process monitoring, fuel characterization may be used to more accurately train subsequent models in a full analysis scheme. This paper presents initial development of a reactor-type classifier that is used to select a reactor-specific partial least squares model to predict fuel burnup. Nuclide activities for prototypic usedmore » fuel samples were generated in ORIGEN-ARP and used to investigate techniques to characterize used nuclear fuel in terms of reactor type (pressurized or boiling water reactor) and burnup. A variety of reactor type classification algorithms, including k-nearest neighbors, linear and quadratic discriminant analyses, and support vector machines, were evaluated to differentiate used fuel from pressurized and boiling water reactors. Then, reactor type-specific partial least squares models were developed to predict the burnup of the fuel. Using these reactor type-specific models instead of a model trained for all light water reactors improved the accuracy of burnup predictions. The developed classification and prediction models were combined and applied to a large dataset that included eight fuel assembly designs, two of which were not used in training the models, and spanned the range of the initial 235U enrichment, cooling time, and burnup values expected of future commercial used fuel for reprocessing. Error rates were consistent across the range of considered enrichment, cooling time, and burnup values. Average absolute relative errors in burnup predictions for validation data both within and outside the training space were 0.0574% and 0.0597%, respectively. The errors seen in this work are artificially low, because the models were trained, optimized, and tested on simulated, noise-free data. However, these results indicate that the developed models may generalize well to new data and that the proposed approach constitutes a viable first step in developing a fuel characterization algorithm based on gamma spectra.« less
Analysis Resilient Algorithm on Artificial Neural Network Backpropagation

NASA Astrophysics Data System (ADS)

Saputra, Widodo; Tulus; Zarlis, Muhammad; Widia Sembiring, Rahmat; Hartama, Dedy

2017-12-01

Prediction required by decision makers to anticipate future planning. Artificial Neural Network (ANN) Backpropagation is one of method. This method however still has weakness, for long training time. This is a reason to improve a method to accelerate the training. One of Artificial Neural Network (ANN) Backpropagation method is a resilient method. Resilient method of changing weights and bias network with direct adaptation process of weighting based on local gradient information from every learning iteration. Predicting data result of Istanbul Stock Exchange training getting better. Mean Square Error (MSE) value is getting smaller and increasing accuracy.
Spatiotemporal prediction of fine particulate matter using high resolution satellite images in the southeastern U.S 2003–2011

PubMed Central

Lee, Mihye; Kloog, Itai; Chudnovsky, Alexandra; Lyapustin, Alexei; Wang, Yujie; Melly, Steven; Coull, Brent; Koutrakis, Petros; Schwartz, Joel

2016-01-01

Numerous studies have demonstrated that fine particulate matter (PM2.5, particles smaller than 2.5 μm in aerodynamic diameter) is associated with adverse health outcomes. The use of ground monitoring stations of PM2.5 to assess personal exposure; however, induces measurement error. Land use regression provides spatially resolved predictions but land use terms do not vary temporally. Meanwhile, the advent of satellite-retrieved aerosol optical depth (AOD) products have made possible to predict the spatial and temporal patterns of PM2.5 exposures. In this paper, we used AOD data with other PM2.5 variables such as meteorological variables, land use regression, and spatial smoothing to predict daily concentrations of PM2.5 at a 1 km2 resolution of the southeastern United States including the seven states of Georgia, North Carolina, South Carolina, Alabama, Tennessee, Mississippi, and Florida for the years from 2003 through 2011. We divided the study area into 3 regions and applied separate mixed-effect models to calibrate AOD using ground PM2.5 measurements and other spatiotemporal predictors. Using 10-fold cross-validation, we obtained out of sample R2 values of 0.77, 0.81, and 0.70 with the square root of the mean squared prediction errors (RMSPE) of 2.89, 2.51, and 2.82 μg/m3 for regions 1, 2, and 3, respectively. The slopes of the relationships between predicted PM2.5 and held out measurements were approximately 1 indicating no bias between the observed and modeled PM2.5 concentrations. Predictions can be used in epidemiological studies investigating the effects of both acute and chronic exposures to PM2.5. Our model results will also extend the existing studies on PM2.5 which have mostly focused on urban areas due to the paucity of monitors in rural areas. PMID:26082149
Spatiotemporal Prediction of Fine Particulate Matter Using High-Resolution Satellite Images in the Southeastern US 2003-2011

NASA Technical Reports Server (NTRS)

Lee, Mihye; Kloog, Itai; Chudnovsky, Alexandra; Lyapustin, Alexei; Wang, Yujie; Melly, Steven; Coull, Brent; Koutrakis, Petros; Schwartz, Joel

2016-01-01

Numerous studies have demonstrated that fine particulate matter (PM(sub 2.5), particles smaller than 2.5 micrometers in aerodynamic diameter) is associated with adverse health outcomes. The use of ground monitoring stations of PM(sub 2.5) to assess personal exposure, however, induces measurement error. Land-use regression provides spatially resolved predictions but land-use terms do not vary temporally. Meanwhile, the advent of satellite-retrieved aerosol optical depth (AOD) products have made possible to predict the spatial and temporal patterns of PM(sub 2.5) exposures. In this paper, we used AOD data with other PM(sub 2.5) variables, such as meteorological variables, land-use regression, and spatial smoothing to predict daily concentrations of PM(sub 2.5) at a 1 sq km resolution of the Southeastern United States including the seven states of Georgia, North Carolina, South Carolina, Alabama, Tennessee, Mississippi, and Florida for the years from 2003 to 2011. We divided the study area into three regions and applied separate mixed-effect models to calibrate AOD using ground PM(sub 2.5) measurements and other spatiotemporal predictors. Using 10-fold cross-validation, we obtained out of sample R2 values of 0.77, 0.81, and 0.70 with the square root of the mean squared prediction errors of 2.89, 2.51, and 2.82 cu micrograms for regions 1, 2, and 3, respectively. The slopes of the relationships between predicted PM2.5 and held out measurements were approximately 1 indicating no bias between the observed and modeled PM(sub 2.5) concentrations. Predictions can be used in epidemiological studies investigating the effects of both acute and chronic exposures to PM(sub 2.5). Our model results will also extend the existing studies on PM(sub 2.5) which have mostly focused on urban areas because of the paucity of monitors in rural areas.
An optimized Nash nonlinear grey Bernoulli model based on particle swarm optimization and its application in prediction for the incidence of Hepatitis B in Xinjiang, China.

PubMed

Zhang, Liping; Zheng, Yanling; Wang, Kai; Zhang, Xueliang; Zheng, Yujian

2014-06-01

In this paper, by using a particle swarm optimization algorithm to solve the optimal parameter estimation problem, an improved Nash nonlinear grey Bernoulli model termed PSO-NNGBM(1,1) is proposed. To test the forecasting performance, the optimized model is applied for forecasting the incidence of hepatitis B in Xinjiang, China. Four models, traditional GM(1,1), grey Verhulst model (GVM), original nonlinear grey Bernoulli model (NGBM(1,1)) and Holt-Winters exponential smoothing method, are also established for comparison with the proposed model under the criteria of mean absolute percentage error and root mean square percent error. The prediction results show that the optimized NNGBM(1,1) model is more accurate and performs better than the traditional GM(1,1), GVM, NGBM(1,1) and Holt-Winters exponential smoothing method. Copyright © 2014. Published by Elsevier Ltd.
Improving operational flood ensemble prediction by the assimilation of satellite soil moisture: comparison between lumped and semi-distributed schemes

NASA Astrophysics Data System (ADS)

Alvarez-Garreton, C.; Ryu, D.; Western, A. W.; Su, C.-H.; Crow, W. T.; Robertson, D. E.; Leahy, C.

2014-09-01

Assimilation of remotely sensed soil moisture data (SM-DA) to correct soil water stores of rainfall-runoff models has shown skill in improving streamflow prediction. In the case of large and sparsely monitored catchments, SM-DA is a particularly attractive tool. Within this context, we assimilate active and passive satellite soil moisture (SSM) retrievals using an ensemble Kalman filter to improve operational flood prediction within a large semi-arid catchment in Australia (>40 000 km2). We assess the importance of accounting for channel routing and the spatial distribution of forcing data by applying SM-DA to a lumped and a semi-distributed scheme of the probability distributed model (PDM). Our scheme also accounts for model error representation and seasonal biases and errors in the satellite data. Before assimilation, the semi-distributed model provided more accurate streamflow prediction (Nash-Sutcliffe efficiency, NS = 0.77) than the lumped model (NS = 0.67) at the catchment outlet. However, this did not ensure good performance at the "ungauged" inner catchments. After SM-DA, the streamflow ensemble prediction at the outlet was improved in both the lumped and the semi-distributed schemes: the root mean square error of the ensemble was reduced by 27 and 31%, respectively; the NS of the ensemble mean increased by 7 and 38%, respectively; the false alarm ratio was reduced by 15 and 25%, respectively; and the ensemble prediction spread was reduced while its reliability was maintained. Our findings imply that even when rainfall is the main driver of flooding in semi-arid catchments, adequately processed SSM can be used to reduce errors in the model soil moisture, which in turn provides better streamflow ensemble prediction. We demonstrate that SM-DA efficacy is enhanced when the spatial distribution in forcing data and routing processes are accounted for. At ungauged locations, SM-DA is effective at improving streamflow ensemble prediction, however, the updated prediction is still poor since SM-DA does not address systematic errors in the model.
Reconsidering the use of rankings in the valuation of health states: a model for estimating cardinal values from ordinal data

PubMed Central

Salomon, Joshua A

2003-01-01

Background In survey studies on health-state valuations, ordinal ranking exercises often are used as precursors to other elicitation methods such as the time trade-off (TTO) or standard gamble, but the ranking data have not been used in deriving cardinal valuations. This study reconsiders the role of ordinal ranks in valuing health and introduces a new approach to estimate interval-scaled valuations based on aggregate ranking data. Methods Analyses were undertaken on data from a previously published general population survey study in the United Kingdom that included rankings and TTO values for hypothetical states described using the EQ-5D classification system. The EQ-5D includes five domains (mobility, self-care, usual activities, pain/discomfort and anxiety/depression) with three possible levels on each. Rank data were analysed using a random utility model, operationalized through conditional logit regression. In the statistical model, probabilities of observed rankings were related to the latent utilities of different health states, modeled as a linear function of EQ-5D domain scores, as in previously reported EQ-5D valuation functions. Predicted valuations based on the conditional logit model were compared to observed TTO values for the 42 states in the study and to predictions based on a model estimated directly from the TTO values. Models were evaluated using the intraclass correlation coefficient (ICC) between predictions and mean observations, and the root mean squared error of predictions at the individual level. Results Agreement between predicted valuations from the rank model and observed TTO values was very high, with an ICC of 0.97, only marginally lower than for predictions based on the model estimated directly from TTO values (ICC = 0.99). Individual-level errors were also comparable in the two models, with root mean squared errors of 0.503 and 0.496 for the rank-based and TTO-based predictions, respectively. Conclusions Modeling health-state valuations based on ordinal ranks can provide results that are similar to those obtained from more widely analyzed valuation techniques such as the TTO. The information content in aggregate ranking data is not currently exploited to full advantage. The possibility of estimating cardinal valuations from ordinal ranks could also simplify future data collection dramatically and facilitate wider empirical study of health-state valuations in diverse settings and population groups. PMID:14687419
Prediction of gas chromatographic retention indices by the use of radial basis function neural networks.

PubMed

Yao, Xiaojun; Zhang, Xiaoyun; Zhang, Ruisheng; Liu, Mancang; Hu, Zhide; Fan, Botao

2002-05-16

A new method for the prediction of retention indices for a diverse set of compounds from their physicochemical parameters has been proposed. The two used input parameters for representing molecular properties are boiling point and molar volume. Models relating relationships between physicochemical parameters and retention indices of compounds are constructed by means of radial basis function neural networks. To get the best prediction results, some strategies are also employed to optimize the topology and learning parameters of the RBFNNs. For the test set, a predictive correlation coefficient R=0.9910 and root mean squared error of 14.1 are obtained. Results show that radial basis function networks can give satisfactory prediction ability and its optimization is less-time consuming and easy to implement.

An improved partial least-squares regression method for Raman spectroscopy

NASA Astrophysics Data System (ADS)

Momenpour Tehran Monfared, Ali; Anis, Hanan

2017-10-01

It is known that the performance of partial least-squares (PLS) regression analysis can be improved using the backward variable selection method (BVSPLS). In this paper, we further improve the BVSPLS based on a novel selection mechanism. The proposed method is based on sorting the weighted regression coefficients, and then the importance of each variable of the sorted list is evaluated using root mean square errors of prediction (RMSEP) criterion in each iteration step. Our Improved BVSPLS (IBVSPLS) method has been applied to leukemia and heparin data sets and led to an improvement in limit of detection of Raman biosensing ranged from 10% to 43% compared to PLS. Our IBVSPLS was also compared to the jack-knifing (simpler) and Genetic Algorithm (more complex) methods. Our method was consistently better than the jack-knifing method and showed either a similar or a better performance compared to the genetic algorithm.
Correcting Four Similar Correlational Measures for Attenuation Due to Errors of Measurement in the Dependent Variable: Eta, Epsilon, Omega, and Intraclass r.

ERIC Educational Resources Information Center

Stanley, Julian C.; Livingston, Samuel A.

Besides the ubiquitous Pearson product-moment r, there are a number of other measures of relationship that are attenuated by errors of measurement and for which the relationship between true measures can be estimated. Among these are the correlation ratio (eta squared), Kelley's unbiased correlation ratio (epsilon squared), Hays' omega squared,…
Leuconostoc mesenteroides growth in food products: prediction and sensitivity analysis by adaptive-network-based fuzzy inference systems.

PubMed

Wang, Hue-Yu; Wen, Ching-Feng; Chiu, Yu-Hsien; Lee, I-Nong; Kao, Hao-Yun; Lee, I-Chen; Ho, Wen-Hsien

2013-01-01

An adaptive-network-based fuzzy inference system (ANFIS) was compared with an artificial neural network (ANN) in terms of accuracy in predicting the combined effects of temperature (10.5 to 24.5°C), pH level (5.5 to 7.5), sodium chloride level (0.25% to 6.25%) and sodium nitrite level (0 to 200 ppm) on the growth rate of Leuconostoc mesenteroides under aerobic and anaerobic conditions. THE ANFIS AND ANN MODELS WERE COMPARED IN TERMS OF SIX STATISTICAL INDICES CALCULATED BY COMPARING THEIR PREDICTION RESULTS WITH ACTUAL DATA: mean absolute percentage error (MAPE), root mean square error (RMSE), standard error of prediction percentage (SEP), bias factor (Bf), accuracy factor (Af), and absolute fraction of variance (R (2)). Graphical plots were also used for model comparison. The learning-based systems obtained encouraging prediction results. Sensitivity analyses of the four environmental factors showed that temperature and, to a lesser extent, NaCl had the most influence on accuracy in predicting the growth rate of Leuconostoc mesenteroides under aerobic and anaerobic conditions. The observed effectiveness of ANFIS for modeling microbial kinetic parameters confirms its potential use as a supplemental tool in predictive mycology. Comparisons between growth rates predicted by ANFIS and actual experimental data also confirmed the high accuracy of the Gaussian membership function in ANFIS. Comparisons of the six statistical indices under both aerobic and anaerobic conditions also showed that the ANFIS model was better than all ANN models in predicting the four kinetic parameters. Therefore, the ANFIS model is a valuable tool for quickly predicting the growth rate of Leuconostoc mesenteroides under aerobic and anaerobic conditions.
Leuconostoc Mesenteroides Growth in Food Products: Prediction and Sensitivity Analysis by Adaptive-Network-Based Fuzzy Inference Systems

PubMed Central

Wang, Hue-Yu; Wen, Ching-Feng; Chiu, Yu-Hsien; Lee, I-Nong; Kao, Hao-Yun; Lee, I-Chen; Ho, Wen-Hsien

2013-01-01

Background An adaptive-network-based fuzzy inference system (ANFIS) was compared with an artificial neural network (ANN) in terms of accuracy in predicting the combined effects of temperature (10.5 to 24.5°C), pH level (5.5 to 7.5), sodium chloride level (0.25% to 6.25%) and sodium nitrite level (0 to 200 ppm) on the growth rate of Leuconostoc mesenteroides under aerobic and anaerobic conditions. Methods The ANFIS and ANN models were compared in terms of six statistical indices calculated by comparing their prediction results with actual data: mean absolute percentage error (MAPE), root mean square error (RMSE), standard error of prediction percentage (SEP), bias factor (Bf), accuracy factor (Af), and absolute fraction of variance (R 2). Graphical plots were also used for model comparison. Conclusions The learning-based systems obtained encouraging prediction results. Sensitivity analyses of the four environmental factors showed that temperature and, to a lesser extent, NaCl had the most influence on accuracy in predicting the growth rate of Leuconostoc mesenteroides under aerobic and anaerobic conditions. The observed effectiveness of ANFIS for modeling microbial kinetic parameters confirms its potential use as a supplemental tool in predictive mycology. Comparisons between growth rates predicted by ANFIS and actual experimental data also confirmed the high accuracy of the Gaussian membership function in ANFIS. Comparisons of the six statistical indices under both aerobic and anaerobic conditions also showed that the ANFIS model was better than all ANN models in predicting the four kinetic parameters. Therefore, the ANFIS model is a valuable tool for quickly predicting the growth rate of Leuconostoc mesenteroides under aerobic and anaerobic conditions. PMID:23705023
Application of RBFN network and GM (1, 1) for groundwater level simulation

NASA Astrophysics Data System (ADS)

Li, Zijun; Yang, Qingchun; Wang, Luchen; Martín, Jordi Delgado

2017-10-01

Groundwater is a prominent resource of drinking and domestic water in the world. In this context, a feasible water resources management plan necessitates acceptable predictions of groundwater table depth fluctuations, which can help ensure the sustainable use of a watershed's aquifers for urban and rural water supply. Due to the difficulties of identifying non-linear model structure and estimating the associated parameters, in this study radial basis function neural network (RBFNN) and GM (1, 1) models are used for the prediction of monthly groundwater level fluctuations in the city of Longyan, Fujian Province (South China). The monthly groundwater level data monitored from January 2003 to December 2011 are used in both models. The error criteria are estimated using the coefficient of determination ( R 2), mean absolute error (E) and root mean squared error (RMSE). The results show that both the models can forecast the groundwater level with fairly high accuracy, but the RBFN network model can be a promising tool to simulate and forecast groundwater level since it has a relatively smaller RMSE and MAE.
Quantitative Determination of Fluorine Content in Blends of Polylactide (PLA)–Talc Using Near Infrared Spectroscopy

PubMed Central

Tamburini, Elena; Tagliati, Chiara; Bonato, Tiziano; Costa, Stefania; Scapoli, Chiara; Pedrini, Paola

2016-01-01

Near-infrared spectroscopy (NIRS) has been widely used for quantitative and/or qualitative determination of a wide range of matrices. The objective of this study was to develop a NIRS method for the quantitative determination of fluorine content in polylactide (PLA)-talc blends. A blending profile was obtained by mixing different amounts of PLA granules and talc powder. The calibration model was built correlating wet chemical data (alkali digestion method) and NIR spectra. Using FT (Fourier Transform)-NIR technique, a Partial Least Squares (PLS) regression model was set-up, in a concentration interval of 0 ppm of pure PLA to 800 ppm of pure talc. Fluorine content prediction (R2cal = 0.9498; standard error of calibration, SEC = 34.77; standard error of cross-validation, SECV = 46.94) was then externally validated by means of a further 15 independent samples (R2EX.V = 0.8955; root mean standard error of prediction, RMSEP = 61.08). A positive relationship between an inorganic component as fluorine and NIR signal has been evidenced, and used to obtain quantitative analytical information from the spectra. PMID:27490548
Effectively parameterizing dissipative particle dynamics using COSMO-SAC: A partition coefficient study

NASA Astrophysics Data System (ADS)

Saathoff, Jonathan

2018-04-01

Dissipative Particle Dynamics (DPD) provides a tool for studying phase behavior and interfacial phenomena for complex mixtures and macromolecules. Methods to quickly and automatically parameterize DPD greatly increase its effectiveness. One such method is to map predicted activity coefficients derived from COSMO-SAC onto DPD parameter sets. However, there are serious limitations to the accuracy of this mapping, including the inability of single DPD beads to reproduce asymmetric infinite dilution activity coefficients, the loss of precision when reusing parameters for different molecular fragments, and the error due to bonding beads together. This report describes these effects in quantitative detail and provides methods to mitigate much of their deleterious effects. This includes a novel approach to remove errors caused by bonding DPD beads together. Using these methods, logarithm hexane/water partition coefficients were calculated for 61 molecules. The root mean-squared error for these calculations was determined to be 0.14—a very low value—with respect to the final mapping procedure. Cognizance of the above limitations can greatly enhance the predictive power of DPD.
Development of variable pathlength UV-vis spectroscopy combined with partial-least-squares regression for wastewater chemical oxygen demand (COD) monitoring.

PubMed

Chen, Baisheng; Wu, Huanan; Li, Sam Fong Yau

2014-03-01

To overcome the challenging task to select an appropriate pathlength for wastewater chemical oxygen demand (COD) monitoring with high accuracy by UV-vis spectroscopy in wastewater treatment process, a variable pathlength approach combined with partial-least squares regression (PLSR) was developed in this study. Two new strategies were proposed to extract relevant information of UV-vis spectral data from variable pathlength measurements. The first strategy was by data fusion with two data fusion levels: low-level data fusion (LLDF) and mid-level data fusion (MLDF). Predictive accuracy was found to improve, indicated by the lower root-mean-square errors of prediction (RMSEP) compared with those obtained for single pathlength measurements. Both fusion levels were found to deliver very robust PLSR models with residual predictive deviations (RPD) greater than 3 (i.e. 3.22 and 3.29, respectively). The second strategy involved calculating the slopes of absorbance against pathlength at each wavelength to generate slope-derived spectra. Without the requirement to select the optimal pathlength, the predictive accuracy (RMSEP) was improved by 20-43% as compared to single pathlength spectroscopy. Comparing to nine-factor models from fusion strategy, the PLSR model from slope-derived spectroscopy was found to be more parsimonious with only five factors and more robust with residual predictive deviation (RPD) of 3.72. It also offered excellent correlation of predicted and measured COD values with R(2) of 0.936. In sum, variable pathlength spectroscopy with the two proposed data analysis strategies proved to be successful in enhancing prediction performance of COD in wastewater and showed high potential to be applied in on-line water quality monitoring. Copyright © 2013 Elsevier B.V. All rights reserved.
On how to avoid input and structural uncertainties corrupt the inference of hydrological parameters using a Bayesian framework

NASA Astrophysics Data System (ADS)

Hernández, Mario R.; Francés, Félix

2015-04-01

One phase of the hydrological models implementation process, significantly contributing to the hydrological predictions uncertainty, is the calibration phase in which values of the unknown model parameters are tuned by optimizing an objective function. An unsuitable error model (e.g. Standard Least Squares or SLS) introduces noise into the estimation of the parameters. The main sources of this noise are the input errors and the hydrological model structural deficiencies. Thus, the biased calibrated parameters cause the divergence model phenomenon, where the errors variance of the (spatially and temporally) forecasted flows far exceeds the errors variance in the fitting period, and provoke the loss of part or all of the physical meaning of the modeled processes. In other words, yielding a calibrated hydrological model which works well, but not for the right reasons. Besides, an unsuitable error model yields a non-reliable predictive uncertainty assessment. Hence, with the aim of prevent all these undesirable effects, this research focuses on the Bayesian joint inference (BJI) of both the hydrological and error model parameters, considering a general additive (GA) error model that allows for correlation, non-stationarity (in variance and bias) and non-normality of model residuals. As hydrological model, it has been used a conceptual distributed model called TETIS, with a particular split structure of the effective model parameters. Bayesian inference has been performed with the aid of a Markov Chain Monte Carlo (MCMC) algorithm called Dream-ZS. MCMC algorithm quantifies the uncertainty of the hydrological and error model parameters by getting the joint posterior probability distribution, conditioned on the observed flows. The BJI methodology is a very powerful and reliable tool, but it must be used correctly this is, if non-stationarity in errors variance and bias is modeled, the Total Laws must be taken into account. The results of this research show that the application of BJI with a GA error model outperforms the hydrological parameters robustness (diminishing the divergence model phenomenon) and improves the reliability of the streamflow predictive distribution, in respect of the results of a bad error model as SLS. Finally, the most likely prediction in a validation period, for both BJI+GA and SLS error models shows a similar performance.
Optical PAyload for Lasercomm Science (OPALS) link validation

NASA Technical Reports Server (NTRS)

Biswas, Abhijit; Oaida, Bogdan V.; Andrews, Kenneth S.; Kovalik, Joseph M.; Abrahamson, Matthew J.; Wright, Malcolm W.

2015-01-01

Recently several day and nighttime links under diverse atmospheric conditions were completed using the Optical Payload for Lasercomm Science (OPALS) flight system on-board the International Space Station (ISS). In this paper we compare measured optical power and its variance at either end of the link with predictions that include atmospheric propagation models. For the 976 nm laser beacon mean power transmitted from the ground to the ISS the predicted mean irradiance of 10's of microwatts per square meter close to zenith and its decrease with range and increased air mass shows good agreement with predictions. The irradiance fluctuations sampled at 100 Hz also follow the expected increase in scintillation with air mass representative of atmospheric coherence lengths at zenith at 500 nm in the 3-8 cm range. The downlink predicted power of 100's of nanowatts was also reconciled within the uncertainty of the atmospheric losses. Expected link performance with uncoded bit-error rates less than 1E-4 required for the Reed-Solomon code to correct errors for video, text and file transmission was verified. The results of predicted and measured powers and fluctuations suggest the need for further study and refinement.
QSAR modeling for predicting mutagenic toxicity of diverse chemicals for regulatory purposes.

PubMed

Basant, Nikita; Gupta, Shikha

2017-06-01

The safety assessment process of chemicals requires information on their mutagenic potential. The experimental determination of mutagenicity of a large number of chemicals is tedious and time and cost intensive, thus compelling for alternative methods. We have established local and global QSAR models for discriminating low and high mutagenic compounds and predicting their mutagenic activity in a quantitative manner in Salmonella typhimurium (TA) bacterial strains (TA98 and TA100). The decision treeboost (DTB)-based classification QSAR models discriminated among two categories with accuracies of >96% and the regression QSAR models precisely predicted the mutagenic activity of diverse chemicals yielding high correlations (R 2 ) between the experimental and model-predicted values in the respective training (>0.96) and test (>0.94) sets. The test set root mean squared error (RMSE) and mean absolute error (MAE) values emphasized the usefulness of the developed models for predicting new compounds. Relevant structural features of diverse chemicals that were responsible and influence the mutagenic activity were identified. The applicability domains of the developed models were defined. The developed models can be used as tools for screening new chemicals for their mutagenicity assessment for regulatory purpose.
Predicting online ratings based on the opinion spreading process

NASA Astrophysics Data System (ADS)

He, Xing-Sheng; Zhou, Ming-Yang; Zhuo, Zhao; Fu, Zhong-Qian; Liu, Jian-Guo

2015-10-01

Predicting users' online ratings is always a challenge issue and has drawn lots of attention. In this paper, we present a rating prediction method by combining the user opinion spreading process with the collaborative filtering algorithm, where user similarity is defined by measuring the amount of opinion a user transfers to another based on the primitive user-item rating matrix. The proposed method could produce a more precise rating prediction for each unrated user-item pair. In addition, we introduce a tunable parameter λ to regulate the preferential diffusion relevant to the degree of both opinion sender and receiver. The numerical results for Movielens and Netflix data sets show that this algorithm has a better accuracy than the standard user-based collaborative filtering algorithm using Cosine and Pearson correlation without increasing computational complexity. By tuning λ, our method could further boost the prediction accuracy when using Mean Absolute Error (MAE) and Root Mean Squared Error (RMSE) as measurements. In the optimal cases, on Movielens and Netflix data sets, the corresponding algorithmic accuracy (MAE and RMSE) are improved 11.26% and 8.84%, 13.49% and 10.52% compared to the item average method, respectively.
Network traffic anomaly prediction using Artificial Neural Network

NASA Astrophysics Data System (ADS)

Ciptaningtyas, Hening Titi; Fatichah, Chastine; Sabila, Altea

2017-03-01

As the excessive increase of internet usage, the malicious software (malware) has also increase significantly. Malware is software developed by hacker for illegal purpose(s), such as stealing data and identity, causing computer damage, or denying service to other user[1]. Malware which attack computer or server often triggers network traffic anomaly phenomena. Based on Sophos's report[2], Indonesia is the riskiest country of malware attack and it also has high network traffic anomaly. This research uses Artificial Neural Network (ANN) to predict network traffic anomaly based on malware attack in Indonesia which is recorded by Id-SIRTII/CC (Indonesia Security Incident Response Team on Internet Infrastructure/Coordination Center). The case study is the highest malware attack (SQL injection) which has happened in three consecutive years: 2012, 2013, and 2014[4]. The data series is preprocessed first, then the network traffic anomaly is predicted using Artificial Neural Network and using two weight update algorithms: Gradient Descent and Momentum. Error of prediction is calculated using Mean Squared Error (MSE) [7]. The experimental result shows that MSE for SQL Injection is 0.03856. So, this approach can be used to predict network traffic anomaly.
Study on the Rationality and Validity of Probit Models of Domino Effect to Chemical Process Equipment caused by Overpressure

NASA Astrophysics Data System (ADS)

Sun, Dongliang; Huang, Guangtuan; Jiang, Juncheng; Zhang, Mingguang; Wang, Zhirong

2013-04-01

Overpressure is one important cause of domino effect in accidents of chemical process equipments. Some models considering propagation probability and threshold values of the domino effect caused by overpressure have been proposed in previous study. In order to prove the rationality and validity of the models reported in the reference, two boundary values of three damage degrees reported were considered as random variables respectively in the interval [0, 100%]. Based on the overpressure data for damage to the equipment and the damage state, and the calculation method reported in the references, the mean square errors of the four categories of damage probability models of overpressure were calculated with random boundary values, and then a relationship of mean square error vs. the two boundary value was obtained, the minimum of mean square error was obtained, compared with the result of the present work, mean square error decreases by about 3%. Therefore, the error was in the acceptable range of engineering applications, the models reported can be considered reasonable and valid.
Multivariate reference technique for quantitative analysis of fiber-optic tissue Raman spectroscopy.

PubMed

Bergholt, Mads Sylvest; Duraipandian, Shiyamala; Zheng, Wei; Huang, Zhiwei

2013-12-03

We report a novel method making use of multivariate reference signals of fused silica and sapphire Raman signals generated from a ball-lens fiber-optic Raman probe for quantitative analysis of in vivo tissue Raman measurements in real time. Partial least-squares (PLS) regression modeling is applied to extract the characteristic internal reference Raman signals (e.g., shoulder of the prominent fused silica boson peak (~130 cm(-1)); distinct sapphire ball-lens peaks (380, 417, 646, and 751 cm(-1))) from the ball-lens fiber-optic Raman probe for quantitative analysis of fiber-optic Raman spectroscopy. To evaluate the analytical value of this novel multivariate reference technique, a rapid Raman spectroscopy system coupled with a ball-lens fiber-optic Raman probe is used for in vivo oral tissue Raman measurements (n = 25 subjects) under 785 nm laser excitation powers ranging from 5 to 65 mW. An accurate linear relationship (R(2) = 0.981) with a root-mean-square error of cross validation (RMSECV) of 2.5 mW can be obtained for predicting the laser excitation power changes based on a leave-one-subject-out cross-validation, which is superior to the normal univariate reference method (RMSE = 6.2 mW). A root-mean-square error of prediction (RMSEP) of 2.4 mW (R(2) = 0.985) can also be achieved for laser power prediction in real time when we applied the multivariate method independently on the five new subjects (n = 166 spectra). We further apply the multivariate reference technique for quantitative analysis of gelatin tissue phantoms that gives rise to an RMSEP of ~2.0% (R(2) = 0.998) independent of laser excitation power variations. This work demonstrates that multivariate reference technique can be advantageously used to monitor and correct the variations of laser excitation power and fiber coupling efficiency in situ for standardizing the tissue Raman intensity to realize quantitative analysis of tissue Raman measurements in vivo, which is particularly appealing in challenging Raman endoscopic applications.
Modeling and Prediction of Solvent Effect on Human Skin Permeability using Support Vector Regression and Random Forest.

PubMed

Baba, Hiromi; Takahara, Jun-ichi; Yamashita, Fumiyoshi; Hashida, Mitsuru

2015-11-01

The solvent effect on skin permeability is important for assessing the effectiveness and toxicological risk of new dermatological formulations in pharmaceuticals and cosmetics development. The solvent effect occurs by diverse mechanisms, which could be elucidated by efficient and reliable prediction models. However, such prediction models have been hampered by the small variety of permeants and mixture components archived in databases and by low predictive performance. Here, we propose a solution to both problems. We first compiled a novel large database of 412 samples from 261 structurally diverse permeants and 31 solvents reported in the literature. The data were carefully screened to ensure their collection under consistent experimental conditions. To construct a high-performance predictive model, we then applied support vector regression (SVR) and random forest (RF) with greedy stepwise descriptor selection to our database. The models were internally and externally validated. The SVR achieved higher performance statistics than RF. The (externally validated) determination coefficient, root mean square error, and mean absolute error of SVR were 0.899, 0.351, and 0.268, respectively. Moreover, because all descriptors are fully computational, our method can predict as-yet unsynthesized compounds. Our high-performance prediction model offers an attractive alternative to permeability experiments for pharmaceutical and cosmetic candidate screening and optimizing skin-permeable topical formulations.
Prediction of texture and colour of dry-cured ham by visible and near infrared spectroscopy using a fiber optic probe.

PubMed

García-Rey, R M; García-Olmo, J; De Pedro, E; Quiles-Zafra, R; Luque de Castro, M D

2005-06-01

The potential of visible and near infrared spectroscopy to predict texture and colour of dry-cured ham samples was investigated. Sensory evaluation was performed on 117 boned and cross-sectioned dry-cured ham samples. Slices of approximate thickness 4cm were cut, vacuum-packaged and kept under frozen storage until spectral analysis. Then, Biceps femoris muscle from the thawed slices was taken and scanned (400-2200nm) using a fiber optic probe. The exploratory analysis using principal component analysis shows that there are two ham groups according to the appearance or not of defects. Then, a K nearest neighbours was used to classify dry-cured hams into defective or no defective classes. The overall accuracy of the classification as a function of pastiness was 88.5%; meanwhile, according to colour was 79.7%. Partial least squares regression was used to formulate prediction equations for pastiness and colour. The correlation coefficients of calibration and cross-validation were 0.97 and 0.86 for optimal equation predicting pastiness, and 0.82 and 0.69 for optimal equation predicting colour. The standard error of cross-validation for predicting pastiness and colour is between 1 and 2 times the standard deviation of the reference method (the error involved in the sensory evaluation by the experts). The magnitude of this error demonstrates the good precision of the methods for predicting pastiness and colour. Furthermore, the samples were classified into defective or no defective classes, with a correct classification of 94.2% according to pasty texture evaluation and 75.7% as regard to colour evaluation.
A data-driven SVR model for long-term runoff prediction and uncertainty analysis based on the Bayesian framework

NASA Astrophysics Data System (ADS)

Liang, Zhongmin; Li, Yujie; Hu, Yiming; Li, Binquan; Wang, Jun

2017-06-01

Accurate and reliable long-term forecasting plays an important role in water resources management and utilization. In this paper, a hybrid model called SVR-HUP is presented to predict long-term runoff and quantify the prediction uncertainty. The model is created based on three steps. First, appropriate predictors are selected according to the correlations between meteorological factors and runoff. Second, a support vector regression (SVR) model is structured and optimized based on the LibSVM toolbox and a genetic algorithm. Finally, using forecasted and observed runoff, a hydrologic uncertainty processor (HUP) based on a Bayesian framework is used to estimate the posterior probability distribution of the simulated values, and the associated uncertainty of prediction was quantitatively analyzed. Six precision evaluation indexes, including the correlation coefficient (CC), relative root mean square error (RRMSE), relative error (RE), mean absolute percentage error (MAPE), Nash-Sutcliffe efficiency (NSE), and qualification rate (QR), are used to measure the prediction accuracy. As a case study, the proposed approach is applied in the Han River basin, South Central China. Three types of SVR models are established to forecast the monthly, flood season and annual runoff volumes. The results indicate that SVR yields satisfactory accuracy and reliability at all three scales. In addition, the results suggest that the HUP cannot only quantify the uncertainty of prediction based on a confidence interval but also provide a more accurate single value prediction than the initial SVR forecasting result. Thus, the SVR-HUP model provides an alternative method for long-term runoff forecasting.
An Update on the Role of Systems Modeling in the Design and Verification of the James Webb Space Telescope

NASA Technical Reports Server (NTRS)

Muheim, Danniella; Menzel, Michael; Mosier, Gary; Irish, Sandra; Maghami, Peiman; Mehalick, Kimberly; Parrish, Keith

2010-01-01

The James Web Space Telescope (JWST) is a large, infrared-optimized space telescope scheduled for launch in 2014. System-level verification of critical performance requirements will rely on integrated observatory models that predict the wavefront error accurately enough to verify that allocated top-level wavefront error of 150 nm root-mean-squared (rms) through to the wave-front sensor focal plane is met. The assembled models themselves are complex and require the insight of technical experts to assess their ability to meet their objectives. This paper describes the systems engineering and modeling approach used on the JWST through the detailed design phase.
Combination of multiple model population analysis and mid-infrared technology for the estimation of copper content in Tegillarca granosa

NASA Astrophysics Data System (ADS)

Hu, Meng-Han; Chen, Xiao-Jing; Ye, Peng-Chao; Chen, Xi; Shi, Yi-Jian; Zhai, Guang-Tao; Yang, Xiao-Kang

2016-11-01

The aim of this study was to use mid-infrared spectroscopy coupled with multiple model population analysis based on Monte Carlo-uninformative variable elimination for rapidly estimating the copper content of Tegillarca granosa. Copper-specific wavelengths were first extracted from the whole spectra, and subsequently, a least square-support vector machine was used to develop the prediction models. Compared with the prediction model based on full wavelengths, models that used 100 multiple MC-UVE selected wavelengths without and with bin operation showed comparable performances with Rp (root mean square error of Prediction) of 0.97 (14.60 mg/kg) and 0.94 (20.85 mg/kg) versus 0.96 (17.27 mg/kg), as well as ratio of percent deviation (number of wavelength) of 2.77 (407) and 1.84 (45) versus 2.32 (1762). The obtained results demonstrated that the mid-infrared technique could be used for estimating copper content in T. granosa. In addition, the proposed multiple model population analysis can eliminate uninformative, weakly informative and interfering wavelengths effectively, that substantially reduced the model complexity and computation time.

[Research on partial least squares for determination of impurities in the presence of high concentration of matrix by ICP-AES].

PubMed

Wang, Yan-peng; Gong, Qi; Yu, Sheng-rong; Liu, You-yan

2012-04-01

A method for detecting trace impurities in high concentration matrix by ICP-AES based on partial least squares (PLS) was established. The research showed that PLS could effectively correct the interference caused by high level of matrix concentration error and could withstand higher concentrations of matrix than multicomponent spectral fitting (MSF). When the mass ratios of matrix to impurities were from 1 000 : 1 to 20 000 : 1, the recoveries of standard addition were between 95% and 105% by PLS. For the system in which interference effect has nonlinear correlation with the matrix concentrations, the prediction accuracy of normal PLS method was poor, but it can be improved greatly by using LIN-PPLS, which was based on matrix transformation of sample concentration. The contents of Co, Pb and Ga in stream sediment (GBW07312) were detected by MSF, PLS and LIN-PPLS respectively. The results showed that the prediction accuracy of LIN-PPLS was better than PLS, and the prediction accuracy of PLS was better than MSF.
Predicting preference-based SF-6D index scores from the SF-8 health survey.

PubMed

Wang, P; Fu, A Z; Wee, H L; Lee, J; Tai, E S; Thumboo, J; Luo, N

2013-09-01

To develop and test functions for predicting the preference-based SF-6D index scores from the SF-8 health survey. This study was a secondary analysis of data collected in a population health survey in which respondents (n = 7,529) completed both the SF-36 and the SF-8 questionnaires. We examined seven ordinary least-square estimators for their performance in predicting SF-6D scores from the SF-8 at both the individual and the group levels. In general, all functions performed similarly well in predicting SF-6D scores, and the predictions at the group level were better than predictions at the individual level. At the individual level, 42.5-51.5% of prediction errors were smaller than the minimally important difference (MID) of the SF-6D scores, depending on the function specifications, while almost all prediction errors of the tested functions were smaller than the MID of SF-6D at the group level. At both individual and group levels, the tested functions predicted lower than actual scores at the higher end of the SF-6D scale. Our study developed functions to generate preference-based SF-6D index scores from the SF-8 health survey, the first of its kind. Further research is needed to evaluate the performance and validity of the prediction functions.
Phytoremediation of palm oil mill secondary effluent (POMSE) by Chrysopogon zizanioides (L.) using artificial neural networks.

PubMed

Darajeh, Negisa; Idris, Azni; Fard Masoumi, Hamid Reza; Nourani, Abolfazl; Truong, Paul; Rezania, Shahabaldin

2017-05-04

Artificial neural networks (ANNs) have been widely used to solve the problems because of their reliable, robust, and salient characteristics in capturing the nonlinear relationships between variables in complex systems. In this study, ANN was applied for modeling of Chemical Oxygen Demand (COD) and biodegradable organic matter (BOD) removal from palm oil mill secondary effluent (POMSE) by vetiver system. The independent variable, including POMSE concentration, vetiver slips density, and removal time, has been considered as input parameters to optimize the network, while the removal percentage of COD and BOD were selected as output. To determine the number of hidden layer nodes, the root mean squared error of testing set was minimized, and the topologies of the algorithms were compared by coefficient of determination and absolute average deviation. The comparison indicated that the quick propagation (QP) algorithm had minimum root mean squared error and absolute average deviation, and maximum coefficient of determination. The importance values of the variables was included vetiver slips density with 42.41%, time with 29.8%, and the POMSE concentration with 27.79%, which showed none of them, is negligible. Results show that the ANN has great potential ability in prediction of COD and BOD removal from POMSE with residual standard error (RSE) of less than 0.45%.
The Detection and Quantification of Adulteration in Ground Roasted Asian Palm Civet Coffee Using Near-Infrared Spectroscopy in Tandem with Chemometrics

NASA Astrophysics Data System (ADS)

Suhandy, D.; Yulia, M.; Ogawa, Y.; Kondo, N.

2018-05-01

In the present research, an evaluation of using near infrared (NIR) spectroscopy in tandem with full spectrum partial least squares (FS-PLS) regression for quantification of degree of adulteration in civet coffee was conducted. A number of 126 ground roasted coffee samples with degree of adulteration 0-51% were prepared. Spectral data were acquired using a NIR spectrometer equipped with an integrating sphere for diffuse reflectance measurement in the range of 1300-2500 nm. The samples were divided into two groups calibration sample set (84 samples) and prediction sample set (42 samples). The calibration model was developed on original spectra using FS-PLS regression with full-cross validation method. The calibration model exhibited the determination coefficient R2=0.96 for calibration and R2=0.92 for validation. The prediction resulted in low root mean square error of prediction (RMSEP) (4.67%) and high ratio prediction to deviation (RPD) (3.75). In conclusion, the degree of adulteration in civet coffee have been quantified successfully by using NIR spectroscopy and FS-PLS regression in a non-destructive, economical, precise, and highly sensitive method, which uses very simple sample preparation.
Enhancing prediction power of chemometric models through manipulation of the fed spectrophotometric data: A comparative study

NASA Astrophysics Data System (ADS)

Saad, Ahmed S.; Hamdy, Abdallah M.; Salama, Fathy M.; Abdelkawy, Mohamed

2016-10-01

Effect of data manipulation in preprocessing step proceeding construction of chemometric models was assessed. The same set of UV spectral data was used for construction of PLS and PCR models directly and after mathematically manipulation as per well known first and second derivatives of the absorption spectra, ratio spectra and first and second derivatives of the ratio spectra spectrophotometric methods, meanwhile the optimal working wavelength ranges were carefully selected for each model and the models were constructed. Unexpectedly, number of latent variables used for models' construction varied among the different methods. The prediction power of the different models was compared using a validation set of 8 mixtures prepared as per the multilevel multifactor design and results were statistically compared using two-way ANOVA test. Root mean squares error of prediction (RMSEP) was used for further comparison of the predictability among different constructed models. Although no significant difference was found between results obtained using Partial Least Squares (PLS) and Principal Component Regression (PCR) models, however, discrepancies among results was found to be attributed to the variation in the discrimination power of adopted spectrophotometric methods on spectral data.
Developing a dengue forecast model using machine learning: A case study in China.

PubMed

Guo, Pi; Liu, Tao; Zhang, Qin; Wang, Li; Xiao, Jianpeng; Zhang, Qingying; Luo, Ganfeng; Li, Zhihao; He, Jianfeng; Zhang, Yonghui; Ma, Wenjun

2017-10-01

In China, dengue remains an important public health issue with expanded areas and increased incidence recently. Accurate and timely forecasts of dengue incidence in China are still lacking. We aimed to use the state-of-the-art machine learning algorithms to develop an accurate predictive model of dengue. Weekly dengue cases, Baidu search queries and climate factors (mean temperature, relative humidity and rainfall) during 2011-2014 in Guangdong were gathered. A dengue search index was constructed for developing the predictive models in combination with climate factors. The observed year and week were also included in the models to control for the long-term trend and seasonality. Several machine learning algorithms, including the support vector regression (SVR) algorithm, step-down linear regression model, gradient boosted regression tree algorithm (GBM), negative binomial regression model (NBM), least absolute shrinkage and selection operator (LASSO) linear regression model and generalized additive model (GAM), were used as candidate models to predict dengue incidence. Performance and goodness of fit of the models were assessed using the root-mean-square error (RMSE) and R-squared measures. The residuals of the models were examined using the autocorrelation and partial autocorrelation function analyses to check the validity of the models. The models were further validated using dengue surveillance data from five other provinces. The epidemics during the last 12 weeks and the peak of the 2014 large outbreak were accurately forecasted by the SVR model selected by a cross-validation technique. Moreover, the SVR model had the consistently smallest prediction error rates for tracking the dynamics of dengue and forecasting the outbreaks in other areas in China. The proposed SVR model achieved a superior performance in comparison with other forecasting techniques assessed in this study. The findings can help the government and community respond early to dengue epidemics.
Total sulfur determination in residues of crude oil distillation using FT-IR/ATR and variable selection methods

NASA Astrophysics Data System (ADS)

Müller, Aline Lima Hermes; Picoloto, Rochele Sogari; Mello, Paola de Azevedo; Ferrão, Marco Flores; dos Santos, Maria de Fátima Pereira; Guimarães, Regina Célia Lourenço; Müller, Edson Irineu; Flores, Erico Marlon Moraes

2012-04-01

Total sulfur concentration was determined in atmospheric residue (AR) and vacuum residue (VR) samples obtained from petroleum distillation process by Fourier transform infrared spectroscopy with attenuated total reflectance (FT-IR/ATR) in association with chemometric methods. Calibration and prediction set consisted of 40 and 20 samples, respectively. Calibration models were developed using two variable selection models: interval partial least squares (iPLS) and synergy interval partial least squares (siPLS). Different treatments and pre-processing steps were also evaluated for the development of models. The pre-treatment based on multiplicative scatter correction (MSC) and the mean centered data were selected for models construction. The use of siPLS as variable selection method provided a model with root mean square error of prediction (RMSEP) values significantly better than those obtained by PLS model using all variables. The best model was obtained using siPLS algorithm with spectra divided in 20 intervals and combinations of 3 intervals (911-824, 823-736 and 737-650 cm-1). This model produced a RMSECV of 400 mg kg-1 S and RMSEP of 420 mg kg-1 S, showing a correlation coefficient of 0.990.
Kinetic approach for the enzymatic determination of levodopa and carbidopa assisted by multivariate curve resolution-alternating least squares.

PubMed

Grünhut, Marcos; Garrido, Mariano; Centurión, Maria E; Fernández Band, Beatriz S

2010-07-12

A combination of kinetic spectroscopic monitoring and multivariate curve resolution-alternating least squares (MCR-ALS) was proposed for the enzymatic determination of levodopa (LVD) and carbidopa (CBD) in pharmaceuticals. The enzymatic reaction process was carried out in a reverse stopped-flow injection system and monitored by UV-vis spectroscopy. The spectra (292-600 nm) were recorded throughout the reaction and were analyzed by multivariate curve resolution-alternating least squares. A small calibration matrix containing nine mixtures was used in the model construction. Additionally, to evaluate the prediction ability of the model, a set with six validation mixtures was used. The lack of fit obtained was 4.3%, the explained variance 99.8% and the overall prediction error 5.5%. Tablets of commercial samples were analyzed and the results were validated by pharmacopeia method (high performance liquid chromatography). No significant differences were found (alpha=0.05) between the reference values and the ones obtained with the proposed method. It is important to note that a unique chemometric model made it possible to determine both analytes simultaneously. Copyright 2010 Elsevier B.V. All rights reserved.
Discrimination and prediction of cultivation age and parts of Panax ginseng by Fourier-transform infrared spectroscopy combined with multivariate statistical analysis.

PubMed

Lee, Byeong-Ju; Kim, Hye-Youn; Lim, Sa Rang; Huang, Linfang; Choi, Hyung-Kyoon

2017-01-01

Panax ginseng C.A. Meyer is a herb used for medicinal purposes, and its discrimination according to cultivation age has been an important and practical issue. This study employed Fourier-transform infrared (FT-IR) spectroscopy with multivariate statistical analysis to obtain a prediction model for discriminating cultivation ages (5 and 6 years) and three different parts (rhizome, tap root, and lateral root) of P. ginseng. The optimal partial-least-squares regression (PLSR) models for discriminating ginseng samples were determined by selecting normalization methods, number of partial-least-squares (PLS) components, and variable influence on projection (VIP) cutoff values. The best prediction model for discriminating 5- and 6-year-old ginseng was developed using tap root, vector normalization applied after the second differentiation, one PLS component, and a VIP cutoff of 1.0 (based on the lowest root-mean-square error of prediction value). In addition, for discriminating among the three parts of P. ginseng, optimized PLSR models were established using data sets obtained from vector normalization, two PLS components, and VIP cutoff values of 1.5 (for 5-year-old ginseng) and 1.3 (for 6-year-old ginseng). To our knowledge, this is the first study to provide a novel strategy for rapidly discriminating the cultivation ages and parts of P. ginseng using FT-IR by selected normalization methods, number of PLS components, and VIP cutoff values.
Discrimination and prediction of cultivation age and parts of Panax ginseng by Fourier-transform infrared spectroscopy combined with multivariate statistical analysis

PubMed Central

Lim, Sa Rang; Huang, Linfang

2017-01-01

Panax ginseng C.A. Meyer is a herb used for medicinal purposes, and its discrimination according to cultivation age has been an important and practical issue. This study employed Fourier-transform infrared (FT-IR) spectroscopy with multivariate statistical analysis to obtain a prediction model for discriminating cultivation ages (5 and 6 years) and three different parts (rhizome, tap root, and lateral root) of P. ginseng. The optimal partial-least-squares regression (PLSR) models for discriminating ginseng samples were determined by selecting normalization methods, number of partial-least-squares (PLS) components, and variable influence on projection (VIP) cutoff values. The best prediction model for discriminating 5- and 6-year-old ginseng was developed using tap root, vector normalization applied after the second differentiation, one PLS component, and a VIP cutoff of 1.0 (based on the lowest root-mean-square error of prediction value). In addition, for discriminating among the three parts of P. ginseng, optimized PLSR models were established using data sets obtained from vector normalization, two PLS components, and VIP cutoff values of 1.5 (for 5-year-old ginseng) and 1.3 (for 6-year-old ginseng). To our knowledge, this is the first study to provide a novel strategy for rapidly discriminating the cultivation ages and parts of P. ginseng using FT-IR by selected normalization methods, number of PLS components, and VIP cutoff values. PMID:29049369
A regression-kriging model for estimation of rainfall in the Laohahe basin

NASA Astrophysics Data System (ADS)

Wang, Hong; Ren, Li L.; Liu, Gao H.

2009-10-01

This paper presents a multivariate geostatistical algorithm called regression-kriging (RK) for predicting the spatial distribution of rainfall by incorporating five topographic/geographic factors of latitude, longitude, altitude, slope and aspect. The technique is illustrated using rainfall data collected at 52 rain gauges from the Laohahe basis in northeast China during 1986-2005 . Rainfall data from 44 stations were selected for modeling and the remaining 8 stations were used for model validation. To eliminate multicollinearity, the five explanatory factors were first transformed using factor analysis with three Principal Components (PCs) extracted. The rainfall data were then fitted using step-wise regression and residuals interpolated using SK. The regression coefficients were estimated by generalized least squares (GLS), which takes the spatial heteroskedasticity between rainfall and PCs into account. Finally, the rainfall prediction based on RK was compared with that predicted from ordinary kriging (OK) and ordinary least squares (OLS) multiple regression (MR). For correlated topographic factors are taken into account, RK improves the efficiency of predictions. RK achieved a lower relative root mean square error (RMSE) (44.67%) than MR (49.23%) and OK (73.60%) and a lower bias than MR and OK (23.82 versus 30.89 and 32.15 mm) for annual rainfall. It is much more effective for the wet season than for the dry season. RK is suitable for estimation of rainfall in areas where there are no stations nearby and where topography has a major influence on rainfall.
An Empirical State Error Covariance Matrix for the Weighted Least Squares Estimation Method

NASA Technical Reports Server (NTRS)

Frisbee, Joseph H., Jr.

2011-01-01

State estimation techniques effectively provide mean state estimates. However, the theoretical state error covariance matrices provided as part of these techniques often suffer from a lack of confidence in their ability to describe the un-certainty in the estimated states. By a reinterpretation of the equations involved in the weighted least squares algorithm, it is possible to directly arrive at an empirical state error covariance matrix. This proposed empirical state error covariance matrix will contain the effect of all error sources, known or not. Results based on the proposed technique will be presented for a simple, two observer, measurement error only problem.
Wavelet-based multiscale performance analysis: An approach to assess and improve hydrological models

NASA Astrophysics Data System (ADS)

Rathinasamy, Maheswaran; Khosa, Rakesh; Adamowski, Jan; ch, Sudheer; Partheepan, G.; Anand, Jatin; Narsimlu, Boini

2014-12-01

The temporal dynamics of hydrological processes are spread across different time scales and, as such, the performance of hydrological models cannot be estimated reliably from global performance measures that assign a single number to the fit of a simulated time series to an observed reference series. Accordingly, it is important to analyze model performance at different time scales. Wavelets have been used extensively in the area of hydrological modeling for multiscale analysis, and have been shown to be very reliable and useful in understanding dynamics across time scales and as these evolve in time. In this paper, a wavelet-based multiscale performance measure for hydrological models is proposed and tested (i.e., Multiscale Nash-Sutcliffe Criteria and Multiscale Normalized Root Mean Square Error). The main advantage of this method is that it provides a quantitative measure of model performance across different time scales. In the proposed approach, model and observed time series are decomposed using the Discrete Wavelet Transform (known as the à trous wavelet transform), and performance measures of the model are obtained at each time scale. The applicability of the proposed method was explored using various case studies-both real as well as synthetic. The synthetic case studies included various kinds of errors (e.g., timing error, under and over prediction of high and low flows) in outputs from a hydrologic model. The real time case studies investigated in this study included simulation results of both the process-based Soil Water Assessment Tool (SWAT) model, as well as statistical models, namely the Coupled Wavelet-Volterra (WVC), Artificial Neural Network (ANN), and Auto Regressive Moving Average (ARMA) methods. For the SWAT model, data from Wainganga and Sind Basin (India) were used, while for the Wavelet Volterra, ANN and ARMA models, data from the Cauvery River Basin (India) and Fraser River (Canada) were used. The study also explored the effect of the choice of the wavelets in multiscale model evaluation. It was found that the proposed wavelet-based performance measures, namely the MNSC (Multiscale Nash-Sutcliffe Criteria) and MNRMSE (Multiscale Normalized Root Mean Square Error), are a more reliable measure than traditional performance measures such as the Nash-Sutcliffe Criteria (NSC), Root Mean Square Error (RMSE), and Normalized Root Mean Square Error (NRMSE). Further, the proposed methodology can be used to: i) compare different hydrological models (both physical and statistical models), and ii) help in model calibration.
A new method to estimate average hourly global solar radiation on the horizontal surface

NASA Astrophysics Data System (ADS)

Pandey, Pramod K.; Soupir, Michelle L.

2012-10-01

A new model, Global Solar Radiation on Horizontal Surface (GSRHS), was developed to estimate the average hourly global solar radiation on the horizontal surfaces (Gh). The GSRHS model uses the transmission function (Tf,ij), which was developed to control hourly global solar radiation, for predicting solar radiation. The inputs of the model were: hour of day, day (Julian) of year, optimized parameter values, solar constant (H0), latitude, and longitude of the location of interest. The parameter values used in the model were optimized at a location (Albuquerque, NM), and these values were applied into the model for predicting average hourly global solar radiations at four different locations (Austin, TX; El Paso, TX; Desert Rock, NV; Seattle, WA) of the United States. The model performance was assessed using correlation coefficient (r), Mean Absolute Bias Error (MABE), Root Mean Square Error (RMSE), and coefficient of determinations (R2). The sensitivities of parameter to prediction were estimated. Results show that the model performed very well. The correlation coefficients (r) range from 0.96 to 0.99, while coefficients of determination (R2) range from 0.92 to 0.98. For daily and monthly prediction, error percentages (i.e. MABE and RMSE) were less than 20%. The approach we proposed here can be potentially useful for predicting average hourly global solar radiation on the horizontal surface for different locations, with the use of readily available data (i.e. latitude and longitude of the location) as inputs.
Predicting cognitive function from clinical measures of physical function and health status in older adults.

PubMed

Bolandzadeh, Niousha; Kording, Konrad; Salowitz, Nicole; Davis, Jennifer C; Hsu, Liang; Chan, Alison; Sharma, Devika; Blohm, Gunnar; Liu-Ambrose, Teresa

2015-01-01

Current research suggests that the neuropathology of dementia-including brain changes leading to memory impairment and cognitive decline-is evident years before the onset of this disease. Older adults with cognitive decline have reduced functional independence and quality of life, and are at greater risk for developing dementia. Therefore, identifying biomarkers that can be easily assessed within the clinical setting and predict cognitive decline is important. Early recognition of cognitive decline could promote timely implementation of preventive strategies. We included 89 community-dwelling adults aged 70 years and older in our study, and collected 32 measures of physical function, health status and cognitive function at baseline. We utilized an L1-L2 regularized regression model (elastic net) to identify which of the 32 baseline measures were strongly predictive of cognitive function after one year. We built three linear regression models: 1) based on baseline cognitive function, 2) based on variables consistently selected in every cross-validation loop, and 3) a full model based on all the 32 variables. Each of these models was carefully tested with nested cross-validation. Our model with the six variables consistently selected in every cross-validation loop had a mean squared prediction error of 7.47. This number was smaller than that of the full model (115.33) and the model with baseline cognitive function (7.98). Our model explained 47% of the variance in cognitive function after one year. We built a parsimonious model based on a selected set of six physical function and health status measures strongly predictive of cognitive function after one year. In addition to reducing the complexity of the model without changing the model significantly, our model with the top variables improved the mean prediction error and R-squared. These six physical function and health status measures can be easily implemented in a clinical setting.
Daily Suspended Sediment Discharge Prediction Using Multiple Linear Regression and Artificial Neural Network

NASA Astrophysics Data System (ADS)

Uca; Toriman, Ekhwan; Jaafar, Othman; Maru, Rosmini; Arfan, Amal; Saleh Ahmar, Ansari

2018-01-01

Prediction of suspended sediment discharge in a catchments area is very important because it can be used to evaluation the erosion hazard, management of its water resources, water quality, hydrology project management (dams, reservoirs, and irrigation) and to determine the extent of the damage that occurred in the catchments. Multiple Linear Regression analysis and artificial neural network can be used to predict the amount of daily suspended sediment discharge. Regression analysis using the least square method, whereas artificial neural networks using Radial Basis Function (RBF) and feedforward multilayer perceptron with three learning algorithms namely Levenberg-Marquardt (LM), Scaled Conjugate Descent (SCD) and Broyden-Fletcher-Goldfarb-Shanno Quasi-Newton (BFGS). The number neuron of hidden layer is three to sixteen, while in output layer only one neuron because only one output target. The mean absolute error (MAE), root mean square error (RMSE), coefficient of determination (R2 ) and coefficient of efficiency (CE) of the multiple linear regression (MLRg) value Model 2 (6 input variable independent) has the lowest the value of MAE and RMSE (0.0000002 and 13.6039) and highest R2 and CE (0.9971 and 0.9971). When compared between LM, SCG and RBF, the BFGS model structure 3-7-1 is the better and more accurate to prediction suspended sediment discharge in Jenderam catchment. The performance value in testing process, MAE and RMSE (13.5769 and 17.9011) is smallest, meanwhile R2 and CE (0.9999 and 0.9998) is the highest if it compared with the another BFGS Quasi-Newton model (6-3-1, 9-10-1 and 12-12-1). Based on the performance statistics value, MLRg, LM, SCG, BFGS and RBF suitable and accurately for prediction by modeling the non-linear complex behavior of suspended sediment responses to rainfall, water depth and discharge. The comparison between artificial neural network (ANN) and MLRg, the MLRg Model 2 accurately for to prediction suspended sediment discharge (kg/day) in Jenderan catchment area.
Analysis of tractable distortion metrics for EEG compression applications.

PubMed

Bazán-Prieto, Carlos; Blanco-Velasco, Manuel; Cárdenas-Barrera, Julián; Cruz-Roldán, Fernando

2012-07-01

Coding distortion in lossy electroencephalographic (EEG) signal compression methods is evaluated through tractable objective criteria. The percentage root-mean-square difference, which is a global and relative indicator of the quality held by reconstructed waveforms, is the most widely used criterion. However, this parameter does not ensure compliance with clinical standard guidelines that specify limits to allowable noise in EEG recordings. As a result, expert clinicians may have difficulties interpreting the resulting distortion of the EEG for a given value of this parameter. Conversely, the root-mean-square error is an alternative criterion that quantifies distortion in understandable units. In this paper, we demonstrate that the root-mean-square error is better suited to control and to assess the distortion introduced by compression methods. The experiments conducted in this paper show that the use of the root-mean-square error as target parameter in EEG compression allows both clinicians and scientists to infer whether coding error is clinically acceptable or not at no cost for the compression ratio.
Modelling the spatial distribution of the nuisance mosquito species Anopheles plumbeus (Diptera: Culicidae) in the Netherlands.

PubMed

Ibañez-Justicia, Adolfo; Cianci, Daniela

2015-05-01

Landscape modifications, urbanization or changes of use of rural-agricultural areas can create more favourable conditions for certain mosquito species and therefore indirectly cause nuisance problems for humans. This could potentially result in mosquito-borne disease outbreaks when the nuisance is caused by mosquito species that can transmit pathogens. Anopheles plumbeus is a nuisance mosquito species and a potential malaria vector. It is one of the most frequently observed species in the Netherlands. Information on the distribution of this species is essential for risk assessments. The purpose of the study was to investigate the potential spatial distribution of An. plumbeus in the Netherlands. Random forest models were used to link the occurrence and the abundance of An. plumbeus with environmental features and to produce distribution maps in the Netherlands. Mosquito data were collected using a cross-sectional study design in the Netherlands, from April to October 2010-2013. The environmental data were obtained from satellite imagery and weather stations. Statistical measures (accuracy for the occurrence model and mean squared error for the abundance model) were used to evaluate the models performance. The models were externally validated. The maps show that forested areas (centre of the Netherlands) and the east of the country were predicted as suitable for An. plumbeus. In particular high suitability and high abundance was predicted in the south-eastern provinces Limburg and North Brabant. Elevation, precipitation, day and night temperature and vegetation indices were important predictors for calculating the probability of occurrence for An. plumbeus. The probability of occurrence, vegetation indices and precipitation were important for predicting its abundance. The AUC value was 0.73 and the error in the validation was 0.29; the mean squared error value was 0.12. The areas identified by the model as suitable and with high abundance of An. plumbeus, are consistent with the areas from which nuisance was reported. Our results can be helpful in the assessment of vector-borne disease risk.
Paleotemperature reconstruction from mammalian phosphate δ18O records - an alternative view on data processing

NASA Astrophysics Data System (ADS)

Skrzypek, Grzegorz; Sadler, Rohan; Wiśniewski, Andrzej

2017-04-01

The stable oxygen isotope composition of phosphates (δ18O) extracted from mammalian bone and teeth material is commonly used as a proxy for paleotemperature. Historically, several different analytical and statistical procedures for determining air paleotemperatures from the measured δ18O of phosphates have been applied. This inconsistency in both stable isotope data processing and the application of statistical procedures has led to large and unwanted differences between calculated results. This study presents the uncertainty associated with two of the most commonly used regression methods: least squares inverted fit and transposed fit. We assessed the performance of these methods by designing and applying calculation experiments to multiple real-life data sets, calculating in reverse temperatures, and comparing them with true recorded values. Our calculations clearly show that the mean absolute errors are always substantially higher for the inverted fit (a causal model), with the transposed fit (a predictive model) returning mean values closer to the measured values (Skrzypek et al. 2015). The predictive models always performed better than causal models, with 12-65% lower mean absolute errors. Moreover, the least-squares regression (LSM) model is more appropriate than Reduced Major Axis (RMA) regression for calculating the environmental water stable oxygen isotope composition from phosphate signatures, as well as for calculating air temperature from the δ18O value of environmental water. The transposed fit introduces a lower overall error than the inverted fit for both the δ18O of environmental water and Tair calculations; therefore, the predictive models are more statistically efficient than the causal models in this instance. The direct comparison of paleotemperature results from different laboratories and studies may only be achieved if a single method of calculation is applied. Reference Skrzypek G., Sadler R., Wiśniewski A., 2016. Reassessment of recommendations for processing mammal phosphate δ18O data for paleotemperature reconstruction. Palaeogeography, Palaeoclimatology, Palaeoecology 446, 162-167.
SU-F-J-65: Prediction of Patient Setup Errors and Errors in the Calibration Curve from Prompt Gamma Proton Range Measurements

DOE Office of Scientific and Technical Information (OSTI.GOV)

Albert, J; Labarbe, R; Sterpin, E

2016-06-15

Purpose: To understand the extent to which the prompt gamma camera measurements can be used to predict the residual proton range due to setup errors and errors in the calibration curve. Methods: We generated ten variations on a default calibration curve (CC) and ten corresponding range maps (RM). Starting with the default RM, we chose a square array of N beamlets, which were then rotated by a random angle θ and shifted by a random vector s. We added a 5% distal Gaussian noise to each beamlet in order to introduce discrepancies that exist between the ranges predicted from themore » prompt gamma measurements and those simulated with Monte Carlo algorithms. For each RM, s, θ, along with an offset u in the CC, were optimized using a simple Euclidian distance between the default ranges and the ranges produced by the given RM. Results: The application of our method lead to the maximal overrange of 2.0mm and underrange of 0.6mm on average. Compared to the situations where s, θ, and u were ignored, these values were larger: 2.1mm and 4.3mm. In order to quantify the need for setup error corrections, we also performed computations in which u was corrected for, but s and θ were not. This yielded: 3.2mm and 3.2mm. The average computation time for 170 beamlets was 65 seconds. Conclusion: These results emphasize the necessity to correct for setup errors and the errors in the calibration curve. The simplicity and speed of our method makes it a good candidate for being implemented as a tool for in-room adaptive therapy. This work also demonstrates that the Prompt gamma range measurements can indeed be useful in the effort to reduce range errors. Given these results, and barring further refinements, this approach is a promising step towards an adaptive proton radiotherapy.« less

Air temperature sensors: dependence of radiative errors on sensor diameter in precision metrology and meteorology

NASA Astrophysics Data System (ADS)

de Podesta, Michael; Bell, Stephanie; Underwood, Robin

2018-04-01

In both meteorological and metrological applications, it is well known that air temperature sensors are susceptible to radiative errors. However, it is not widely known that the radiative error measured by an air temperature sensor in flowing air depends upon the sensor diameter, with smaller sensors reporting values closer to true air temperature. This is not a transient effect related to sensor heat capacity, but a fluid-dynamical effect arising from heat and mass flow in cylindrical geometries. This result has been known historically and is in meteorology text books. However, its significance does not appear to be widely appreciated and, as a consequence, air temperature can be—and probably is being—widely mis-estimated. In this paper, we first review prior descriptions of the ‘sensor size’ effect from the metrological and meteorological literature. We develop a heat transfer model to describe the process for cylindrical sensors, and evaluate the predicted temperature error for a range of sensor sizes and air speeds. We compare these predictions with published predictions and measurements. We report measurements demonstrating this effect in two laboratories at NPL in which the air flow and temperature are exceptionally closely controlled. The results are consistent with the heat-transfer model, and show that the air temperature error is proportional to the square root of the sensor diameter and that, even under good laboratory conditions, it can exceed 0.1 °C for a 6 mm diameter sensor. We then consider the implications of this result. In metrological applications, errors of the order of 0.1 °C are significant, representing limiting uncertainties in dimensional and mass measurements. In meteorological applications, radiative errors can easily be much larger. But in both cases, an understanding of the diameter dependence allows assessment and correction of the radiative error using a multi-sensor technique.
Postoperative refraction in the second eye having cataract surgery.

PubMed

Leffler, Christopher T; Wilkes, Martin; Reeves, Juliana; Mahmood, Muneera A

2011-01-01

Introduction. Previous cataract surgery studies assumed that first-eye predicted and observed postoperative refractions are equally important for predicting second-eye postoperative refraction. Methods. In a retrospective analysis of 173 patients having bilateral sequential phacoemulsification, multivariable linear regression was used to predict the second-eye postoperative refraction based on refractions predicted by the SRK-T formula for both eyes, the first-eye postoperative refraction, and the difference in IOL selected between eyes. Results. The first-eye observed postoperative refraction was an independent predictor of the second eye postoperative refraction (P < 0.001) and was weighted more heavily than the first-eye predicted refraction. Compared with the SRK-T formula, this model reduced the root-mean-squared (RMS) error of the predicted refraction by 11.3%. Conclusions. The first-eye postoperative refraction is an independent predictor of the second-eye postoperative refraction. The first-eye predicted refraction is less important. These findings may be due to interocular symmetry.
The output least-squares approach to estimating Lamé moduli

NASA Astrophysics Data System (ADS)

Gockenbach, Mark S.

2007-12-01

The Lamé moduli of a heterogeneous, isotropic, planar membrane can be estimated by observing the displacement of the membrane under a known edge traction, and choosing estimates of the moduli that best predict the observed displacement under a finite-element simulation. This algorithm converges to the exact moduli given pointwise measurements of the displacement on an increasingly fine mesh. The error estimates that prove this convergence also show the instability of the inverse problem.
ARGOS Testbed: Study of Multidisciplinary Challenges of Future Spaceborne Interferometric Arrays

DTIC Science & Technology

2004-09-01

optimized ex- tensively by ZEMAX . One drawback of the cemented dou- blet is that it has bonded glasses, therefore if there is a change of temperature, the...residual aberrations @root mean square ~rms! wavefront errors predicted by ZEMAX #. The final FK51- BaK2 design achieves 271.6 mm chromatic focal shift...of ZEMAX , a complete ARGOS optics layout is constructed based on the optical specifications of a subaperture, pyramidal mirror, and the beam combining
Fast and robust method for the determination of microstructure and composition in butadiene, styrene-butadiene, and isoprene rubber by near-infrared spectroscopy.

PubMed

Vilmin, Franck; Dussap, Claude; Coste, Nathalie

2006-06-01

In the tire industry, synthetic styrene-butadiene rubber (SBR), butadiene rubber (BR), and isoprene rubber (IR) elastomers are essential for conferring to the product its properties of grip and rolling resistance. Their physical properties depend on their chemical composition, i. e., their microstructure and styrene content, which must be accurately controlled. This paper describes a fast, robust, and highly reproducible near-infrared analytical method for the quantitative determination of the microstructure and styrene content. The quantitative models are calculated with the help of pure spectral profiles estimated from a partial least squares (PLS) regression, using (13)C nuclear magnetic resonance (NMR) as the reference method. This versatile approach allows the models to be applied over a large range of compositions, from a single BR to an SBR-IR blend. The resulting quantitative predictions are independent of the sample path length. As a consequence, the sample preparation is solvent free and simplified with a very fast (five minutes) hot filming step of a bulk polymer piece. No precise thickness control is required. Thus, the operator effect becomes negligible and the method is easily transferable. The root mean square error of prediction, depending on the rubber composition, is between 0.7% and 1.3%. The reproducibility standard error is less than 0.2% in every case.
Estimating the Mechanical Behavior of the Knee Joint during Crouch Gait: Implications for Real-Time Motor Control of Robotic Knee Orthoses

PubMed Central

Damiano, Diane L.; Bulea, Thomas C.

2016-01-01

Individuals with cerebral palsy frequently exhibit crouch gait, a pathological walking pattern characterized by excessive knee flexion. Knowledge of the knee joint moment during crouch gait is necessary for the design and control of assistive devices used for treatment. Our goal was to 1) develop statistical models to estimate knee joint moment extrema and dynamic stiffness during crouch gait, and 2) use the models to estimate the instantaneous joint moment during weight-acceptance. We retrospectively computed knee moments from 10 children with crouch gait and used stepwise linear regression to develop statistical models describing the knee moment features. The models explained at least 90% of the response value variability: peak moment in early (99%) and late (90%) stance, and dynamic stiffness of weight-acceptance flexion (94%) and extension (98%). We estimated knee extensor moment profiles from the predicted dynamic stiffness and instantaneous knee angle. This approach captured the timing and shape of the computed moment (root-mean-squared error: 2.64 Nm); including the predicted early-stance peak moment as a correction factor improved model performance (root-mean-squared error: 1.37 Nm). Our strategy provides a practical, accurate method to estimate the knee moment during crouch gait, and could be used for real-time, adaptive control of robotic orthoses. PMID:27101612
Four-dimensional data coupled to alternating weighted residue constraint quadrilinear decomposition model applied to environmental analysis: Determination of polycyclic aromatic hydrocarbons

NASA Astrophysics Data System (ADS)

Liu, Tingting; Zhang, Ling; Wang, Shutao; Cui, Yaoyao; Wang, Yutian; Liu, Lingfei; Yang, Zhe

2018-03-01

Qualitative and quantitative analysis of polycyclic aromatic hydrocarbons (PAHs) was carried out by three-dimensional fluorescence spectroscopy combining with Alternating Weighted Residue Constraint Quadrilinear Decomposition (AWRCQLD). The experimental subjects were acenaphthene (ANA) and naphthalene (NAP). Firstly, in order to solve the redundant information of the three-dimensional fluorescence spectral data, the wavelet transform was used to compress data in preprocessing. Then, the four-dimensional data was constructed by using the excitation-emission fluorescence spectra of different concentration PAHs. The sample data was obtained from three solvents that are methanol, ethanol and Ultra-pure water. The four-dimensional spectral data was analyzed by AWRCQLD, then the recovery rate of PAHs was obtained from the three solvents and compared respectively. On one hand, the results showed that PAHs can be measured more accurately by the high-order data, and the recovery rate was higher. On the other hand, the results presented that AWRCQLD can better reflect the superiority of four-dimensional algorithm than the second-order calibration and other third-order calibration algorithms. The recovery rate of ANA was 96.5% 103.3% and the root mean square error of prediction was 0.04 μgL- 1. The recovery rate of NAP was 96.7% 115.7% and the root mean square error of prediction was 0.06 μgL- 1.
Estimating patient-specific soft-tissue properties in a TKA knee.

PubMed

Ewing, Joseph A; Kaufman, Michelle K; Hutter, Erin E; Granger, Jeffrey F; Beal, Matthew D; Piazza, Stephen J; Siston, Robert A

2016-03-01

Surgical technique is one factor that has been identified as critical to success of total knee arthroplasty. Researchers have shown that computer simulations can aid in determining how decisions in the operating room generally affect post-operative outcomes. However, to use simulations to make clinically relevant predictions about knee forces and motions for a specific total knee patient, patient-specific models are needed. This study introduces a methodology for estimating knee soft-tissue properties of an individual total knee patient. A custom surgical navigation system and stability device were used to measure the force-displacement relationship of the knee. Soft-tissue properties were estimated using a parameter optimization that matched simulated tibiofemoral kinematics with experimental tibiofemoral kinematics. Simulations using optimized ligament properties had an average root mean square error of 3.5° across all tests while simulations using generic ligament properties taken from literature had an average root mean square error of 8.4°. Specimens showed large variability among ligament properties regardless of similarities in prosthetic component alignment and measured knee laxity. These results demonstrate the importance of soft-tissue properties in determining knee stability, and suggest that to make clinically relevant predictions of post-operative knee motions and forces using computer simulations, patient-specific soft-tissue properties are needed. © 2015 Orthopaedic Research Society. Published by Wiley Periodicals, Inc.
Optimal design of minimum mean-square error noise reduction algorithms using the simulated annealing technique.

PubMed

Bai, Mingsian R; Hsieh, Ping-Ju; Hur, Kur-Nan

2009-02-01

The performance of the minimum mean-square error noise reduction (MMSE-NR) algorithm in conjunction with time-recursive averaging (TRA) for noise estimation is found to be very sensitive to the choice of two recursion parameters. To address this problem in a more systematic manner, this paper proposes an optimization method to efficiently search the optimal parameters of the MMSE-TRA-NR algorithms. The objective function is based on a regression model, whereas the optimization process is carried out with the simulated annealing algorithm that is well suited for problems with many local optima. Another NR algorithm proposed in the paper employs linear prediction coding as a preprocessor for extracting the correlated portion of human speech. Objective and subjective tests were undertaken to compare the optimized MMSE-TRA-NR algorithm with several conventional NR algorithms. The results of subjective tests were processed by using analysis of variance to justify the statistic significance. A post hoc test, Tukey's Honestly Significant Difference, was conducted to further assess the pairwise difference between the NR algorithms.
Chemometric analysis for discrimination of extra virgin olive oils from whole and stoned olive pastes.

PubMed

De Luca, Michele; Restuccia, Donatella; Clodoveo, Maria Lisa; Puoci, Francesco; Ragno, Gaetano

2016-07-01

Chemometric discrimination of extra virgin olive oils (EVOO) from whole and stoned olive pastes was carried out by using Fourier transform infrared (FTIR) data and partial least squares-discriminant analysis (PLS1-DA) approach. Four Italian commercial EVOO brands, all in both whole and stoned version, were considered in this study. The adopted chemometric methodologies were able to describe the different chemical features in phenolic and volatile compounds contained in the two types of oil by using unspecific IR spectral information. Principal component analysis (PCA) was employed in cluster analysis to capture data patterns and to highlight differences between technological processes and EVOO brands. The PLS1-DA algorithm was used as supervised discriminant analysis to identify the different oil extraction procedures. Discriminant analysis was extended to the evaluation of possible adulteration by addition of aliquots of oil from whole paste to the most valuable oil from stoned olives. The statistical parameters from external validation of all the PLS models were very satisfactory, with low root mean square error of prediction (RMSEP) and relative error (RE%). Copyright © 2016 Elsevier Ltd. All rights reserved.
Improvement of GPS radio occultation retrieval error of E region electron density: COSMIC measurement and IRI model simulation

NASA Astrophysics Data System (ADS)

Wu, Kang-Hung; Su, Ching-Lun; Chu, Yen-Hsyang

2015-03-01

In this article, we use the International Reference Ionosphere (IRI) model to simulate temporal and spatial distributions of global E region electron densities retrieved by the FORMOSAT-3/COSMIC satellites by means of GPS radio occultation (RO) technique. Despite regional discrepancies in the magnitudes of the E region electron density, the IRI model simulations can, on the whole, describe the COSMIC measurements in quality and quantity. On the basis of global ionosonde network and the IRI model, the retrieval errors of the global COSMIC-measured E region peak electron density (NmE) from July 2006 to July 2011 are examined and simulated. The COSMIC measurement and the IRI model simulation both reveal that the magnitudes of the percentage error (PE) and root mean-square-error (RMSE) of the relative RO retrieval errors of the NmE values are dependent on local time (LT) and geomagnetic latitude, with minimum in the early morning and at high latitudes and maximum in the afternoon and at middle latitudes. In addition, the seasonal variation of PE and RMSE values seems to be latitude dependent. After removing the IRI model-simulated GPS RO retrieval errors from the original COSMIC measurements, the average values of the annual and monthly mean percentage errors of the RO retrieval errors of the COSMIC-measured E region electron density are, respectively, substantially reduced by a factor of about 2.95 and 3.35, and the corresponding root-mean-square errors show averaged decreases of 15.6% and 15.4%, respectively. It is found that, with this process, the largest reduction in the PE and RMSE of the COSMIC-measured NmE occurs at the equatorial anomaly latitudes 10°N-30°N in the afternoon from 14 to 18 LT, with a factor of 25 and 2, respectively. Statistics show that the residual errors that remained in the corrected COSMIC-measured NmE vary in a range of -20% to 38%, which are comparable to or larger than the percentage errors of the IRI-predicted NmE fluctuating in a range of -6.5% to 20%.
Petroleomics by electrospray ionization FT-ICR mass spectrometry coupled to partial least squares with variable selection methods: prediction of the total acid number of crude oils.

PubMed

Terra, Luciana A; Filgueiras, Paulo R; Tose, Lílian V; Romão, Wanderson; de Souza, Douglas D; de Castro, Eustáquio V R; de Oliveira, Mirela S L; Dias, Júlio C M; Poppi, Ronei J

2014-10-07

Negative-ion mode electrospray ionization, ESI(-), with Fourier transform ion cyclotron resonance mass spectrometry (FT-ICR MS) was coupled to a Partial Least Squares (PLS) regression and variable selection methods to estimate the total acid number (TAN) of Brazilian crude oil samples. Generally, ESI(-)-FT-ICR mass spectra present a power of resolution of ca. 500,000 and a mass accuracy less than 1 ppm, producing a data matrix containing over 5700 variables per sample. These variables correspond to heteroatom-containing species detected as deprotonated molecules, [M - H](-) ions, which are identified primarily as naphthenic acids, phenols and carbazole analog species. The TAN values for all samples ranged from 0.06 to 3.61 mg of KOH g(-1). To facilitate the spectral interpretation, three methods of variable selection were studied: variable importance in the projection (VIP), interval partial least squares (iPLS) and elimination of uninformative variables (UVE). The UVE method seems to be more appropriate for selecting important variables, reducing the dimension of the variables to 183 and producing a root mean square error of prediction of 0.32 mg of KOH g(-1). By reducing the size of the data, it was possible to relate the selected variables with their corresponding molecular formulas, thus identifying the main chemical species responsible for the TAN values.
Development of a partial least squares-artificial neural network (PLS-ANN) hybrid model for the prediction of consumer liking scores of ready-to-drink green tea beverages.

PubMed

Yu, Peigen; Low, Mei Yin; Zhou, Weibiao

2018-01-01

In order to develop products that would be preferred by consumers, the effects of the chemical compositions of ready-to-drink green tea beverages on consumer liking were studied through regression analyses. Green tea model systems were prepared by dosing solutions of 0.1% green tea extract with differing concentrations of eight flavour keys deemed to be important for green tea aroma and taste, based on a D-optimal experimental design, before undergoing commercial sterilisation. Sensory evaluation of the green tea model system was carried out using an untrained consumer panel to obtain hedonic liking scores of the samples. Regression models were subsequently trained to objectively predict the consumer liking scores of the green tea model systems. A linear partial least squares (PLS) regression model was developed to describe the effects of the eight flavour keys on consumer liking, with a coefficient of determination (R 2 ) of 0.733, and a root-mean-square error (RMSE) of 3.53%. The PLS model was further augmented with an artificial neural network (ANN) to establish a PLS-ANN hybrid model. The established hybrid model was found to give a better prediction of consumer liking scores, based on its R 2 (0.875) and RMSE (2.41%). Copyright © 2017 Elsevier Ltd. All rights reserved.
Study on for soluble solids contents measurement of grape juice beverage based on Vis/NIRS and chemomtrics

NASA Astrophysics Data System (ADS)

Wu, Di; He, Yong

2007-11-01

The aim of this study is to investigate the potential of the visible and near infrared spectroscopy (Vis/NIRS) technique for non-destructive measurement of soluble solids contents (SSC) in grape juice beverage. 380 samples were studied in this paper. Smoothing way of Savitzky-Golay and standard normal variate were applied for the pre-processing of spectral data. Least-squares support vector machines (LS-SVM) with RBF kernel function was applied to developing the SSC prediction model based on the Vis/NIRS absorbance data. The determination coefficient for prediction (Rp2) of the results predicted by LS-SVM model was 0. 962 and root mean square error (RMSEP) was 0. 434137. It is concluded that Vis/NIRS technique can quantify the SSC of grape juice beverage fast and non-destructively.. At the same time, LS-SVM model was compared with PLS and back propagation neural network (BP-NN) methods. The results showed that LS-SVM was superior to the conventional linear and non-linear methods in predicting SSC of grape juice beverage. In this study, the generation ability of LS-SVM, PLS and BP-NN models were also investigated. It is concluded that LS-SVM regression method is a promising technique for chemometrics in quantitative prediction.
Intelligent sensing sensory quality of Chinese rice wine using near infrared spectroscopy and nonlinear tools.

PubMed

Ouyang, Qin; Chen, Quansheng; Zhao, Jiewen

2016-02-05

The approach presented herein reports the application of near infrared (NIR) spectroscopy, in contrast with human sensory panel, as a tool for estimating Chinese rice wine quality; concretely, to achieve the prediction of the overall sensory scores assigned by the trained sensory panel. Back propagation artificial neural network (BPANN) combined with adaptive boosting (AdaBoost) algorithm, namely BP-AdaBoost, as a novel nonlinear algorithm, was proposed in modeling. First, the optimal spectra intervals were selected by synergy interval partial least square (Si-PLS). Then, BP-AdaBoost model based on the optimal spectra intervals was established, called Si-BP-AdaBoost model. These models were optimized by cross validation, and the performance of each final model was evaluated according to correlation coefficient (Rp) and root mean square error of prediction (RMSEP) in prediction set. Si-BP-AdaBoost showed excellent performance in comparison with other models. The best Si-BP-AdaBoost model was achieved with Rp=0.9180 and RMSEP=2.23 in the prediction set. It was concluded that NIR spectroscopy combined with Si-BP-AdaBoost was an appropriate method for the prediction of the sensory quality in Chinese rice wine. Copyright © 2015 Elsevier B.V. All rights reserved.
Bitterness intensity prediction of berberine hydrochloride using an electronic tongue and a GA-BP neural network.

PubMed

Liu, Ruixin; Zhang, Xiaodong; Zhang, Lu; Gao, Xiaojie; Li, Huiling; Shi, Junhan; Li, Xuelin

2014-06-01

The aim of this study was to predict the bitterness intensity of a drug using an electronic tongue (e-tongue). The model drug of berberine hydrochloride was used to establish a bitterness prediction model (BPM), based on the taste evaluation of bitterness intensity by a taste panel, the data provided by the e-tongue and a genetic algorithm-back-propagation neural network (GA-BP) modeling method. The modeling characteristics of the GA-BP were compared with those of multiple linear regression, partial least square regression and BP methods. The determination coefficient of the BPM was 0.99965±0.00004, the root mean square error of cross-validation was 0.1398±0.0488 and the correlation coefficient of the cross-validation between the true and predicted values was 0.9959±0.0027. The model is superior to the other three models based on these indicators. In conclusion, the model established in this study has a high fitting degree and may be used for the bitterness prediction modeling of berberine hydrochloride of different concentrations. The model also provides a reference for the generation of BPMs of other drugs. Additionally, the algorithm of the study is able to conduct a rapid and accurate quantitative analysis of the data provided by the e-tongue.
Bitterness intensity prediction of berberine hydrochloride using an electronic tongue and a GA-BP neural network

PubMed Central

LIU, RUIXIN; ZHANG, XIAODONG; ZHANG, LU; GAO, XIAOJIE; LI, HUILING; SHI, JUNHAN; LI, XUELIN

2014-01-01

The aim of this study was to predict the bitterness intensity of a drug using an electronic tongue (e-tongue). The model drug of berberine hydrochloride was used to establish a bitterness prediction model (BPM), based on the taste evaluation of bitterness intensity by a taste panel, the data provided by the e-tongue and a genetic algorithm-back-propagation neural network (GA-BP) modeling method. The modeling characteristics of the GA-BP were compared with those of multiple linear regression, partial least square regression and BP methods. The determination coefficient of the BPM was 0.99965±0.00004, the root mean square error of cross-validation was 0.1398±0.0488 and the correlation coefficient of the cross-validation between the true and predicted values was 0.9959±0.0027. The model is superior to the other three models based on these indicators. In conclusion, the model established in this study has a high fitting degree and may be used for the bitterness prediction modeling of berberine hydrochloride of different concentrations. The model also provides a reference for the generation of BPMs of other drugs. Additionally, the algorithm of the study is able to conduct a rapid and accurate quantitative analysis of the data provided by the e-tongue. PMID:24926369
Ultra-Short-Term Wind Power Prediction Using a Hybrid Model

NASA Astrophysics Data System (ADS)

Mohammed, E.; Wang, S.; Yu, J.

2017-05-01

This paper aims to develop and apply a hybrid model of two data analytical methods, multiple linear regressions and least square (MLR&LS), for ultra-short-term wind power prediction (WPP), for example taking, Northeast China electricity demand. The data was obtained from the historical records of wind power from an offshore region, and from a wind farm of the wind power plant in the areas. The WPP achieved in two stages: first, the ratios of wind power were forecasted using the proposed hybrid method, and then the transformation of these ratios of wind power to obtain forecasted values. The hybrid model combines the persistence methods, MLR and LS. The proposed method included two prediction types, multi-point prediction and single-point prediction. WPP is tested by applying different models such as autoregressive moving average (ARMA), autoregressive integrated moving average (ARIMA) and artificial neural network (ANN). By comparing results of the above models, the validity of the proposed hybrid model is confirmed in terms of error and correlation coefficient. Comparison of results confirmed that the proposed method works effectively. Additional, forecasting errors were also computed and compared, to improve understanding of how to depict highly variable WPP and the correlations between actual and predicted wind power.
A Survey of Terrain Modeling Technologies and Techniques

DTIC Science & Technology

2007-09-01

Washington , DC 20314-1000 ERDC/TEC TR-08-2 ii Abstract: Test planning, rehearsal, and distributed test events for Future Combat Systems (FCS) require...distance) for all five lines of control points. Blue circles are errors of DSM (original data), red squares are DTM (bare Earth, processed by Intermap...circles are DSM, red squares are DTM ........... 8 5 Distribution of errors for line No. 729. Blue circles are DSM, red squares are DTM
Analysis backpropagation methods with neural network for prediction of children's ability in psychomotoric

NASA Astrophysics Data System (ADS)

Izhari, F.; Dhany, H. W.; Zarlis, M.; Sutarman

2018-03-01

A good age in optimizing aspects of development is at the age of 4-6 years, namely with psychomotor development. Psychomotor is broader, more difficult to monitor but has a meaningful value for the child's life because it directly affects his behavior and deeds. Therefore, there is a problem to predict the child's ability level based on psychomotor. This analysis uses backpropagation method analysis with artificial neural network to predict the ability of the child on the psychomotor aspect by generating predictions of the child's ability on psychomotor and testing there is a mean squared error (MSE) value at the end of the training of 0.001. There are 30% of children aged 4-6 years have a good level of psychomotor ability, excellent, less good, and good enough.

Space-Time Joint Interference Cancellation Using Fuzzy-Inference-Based Adaptive Filtering Techniques in Frequency-Selective Multipath Channels

NASA Astrophysics Data System (ADS)

Hu, Chia-Chang; Lin, Hsuan-Yu; Chen, Yu-Fan; Wen, Jyh-Horng

2006-12-01

An adaptive minimum mean-square error (MMSE) array receiver based on the fuzzy-logic recursive least-squares (RLS) algorithm is developed for asynchronous DS-CDMA interference suppression in the presence of frequency-selective multipath fading. This receiver employs a fuzzy-logic control mechanism to perform the nonlinear mapping of the squared error and squared error variation, denoted by ([InlineEquation not available: see fulltext.],[InlineEquation not available: see fulltext.]), into a forgetting factor[InlineEquation not available: see fulltext.]. For the real-time applicability, a computationally efficient version of the proposed receiver is derived based on the least-mean-square (LMS) algorithm using the fuzzy-inference-controlled step-size[InlineEquation not available: see fulltext.]. This receiver is capable of providing both fast convergence/tracking capability as well as small steady-state misadjustment as compared with conventional LMS- and RLS-based MMSE DS-CDMA receivers. Simulations show that the fuzzy-logic LMS and RLS algorithms outperform, respectively, other variable step-size LMS (VSS-LMS) and variable forgetting factor RLS (VFF-RLS) algorithms at least 3 dB and 1.5 dB in bit-error-rate (BER) for multipath fading channels.
Mapping from disease-specific measures to health-state utility values in individuals with migraine.

PubMed

Gillard, Patrick J; Devine, Beth; Varon, Sepideh F; Liu, Lei; Sullivan, Sean D

2012-05-01

The objective of this study was to develop empirical algorithms that estimate health-state utility values from disease-specific quality-of-life scores in individuals with migraine. Data from a cross-sectional, multicountry study were used. Individuals with episodic and chronic migraine were randomly assigned to training or validation samples. Spearman's correlation coefficients between paired EuroQol five-dimensional (EQ-5D) questionnaire utility values and both Headache Impact Test (HIT-6) scores and Migraine-Specific Quality-of-Life Questionnaire version 2.1 (MSQ) domain scores (role restrictive, role preventive, and emotional function) were examined. Regression models were constructed to estimate EQ-5D questionnaire utility values from the HIT-6 score or the MSQ domain scores. Preferred algorithms were confirmed in the validation samples. In episodic migraine, the preferred HIT-6 and MSQ algorithms explained 22% and 25% of the variance (R(2)) in the training samples, respectively, and had similar prediction errors (root mean square errors of 0.30). In chronic migraine, the preferred HIT-6 and MSQ algorithms explained 36% and 45% of the variance in the training samples, respectively, and had similar prediction errors (root mean square errors 0.31 and 0.29). In episodic and chronic migraine, no statistically significant differences were observed between the mean observed and the mean estimated EQ-5D questionnaire utility values for the preferred HIT-6 and MSQ algorithms in the validation samples. The relationship between the EQ-5D questionnaire and the HIT-6 or the MSQ is adequate to use regression equations to estimate EQ-5D questionnaire utility values. The preferred HIT-6 and MSQ algorithms will be useful in estimating health-state utilities in migraine trials in which no preference-based measure is present. Copyright © 2012 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
Comparing Thermal Process Validation Methods for Salmonella Inactivation on Almond Kernels.

PubMed

Jeong, Sanghyup; Marks, Bradley P; James, Michael K

2017-01-01

Ongoing regulatory changes are increasing the need for reliable process validation methods for pathogen reduction processes involving low-moisture products; however, the reliability of various validation methods has not been evaluated. Therefore, the objective was to quantify accuracy and repeatability of four validation methods (two biologically based and two based on time-temperature models) for thermal pasteurization of almonds. Almond kernels were inoculated with Salmonella Enteritidis phage type 30 or Enterococcus faecium (NRRL B-2354) at ~10 8 CFU/g, equilibrated to 0.24, 0.45, 0.58, or 0.78 water activity (a w ), and then heated in a pilot-scale, moist-air impingement oven (dry bulb 121, 149, or 177°C; dew point <33.0, 69.4, 81.6, or 90.6°C; v air = 2.7 m/s) to a target lethality of ~4 log. Almond surface temperatures were measured in two ways, and those temperatures were used to calculate Salmonella inactivation using a traditional (D, z) model and a modified model accounting for process humidity. Among the process validation methods, both methods based on time-temperature models had better repeatability, with replication errors approximately half those of the surrogate ( E. faecium ). Additionally, the modified model yielded the lowest root mean squared error in predicting Salmonella inactivation (1.1 to 1.5 log CFU/g); in contrast, E. faecium yielded a root mean squared error of 1.2 to 1.6 log CFU/g, and the traditional model yielded an unacceptably high error (3.4 to 4.4 log CFU/g). Importantly, the surrogate and modified model both yielded lethality predictions that were statistically equivalent (α = 0.05) to actual Salmonella lethality. The results demonstrate the importance of methodology, a w , and process humidity when validating thermal pasteurization processes for low-moisture foods, which should help processors select and interpret validation methods to ensure product safety.
Obstacle Detection in Indoor Environment for Visually Impaired Using Mobile Camera

NASA Astrophysics Data System (ADS)

Rahman, Samiur; Ullah, Sana; Ullah, Sehat

2018-01-01

Obstacle detection can improve the mobility as well as the safety of visually impaired people. In this paper, we present a system using mobile camera for visually impaired people. The proposed algorithm works in indoor environment and it uses a very simple technique of using few pre-stored floor images. In indoor environment all unique floor types are considered and a single image is stored for each unique floor type. These floor images are considered as reference images. The algorithm acquires an input image frame and then a region of interest is selected and is scanned for obstacle using pre-stored floor images. The algorithm compares the present frame and the next frame and compute mean square error of the two frames. If mean square error is less than a threshold value α then it means that there is no obstacle in the next frame. If mean square error is greater than α then there are two possibilities; either there is an obstacle or the floor type is changed. In order to check if the floor is changed, the algorithm computes mean square error of next frame and all stored floor types. If minimum of mean square error is less than a threshold value α then flour is changed otherwise there exist an obstacle. The proposed algorithm works in real-time and 96% accuracy has been achieved.
Non-invasive prediction of bloodstain age using the principal component and a back propagation artificial neural network

NASA Astrophysics Data System (ADS)

Sun, Huimin; Meng, Yaoyong; Zhang, Pingli; Li, Yajing; Li, Nan; Li, Caiyun; Guo, Zhiyou

2017-09-01

The age determination of bloodstains is an important and immediate challenge for forensic science. No reliable methods are currently available for estimating the age of bloodstains. Here we report a method for determining the age of bloodstains at different storage temperatures. Bloodstains were stored at 37 °C, 25 °C, 4 °C, and -20 °C for 80 d. Bloodstains were measured using Raman spectroscopy at various time points. The principal component and a back propagation artificial neural network model were then established for estimating the age of the bloodstains. The results were ideal; the square of correlation coefficient was up to 0.99 (R 2 > 0.99) and the root mean square error of the prediction at lowest reached 55.9829 h. This method is real-time, non-invasive, non-destructive and highly efficiency. It may well prove that Raman spectroscopy is a promising tool for the estimation of the age of bloodstains.
Physicochemical characterization of Lavandula spp. honey with FT-Raman spectroscopy.

PubMed

Anjos, Ofélia; Santos, António J A; Paixão, Vasco; Estevinho, Letícia M

2018-02-01

This study aimed to evaluate the potential of FT-Raman spectroscopy in the prediction of the chemical composition of Lavandula spp. monofloral honey. Partial Least Squares (PLS) regression models were performed for the quantitative estimation and the results were correlated with those obtained using reference methods. Good calibration models were obtained for electrical conductivity, ash, total acidity, pH, reducing sugars, hydroxymethylfurfural (HMF), proline, diastase index, apparent sucrose, total flavonoids content and total phenol content. On the other hand, the model was less accurate for pH determination. The calibration models had high r 2 (ranging between 92.8% and 99.9%), high residual prediction deviation - RPD (ranging between 4.2 and 26.8) and low root mean square errors. These results confirm the hypothesis that FT-Raman is a useful technique for the quality control and chemical properties' evaluation of Lavandula spp honey. Its application may allow improving the efficiency, speed and cost of the current laboratory analysis. Copyright © 2017 Elsevier B.V. All rights reserved.
Automatic variable selection method and a comparison for quantitative analysis in laser-induced breakdown spectroscopy

NASA Astrophysics Data System (ADS)

Duan, Fajie; Fu, Xiao; Jiang, Jiajia; Huang, Tingting; Ma, Ling; Zhang, Cong

2018-05-01

In this work, an automatic variable selection method for quantitative analysis of soil samples using laser-induced breakdown spectroscopy (LIBS) is proposed, which is based on full spectrum correction (FSC) and modified iterative predictor weighting-partial least squares (mIPW-PLS). The method features automatic selection without artificial processes. To illustrate the feasibility and effectiveness of the method, a comparison with genetic algorithm (GA) and successive projections algorithm (SPA) for different elements (copper, barium and chromium) detection in soil was implemented. The experimental results showed that all the three methods could accomplish variable selection effectively, among which FSC-mIPW-PLS required significantly shorter computation time (12 s approximately for 40,000 initial variables) than the others. Moreover, improved quantification models were got with variable selection approaches. The root mean square errors of prediction (RMSEP) of models utilizing the new method were 27.47 (copper), 37.15 (barium) and 39.70 (chromium) mg/kg, which showed comparable prediction effect with GA and SPA.
A hybrid SVM-FFA method for prediction of monthly mean global solar radiation

NASA Astrophysics Data System (ADS)

Shamshirband, Shahaboddin; Mohammadi, Kasra; Tong, Chong Wen; Zamani, Mazdak; Motamedi, Shervin; Ch, Sudheer

2016-07-01

In this study, a hybrid support vector machine-firefly optimization algorithm (SVM-FFA) model is proposed to estimate monthly mean horizontal global solar radiation (HGSR). The merit of SVM-FFA is assessed statistically by comparing its performance with three previously used approaches. Using each approach and long-term measured HGSR, three models are calibrated by considering different sets of meteorological parameters measured for Bandar Abbass situated in Iran. It is found that the model (3) utilizing the combination of relative sunshine duration, difference between maximum and minimum temperatures, relative humidity, water vapor pressure, average temperature, and extraterrestrial solar radiation shows superior performance based upon all approaches. Moreover, the extraterrestrial radiation is introduced as a significant parameter to accurately estimate the global solar radiation. The survey results reveal that the developed SVM-FFA approach is greatly capable to provide favorable predictions with significantly higher precision than other examined techniques. For the SVM-FFA (3), the statistical indicators of mean absolute percentage error (MAPE), root mean square error (RMSE), relative root mean square error (RRMSE), and coefficient of determination ( R 2) are 3.3252 %, 0.1859 kWh/m2, 3.7350 %, and 0.9737, respectively which according to the RRMSE has an excellent performance. As a more evaluation of SVM-FFA (3), the ratio of estimated to measured values is computed and found that 47 out of 48 months considered as testing data fall between 0.90 and 1.10. Also, by performing a further verification, it is concluded that SVM-FFA (3) offers absolute superiority over the empirical models using relatively similar input parameters. In a nutshell, the hybrid SVM-FFA approach would be considered highly efficient to estimate the HGSR.
Estimation of flood-frequency characteristics of small urban streams in North Carolina

USGS Publications Warehouse

Robbins, J.C.; Pope, B.F.

1996-01-01

A statewide study was conducted to develop methods for estimating the magnitude and frequency of floods of small urban streams in North Carolina. This type of information is critical in the design of bridges, culverts and water-control structures, establishment of flood-insurance rates and flood-plain regulation, and for other uses by urban planners and engineers. Concurrent records of rainfall and runoff data collected in small urban basins were used to calibrate rainfall-runoff models. Historic rain- fall records were used with the calibrated models to synthesize a long- term record of annual peak discharges. The synthesized record of annual peak discharges were used in a statistical analysis to determine flood- frequency distributions. These frequency distributions were used with distributions from previous investigations to develop a database for 32 small urban basins in the Blue Ridge-Piedmont, Sand Hills, and Coastal Plain hydrologic areas. The study basins ranged in size from 0.04 to 41.0 square miles. Data describing the size and shape of the basin, level of urban development, and climate and rural flood charac- teristics also were included in the database. Estimation equations were developed by relating flood-frequency char- acteristics to basin characteristics in a generalized least-squares regression analysis. The most significant basin characteristics are drainage area, impervious area, and rural flood discharge. The model error and prediction errors for the estimating equations were less than those for the national flood-frequency equations previously reported. Resulting equations, which have prediction errors generally less than 40 percent, can be used to estimate flood-peak discharges for 2-, 5-, 10-, 25-, 50-, and 100-year recurrence intervals for small urban basins across the State assuming negligible, sustainable, in- channel detention or basin storage.
Prediction of forest fires occurrences with area-level Poisson mixed models.

PubMed

Boubeta, Miguel; Lombardía, María José; Marey-Pérez, Manuel Francisco; Morales, Domingo

2015-05-01

The number of fires in forest areas of Galicia (north-west of Spain) during the summer period is quite high. Local authorities are interested in analyzing the factors that explain this phenomenon. Poisson regression models are good tools for describing and predicting the number of fires per forest areas. This work employs area-level Poisson mixed models for treating real data about fires in forest areas. A parametric bootstrap method is applied for estimating the mean squared errors of fires predictors. The developed methodology and software are applied to a real data set of fires in forest areas of Galicia. Copyright © 2015 Elsevier Ltd. All rights reserved.
Application of a soft computing technique in predicting the percentage of shear force carried by walls in a rectangular channel with non-homogeneous roughness.

PubMed

Khozani, Zohreh Sheikh; Bonakdari, Hossein; Zaji, Amir Hossein

2016-01-01

Two new soft computing models, namely genetic programming (GP) and genetic artificial algorithm (GAA) neural network (a combination of modified genetic algorithm and artificial neural network methods) were developed in order to predict the percentage of shear force in a rectangular channel with non-homogeneous roughness. The ability of these methods to estimate the percentage of shear force was investigated. Moreover, the independent parameters' effectiveness in predicting the percentage of shear force was determined using sensitivity analysis. According to the results, the GP model demonstrated superior performance to the GAA model. A comparison was also made between the GP program determined as the best model and five equations obtained in prior research. The GP model with the lowest error values (root mean square error ((RMSE) of 0.0515) had the best function compared with the other equations presented for rough and smooth channels as well as smooth ducts. The equation proposed for rectangular channels with rough boundaries (RMSE of 0.0642) outperformed the prior equations for smooth boundaries.
Fiber-optic evanescent-wave spectroscopy for fast multicomponent analysis of human blood

NASA Astrophysics Data System (ADS)

Simhi, Ronit; Gotshal, Yaron; Bunimovich, David; Katzir, Abraham; Sela, Ben-Ami

1996-07-01

A spectral analysis of human blood serum was undertaken by fiber-optic evanescent-wave spectroscopy (FEWS) by the use of a Fourier-transform infrared spectrometer. A special cell for the FEWS measurements was designed and built that incorporates an IR-transmitting silver halide fiber and a means for introducing the blood-serum sample. Further improvements in analysis were obtained by the adoption of multivariate calibration techniques that are already used in clinical chemistry. The partial least-squares algorithm was used to calculate the concentrations of cholesterol, total protein, urea, and uric acid in human blood serum. The estimated prediction errors obtained (in percent from the average value) were 6% for total protein, 15% for cholesterol, 30% for urea, and 30% for uric acid. These results were compared with another independent prediction method that used a neural-network model. This model yielded estimated prediction errors of 8.8% for total protein, 25% for cholesterol, and 21% for uric acid. spectroscopy, fiber-optic evanescent-wave spectroscopy, Fourier-transform infrared spectrometer, blood, multivariate calibration, neural networks.
Depicting mass flow rate of R134a /LPG refrigerant through straight and helical coiled adiabatic capillary tubes of vapor compression refrigeration system using artificial neural network approach

NASA Astrophysics Data System (ADS)

Gill, Jatinder; Singh, Jagdev

2018-07-01

In this work, an experimental investigation is carried out with R134a and LPG refrigerant mixture for depicting mass flow rate through straight and helical coil adiabatic capillary tubes in a vapor compression refrigeration system. Various experiments were conducted under steady-state conditions, by changing capillary tube length, inner diameter, coil diameter and degree of subcooling. The results showed that mass flow rate through helical coil capillary tube was found lower than straight capillary tube by about 5-16%. Dimensionless correlation and Artificial Neural Network (ANN) models were developed to predict mass flow rate. It was found that dimensionless correlation and ANN model predictions agreed well with experimental results and brought out an absolute fraction of variance of 0.961 and 0.988, root mean square error of 0.489 and 0.275 and mean absolute percentage error of 4.75% and 2.31% respectively. The results suggested that ANN model shows better statistical prediction than dimensionless correlation model.
Accuracy of maximum likelihood and least-squares estimates in the lidar slope method with noisy data.

PubMed

Eberhard, Wynn L

2017-04-01

The maximum likelihood estimator (MLE) is derived for retrieving the extinction coefficient and zero-range intercept in the lidar slope method in the presence of random and independent Gaussian noise. Least-squares fitting, weighted by the inverse of the noise variance, is equivalent to the MLE. Monte Carlo simulations demonstrate that two traditional least-squares fitting schemes, which use different weights, are less accurate. Alternative fitting schemes that have some positive attributes are introduced and evaluated. The principal factors governing accuracy of all these schemes are elucidated. Applying these schemes to data with Poisson rather than Gaussian noise alters accuracy little, even when the signal-to-noise ratio is low. Methods to estimate optimum weighting factors in actual data are presented. Even when the weighting estimates are coarse, retrieval accuracy declines only modestly. Mathematical tools are described for predicting retrieval accuracy. Least-squares fitting with inverse variance weighting has optimum accuracy for retrieval of parameters from single-wavelength lidar measurements when noise, errors, and uncertainties are Gaussian distributed, or close to optimum when only approximately Gaussian.
Limb Dominance Results from Asymmetries in Predictive and Impedance Control Mechanisms

PubMed Central

Yadav, Vivek; Sainburg, Robert L.

2014-01-01

Handedness is a pronounced feature of human motor behavior, yet the underlying neural mechanisms remain unclear. We hypothesize that motor lateralization results from asymmetries in predictive control of task dynamics and in control of limb impedance. To test this hypothesis, we present an experiment with two different force field environments, a field with a predictable magnitude that varies with the square of velocity, and a field with a less predictable magnitude that varies linearly with velocity. These fields were designed to be compatible with controllers that are specialized in predicting limb and task dynamics, and modulating position and velocity dependent impedance, respectively. Because the velocity square field does not change the form of the equations of motion for the reaching arm, we reasoned that a forward dynamic-type controller should perform well in this field, while control of linear damping and stiffness terms should be less effective. In contrast, the unpredictable linear field should be most compatible with impedance control, but incompatible with predictive dynamics control. We measured steady state final position accuracy and 3 trajectory features during exposure to these fields: Mean squared jerk, Straightness, and Movement time. Our results confirmed that each arm made straighter, smoother, and quicker movements in its compatible field. Both arms showed similar final position accuracies, which were achieved using more extensive corrective sub-movements when either arm performed in its incompatible field. Finally, each arm showed limited adaptation to its incompatible field. Analysis of the dependence of trajectory errors on field magnitude suggested that dominant arm adaptation occurred by prediction of the mean field, thus exploiting predictive mechanisms for adaptation to the unpredictable field. Overall, our results support the hypothesis that motor lateralization reflects asymmetries in specific motor control mechanisms associated with predictive control of limb and task dynamics, and modulation of limb impedance. PMID:24695543
Study on fast measurement of sugar content of yogurt using Vis/NIR spectroscopy techniques

NASA Astrophysics Data System (ADS)

He, Yong; Feng, Shuijuan; Wu, Di; Li, Xiaoli

2006-09-01

In order to measuring the sugar content of yogurt rapidly, a fast measurement of sugar content of yogurt using Vis/NIR-spectroscopy techniques was established. 25 samples selected separately from five different brands of yogurt were measured by Vis/NIR-spectroscopy. The sugar content of yogurt on positions scanned by spectrum were measured by a sugar content meter. The mathematical model between sugar content and Vis/NIR spectral measurements was established and developed based on partial least squares (PLS). The correlation coefficient of sugar content based on PLS model is more than 0.894, and standard error of calibration (SEC) is 0.356, standard error of prediction (SEP) is 0.389. Through predicting the sugar content quantitatively of 35 samples of yogurt from 5 different brands, the correlation coefficient between predictive value and measured value of those samples is more than 0.934. The results show the good to excellent prediction performance. The Vis/NIR spectroscopy technique had significantly greater accuracy for determining the sugar content. It was concluded that the Vis/NIRS measurement technique seems reliable to assess the fast measurement of sugar content of yogurt, and a new method for the measurement of sugar content of yogurt was established.
Interest rate next-day variation prediction based on hybrid feedforward neural network, particle swarm optimization, and multiresolution techniques

NASA Astrophysics Data System (ADS)

Lahmiri, Salim

2016-02-01

Multiresolution analysis techniques including continuous wavelet transform, empirical mode decomposition, and variational mode decomposition are tested in the context of interest rate next-day variation prediction. In particular, multiresolution analysis techniques are used to decompose interest rate actual variation and feedforward neural network for training and prediction. Particle swarm optimization technique is adopted to optimize its initial weights. For comparison purpose, autoregressive moving average model, random walk process and the naive model are used as main reference models. In order to show the feasibility of the presented hybrid models that combine multiresolution analysis techniques and feedforward neural network optimized by particle swarm optimization, we used a set of six illustrative interest rates; including Moody's seasoned Aaa corporate bond yield, Moody's seasoned Baa corporate bond yield, 3-Month, 6-Month and 1-Year treasury bills, and effective federal fund rate. The forecasting results show that all multiresolution-based prediction systems outperform the conventional reference models on the criteria of mean absolute error, mean absolute deviation, and root mean-squared error. Therefore, it is advantageous to adopt hybrid multiresolution techniques and soft computing models to forecast interest rate daily variations as they provide good forecasting performance.
Prediction of the reference evapotranspiration using a chaotic approach.

PubMed

Wang, Wei-guang; Zou, Shan; Luo, Zhao-hui; Zhang, Wei; Chen, Dan; Kong, Jun

2014-01-01

Evapotranspiration is one of the most important hydrological variables in the context of water resources management. An attempt was made to understand and predict the dynamics of reference evapotranspiration from a nonlinear dynamical perspective in this study. The reference evapotranspiration data was calculated using the FAO Penman-Monteith equation with the observed daily meteorological data for the period 1966-2005 at four meteorological stations (i.e., Baotou, Zhangbei, Kaifeng, and Shaoguan) representing a wide range of climatic conditions of China. The correlation dimension method was employed to investigate the chaotic behavior of the reference evapotranspiration series. The existence of chaos in the reference evapotranspiration series at the four different locations was proved by the finite and low correlation dimension. A local approximation approach was employed to forecast the daily reference evapotranspiration series. Low root mean square error (RSME) and mean absolute error (MAE) (for all locations lower than 0.31 and 0.24, resp.), high correlation coefficient (CC), and modified coefficient of efficiency (for all locations larger than 0.97 and 0.8, resp.) indicate that the predicted reference evapotranspiration agrees well with the observed one. The encouraging results indicate the suitableness of chaotic approach for understanding and predicting the dynamics of the reference evapotranspiration.
Evaluation of a Mysis bioenergetics model

USGS Publications Warehouse

Chipps, S.R.; Bennett, D.H.

2002-01-01

Direct approaches for estimating the feeding rate of the opossum shrimp Mysis relicta can be hampered by variable gut residence time (evacuation rate models) and non-linear functional responses (clearance rate models). Bioenergetics modeling provides an alternative method, but the reliability of this approach needs to be evaluated using independent measures of growth and food consumption. In this study, we measured growth and food consumption for M. relicta and compared experimental results with those predicted from a Mysis bioenergetics model. For Mysis reared at 10??C, model predictions were not significantly different from observed values. Moreover, decomposition of mean square error indicated that 70% of the variation between model predictions and observed values was attributable to random error. On average, model predictions were within 12% of observed values. A sensitivity analysis revealed that Mysis respiration and prey energy density were the most sensitive parameters affecting model output. By accounting for uncertainty (95% CLs) in Mysis respiration, we observed a significant improvement in the accuracy of model output (within 5% of observed values), illustrating the importance of sensitive input parameters for model performance. These findings help corroborate the Mysis bioenergetics model and demonstrate the usefulness of this approach for estimating Mysis feeding rate.
Prediction model of dissolved oxygen in ponds based on ELM neural network

NASA Astrophysics Data System (ADS)

Li, Xinfei; Ai, Jiaoyan; Lin, Chunhuan; Guan, Haibin

2018-02-01

Dissolved oxygen in ponds is affected by many factors, and its distribution is unbalanced. In this paper, in order to improve the imbalance of dissolved oxygen distribution more effectively, the dissolved oxygen prediction model of Extreme Learning Machine (ELM) intelligent algorithm is established, based on the method of improving dissolved oxygen distribution by artificial push flow. Select the Lake Jing of Guangxi University as the experimental area. Using the model to predict the dissolved oxygen concentration of different voltage pumps, the results show that the ELM prediction accuracy is higher than the BP algorithm, and its mean square error is MSEELM=0.0394, the correlation coefficient RELM=0.9823. The prediction results of the 24V voltage pump push flow show that the discrete prediction curve can approximate the measured values well. The model can provide the basis for the artificial improvement of the dissolved oxygen distribution decision.

Gamma model and its analysis for phase measuring profilometry.

PubMed

Liu, Kai; Wang, Yongchang; Lau, Daniel L; Hao, Qi; Hassebrook, Laurence G

2010-03-01

Phase measuring profilometry is a method of structured light illumination whose three-dimensional reconstructions are susceptible to error from nonunitary gamma in the associated optical devices. While the effects of this distortion diminish with an increasing number of employed phase-shifted patterns, gamma distortion may be unavoidable in real-time systems where the number of projected patterns is limited by the presence of target motion. A mathematical model is developed for predicting the effects of nonunitary gamma on phase measuring profilometry, while also introducing an accurate gamma calibration method and two strategies for minimizing gamma's effect on phase determination. These phase correction strategies include phase corrections with and without gamma calibration. With the reduction in noise, for three-step phase measuring profilometry, analysis of the root mean squared error of the corrected phase will show a 60x reduction in phase error when the proposed gamma calibration is performed versus 33x reduction without calibration.
Tourism forecasting using modified empirical mode decomposition and group method of data handling

NASA Astrophysics Data System (ADS)

Yahya, N. A.; Samsudin, R.; Shabri, A.

2017-09-01

In this study, a hybrid model using modified Empirical Mode Decomposition (EMD) and Group Method of Data Handling (GMDH) model is proposed for tourism forecasting. This approach reconstructs intrinsic mode functions (IMFs) produced by EMD using trial and error method. The new component and the remaining IMFs is then predicted respectively using GMDH model. Finally, the forecasted results for each component are aggregated to construct an ensemble forecast. The data used in this experiment are monthly time series data of tourist arrivals from China, Thailand and India to Malaysia from year 2000 to 2016. The performance of the model is evaluated using Root Mean Square Error (RMSE) and Mean Absolute Percentage Error (MAPE) where conventional GMDH model and EMD-GMDH model are used as benchmark models. Empirical results proved that the proposed model performed better forecasts than the benchmarked models.
Robust estimation of thermodynamic parameters (ΔH, ΔS and ΔCp) for prediction of retention time in gas chromatography - Part II (Application).

PubMed

Claumann, Carlos Alberto; Wüst Zibetti, André; Bolzan, Ariovaldo; Machado, Ricardo A F; Pinto, Leonel Teixeira

2015-12-18

For this work, an analysis of parameter estimation for the retention factor in GC model was performed, considering two different criteria: sum of square error, and maximum error in absolute value; relevant statistics are described for each case. The main contribution of this work is the implementation of an initialization scheme (specialized) for the estimated parameters, which features fast convergence (low computational time) and is based on knowledge of the surface of the error criterion. In an application to a series of alkanes, specialized initialization resulted in significant reduction to the number of evaluations of the objective function (reducing computational time) in the parameter estimation. The obtained reduction happened between one and two orders of magnitude, compared with the simple random initialization. Copyright © 2015 Elsevier B.V. All rights reserved.
Compensated Box-Jenkins transfer function for short term load forecast

DOE Office of Scientific and Technical Information (OSTI.GOV)

Breipohl, A.; Yu, Z.; Lee, F.N.

In the past years, the Box-Jenkins ARIMA method and the Box-Jenkins transfer function method (BJTF) have been among the most commonly used methods for short term electrical load forecasting. But when there exists a sudden change in the temperature, both methods tend to exhibit larger errors in the forecast. This paper demonstrates that the load forecasting errors resulting from either the BJ ARIMA model or the BJTF model are not simply white noise, but rather well-patterned noise, and the patterns in the noise can be used to improve the forecasts. Thus a compensated Box-Jenkins transfer method (CBJTF) is proposed tomore » improve the accuracy of the load prediction. Some case studies have been made which result in about a 14-33% reduction of the root mean square (RMS) errors of the forecasts, depending on the compensation time period as well as the compensation method used.« less
Quality evaluation of frozen guava and yellow passion fruit pulps by NIR spectroscopy and chemometrics.

PubMed

Alamar, Priscila D; Caramês, Elem T S; Poppi, Ronei J; Pallone, Juliana A L

2016-07-01

The present study investigated the application of near infrared spectroscopy as a green, quick, and efficient alternative to analytical methods currently used to evaluate the quality (moisture, total sugars, acidity, soluble solids, pH and ascorbic acid) of frozen guava and passion fruit pulps. Fifty samples were analyzed by near infrared spectroscopy (NIR) and reference methods. Partial least square regression (PLSR) was used to develop calibration models to relate the NIR spectra and the reference values. Reference methods indicated adulteration by water addition in 58% of guava pulp samples and 44% of yellow passion fruit pulp samples. The PLS models produced lower values of root mean squares error of calibration (RMSEC), root mean squares error of prediction (RMSEP), and coefficient of determination above 0.7. Moisture and total sugars presented the best calibration models (RMSEP of 0.240 and 0.269, respectively, for guava pulp; RMSEP of 0.401 and 0.413, respectively, for passion fruit pulp) which enables the application of these models to determine adulteration in guava and yellow passion fruit pulp by water or sugar addition. The models constructed for calibration of quality parameters of frozen fruit pulps in this study indicate that NIR spectroscopy coupled with the multivariate calibration technique could be applied to determine the quality of guava and yellow passion fruit pulp. Copyright © 2016 Elsevier Ltd. All rights reserved.
Empirical Correction to the Likelihood Ratio Statistic for Structural Equation Modeling with Many Variables.

PubMed

Yuan, Ke-Hai; Tian, Yubin; Yanagihara, Hirokazu

2015-06-01

Survey data typically contain many variables. Structural equation modeling (SEM) is commonly used in analyzing such data. The most widely used statistic for evaluating the adequacy of a SEM model is T ML, a slight modification to the likelihood ratio statistic. Under normality assumption, T ML approximately follows a chi-square distribution when the number of observations (N) is large and the number of items or variables (p) is small. However, in practice, p can be rather large while N is always limited due to not having enough participants. Even with a relatively large N, empirical results show that T ML rejects the correct model too often when p is not too small. Various corrections to T ML have been proposed, but they are mostly heuristic. Following the principle of the Bartlett correction, this paper proposes an empirical approach to correct T ML so that the mean of the resulting statistic approximately equals the degrees of freedom of the nominal chi-square distribution. Results show that empirically corrected statistics follow the nominal chi-square distribution much more closely than previously proposed corrections to T ML, and they control type I errors reasonably well whenever N ≥ max(50,2p). The formulations of the empirically corrected statistics are further used to predict type I errors of T ML as reported in the literature, and they perform well.
Correlation of sensory bitterness in dairy protein hydrolysates: Comparison of prediction models built using sensory, chromatographic and electronic tongue data.

PubMed

Newman, J; Egan, T; Harbourne, N; O'Riordan, D; Jacquier, J C; O'Sullivan, M

2014-08-01

Sensory evaluation can be problematic for ingredients with a bitter taste during research and development phase of new food products. In this study, 19 dairy protein hydrolysates (DPH) were analysed by an electronic tongue and their physicochemical characteristics, the data obtained from these methods were correlated with their bitterness intensity as scored by a trained sensory panel and each model was also assessed by its predictive capabilities. The physiochemical characteristics of the DPHs investigated were degree of hydrolysis (DH%), and data relating to peptide size and relative hydrophobicity from size exclusion chromatography (SEC) and reverse phase (RP) HPLC. Partial least square regression (PLS) was used to construct the prediction models. All PLS regressions had good correlations (0.78 to 0.93) with the strongest being the combination of data obtained from SEC and RP HPLC. However, the PLS with the strongest predictive power was based on the e-tongue which had the PLS regression with the lowest root mean predicted residual error sum of squares (PRESS) in the study. The results show that the PLS models constructed with the e-tongue and the combination of SEC and RP-HPLC has potential to be used for prediction of bitterness and thus reducing the reliance on sensory analysis in DPHs for future food research. Copyright © 2014 Elsevier B.V. All rights reserved.
Prediction of Moisture Content for Congou Black Tea Withering Leaves Using Image Features and Nonlinear Method.

PubMed

Liang, Gaozhen; Dong, Chunwang; Hu, Bin; Zhu, Hongkai; Yuan, Haibo; Jiang, Yongwen; Hao, Guoshuang

2018-05-18

Withering is the first step in the processing of congou black tea. With respect to the deficiency of traditional water content detection methods, a machine vision based NDT (Non Destructive Testing) method was established to detect the moisture content of withered leaves. First, according to the time sequences using computer visual system collected visible light images of tea leaf surfaces, and color and texture characteristics are extracted through the spatial changes of colors. Then quantitative prediction models for moisture content detection of withered tea leaves was established through linear PLS (Partial Least Squares) and non-linear SVM (Support Vector Machine). The results showed correlation coefficients higher than 0.8 between the water contents and green component mean value (G), lightness component mean value (L * ) and uniformity (U), which means that the extracted characteristics have great potential to predict the water contents. The performance parameters as correlation coefficient of prediction set (Rp), root-mean-square error of prediction (RMSEP), and relative standard deviation (RPD) of the SVM prediction model are 0.9314, 0.0411 and 1.8004, respectively. The non-linear modeling method can better describe the quantitative analytical relations between the image and water content. With superior generalization and robustness, the method would provide a new train of thought and theoretical basis for the online water content monitoring technology of automated production of black tea.
Toward accurate prediction of pKa values for internal protein residues: the importance of conformational relaxation and desolvation energy.

PubMed

Wallace, Jason A; Wang, Yuhang; Shi, Chuanyin; Pastoor, Kevin J; Nguyen, Bao-Linh; Xia, Kai; Shen, Jana K

2011-12-01

Proton uptake or release controls many important biological processes, such as energy transduction, virus replication, and catalysis. Accurate pK(a) prediction informs about proton pathways, thereby revealing detailed acid-base mechanisms. Physics-based methods in the framework of molecular dynamics simulations not only offer pK(a) predictions but also inform about the physical origins of pK(a) shifts and provide details of ionization-induced conformational relaxation and large-scale transitions. One such method is the recently developed continuous constant pH molecular dynamics (CPHMD) method, which has been shown to be an accurate and robust pK(a) prediction tool for naturally occurring titratable residues. To further examine the accuracy and limitations of CPHMD, we blindly predicted the pK(a) values for 87 titratable residues introduced in various hydrophobic regions of staphylococcal nuclease and variants. The predictions gave a root-mean-square deviation of 1.69 pK units from experiment, and there were only two pK(a)'s with errors greater than 3.5 pK units. Analysis of the conformational fluctuation of titrating side-chains in the context of the errors of calculated pK(a) values indicate that explicit treatment of conformational flexibility and the associated dielectric relaxation gives CPHMD a distinct advantage. Analysis of the sources of errors suggests that more accurate pK(a) predictions can be obtained for the most deeply buried residues by improving the accuracy in calculating desolvation energies. Furthermore, it is found that the generalized Born implicit-solvent model underlying the current CPHMD implementation slightly distorts the local conformational environment such that the inclusion of an explicit-solvent representation may offer improvement of accuracy. Copyright © 2011 Wiley-Liss, Inc.
Accuracy of measurement of star images on a pixel array

NASA Technical Reports Server (NTRS)

King, I. R.

1983-01-01

Algorithms are developed for predicting the accuracy with which the brightness of a star can be determined from its image on a digital detector array, as a function of the brightness of the background. The assumption is made that a known profile is being fitted by least squares. The two profiles used correspond to ST images and to ground-based observations. The first result is an approximate rule of thumb for equivalent noise area. More rigorous results are then given in tabular form. The size of the pixels, relative to the image size, is taken into account. Astronometric accuracy is also discussed briefly; the error, relative to image size, is very similar to the photometric error relative to brightness.
Analysis of the Magnitude and Frequency of Peak Discharge and Maximum Observed Peak Discharge in New Mexico and Surrounding Areas

USGS Publications Warehouse

Waltemeyer, Scott D.

2008-01-01

Estimates of the magnitude and frequency of peak discharges are necessary for the reliable design of bridges, culverts, and open-channel hydraulic analysis, and for flood-hazard mapping in New Mexico and surrounding areas. The U.S. Geological Survey, in cooperation with the New Mexico Department of Transportation, updated estimates of peak-discharge magnitude for gaging stations in the region and updated regional equations for estimation of peak discharge and frequency at ungaged sites. Equations were developed for estimating the magnitude of peak discharges for recurrence intervals of 2, 5, 10, 25, 50, 100, and 500 years at ungaged sites by use of data collected through 2004 for 293 gaging stations on unregulated streams that have 10 or more years of record. Peak discharges for selected recurrence intervals were determined at gaging stations by fitting observed data to a log-Pearson Type III distribution with adjustments for a low-discharge threshold and a zero skew coefficient. A low-discharge threshold was applied to frequency analysis of 140 of the 293 gaging stations. This application provides an improved fit of the log-Pearson Type III frequency distribution. Use of the low-discharge threshold generally eliminated the peak discharge by having a recurrence interval of less than 1.4 years in the probability-density function. Within each of the nine regions, logarithms of the maximum peak discharges for selected recurrence intervals were related to logarithms of basin and climatic characteristics by using stepwise ordinary least-squares regression techniques for exploratory data analysis. Generalized least-squares regression techniques, an improved regression procedure that accounts for time and spatial sampling errors, then were applied to the same data used in the ordinary least-squares regression analyses. The average standard error of prediction, which includes average sampling error and average standard error of regression, ranged from 38 to 93 percent (mean value is 62, and median value is 59) for the 100-year flood. The 1996 investigation standard error of prediction for the flood regions ranged from 41 to 96 percent (mean value is 67, and median value is 68) for the 100-year flood that was analyzed by using generalized least-squares regression analysis. Overall, the equations based on generalized least-squares regression techniques are more reliable than those in the 1996 report because of the increased length of record and improved geographic information system (GIS) method to determine basin and climatic characteristics. Flood-frequency estimates can be made for ungaged sites upstream or downstream from gaging stations by using a method that transfers flood-frequency data at the gaging station to the ungaged site by using a drainage-area ratio adjustment equation. The peak discharge for a given recurrence interval at the gaging station, drainage-area ratio, and the drainage-area exponent from the regional regression equation of the respective region is used to transfer the peak discharge for the recurrence interval to the ungaged site. Maximum observed peak discharge as related to drainage area was determined for New Mexico. Extreme events are commonly used in the design and appraisal of bridge crossings and other structures. Bridge-scour evaluations are commonly made by using the 500-year peak discharge for these appraisals. Peak-discharge data collected at 293 gaging stations and 367 miscellaneous sites were used to develop a maximum peak-discharge relation as an alternative method of estimating peak discharge of an extreme event such as a maximum probable flood.
Estimating riparian understory vegetation cover with beta regression and copula models

USGS Publications Warehouse

Eskelson, Bianca N.I.; Madsen, Lisa; Hagar, Joan C.; Temesgen, Hailemariam

2011-01-01

Understory vegetation communities are critical components of forest ecosystems. As a result, the importance of modeling understory vegetation characteristics in forested landscapes has become more apparent. Abundance measures such as shrub cover are bounded between 0 and 1, exhibit heteroscedastic error variance, and are often subject to spatial dependence. These distributional features tend to be ignored when shrub cover data are analyzed. The beta distribution has been used successfully to describe the frequency distribution of vegetation cover. Beta regression models ignoring spatial dependence (BR) and accounting for spatial dependence (BRdep) were used to estimate percent shrub cover as a function of topographic conditions and overstory vegetation structure in riparian zones in western Oregon. The BR models showed poor explanatory power (pseudo-R2 ≤ 0.34) but outperformed ordinary least-squares (OLS) and generalized least-squares (GLS) regression models with logit-transformed response in terms of mean square prediction error and absolute bias. We introduce a copula (COP) model that is based on the beta distribution and accounts for spatial dependence. A simulation study was designed to illustrate the effects of incorrectly assuming normality, equal variance, and spatial independence. It showed that BR, BRdep, and COP models provide unbiased parameter estimates, whereas OLS and GLS models result in slightly biased estimates for two of the three parameters. On the basis of the simulation study, 93–97% of the GLS, BRdep, and COP confidence intervals covered the true parameters, whereas OLS and BR only resulted in 84–88% coverage, which demonstrated the superiority of GLS, BRdep, and COP over OLS and BR models in providing standard errors for the parameter estimates in the presence of spatial dependence.
47 CFR 87.145 - Acceptability of transmitters for licensing.

Code of Federal Regulations, 2014 CFR

2014-10-01

... square error which assumes zero error for the received ground earth station signal and includes the AES transmit/receive frequency reference error and the AES automatic frequency control residual errors.) The...
47 CFR 87.145 - Acceptability of transmitters for licensing.

Code of Federal Regulations, 2013 CFR

2013-10-01

... square error which assumes zero error for the received ground earth station signal and includes the AES transmit/receive frequency reference error and the AES automatic frequency control residual errors.) The...
47 CFR 87.145 - Acceptability of transmitters for licensing.

Code of Federal Regulations, 2012 CFR

2012-10-01

... square error which assumes zero error for the received ground earth station signal and includes the AES transmit/receive frequency reference error and the AES automatic frequency control residual errors.) The...
47 CFR 87.145 - Acceptability of transmitters for licensing.

Code of Federal Regulations, 2011 CFR

2011-10-01

... square error which assumes zero error for the received ground earth station signal and includes the AES transmit/receive frequency reference error and the AES automatic frequency control residual errors.) The...
The importance of molecular structures, endpoints' values, and predictivity parameters in QSAR research: QSAR analysis of a series of estrogen receptor binders.

PubMed

Li, Jiazhong; Gramatica, Paola

2010-11-01

Quantitative structure-activity relationship (QSAR) methodology aims to explore the relationship between molecular structures and experimental endpoints, producing a model for the prediction of new data; the predictive performance of the model must be checked by external validation. Clearly, the qualities of chemical structure information and experimental endpoints, as well as the statistical parameters used to verify the external predictivity have a strong influence on QSAR model reliability. Here, we emphasize the importance of these three aspects by analyzing our models on estrogen receptor binders (Endocrine disruptor knowledge base (EDKB) database). Endocrine disrupting chemicals, which mimic or antagonize the endogenous hormones such as estrogens, are a hot topic in environmental and toxicological sciences. QSAR shows great values in predicting the estrogenic activity and exploring the interactions between the estrogen receptor and ligands. We have verified our previously published model for additional external validation on new EDKB chemicals. Having found some errors in the used 3D molecular conformations, we redevelop a new model using the same data set with corrected structures, the same method (ordinary least-square regression, OLS) and DRAGON descriptors. The new model, based on some different descriptors, is more predictive on external prediction sets. Three different formulas to calculate correlation coefficient for the external prediction set (Q2 EXT) were compared, and the results indicated that the new proposal of Consonni et al. had more reasonable results, consistent with the conclusions from regression line, Williams plot and root mean square error (RMSE) values. Finally, the importance of reliable endpoints values has been highlighted by comparing the classification assignments of EDKB with those of another estrogen receptor binders database (METI): we found that 16.1% assignments of the common compounds were opposite (20 among 124 common compounds). In order to verify the real assignments for these inconsistent compounds, we predicted these samples, as a blind external set, by our regression models and compared the results with the two databases. The results indicated that most of the predictions were consistent with METI. Furthermore, we built a kNN classification model using the 104 consistent compounds to predict those inconsistent ones, and most of the predictions were also in agreement with METI database.
Accuracy and uncertainty analysis of soil Bbf spatial distribution estimation at a coking plant-contaminated site based on normalization geostatistical technologies.

PubMed

Liu, Geng; Niu, Junjie; Zhang, Chao; Guo, Guanlin

2015-12-01

Data distribution is usually skewed severely by the presence of hot spots in contaminated sites. This causes difficulties for accurate geostatistical data transformation. Three types of typical normal distribution transformation methods termed the normal score, Johnson, and Box-Cox transformations were applied to compare the effects of spatial interpolation with normal distribution transformation data of benzo(b)fluoranthene in a large-scale coking plant-contaminated site in north China. Three normal transformation methods decreased the skewness and kurtosis of the benzo(b)fluoranthene, and all the transformed data passed the Kolmogorov-Smirnov test threshold. Cross validation showed that Johnson ordinary kriging has a minimum root-mean-square error of 1.17 and a mean error of 0.19, which was more accurate than the other two models. The area with fewer sampling points and that with high levels of contamination showed the largest prediction standard errors based on the Johnson ordinary kriging prediction map. We introduce an ideal normal transformation method prior to geostatistical estimation for severely skewed data, which enhances the reliability of risk estimation and improves the accuracy for determination of remediation boundaries.
Taxi-Out Time Prediction for Departures at Charlotte Airport Using Machine Learning Techniques

NASA Technical Reports Server (NTRS)

Lee, Hanbong; Malik, Waqar; Jung, Yoon C.

2016-01-01

Predicting the taxi-out times of departures accurately is important for improving airport efficiency and takeoff time predictability. In this paper, we attempt to apply machine learning techniques to actual traffic data at Charlotte Douglas International Airport for taxi-out time prediction. To find the key factors affecting aircraft taxi times, surface surveillance data is first analyzed. From this data analysis, several variables, including terminal concourse, spot, runway, departure fix and weight class, are selected for taxi time prediction. Then, various machine learning methods such as linear regression, support vector machines, k-nearest neighbors, random forest, and neural networks model are applied to actual flight data. Different traffic flow and weather conditions at Charlotte airport are also taken into account for more accurate prediction. The taxi-out time prediction results show that linear regression and random forest techniques can provide the most accurate prediction in terms of root-mean-square errors. We also discuss the operational complexity and uncertainties that make it difficult to predict the taxi times accurately.
Experimental determination of solvent-water partition coefficients and Abraham parameters for munition constituents.

PubMed

Liang, Yuzhen; Kuo, Dave T F; Allen, Herbert E; Di Toro, Dominic M

2016-10-01

There is concern about the environmental fate and effects of munition constituents (MCs). Polyparameter linear free energy relationships (pp-LFERs) that employ Abraham solute parameters can aid in evaluating the risk of MCs to the environment. However, poor predictions using pp-LFERs and ABSOLV estimated Abraham solute parameters are found for some key physico-chemical properties. In this work, the Abraham solute parameters are determined using experimental partition coefficients in various solvent-water systems. The compounds investigated include hexahydro-1,3,5-trinitro-1,3,5-triazacyclohexane (RDX), octahydro-1,3,5,7-tetranitro-1,3,5,7-tetraazacyclooctane (HMX), hexahydro-1-nitroso-3,5-dinitro-1,3,5-triazine (MNX), hexahydro-1,3,5-trinitroso-1,3,5-triazine (TNX), hexahydro-1,3-dinitroso-5- nitro-1,3,5-triazine (DNX), 2,4,6-trinitrotoluene (TNT), 1,3,5-trinitrobenzene (TNB), and 4-nitroanisole. The solvents in the solvent-water systems are hexane, dichloromethane, trichloromethane, octanol, and toluene. The only available reported solvent-water partition coefficients are for octanol-water for some of the investigated compounds and they are in good agreement with the experimental measurements from this study. Solvent-water partition coefficients fitted using experimentally derived solute parameters from this study have significantly smaller root mean square errors (RMSE = 0.38) than predictions using ABSOLV estimated solute parameters (RMSE = 3.56) for the investigated compounds. Additionally, the predictions for various physico-chemical properties using the experimentally derived solute parameters agree with available literature reported values with prediction errors within 0.79 log units except for water solubility of RDX and HMX with errors of 1.48 and 2.16 log units respectively. However, predictions using ABSOLV estimated solute parameters have larger prediction errors of up to 7.68 log units. This large discrepancy is probably due to the missing R2NNO2 and R2NNO2 functional groups in the ABSOLV fragment database. Copyright © 2016. Published by Elsevier Ltd.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.