Using Robust Standard Errors to Combine Multiple Regression Estimates with Meta-Analysis
ERIC Educational Resources Information Center
Williams, Ryan T.
2012-01-01
Combining multiple regression estimates with meta-analysis has continued to be a difficult task. A variety of methods have been proposed and used to combine multiple regression slope estimates with meta-analysis, however, most of these methods have serious methodological and practical limitations. The purpose of this study was to explore the use…
Using Robust Variance Estimation to Combine Multiple Regression Estimates with Meta-Analysis
ERIC Educational Resources Information Center
Williams, Ryan
2013-01-01
The purpose of this study was to explore the use of robust variance estimation for combining commonly specified multiple regression models and for combining sample-dependent focal slope estimates from diversely specified models. The proposed estimator obviates traditionally required information about the covariance structure of the dependent…
Logistic Regression: Concept and Application
ERIC Educational Resources Information Center
Cokluk, Omay
2010-01-01
The main focus of logistic regression analysis is classification of individuals in different groups. The aim of the present study is to explain basic concepts and processes of binary logistic regression analysis intended to determine the combination of independent variables which best explain the membership in certain groups called dichotomous…
Development of a User Interface for a Regression Analysis Software Tool
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert Manfred; Volden, Thomas R.
2010-01-01
An easy-to -use user interface was implemented in a highly automated regression analysis tool. The user interface was developed from the start to run on computers that use the Windows, Macintosh, Linux, or UNIX operating system. Many user interface features were specifically designed such that a novice or inexperienced user can apply the regression analysis tool with confidence. Therefore, the user interface s design minimizes interactive input from the user. In addition, reasonable default combinations are assigned to those analysis settings that influence the outcome of the regression analysis. These default combinations will lead to a successful regression analysis result for most experimental data sets. The user interface comes in two versions. The text user interface version is used for the ongoing development of the regression analysis tool. The official release of the regression analysis tool, on the other hand, has a graphical user interface that is more efficient to use. This graphical user interface displays all input file names, output file names, and analysis settings for a specific software application mode on a single screen which makes it easier to generate reliable analysis results and to perform input parameter studies. An object-oriented approach was used for the development of the graphical user interface. This choice keeps future software maintenance costs to a reasonable limit. Examples of both the text user interface and graphical user interface are discussed in order to illustrate the user interface s overall design approach.
Zarb, Francis; McEntee, Mark F; Rainford, Louise
2015-06-01
To evaluate visual grading characteristics (VGC) and ordinal regression analysis during head CT optimisation as a potential alternative to visual grading assessment (VGA), traditionally employed to score anatomical visualisation. Patient images (n = 66) were obtained using current and optimised imaging protocols from two CT suites: a 16-slice scanner at the national Maltese centre for trauma and a 64-slice scanner in a private centre. Local resident radiologists (n = 6) performed VGA followed by VGC and ordinal regression analysis. VGC alone indicated that optimised protocols had similar image quality as current protocols. Ordinal logistic regression analysis provided an in-depth evaluation, criterion by criterion allowing the selective implementation of the protocols. The local radiology review panel supported the implementation of optimised protocols for brain CT examinations (including trauma) in one centre, achieving radiation dose reductions ranging from 24 % to 36 %. In the second centre a 29 % reduction in radiation dose was achieved for follow-up cases. The combined use of VGC and ordinal logistic regression analysis led to clinical decisions being taken on the implementation of the optimised protocols. This improved method of image quality analysis provided the evidence to support imaging protocol optimisation, resulting in significant radiation dose savings. • There is need for scientifically based image quality evaluation during CT optimisation. • VGC and ordinal regression analysis in combination led to better informed clinical decisions. • VGC and ordinal regression analysis led to dose reductions without compromising diagnostic efficacy.
L.R. Grosenbaugh
1967-01-01
Describes an expansible computerized system that provides data needed in regression or covariance analysis of as many as 50 variables, 8 of which may be dependent. Alternatively, it can screen variously generated combinations of independent variables to find the regression with the smallest mean-squared-residual, which will be fitted if desired. The user can easily...
A Technique of Fuzzy C-Mean in Multiple Linear Regression Model toward Paddy Yield
NASA Astrophysics Data System (ADS)
Syazwan Wahab, Nur; Saifullah Rusiman, Mohd; Mohamad, Mahathir; Amira Azmi, Nur; Che Him, Norziha; Ghazali Kamardan, M.; Ali, Maselan
2018-04-01
In this paper, we propose a hybrid model which is a combination of multiple linear regression model and fuzzy c-means method. This research involved a relationship between 20 variates of the top soil that are analyzed prior to planting of paddy yields at standard fertilizer rates. Data used were from the multi-location trials for rice carried out by MARDI at major paddy granary in Peninsular Malaysia during the period from 2009 to 2012. Missing observations were estimated using mean estimation techniques. The data were analyzed using multiple linear regression model and a combination of multiple linear regression model and fuzzy c-means method. Analysis of normality and multicollinearity indicate that the data is normally scattered without multicollinearity among independent variables. Analysis of fuzzy c-means cluster the yield of paddy into two clusters before the multiple linear regression model can be used. The comparison between two method indicate that the hybrid of multiple linear regression model and fuzzy c-means method outperform the multiple linear regression model with lower value of mean square error.
Singh, Jagmahender; Pathak, R K; Chavali, Krishnadutt H
2011-03-20
Skeletal height estimation from regression analysis of eight sternal lengths in the subjects of Chandigarh zone of Northwest India is the topic of discussion in this study. Analysis of eight sternal lengths (length of manubrium, length of mesosternum, combined length of manubrium and mesosternum, total sternal length and first four intercostals lengths of mesosternum) measured from 252 male and 91 female sternums obtained at postmortems revealed that mean cadaver stature and sternal lengths were more in North Indians and males than the South Indians and females. Except intercostal lengths, all the sternal lengths were positively correlated with stature of the deceased in both sexes (P < 0.001). The multiple regression analysis of sternal lengths was found more useful than the linear regression for stature estimation. Using multivariate regression analysis, the combined length of manubrium and mesosternum in both sexes and the length of manubrium along with 2nd and 3rd intercostal lengths of mesosternum in males were selected as best estimators of stature. Nonetheless, the stature of males can be predicted with SEE of 6.66 (R(2) = 0.16, r = 0.318) from combination of MBL+BL_3+LM+BL_2, and in females from MBL only, it can be estimated with SEE of 6.65 (R(2) = 0.10, r = 0.318), whereas from the multiple regression analysis of pooled data, stature can be known with SEE of 6.97 (R(2) = 0.387, r = 575) from the combination of MBL+LM+BL_2+TSL+BL_3. The R(2) and F-ratio were found to be statistically significant for almost all the variables in both the sexes, except 4th intercostal length in males and 2nd to 4th intercostal lengths in females. The 'major' sternal lengths were more useful than the 'minor' ones for stature estimation The universal regression analysis used by Kanchan et al. [39] when applied to sternal lengths, gave satisfactory estimates of stature for males only but female stature was comparatively better estimated from simple linear regressions. But they are not proposed for the subjects of known sex, as they underestimate the male and overestimate female stature. However, intercostal lengths were found to be the poor estimators of stature (P < 0.05). And also sternal lengths exhibit weaker correlation coefficients and higher standard errors of estimate. Copyright © 2010 Elsevier Ireland Ltd. All rights reserved.
Identification of molecular markers associated with mite resistance in coconut (Cocos nucifera L.).
Shalini, K V; Manjunatha, S; Lebrun, P; Berger, A; Baudouin, L; Pirany, N; Ranganath, R M; Prasad, D Theertha
2007-01-01
Coconut mite (Aceria guerreronis 'Keifer') has become a major threat to Indian coconut (Coçcos nucifera L.) cultivators and the processing industry. Chemical and biological control measures have proved to be costly, ineffective, and ecologically undesirable. Planting mite-resistant coconut cultivars is the most effective method of preventing yield loss and should form a major component of any integrated pest management stratagem. Coconut genotypes, and mite-resistant and -susceptible accessions were collected from different parts of South India. Thirty-two simple sequence repeat (SSR) and 7 RAPD primers were used for molecular analyses. In single-marker analysis, 9 SSR and 4 RAPD markers associated with mite resistance were identified. In stepwise multiple regression analysis of SSRs, a combination of 6 markers showed 100% association with mite infestation. Stepwise multiple regression analysis for RAPD data revealed that a combination of 3 markers accounted for 83.86% of mite resistance in the selected materials. Combined stepwise multiple regression analysis of RAPD and SSR data showed that a combination of 5 markers explained 100% of the association with mite resistance in coconut. Markers associated with mite resistance are important in coconut breeding programs and will facilitate the selection of mite-resistant plants at an early stage as well as mother plants for breeding programs.
Mohd Yusof, Mohd Yusmiaidil Putera; Cauwels, Rita; Deschepper, Ellen; Martens, Luc
2015-08-01
The third molar development (TMD) has been widely utilized as one of the radiographic method for dental age estimation. By using the same radiograph of the same individual, third molar eruption (TME) information can be incorporated to the TMD regression model. This study aims to evaluate the performance of dental age estimation in individual method models and the combined model (TMD and TME) based on the classic regressions of multiple linear and principal component analysis. A sample of 705 digital panoramic radiographs of Malay sub-adults aged between 14.1 and 23.8 years was collected. The techniques described by Gleiser and Hunt (modified by Kohler) and Olze were employed to stage the TMD and TME, respectively. The data was divided to develop three respective models based on the two regressions of multiple linear and principal component analysis. The trained models were then validated on the test sample and the accuracy of age prediction was compared between each model. The coefficient of determination (R²) and root mean square error (RMSE) were calculated. In both genders, adjusted R² yielded an increment in the linear regressions of combined model as compared to the individual models. The overall decrease in RMSE was detected in combined model as compared to TMD (0.03-0.06) and TME (0.2-0.8). In principal component regression, low value of adjusted R(2) and high RMSE except in male were exhibited in combined model. Dental age estimation is better predicted using combined model in multiple linear regression models. Copyright © 2015 Elsevier Ltd and Faculty of Forensic and Legal Medicine. All rights reserved.
Resting-state functional magnetic resonance imaging: the impact of regression analysis.
Yeh, Chia-Jung; Tseng, Yu-Sheng; Lin, Yi-Ru; Tsai, Shang-Yueh; Huang, Teng-Yi
2015-01-01
To investigate the impact of regression methods on resting-state functional magnetic resonance imaging (rsfMRI). During rsfMRI preprocessing, regression analysis is considered effective for reducing the interference of physiological noise on the signal time course. However, it is unclear whether the regression method benefits rsfMRI analysis. Twenty volunteers (10 men and 10 women; aged 23.4 ± 1.5 years) participated in the experiments. We used node analysis and functional connectivity mapping to assess the brain default mode network by using five combinations of regression methods. The results show that regressing the global mean plays a major role in the preprocessing steps. When a global regression method is applied, the values of functional connectivity are significantly lower (P ≤ .01) than those calculated without a global regression. This step increases inter-subject variation and produces anticorrelated brain areas. rsfMRI data processed using regression should be interpreted carefully. The significance of the anticorrelated brain areas produced by global signal removal is unclear. Copyright © 2014 by the American Society of Neuroimaging.
Almalik, Osama; Nijhuis, Michiel B; van den Heuvel, Edwin R
2014-01-01
Shelf-life estimation usually requires that at least three registration batches are tested for stability at multiple storage conditions. The shelf-life estimates are often obtained by linear regression analysis per storage condition, an approach implicitly suggested by ICH guideline Q1E. A linear regression analysis combining all data from multiple storage conditions was recently proposed in the literature when variances are homogeneous across storage conditions. The combined analysis is expected to perform better than the separate analysis per storage condition, since pooling data would lead to an improved estimate of the variation and higher numbers of degrees of freedom, but this is not evident for shelf-life estimation. Indeed, the two approaches treat the observed initial batch results, the intercepts in the model, and poolability of batches differently, which may eliminate or reduce the expected advantage of the combined approach with respect to the separate approach. Therefore, a simulation study was performed to compare the distribution of simulated shelf-life estimates on several characteristics between the two approaches and to quantify the difference in shelf-life estimates. In general, the combined statistical analysis does estimate the true shelf life more consistently and precisely than the analysis per storage condition, but it did not outperform the separate analysis in all circumstances.
Likhvantseva, V G; Sokolov, V A; Levanova, O N; Kovelenova, I V
2018-01-01
Prediction of the clinical course of primary open-angle glaucoma (POAG) is one of the main directions in solving the problem of vision loss prevention and stabilization of the pathological process. Simple statistical methods of correlation analysis show the extent of each risk factor's impact, but do not indicate the total impact of these factors in personalized combinations. The relationships between the risk factors is subject to correlation and regression analysis. The regression equation represents the dependence of the mathematical expectation of the resulting sign on the combination of factor signs. To develop a technique for predicting the probability of development and progression of primary open-angle glaucoma based on a personalized combination of risk factors by linear multivariate regression analysis. The study included 66 patients (23 female and 43 male; 132 eyes) with newly diagnosed primary open-angle glaucoma. The control group consisted of 14 patients (8 male and 6 female). Standard ophthalmic examination was supplemented with biochemical study of lacrimal fluid. Concentration of matrix metalloproteinase MMP-2 and MMP-9 in tear fluid in both eyes was determined using 'sandwich' enzyme-linked immunosorbent assay (ELISA) method. The study resulted in the development of regression equations and step-by-step multivariate logistic models that can help calculate the risk of development and progression of POAG. Those models are based on expert evaluation of clinical and instrumental indicators of hydrodynamic disturbances (coefficient of outflow ease - C, volume of intraocular fluid secretion - F, fluctuation of intraocular pressure), as well as personalized morphometric parameters of the retina (central retinal thickness in the macular area) and concentration of MMP-2 and MMP-9 in the tear film. The newly developed regression equations are highly informative and can be a reliable tool for studying of the influence vector and assessment of pathogenic potential of the independent risk factors in specific personalized combinations.
Algorithm For Solution Of Subset-Regression Problems
NASA Technical Reports Server (NTRS)
Verhaegen, Michel
1991-01-01
Reliable and flexible algorithm for solution of subset-regression problem performs QR decomposition with new column-pivoting strategy, enables selection of subset directly from originally defined regression parameters. This feature, in combination with number of extensions, makes algorithm very flexible for use in analysis of subset-regression problems in which parameters have physical meanings. Also extended to enable joint processing of columns contaminated by noise with those free of noise, without using scaling techniques.
Mainou, Maria; Madenidou, Anastasia-Vasiliki; Liakos, Aris; Paschos, Paschalis; Karagiannis, Thomas; Bekiari, Eleni; Vlachaki, Efthymia; Wang, Zhen; Murad, Mohammad Hassan; Kumar, Shaji; Tsapas, Apostolos
2017-06-01
We performed a systematic review and meta-regression analysis of randomized control trials to investigate the association between response to initial treatment and survival outcomes in patients with newly diagnosed multiple myeloma (MM). Response outcomes included complete response (CR) and the combined outcome of CR or very good partial response (VGPR), while survival outcomes were overall survival (OS) and progression-free survival (PFS). We used random-effect meta-regression models and conducted sensitivity analyses based on definition of CR and study quality. Seventy-two trials were included in the systematic review, 63 of which contributed data in meta-regression analyses. There was no association between OS and CR in patients without autologous stem cell transplant (ASCT) (regression coefficient: .02, 95% confidence interval [CI] -0.06, 0.10), in patients undergoing ASCT (-.11, 95% CI -0.44, 0.22) and in trials comparing ASCT with non-ASCT patients (.04, 95% CI -0.29, 0.38). Similarly, OS did not correlate with the combined metric of CR or VGPR, and no association was evident between response outcomes and PFS. Sensitivity analyses yielded similar results. This meta-regression analysis suggests that there is no association between conventional response outcomes and survival in patients with newly diagnosed MM. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Categorical Variables in Multiple Regression: Some Cautions.
ERIC Educational Resources Information Center
O'Grady, Kevin E.; Medoff, Deborah R.
1988-01-01
Limitations of dummy coding and nonsense coding as methods of coding categorical variables for use as predictors in multiple regression analysis are discussed. The combination of these approaches often yields estimates and tests of significance that are not intended by researchers for inclusion in their models. (SLD)
Ensemble habitat mapping of invasive plant species
Stohlgren, T.J.; Ma, P.; Kumar, S.; Rocca, M.; Morisette, J.T.; Jarnevich, C.S.; Benson, N.
2010-01-01
Ensemble species distribution models combine the strengths of several species environmental matching models, while minimizing the weakness of any one model. Ensemble models may be particularly useful in risk analysis of recently arrived, harmful invasive species because species may not yet have spread to all suitable habitats, leaving species-environment relationships difficult to determine. We tested five individual models (logistic regression, boosted regression trees, random forest, multivariate adaptive regression splines (MARS), and maximum entropy model or Maxent) and ensemble modeling for selected nonnative plant species in Yellowstone and Grand Teton National Parks, Wyoming; Sequoia and Kings Canyon National Parks, California, and areas of interior Alaska. The models are based on field data provided by the park staffs, combined with topographic, climatic, and vegetation predictors derived from satellite data. For the four invasive plant species tested, ensemble models were the only models that ranked in the top three models for both field validation and test data. Ensemble models may be more robust than individual species-environment matching models for risk analysis. ?? 2010 Society for Risk Analysis.
Combined analysis of magnetic and gravity anomalies using normalized source strength (NSS)
NASA Astrophysics Data System (ADS)
Li, L.; Wu, Y.
2017-12-01
Gravity field and magnetic field belong to potential fields which lead inherent multi-solution. Combined analysis of magnetic and gravity anomalies based on Poisson's relation is used to determinate homology gravity and magnetic anomalies and decrease the ambiguity. The traditional combined analysis uses the linear regression of the reduction to pole (RTP) magnetic anomaly to the first order vertical derivative of the gravity anomaly, and provides the quantitative or semi-quantitative interpretation by calculating the correlation coefficient, slope and intercept. In the calculation process, due to the effect of remanent magnetization, the RTP anomaly still contains the effect of oblique magnetization. In this case the homology gravity and magnetic anomalies display irrelevant results in the linear regression calculation. The normalized source strength (NSS) can be transformed from the magnetic tensor matrix, which is insensitive to the remanence. Here we present a new combined analysis using NSS. Based on the Poisson's relation, the gravity tensor matrix can be transformed into the pseudomagnetic tensor matrix of the direction of geomagnetic field magnetization under the homologous condition. The NSS of pseudomagnetic tensor matrix and original magnetic tensor matrix are calculated and linear regression analysis is carried out. The calculated correlation coefficient, slope and intercept indicate the homology level, Poisson's ratio and the distribution of remanent respectively. We test the approach using synthetic model under complex magnetization, the results show that it can still distinguish the same source under the condition of strong remanence, and establish the Poisson's ratio. Finally, this approach is applied in China. The results demonstrated that our approach is feasible.
Optimizing methods for linking cinematic features to fMRI data.
Kauttonen, Janne; Hlushchuk, Yevhen; Tikka, Pia
2015-04-15
One of the challenges of naturalistic neurosciences using movie-viewing experiments is how to interpret observed brain activations in relation to the multiplicity of time-locked stimulus features. As previous studies have shown less inter-subject synchronization across viewers of random video footage than story-driven films, new methods need to be developed for analysis of less story-driven contents. To optimize the linkage between our fMRI data collected during viewing of a deliberately non-narrative silent film 'At Land' by Maya Deren (1944) and its annotated content, we combined the method of elastic-net regularization with the model-driven linear regression and the well-established data-driven independent component analysis (ICA) and inter-subject correlation (ISC) methods. In the linear regression analysis, both IC and region-of-interest (ROI) time-series were fitted with time-series of a total of 36 binary-valued and one real-valued tactile annotation of film features. The elastic-net regularization and cross-validation were applied in the ordinary least-squares linear regression in order to avoid over-fitting due to the multicollinearity of regressors, the results were compared against both the partial least-squares (PLS) regression and the un-regularized full-model regression. Non-parametric permutation testing scheme was applied to evaluate the statistical significance of regression. We found statistically significant correlation between the annotation model and 9 ICs out of 40 ICs. Regression analysis was also repeated for a large set of cubic ROIs covering the grey matter. Both IC- and ROI-based regression analyses revealed activations in parietal and occipital regions, with additional smaller clusters in the frontal lobe. Furthermore, we found elastic-net based regression more sensitive than PLS and un-regularized regression since it detected a larger number of significant ICs and ROIs. Along with the ISC ranking methods, our regression analysis proved a feasible method for ordering the ICs based on their functional relevance to the annotated cinematic features. The novelty of our method is - in comparison to the hypothesis-driven manual pre-selection and observation of some individual regressors biased by choice - in applying data-driven approach to all content features simultaneously. We found especially the combination of regularized regression and ICA useful when analyzing fMRI data obtained using non-narrative movie stimulus with a large set of complex and correlated features. Copyright © 2015. Published by Elsevier Inc.
Guo, Huey-Ming; Shyu, Yea-Ing Lotus; Chang, Her-Kun
2006-01-01
In this article, the authors provide an overview of a research method to predict quality of care in home health nursing data set. The results of this study can be visualized through classification an regression tree (CART) graphs. The analysis was more effective, and the results were more informative since the home health nursing dataset was analyzed with a combination of the logistic regression and CART, these two techniques complete each other. And the results more informative that more patients' characters were related to quality of care in home care. The results contributed to home health nurse predict patient outcome in case management. Improved prediction is needed for interventions to be appropriately targeted for improved patient outcome and quality of care.
NASA Astrophysics Data System (ADS)
Muji Susantoro, Tri; Wikantika, Ketut; Saepuloh, Asep; Handoyo Harsolumakso, Agus
2018-05-01
Selection of vegetation indices in plant mapping is needed to provide the best information of plant conditions. The methods used in this research are the standard deviation and the linear regression. This research tried to determine the vegetation indices used for mapping the sugarcane conditions around oil and gas fields. The data used in this study is Landsat 8 OLI/TIRS. The standard deviation analysis on the 23 vegetation indices with 27 samples has resulted in the six highest standard deviations of vegetation indices, termed as GRVI, SR, NLI, SIPI, GEMI and LAI. The standard deviation values are 0.47; 0.43; 0.30; 0.17; 0.16 and 0.13. Regression correlation analysis on the 23 vegetation indices with 280 samples has resulted in the six vegetation indices, termed as NDVI, ENDVI, GDVI, VARI, LAI and SIPI. This was performed based on regression correlation with the lowest value R2 than 0,8. The combined analysis of the standard deviation and the regression correlation has obtained the five vegetation indices, termed as NDVI, ENDVI, GDVI, LAI and SIPI. The results of the analysis of both methods show that a combination of two methods needs to be done to produce a good analysis of sugarcane conditions. It has been clarified through field surveys and showed good results for the prediction of microseepages.
Chowdhury, Nilotpal; Sapru, Shantanu
2015-01-01
Microarray analysis has revolutionized the role of genomic prognostication in breast cancer. However, most studies are single series studies, and suffer from methodological problems. We sought to use a meta-analytic approach in combining multiple publicly available datasets, while correcting for batch effects, to reach a more robust oncogenomic analysis. The aim of the present study was to find gene sets associated with distant metastasis free survival (DMFS) in systemically untreated, node-negative breast cancer patients, from publicly available genomic microarray datasets. Four microarray series (having 742 patients) were selected after a systematic search and combined. Cox regression for each gene was done for the combined dataset (univariate, as well as multivariate - adjusted for expression of Cell cycle related genes) and for the 4 major molecular subtypes. The centre and microarray batch effects were adjusted by including them as random effects variables. The Cox regression coefficients for each analysis were then ranked and subjected to a Gene Set Enrichment Analysis (GSEA). Gene sets representing protein translation were independently negatively associated with metastasis in the Luminal A and Luminal B subtypes, but positively associated with metastasis in Basal tumors. Proteinaceous extracellular matrix (ECM) gene set expression was positively associated with metastasis, after adjustment for expression of cell cycle related genes on the combined dataset. Finally, the positive association of the proliferation-related genes with metastases was confirmed. To the best of our knowledge, the results depicting mixed prognostic significance of protein translation in breast cancer subtypes are being reported for the first time. We attribute this to our study combining multiple series and performing a more robust meta-analytic Cox regression modeling on the combined dataset, thus discovering 'hidden' associations. This methodology seems to yield new and interesting results and may be used as a tool to guide new research.
Chowdhury, Nilotpal; Sapru, Shantanu
2015-01-01
Introduction Microarray analysis has revolutionized the role of genomic prognostication in breast cancer. However, most studies are single series studies, and suffer from methodological problems. We sought to use a meta-analytic approach in combining multiple publicly available datasets, while correcting for batch effects, to reach a more robust oncogenomic analysis. Aim The aim of the present study was to find gene sets associated with distant metastasis free survival (DMFS) in systemically untreated, node-negative breast cancer patients, from publicly available genomic microarray datasets. Methods Four microarray series (having 742 patients) were selected after a systematic search and combined. Cox regression for each gene was done for the combined dataset (univariate, as well as multivariate – adjusted for expression of Cell cycle related genes) and for the 4 major molecular subtypes. The centre and microarray batch effects were adjusted by including them as random effects variables. The Cox regression coefficients for each analysis were then ranked and subjected to a Gene Set Enrichment Analysis (GSEA). Results Gene sets representing protein translation were independently negatively associated with metastasis in the Luminal A and Luminal B subtypes, but positively associated with metastasis in Basal tumors. Proteinaceous extracellular matrix (ECM) gene set expression was positively associated with metastasis, after adjustment for expression of cell cycle related genes on the combined dataset. Finally, the positive association of the proliferation-related genes with metastases was confirmed. Conclusion To the best of our knowledge, the results depicting mixed prognostic significance of protein translation in breast cancer subtypes are being reported for the first time. We attribute this to our study combining multiple series and performing a more robust meta-analytic Cox regression modeling on the combined dataset, thus discovering 'hidden' associations. This methodology seems to yield new and interesting results and may be used as a tool to guide new research. PMID:26080057
Wang, Wanping; Liu, Mingyue; Wang, Jing; Tian, Rui; Dong, Junqiang; Liu, Qi; Zhao, Xianping; Wang, Yuanfang
2014-01-01
Screening indexes of tumor serum markers for benign and malignant solitary pulmonary nodules (SPNs) were analyzed to find the optimum method for diagnosis. Enzyme-linked immunosorbent assays, an automatic immune analyzer and radioimmunoassay methods were used to examine the levels of 8 serum markers in 164 SPN patients, and the sensitivity for differential diagnosis of malignant or benign SPN was compared for detection using a single plasma marker or a combination of markers. The results for serological indicators that closely relate to benign and malignant SPNs were screened using the Fisher discriminant analysis and a non-conditional logistic regression analysis method, respectively. The results were then verified by the k-means clustering analysis method. The sensitivity when using a combination of serum markers to detect SPN was higher than that using a single marker. By Fisher discriminant analysis, cytokeratin 19 fragments (CYFRA21-1), carbohydrate antigen 125 (CA125), squamous cell carcinoma antigen (SCC) and breast cancer antigen (CA153), which relate to the benign and malignant SPNs, were screened. Through non-conditional logistic regression analysis, CYFRA21-1, SCC and CA153 were obtained. Using the k-means clustering analysis, the cophenetic correlation coefficient (0.940) obtained by the Fisher discriminant analysis was higher than that obtained with logistic regression analysis (0.875). This study indicated that the Fisher discriminant analysis functioned better in screening out serum markers to recognize the benign and malignant SPN. The combined detection of CYFRA21-1, CA125, SCC and CA153 is an effective way to distinguish benign and malignant SPN, and will find an important clinical application in the early diagnosis of SPN. © 2014 S. Karger GmbH, Freiburg.
NASA Astrophysics Data System (ADS)
Rajab, Jasim M.; MatJafri, M. Z.; Lim, H. S.
2013-06-01
This study encompasses columnar ozone modelling in the peninsular Malaysia. Data of eight atmospheric parameters [air surface temperature (AST), carbon monoxide (CO), methane (CH4), water vapour (H2Ovapour), skin surface temperature (SSKT), atmosphere temperature (AT), relative humidity (RH), and mean surface pressure (MSP)] data set, retrieved from NASA's Atmospheric Infrared Sounder (AIRS), for the entire period (2003-2008) was employed to develop models to predict the value of columnar ozone (O3) in study area. The combined method, which is based on using both multiple regressions combined with principal component analysis (PCA) modelling, was used to predict columnar ozone. This combined approach was utilized to improve the prediction accuracy of columnar ozone. Separate analysis was carried out for north east monsoon (NEM) and south west monsoon (SWM) seasons. The O3 was negatively correlated with CH4, H2Ovapour, RH, and MSP, whereas it was positively correlated with CO, AST, SSKT, and AT during both the NEM and SWM season periods. Multiple regression analysis was used to fit the columnar ozone data using the atmospheric parameter's variables as predictors. A variable selection method based on high loading of varimax rotated principal components was used to acquire subsets of the predictor variables to be comprised in the linear regression model of the atmospheric parameter's variables. It was found that the increase in columnar O3 value is associated with an increase in the values of AST, SSKT, AT, and CO and with a drop in the levels of CH4, H2Ovapour, RH, and MSP. The result of fitting the best models for the columnar O3 value using eight of the independent variables gave about the same values of the R (≈0.93) and R2 (≈0.86) for both the NEM and SWM seasons. The common variables that appeared in both regression equations were SSKT, CH4 and RH, and the principal precursor of the columnar O3 value in both the NEM and SWM seasons was SSKT.
An INAR(1) Negative Multinomial Regression Model for Longitudinal Count Data.
ERIC Educational Resources Information Center
Bockenholt, Ulf
1999-01-01
Discusses a regression model for the analysis of longitudinal count data in a panel study by adapting an integer-valued first-order autoregressive (INAR(1)) Poisson process to represent time-dependent correlation between counts. Derives a new negative multinomial distribution by combining INAR(1) representation with a random effects approach.…
Clustering performance comparison using K-means and expectation maximization algorithms.
Jung, Yong Gyu; Kang, Min Soo; Heo, Jun
2014-11-14
Clustering is an important means of data mining based on separating data categories by similar features. Unlike the classification algorithm, clustering belongs to the unsupervised type of algorithms. Two representatives of the clustering algorithms are the K -means and the expectation maximization (EM) algorithm. Linear regression analysis was extended to the category-type dependent variable, while logistic regression was achieved using a linear combination of independent variables. To predict the possibility of occurrence of an event, a statistical approach is used. However, the classification of all data by means of logistic regression analysis cannot guarantee the accuracy of the results. In this paper, the logistic regression analysis is applied to EM clusters and the K -means clustering method for quality assessment of red wine, and a method is proposed for ensuring the accuracy of the classification results.
Analysis of a Rocket Based Combined Cycle Engine during Rocket Only Operation
NASA Technical Reports Server (NTRS)
Smith, T. D.; Steffen, C. J., Jr.; Yungster, S.; Keller, D. J.
1998-01-01
The all rocket mode of operation is a critical factor in the overall performance of a rocket based combined cycle (RBCC) vehicle. However, outside of performing experiments or a full three dimensional analysis, there are no first order parametric models to estimate performance. As a result, an axisymmetric RBCC engine was used to analytically determine specific impulse efficiency values based upon both full flow and gas generator configurations. Design of experiments methodology was used to construct a test matrix and statistical regression analysis was used to build parametric models. The main parameters investigated in this study were: rocket chamber pressure, rocket exit area ratio, percent of injected secondary flow, mixer-ejector inlet area, mixer-ejector area ratio, and mixer-ejector length-to-inject diameter ratio. A perfect gas computational fluid dynamics analysis was performed to obtain values of vacuum specific impulse. Statistical regression analysis was performed based on both full flow and gas generator engine cycles. Results were also found to be dependent upon the entire cycle assumptions. The statistical regression analysis determined that there were five significant linear effects, six interactions, and one second-order effect. Two parametric models were created to provide performance assessments of an RBCC engine in the all rocket mode of operation.
NASA Technical Reports Server (NTRS)
Smith, Timothy D.; Steffen, Christopher J., Jr.; Yungster, Shaye; Keller, Dennis J.
1998-01-01
The all rocket mode of operation is shown to be a critical factor in the overall performance of a rocket based combined cycle (RBCC) vehicle. An axisymmetric RBCC engine was used to determine specific impulse efficiency values based upon both full flow and gas generator configurations. Design of experiments methodology was used to construct a test matrix and multiple linear regression analysis was used to build parametric models. The main parameters investigated in this study were: rocket chamber pressure, rocket exit area ratio, injected secondary flow, mixer-ejector inlet area, mixer-ejector area ratio, and mixer-ejector length-to-inlet diameter ratio. A perfect gas computational fluid dynamics analysis, using both the Spalart-Allmaras and k-omega turbulence models, was performed with the NPARC code to obtain values of vacuum specific impulse. Results from the multiple linear regression analysis showed that for both the full flow and gas generator configurations increasing mixer-ejector area ratio and rocket area ratio increase performance, while increasing mixer-ejector inlet area ratio and mixer-ejector length-to-diameter ratio decrease performance. Increasing injected secondary flow increased performance for the gas generator analysis, but was not statistically significant for the full flow analysis. Chamber pressure was found to be not statistically significant.
Regression Model Optimization for the Analysis of Experimental Data
NASA Technical Reports Server (NTRS)
Ulbrich, N.
2009-01-01
A candidate math model search algorithm was developed at Ames Research Center that determines a recommended math model for the multivariate regression analysis of experimental data. The search algorithm is applicable to classical regression analysis problems as well as wind tunnel strain gage balance calibration analysis applications. The algorithm compares the predictive capability of different regression models using the standard deviation of the PRESS residuals of the responses as a search metric. This search metric is minimized during the search. Singular value decomposition is used during the search to reject math models that lead to a singular solution of the regression analysis problem. Two threshold dependent constraints are also applied. The first constraint rejects math models with insignificant terms. The second constraint rejects math models with near-linear dependencies between terms. The math term hierarchy rule may also be applied as an optional constraint during or after the candidate math model search. The final term selection of the recommended math model depends on the regressor and response values of the data set, the user s function class combination choice, the user s constraint selections, and the result of the search metric minimization. A frequently used regression analysis example from the literature is used to illustrate the application of the search algorithm to experimental data.
Tang, Kai; Si, Jun-Kang; Guo, Da-Dong; Cui, Yan; Du, Yu-Xiang; Pan, Xue-Mei; Bi, Hong-Sheng
2015-01-01
To compare the efficacy of intravitreal ranibizumab (IVR) alone or in combination with photodynamic therapy (PDT) vs PDT in patients with symptomatic polypoidal choroidal vasculopathy (PCV). A systematic search of a wide range of databases (including PubMed, EMBASE, Cochrane Library and Web of Science) was searched to identify relevant studies. Both randomized controlled trials (RCTs) and non-RCT studies were included. Methodological quality of included literatures was evaluated according to the Newcastle-Ottawa Scale. RevMan 5.2.7 software was used to do the Meta-analysis. Three RCTs and 6 retrospective studies were included. The results showed that PDT monotherapy had a significantly higher proportion in patients who achieved complete regression of polyps than IVR monotherapy at months 3, 6, and 12 (All P≤0.01), respectively. However, IVR had a tendency to be more effective in improving vision on the basis of RCTs. The proportion of patients who gained complete regression of polyps revealed that there was no significant difference between the combination treatment and PDT monotherapy. The mean change of best-corrected visual acuity (BCVA) from baseline showed that the combination treatment had significant superiority in improving vision vs PDT monotherapy at months 3, 6 and 24 (All P<0.05), respectively. In the mean time, this comparison result was also significant at month 12 (P<0.01) after removal of a heterogeneous study. IVR has non-inferiority compare with PDT either in stabilizing or in improving vision, although it can hardly promote the regression of polyps. The combination treatment of PDT and IVR can exert a synergistic effect on regressing polyps and on maintaining or improving visual acuity. Thus, it can be the first-line therapy for PCV.
Tang, Kai; Si, Jun-Kang; Guo, Da-Dong; Cui, Yan; Du, Yu-Xiang; Pan, Xue-Mei; Bi, Hong-Sheng
2015-01-01
AIM To compare the efficacy of intravitreal ranibizumab (IVR) alone or in combination with photodynamic therapy (PDT) vs PDT in patients with symptomatic polypoidal choroidal vasculopathy (PCV). METHODS A systematic search of a wide range of databases (including PubMed, EMBASE, Cochrane Library and Web of Science) was searched to identify relevant studies. Both randomized controlled trials (RCTs) and non-RCT studies were included. Methodological quality of included literatures was evaluated according to the Newcastle-Ottawa Scale. RevMan 5.2.7 software was used to do the Meta-analysis. RESULTS Three RCTs and 6 retrospective studies were included. The results showed that PDT monotherapy had a significantly higher proportion in patients who achieved complete regression of polyps than IVR monotherapy at months 3, 6, and 12 (All P≤0.01), respectively. However, IVR had a tendency to be more effective in improving vision on the basis of RCTs. The proportion of patients who gained complete regression of polyps revealed that there was no significant difference between the combination treatment and PDT monotherapy. The mean change of best-corrected visual acuity (BCVA) from baseline showed that the combination treatment had significant superiority in improving vision vs PDT monotherapy at months 3, 6 and 24 (All P<0.05), respectively. In the mean time, this comparison result was also significant at month 12 (P<0.01) after removal of a heterogeneous study. CONCLUSION IVR has non-inferiority compare with PDT either in stabilizing or in improving vision, although it can hardly promote the regression of polyps. The combination treatment of PDT and IVR can exert a synergistic effect on regressing polyps and on maintaining or improving visual acuity. Thus, it can be the first-line therapy for PCV. PMID:26558226
Cooley, Richard L.
1982-01-01
Prior information on the parameters of a groundwater flow model can be used to improve parameter estimates obtained from nonlinear regression solution of a modeling problem. Two scales of prior information can be available: (1) prior information having known reliability (that is, bias and random error structure) and (2) prior information consisting of best available estimates of unknown reliability. A regression method that incorporates the second scale of prior information assumes the prior information to be fixed for any particular analysis to produce improved, although biased, parameter estimates. Approximate optimization of two auxiliary parameters of the formulation is used to help minimize the bias, which is almost always much smaller than that resulting from standard ridge regression. It is shown that if both scales of prior information are available, then a combined regression analysis may be made.
Noninvasive glucose monitoring by optical reflective and thermal emission spectroscopic measurements
NASA Astrophysics Data System (ADS)
Saetchnikov, V. A.; Tcherniavskaia, E. A.; Schiffner, G.
2005-08-01
Noninvasive method for blood glucose monitoring in cutaneous tissue based on reflective spectrometry combined with a thermal emission spectroscopy has been developed. Regression analysis, neural network algorithms and cluster analysis are used for data processing.
Zhou, Qing-he; Xiao, Wang-pin; Shen, Ying-yan
2014-07-01
The spread of spinal anesthesia is highly unpredictable. In patients with increased abdominal girth and short stature, a greater cephalad spread after a fixed amount of subarachnoidally administered plain bupivacaine is often observed. We hypothesized that there is a strong correlation between abdominal girth/vertebral column length and cephalad spread. Age, weight, height, body mass index, abdominal girth, and vertebral column length were recorded for 114 patients. The L3-L4 interspace was entered, and 3 mL of 0.5% plain bupivacaine was injected into the subarachnoid space. The cephalad spread (loss of temperature sensation and loss of pinprick discrimination) was assessed 30 minutes after intrathecal injection. Linear regression analysis was performed for age, weight, height, body mass index, abdominal girth, vertebral column length, and the spread of spinal anesthesia, and the combined linear contribution of age up to 55 years, weight, height, abdominal girth, and vertebral column length was tested by multiple regression analysis. Linear regression analysis showed that there was a significant univariate correlation among all 6 patient characteristics evaluated and the spread of spinal anesthesia (all P < 0.039) except for age and loss of temperature sensation (P > 0.068). Multiple regression analysis showed that abdominal girth and the vertebral column length were the key determinants for spinal anesthesia spread (both P < 0.0001), whereas age, weight, and height could be omitted without changing the results (all P > 0.059, all 95% confidence limits < 0.372). Multiple regression analysis revealed that the combination of a patient's 5 general characteristics, especially abdominal girth and vertebral column length, had a high predictive value for the spread of spinal anesthesia after a given dose of plain bupivacaine.
2014-01-01
Background Support vector regression (SVR) and Gaussian process regression (GPR) were used for the analysis of electroanalytical experimental data to estimate diffusion coefficients. Results For simulated cyclic voltammograms based on the EC, Eqr, and EqrC mechanisms these regression algorithms in combination with nonlinear kernel/covariance functions yielded diffusion coefficients with higher accuracy as compared to the standard approach of calculating diffusion coefficients relying on the Nicholson-Shain equation. The level of accuracy achieved by SVR and GPR is virtually independent of the rate constants governing the respective reaction steps. Further, the reduction of high-dimensional voltammetric signals by manual selection of typical voltammetric peak features decreased the performance of both regression algorithms compared to a reduction by downsampling or principal component analysis. After training on simulated data sets, diffusion coefficients were estimated by the regression algorithms for experimental data comprising voltammetric signals for three organometallic complexes. Conclusions Estimated diffusion coefficients closely matched the values determined by the parameter fitting method, but reduced the required computational time considerably for one of the reaction mechanisms. The automated processing of voltammograms according to the regression algorithms yields better results than the conventional analysis of peak-related data. PMID:24987463
Code of Federal Regulations, 2010 CFR
2010-01-01
... systems, building load simulation models, statistical regression analysis, or some combination of these..., excluding any cogeneration process for other than a federally owned building or buildings or other federally...
Analysis of the Effects of the Commander’s Battle Positioning on Unit Combat Performance
1991-03-01
Analysis ......... .. 58 Logistic Regression Analysis ......... .. 61 Canonical Correlation Analysis ........ .. 62 Descriminant Analysis...entails classifying objects into two or more distinct groups, or responses. Dillon defines descriminant analysis as "deriving linear combinations of the...object given it’s predictor variables. The second objective is, through analysis of the parameters of the descriminant functions, determine those
Robust neural network with applications to credit portfolio data analysis.
Feng, Yijia; Li, Runze; Sudjianto, Agus; Zhang, Yiyun
2010-01-01
In this article, we study nonparametric conditional quantile estimation via neural network structure. We proposed an estimation method that combines quantile regression and neural network (robust neural network, RNN). It provides good smoothing performance in the presence of outliers and can be used to construct prediction bands. A Majorization-Minimization (MM) algorithm was developed for optimization. Monte Carlo simulation study is conducted to assess the performance of RNN. Comparison with other nonparametric regression methods (e.g., local linear regression and regression splines) in real data application demonstrate the advantage of the newly proposed procedure.
Valid Statistical Analysis for Logistic Regression with Multiple Sources
NASA Astrophysics Data System (ADS)
Fienberg, Stephen E.; Nardi, Yuval; Slavković, Aleksandra B.
Considerable effort has gone into understanding issues of privacy protection of individual information in single databases, and various solutions have been proposed depending on the nature of the data, the ways in which the database will be used and the precise nature of the privacy protection being offered. Once data are merged across sources, however, the nature of the problem becomes far more complex and a number of privacy issues arise for the linked individual files that go well beyond those that are considered with regard to the data within individual sources. In the paper, we propose an approach that gives full statistical analysis on the combined database without actually combining it. We focus mainly on logistic regression, but the method and tools described may be applied essentially to other statistical models as well.
Detrended fluctuation analysis as a regression framework: Estimating dependence at different scales
NASA Astrophysics Data System (ADS)
Kristoufek, Ladislav
2015-02-01
We propose a framework combining detrended fluctuation analysis with standard regression methodology. The method is built on detrended variances and covariances and it is designed to estimate regression parameters at different scales and under potential nonstationarity and power-law correlations. The former feature allows for distinguishing between effects for a pair of variables from different temporal perspectives. The latter ones make the method a significant improvement over the standard least squares estimation. Theoretical claims are supported by Monte Carlo simulations. The method is then applied on selected examples from physics, finance, environmental science, and epidemiology. For most of the studied cases, the relationship between variables of interest varies strongly across scales.
NASA Astrophysics Data System (ADS)
Candefjord, Stefan; Nyberg, Morgan; Jalkanen, Ville; Ramser, Kerstin; Lindahl, Olof A.
2010-12-01
Tissue characterization is fundamental for identification of pathological conditions. Raman spectroscopy (RS) and tactile resonance measurement (TRM) are two promising techniques that measure biochemical content and stiffness, respectively. They have potential to complement the golden standard--histological analysis. By combining RS and TRM, complementary information about tissue content can be obtained and specific drawbacks can be avoided. The aim of this study was to develop a multivariate approach to compare RS and TRM information. The approach was evaluated on measurements at the same points on porcine abdominal tissue. The measurement points were divided into five groups by multivariate analysis of the RS data. A regression analysis was performed and receiver operating characteristic (ROC) curves were used to compare the RS and TRM data. TRM identified one group efficiently (area under ROC curve 0.99). The RS data showed that the proportion of saturated fat was high in this group. The regression analysis showed that stiffness was mainly determined by the amount of fat and its composition. We concluded that RS provided additional, important information for tissue identification that was not provided by TRM alone. The results are promising for development of a method combining RS and TRM for intraoperative tissue characterization.
Improved Design Formulae for Buckling of Orthotropic Plates under Combined Loading
NASA Technical Reports Server (NTRS)
Weaver, Paul M.; Nemeth, Michael P.
2008-01-01
Simple, accurate buckling interaction formulae are presented for long orthotropic plates with either simply supported or clamped longitudinal edges and under combined loading that are suitable for design studies. The loads include 1) combined uniaxial compression (or tension) and shear, 2) combined pure inplane bending and 3) shear and combined uniaxial compression (or tension) and pure inplane bending. The interaction formulae are the results of detailed regression analysis of buckling data obtained from a very accurate Rayleigh-Ritz method.
Song, Seung Yeob; Lee, Young Koung; Kim, In-Jung
2016-01-01
A high-throughput screening system for Citrus lines were established with higher sugar and acid contents using Fourier transform infrared (FT-IR) spectroscopy in combination with multivariate analysis. FT-IR spectra confirmed typical spectral differences between the frequency regions of 950-1100 cm(-1), 1300-1500 cm(-1), and 1500-1700 cm(-1). Principal component analysis (PCA) and subsequent partial least square-discriminant analysis (PLS-DA) were able to discriminate five Citrus lines into three separate clusters corresponding to their taxonomic relationships. The quantitative predictive modeling of sugar and acid contents from Citrus fruits was established using partial least square regression algorithms from FT-IR spectra. The regression coefficients (R(2)) between predicted values and estimated sugar and acid content values were 0.99. These results demonstrate that by using FT-IR spectra and applying quantitative prediction modeling to Citrus sugar and acid contents, excellent Citrus lines can be early detected with greater accuracy. Copyright © 2015 Elsevier Ltd. All rights reserved.
Regression Model Term Selection for the Analysis of Strain-Gage Balance Calibration Data
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert Manfred; Volden, Thomas R.
2010-01-01
The paper discusses the selection of regression model terms for the analysis of wind tunnel strain-gage balance calibration data. Different function class combinations are presented that may be used to analyze calibration data using either a non-iterative or an iterative method. The role of the intercept term in a regression model of calibration data is reviewed. In addition, useful algorithms and metrics originating from linear algebra and statistics are recommended that will help an analyst (i) to identify and avoid both linear and near-linear dependencies between regression model terms and (ii) to make sure that the selected regression model of the calibration data uses only statistically significant terms. Three different tests are suggested that may be used to objectively assess the predictive capability of the final regression model of the calibration data. These tests use both the original data points and regression model independent confirmation points. Finally, data from a simplified manual calibration of the Ames MK40 balance is used to illustrate the application of some of the metrics and tests to a realistic calibration data set.
Ruan, Cheng-Jiang; Xu, Xue-Xuan; Shao, Hong-Bo; Jaleel, Cheruth Abdul
2010-09-01
In the past 20 years, the major effort in plant breeding has changed from quantitative to molecular genetics with emphasis on quantitative trait loci (QTL) identification and marker assisted selection (MAS). However, results have been modest. This has been due to several factors including absence of tight linkage QTL, non-availability of mapping populations, and substantial time needed to develop such populations. To overcome these limitations, and as an alternative to planned populations, molecular marker-trait associations have been identified by the combination between germplasm and the regression technique. In the present preview, the authors (1) survey the successful applications of germplasm-regression-combined (GRC) molecular marker-trait association identification in plants; (2) describe how to do the GRC analysis and its differences from mapping QTL based on a linkage map reconstructed from the planned populations; (3) consider the factors that affect the GRC association identification, including selections of optimal germplasm and molecular markers and testing of identification efficiency of markers associated with traits; and (4) finally discuss the future prospects of GRC marker-trait association analysis used in plant MAS/QTL breeding programs, especially in long-juvenile woody plants when no other genetic information such as linkage maps and QTL are available.
Shi, J Q; Wang, B; Will, E J; West, R M
2012-11-20
We propose a new semiparametric model for functional regression analysis, combining a parametric mixed-effects model with a nonparametric Gaussian process regression model, namely a mixed-effects Gaussian process functional regression model. The parametric component can provide explanatory information between the response and the covariates, whereas the nonparametric component can add nonlinearity. We can model the mean and covariance structures simultaneously, combining the information borrowed from other subjects with the information collected from each individual subject. We apply the model to dose-response curves that describe changes in the responses of subjects for differing levels of the dose of a drug or agent and have a wide application in many areas. We illustrate the method for the management of renal anaemia. An individual dose-response curve is improved when more information is included by this mechanism from the subject/patient over time, enabling a patient-specific treatment regime. Copyright © 2012 John Wiley & Sons, Ltd.
Caballero, Daniel; Antequera, Teresa; Caro, Andrés; Ávila, María Del Mar; G Rodríguez, Pablo; Perez-Palacios, Trinidad
2017-07-01
Magnetic resonance imaging (MRI) combined with computer vision techniques have been proposed as an alternative or complementary technique to determine the quality parameters of food in a non-destructive way. The aim of this work was to analyze the sensory attributes of dry-cured loins using this technique. For that, different MRI acquisition sequences (spin echo, gradient echo and turbo 3D), algorithms for MRI analysis (GLCM, NGLDM, GLRLM and GLCM-NGLDM-GLRLM) and predictive data mining techniques (multiple linear regression and isotonic regression) were tested. The correlation coefficient (R) and mean absolute error (MAE) were used to validate the prediction results. The combination of spin echo, GLCM and isotonic regression produced the most accurate results. In addition, the MRI data from dry-cured loins seems to be more suitable than the data from fresh loins. The application of predictive data mining techniques on computational texture features from the MRI data of loins enables the determination of the sensory traits of dry-cured loins in a non-destructive way. © 2016 Society of Chemical Industry. © 2016 Society of Chemical Industry.
I show that a conditional probability analysis using a stressor-response model based on a logistic regression provides a useful approach for developing candidate water quality criteria from empirical data, such as the Maryland Biological Streams Survey (MBSS) data.
Tanpitukpongse, Teerath P.; Mazurowski, Maciej A.; Ikhena, John; Petrella, Jeffrey R.
2016-01-01
Background and Purpose To assess prognostic efficacy of individual versus combined regional volumetrics in two commercially-available brain volumetric software packages for predicting conversion of patients with mild cognitive impairment to Alzheimer's disease. Materials and Methods Data was obtained through the Alzheimer's Disease Neuroimaging Initiative. 192 subjects (mean age 74.8 years, 39% female) diagnosed with mild cognitive impairment at baseline were studied. All had T1WI MRI sequences at baseline and 3-year clinical follow-up. Analysis was performed with NeuroQuant® and Neuroreader™. Receiver operating characteristic curves assessing the prognostic efficacy of each software package were generated using a univariable approach employing individual regional brain volumes, as well as two multivariable approaches (multiple regression and random forest), combining multiple volumes. Results On univariable analysis of 11 NeuroQuant® and 11 Neuroreader™ regional volumes, hippocampal volume had the highest area under the curve for both software packages (0.69 NeuroQuant®, 0.68 Neuroreader™), and was not significantly different (p > 0.05) between packages. Multivariable analysis did not increase the area under the curve for either package (0.63 logistic regression, 0.60 random forest NeuroQuant®; 0.65 logistic regression, 0.62 random forest Neuroreader™). Conclusion Of the multiple regional volume measures available in FDA-cleared brain volumetric software packages, hippocampal volume remains the best single predictor of conversion of mild cognitive impairment to Alzheimer's disease at 3-year follow-up. Combining volumetrics did not add additional prognostic efficacy. Therefore, future prognostic studies in MCI, combining such tools with demographic and other biomarker measures, are justified in using hippocampal volume as the only volumetric biomarker. PMID:28057634
Hill, Andrew; Loh, Po-Ru; Bharadwaj, Ragu B.; Pons, Pascal; Shang, Jingbo; Guinan, Eva; Lakhani, Karim; Kilty, Iain
2017-01-01
Abstract Background: The association of differing genotypes with disease-related phenotypic traits offers great potential to both help identify new therapeutic targets and support stratification of patients who would gain the greatest benefit from specific drug classes. Development of low-cost genotyping and sequencing has made collecting large-scale genotyping data routine in population and therapeutic intervention studies. In addition, a range of new technologies is being used to capture numerous new and complex phenotypic descriptors. As a result, genotype and phenotype datasets have grown exponentially. Genome-wide association studies associate genotypes and phenotypes using methods such as logistic regression. As existing tools for association analysis limit the efficiency by which value can be extracted from increasing volumes of data, there is a pressing need for new software tools that can accelerate association analyses on large genotype-phenotype datasets. Results: Using open innovation (OI) and contest-based crowdsourcing, the logistic regression analysis in a leading, community-standard genetics software package (PLINK 1.07) was substantially accelerated. OI allowed us to do this in <6 months by providing rapid access to highly skilled programmers with specialized, difficult-to-find skill sets. Through a crowd-based contest a combination of computational, numeric, and algorithmic approaches was identified that accelerated the logistic regression in PLINK 1.07 by 18- to 45-fold. Combining contest-derived logistic regression code with coarse-grained parallelization, multithreading, and associated changes to data initialization code further developed through distributed innovation, we achieved an end-to-end speedup of 591-fold for a data set size of 6678 subjects by 645 863 variants, compared to PLINK 1.07's logistic regression. This represents a reduction in run time from 4.8 hours to 29 seconds. Accelerated logistic regression code developed in this project has been incorporated into the PLINK2 project. Conclusions: Using iterative competition-based OI, we have developed a new, faster implementation of logistic regression for genome-wide association studies analysis. We present lessons learned and recommendations on running a successful OI process for bioinformatics. PMID:28327993
Hill, Andrew; Loh, Po-Ru; Bharadwaj, Ragu B; Pons, Pascal; Shang, Jingbo; Guinan, Eva; Lakhani, Karim; Kilty, Iain; Jelinsky, Scott A
2017-05-01
The association of differing genotypes with disease-related phenotypic traits offers great potential to both help identify new therapeutic targets and support stratification of patients who would gain the greatest benefit from specific drug classes. Development of low-cost genotyping and sequencing has made collecting large-scale genotyping data routine in population and therapeutic intervention studies. In addition, a range of new technologies is being used to capture numerous new and complex phenotypic descriptors. As a result, genotype and phenotype datasets have grown exponentially. Genome-wide association studies associate genotypes and phenotypes using methods such as logistic regression. As existing tools for association analysis limit the efficiency by which value can be extracted from increasing volumes of data, there is a pressing need for new software tools that can accelerate association analyses on large genotype-phenotype datasets. Using open innovation (OI) and contest-based crowdsourcing, the logistic regression analysis in a leading, community-standard genetics software package (PLINK 1.07) was substantially accelerated. OI allowed us to do this in <6 months by providing rapid access to highly skilled programmers with specialized, difficult-to-find skill sets. Through a crowd-based contest a combination of computational, numeric, and algorithmic approaches was identified that accelerated the logistic regression in PLINK 1.07 by 18- to 45-fold. Combining contest-derived logistic regression code with coarse-grained parallelization, multithreading, and associated changes to data initialization code further developed through distributed innovation, we achieved an end-to-end speedup of 591-fold for a data set size of 6678 subjects by 645 863 variants, compared to PLINK 1.07's logistic regression. This represents a reduction in run time from 4.8 hours to 29 seconds. Accelerated logistic regression code developed in this project has been incorporated into the PLINK2 project. Using iterative competition-based OI, we have developed a new, faster implementation of logistic regression for genome-wide association studies analysis. We present lessons learned and recommendations on running a successful OI process for bioinformatics. © The Author 2017. Published by Oxford University Press.
Sparse partial least squares regression for simultaneous dimension reduction and variable selection
Chun, Hyonho; Keleş, Sündüz
2010-01-01
Partial least squares regression has been an alternative to ordinary least squares for handling multicollinearity in several areas of scientific research since the 1960s. It has recently gained much attention in the analysis of high dimensional genomic data. We show that known asymptotic consistency of the partial least squares estimator for a univariate response does not hold with the very large p and small n paradigm. We derive a similar result for a multivariate response regression with partial least squares. We then propose a sparse partial least squares formulation which aims simultaneously to achieve good predictive performance and variable selection by producing sparse linear combinations of the original predictors. We provide an efficient implementation of sparse partial least squares regression and compare it with well-known variable selection and dimension reduction approaches via simulation experiments. We illustrate the practical utility of sparse partial least squares regression in a joint analysis of gene expression and genomewide binding data. PMID:20107611
Hao, Yong; Sun, Xu-Dong; Yang, Qiang
2012-12-01
Variables selection strategy combined with local linear embedding (LLE) was introduced for the analysis of complex samples by using near infrared spectroscopy (NIRS). Three methods include Monte Carlo uninformation variable elimination (MCUVE), successive projections algorithm (SPA) and MCUVE connected with SPA were used for eliminating redundancy spectral variables. Partial least squares regression (PLSR) and LLE-PLSR were used for modeling complex samples. The results shown that MCUVE can both extract effective informative variables and improve the precision of models. Compared with PLSR models, LLE-PLSR models can achieve more accurate analysis results. MCUVE combined with LLE-PLSR is an effective modeling method for NIRS quantitative analysis.
Li, Min; Zhang, Lu; Yao, Xiaolong; Jiang, Xingyu
2017-01-01
The emerging membrane introduction mass spectrometry technique has been successfully used to detect benzene, toluene, ethyl benzene and xylene (BTEX), while overlapped spectra have unfortunately hindered its further application to the analysis of mixtures. Multivariate calibration, an efficient method to analyze mixtures, has been widely applied. In this paper, we compared univariate and multivariate analyses for quantification of the individual components of mixture samples. The results showed that the univariate analysis creates poor models with regression coefficients of 0.912, 0.867, 0.440 and 0.351 for BTEX, respectively. For multivariate analysis, a comparison to the partial-least squares (PLS) model shows that the orthogonal partial-least squares (OPLS) regression exhibits an optimal performance with regression coefficients of 0.995, 0.999, 0.980 and 0.976, favorable calibration parameters (RMSEC and RMSECV) and a favorable validation parameter (RMSEP). Furthermore, the OPLS exhibits a good recovery of 73.86 - 122.20% and relative standard deviation (RSD) of the repeatability of 1.14 - 4.87%. Thus, MIMS coupled with the OPLS regression provides an optimal approach for a quantitative BTEX mixture analysis in monitoring and predicting water pollution.
Analysis of Sting Balance Calibration Data Using Optimized Regression Models
NASA Technical Reports Server (NTRS)
Ulbrich, N.; Bader, Jon B.
2010-01-01
Calibration data of a wind tunnel sting balance was processed using a candidate math model search algorithm that recommends an optimized regression model for the data analysis. During the calibration the normal force and the moment at the balance moment center were selected as independent calibration variables. The sting balance itself had two moment gages. Therefore, after analyzing the connection between calibration loads and gage outputs, it was decided to choose the difference and the sum of the gage outputs as the two responses that best describe the behavior of the balance. The math model search algorithm was applied to these two responses. An optimized regression model was obtained for each response. Classical strain gage balance load transformations and the equations of the deflection of a cantilever beam under load are used to show that the search algorithm s two optimized regression models are supported by a theoretical analysis of the relationship between the applied calibration loads and the measured gage outputs. The analysis of the sting balance calibration data set is a rare example of a situation when terms of a regression model of a balance can directly be derived from first principles of physics. In addition, it is interesting to note that the search algorithm recommended the correct regression model term combinations using only a set of statistical quality metrics that were applied to the experimental data during the algorithm s term selection process.
Reference-Free Removal of EEG-fMRI Ballistocardiogram Artifacts with Harmonic Regression
Krishnaswamy, Pavitra; Bonmassar, Giorgio; Poulsen, Catherine; Pierce, Eric T; Purdon, Patrick L.; Brown, Emery N.
2016-01-01
Combining electroencephalogram (EEG) recording and functional magnetic resonance imaging (fMRI) offers the potential for imaging brain activity with high spatial and temporal resolution. This potential remains limited by the significant ballistocardiogram (BCG) artifacts induced in the EEG by cardiac pulsation-related head movement within the magnetic field. We model the BCG artifact using a harmonic basis, pose the artifact removal problem as a local harmonic regression analysis, and develop an efficient maximum likelihood algorithm to estimate and remove BCG artifacts. Our analysis paradigm accounts for time-frequency overlap between the BCG artifacts and neurophysiologic EEG signals, and tracks the spatiotemporal variations in both the artifact and the signal. We evaluate performance on: simulated oscillatory and evoked responses constructed with realistic artifacts; actual anesthesia-induced oscillatory recordings; and actual visual evoked potential recordings. In each case, the local harmonic regression analysis effectively removes the BCG artifacts, and recovers the neurophysiologic EEG signals. We further show that our algorithm outperforms commonly used reference-based and component analysis techniques, particularly in low SNR conditions, the presence of significant time-frequency overlap between the artifact and the signal, and/or large spatiotemporal variations in the BCG. Because our algorithm does not require reference signals and has low computational complexity, it offers a practical tool for removing BCG artifacts from EEG data recorded in combination with fMRI. PMID:26151100
40 CFR 80.48 - Augmentation of the complex emission model by vehicle testing.
Code of Federal Regulations, 2010 CFR
2010-07-01
... section, the analysis shall fit a regression model to a combined data set that includes vehicle testing... logarithm of emissions contained in this combined data set: (A) A term for each vehicle that shall reflect... nearest limit of the data core, using the unaugmented complex model. (B) “B” shall be set equal to the...
40 CFR 80.48 - Augmentation of the complex emission model by vehicle testing.
Code of Federal Regulations, 2012 CFR
2012-07-01
... section, the analysis shall fit a regression model to a combined data set that includes vehicle testing... logarithm of emissions contained in this combined data set: (A) A term for each vehicle that shall reflect... nearest limit of the data core, using the unaugmented complex model. (B) “B” shall be set equal to the...
40 CFR 80.48 - Augmentation of the complex emission model by vehicle testing.
Code of Federal Regulations, 2014 CFR
2014-07-01
... section, the analysis shall fit a regression model to a combined data set that includes vehicle testing... logarithm of emissions contained in this combined data set: (A) A term for each vehicle that shall reflect... nearest limit of the data core, using the unaugmented complex model. (B) “B” shall be set equal to the...
40 CFR 80.48 - Augmentation of the complex emission model by vehicle testing.
Code of Federal Regulations, 2011 CFR
2011-07-01
... section, the analysis shall fit a regression model to a combined data set that includes vehicle testing... logarithm of emissions contained in this combined data set: (A) A term for each vehicle that shall reflect... nearest limit of the data core, using the unaugmented complex model. (B) “B” shall be set equal to the...
40 CFR 80.48 - Augmentation of the complex emission model by vehicle testing.
Code of Federal Regulations, 2013 CFR
2013-07-01
... section, the analysis shall fit a regression model to a combined data set that includes vehicle testing... logarithm of emissions contained in this combined data set: (A) A term for each vehicle that shall reflect... nearest limit of the data core, using the unaugmented complex model. (B) “B” shall be set equal to the...
Milner, Allison; Aitken, Zoe; Kavanagh, Anne; LaMontagne, Anthony D; Pega, Frank; Petrie, Dennis
2017-06-23
Previous studies suggest that poor psychosocial job quality is a risk factor for mental health problems, but they use conventional regression analytic methods that cannot rule out reverse causation, unmeasured time-invariant confounding and reporting bias. This study combines two quasi-experimental approaches to improve causal inference by better accounting for these biases: (i) linear fixed effects regression analysis and (ii) linear instrumental variable analysis. We extract 13 annual waves of national cohort data including 13 260 working-age (18-64 years) employees. The exposure variable is self-reported level of psychosocial job quality. The instruments used are two common workplace entitlements. The outcome variable is the Mental Health Inventory (MHI-5). We adjust for measured time-varying confounders. In the fixed effects regression analysis adjusted for time-varying confounders, a 1-point increase in psychosocial job quality is associated with a 1.28-point improvement in mental health on the MHI-5 scale (95% CI: 1.17, 1.40; P < 0.001). When the fixed effects was combined with the instrumental variable analysis, a 1-point increase psychosocial job quality is related to 1.62-point improvement on the MHI-5 scale (95% CI: -0.24, 3.48; P = 0.088). Our quasi-experimental results provide evidence to confirm job stressors as risk factors for mental ill health using methods that improve causal inference. © The Author 2017. Published by Oxford University Press on behalf of Faculty of Public Health. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com
Suzuki, Hideaki; Tabata, Takahisa; Koizumi, Hiroki; Hohchi, Nobusuke; Takeuchi, Shoko; Kitamura, Takuro; Fujino, Yoshihisa; Ohbuchi, Toyoaki
2014-12-01
This study aimed to create a multiple regression model for predicting hearing outcomes of idiopathic sudden sensorineural hearing loss (ISSNHL). The participants were 205 consecutive patients (205 ears) with ISSNHL (hearing level ≥ 40 dB, interval between onset and treatment ≤ 30 days). They received systemic steroid administration combined with intratympanic steroid injection. Data were examined by simple and multiple regression analyses. Three hearing indices (percentage hearing improvement, hearing gain, and posttreatment hearing level [HLpost]) and 7 prognostic factors (age, days from onset to treatment, initial hearing level, initial hearing level at low frequencies, initial hearing level at high frequencies, presence of vertigo, and contralateral hearing level) were included in the multiple regression analysis as dependent and explanatory variables, respectively. In the simple regression analysis, the percentage hearing improvement, hearing gain, and HLpost showed significant correlation with 2, 5, and 6 of the 7 prognostic factors, respectively. The multiple correlation coefficients were 0.396, 0.503, and 0.714 for the percentage hearing improvement, hearing gain, and HLpost, respectively. Predicted values of HLpost calculated by the multiple regression equation were reliable with 70% probability with a 40-dB-width prediction interval. Prediction of HLpost by the multiple regression model may be useful to estimate the hearing prognosis of ISSNHL. © The Author(s) 2014.
Machine learning in updating predictive models of planning and scheduling transportation projects
DOT National Transportation Integrated Search
1997-01-01
A method combining machine learning and regression analysis to automatically and intelligently update predictive models used in the Kansas Department of Transportations (KDOTs) internal management system is presented. The predictive models used...
Logistic regression applied to natural hazards: rare event logistic regression with replications
NASA Astrophysics Data System (ADS)
Guns, M.; Vanacker, V.
2012-06-01
Statistical analysis of natural hazards needs particular attention, as most of these phenomena are rare events. This study shows that the ordinary rare event logistic regression, as it is now commonly used in geomorphologic studies, does not always lead to a robust detection of controlling factors, as the results can be strongly sample-dependent. In this paper, we introduce some concepts of Monte Carlo simulations in rare event logistic regression. This technique, so-called rare event logistic regression with replications, combines the strength of probabilistic and statistical methods, and allows overcoming some of the limitations of previous developments through robust variable selection. This technique was here developed for the analyses of landslide controlling factors, but the concept is widely applicable for statistical analyses of natural hazards.
NASA Astrophysics Data System (ADS)
Lespinats, S.; Meyer-Bäse, Anke; He, Huan; Marshall, Alan G.; Conrad, Charles A.; Emmett, Mark R.
2009-05-01
Partial Least Square Regression (PLSR) and Data-Driven High Dimensional Scaling (DD-HDS) are employed for the prediction and the visualization of changes in polar lipid expression induced by different combinations of wild-type (wt) p53 gene therapy and SN38 chemotherapy of U87 MG glioblastoma cells. A very detailed analysis of the gangliosides reveals that certain gangliosides of GM3 or GD1-type have unique properties not shared by the others. In summary, this preliminary work shows that data mining techniques are able to determine the modulation of gangliosides by different treatment combinations.
Effects of eye artifact removal methods on single trial P300 detection, a comparative study.
Ghaderi, Foad; Kim, Su Kyoung; Kirchner, Elsa Andrea
2014-01-15
Electroencephalographic signals are commonly contaminated by eye artifacts, even if recorded under controlled conditions. The objective of this work was to quantitatively compare standard artifact removal methods (regression, filtered regression, Infomax, and second order blind identification (SOBI)) and two artifact identification approaches for independent component analysis (ICA) methods, i.e. ADJUST and correlation. To this end, eye artifacts were removed and the cleaned datasets were used for single trial classification of P300 (a type of event related potentials elicited using the oddball paradigm). Statistical analysis of the results confirms that the combination of Infomax and ADJUST provides a relatively better performance (0.6% improvement on average of all subject) while the combination of SOBI and correlation performs the worst. Low-pass filtering the data at lower cutoffs (here 4 Hz) can also improve the classification accuracy. Without requiring any artifact reference channel, the combination of Infomax and ADJUST improves the classification performance more than the other methods for both examined filtering cutoffs, i.e., 4 Hz and 25 Hz. Copyright © 2013 Elsevier B.V. All rights reserved.
Parrett, Charles
2006-01-01
To address concerns expressed by the State of Montana about the apportionment of water in the St. Mary and Milk River basins between Canada and the United States, the International Joint Commission requested information from the United States government about water that originates in the United States but does not cross the border into Canada. In response to this request, the U.S. Geological Survey synthesized monthly and annual streamflow records for Big Sandy, Clear, Peoples, and Beaver Creeks, all of which are in the Milk River basin in Montana, for water years 1950-2003. This report presents the synthesized values of monthly and annual streamflow for Big Sandy, Clear, Peoples, and Beaver Creeks in Montana. Synthesized values were derived from recorded and estimated streamflows. Statistics, including long-term medians and averages and flows for various exceedance probabilities, were computed from the synthesized data. Beaver Creek had the largest median annual discharge (19,490 acre-feet), and Clear Creek had the smallest median annual discharge (6,680 acre-feet). Big Sandy Creek, the stream with the largest drainage area, had the second smallest median annual discharge (9,640 acre-feet), whereas Peoples Creek, the stream with the second smallest drainage area, had the second largest median annual discharge (11,700 acre-feet). The combined median annual discharge for the four streams was 45,400 acre-feet. The largest combined median monthly discharge for the four creeks was 6,930 acre-feet in March, and the smallest combined median monthly discharge was 48 acre-feet in January. The combined median monthly values were substantially smaller than the average monthly values. Overall, synthesized flow records for the four creeks are considered to be reasonable given the prevailing climatic conditions in the region during the 1950-2003 base period. Individual estimates of monthly streamflow may have large errors, however. Linear regression was used to relate logarithms of combined annual streamflow to water years 1950-2003. The results of the regression analysis indicated a significant downward trend (regression line slope was -0.00977) for combined annual streamflow. A regression analysis using data from 1956-2003 indicated a slight, but not significant, downward trend for combined annual streamflow.
Sophocleous, M.
2000-01-01
A practical methodology for recharge characterization was developed based on several years of field-oriented research at 10 sites in the Great Bend Prairie of south-central Kansas. This methodology combines the soil-water budget on a storm-by-storm year-round basis with the resulting watertable rises. The estimated 1985-1992 average annual recharge was less than 50mm/year with a range from 15 mm/year (during the 1998 drought) to 178 mm/year (during the 1993 flood year). Most of this recharge occurs during the spring months. To regionalize these site-specific estimates, an additional methodology based on multiple (forward) regression analysis combined with classification and GIS overlay analyses was developed and implemented. The multiple regression analysis showed that the most influential variables were, in order of decreasing importance, total annual precipitation, average maximum springtime soil-profile water storage, average shallowest springtime depth to watertable, and average springtime precipitation rate. Therefore, four GIS (ARC/INFO) data "layers" or coverages were constructed for the study region based on these four variables, and each such coverage was classified into the same number of data classes to avoid biasing the results. The normalized regression coefficients were employed to weigh the class rankings of each recharge-affecting variable. This approach resulted in recharge zonations that agreed well with the site recharge estimates. During the "Great Flood of 1993," when rainfall totals exceeded normal levels by -200% in the northern portion of the study region, the developed regionalization methodology was tested against such extreme conditions, and proved to be both practical, based on readily available or easily measurable data, and robust. It was concluded that the combination of multiple regression and GIS overlay analyses is a powerful and practical approach to regionalizing small samples of recharge estimates.
Forecasting Container Throughput at the Doraleh Port in Djibouti through Time Series Analysis
NASA Astrophysics Data System (ADS)
Mohamed Ismael, Hawa; Vandyck, George Kobina
The Doraleh Container Terminal (DCT) located in Djibouti has been noted as the most technologically advanced container terminal on the African continent. DCT's strategic location at the crossroads of the main shipping lanes connecting Asia, Africa and Europe put it in a unique position to provide important shipping services to vessels plying that route. This paper aims to forecast container throughput through the Doraleh Container Port in Djibouti by Time Series Analysis. A selection of univariate forecasting models has been used, namely Triple Exponential Smoothing Model, Grey Model and Linear Regression Model. By utilizing the above three models and their combination, the forecast of container throughput through the Doraleh port was realized. A comparison of the different forecasting results of the three models, in addition to the combination forecast is then undertaken, based on commonly used evaluation criteria Mean Absolute Deviation (MAD) and Mean Absolute Percentage Error (MAPE). The study found that the Linear Regression forecasting Model was the best prediction method for forecasting the container throughput, since its forecast error was the least. Based on the regression model, a ten (10) year forecast for container throughput at DCT has been made.
Van Belle, Vanya; Pelckmans, Kristiaan; Van Huffel, Sabine; Suykens, Johan A K
2011-10-01
To compare and evaluate ranking, regression and combined machine learning approaches for the analysis of survival data. The literature describes two approaches based on support vector machines to deal with censored observations. In the first approach the key idea is to rephrase the task as a ranking problem via the concordance index, a problem which can be solved efficiently in a context of structural risk minimization and convex optimization techniques. In a second approach, one uses a regression approach, dealing with censoring by means of inequality constraints. The goal of this paper is then twofold: (i) introducing a new model combining the ranking and regression strategy, which retains the link with existing survival models such as the proportional hazards model via transformation models; and (ii) comparison of the three techniques on 6 clinical and 3 high-dimensional datasets and discussing the relevance of these techniques over classical approaches fur survival data. We compare svm-based survival models based on ranking constraints, based on regression constraints and models based on both ranking and regression constraints. The performance of the models is compared by means of three different measures: (i) the concordance index, measuring the model's discriminating ability; (ii) the logrank test statistic, indicating whether patients with a prognostic index lower than the median prognostic index have a significant different survival than patients with a prognostic index higher than the median; and (iii) the hazard ratio after normalization to restrict the prognostic index between 0 and 1. Our results indicate a significantly better performance for models including regression constraints above models only based on ranking constraints. This work gives empirical evidence that svm-based models using regression constraints perform significantly better than svm-based models based on ranking constraints. Our experiments show a comparable performance for methods including only regression or both regression and ranking constraints on clinical data. On high dimensional data, the former model performs better. However, this approach does not have a theoretical link with standard statistical models for survival data. This link can be made by means of transformation models when ranking constraints are included. Copyright © 2011 Elsevier B.V. All rights reserved.
NASA Technical Reports Server (NTRS)
Waller, M. C.
1976-01-01
An electro-optical device called an oculometer which tracks a subject's lookpoint as a time function has been used to collect data in a real-time simulation study of instrument landing system (ILS) approaches. The data describing the scanning behavior of a pilot during the instrument approaches have been analyzed by use of a stepwise regression analysis technique. A statistically significant correlation between pilot workload, as indicated by pilot ratings, and scanning behavior has been established. In addition, it was demonstrated that parameters derived from the scanning behavior data can be combined in a mathematical equation to provide a good representation of pilot workload.
Pattullo, Venessa; Thein, Hla-Hla; Heathcote, Elizabeth Jenny; Guindi, Maha
2012-09-01
A fall in hepatic fibrosis stage may be observed in patients with chronic hepatitis C (CHC); however, parenchymal architectural changes may also signify hepatic remodelling associated with fibrosis regression. The aim of this study was to utilize semiquantitative and qualitative methods to report the prevalence and factors associated with fibrosis regression in CHC. Paired liver biopsies were scored for fibrosis (Ishak), and for the presence of eight qualitative features of parenchymal remodelling, to derive a qualitative regression score (QR score). Combined fibrosis regression was defined as ≥2-stage fall in Ishak stage (Reg-I) or <2-stage fall in Ishak stage with a rise in QR score (Reg-Qual). Among 159 patients (biopsy interval 5.4 ± 3.1 years), Reg-I was observed in 12 (7.5%) and Reg-Qual in 26 (16.4%) patients. The combined diagnostic criteria increased the diagnosis rate for fibrosis regression (38 patients, 23.9%) compared with use of Reg-I alone (P < 0.001). Combined fibrosis regression was observed in nine patients (50%) who achieved sustained virological response (SVR), and in 29 of 141 (21%) patients despite persistent viraemia. SVR was the only clinical factor associated independently with combined fibrosis regression (odds ratio 3.05). The combination of semiquantitative measures and qualitative features aids the identification of fibrosis regression in CHC. © 2012 Blackwell Publishing Ltd.
Kuiper, Gerhardus J A J M; Houben, Rik; Wetzels, Rick J H; Verhezen, Paul W M; Oerle, Rene van; Ten Cate, Hugo; Henskens, Yvonne M C; Lancé, Marcus D
2017-11-01
Low platelet counts and hematocrit levels hinder whole blood point-of-care testing of platelet function. Thus far, no reference ranges for MEA (multiple electrode aggregometry) and PFA-100 (platelet function analyzer 100) devices exist for low ranges. Through dilution methods of volunteer whole blood, platelet function at low ranges of platelet count and hematocrit levels was assessed on MEA for four agonists and for PFA-100 in two cartridges. Using (multiple) regression analysis, 95% reference intervals were computed for these low ranges. Low platelet counts affected MEA in a positive correlation (all agonists showed r 2 ≥ 0.75) and PFA-100 in an inverse correlation (closure times were prolonged with lower platelet counts). Lowered hematocrit did not affect MEA testing, except for arachidonic acid activation (ASPI), which showed a weak positive correlation (r 2 = 0.14). Closure time on PFA-100 testing was inversely correlated with hematocrit for both cartridges. Regression analysis revealed different 95% reference intervals in comparison with originally established intervals for both MEA and PFA-100 in low platelet or hematocrit conditions. Multiple regression analysis of ASPI and both tests on the PFA-100 for combined low platelet and hematocrit conditions revealed that only PFA-100 testing should be adjusted for both thrombocytopenia and anemia. 95% reference intervals were calculated using multiple regression analysis. However, coefficients of determination of PFA-100 were poor, and some variance remained unexplained. Thus, in this pilot study using (multiple) regression analysis, we could establish reference intervals of platelet function in anemia and thrombocytopenia conditions on PFA-100 and in thrombocytopenia conditions on MEA.
Geographical Text Analysis: A new approach to understanding nineteenth-century mortality.
Porter, Catherine; Atkinson, Paul; Gregory, Ian
2015-11-01
This paper uses a combination of Geographic Information Systems (GIS) and corpus linguistic analysis to extract and analyse disease related keywords from the Registrar-General's Decennial Supplements. Combined with known mortality figures, this provides, for the first time, a spatial picture of the relationship between the Registrar-General's discussion of disease and deaths in England and Wales in the nineteenth and early twentieth centuries. Techniques such as collocation, density analysis, the Hierarchical Regional Settlement matrix and regression analysis are employed to extract and analyse the data resulting in new insight into the relationship between the Registrar-General's published texts and the changing mortality patterns during this time. Copyright © 2015 Elsevier Ltd. All rights reserved.
du Toit, Lisa; Pillay, Viness; Choonara, Yahya
2010-01-01
Dissolution testing with subsequent analysis is considered as an imperative tool for quality evaluation of the combination rifampicin-isoniazid (RIF-INH) combination. Partial least squares (PLS) regression has been successfully undertaken to select suitable predictor variables and to identify outliers for the generation of equations for RIF and INH determination in fixed-dose combinations (FDCs). The aim of this investigation was to ascertain the applicability of the described technique in testing a novel oral FDC anti-TB drug delivery system and currently available two-drug FDCs, in comparison to the United States Pharmacopeial method for analysis of RIF and INH Capsules with chromatographic determination of INH and colorimetric RIF determination. Regression equations generated employing the statistical coefficients satisfactorily predicted RIF release at each sampling point (R(2)>or=0.9350). There was an acceptable degree of correlation between the drug release data, as predicted by regressional analysis of UV spectrophotometric data, and chromatographic and colorimetric determination of INH (R(2)=0.9793 and R(2)=0.9739) and RIF (R(2)= 0.9976 and R(2)=0.9996) for the two-drug FDC and the novel oral anti-TB drug delivery system, respectively. Regressional analysis of UV spectrophotometric data for simultaneous RIF and INH prediction thus provides a simplified methodology for use in diverse research settings for the assurance of RIF bioavailability from FDC formulations, specifically modified-release forms.
Tanpitukpongse, T P; Mazurowski, M A; Ikhena, J; Petrella, J R
2017-03-01
Alzheimer disease is a prevalent neurodegenerative disease. Computer assessment of brain atrophy patterns can help predict conversion to Alzheimer disease. Our aim was to assess the prognostic efficacy of individual-versus-combined regional volumetrics in 2 commercially available brain volumetric software packages for predicting conversion of patients with mild cognitive impairment to Alzheimer disease. Data were obtained through the Alzheimer's Disease Neuroimaging Initiative. One hundred ninety-two subjects (mean age, 74.8 years; 39% female) diagnosed with mild cognitive impairment at baseline were studied. All had T1-weighted MR imaging sequences at baseline and 3-year clinical follow-up. Analysis was performed with NeuroQuant and Neuroreader. Receiver operating characteristic curves assessing the prognostic efficacy of each software package were generated by using a univariable approach using individual regional brain volumes and 2 multivariable approaches (multiple regression and random forest), combining multiple volumes. On univariable analysis of 11 NeuroQuant and 11 Neuroreader regional volumes, hippocampal volume had the highest area under the curve for both software packages (0.69, NeuroQuant; 0.68, Neuroreader) and was not significantly different ( P > .05) between packages. Multivariable analysis did not increase the area under the curve for either package (0.63, logistic regression; 0.60, random forest NeuroQuant; 0.65, logistic regression; 0.62, random forest Neuroreader). Of the multiple regional volume measures available in FDA-cleared brain volumetric software packages, hippocampal volume remains the best single predictor of conversion of mild cognitive impairment to Alzheimer disease at 3-year follow-up. Combining volumetrics did not add additional prognostic efficacy. Therefore, future prognostic studies in mild cognitive impairment, combining such tools with demographic and other biomarker measures, are justified in using hippocampal volume as the only volumetric biomarker. © 2017 by American Journal of Neuroradiology.
Planning Skills in Autism Spectrum Disorder across the Lifespan: A Meta-Analysis and Meta-Regression
ERIC Educational Resources Information Center
Olde Dubbelink, Linda M. E.; Geurts, Hilde M.
2017-01-01
Individuals with an autism spectrum disorder (ASD) are thought to encounter planning difficulties, but experimental research regarding the mastery of planning in ASD is inconsistent. By means of a meta-analysis of 50 planning studies with a combined sample size of 1755 individuals with and 1642 without ASD, we aim to determine whether planning…
An Adaptive Cross-Architecture Combination Method for Graph Traversal
DOE Office of Scientific and Technical Information (OSTI.GOV)
You, Yang; Song, Shuaiwen; Kerbyson, Darren J.
2014-06-18
Breadth-First Search (BFS) is widely used in many real-world applications including computational biology, social networks, and electronic design automation. The combination method, using both top-down and bottom-up techniques, is the most effective BFS approach. However, current combination methods rely on trial-and-error and exhaustive search to locate the optimal switching point, which may cause significant runtime overhead. To solve this problem, we design an adaptive method based on regression analysis to predict an optimal switching point for the combination method at runtime within less than 0.1% of the BFS execution time.
Meta-Analysis of the Reasoned Action Approach (RAA) to Understanding Health Behaviors.
McEachan, Rosemary; Taylor, Natalie; Harrison, Reema; Lawton, Rebecca; Gardner, Peter; Conner, Mark
2016-08-01
Reasoned action approach (RAA) includes subcomponents of attitude (experiential/instrumental), perceived norm (injunctive/descriptive), and perceived behavioral control (capacity/autonomy) to predict intention and behavior. To provide a meta-analysis of the RAA for health behaviors focusing on comparing the pairs of RAA subcomponents and differences between health protection and health-risk behaviors. The present research reports a meta-analysis of correlational tests of RAA subcomponents, examination of moderators, and combined effects of subcomponents on intention and behavior. Regressions were used to predict intention and behavior based on data from studies measuring all variables. Capacity and experiential attitude had large, and other constructs had small-medium-sized correlations with intention; all constructs except autonomy were significant independent predictors of intention in regressions. Intention, capacity, and experiential attitude had medium-large, and other constructs had small-medium-sized correlations with behavior; intention, capacity, experiential attitude, and descriptive norm were significant independent predictors of behavior in regressions. The RAA subcomponents have utility in predicting and understanding health behaviors.
Deep ensemble learning of sparse regression models for brain disease diagnosis.
Suk, Heung-Il; Lee, Seong-Whan; Shen, Dinggang
2017-04-01
Recent studies on brain imaging analysis witnessed the core roles of machine learning techniques in computer-assisted intervention for brain disease diagnosis. Of various machine-learning techniques, sparse regression models have proved their effectiveness in handling high-dimensional data but with a small number of training samples, especially in medical problems. In the meantime, deep learning methods have been making great successes by outperforming the state-of-the-art performances in various applications. In this paper, we propose a novel framework that combines the two conceptually different methods of sparse regression and deep learning for Alzheimer's disease/mild cognitive impairment diagnosis and prognosis. Specifically, we first train multiple sparse regression models, each of which is trained with different values of a regularization control parameter. Thus, our multiple sparse regression models potentially select different feature subsets from the original feature set; thereby they have different powers to predict the response values, i.e., clinical label and clinical scores in our work. By regarding the response values from our sparse regression models as target-level representations, we then build a deep convolutional neural network for clinical decision making, which thus we call 'Deep Ensemble Sparse Regression Network.' To our best knowledge, this is the first work that combines sparse regression models with deep neural network. In our experiments with the ADNI cohort, we validated the effectiveness of the proposed method by achieving the highest diagnostic accuracies in three classification tasks. We also rigorously analyzed our results and compared with the previous studies on the ADNI cohort in the literature. Copyright © 2017 Elsevier B.V. All rights reserved.
Deep ensemble learning of sparse regression models for brain disease diagnosis
Suk, Heung-Il; Lee, Seong-Whan; Shen, Dinggang
2018-01-01
Recent studies on brain imaging analysis witnessed the core roles of machine learning techniques in computer-assisted intervention for brain disease diagnosis. Of various machine-learning techniques, sparse regression models have proved their effectiveness in handling high-dimensional data but with a small number of training samples, especially in medical problems. In the meantime, deep learning methods have been making great successes by outperforming the state-of-the-art performances in various applications. In this paper, we propose a novel framework that combines the two conceptually different methods of sparse regression and deep learning for Alzheimer’s disease/mild cognitive impairment diagnosis and prognosis. Specifically, we first train multiple sparse regression models, each of which is trained with different values of a regularization control parameter. Thus, our multiple sparse regression models potentially select different feature subsets from the original feature set; thereby they have different powers to predict the response values, i.e., clinical label and clinical scores in our work. By regarding the response values from our sparse regression models as target-level representations, we then build a deep convolutional neural network for clinical decision making, which thus we call ‘ Deep Ensemble Sparse Regression Network.’ To our best knowledge, this is the first work that combines sparse regression models with deep neural network. In our experiments with the ADNI cohort, we validated the effectiveness of the proposed method by achieving the highest diagnostic accuracies in three classification tasks. We also rigorously analyzed our results and compared with the previous studies on the ADNI cohort in the literature. PMID:28167394
A psycholinguistic database for traditional Chinese character naming.
Chang, Ya-Ning; Hsu, Chun-Hsien; Tsai, Jie-Li; Chen, Chien-Liang; Lee, Chia-Ying
2016-03-01
In this study, we aimed to provide a large-scale set of psycholinguistic norms for 3,314 traditional Chinese characters, along with their naming reaction times (RTs), collected from 140 Chinese speakers. The lexical and semantic variables in the database include frequency, regularity, familiarity, consistency, number of strokes, homophone density, semantic ambiguity rating, phonetic combinability, semantic combinability, and the number of disyllabic compound words formed by a character. Multiple regression analyses were conducted to examine the predictive powers of these variables for the naming RTs. The results demonstrated that these variables could account for a significant portion of variance (55.8%) in the naming RTs. An additional multiple regression analysis was conducted to demonstrate the effects of consistency and character frequency. Overall, the regression results were consistent with the findings of previous studies on Chinese character naming. This database should be useful for research into Chinese language processing, Chinese education, or cross-linguistic comparisons. The database can be accessed via an online inquiry system (http://ball.ling.sinica.edu.tw/namingdatabase/index.html).
NASA Astrophysics Data System (ADS)
Abunama, Taher; Othman, Faridah
2017-06-01
Analysing the fluctuations of wastewater inflow rates in sewage treatment plants (STPs) is essential to guarantee a sufficient treatment of wastewater before discharging it to the environment. The main objectives of this study are to statistically analyze and forecast the wastewater inflow rates into the Bandar Tun Razak STP in Kuala Lumpur, Malaysia. A time series analysis of three years’ weekly influent data (156weeks) has been conducted using the Auto-Regressive Integrated Moving Average (ARIMA) model. Various combinations of ARIMA orders (p, d, q) have been tried to select the most fitted model, which was utilized to forecast the wastewater inflow rates. The linear regression analysis was applied to testify the correlation between the observed and predicted influents. ARIMA (3, 1, 3) model was selected with the highest significance R-square and lowest normalized Bayesian Information Criterion (BIC) value, and accordingly the wastewater inflow rates were forecasted to additional 52weeks. The linear regression analysis between the observed and predicted values of the wastewater inflow rates showed a positive linear correlation with a coefficient of 0.831.
Lu, Chi-Jie; Chang, Chi-Chang
2014-01-01
Sales forecasting plays an important role in operating a business since it can be used to determine the required inventory level to meet consumer demand and avoid the problem of under/overstocking. Improving the accuracy of sales forecasting has become an important issue of operating a business. This study proposes a hybrid sales forecasting scheme by combining independent component analysis (ICA) with K-means clustering and support vector regression (SVR). The proposed scheme first uses the ICA to extract hidden information from the observed sales data. The extracted features are then applied to K-means algorithm for clustering the sales data into several disjoined clusters. Finally, the SVR forecasting models are applied to each group to generate final forecasting results. Experimental results from information technology (IT) product agent sales data reveal that the proposed sales forecasting scheme outperforms the three comparison models and hence provides an efficient alternative for sales forecasting.
2014-01-01
Sales forecasting plays an important role in operating a business since it can be used to determine the required inventory level to meet consumer demand and avoid the problem of under/overstocking. Improving the accuracy of sales forecasting has become an important issue of operating a business. This study proposes a hybrid sales forecasting scheme by combining independent component analysis (ICA) with K-means clustering and support vector regression (SVR). The proposed scheme first uses the ICA to extract hidden information from the observed sales data. The extracted features are then applied to K-means algorithm for clustering the sales data into several disjoined clusters. Finally, the SVR forecasting models are applied to each group to generate final forecasting results. Experimental results from information technology (IT) product agent sales data reveal that the proposed sales forecasting scheme outperforms the three comparison models and hence provides an efficient alternative for sales forecasting. PMID:25045738
AGR-1 Thermocouple Data Analysis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Jeff Einerson
2012-05-01
This report documents an effort to analyze measured and simulated data obtained in the Advanced Gas Reactor (AGR) fuel irradiation test program conducted in the INL's Advanced Test Reactor (ATR) to support the Next Generation Nuclear Plant (NGNP) R&D program. The work follows up on a previous study (Pham and Einerson, 2010), in which statistical analysis methods were applied for AGR-1 thermocouple data qualification. The present work exercises the idea that, while recognizing uncertainties inherent in physics and thermal simulations of the AGR-1 test, results of the numerical simulations can be used in combination with the statistical analysis methods tomore » further improve qualification of measured data. Additionally, the combined analysis of measured and simulation data can generate insights about simulation model uncertainty that can be useful for model improvement. This report also describes an experimental control procedure to maintain fuel target temperature in the future AGR tests using regression relationships that include simulation results. The report is organized into four chapters. Chapter 1 introduces the AGR Fuel Development and Qualification program, AGR-1 test configuration and test procedure, overview of AGR-1 measured data, and overview of physics and thermal simulation, including modeling assumptions and uncertainties. A brief summary of statistical analysis methods developed in (Pham and Einerson 2010) for AGR-1 measured data qualification within NGNP Data Management and Analysis System (NDMAS) is also included for completeness. Chapters 2-3 describe and discuss cases, in which the combined use of experimental and simulation data is realized. A set of issues associated with measurement and modeling uncertainties resulted from the combined analysis are identified. This includes demonstration that such a combined analysis led to important insights for reducing uncertainty in presentation of AGR-1 measured data (Chapter 2) and interpretation of simulation results (Chapter 3). The statistics-based simulation-aided experimental control procedure described for the future AGR tests is developed and demonstrated in Chapter 4. The procedure for controlling the target fuel temperature (capsule peak or average) is based on regression functions of thermocouple readings and other relevant parameters and accounting for possible changes in both physical and thermal conditions and in instrument performance.« less
Guo, Canyong; Luo, Xuefang; Zhou, Xiaohua; Shi, Beijia; Wang, Juanjuan; Zhao, Jinqi; Zhang, Xiaoxia
2017-06-05
Vibrational spectroscopic techniques such as infrared, near-infrared and Raman spectroscopy have become popular in detecting and quantifying polymorphism of pharmaceutics since they are fast and non-destructive. This study assessed the ability of three vibrational spectroscopy combined with multivariate analysis to quantify a low-content undesired polymorph within a binary polymorphic mixture. Partial least squares (PLS) regression and support vector machine (SVM) regression were employed to build quantitative models. Fusidic acid, a steroidal antibiotic, was used as the model compound. It was found that PLS regression performed slightly better than SVM regression in all the three spectroscopic techniques. Root mean square errors of prediction (RMSEP) were ranging from 0.48% to 1.17% for diffuse reflectance FTIR spectroscopy and 1.60-1.93% for diffuse reflectance FT-NIR spectroscopy and 1.62-2.31% for Raman spectroscopy. The results indicate that diffuse reflectance FTIR spectroscopy offers significant advantages in providing accurate measurement of polymorphic content in the fusidic acid binary mixtures, while Raman spectroscopy is the least accurate technique for quantitative analysis of polymorphs. Copyright © 2017 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Koshigai, Masaru; Marui, Atsunao
Water table provides important information for the evaluation of groundwater resource. Recently, the estimation of water table in wide area is required for effective evaluation of groundwater resources. However, evaluation process is met with difficulties due to technical and economic constraints. Regression analysis for the prediction of groundwater levels based on geomorphologic and geologic conditions is considered as a reliable tool for the estimation of water table of wide area. Data of groundwater levels were extracted from the public database of geotechnical information. It was observed that changes in groundwater level depend on climate conditions. It was also observed and confirmed that there exist variations of groundwater levels according to geomorphologic and geologic conditions. The objective variable of the regression analysis was groundwater level. And the explanatory variables were elevation and the dummy variable consisting of group number. The constructed regression formula was significant according to the determination coefficients and analysis of the variance. Therefore, combining the regression formula and mesh map, the statistical method to estimate the water table based on geomorphologic and geologic condition for the whole country could be established.
1997-09-01
program include the ACEIT software training and the combination of Department of Defense (DOD) application, regression, and statistics. The weaknesses...and Integrated Tools ( ACEIT ) software and training could not be praised enough. AFIT vs. Civilian Institutions. The GCA program provides a Department...very useful to the graduates and beneficial to their careers. The main strengths of the program include the ACEIT software training and the combination
An evaluation of treatment strategies for head and neck cancer in an African American population.
Ignacio, D N; Griffin, J J; Daniel, M G; Serlemitsos-Day, M T; Lombardo, F A; Alleyne, T A
2013-07-01
This study evaluated treatment strategies for head and neck cancers in a predominantly African American population. Data were collected utilizing medical records and the tumour registry at the Howard University Hospital. Kaplan-Meier method was used for survival analysis and Cox proportional hazards regression analysis predicted the hazard of death. Analysis revealed that the main treatment strategy was radiation combined with platinum for all stages except stage I. Cetuximab was employed in only 1% of cases. Kaplan-Meier analysis revealed stage II patients had poorer outcome than stage IV while Cox proportional hazard regression analysis (p = 0.4662) showed that stage I had a significantly lower hazard of death than stage IV (HR = 0.314; p = 0.0272). Contributory factors included tobacco and alcohol but body mass index (BMI) was inversely related to hazard of death. There was no difference in survival using any treatment modality for African Americans.
Kempe, P T; van Oppen, P; de Haan, E; Twisk, J W R; Sluis, A; Smit, J H; van Dyck, R; van Balkom, A J L M
2007-09-01
Two methods for predicting remissions in obsessive-compulsive disorder (OCD) treatment are evaluated. Y-BOCS measurements of 88 patients with a primary OCD (DSM-III-R) diagnosis were performed over a 16-week treatment period, and during three follow-ups. Remission at any measurement was defined as a Y-BOCS score lower than thirteen combined with a reduction of seven points when compared with baseline. Logistic regression models were compared with a Cox regression for recurrent events model. Logistic regression yielded different models at different evaluation times. The recurrent events model remained stable when fewer measurements were used. Higher baseline levels of neuroticism and more severe OCD symptoms were associated with a lower chance of remission, early age of onset and more depressive symptoms with a higher chance. Choice of outcome time affects logistic regression prediction models. Recurrent events analysis uses all information on remissions and relapses. Short- and long-term predictors for OCD remission show overlap.
Weighted functional linear regression models for gene-based association analysis.
Belonogova, Nadezhda M; Svishcheva, Gulnara R; Wilson, James F; Campbell, Harry; Axenovich, Tatiana I
2018-01-01
Functional linear regression models are effectively used in gene-based association analysis of complex traits. These models combine information about individual genetic variants, taking into account their positions and reducing the influence of noise and/or observation errors. To increase the power of methods, where several differently informative components are combined, weights are introduced to give the advantage to more informative components. Allele-specific weights have been introduced to collapsing and kernel-based approaches to gene-based association analysis. Here we have for the first time introduced weights to functional linear regression models adapted for both independent and family samples. Using data simulated on the basis of GAW17 genotypes and weights defined by allele frequencies via the beta distribution, we demonstrated that type I errors correspond to declared values and that increasing the weights of causal variants allows the power of functional linear models to be increased. We applied the new method to real data on blood pressure from the ORCADES sample. Five of the six known genes with P < 0.1 in at least one analysis had lower P values with weighted models. Moreover, we found an association between diastolic blood pressure and the VMP1 gene (P = 8.18×10-6), when we used a weighted functional model. For this gene, the unweighted functional and weighted kernel-based models had P = 0.004 and 0.006, respectively. The new method has been implemented in the program package FREGAT, which is freely available at https://cran.r-project.org/web/packages/FREGAT/index.html.
Seo, Chang-Seob; Kim, Seong-Sil; Ha, Hyekyung
2013-01-01
This study was designed to perform simultaneous determination of three reference compounds in Syzygium aromaticum (SA), gallic acid, ellagic acid, and eugenol, and to investigate the chemical antagonistic effect when combining Curcuma aromatica (CA) with SA, based on chromatographic analysis. The values of LODs and LOQs were 0.01–0.11 μg/mL and 0.03–0.36 μg/mL, respectively. The intraday and interday precisions were <3.0 of RSD values, and the recovery was in the range of 92.19–103.24%, with RSD values <3.0%. Repeatability and stability were 0.38–0.73% and 0.49–2.24%, respectively. Compared with the content of reference and relative peaks in SA and SA combined with CA (SAC), the amounts of gallic acid and eugenol were increased, while that of ellagic acid was decreased in SAC (compared with SA), and most of peak areas in SA were reduced in SAC. Regression analysis of the relative peak areas between SA and SAC showed r 2 values >0.87, indicating a linear relationship between SA and SAC. These results demonstrate that the components contained in CA could affect the extraction of components of SA mainly in a decreasing manner. The antagonistic effect of CA on SA was verified by chemical analysis. PMID:23878761
NASA Astrophysics Data System (ADS)
Anggraeni, Anni; Arianto, Fernando; Mutalib, Abdul; Pratomo, Uji; Bahti, Husein H.
2017-05-01
Rare Earth Elements (REE) are elements that a lot of function for life, such as metallurgy, optical devices, and manufacture of electronic devices. Sources of REE is present in the mineral, in which each element has similar properties. Currently, to determining the content of REE is used instruments such as ICP-OES, ICP-MS, XRF, and HPLC. But in each instruments, there are still have some weaknesses. Therefore we need an alternative analytical method for the determination of rare earth metal content, one of them is by a combination of UV-Visible spectrophotometry and multivariate analysis, including Principal Component Analysis (PCA), Principal Component Regression (PCR), and Partial Least Square Regression (PLS). The purpose of this experiment is to determine the content of light and medium rare earth elements in the mineral monazite without chemical separation by using a combination of multivariate analysis and UV-Visible spectrophotometric methods. Training set created 22 variations of concentration and absorbance was measured using a UV-Vis spectrophotometer, then the data is processed by PCA, PCR, and PLSR. The results were compared and validated to obtain the mathematical equation with the smallest percent error. From this experiment, mathematical equation used PLS methods was better than PCR after validated, which has RMSE value for La, Ce, Pr, Nd, Gd, Sm, Eu, and Tb respectively 0.095; 0.573; 0.538; 0.440; 3.387; 1.240; 1.870; and 0.639.
Multivariate meta-analysis for non-linear and other multi-parameter associations
Gasparrini, A; Armstrong, B; Kenward, M G
2012-01-01
In this paper, we formalize the application of multivariate meta-analysis and meta-regression to synthesize estimates of multi-parameter associations obtained from different studies. This modelling approach extends the standard two-stage analysis used to combine results across different sub-groups or populations. The most straightforward application is for the meta-analysis of non-linear relationships, described for example by regression coefficients of splines or other functions, but the methodology easily generalizes to any setting where complex associations are described by multiple correlated parameters. The modelling framework of multivariate meta-analysis is implemented in the package mvmeta within the statistical environment R. As an illustrative example, we propose a two-stage analysis for investigating the non-linear exposure–response relationship between temperature and non-accidental mortality using time-series data from multiple cities. Multivariate meta-analysis represents a useful analytical tool for studying complex associations through a two-stage procedure. Copyright © 2012 John Wiley & Sons, Ltd. PMID:22807043
Empirical Likelihood in Nonignorable Covariate-Missing Data Problems.
Xie, Yanmei; Zhang, Biao
2017-04-20
Missing covariate data occurs often in regression analysis, which frequently arises in the health and social sciences as well as in survey sampling. We study methods for the analysis of a nonignorable covariate-missing data problem in an assumed conditional mean function when some covariates are completely observed but other covariates are missing for some subjects. We adopt the semiparametric perspective of Bartlett et al. (Improving upon the efficiency of complete case analysis when covariates are MNAR. Biostatistics 2014;15:719-30) on regression analyses with nonignorable missing covariates, in which they have introduced the use of two working models, the working probability model of missingness and the working conditional score model. In this paper, we study an empirical likelihood approach to nonignorable covariate-missing data problems with the objective of effectively utilizing the two working models in the analysis of covariate-missing data. We propose a unified approach to constructing a system of unbiased estimating equations, where there are more equations than unknown parameters of interest. One useful feature of these unbiased estimating equations is that they naturally incorporate the incomplete data into the data analysis, making it possible to seek efficient estimation of the parameter of interest even when the working regression function is not specified to be the optimal regression function. We apply the general methodology of empirical likelihood to optimally combine these unbiased estimating equations. We propose three maximum empirical likelihood estimators of the underlying regression parameters and compare their efficiencies with other existing competitors. We present a simulation study to compare the finite-sample performance of various methods with respect to bias, efficiency, and robustness to model misspecification. The proposed empirical likelihood method is also illustrated by an analysis of a data set from the US National Health and Nutrition Examination Survey (NHANES).
Ahmadi, Mehdi; Shahlaei, Mohsen
2015-01-01
P2X7 antagonist activity for a set of 49 molecules of the P2X7 receptor antagonists, derivatives of purine, was modeled with the aid of chemometric and artificial intelligence techniques. The activity of these compounds was estimated by means of combination of principal component analysis (PCA), as a well-known data reduction method, genetic algorithm (GA), as a variable selection technique, and artificial neural network (ANN), as a non-linear modeling method. First, a linear regression, combined with PCA, (principal component regression) was operated to model the structure-activity relationships, and afterwards a combination of PCA and ANN algorithm was employed to accurately predict the biological activity of the P2X7 antagonist. PCA preserves as much of the information as possible contained in the original data set. Seven most important PC's to the studied activity were selected as the inputs of ANN box by an efficient variable selection method, GA. The best computational neural network model was a fully-connected, feed-forward model with 7-7-1 architecture. The developed ANN model was fully evaluated by different validation techniques, including internal and external validation, and chemical applicability domain. All validations showed that the constructed quantitative structure-activity relationship model suggested is robust and satisfactory.
Ahmadi, Mehdi; Shahlaei, Mohsen
2015-01-01
P2X7 antagonist activity for a set of 49 molecules of the P2X7 receptor antagonists, derivatives of purine, was modeled with the aid of chemometric and artificial intelligence techniques. The activity of these compounds was estimated by means of combination of principal component analysis (PCA), as a well-known data reduction method, genetic algorithm (GA), as a variable selection technique, and artificial neural network (ANN), as a non-linear modeling method. First, a linear regression, combined with PCA, (principal component regression) was operated to model the structure–activity relationships, and afterwards a combination of PCA and ANN algorithm was employed to accurately predict the biological activity of the P2X7 antagonist. PCA preserves as much of the information as possible contained in the original data set. Seven most important PC's to the studied activity were selected as the inputs of ANN box by an efficient variable selection method, GA. The best computational neural network model was a fully-connected, feed-forward model with 7−7−1 architecture. The developed ANN model was fully evaluated by different validation techniques, including internal and external validation, and chemical applicability domain. All validations showed that the constructed quantitative structure–activity relationship model suggested is robust and satisfactory. PMID:26600858
Bayesian Analysis of Nonlinear Structural Equation Models with Nonignorable Missing Data
ERIC Educational Resources Information Center
Lee, Sik-Yum
2006-01-01
A Bayesian approach is developed for analyzing nonlinear structural equation models with nonignorable missing data. The nonignorable missingness mechanism is specified by a logistic regression model. A hybrid algorithm that combines the Gibbs sampler and the Metropolis-Hastings algorithm is used to produce the joint Bayesian estimates of…
2011-01-01
Introduction The purpose of this study was to explore a data set of patients with fibromyalgia (FM), rheumatoid arthritis (RA) and systemic lupus erythematosus (SLE) who completed the Revised Fibromyalgia Impact Questionnaire (FIQR) and its variant, the Symptom Impact Questionnaire (SIQR), for discriminating features that could be used to differentiate FM from RA and SLE in clinical surveys. Methods The frequency and means of comparing FM, RA and SLE patients on all pain sites and SIQR variables were calculated. Multiple regression analysis was then conducted to identify the significant pain sites and SIQR predictors of group membership. Thereafter stepwise multiple regression analysis was performed to identify the order of variables in predicting their maximal statistical contribution to group membership. Partial correlations assessed their unique contribution, and, last, two-group discriminant analysis provided a classification table. Results The data set contained information on the SIQR and also pain locations in 202 FM, 31 RA and 20 SLE patients. As the SIQR and pain locations did not differ much between the RA and SLE patients, they were grouped together (RA/SLE) to provide a more robust analysis. The combination of eight SIQR items and seven pain sites correctly classified 99% of FM and 90% of RA/SLE patients in a two-group discriminant analysis. The largest reported SIQR differences (FM minus RA/SLE) were seen for the parameters "tenderness to touch," "difficulty cleaning floors" and "discomfort on sitting for 45 minutes." Combining the SIQR and pain locations in a stepwise multiple regression analysis revealed that the seven most important predictors of group membership were mid-lower back pain (29%; 79% vs. 16%), tenderness to touch (11.5%; 6.86 vs. 3.02), neck pain (6.8%; 91% vs. 39%), hand pain (5%; 64% vs. 77%), arm pain (3%; 69% vs. 18%), outer lower back pain (1.7%; 80% vs. 22%) and sitting for 45 minutes (1.4%; 5.56 vs. 1.49). Conclusions A combination of two SIQR questions ("tenderness to touch" and "difficulty sitting for 45 minutes") plus pain in the lower back, neck, hands and arms may be useful in the construction of clinical questionnaires designed for patients with musculoskeletal pain. This combination provided the correct diagnosis in 97% of patients, with only 7 of 253 patients misclassified. PMID:21477308
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kim, J.; Moon, T.J.; Howell, J.R.
This paper presents an analysis of the heat transfer occurring during an in-situ curing process for which infrared energy is provided on the surface of polymer composite during winding. The material system is Hercules prepreg AS4/3501-6. Thermoset composites have an exothermic chemical reaction during the curing process. An Eulerian thermochemical model is developed for the heat transfer analysis of helical winding. The model incorporates heat generation due to the chemical reaction. Several assumptions are made leading to a two-dimensional, thermochemical model. For simplicity, 360{degree} heating around the mandrel is considered. In order to generate the appropriate process windows, the developedmore » heat transfer model is combined with a simple winding time model. The process windows allow for a proper selection of process variables such as infrared energy input and winding velocity to give a desired end-product state. Steady-state temperatures are found for each combination of the process variables. A regression analysis is carried out to relate the process variables to the resulting steady-state temperatures. Using regression equations, process windows for a wide range of cylinder diameters are found. A general procedure to find process windows for Hercules AS4/3501-6 prepreg tape is coded in a FORTRAN program.« less
Lee, Soon Jae; Cho, Yoo-Kyung; Na, Soo-Young; Choi, Eun Kwang; Boo, Sun Jin; Jeong, Seung Uk; Song, Hyung Joo; Kim, Heung Up; Kim, Bong Soo; Song, Byung-Cheol
2016-09-01
Some recent studies have found regression of liver cirrhosis after antiviral therapy in patients with hepatitis C virus (HCV)-related liver cirrhosis, but there have been no reports of complete regression of esophageal varices after interferon/peg-interferon and ribavirin combination therapy. We describe two cases of complete regression of esophageal varices and splenomegaly after interferon-alpha and ribavirin combination therapy in patients with HCV-related liver cirrhosis. Esophageal varices and splenomegaly regressed after 3 and 8 years of sustained virologic responses in cases 1 and 2, respectively. To our knowledge, this is the first study demonstrating that complications of liver cirrhosis, such as esophageal varices and splenomegaly, can regress after antiviral therapy in patients with HCV-related liver cirrhosis.
Zhang, Yixiang; Liang, Xinqiang; Wang, Zhibo; Xu, Lixian
2015-01-01
High content of organic matter in the downstream of watersheds underscored the severity of non-point source (NPS) pollution. The major objectives of this study were to characterize and quantify dissolved organic matter (DOM) in watersheds affected by NPS pollution, and to apply self-organizing map (SOM) and parallel factor analysis (PARAFAC) to assess fluorescence properties as proxy indicators for NPS pollution and labor-intensive routine water quality indicators. Water from upstreams and downstreams was sampled to measure dissolved organic carbon (DOC) concentrations and excitation-emission matrix (EEM). Five fluorescence components were modeled with PARAFAC. The regression analysis between PARAFAC intensities (Fmax) and raw EEM measurements indicated that several raw fluorescence measurements at target excitation-emission wavelength region could provide similar DOM information to massive EEM measurements combined with PARAFAC. Regression analysis between DOC concentration and raw EEM measurements suggested that some regions in raw EEM could be used as surrogates for labor-intensive routine indicators. SOM can be used to visualize the occurrence of pollution. Relationship between DOC concentration and PARAFAC components analyzed with SOM suggested that PARAFAC component 2 might be the major part of bulk DOC and could be recognized as a proxy indicator to predict the DOC concentration. PMID:26526140
Arenja, Nisha; Riffel, Johannes H; Fritz, Thomas; André, Florian; Aus dem Siepen, Fabian; Mueller-Hennessen, Matthias; Giannitsis, Evangelos; Katus, Hugo A; Friedrich, Matthias G; Buss, Sebastian J
2017-06-01
Purpose To assess the utility of established functional markers versus two additional functional markers derived from standard cardiovascular magnetic resonance (MR) images for their incremental diagnostic and prognostic information in patients with nonischemic dilated cardiomyopathy (NIDCM). Materials and Methods Approval was obtained from the local ethics committee. MR images from 453 patients with NIDCM and 150 healthy control subjects were included between 2005 and 2013 and were analyzed retrospectively. Myocardial contraction fraction (MCF) was calculated by dividing left ventricular (LV) stroke volume by LV myocardial volume, and long-axis strain (LAS) was calculated from the distances between the epicardial border of the LV apex and the midpoint of a line connecting the origins of the mitral valve leaflets at end systole and end diastole. Receiver operating characteristic curve, Kaplan-Meier method, Cox regression, and classification and regression tree (CART) analyses were performed for diagnostic and prognostic performances. Results LAS (area under the receiver operating characteristic curve [AUC] = 0.93, P < .001) and MCF (AUC = 0.92, P < .001) can be used to discriminate patients with NIDCM from age- and sex-matched control subjects. A total of 97 patients reached the combined end point during a median follow-up of 4.8 years. In multivariate Cox regression analysis, only LV ejection fraction (EF) and LAS independently indicated the combined end point (hazard ratio = 2.8 and 1.9, respectively; P < .001 for both). In a risk stratification approach with classification and regression tree analysis, combined LV EF and LAS cutoff values were used to stratify patients into three risk groups (log-rank test, P < .001). Conclusion Cardiovascular MR-derived MCF and LAS serve as reliable diagnostic and prognostic markers in patients with NIDCM. LAS, as a marker for longitudinal contractile function, is an independent parameter for outcome and offers incremental information beyond LV EF and the presence of myocardial fibrosis. © RSNA, 2017 Online supplemental material is available for this article.
Predictive Validity of National Basketball Association Draft Combine on Future Performance.
Teramoto, Masaru; Cross, Chad L; Rieger, Randall H; Maak, Travis G; Willick, Stuart E
2018-02-01
Teramoto, M, Cross, CL, Rieger, RH, Maak, TG, and Willick, SE. Predictive validity of national basketball association draft combine on future performance. J Strength Cond Res 32(2): 396-408, 2018-The National Basketball Association (NBA) Draft Combine is an annual event where prospective players are evaluated in terms of their athletic abilities and basketball skills. Data collected at the Combine should help NBA teams select right the players for the upcoming NBA draft; however, its value for predicting future performance of players has not been examined. This study investigated predictive validity of the NBA Draft Combine on future performance of basketball players. We performed a principal component analysis (PCA) on the 2010-2015 Combine data to reduce correlated variables (N = 234), a correlation analysis on the Combine data and future on-court performance to examine relationships (maximum pairwise N = 217), and a robust principal component regression (PCR) analysis to predict first-year and 3-year on-court performance from the Combine measures (N = 148 and 127, respectively). Three components were identified within the Combine data through PCA (= Combine subscales): length-size, power-quickness, and upper-body strength. As per the correlation analysis, the individual Combine items for anthropometrics, including height without shoes, standing reach, weight, wingspan, and hand length, as well as the Combine subscale of length-size, had positive, medium-to-large-sized correlations (r = 0.313-0.545) with defensive performance quantified by Defensive Box Plus/Minus. The robust PCR analysis showed that the Combine subscale of length-size was a predictor most significantly associated with future on-court performance (p ≤ 0.05), including Win Shares, Box Plus/Minus, and Value Over Replacement Player, followed by upper-body strength. In conclusion, the NBA Draft Combine has value for predicting future performance of players.
Rahman, Md. Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D. W.; Labrique, Alain B.; Rashid, Mahbubur; Christian, Parul; West, Keith P.
2017-01-01
Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 − -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset. PMID:29261760
Kabir, Alamgir; Rahman, Md Jahanur; Shamim, Abu Ahmed; Klemm, Rolf D W; Labrique, Alain B; Rashid, Mahbubur; Christian, Parul; West, Keith P
2017-01-01
Birth weight, length and circumferences of the head, chest and arm are key measures of newborn size and health in developing countries. We assessed maternal socio-demographic factors associated with multiple measures of newborn size in a large rural population in Bangladesh using partial least squares (PLS) regression method. PLS regression, combining features from principal component analysis and multiple linear regression, is a multivariate technique with an ability to handle multicollinearity while simultaneously handling multiple dependent variables. We analyzed maternal and infant data from singletons (n = 14,506) born during a double-masked, cluster-randomized, placebo-controlled maternal vitamin A or β-carotene supplementation trial in rural northwest Bangladesh. PLS regression results identified numerous maternal factors (parity, age, early pregnancy MUAC, living standard index, years of education, number of antenatal care visits, preterm delivery and infant sex) significantly (p<0.001) associated with newborn size. Among them, preterm delivery had the largest negative influence on newborn size (Standardized β = -0.29 - -0.19; p<0.001). Scatter plots of the scores of first two PLS components also revealed an interaction between newborn sex and preterm delivery on birth size. PLS regression was found to be more parsimonious than both ordinary least squares regression and principal component regression. It also provided more stable estimates than the ordinary least squares regression and provided the effect measure of the covariates with greater accuracy as it accounts for the correlation among the covariates and outcomes. Therefore, PLS regression is recommended when either there are multiple outcome measurements in the same study, or the covariates are correlated, or both situations exist in a dataset.
Van Houtven, George; Powers, John; Jessup, Amber; Yang, Jui-Chen
2006-08-01
Many economists argue that willingness-to-pay (WTP) measures are most appropriate for assessing the welfare effects of health changes. Nevertheless, the health evaluation literature is still dominated by studies estimating nonmonetary health status measures (HSMs), which are often used to assess changes in quality-adjusted life years (QALYs). Using meta-regression analysis, this paper combines results from both WTP and HSM studies applied to acute morbidity, and it tests whether a systematic relationship exists between HSM and WTP estimates. We analyze over 230 WTP estimates from 17 different studies and find evidence that QALY-based estimates of illness severity--as measured by the Quality of Well-Being (QWB) Scale--are significant factors in explaining variation in WTP, as are changes in the duration of illness and the average income and age of the study populations. In addition, we test and reject the assumption of a constant WTP per QALY gain. We also demonstrate how the estimated meta-regression equations can serve as benefit transfer functions for policy analysis. By specifying the change in duration and severity of the acute illness and the characteristics of the affected population, we apply the regression functions to predict average WTP per case avoided. Copyright 2006 John Wiley & Sons, Ltd.
Cameron, Isobel M; Scott, Neil W; Adler, Mats; Reid, Ian C
2014-12-01
It is important for clinical practice and research that measurement scales of well-being and quality of life exhibit only minimal differential item functioning (DIF). DIF occurs where different groups of people endorse items in a scale to different extents after being matched by the intended scale attribute. We investigate the equivalence or otherwise of common methods of assessing DIF. Three methods of measuring age- and sex-related DIF (ordinal logistic regression, Rasch analysis and Mantel χ(2) procedure) were applied to Hospital Anxiety Depression Scale (HADS) data pertaining to a sample of 1,068 patients consulting primary care practitioners. Three items were flagged by all three approaches as having either age- or sex-related DIF with a consistent direction of effect; a further three items identified did not meet stricter criteria for important DIF using at least one method. When applying strict criteria for significant DIF, ordinal logistic regression was slightly less sensitive. Ordinal logistic regression, Rasch analysis and contingency table methods yielded consistent results when identifying DIF in the HADS depression and HADS anxiety scales. Regardless of methods applied, investigators should use a combination of statistical significance, magnitude of the DIF effect and investigator judgement when interpreting the results.
Wagner, Daniel M.; Krieger, Joshua D.; Veilleux, Andrea G.
2016-08-04
In 2013, the U.S. Geological Survey initiated a study to update regional skew, annual exceedance probability discharges, and regional regression equations used to estimate annual exceedance probability discharges for ungaged locations on streams in the study area with the use of recent geospatial data, new analytical methods, and available annual peak-discharge data through the 2013 water year. An analysis of regional skew using Bayesian weighted least-squares/Bayesian generalized-least squares regression was performed for Arkansas, Louisiana, and parts of Missouri and Oklahoma. The newly developed constant regional skew of -0.17 was used in the computation of annual exceedance probability discharges for 281 streamgages used in the regional regression analysis. Based on analysis of covariance, four flood regions were identified for use in the generation of regional regression models. Thirty-nine basin characteristics were considered as potential explanatory variables, and ordinary least-squares regression techniques were used to determine the optimum combinations of basin characteristics for each of the four regions. Basin characteristics in candidate models were evaluated based on multicollinearity with other basin characteristics (variance inflation factor < 2.5) and statistical significance at the 95-percent confidence level (p ≤ 0.05). Generalized least-squares regression was used to develop the final regression models for each flood region. Average standard errors of prediction of the generalized least-squares models ranged from 32.76 to 59.53 percent, with the largest range in flood region D. Pseudo coefficients of determination of the generalized least-squares models ranged from 90.29 to 97.28 percent, with the largest range also in flood region D. The regional regression equations apply only to locations on streams in Arkansas where annual peak discharges are not substantially affected by regulation, diversion, channelization, backwater, or urbanization. The applicability and accuracy of the regional regression equations depend on the basin characteristics measured for an ungaged location on a stream being within range of those used to develop the equations.
Cognition and Error in Student Writing
ERIC Educational Resources Information Center
Perrault, S. T.
2011-01-01
The author integrates work from cognitive and developmental psychology with studies in writing in order to explain why the quality of student writing sometimes appears to regress to earlier or less proficient levels. Insights from this combined analysis are applied to explain how and why to use specific Writing Across the Curriculum strategies to…
We show that a conditional probability analysis that utilizes a stressor-response model based on a logistic regression provides a useful approach for developing candidate water quality criterai from empirical data. The critical step in this approach is transforming the response ...
Updated generalized biomass equations for North American tree species
David C. Chojnacky; Linda S. Heath; Jennifer C. Jenkins
2014-01-01
Historically, tree biomass at large scales has been estimated by applying dimensional analysis techniques and field measurements such as diameter at breast height (dbh) in allometric regression equations. Equations often have been developed using differing methods and applied only to certain species or isolated areas. We previously had compiled and combined (in meta-...
NASA Astrophysics Data System (ADS)
Yoshida, Kenichiro; Nishidate, Izumi; Ojima, Nobutoshi; Iwata, Kayoko
2014-01-01
To quantitatively evaluate skin chromophores over a wide region of curved skin surface, we propose an approach that suppresses the effect of the shading-derived error in the reflectance on the estimation of chromophore concentrations, without sacrificing the accuracy of that estimation. In our method, we use multiple regression analysis, assuming the absorbance spectrum as the response variable and the extinction coefficients of melanin, oxygenated hemoglobin, and deoxygenated hemoglobin as the predictor variables. The concentrations of melanin and total hemoglobin are determined from the multiple regression coefficients using compensation formulae (CF) based on the diffuse reflectance spectra derived from a Monte Carlo simulation. To suppress the shading-derived error, we investigated three different combinations of multiple regression coefficients for the CF. In vivo measurements with the forearm skin demonstrated that the proposed approach can reduce the estimation errors that are due to shading-derived errors in the reflectance. With the best combination of multiple regression coefficients, we estimated that the ratio of the error to the chromophore concentrations is about 10%. The proposed method does not require any measurements or assumptions about the shape of the subjects; this is an advantage over other studies related to the reduction of shading-derived errors.
On the use of log-transformation vs. nonlinear regression for analyzing biological power laws.
Xiao, Xiao; White, Ethan P; Hooten, Mevin B; Durham, Susan L
2011-10-01
Power-law relationships are among the most well-studied functional relationships in biology. Recently the common practice of fitting power laws using linear regression (LR) on log-transformed data has been criticized, calling into question the conclusions of hundreds of studies. It has been suggested that nonlinear regression (NLR) is preferable, but no rigorous comparison of these two methods has been conducted. Using Monte Carlo simulations, we demonstrate that the error distribution determines which method performs better, with NLR better characterizing data with additive, homoscedastic, normal error and LR better characterizing data with multiplicative, heteroscedastic, lognormal error. Analysis of 471 biological power laws shows that both forms of error occur in nature. While previous analyses based on log-transformation appear to be generally valid, future analyses should choose methods based on a combination of biological plausibility and analysis of the error distribution. We provide detailed guidelines and associated computer code for doing so, including a model averaging approach for cases where the error structure is uncertain.
Jiang, Wei; Xu, Chao-Zhen; Jiang, Si-Zhi; Zhang, Tang-Duo; Wang, Shi-Zhen; Fang, Bai-Shan
2017-04-01
L-tert-Leucine (L-Tle) and its derivatives are extensively used as crucial building blocks for chiral auxiliaries, pharmaceutically active ingredients, and ligands. Combining with formate dehydrogenase (FDH) for regenerating the expensive coenzyme NADH, leucine dehydrogenase (LeuDH) is continually used for synthesizing L-Tle from α-keto acid. A multilevel factorial experimental design was executed for research of this system. In this work, an efficient optimization method for improving the productivity of L-Tle was developed. And the mathematical model between different fermentation conditions and L-Tle yield was also determined in the form of the equation by using uniform design and regression analysis. The multivariate regression equation was conveniently implemented in water, with a space time yield of 505.9 g L -1 day -1 and an enantiomeric excess value of >99 %. These results demonstrated that this method might become an ideal protocol for industrial production of chiral compounds and unnatural amino acids such as chiral drug intermediates.
Above-ground biomass of mangrove species. I. Analysis of models
NASA Astrophysics Data System (ADS)
Soares, Mário Luiz Gomes; Schaeffer-Novelli, Yara
2005-10-01
This study analyzes the above-ground biomass of Rhizophora mangle and Laguncularia racemosa located in the mangroves of Bertioga (SP) and Guaratiba (RJ), Southeast Brazil. Its purpose is to determine the best regression model to estimate the total above-ground biomass and compartment (leaves, reproductive parts, twigs, branches, trunk and prop roots) biomass, indirectly. To do this, we used structural measurements such as height, diameter at breast-height (DBH), and crown area. A combination of regression types with several compositions of independent variables generated 2.272 models that were later tested. Subsequent analysis of the models indicated that the biomass of reproductive parts, branches, and prop roots yielded great variability, probably because of environmental factors and seasonality (in the case of reproductive parts). It also indicated the superiority of multiple regression to estimate above-ground biomass as it allows researchers to consider several aspects that affect above-ground biomass, specially the influence of environmental factors. This fact has been attested to the models that estimated the biomass of crown compartments.
Song, Minju; Kang, Minji; Kang, Dae Ryong; Jung, Hoi In; Kim, Euiseong
2018-05-01
The purpose of this retrospective clinical study was to evaluate the effect of lesion types related to endodontic microsurgery on the clinical outcome. Patients who underwent endodontic microsurgery between March 2001 and March 2014 with a postoperative follow-up period of at least 1 year were included in the study. Survival analyses were conducted to compare the clinical outcomes between isolated endodontic lesion group (endo group) and endodontic-periodontal combined lesion group (endo-perio group) and to evaluate other clinical variables. To reduce the effect of selection bias in this study, the estimated propensity scores were used to match the cases of the endo group with those of the endo-perio group. Among the 414 eligible cases, the 83 cases in the endo-perio group were matched to 166 out of the 331 cases in the endo group based on propensity score matching (PSM). The cumulated success rates of the endo and endo-perio groups were 87.3 and 72.3%, respectively. The median success period of the endo-perio group was 12 years (95% CI: 5.507, 18.498). Lesion type was found to be significant according to both Log-rank test (P = 0.002) and Cox proportional hazard regression analysis (P = 0.001). Among the other clinical variables, sex (female or male), age, and tooth type (anterior, premolar, or molar) were determined to be significant in Cox regression analysis (P < 0.05). Endodontic-periodontal combined lesions had a negative effect on the clinical outcome based on an analysis that utilized PSM, a useful statistical matching method for observational studies. Lesion type is a significant predictor of the outcome of endodontic microsurgery.
Coates, Laura C; Walsh, Jessica; Haroon, Muhammad; FitzGerald, Oliver; Aslam, Tariq; Al Balushi, Farida; Burden, A D; Burden-Teh, Esther; Caperon, Anna R; Cerio, Rino; Chattopadhyay, Chandrabhusan; Chinoy, Hector; Goodfield, Mark J D; Kay, Lesley; Kelly, Stephen; Kirkham, Bruce W; Lovell, Christopher R; Marzo-Ortega, Helena; McHugh, Neil; Murphy, Ruth; Reynolds, Nick J; Smith, Catherine H; Stewart, Elizabeth J C; Warren, Richard B; Waxman, Robin; Wilson, Hilary E; Helliwell, Philip S
2014-09-01
Several questionnaires have been developed to screen for psoriatic arthritis (PsA), but head-to-head studies have found limitations. This study aimed to develop new questionnaires encompassing the most discriminative questions from existing instruments. Data from the CONTEST study, a head-to-head comparison of 3 existing questionnaires, were used to identify items with a Youden index score of ≥0.1. These were combined using 4 approaches: CONTEST (simple additions of questions), CONTESTw (weighting using logistic regression), CONTESTjt (addition of a joint manikin), and CONTESTtree (additional questions identified by classification and regression tree [CART] analysis). These candidate questionnaires were tested in independent data sets. Twelve individual questions with a Youden index score of ≥0.1 were identified, but 4 of these were excluded due to duplication and redundancy. Weighting for 2 of these questions was included in CONTESTw. Receiver operating characteristic (ROC) curve analysis showed that involvement in 6 joint areas on the manikin was predictive of PsA for inclusion in CONTESTjt. CART analysis identified a further 5 questions for inclusion in CONTESTtree. CONTESTtree was not significant on ROC curve analysis and discarded. The other 3 questionnaires were significant in all data sets, although CONTESTw was slightly inferior to the others in the validation data sets. Potential cut points for referral were also discussed. Of 4 candidate questionnaires combining existing discriminatory items to identify PsA in people with psoriasis, 3 were found to be significant on ROC curve analysis. Testing in independent data sets identified 2 questionnaires (CONTEST and CONTESTjt) that should be pursued for further prospective testing. Copyright © 2014 by the American College of Rheumatology.
Chen, Carla Chia-Ming; Schwender, Holger; Keith, Jonathan; Nunkesser, Robin; Mengersen, Kerrie; Macrossan, Paula
2011-01-01
Due to advancements in computational ability, enhanced technology and a reduction in the price of genotyping, more data are being generated for understanding genetic associations with diseases and disorders. However, with the availability of large data sets comes the inherent challenges of new methods of statistical analysis and modeling. Considering a complex phenotype may be the effect of a combination of multiple loci, various statistical methods have been developed for identifying genetic epistasis effects. Among these methods, logic regression (LR) is an intriguing approach incorporating tree-like structures. Various methods have built on the original LR to improve different aspects of the model. In this study, we review four variations of LR, namely Logic Feature Selection, Monte Carlo Logic Regression, Genetic Programming for Association Studies, and Modified Logic Regression-Gene Expression Programming, and investigate the performance of each method using simulated and real genotype data. We contrast these with another tree-like approach, namely Random Forests, and a Bayesian logistic regression with stochastic search variable selection.
Linden, Ariel
2018-04-01
Interrupted time series analysis (ITSA) is an evaluation methodology in which a single treatment unit's outcome is studied over time and the intervention is expected to "interrupt" the level and/or trend of the outcome. The internal validity is strengthened considerably when the treated unit is contrasted with a comparable control group. In this paper, we introduce a robust evaluation framework that combines the synthetic controls method (SYNTH) to generate a comparable control group and ITSA regression to assess covariate balance and estimate treatment effects. We evaluate the effect of California's Proposition 99 for reducing cigarette sales, by comparing California to other states not exposed to smoking reduction initiatives. SYNTH is used to reweight nontreated units to make them comparable to the treated unit. These weights are then used in ITSA regression models to assess covariate balance and estimate treatment effects. Covariate balance was achieved for all but one covariate. While California experienced a significant decrease in the annual trend of cigarette sales after Proposition 99, there was no statistically significant treatment effect when compared to synthetic controls. The advantage of using this framework over regression alone is that it ensures that a comparable control group is generated. Additionally, it offers a common set of statistical measures familiar to investigators, the capability for assessing covariate balance, and enhancement of the evaluation with a comprehensive set of postestimation measures. Therefore, this robust framework should be considered as a primary approach for evaluating treatment effects in multiple group time series analysis. © 2018 John Wiley & Sons, Ltd.
Neural Network and Regression Methods Demonstrated in the Design Optimization of a Subsonic Aircraft
NASA Technical Reports Server (NTRS)
Hopkins, Dale A.; Lavelle, Thomas M.; Patnaik, Surya
2003-01-01
The neural network and regression methods of NASA Glenn Research Center s COMETBOARDS design optimization testbed were used to generate approximate analysis and design models for a subsonic aircraft operating at Mach 0.85 cruise speed. The analytical model is defined by nine design variables: wing aspect ratio, engine thrust, wing area, sweep angle, chord-thickness ratio, turbine temperature, pressure ratio, bypass ratio, fan pressure; and eight response parameters: weight, landing velocity, takeoff and landing field lengths, approach thrust, overall efficiency, and compressor pressure and temperature. The variables were adjusted to optimally balance the engines to the airframe. The solution strategy included a sensitivity model and the soft analysis model. Researchers generated the sensitivity model by training the approximators to predict an optimum design. The trained neural network predicted all response variables, within 5-percent error. This was reduced to 1 percent by the regression method. The soft analysis model was developed to replace aircraft analysis as the reanalyzer in design optimization. Soft models have been generated for a neural network method, a regression method, and a hybrid method obtained by combining the approximators. The performance of the models is graphed for aircraft weight versus thrust as well as for wing area and turbine temperature. The regression method followed the analytical solution with little error. The neural network exhibited 5-percent maximum error over all parameters. Performance of the hybrid method was intermediate in comparison to the individual approximators. Error in the response variable is smaller than that shown in the figure because of a distortion scale factor. The overall performance of the approximators was considered to be satisfactory because aircraft analysis with NASA Langley Research Center s FLOPS (Flight Optimization System) code is a synthesis of diverse disciplines: weight estimation, aerodynamic analysis, engine cycle analysis, propulsion data interpolation, mission performance, airfield length for landing and takeoff, noise footprint, and others.
Low, Gary Kim-Kuan; Ogston, Simon A; Yong, Mun-Hin; Gan, Seng-Chiew; Chee, Hui-Yee
2018-06-01
Since the introduction of 2009 WHO dengue case classification, no literature was found regarding its effect on dengue death. This study was to evaluate the effect of 2009 WHO dengue case classification towards dengue case fatality rate. Various databases were used to search relevant articles since 1995. Studies included were cohort and cross-sectional studies, all patients with dengue infection and must report the number of death or case fatality rate. The Joanna Briggs Institute appraisal checklist was used to evaluate the risk of bias of the full-texts. The studies were grouped according to the classification adopted: WHO 1997 and WHO 2009. Meta-regression was employed using a logistic transformation (log-odds) of the case fatality rate. The result of the meta-regression was the adjusted case fatality rate and odds ratio on the explanatory variables. A total of 77 studies were included in the meta-regression analysis. The case fatality rate for all studies combined was 1.14% with 95% confidence interval (CI) of 0.82-1.58%. The combined (unadjusted) case fatality rate for 69 studies which adopted WHO 1997 dengue case classification was 1.09% with 95% CI of 0.77-1.55%; and for eight studies with WHO 2009 was 1.62% with 95% CI of 0.64-4.02%. The unadjusted and adjusted odds ratio of case fatality using WHO 2009 dengue case classification was 1.49 (95% CI: 0.52, 4.24) and 0.83 (95% CI: 0.26, 2.63) respectively, compared to WHO 1997 dengue case classification. There was an apparent increase in trend of case fatality rate from the year 1992-2016. Neither was statistically significant. The WHO 2009 dengue case classification might have no effect towards the case fatality rate although the adjusted results indicated a lower case fatality rate. Future studies are required for an update in the meta-regression analysis to confirm the findings. Copyright © 2018 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Maguen, Ezra I.; Papaioannou, Thanassis; Nesburn, Anthony B.; Salz, James J.; Warren, Cathy; Grundfest, Warren S.
1996-05-01
Multivariable regression analysis was used to evaluate the combined effects of some preoperative and operative variables on the change of refraction following excimer laser photorefractive keratectomy for myopia (PRK). This analysis was performed on 152 eyes (at 6 months postoperatively) and 156 eyes (at 12 months postoperatively). The following variables were considered: intended refractive correction, patient age, treatment zone, central corneal thickness, average corneal curvature, and intraocular pressure. At 6 months after surgery, the cumulative R2 was 0.43 with 0.38 attributed to the intended correction and 0.06 attributed to the preoperative corneal curvature. At 12 months, the cumulative R2 was 0.37 where 0.33 was attributed to the intended correction, 0.02 to the preoperative corneal curvature, and 0.01 to both preoperative corneal thickness and to the patient age. Further model augmentation is necessary to account for the remaining variability and the behavior of the residuals.
Zhang, Xiaona; Sun, Xiaoxuan; Wang, Junhong; Tang, Liou; Xie, Anmu
2017-01-01
Rapid eye movement sleep behavior disorder (RBD) is thought to be one of the most frequent preceding symptoms of Parkinson's disease (PD). However, the prevalence of RBD in PD stated in the published studies is still inconsistent. We conducted a meta and meta-regression analysis in this paper to estimate the pooled prevalence. We searched the electronic databases of PubMed, ScienceDirect, EMBASE and EBSCO up to June 2016 for related articles. STATA 12.0 statistics software was used to calculate the available data from each research. The prevalence of RBD in PD patients in each study was combined to a pooled prevalence with a 95 % confidence interval (CI). Subgroup analysis and meta-regression analysis were performed to search for the causes of the heterogeneity. A total of 28 studies with 6869 PD cases were deemed eligible and included in our meta-analysis based on the inclusion and exclusion criteria. The pooled prevalence of RBD in PD was 42.3 % (95 % CI 37.4-47.1 %). In subgroup analysis and meta-regression analysis, we found that the important causes of heterogeneity were the diagnosis criteria of RBD and age of PD patients (P = 0.016, P = 0.019, respectively). The results indicate that nearly half of the PD patients are suffering from RBD. Older age and longer duration are risk factors for RBD in PD. We can use the minimal diagnosis criteria for RBD according to the International Classification of Sleep Disorders to diagnose RBD patients in our daily work if polysomnography is not necessary.
NASA Astrophysics Data System (ADS)
Liu, Yande; Ying, Yibin; Lu, Huishan; Fu, Xiaping
2005-11-01
A new method is proposed to eliminate the varying background and noise simultaneously for multivariate calibration of Fourier transform near infrared (FT-NIR) spectral signals. An ideal spectrum signal prototype was constructed based on the FT-NIR spectrum of fruit sugar content measurement. The performances of wavelet based threshold de-noising approaches via different combinations of wavelet base functions were compared. Three families of wavelet base function (Daubechies, Symlets and Coiflets) were applied to estimate the performance of those wavelet bases and threshold selection rules by a series of experiments. The experimental results show that the best de-noising performance is reached via the combinations of Daubechies 4 or Symlet 4 wavelet base function. Based on the optimization parameter, wavelet regression models for sugar content of pear were also developed and result in a smaller prediction error than a traditional Partial Least Squares Regression (PLSR) mode.
Procedures for adjusting regional regression models of urban-runoff quality using local data
Hoos, A.B.; Sisolak, J.K.
1993-01-01
Statistical operations termed model-adjustment procedures (MAP?s) can be used to incorporate local data into existing regression models to improve the prediction of urban-runoff quality. Each MAP is a form of regression analysis in which the local data base is used as a calibration data set. Regression coefficients are determined from the local data base, and the resulting `adjusted? regression models can then be used to predict storm-runoff quality at unmonitored sites. The response variable in the regression analyses is the observed load or mean concentration of a constituent in storm runoff for a single storm. The set of explanatory variables used in the regression analyses is different for each MAP, but always includes the predicted value of load or mean concentration from a regional regression model. The four MAP?s examined in this study were: single-factor regression against the regional model prediction, P, (termed MAP-lF-P), regression against P,, (termed MAP-R-P), regression against P, and additional local variables (termed MAP-R-P+nV), and a weighted combination of P, and a local-regression prediction (termed MAP-W). The procedures were tested by means of split-sample analysis, using data from three cities included in the Nationwide Urban Runoff Program: Denver, Colorado; Bellevue, Washington; and Knoxville, Tennessee. The MAP that provided the greatest predictive accuracy for the verification data set differed among the three test data bases and among model types (MAP-W for Denver and Knoxville, MAP-lF-P and MAP-R-P for Bellevue load models, and MAP-R-P+nV for Bellevue concentration models) and, in many cases, was not clearly indicated by the values of standard error of estimate for the calibration data set. A scheme to guide MAP selection, based on exploratory data analysis of the calibration data set, is presented and tested. The MAP?s were tested for sensitivity to the size of a calibration data set. As expected, predictive accuracy of all MAP?s for the verification data set decreased as the calibration data-set size decreased, but predictive accuracy was not as sensitive for the MAP?s as it was for the local regression models.
Knowledge Systems and Value Chain Integration: The Case of Linseed Production in Ethiopia
ERIC Educational Resources Information Center
Chagwiza, Clarietta; Muradian, Roldan; Ruben, Ruerd
2017-01-01
Purpose: This study uses data from a sample of 150 oilseed farming households from Arsi Robe, Ethiopia, to assess the impact of different knowledge bases (education, training and experience) and their interactions on linseed productivity. Methodology: A multiple regression analysis was employed to assess the combined effect of the knowledge bases,…
Speech prosody impairment predicts cognitive decline in Parkinson's disease.
Rektorova, Irena; Mekyska, Jiri; Janousova, Eva; Kostalova, Milena; Eliasova, Ilona; Mrackova, Martina; Berankova, Dagmar; Necasova, Tereza; Smekal, Zdenek; Marecek, Radek
2016-08-01
Impairment of speech prosody is characteristic for Parkinson's disease (PD) and does not respond well to dopaminergic treatment. We assessed whether baseline acoustic parameters, alone or in combination with other predominantly non-dopaminergic symptoms may predict global cognitive decline as measured by the Addenbrooke's cognitive examination (ACE-R) and/or worsening of cognitive status as assessed by a detailed neuropsychological examination. Forty-four consecutive non-depressed PD patients underwent clinical and cognitive testing, and acoustic voice analysis at baseline and at the two-year follow-up. Influence of speech and other clinical parameters on worsening of the ACE-R and of the cognitive status was analyzed using linear and logistic regression. The cognitive status (classified as normal cognition, mild cognitive impairment and dementia) deteriorated in 25% of patients during the follow-up. The multivariate linear regression model consisted of the variation in range of the fundamental voice frequency (F0VR) and the REM Sleep Behavioral Disorder Screening Questionnaire (RBDSQ). These parameters explained 37.2% of the variability of the change in ACE-R. The most significant predictors in the univariate logistic regression were the speech index of rhythmicity (SPIR; p = 0.012), disease duration (p = 0.019), and the RBDSQ (p = 0.032). The multivariate regression analysis revealed that SPIR alone led to 73.2% accuracy in predicting a change in cognitive status. Combining SPIR with RBDSQ improved the prediction accuracy of SPIR alone by 7.3%. Impairment of speech prosody together with symptoms of RBD predicted rapid cognitive decline and worsening of PD cognitive status during a two-year period. Copyright © 2016 Elsevier Ltd. All rights reserved.
J Cerqueira, Rui; Melo, Renata; Moreira, Soraia; A Saraiva, Francisca; Andrade, Marta; Salgueiro, Elson; Almeida, Jorge; J Amorim, Mário; Pinho, Paulo; Lourenço, André; F Leite-Moreira, Adelino
2017-01-01
To compare stentless Freedom Solo and stented Trifecta aortic bioprostheses regarding hemodynamic profile, left ventricular mass regression, early and late postoperative outcomes and survival. Longitudinal cohort study of consecutive patients undergoing aortic valve replacement (from 2009 to 2016) with either Freedom Solo or Trifecta at one centre. Local databases and national records were queried. Postoperative echocardiography (3-6 months) was obtained for hemodynamic profile (mean transprosthetic gradient and effective orifice area) and left ventricle mass determination. After propensity score matching (21 covariates), Kaplan-Meier analysis and cumulative incidence analysis were performed for survival and combined outcome of structural valve deterioration and endocarditis, respectively. Hemodynamics and left ventricle mass regression were assessed by a mixed- -effects model including propensity score as a covariate. From a total sample of 397 Freedom Solo and 525 Trifecta patients with a median follow-up time of 4.0 (2.2- 6.0) and 2.4 (1.4-3.7) years, respectively, a matched sample of 329 pairs was obtained. Well-balanced matched groups showed no difference in survival (hazard ratio=1.04, 95% confidence interval=0.69-1.56) or cumulative hazards of combined outcome (subhazard ratio=0.54, 95% confidence interval=0.21-1.39). Although Trifecta showed improved hemodynamic profile compared to Freedom Solo, no differences were found in left ventricle mass regression. Trifecta has a slightly improved hemodynamic profile compared to Freedom Solo but this does not translate into differences in the extent of mass regression, postoperative outcomes or survival, which were good and comparable for both bioprostheses. Long-term follow-up is needed for comparisons with older models of bioprostheses.
Goldstein, Alisa M; Dondon, Marie-Gabrielle; Andrieu, Nadine
2006-08-01
A design combining both related and unrelated controls, named the case-combined-control design, was recently proposed to increase the power for detecting gene-environment (GxE) interaction. Under a conditional analytic approach, the case-combined-control design appeared to be more efficient and feasible than a classical case-control study for detecting interaction involving rare events. We now propose an unconditional analytic strategy to further increase the power for detecting gene-environment (GxE) interactions. This strategy allows the estimation of GxE interaction and exposure (E) main effects under certain assumptions (e.g. no correlation in E between siblings and the same exposure frequency in both control groups). Only the genetic (G) main effect cannot be estimated because it is biased. Using simulations, we show that unconditional logistic regression analysis is often more efficient than conditional analysis for detecting GxE interaction, particularly for a rare gene and strong effects. The unconditional analysis is also at least as efficient as the conditional analysis when the gene is common and the main and joint effects of E and G are small. Under the required assumptions, the unconditional analysis retains more information than does the conditional analysis for which only discordant case-control pairs are informative leading to more precise estimates of the odds ratios.
Leite, Fábio R M; Nascimento, Gustavo G; Demarco, Flávio F; Gomes, Brenda P F A; Pucci, Cesar R; Martinho, Frederico C
2015-05-01
This systematic review and meta-regression analysis aimed to calculate a combined prevalence estimate and evaluate the prevalence of different Treponema species in primary and secondary endodontic infections, including symptomatic and asymptomatic cases. The MEDLINE/PubMed, Embase, Scielo, Web of Knowledge, and Scopus databases were searched without starting date restriction up to and including March 2014. Only reports in English were included. The selected literature was reviewed by 2 authors and classified as suitable or not to be included in this review. Lists were compared, and, in case of disagreements, decisions were made after a discussion based on inclusion and exclusion criteria. A pooled prevalence of Treponema species in endodontic infections was estimated. Additionally, a meta-regression analysis was performed. Among the 265 articles identified in the initial search, only 51 were included in the final analysis. The studies were classified into 2 different groups according to the type of endodontic infection and whether it was an exclusively primary/secondary study (n = 36) or a primary/secondary comparison (n = 15). The pooled prevalence of Treponema species was 41.5% (95% confidence interval, 35.9-47.0). In the multivariate model of meta-regression analysis, primary endodontic infections (P < .001), acute apical abscess, symptomatic apical periodontitis (P < .001), and concomitant presence of 2 or more species (P = .028) explained the heterogeneity regarding the prevalence rates of Treponema species. Our findings suggest that Treponema species are important pathogens involved in endodontic infections, particularly in cases of primary and acute infections. Copyright © 2015 American Association of Endodontists. Published by Elsevier Inc. All rights reserved.
Integrative eQTL analysis of tumor and host omics data in individuals with bladder cancer.
Pineda, Silvia; Van Steen, Kristel; Malats, Núria
2017-09-01
Integrative analyses of several omics data are emerging. The data are usually generated from the same source material (i.e., tumor sample) representing one level of regulation. However, integrating different regulatory levels (i.e., blood) with those from tumor may also reveal important knowledge about the human genetic architecture. To model this multilevel structure, an integrative-expression quantitative trait loci (eQTL) analysis applying two-stage regression (2SR) was proposed. This approach first regressed tumor gene expression levels with tumor markers and the adjusted residuals from the previous model were then regressed with the germline genotypes measured in blood. Previously, we demonstrated that penalized regression methods in combination with a permutation-based MaxT method (Global-LASSO) is a promising tool to fix some of the challenges that high-throughput omics data analysis imposes. Here, we assessed whether Global-LASSO can also be applied when tumor and blood omics data are integrated. We further compared our strategy with two 2SR-approaches, one using multiple linear regression (2SR-MLR) and other using LASSO (2SR-LASSO). We applied the three models to integrate genomic, epigenomic, and transcriptomic data from tumor tissue with blood germline genotypes from 181 individuals with bladder cancer included in the TCGA Consortium. Global-LASSO provided a larger list of eQTLs than the 2SR methods, identified a previously reported eQTLs in prostate stem cell antigen (PSCA), and provided further clues on the complexity of APBEC3B loci, with a minimal false-positive rate not achieved by 2SR-MLR. It also represents an important contribution for omics integrative analysis because it is easy to apply and adaptable to any type of data. © 2017 WILEY PERIODICALS, INC.
Li, Li; Nguyen, Kim-Huong; Comans, Tracy; Scuffham, Paul
2018-04-01
Several utility-based instruments have been applied in cost-utility analysis to assess health state values for people with dementia. Nevertheless, concerns and uncertainty regarding their performance for people with dementia have been raised. To assess the performance of available utility-based instruments for people with dementia by comparing their psychometric properties and to explore factors that cause variations in the reported health state values generated from those instruments by conducting meta-regression analyses. A literature search was conducted and psychometric properties were synthesized to demonstrate the overall performance of each instrument. When available, health state values and variables such as the type of instrument and cognitive impairment levels were extracted from each article. A meta-regression analysis was undertaken and available covariates were included in the models. A total of 64 studies providing preference-based values were identified and included. The EuroQol five-dimension questionnaire demonstrated the best combination of feasibility, reliability, and validity. Meta-regression analyses suggested that significant differences exist between instruments, type of respondents, and mode of administration and the variations in estimated utility values had influences on incremental quality-adjusted life-year calculation. This review finds that the EuroQol five-dimension questionnaire is the most valid utility-based instrument for people with dementia, but should be replaced by others under certain circumstances. Although no utility estimates were reported in the article, the meta-regression analyses that examined variations in utility estimates produced by different instruments impact on cost-utility analysis, potentially altering the decision-making process in some circumstances. Copyright © 2018 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.
Mannan, Malik M Naeem; Jeong, Myung Y; Kamran, Muhammad A
2016-01-01
Electroencephalography (EEG) is a portable brain-imaging technique with the advantage of high-temporal resolution that can be used to record electrical activity of the brain. However, it is difficult to analyze EEG signals due to the contamination of ocular artifacts, and which potentially results in misleading conclusions. Also, it is a proven fact that the contamination of ocular artifacts cause to reduce the classification accuracy of a brain-computer interface (BCI). It is therefore very important to remove/reduce these artifacts before the analysis of EEG signals for applications like BCI. In this paper, a hybrid framework that combines independent component analysis (ICA), regression and high-order statistics has been proposed to identify and eliminate artifactual activities from EEG data. We used simulated, experimental and standard EEG signals to evaluate and analyze the effectiveness of the proposed method. Results demonstrate that the proposed method can effectively remove ocular artifacts as well as it can preserve the neuronal signals present in EEG data. A comparison with four methods from literature namely ICA, regression analysis, wavelet-ICA (wICA), and regression-ICA (REGICA) confirms the significantly enhanced performance and effectiveness of the proposed method for removal of ocular activities from EEG, in terms of lower mean square error and mean absolute error values and higher mutual information between reconstructed and original EEG.
Mannan, Malik M. Naeem; Jeong, Myung Y.; Kamran, Muhammad A.
2016-01-01
Electroencephalography (EEG) is a portable brain-imaging technique with the advantage of high-temporal resolution that can be used to record electrical activity of the brain. However, it is difficult to analyze EEG signals due to the contamination of ocular artifacts, and which potentially results in misleading conclusions. Also, it is a proven fact that the contamination of ocular artifacts cause to reduce the classification accuracy of a brain-computer interface (BCI). It is therefore very important to remove/reduce these artifacts before the analysis of EEG signals for applications like BCI. In this paper, a hybrid framework that combines independent component analysis (ICA), regression and high-order statistics has been proposed to identify and eliminate artifactual activities from EEG data. We used simulated, experimental and standard EEG signals to evaluate and analyze the effectiveness of the proposed method. Results demonstrate that the proposed method can effectively remove ocular artifacts as well as it can preserve the neuronal signals present in EEG data. A comparison with four methods from literature namely ICA, regression analysis, wavelet-ICA (wICA), and regression-ICA (REGICA) confirms the significantly enhanced performance and effectiveness of the proposed method for removal of ocular activities from EEG, in terms of lower mean square error and mean absolute error values and higher mutual information between reconstructed and original EEG. PMID:27199714
Fakayode, Sayo O; Mitchell, Breanna S; Pollard, David A
2014-08-01
Accurate understanding of analyte boiling points (BP) is of critical importance in gas chromatographic (GC) separation and crude oil refinery operation in petrochemical industries. This study reported the first combined use of GC separation and partial-least-square (PLS1) multivariate regression analysis of petrochemical structural activity relationship (SAR) for accurate BP determination of two commercially available (D3710 and MA VHP) calibration gas mix samples. The results of the BP determination using PLS1 multivariate regression were further compared with the results of traditional simulated distillation method of BP determination. The developed PLS1 regression was able to correctly predict analytes BP in D3710 and MA VHP calibration gas mix samples, with a root-mean-square-%-relative-error (RMS%RE) of 6.4%, and 10.8% respectively. In contrast, the overall RMS%RE of 32.9% and 40.4%, respectively obtained for BP determination in D3710 and MA VHP using a traditional simulated distillation method were approximately four times larger than the corresponding RMS%RE of BP prediction using MRA, demonstrating the better predictive ability of MRA. The reported method is rapid, robust, and promising, and can be potentially used routinely for fast analysis, pattern recognition, and analyte BP determination in petrochemical industries. Copyright © 2014 Elsevier B.V. All rights reserved.
Zhu, Xiang; Stephens, Matthew
2017-01-01
Bayesian methods for large-scale multiple regression provide attractive approaches to the analysis of genome-wide association studies (GWAS). For example, they can estimate heritability of complex traits, allowing for both polygenic and sparse models; and by incorporating external genomic data into the priors, they can increase power and yield new biological insights. However, these methods require access to individual genotypes and phenotypes, which are often not easily available. Here we provide a framework for performing these analyses without individual-level data. Specifically, we introduce a “Regression with Summary Statistics” (RSS) likelihood, which relates the multiple regression coefficients to univariate regression results that are often easily available. The RSS likelihood requires estimates of correlations among covariates (SNPs), which also can be obtained from public databases. We perform Bayesian multiple regression analysis by combining the RSS likelihood with previously proposed prior distributions, sampling posteriors by Markov chain Monte Carlo. In a wide range of simulations RSS performs similarly to analyses using the individual data, both for estimating heritability and detecting associations. We apply RSS to a GWAS of human height that contains 253,288 individuals typed at 1.06 million SNPs, for which analyses of individual-level data are practically impossible. Estimates of heritability (52%) are consistent with, but more precise, than previous results using subsets of these data. We also identify many previously unreported loci that show evidence for association with height in our analyses. Software is available at https://github.com/stephenslab/rss. PMID:29399241
Zhu, Yu; Xia, Jie-lai; Wang, Jing
2009-09-01
Application of the 'single auto regressive integrated moving average (ARIMA) model' and the 'ARIMA-generalized regression neural network (GRNN) combination model' in the research of the incidence of scarlet fever. Establish the auto regressive integrated moving average model based on the data of the monthly incidence on scarlet fever of one city, from 2000 to 2006. The fitting values of the ARIMA model was used as input of the GRNN, and the actual values were used as output of the GRNN. After training the GRNN, the effect of the single ARIMA model and the ARIMA-GRNN combination model was then compared. The mean error rate (MER) of the single ARIMA model and the ARIMA-GRNN combination model were 31.6%, 28.7% respectively and the determination coefficient (R(2)) of the two models were 0.801, 0.872 respectively. The fitting efficacy of the ARIMA-GRNN combination model was better than the single ARIMA, which had practical value in the research on time series data such as the incidence of scarlet fever.
Effect of combined topical heparin and steroid on corneal neovascularization in children.
Michels, Rike; Michels, Stephan; Kaminski, Stephan
2012-01-01
To demonstrate the effect of topical heparin combined with topical steroid on corneal neovascularization (CN) in children. Four children (5 eyes) with new-onset progressive CN in at least one eye received topical rimexolone or dexamethasone in combination with heparin until complete regression of CN was obtained. The regression of CN was documented by slit-lamp or anterior segment photography. All 5 eyes showed complete regression of CN within 5 months. An anti-angiogenic effect was found as early as 1 week after starting topical combination treatment. No ocular and systemic side effects were detected and treatment was well tolerated by all children. In the 3 eyes with involvement of the optical axis, symmetrical visual acuity was obtained by amblyopia treatment. Recurrence of the CN was detectable in 2 eyes at 1 and 6 months, respectively, after ending combination therapy. Both eyes responded favorably to re-treatment. Combination of topical heparin and steroid leads to rapid regression and complete inactivity of CN. This therapeutic approach is promising, especially in children with limited therapeutic alternatives and a high risk for amblyopia. Copyright 2012, SLACK Incorporated.
A single determinant dominates the rate of yeast protein evolution.
Drummond, D Allan; Raval, Alpan; Wilke, Claus O
2006-02-01
A gene's rate of sequence evolution is among the most fundamental evolutionary quantities in common use, but what determines evolutionary rates has remained unclear. Here, we carry out the first combined analysis of seven predictors (gene expression level, dispensability, protein abundance, codon adaptation index, gene length, number of protein-protein interactions, and the gene's centrality in the interaction network) previously reported to have independent influences on protein evolutionary rates. Strikingly, our analysis reveals a single dominant variable linked to the number of translation events which explains 40-fold more variation in evolutionary rate than any other, suggesting that protein evolutionary rate has a single major determinant among the seven predictors. The dominant variable explains nearly half the variation in the rate of synonymous and protein evolution. We show that the two most commonly used methods to disentangle the determinants of evolutionary rate, partial correlation analysis and ordinary multivariate regression, produce misleading or spurious results when applied to noisy biological data. We overcome these difficulties by employing principal component regression, a multivariate regression of evolutionary rate against the principal components of the predictor variables. Our results support the hypothesis that translational selection governs the rate of synonymous and protein sequence evolution in yeast.
Huang, Desheng; Guan, Peng; Guo, Junqiao; Wang, Ping; Zhou, Baosen
2008-01-01
Background The effects of climate variations on bacillary dysentery incidence have gained more recent concern. However, the multi-collinearity among meteorological factors affects the accuracy of correlation with bacillary dysentery incidence. Methods As a remedy, a modified method to combine ridge regression and hierarchical cluster analysis was proposed for investigating the effects of climate variations on bacillary dysentery incidence in northeast China. Results All weather indicators, temperatures, precipitation, evaporation and relative humidity have shown positive correlation with the monthly incidence of bacillary dysentery, while air pressure had a negative correlation with the incidence. Ridge regression and hierarchical cluster analysis showed that during 1987–1996, relative humidity, temperatures and air pressure affected the transmission of the bacillary dysentery. During this period, all meteorological factors were divided into three categories. Relative humidity and precipitation belonged to one class, temperature indexes and evaporation belonged to another class, and air pressure was the third class. Conclusion Meteorological factors have affected the transmission of bacillary dysentery in northeast China. Bacillary dysentery prevention and control would benefit from by giving more consideration to local climate variations. PMID:18816415
Kim, Tae-Hyung; Yun, Tae Jin; Park, Chul-Kee; Kim, Tae Min; Kim, Ji-Hoon; Sohn, Chul-Ho; Won, Jae Kyung; Park, Sung-Hye; Kim, Il Han; Choi, Seung Hong
2017-03-21
Purpose was to assess predictive power for overall survival (OS) and diagnostic performance of combination of susceptibility-weighted MRI sequences (SWMRI) and dynamic susceptibility contrast (DSC) perfusion-weighted imaging (PWI) for differentiation of recurrence and radionecrosis in high-grade glioma (HGG). We enrolled 51 patients who underwent radiation therapy or gamma knife surgeryfollowed by resection for HGG and who developed new measurable enhancement more than six months after complete response. The lesions were confirmed as recurrence (n = 32) or radionecrosis (n = 19). The mean and each percentile value from cumulative histograms of normalized CBV (nCBV) and proportion of dark signal intensity on SWMRI (proSWMRI, %) within enhancement were compared. Multivariate regression was performed for the best differentiator. The cutoff value of best predictor from ROC analysis was evaluated. OS was determined with Kaplan-Meier method and log-rank test. Recurrence showed significantly lower proSWMRI and higher mean nCBV and 90th percentile nCBV (nCBV90) than radionecrosis. Regression analysis revealed both nCBV90 and proSWMRI were independent differentiators. Combination of nCBV90 and proSWMRI achieved 71.9% sensitivity (23/32), 100% specificity (19/19) and 82.3% accuracy (42/51) using best cut-off values (nCBV90 > 2.07 and proSWMRI≤15.76%) from ROC analysis. In subgroup analysis, radionecrosis with nCBV > 2.07 (n = 5) showed obvious hemorrhage (proSWMRI > 32.9%). Patients with nCBV90 > 2.07 and proSWMRI≤15.76% had significantly shorter OS. In conclusion, compared with DSC PWI alone, combination of SWMRI and DSC PWI have potential to be prognosticator for OS and lower false positive rate in differentiation of recurrence and radionecrosis in HGG who develop new measurable enhancement more than six months after complete response.
Karaismailoğlu, Eda; Dikmen, Zeliha Günnur; Akbıyık, Filiz; Karaağaoğlu, Ahmet Ergun
2018-04-30
Background/aim: Myoglobin, cardiac troponin T, B-type natriuretic peptide (BNP), and creatine kinase isoenzyme MB (CK-MB) are frequently used biomarkers for evaluating risk of patients admitted to an emergency department with chest pain. Recently, time- dependent receiver operating characteristic (ROC) analysis has been used to evaluate the predictive power of biomarkers where disease status can change over time. We aimed to determine the best set of biomarkers that estimate cardiac death during follow-up time. We also obtained optimal cut-off values of these biomarkers, which differentiates between patients with and without risk of death. A web tool was developed to estimate time intervals in risk. Materials and methods: A total of 410 patients admitted to the emergency department with chest pain and shortness of breath were included. Cox regression analysis was used to determine an optimal set of biomarkers that can be used for estimating cardiac death and to combine the significant biomarkers. Time-dependent ROC analysis was performed for evaluating performances of significant biomarkers and a combined biomarker during 240 h. The bootstrap method was used to compare statistical significance and the Youden index was used to determine optimal cut-off values. Results : Myoglobin and BNP were significant by multivariate Cox regression analysis. Areas under the time-dependent ROC curves of myoglobin and BNP were about 0.80 during 240 h, and that of the combined biomarker (myoglobin + BNP) increased to 0.90 during the first 180 h. Conclusion: Although myoglobin is not clinically specific to a cardiac event, in our study both myoglobin and BNP were found to be statistically significant for estimating cardiac death. Using this combined biomarker may increase the power of prediction. Our web tool can be useful for evaluating the risk status of new patients and helping clinicians in making decisions.
Mijiti, Peierdun; Yuexin, Zhang; Min, Liu; Wubuli, Maimaitili; Kejun, Pan; Upur, Halmurat
2015-03-01
We retrospectively analysed routinely collected baseline data of 2252 patients with HIV infection registered in the National Free Antiretroviral Treatment Program in Xinjiang province, China, from 2006 to 2011 to estimate the prevalence and predictors of anaemia at the initiation of combined antiretroviral therapy. Anaemia was diagnosed using the criteria set forth by the World Health Organisation, and univariate and multivariate logistic regression analyses were performed to determine its predictors. The prevalences of mild, moderate, and severe anaemia at the initiation of combined antiretroviral therapy were 19.2%, 17.1%, and 2.6%, respectively. Overall, 38.9% of the patients were anaemic at the initiation of combined antiretroviral therapy. The multivariate logistic regression analysis indicated that Uyghur ethnicity, female gender, lower CD4 count, lower body mass index value, self-reported tuberculosis infection, and oral candidiasis were associated with a higher prevalence of anaemia, whereas higher serum alanine aminotransferase level was associated with a lower prevalence of anaemia. The results suggest that the overall prevalence of anaemia at the initiation of combined antiretroviral therapy in patients with HIV infection is high in Xinjiang, China, but severe anaemia is uncommon. Patients in China should be routinely checked for anaemia prior to combined antiretroviral therapy initiation, and healthcare providers should carefully select the appropriate first-line combined antiretroviral therapy regimens for anaemic patients. © The Author(s) 2014 Reprints and permissions: sagepub.co.uk/journalsPermissions.nav.
Tachinami, H; Tomihara, K; Fujiwara, K; Nakamori, K; Noguchi, M
2017-11-01
A retrospective cohort study was performed to assess the clinical usefulness of combination assessment using computed tomography (CT) images in patients undergoing third molar extraction. This study included 85 patients (124 extraction sites). The relationship between cortication status, buccolingual position, and shape of the inferior alveolar canal (IAC) on CT images and the incidence of inferior alveolar nerve (IAN) injury after third molar extraction was evaluated. IAN injury was observed at eight of the 124 sites (6.5%), and in five of 19 sites (26.3%) in which cortication was absent+the IAC had a lingual position+the IAC had a dumbbell shape. Significant relationships were found between IAN injury and the three IAC factors (cortication status, IAC position, and IAC shape; P=0.0001). In patients with the three IAC factors, logistic regression analysis indicated a strong association between these factors and IAN injury (P=0.007). An absence of cortication, a lingually positioned IAC, and a dumbbell-shaped IAC are considered to indicate a high risk of IAN injury according to the logistic regression analysis (P=0.007). These results suggest that a combined assessment of these three IAC factors could be useful for the improved prediction of IAN injury. Copyright © 2017 International Association of Oral and Maxillofacial Surgeons. Published by Elsevier Ltd. All rights reserved.
Robinson, Jo; Spittal, Matthew J; Carter, Greg
2016-01-01
Objective To examine the efficacy of psychological and psychosocial interventions for reductions in repeated self-harm. Design We conducted a systematic review, meta-analysis and meta-regression to examine the efficacy of psychological and psychosocial interventions to reduce repeat self-harm in adults. We included a sensitivity analysis of studies with a low risk of bias for the meta-analysis. For the meta-regression, we examined whether the type, intensity (primary analyses) and other components of intervention or methodology (secondary analyses) modified the overall intervention effect. Data sources A comprehensive search of MEDLINE, PsycInfo and EMBASE (from 1999 to June 2016) was performed. Eligibility criteria for selecting studies Randomised controlled trials of psychological and psychosocial interventions for adult self-harm patients. Results Forty-five trials were included with data available from 36 (7354 participants) for the primary analysis. Meta-analysis showed a significant benefit of all psychological and psychosocial interventions combined (risk ratio 0.84; 95% CI 0.74 to 0.96; number needed to treat=33); however, sensitivity analyses showed that this benefit was non-significant when restricted to a limited number of high-quality studies. Meta-regression showed that the type of intervention did not modify the treatment effects. Conclusions Consideration of a psychological or psychosocial intervention over and above treatment as usual is worthwhile; with the public health benefits of ensuring that this practice is widely adopted potentially worth the investment. However, the specific type and nature of the intervention that should be delivered is not yet clear. Cognitive–behavioural therapy or interventions with an interpersonal focus and targeted on the precipitants to self-harm may be the best candidates on the current evidence. Further research is required. PMID:27660314
Regression rate behaviors of HTPB-based propellant combinations for hybrid rocket motor
NASA Astrophysics Data System (ADS)
Sun, Xingliang; Tian, Hui; Li, Yuelong; Yu, Nanjia; Cai, Guobiao
2016-02-01
The purpose of this paper is to characterize the regression rate behavior of hybrid rocket motor propellant combinations, using hydrogen peroxide (HP), gaseous oxygen (GOX), nitrous oxide (N2O) as the oxidizer and hydroxyl-terminated poly-butadiene (HTPB) as the based fuel. In order to complete this research by experiment and simulation, a hybrid rocket motor test system and a numerical simulation model are established. Series of hybrid rocket motor firing tests are conducted burning different propellant combinations, and several of those are used as references for numerical simulations. The numerical simulation model is developed by combining the Navies-Stokes equations with the turbulence model, one-step global reaction model, and solid-gas coupling model. The distribution of regression rate along the axis is determined by applying simulation mode to predict the combustion process and heat transfer inside the hybrid rocket motor. The time-space averaged regression rate has a good agreement between the numerical value and experimental data. The results indicate that the N2O/HTPB and GOX/HTPB propellant combinations have a higher regression rate, since the enhancement effect of latter is significant due to its higher flame temperature. Furthermore, the containing of aluminum (Al) and/or ammonium perchlorate(AP) in the grain does enhance the regression rate, mainly due to the more energy released inside the chamber and heat feedback to the grain surface by the aluminum combustion.
Luszczki, Jarogniew J; Zagaja, Mirosław; Miziak, Barbara; Kondrat-Wrobel, Maria W; Zaluska, Katarzyna; Wroblewska-Luczka, Paula; Adamczuk, Piotr; Czuczwar, Stanislaw J; Florek-Luszczki, Magdalena
2018-01-01
To isobolographically determine the types of interactions that occur between retigabine and lacosamide (LCM; two third-generation antiepileptic drugs) with respect to their anticonvulsant activity and acute adverse effects (sedation) in the maximal electroshock-induced seizures (MES) and chimney test (motor performance) in adult male Swiss mice. Type I isobolographic analysis for nonparallel dose-response effects for the combination of retigabine with LCM (at the fixed-ratio of 1:1) in both the MES and chimney test in mice was performed. Brain concentrations of retigabine and LCM were measured by high-pressure liquid chromatography (HPLC) to characterize any pharmacokinetic interactions occurring when combining these drugs. Linear regression analysis revealed that retigabine had its dose-response effect line nonparallel to that of LCM in both the MES and chimney tests. The type I isobolographic analysis illustrated that retigabine combined with LCM (fixed-ratio of 1:1) exerted an additive interaction in the mouse MES model and sub-additivity (antagonism) in the chimney test. With HPLC, retigabine and LCM did not mutually change their total brain concentrations, thereby confirming the pharmacodynamic nature of the interaction. LCM combined with retigabine possesses a beneficial preclinical profile (benefit index ranged from 2.07 to 2.50) and this 2-drug combination is worth recommending as treatment plan to patients with pharmacoresistant epilepsy. © 2017 S. Karger AG, Basel.
Zhou, Jinzhe; Zhou, Yanbing; Cao, Shougen; Li, Shikuan; Wang, Hao; Niu, Zhaojian; Chen, Dong; Wang, Dongsheng; Lv, Liang; Zhang, Jian; Li, Yu; Jiao, Xuelong; Tan, Xiaojie; Zhang, Jianli; Wang, Haibo; Zhang, Bingyuan; Lu, Yun; Sun, Zhenqing
2016-01-01
Reporting of surgical complications is common, but few provide information about the severity and estimate risk factors of complications. If have, but lack of specificity. We retrospectively analyzed data on 2795 gastric cancer patients underwent surgical procedure at the Affiliated Hospital of Qingdao University between June 2007 and June 2012, established multivariate logistic regression model to predictive risk factors related to the postoperative complications according to the Clavien-Dindo classification system. Twenty-four out of 86 variables were identified statistically significant in univariate logistic regression analysis, 11 significant variables entered multivariate analysis were employed to produce the risk model. Liver cirrhosis, diabetes mellitus, Child classification, invasion of neighboring organs, combined resection, introperative transfusion, Billroth II anastomosis of reconstruction, malnutrition, surgical volume of surgeons, operating time and age were independent risk factors for postoperative complications after gastrectomy. Based on logistic regression equation, p=Exp∑BiXi / (1+Exp∑BiXi), multivariate logistic regression predictive model that calculated the risk of postoperative morbidity was developed, p = 1/(1 + e((4.810-1.287X1-0.504X2-0.500X3-0.474X4-0.405X5-0.318X6-0.316X7-0.305X8-0.278X9-0.255X10-0.138X11))). The accuracy, sensitivity and specificity of the model to predict the postoperative complications were 86.7%, 76.2% and 88.6%, respectively. This risk model based on Clavien-Dindo grading severity of complications system and logistic regression analysis can predict severe morbidity specific to an individual patient's risk factors, estimate patients' risks and benefits of gastric surgery as an accurate decision-making tool and may serve as a template for the development of risk models for other surgical groups.
Intraurban Differences in the Use of Ambulatory Health Services in a Large Brazilian City
Lima-Costa, Maria Fernanda; Proietti, Fernando Augusto; Cesar, Cibele C.; Macinko, James
2010-01-01
A major goal of health systems is to reduce inequities in access to services, that is, to ensure that health care is provided based on health needs rather than social or economic factors. This study aims to identify the determinants of health services utilization among adults in a large Brazilian city and intraurban disparities in health care use. We combine household survey data with census-derived classification of social vulnerability of each household’s census tract. The dependent variable was utilization of physician services in the prior 12 months, and the independent variables included predisposing factors, health needs, enabling factors, and context. Prevalence ratios and 95% confidence intervals were estimated by the Hurdle regression model, which combined Poisson regression analysis of factors associated with any doctor visits (dichotomous variable) and zero-truncated negative binomial regression for the analysis of factors associated with the number of visits among those who had at least one. Results indicate that the use of health services was greater among women and increased with age, and was determined primarily by health needs and whether the individual had a regular doctor, even among those living in areas of the city with the worst socio-environmental indicators. The experience of Belo Horizonte may have implications for other world cities, particularly in the development and use of a comprehensive index to identify populations at risk and in order to guide expansion of primary health care services as a means of enhancing equity in health. PMID:21104332
Hoos, Anne B.; Patel, Anant R.
1996-01-01
Model-adjustment procedures were applied to the combined data bases of storm-runoff quality for Chattanooga, Knoxville, and Nashville, Tennessee, to improve predictive accuracy for storm-runoff quality for urban watersheds in these three cities and throughout Middle and East Tennessee. Data for 45 storms at 15 different sites (five sites in each city) constitute the data base. Comparison of observed values of storm-runoff load and event-mean concentration to the predicted values from the regional regression models for 10 constituents shows prediction errors, as large as 806,000 percent. Model-adjustment procedures, which combine the regional model predictions with local data, are applied to improve predictive accuracy. Standard error of estimate after model adjustment ranges from 67 to 322 percent. Calibration results may be biased due to sampling error in the Tennessee data base. The relatively large values of standard error of estimate for some of the constituent models, although representing significant reduction (at least 50 percent) in prediction error compared to estimation with unadjusted regional models, may be unacceptable for some applications. The user may wish to collect additional local data for these constituents and repeat the analysis, or calibrate an independent local regression model.
Meta-regression approximations to reduce publication selection bias.
Stanley, T D; Doucouliagos, Hristos
2014-03-01
Publication selection bias is a serious challenge to the integrity of all empirical sciences. We derive meta-regression approximations to reduce this bias. Our approach employs Taylor polynomial approximations to the conditional mean of a truncated distribution. A quadratic approximation without a linear term, precision-effect estimate with standard error (PEESE), is shown to have the smallest bias and mean squared error in most cases and to outperform conventional meta-analysis estimators, often by a great deal. Monte Carlo simulations also demonstrate how a new hybrid estimator that conditionally combines PEESE and the Egger regression intercept can provide a practical solution to publication selection bias. PEESE is easily expanded to accommodate systematic heterogeneity along with complex and differential publication selection bias that is related to moderator variables. By providing an intuitive reason for these approximations, we can also explain why the Egger regression works so well and when it does not. These meta-regression methods are applied to several policy-relevant areas of research including antidepressant effectiveness, the value of a statistical life, the minimum wage, and nicotine replacement therapy. Copyright © 2013 John Wiley & Sons, Ltd.
Shillcutt, Samuel D; LeFevre, Amnesty E; Fischer-Walker, Christa L; Taneja, Sunita; Black, Robert E; Mazumder, Sarmila
2017-01-01
This study evaluates the cost-effectiveness of the DAZT program for scaling up treatment of acute child diarrhea in Gujarat India using a net-benefit regression framework. Costs were calculated from societal and caregivers' perspectives and effectiveness was assessed in terms of coverage of zinc and both zinc and Oral Rehydration Salt. Regression models were tested in simple linear regression, with a specified set of covariates, and with a specified set of covariates and interaction terms using linear regression with endogenous treatment effects was used as the reference case. The DAZT program was cost-effective with over 95% certainty above $5.50 and $7.50 per appropriately treated child in the unadjusted and adjusted models respectively, with specifications including interaction terms being cost-effective with 85-97% certainty. Findings from this study should be combined with other evidence when considering decisions to scale up programs such as the DAZT program to promote the use of ORS and zinc to treat child diarrhea.
Detection of epistatic effects with logic regression and a classical linear regression model.
Malina, Magdalena; Ickstadt, Katja; Schwender, Holger; Posch, Martin; Bogdan, Małgorzata
2014-02-01
To locate multiple interacting quantitative trait loci (QTL) influencing a trait of interest within experimental populations, usually methods as the Cockerham's model are applied. Within this framework, interactions are understood as the part of the joined effect of several genes which cannot be explained as the sum of their additive effects. However, if a change in the phenotype (as disease) is caused by Boolean combinations of genotypes of several QTLs, this Cockerham's approach is often not capable to identify them properly. To detect such interactions more efficiently, we propose a logic regression framework. Even though with the logic regression approach a larger number of models has to be considered (requiring more stringent multiple testing correction) the efficient representation of higher order logic interactions in logic regression models leads to a significant increase of power to detect such interactions as compared to a Cockerham's approach. The increase in power is demonstrated analytically for a simple two-way interaction model and illustrated in more complex settings with simulation study and real data analysis.
NASA Astrophysics Data System (ADS)
Fujiki, Shogoro; Okada, Kei-ichi; Nishio, Shogo; Kitayama, Kanehiro
2016-09-01
We developed a new method to estimate stand ages of secondary vegetation in the Bornean montane zone, where local people conduct traditional shifting cultivation and protected areas are surrounded by patches of recovering secondary vegetation of various ages. Identifying stand ages at the landscape level is critical to improve conservation policies. We combined a high-resolution satellite image (WorldView-2) with time-series Landsat images. We extracted stand ages (the time elapsed since the most recent slash and burn) from a change-detection analysis with Landsat time-series images and superimposed the derived stand ages on the segments classified by object-based image analysis using WorldView-2. We regarded stand ages as a response variable, and object-based metrics as independent variables, to develop regression models that explain stand ages. Subsequently, we classified the vegetation of the target area into six age units and one rubber plantation unit (1-3 yr, 3-5 yr, 5-7 yr, 7-30 yr, 30-50 yr, >50 yr and 'rubber plantation') using regression models and linear discriminant analyses. Validation demonstrated an accuracy of 84.3%. Our approach is particularly effective in classifying highly dynamic pioneer vegetation younger than 7 years into 2-yr intervals, suggesting that rapid changes in vegetation canopies can be detected with high accuracy. The combination of a spectral time-series analysis and object-based metrics based on high-resolution imagery enabled the classification of dynamic vegetation under intensive shifting cultivation and yielded an informative land cover map based on stand ages.
NASA Astrophysics Data System (ADS)
Yang, Fan; Xue, Lianqing; Zhang, Luochen; Chen, Xinfang; Chi, Yixia
2017-12-01
This article aims to explore the adaptive utilization strategies of flow regime versus traditional practices in the context of climate change and human activities in the arid area. The study presents quantitative analysis of climatic and anthropogenic factors to streamflow alteration in the Tarim River Basin (TRB) using the Budyko method and adaptive utilization strategies to eco-hydrological regime by comparing the applicability between autoregressive moving average model (ARMA) model and combined regression model. Our results suggest that human activities played a dominant role in streamflow deduction in the mainstream with contribution of 120.7%~190.1%. While in the headstreams, climatic variables were the primary determinant of streamflow by 56.5~152.6% of the increase. The comparison revealed that combined regression model performed better than ARMA model with the qualified rate of 80.49~90.24%. Based on the forecasts of streamflow for different purposes, the adaptive utilization scheme of water flow is established from the perspective of time and space. Our study presents an effective water resources scheduling scheme for the ecological environment and provides references for ecological protection and water allocation in the arid area.
Linge, Annett; Schötz, Ulrike; Löck, Steffen; Lohaus, Fabian; von Neubeck, Cläre; Gudziol, Volker; Nowak, Alexander; Tinhofer, Inge; Budach, Volker; Sak, Ali; Stuschke, Martin; Balermpas, Panagiotis; Rödel, Claus; Bunea, Hatice; Grosu, Anca-Ligia; Abdollahi, Amir; Debus, Jürgen; Ganswindt, Ute; Lauber, Kirsten; Pigorsch, Steffi; Combs, Stephanie E; Mönnich, David; Zips, Daniel; Baretton, Gustavo B; Buchholz, Frank; Krause, Mechthild; Belka, Claus; Baumann, Michael
2018-04-01
To compare six HPV detection methods in pre-treatment FFPE tumour samples from patients with locally advanced head and neck squamous cell carcinoma (HNSCC) who received postoperative (N = 175) or primary (N = 90) radiochemotherapy. HPV analyses included detection of (i) HPV16 E6/E7 RNA, (ii) HPV16 DNA (PCR-based arrays, A-PCR), (iii) HPV DNA (GP5+/GP6+ qPCR, (GP-PCR)), (iv) p16 (immunohistochemistry, p16 IHC), (v) combining p16 IHC and the A-PCR result and (vi) combining p16 IHC and the GP-PCR result. Differences between HPV positive and negative subgroups were evaluated for the primary endpoint loco-regional control (LRC) using Cox regression. Correlation between the HPV detection methods was high (chi-squared test, p < 0.001). While p16 IHC analysis resulted in several false positive classifications, A-PCR, GP-PCR and the combination of p16 IHC and A-PCR or GP-PCR led to results comparable to RNA analysis. In both cohorts, Cox regression analyses revealed significantly prolonged LRC for patients with HPV positive tumours irrespective of the detection method. The most stringent classification was obtained by detection of HPV16 RNA, or combining p16 IHC with A-PCR or GP-PCR. This approach revealed the lowest rate of recurrence in patients with tumours classified as HPV positive and therefore appears most suited for patient stratification in HPV-based clinical studies. Copyright © 2017 Elsevier B.V. All rights reserved.
Spatial Bayesian Latent Factor Regression Modeling of Coordinate-based Meta-analysis Data
Montagna, Silvia; Wager, Tor; Barrett, Lisa Feldman; Johnson, Timothy D.; Nichols, Thomas E.
2017-01-01
Summary Now over 20 years old, functional MRI (fMRI) has a large and growing literature that is best synthesised with meta-analytic tools. As most authors do not share image data, only the peak activation coordinates (foci) reported in the paper are available for Coordinate-Based Meta-Analysis (CBMA). Neuroimaging meta-analysis is used to 1) identify areas of consistent activation; and 2) build a predictive model of task type or cognitive process for new studies (reverse inference). To simultaneously address these aims, we propose a Bayesian point process hierarchical model for CBMA. We model the foci from each study as a doubly stochastic Poisson process, where the study-specific log intensity function is characterised as a linear combination of a high-dimensional basis set. A sparse representation of the intensities is guaranteed through latent factor modeling of the basis coefficients. Within our framework, it is also possible to account for the effect of study-level covariates (meta-regression), significantly expanding the capabilities of the current neuroimaging meta-analysis methods available. We apply our methodology to synthetic data and neuroimaging meta-analysis datasets. PMID:28498564
Zhang, Qun; Zhang, Qunzhi; Sornette, Didier
2016-01-01
We augment the existing literature using the Log-Periodic Power Law Singular (LPPLS) structures in the log-price dynamics to diagnose financial bubbles by providing three main innovations. First, we introduce the quantile regression to the LPPLS detection problem. This allows us to disentangle (at least partially) the genuine LPPLS signal and the a priori unknown complicated residuals. Second, we propose to combine the many quantile regressions with a multi-scale analysis, which aggregates and consolidates the obtained ensembles of scenarios. Third, we define and implement the so-called DS LPPLS Confidence™ and Trust™ indicators that enrich considerably the diagnostic of bubbles. Using a detailed study of the "S&P 500 1987" bubble and presenting analyses of 16 historical bubbles, we show that the quantile regression of LPPLS signals contributes useful early warning signals. The comparison between the constructed signals and the price development in these 16 historical bubbles demonstrates their significant predictive ability around the real critical time when the burst/rally occurs.
Robust Head-Pose Estimation Based on Partially-Latent Mixture of Linear Regressions.
Drouard, Vincent; Horaud, Radu; Deleforge, Antoine; Ba, Sileye; Evangelidis, Georgios
2017-03-01
Head-pose estimation has many applications, such as social event analysis, human-robot and human-computer interaction, driving assistance, and so forth. Head-pose estimation is challenging, because it must cope with changing illumination conditions, variabilities in face orientation and in appearance, partial occlusions of facial landmarks, as well as bounding-box-to-face alignment errors. We propose to use a mixture of linear regressions with partially-latent output. This regression method learns to map high-dimensional feature vectors (extracted from bounding boxes of faces) onto the joint space of head-pose angles and bounding-box shifts, such that they are robustly predicted in the presence of unobservable phenomena. We describe in detail the mapping method that combines the merits of unsupervised manifold learning techniques and of mixtures of regressions. We validate our method with three publicly available data sets and we thoroughly benchmark four variants of the proposed algorithm with several state-of-the-art head-pose estimation methods.
Jang, Seung-Ho; Ryu, Han-Seung; Choi, Suck-Chei; Lee, Sang-Yeol
2016-10-01
The purpose of this study was to examine psychosocial factors related to gastroesophageal reflux disease (GERD) and their effects on quality of life (QOL) in firefighters. Data were collected from 1217 firefighters in a Korean province. We measured psychological symptoms using the scale. In order to observe the influence of the high-risk group on occupational stress, we conduct logistic multiple linear regression. The correlation between psychological factors and QOL was also analyzed and performed a hierarchical regression analysis. GERD was observed in 32.2% of subjects. Subjects with GERD showed higher depressive symptom, anxiety and occupational stress scores, and lower self-esteem and QOL scores relative to those observed in GERD - negative subject. GERD risk was higher for the following occupational stress subcategories: job demand, lack of reward, interpersonal conflict, and occupational climate. The stepwise regression analysis showed that depressive symptoms, occupational stress, self-esteem, and anxiety were the best predictors of QOL. The results suggest that psychological and medical approaches should be combined in GERD assessment.
An Analysis of the Number of Medical Malpractice Claims and Their Amounts
Bonetti, Marco; Cirillo, Pasquale; Musile Tanzi, Paola; Trinchero, Elisabetta
2016-01-01
Starting from an extensive database, pooling 9 years of data from the top three insurance brokers in Italy, and containing 38125 reported claims due to alleged cases of medical malpractice, we use an inhomogeneous Poisson process to model the number of medical malpractice claims in Italy. The intensity of the process is allowed to vary over time, and it depends on a set of covariates, like the size of the hospital, the medical department and the complexity of the medical operations performed. We choose the combination medical department by hospital as the unit of analysis. Together with the number of claims, we also model the associated amounts paid by insurance companies, using a two-stage regression model. In particular, we use logistic regression for the probability that a claim is closed with a zero payment, whereas, conditionally on the fact that an amount is strictly positive, we make use of lognormal regression to model it as a function of several covariates. The model produces estimates and forecasts that are relevant to both insurance companies and hospitals, for quality assurance, service improvement and cost reduction. PMID:27077661
Prediction of strontium bromide laser efficiency using cluster and decision tree analysis
NASA Astrophysics Data System (ADS)
Iliev, Iliycho; Gocheva-Ilieva, Snezhana; Kulin, Chavdar
2018-01-01
Subject of investigation is a new high-powered strontium bromide (SrBr2) vapor laser emitting in multiline region of wavelengths. The laser is an alternative to the atom strontium lasers and electron free lasers, especially at the line 6.45 μm which line is used in surgery for medical processing of biological tissues and bones with minimal damage. In this paper the experimental data from measurements of operational and output characteristics of the laser are statistically processed by means of cluster analysis and tree-based regression techniques. The aim is to extract the more important relationships and dependences from the available data which influence the increase of the overall laser efficiency. There are constructed and analyzed a set of cluster models. It is shown by using different cluster methods that the seven investigated operational characteristics (laser tube diameter, length, supplied electrical power, and others) and laser efficiency are combined in 2 clusters. By the built regression tree models using Classification and Regression Trees (CART) technique there are obtained dependences to predict the values of efficiency, and especially the maximum efficiency with over 95% accuracy.
Regression equations for disinfection by-products for the Mississippi, Ohio and Missouri rivers
Rathbun, R.E.
1996-01-01
Trihalomethane and nonpurgeable total organic-halide formation potentials were determined for the chlorination of water samples from the Mississippi, Ohio and Missouri Rivers. Samples were collected during the summer and fall of 1991 and the spring of 1992 at twelve locations on the Mississippi from New Orleans to Minneapolis, and on the Ohio and Missouri 1.6 km upstream from their confluences with the Mississippi. Formation potentials were determined as a function of pH, initial free-chlorine concentration, and reaction time. Multiple linear regression analysis of the data indicated that pH, reaction time, and the dissolved organic carbon concentration and/or the ultraviolet absorbance of the water were the most significant variables. The initial free-chlorine concentration had less significance and bromide concentration had little or no significance. Analysis of combinations of the dissolved organic carbon concentration and the ultraviolet absorbance indicated that use of the ultraviolet absorbance alone provided the best prediction of the experimental data. Regression coefficients for the variables were generally comparable to coefficients previously presented in the literature for waters from other parts of the United States.
Jang, Seung-Ho; Ryu, Han-Seung; Choi, Suck-Chei; Lee, Sang-Yeol
2016-01-01
Objectives The purpose of this study was to examine psychosocial factors related to gastroesophageal reflux disease (GERD) and their effects on quality of life (QOL) in firefighters. Methods Data were collected from 1217 firefighters in a Korean province. We measured psychological symptoms using the scale. In order to observe the influence of the high-risk group on occupational stress, we conduct logistic multiple linear regression. The correlation between psychological factors and QOL was also analyzed and performed a hierarchical regression analysis. Results GERD was observed in 32.2% of subjects. Subjects with GERD showed higher depressive symptom, anxiety and occupational stress scores, and lower self-esteem and QOL scores relative to those observed in GERD – negative subject. GERD risk was higher for the following occupational stress subcategories: job demand, lack of reward, interpersonal conflict, and occupational climate. The stepwise regression analysis showed that depressive symptoms, occupational stress, self-esteem, and anxiety were the best predictors of QOL. Conclusions The results suggest that psychological and medical approaches should be combined in GERD assessment. PMID:27691373
On the use of log-transformation vs. nonlinear regression for analyzing biological power laws
Xiao, X.; White, E.P.; Hooten, M.B.; Durham, S.L.
2011-01-01
Power-law relationships are among the most well-studied functional relationships in biology. Recently the common practice of fitting power laws using linear regression (LR) on log-transformed data has been criticized, calling into question the conclusions of hundreds of studies. It has been suggested that nonlinear regression (NLR) is preferable, but no rigorous comparison of these two methods has been conducted. Using Monte Carlo simulations, we demonstrate that the error distribution determines which method performs better, with NLR better characterizing data with additive, homoscedastic, normal error and LR better characterizing data with multiplicative, heteroscedastic, lognormal error. Analysis of 471 biological power laws shows that both forms of error occur in nature. While previous analyses based on log-transformation appear to be generally valid, future analyses should choose methods based on a combination of biological plausibility and analysis of the error distribution. We provide detailed guidelines and associated computer code for doing so, including a model averaging approach for cases where the error structure is uncertain. ?? 2011 by the Ecological Society of America.
Giménez-Espert, María Del Carmen; Prado-Gascó, Vicente Javier
2018-03-01
To analyse link between empathy and emotional intelligence as a predictor of nurses' attitudes towards communication while comparing the contribution of emotional aspects and attitudinal elements on potential behaviour. Nurses' attitudes towards communication, empathy and emotional intelligence are key skills for nurses involved in patient care. There are currently no studies analysing this link, and its investigation is needed because attitudes may influence communication behaviours. Correlational study. To attain this goal, self-reported instruments (attitudes towards communication of nurses, trait emotional intelligence (Trait Emotional Meta-Mood Scale) and Jefferson Scale of Nursing Empathy (Jefferson Scale Nursing Empathy) were collected from 460 nurses between September 2015-February 2016. Two different analytical methodologies were used: traditional regression models and fuzzy-set qualitative comparative analysis models. The results of the regression model suggest that cognitive dimensions of attitude are a significant and positive predictor of the behavioural dimension. The perspective-taking dimension of empathy and the emotional-clarity dimension of emotional intelligence were significant positive predictors of the dimensions of attitudes towards communication, except for the affective dimension (for which the association was negative). The results of the fuzzy-set qualitative comparative analysis models confirm that the combination of high levels of cognitive dimension of attitudes, perspective-taking and emotional clarity explained high levels of the behavioural dimension of attitude. Empathy and emotional intelligence are predictors of nurses' attitudes towards communication, and the cognitive dimension of attitude is a good predictor of the behavioural dimension of attitudes towards communication of nurses in both regression models and fuzzy-set qualitative comparative analysis. In general, the fuzzy-set qualitative comparative analysis models appear to be better predictors than the regression models are. To evaluate current practices, establish intervention strategies and evaluate their effectiveness. The evaluation of these variables and their relationships are important in creating a satisfied and sustainable workforce and improving quality of care and patient health. © 2018 John Wiley & Sons Ltd.
Classification of sodium MRI data of cartilage using machine learning.
Madelin, Guillaume; Poidevin, Frederick; Makrymallis, Antonios; Regatte, Ravinder R
2015-11-01
To assess the possible utility of machine learning for classifying subjects with and subjects without osteoarthritis using sodium magnetic resonance imaging data. Theory: Support vector machine, k-nearest neighbors, naïve Bayes, discriminant analysis, linear regression, logistic regression, neural networks, decision tree, and tree bagging were tested. Sodium magnetic resonance imaging with and without fluid suppression by inversion recovery was acquired on the knee cartilage of 19 controls and 28 osteoarthritis patients. Sodium concentrations were measured in regions of interests in the knee for both acquisitions. Mean (MEAN) and standard deviation (STD) of these concentrations were measured in each regions of interest, and the minimum, maximum, and mean of these two measurements were calculated over all regions of interests for each subject. The resulting 12 variables per subject were used as predictors for classification. Either Min [STD] alone, or in combination with Mean [MEAN] or Min [MEAN], all from fluid suppressed data, were the best predictors with an accuracy >74%, mainly with linear logistic regression and linear support vector machine. Other good classifiers include discriminant analysis, linear regression, and naïve Bayes. Machine learning is a promising technique for classifying osteoarthritis patients and controls from sodium magnetic resonance imaging data. © 2014 Wiley Periodicals, Inc.
Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.
2003-01-01
Logistic regression was used to predict the probability of debris flows occurring in areas recently burned by wildland fires. Multiple logistic regression is conceptually similar to multiple linear regression because statistical relations between one dependent variable and several independent variables are evaluated. In logistic regression, however, the dependent variable is transformed to a binary variable (debris flow did or did not occur), and the actual probability of the debris flow occurring is statistically modeled. Data from 399 basins located within 15 wildland fires that burned during 2000-2002 in Colorado, Idaho, Montana, and New Mexico were evaluated. More than 35 independent variables describing the burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows were delineated from National Elevation Data using a Geographic Information System (GIS). (2) Data describing the burn severity, geology, land surface gradient, rainfall, and soil properties were determined for each basin. These data were then downloaded to a statistics software package for analysis using logistic regression. (3) Relations between the occurrence/non-occurrence of debris flows and burn severity, geology, land surface gradient, rainfall, and soil properties were evaluated and several preliminary multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combination produced the most effective model. The multivariate model that best predicted the occurrence of debris flows was selected. (4) The multivariate logistic regression model was entered into a GIS, and a map showing the probability of debris flows was constructed. The most effective model incorporates the percentage of each basin with slope greater than 30 percent, percentage of land burned at medium and high burn severity in each basin, particle size sorting, average storm intensity (millimeters per hour), soil organic matter content, soil permeability, and soil drainage. The results of this study demonstrate that logistic regression is a valuable tool for predicting the probability of debris flows occurring in recently-burned landscapes.
Rhodes, Kirsty M; Turner, Rebecca M; White, Ian R; Jackson, Dan; Spiegelhalter, David J; Higgins, Julian P T
2016-12-20
Many meta-analyses combine results from only a small number of studies, a situation in which the between-study variance is imprecisely estimated when standard methods are applied. Bayesian meta-analysis allows incorporation of external evidence on heterogeneity, providing the potential for more robust inference on the effect size of interest. We present a method for performing Bayesian meta-analysis using data augmentation, in which we represent an informative conjugate prior for between-study variance by pseudo data and use meta-regression for estimation. To assist in this, we derive predictive inverse-gamma distributions for the between-study variance expected in future meta-analyses. These may serve as priors for heterogeneity in new meta-analyses. In a simulation study, we compare approximate Bayesian methods using meta-regression and pseudo data against fully Bayesian approaches based on importance sampling techniques and Markov chain Monte Carlo (MCMC). We compare the frequentist properties of these Bayesian methods with those of the commonly used frequentist DerSimonian and Laird procedure. The method is implemented in standard statistical software and provides a less complex alternative to standard MCMC approaches. An importance sampling approach produces almost identical results to standard MCMC approaches, and results obtained through meta-regression and pseudo data are very similar. On average, data augmentation provides closer results to MCMC, if implemented using restricted maximum likelihood estimation rather than DerSimonian and Laird or maximum likelihood estimation. The methods are applied to real datasets, and an extension to network meta-analysis is described. The proposed method facilitates Bayesian meta-analysis in a way that is accessible to applied researchers. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd. © 2016 The Authors. Statistics in Medicine Published by John Wiley & Sons Ltd.
Dusseldorp, Elise; van Genugten, Lenneke; van Buuren, Stef; Verheijden, Marieke W; van Empelen, Pepijn
2014-12-01
Many health-promoting interventions combine multiple behavior change techniques (BCTs) to maximize effectiveness. Although, in theory, BCTs can amplify each other, the available meta-analyses have not been able to identify specific combinations of techniques that provide synergistic effects. This study overcomes some of the shortcomings in the current methodology by applying classification and regression trees (CART) to meta-analytic data in a special way, referred to as Meta-CART. The aim was to identify particular combinations of BCTs that explain intervention success. A reanalysis of data from Michie, Abraham, Whittington, McAteer, and Gupta (2009) was performed. These data included effect sizes from 122 interventions targeted at physical activity and healthy eating, and the coding of the interventions into 26 BCTs. A CART analysis was performed using the BCTs as predictors and treatment success (i.e., effect size) as outcome. A subgroup meta-analysis using a mixed effects model was performed to compare the treatment effect in the subgroups found by CART. Meta-CART identified the following most effective combinations: Provide information about behavior-health link with Prompt intention formation (mean effect size ḡ = 0.46), and Provide information about behavior-health link with Provide information on consequences and Use of follow-up prompts (ḡ = 0.44). Least effective interventions were those using Provide feedback on performance without using Provide instruction (ḡ = 0.05). Specific combinations of BCTs increase the likelihood of achieving change in health behavior, whereas other combinations decrease this likelihood. Meta-CART successfully identified these combinations and thus provides a viable methodology in the context of meta-analysis.
NASA Astrophysics Data System (ADS)
Jiang, Weiping; Ma, Jun; Li, Zhao; Zhou, Xiaohui; Zhou, Boye
2018-05-01
The analysis of the correlations between the noise in different components of GPS stations has positive significance to those trying to obtain more accurate uncertainty of velocity with respect to station motion. Previous research into noise in GPS position time series focused mainly on single component evaluation, which affects the acquisition of precise station positions, the velocity field, and its uncertainty. In this study, before and after removing the common-mode error (CME), we performed one-dimensional linear regression analysis of the noise amplitude vectors in different components of 126 GPS stations with a combination of white noise, flicker noise, and random walking noise in Southern California. The results show that, on the one hand, there are above-moderate degrees of correlation between the white noise amplitude vectors in all components of the stations before and after removal of the CME, while the correlations between flicker noise amplitude vectors in horizontal and vertical components are enhanced from un-correlated to moderately correlated by removing the CME. On the other hand, the significance tests show that, all of the obtained linear regression equations, which represent a unique function of the noise amplitude in any two components, are of practical value after removing the CME. According to the noise amplitude estimates in two components and the linear regression equations, more accurate noise amplitudes can be acquired in the two components.
NASA Astrophysics Data System (ADS)
Xu, Wenbo; Jing, Shaocai; Yu, Wenjuan; Wang, Zhaoxian; Zhang, Guoping; Huang, Jianxi
2013-11-01
In this study, the high risk areas of Sichuan Province with debris flow, Panzhihua and Liangshan Yi Autonomous Prefecture, were taken as the studied areas. By using rainfall and environmental factors as the predictors and based on the different prior probability combinations of debris flows, the prediction of debris flows was compared in the areas with statistical methods: logistic regression (LR) and Bayes discriminant analysis (BDA). The results through the comprehensive analysis show that (a) with the mid-range scale prior probability, the overall predicting accuracy of BDA is higher than those of LR; (b) with equal and extreme prior probabilities, the overall predicting accuracy of LR is higher than those of BDA; (c) the regional predicting models of debris flows with rainfall factors only have worse performance than those introduced environmental factors, and the predicting accuracies of occurrence and nonoccurrence of debris flows have been changed in the opposite direction as the supplemented information.
Gouvinhas, Irene; Machado, Nelson; Carvalho, Teresa; de Almeida, José M M M; Barros, Ana I R N A
2015-01-01
Extra virgin olive oils produced from three cultivars on different maturation stages were characterized using Raman spectroscopy. Chemometric methods (principal component analysis, discriminant analysis, principal component regression and partial least squares regression) applied to Raman spectral data were utilized to evaluate and quantify the statistical differences between cultivars and their ripening process. The models for predicting the peroxide value and free acidity of olive oils showed good calibration and prediction values and presented high coefficients of determination (>0.933). Both the R(2), and the correlation equations between the measured chemical parameters, and the values predicted by each approach are presented; these comprehend both PCR and PLS, used to assess SNV normalized Raman data, as well as first and second derivative of the spectra. This study demonstrates that a combination of Raman spectroscopy with multivariate analysis methods can be useful to predict rapidly olive oil chemical characteristics during the maturation process. Copyright © 2014 Elsevier B.V. All rights reserved.
NASA Astrophysics Data System (ADS)
Alrehaly, Essa D.
Examination of Saudi Arabian educational practices is scarce, but increasingly important, especially in light of the country's pace in worldwide mathematics and science rankings. The purpose of the study is to understand and evaluate parental influence on male children's science education achievements in Saudi Arabia. Parental level of education and participant's choice of science major were used to identify groups for the purpose of data analysis. Data were gathered using five independent variables concerning parental educational practices (attitude, involvement, autonomy support, structure and control) and the dependent variable of science scores in high school. The sample consisted of 338 participants and was arbitrarily drawn from the science-based colleges (medical, engineering, and natural science) at Jazan University in Saudi Arabia. The data were tested using Pearson's analysis, backward multiple regression, one way ANOVA and independent t-test. The findings of the study reveal significant correlations for all five of the variables. Multiple regressions revealed that all five of the parents' educational practices indicators combined together could explain 19% of the variance in science scores and parental attitude toward science and educational involvement combined accounted for more than 18% of the variance. Analysis indicates that no significant difference is attributable to parental involvement and educational level. This finding is important because it indicates that, in Saudi Arabia, results are not consistent with research in Western or other Asian contexts.
Meta-analysis identifies a MECOM gene as a novel predisposing factor of osteoporotic fracture
Hwang, Joo-Yeon; Lee, Seung Hun; Go, Min Jin; Kim, Beom-Jun; Kou, Ikuyo; Ikegawa, Shiro; Guo, Yan; Deng, Hong-Wen; Raychaudhuri, Soumya; Kim, Young Jin; Oh, Ji Hee; Kim, Youngdoe; Moon, Sanghoon; Kim, Dong-Joon; Koo, Heejo; Cha, My-Jung; Lee, Min Hye; Yun, Ji Young; Yoo, Hye-Sook; Kang, Young-Ah; Cho, Eun-Hee; Kim, Sang-Wook; Oh, Ki Won; Kang, Moo II; Son, Ho Young; Kim, Shin-Yoon; Kim, Ghi Su; Han, Bok-Ghee; Cho, Yoon Shin; Cho, Myeong-Chan; Lee, Jong-Young; Koh, Jung-Min
2014-01-01
Background Osteoporotic fracture (OF) as a clinical endpoint is a major complication of osteoporosis. To screen for OF susceptibility genes, we performed a genome-wide association study and carried out de novo replication analysis of an East Asian population. Methods Association was tested using a logistic regression analysis. A meta-analysis was performed on the combined results using effect size and standard errors estimated for each study. Results In a combined meta-analysis of a discovery cohort (288 cases and 1139 controls), three hospital based sets in replication stage I (462 cases and 1745 controls), and an independent ethnic group in replication stage II (369 cases and 560 for controls), we identified a new locus associated with OF (rs784288 in the MECOM gene) that showed genome-wide significance (p=3.59×10−8; OR 1.39). RNA interference revealed that a MECOM knockdown suppresses osteoclastogenesis. Conclusions Our findings provide new insights into the genetic architecture underlying OF in East Asians. PMID:23349225
Revisiting the southern pine growth decline: Where are we 10 years later?
Gary L. Gadbury; Michael S. Williams; Hans T. Schreuder
2004-01-01
This paper evaluates changes in growth of pine stands in the state of Georgia, U.S.A., using USDA Forest Service Forest Inventory and Analysis (FIA) data. In particular, data representing an additional 10-year growth cy-cle has been added to previously published results from two earlier growth cycles. A robust regression procedure is combined with a bootstrap technique...
Predictive equations for the estimation of body size in seals and sea lions (Carnivora: Pinnipedia)
Churchill, Morgan; Clementz, Mark T; Kohno, Naoki
2014-01-01
Body size plays an important role in pinniped ecology and life history. However, body size data is often absent for historical, archaeological, and fossil specimens. To estimate the body size of pinnipeds (seals, sea lions, and walruses) for today and the past, we used 14 commonly preserved cranial measurements to develop sets of single variable and multivariate predictive equations for pinniped body mass and total length. Principal components analysis (PCA) was used to test whether separate family specific regressions were more appropriate than single predictive equations for Pinnipedia. The influence of phylogeny was tested with phylogenetic independent contrasts (PIC). The accuracy of these regressions was then assessed using a combination of coefficient of determination, percent prediction error, and standard error of estimation. Three different methods of multivariate analysis were examined: bidirectional stepwise model selection using Akaike information criteria; all-subsets model selection using Bayesian information criteria (BIC); and partial least squares regression. The PCA showed clear discrimination between Otariidae (fur seals and sea lions) and Phocidae (earless seals) for the 14 measurements, indicating the need for family-specific regression equations. The PIC analysis found that phylogeny had a minor influence on relationship between morphological variables and body size. The regressions for total length were more accurate than those for body mass, and equations specific to Otariidae were more accurate than those for Phocidae. Of the three multivariate methods, the all-subsets approach required the fewest number of variables to estimate body size accurately. We then used the single variable predictive equations and the all-subsets approach to estimate the body size of two recently extinct pinniped taxa, the Caribbean monk seal (Monachus tropicalis) and the Japanese sea lion (Zalophus japonicus). Body size estimates using single variable regressions generally under or over-estimated body size; however, the all-subset regression produced body size estimates that were close to historically recorded body length for these two species. This indicates that the all-subset regression equations developed in this study can estimate body size accurately. PMID:24916814
2012-01-01
Background Chronic depression represents a substantial portion of depressive disorders and is associated with severe consequences. This review examined whether the combination of pharmacological treatments and psychotherapy is associated with higher effectiveness than pharmacotherapy alone via meta-analysis; and identified possible treatment effect modifiers via meta-regression-analysis. Methods A systematic search was conducted in the following databases: Cochrane Central Register of Controlled Trials (CENTRAL), MEDLINE, EMBASE, ISI Web of Science, BIOSIS, PsycINFO, and CINAHL. Primary efficacy outcome was a response to treatment; primary acceptance outcome was dropping out of the study. Only randomized controlled trials were considered. Results We identified 8 studies with a total of 9 relevant comparisons. Our analysis revealed small, but statistically not significant effects of combined therapies on outcomes directly related to depression (BR = 1.20) with substantial heterogeneity between studies (I² = 67%). Three treatment effect modifiers were identified: target disorders, the type of psychotherapy and the type of pharmacotherapy. Small but statistically significant effects of combined therapies on quality of life (SMD = 0.18) were revealed. No differences in acceptance rates and the long-term effects between combined treatments and pure pharmacological interventions were observed. Conclusions This systematic review could not provide clear evidence for the combination of pharmacotherapy and psychotherapy. However, due to the small amount of primary studies further research is needed for a conclusive decision. PMID:22694751
Nixon, R M; Bansback, N; Brennan, A
2007-03-15
Mixed treatment comparison (MTC) is a generalization of meta-analysis. Instead of the same treatment for a disease being tested in a number of studies, a number of different interventions are considered. Meta-regression is also a generalization of meta-analysis where an attempt is made to explain the heterogeneity between the treatment effects in the studies by regressing on study-level covariables. Our focus is where there are several different treatments considered in a number of randomized controlled trials in a specific disease, the same treatment can be applied in several arms within a study, and where differences in efficacy can be explained by differences in the study settings. We develop methods for simultaneously comparing several treatments and adjusting for study-level covariables by combining ideas from MTC and meta-regression. We use a case study from rheumatoid arthritis. We identified relevant trials of biologic verses standard therapy or placebo and extracted the doses, comparators and patient baseline characteristics. Efficacy is measured using the log odds ratio of achieving six-month ACR50 responder status. A random-effects meta-regression model is fitted which adjusts the log odds ratio for study-level prognostic factors. A different random-effect distribution on the log odds ratios is allowed for each different treatment. The odds ratio is found as a function of the prognostic factors for each treatment. The apparent differences in the randomized trials between tumour necrosis factor alpha (TNF- alpha) antagonists are explained by differences in prognostic factors and the analysis suggests that these drugs as a class are not different from each other. Copyright (c) 2006 John Wiley & Sons, Ltd.
Hernández, M M; Martínez-Villar, E; Peace, C; Pérez-Moreno, I; Marco, V
2012-12-01
Laboratory studies were developed to evaluate the compatibility of flufenoxuron and azadirachtin with Beauveria bassiana against Tetranychus urticae larvae along with the required Probit analysis of the involved chemicals on all of the life stages of this mite. Flufenoxuron displayed parallel regression lines for the mortality of eggs, deutonymphs and adults. Larvae and protonymphs were the most susceptible life stages. Protonymphs were 35 times more sensitive than eggs and adults. Azadirachtin gave equal mortality on proto- and deutonymphs. The response of eggs and adults was equivalent when treated with azadirachtin. The regression lines for proto- and deutonymphs were parallel to those of adults and eggs yet three times more sensitive. The effects of separate combinations of the entomopathogenic fungus Beauveria bassiana at its LC(20) with flufenoxuron and azadirachtin at their corresponding LC(40) were evaluated on mite larvae. The application of flufenoxuron with B. bassiana revealed a clear synergy. While the combination of azadirachtin and B. bassiana had an additive effect. These combinations with B. bassiana could improve mite control by contributing to a decline in the likelihood of resistance so often described in the literature.
Pekala, Ronald J; Baglio, Francesca; Cabinio, Monia; Lipari, Susanna; Baglio, Gisella; Mendozzi, Laura; Cecconi, Pietro; Pugnetti, Luigi; Sciaky, Riccardo
2017-01-01
Previous research using stepwise regression analyses found self-reported hypnotic depth (srHD) to be a function of suggestibility, trance state effects, and expectancy. This study sought to replicate and expand that research using a general state measure of hypnotic responsivity, the Phenomenology of Consciousness Inventory: Hypnotic Assessment Procedure (PCI-HAP). Ninety-five participants completed an Italian translation of the PCI-HAP, with srHD scores predicted from the PCI-HAP assessment items. The regression analysis replicated the previous research results. Additionally, stepwise regression analyses were able to predict the srHD score equally well using only the PCI dimension scores. These results not only replicated prior research but suggest how this methodology to assess hypnotic responsivity, when combined with more traditional neurophysiological and cognitive-behavioral methodologies, may allow for a more comprehensive understanding of that enigma called hypnosis.
Classical Testing in Functional Linear Models.
Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab
2016-01-01
We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications.
Classical Testing in Functional Linear Models
Kong, Dehan; Staicu, Ana-Maria; Maity, Arnab
2016-01-01
We extend four tests common in classical regression - Wald, score, likelihood ratio and F tests - to functional linear regression, for testing the null hypothesis, that there is no association between a scalar response and a functional covariate. Using functional principal component analysis, we re-express the functional linear model as a standard linear model, where the effect of the functional covariate can be approximated by a finite linear combination of the functional principal component scores. In this setting, we consider application of the four traditional tests. The proposed testing procedures are investigated theoretically for densely observed functional covariates when the number of principal components diverges. Using the theoretical distribution of the tests under the alternative hypothesis, we develop a procedure for sample size calculation in the context of functional linear regression. The four tests are further compared numerically for both densely and sparsely observed noisy functional data in simulation experiments and using two real data applications. PMID:28955155
An application of robust ridge regression model in the presence of outliers to real data problem
NASA Astrophysics Data System (ADS)
Shariff, N. S. Md.; Ferdaos, N. A.
2017-09-01
Multicollinearity and outliers are often leads to inconsistent and unreliable parameter estimates in regression analysis. The well-known procedure that is robust to multicollinearity problem is the ridge regression method. This method however is believed are affected by the presence of outlier. The combination of GM-estimation and ridge parameter that is robust towards both problems is on interest in this study. As such, both techniques are employed to investigate the relationship between stock market price and macroeconomic variables in Malaysia due to curiosity of involving the multicollinearity and outlier problem in the data set. There are four macroeconomic factors selected for this study which are Consumer Price Index (CPI), Gross Domestic Product (GDP), Base Lending Rate (BLR) and Money Supply (M1). The results demonstrate that the proposed procedure is able to produce reliable results towards the presence of multicollinearity and outliers in the real data.
Miozzo, Michele; Pulvermüller, Friedemann; Hauk, Olaf
2015-01-01
The time course of brain activation during word production has become an area of increasingly intense investigation in cognitive neuroscience. The predominant view has been that semantic and phonological processes are activated sequentially, at about 150 and 200–400 ms after picture onset. Although evidence from prior studies has been interpreted as supporting this view, these studies were arguably not ideally suited to detect early brain activation of semantic and phonological processes. We here used a multiple linear regression approach to magnetoencephalography (MEG) analysis of picture naming in order to investigate early effects of variables specifically related to visual, semantic, and phonological processing. This was combined with distributed minimum-norm source estimation and region-of-interest analysis. Brain activation associated with visual image complexity appeared in occipital cortex at about 100 ms after picture presentation onset. At about 150 ms, semantic variables became physiologically manifest in left frontotemporal regions. In the same latency range, we found an effect of phonological variables in the left middle temporal gyrus. Our results demonstrate that multiple linear regression analysis is sensitive to early effects of multiple psycholinguistic variables in picture naming. Crucially, our results suggest that access to phonological information might begin in parallel with semantic processing around 150 ms after picture onset. PMID:25005037
Shanks, David R
2017-06-01
Many studies of unconscious processing involve comparing a performance measure (e.g., some assessment of perception or memory) with an awareness measure (such as a verbal report or a forced-choice response) taken either concurrently or separately. Unconscious processing is inferred when above-chance performance is combined with null awareness. Often, however, aggregate awareness is better than chance, and data analysis therefore employs a form of extreme group analysis focusing post hoc on participants, trials, or items where awareness is absent or at chance. The pitfalls of this analytic approach are described with particular reference to recent research on implicit learning and subliminal perception. Because of regression to the mean, the approach can mislead researchers into erroneous conclusions concerning unconscious influences on behavior. Recommendations are made about future use of post hoc selection in research on unconscious cognition.
Wu, Xue; Sengupta, Kaushik
2018-03-19
This paper demonstrates a methodology to miniaturize THz spectroscopes into a single silicon chip by eliminating traditional solid-state architectural components such as complex tunable THz and optical sources, nonlinear mixing and amplifiers. The proposed method achieves this by extracting incident THz spectral signatures from the surface of an on-chip antenna itself. The information is sensed through the spectrally-sensitive 2D distribution of the impressed current surface under the THz incident field. By converting the antenna from a single-port to a massively multi-port architecture with integrated electronics and deep subwavelength sensing, THz spectral estimation is converted into a linear estimation problem. We employ rigorous regression techniques and analysis to demonstrate a single silicon chip system operating at room temperature across 0.04-0.99 THz with 10 MHz accuracy in spectrum estimation of THz tones across the entire spectrum.
Impact of job characteristics on psychological health of Chinese single working women.
Yeung, D Y; Tang, C S
2001-01-01
This study aims at investigating the impact of individual and contextual job characteristics of control, psychological and physical demand, and security on psychological distress of 193 Chinese single working women in Hong Kong. The mediating role of job satisfaction in the job characteristics-distress relation is also assessed. Multiple regression analysis results show that job satisfaction mediates the effects of job control and security in predicting psychological distress; whereas psychological job demand has an independent effect on mental distress after considering the effect of job satisfaction. This main effect model indicates that psychological distress is best predicted by small company size, high psychological job demand, and low job satisfaction. Results from a separate regression analysis fails to support the overall combined effect of job demand-control on psychological distress. However, a significant physical job demand-control interaction effect on mental distress is noted, which reduces slightly after controlling the effect of job satisfaction.
Efficient Regressions via Optimally Combining Quantile Information*
Zhao, Zhibiao; Xiao, Zhijie
2014-01-01
We develop a generally applicable framework for constructing efficient estimators of regression models via quantile regressions. The proposed method is based on optimally combining information over multiple quantiles and can be applied to a broad range of parametric and nonparametric settings. When combining information over a fixed number of quantiles, we derive an upper bound on the distance between the efficiency of the proposed estimator and the Fisher information. As the number of quantiles increases, this upper bound decreases and the asymptotic variance of the proposed estimator approaches the Cramér-Rao lower bound under appropriate conditions. In the case of non-regular statistical estimation, the proposed estimator leads to super-efficient estimation. We illustrate the proposed method for several widely used regression models. Both asymptotic theory and Monte Carlo experiments show the superior performance over existing methods. PMID:25484481
The Potential Impact of an Anthrax Attack on Real Estate Prices and Foreclosures in Seattle.
Dormady, Noah; Szelazek, Thomas; Rose, Adam
2014-01-01
This article provides a methodology for the economic analysis of the potential consequences of a simulated anthrax terrorism attack on real estate within the Seattle metropolitan area. We estimate spatially disaggregated impacts on median sales price of residential housing within the Seattle metro area following an attack on the central business district (CBD). Using a combination of longitudinal panel regression and GIS analysis, we find that the median sales price in the CBD could decline by as much as $280,000, and by nearly $100,000 in nearby communities. These results indicate that total residential property values could decrease by over $50 billion for Seattle, or a 33% overall decline. We combine these estimates with HUD's 2009 American Housing Survey (AHS) to further predict 70,000 foreclosures in Seattle spatial zones following the terrorism event. © 2013 Society for Risk Analysis.
The association between intelligence and lifespan is mostly genetic.
Arden, Rosalind; Luciano, Michelle; Deary, Ian J; Reynolds, Chandra A; Pedersen, Nancy L; Plassman, Brenda L; McGue, Matt; Christensen, Kaare; Visscher, Peter M
2016-02-01
Several studies in the new field of cognitive epidemiology have shown that higher intelligence predicts longer lifespan. This positive correlation might arise from socioeconomic status influencing both intelligence and health; intelligence leading to better health behaviours; and/or some shared genetic factors influencing both intelligence and health. Distinguishing among these hypotheses is crucial for medicine and public health, but can only be accomplished by studying a genetically informative sample. We analysed data from three genetically informative samples containing information on intelligence and mortality: Sample 1, 377 pairs of male veterans from the NAS-NRC US World War II Twin Registry; Sample 2, 246 pairs of twins from the Swedish Twin Registry; and Sample 3, 784 pairs of twins from the Danish Twin Registry. The age at which intelligence was measured differed between the samples. We used three methods of genetic analysis to examine the relationship between intelligence and lifespan: we calculated the proportion of the more intelligent twins who outlived their co-twin; we regressed within-twin-pair lifespan differences on within-twin-pair intelligence differences; and we used the resulting regression coefficients to model the additive genetic covariance. We conducted a meta-analysis of the regression coefficients across the three samples. The combined (and all three individual samples) showed a small positive phenotypic correlation between intelligence and lifespan. In the combined sample observed r = .12 (95% confidence interval .06 to .18). The additive genetic covariance model supported a genetic relationship between intelligence and lifespan. In the combined sample the genetic contribution to the covariance was 95%; in the US study, 84%; in the Swedish study, 86%, and in the Danish study, 85%. The finding of common genetic effects between lifespan and intelligence has important implications for public health, and for those interested in the genetics of intelligence, lifespan or inequalities in health outcomes including lifespan. © The Author 2015; Published by Oxford University Press on behalf of the International Epidemiological Association.
The association between intelligence and lifespan is mostly genetic
Arden, Rosalind; Deary, Ian J; Reynolds, Chandra A; Pedersen, Nancy L; Plassman, Brenda L; McGue, Matt; Christensen, Kaare; Visscher, Peter M
2016-01-01
Abstract Background: Several studies in the new field of cognitive epidemiology have shown that higher intelligence predicts longer lifespan. This positive correlation might arise from socioeconomic status influencing both intelligence and health; intelligence leading to better health behaviours; and/or some shared genetic factors influencing both intelligence and health. Distinguishing among these hypotheses is crucial for medicine and public health, but can only be accomplished by studying a genetically informative sample. Methods: We analysed data from three genetically informative samples containing information on intelligence and mortality: Sample 1, 377 pairs of male veterans from the NAS-NRC US World War II Twin Registry; Sample 2, 246 pairs of twins from the Swedish Twin Registry; and Sample 3, 784 pairs of twins from the Danish Twin Registry. The age at which intelligence was measured differed between the samples. We used three methods of genetic analysis to examine the relationship between intelligence and lifespan: we calculated the proportion of the more intelligent twins who outlived their co-twin; we regressed within-twin-pair lifespan differences on within-twin-pair intelligence differences; and we used the resulting regression coefficients to model the additive genetic covariance. We conducted a meta-analysis of the regression coefficients across the three samples. Results: The combined (and all three individual samples) showed a small positive phenotypic correlation between intelligence and lifespan. In the combined sample observed r = .12 (95% confidence interval .06 to .18). The additive genetic covariance model supported a genetic relationship between intelligence and lifespan. In the combined sample the genetic contribution to the covariance was 95%; in the US study, 84%; in the Swedish study, 86%, and in the Danish study, 85%. Conclusions: The finding of common genetic effects between lifespan and intelligence has important implications for public health, and for those interested in the genetics of intelligence, lifespan or inequalities in health outcomes including lifespan. PMID:26213105
Weaver, J. Curtis; Feaster, Toby D.; Gotvald, Anthony J.
2009-01-01
Reliable estimates of the magnitude and frequency of floods are required for the economical and safe design of transportation and water-conveyance structures. A multistate approach was used to update methods for estimating the magnitude and frequency of floods in rural, ungaged basins in North Carolina, South Carolina, and Georgia that are not substantially affected by regulation, tidal fluctuations, or urban development. In North Carolina, annual peak-flow data available through September 2006 were available for 584 sites; 402 of these sites had a total of 10 or more years of systematic record that is required for at-site, flood-frequency analysis. Following data reviews and the computation of 20 physical and climatic basin characteristics for each station as well as at-site flood-frequency statistics, annual peak-flow data were identified for 363 sites in North Carolina suitable for use in this analysis. Among these 363 sites, 19 sites had records that could be divided into unregulated and regulated/ channelized annual peak discharges, which means peak-flow records were identified for a total of 382 cases in North Carolina. Considering the 382 cases, at-site flood-frequency statistics are provided for 333 unregulated cases (also used for the regression database) and 49 regulated/channelized cases. The flood-frequency statistics for the 333 unregulated sites were combined with data for sites from South Carolina, Georgia, and adjacent parts of Alabama, Florida, Tennessee, and Virginia to create a database of 943 sites considered for use in the regional regression analysis. Flood-frequency statistics were computed by fitting logarithms (base 10) of the annual peak flows to a log-Pearson Type III distribution. As part of the computation process, a new generalized skew coefficient was developed by using a Bayesian generalized least-squares regression model. Exploratory regression analyses using ordinary least-squares regression completed on the initial database of 943 sites resulted in defining five hydrologic regions for North Carolina, South Carolina, and Georgia. Stations with drainage areas less than 1 square mile were removed from the database, and a procedure to examine for basin redundancy (based on drainage area and periods of record) also resulted in the removal of some stations from the regression database. Flood-frequency estimates and basin characteristics for 828 gaged stations were combined to form the final database that was used in the regional regression analysis. Regional regression analysis, using generalized least-squares regression, was used to develop a set of predictive equations that can be used for estimating the 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent chance exceedance flows for rural ungaged, basins in North Carolina, South Carolina, and Georgia. The final predictive equations are all functions of drainage area and the percentage of drainage basin within each of the five hydrologic regions. Average errors of prediction for these regression equations range from 34.0 to 47.7 percent. Discharge estimates determined from the systematic records for the current study are, on average, larger in magnitude than those from a previous study for the highest percent chance exceedances (50 and 20 percent) and tend to be smaller than those from the previous study for the lower percent chance exceedances when all sites are considered as a group. For example, mean differences for sites in the Piedmont hydrologic region range from positive 0.5 percent for the 50-percent chance exceedance flow to negative 4.6 percent for the 0.2-percent chance exceedance flow when stations are grouped by hydrologic region. Similarly for the same hydrologic region, median differences range from positive 0.9 percent for the 50-percent chance exceedance flow to negative 7.1 percent for the 0.2-percent chance exceedance flow. However, mean and median percentage differences between the estimates from the previous and curre
Elmunzer, B Joseph; Higgins, Peter D R; Saini, Sameer D; Scheiman, James M; Parker, Robert A; Chak, Amitabh; Romagnuolo, Joseph; Mosler, Patrick; Hayward, Rodney A; Elta, Grace H; Korsnes, Sheryl J; Schmidt, Suzette E; Sherman, Stuart; Lehman, Glen A; Fogel, Evan L
2013-03-01
A recent large-scale randomized controlled trial (RCT) demonstrated that rectal indomethacin administration is effective in addition to pancreatic stent placement (PSP) for preventing post-endoscopic retrograde cholangiopancreatography (ERCP) pancreatitis (PEP) in high-risk cases. We performed a post hoc analysis of this RCT to explore whether rectal indomethacin can replace PSP in the prevention of PEP and to estimate the potential cost savings of such an approach. We retrospectively classified RCT subjects into four prevention groups: (1) no prophylaxis, (2) PSP alone, (3) rectal indomethacin alone, and (4) the combination of PSP and indomethacin. Multivariable logistic regression was used to adjust for imbalances in the prevalence of risk factors for PEP between the groups. Based on these adjusted PEP rates, we conducted an economic analysis comparing the costs associated with PEP prevention strategies employing rectal indomethacin alone, PSP alone, or the combination of both. After adjusting for risk using two different logistic regression models, rectal indomethacin alone appeared to be more effective for preventing PEP than no prophylaxis, PSP alone, and the combination of indomethacin and PSP. Economic analysis revealed that indomethacin alone was a cost-saving strategy in 96% of Monte Carlo trials. A prevention strategy employing rectal indomethacin alone could save approximately $150 million annually in the United States compared with a strategy of PSP alone, and $85 million compared with a strategy of indomethacin and PSP. This hypothesis-generating study suggests that prophylactic rectal indomethacin could replace PSP in patients undergoing high-risk ERCP, potentially improving clinical outcomes and reducing healthcare costs. A RCT comparing rectal indomethacin alone vs. indomethacin plus PSP is needed.
Cao, Zhiyong; Zhang, Ping; He, Zhiqing; Yang, Jing; Liang, Chun; Ren, Yusheng; Wu, Zonggui
2016-05-26
Current study was designed to investigate the effects of obstructive sleep apnea (OSA) combined dyslipidemia on the prevalence of atherosclerotic cardiovascular diseases (ASCVD). This was a cross-sectional study and subjects with documented dyslipidemia and without previous diagnosis of OSA were enrolled. Polysomnography was applied to evaluate apnea-hypopnea index (AHI). Based on AHI value, subjects were classified into four groups: without OSA, mild, moderate and severe OSA groups. Clinical characteristics and laboratory examination data were recorded. Relationship between AHI event and lipid profiles was analyzed, and logistic regression analysis was used to evaluate the effects of OSA combined dyslipidemia on ASCVD prevalence. Totally 248 subjects with dyslipidemia were enrolled. Compared to the other 3 groups, subjects with severe OSA were older, male predominant and had higher smoking rate. In addition, subjects with severe OSA had higher body mass index, waist-hip ratio, blood pressure, and higher rates of overweight and obesity. Serum levels of fasting plasma glucose, glycated hemoglobin, LDL-C and CRP were all significantly higher. ASCVD prevalence was considerably higher in subjects with severe OSA. AHI event in the severe OSA group was up to 35.4 ± 5.1 events per hour which was significantly higher than the other groups (P < 0.05 for trend). Pearson correlation analysis showed that only LDL-C was positively correlated with AHI events (r = 0.685, P < 0.05). Logistic regression analysis revealed that in unadjusted model, compared to dyslipidemia plus no-OSA group (reference group), OSA enhanced ASCVD risk in subjects with dyslipidemia, regardless of OSA severity. After extensively adjusted for confounding variables, the odds of dyslipidemia plus mild-OSA was reduced to insignificance. While the effects of moderate- and severe-OSA on promoting ASCVD risk in subjects with dyslipidemia remained significant, with severe-OSA most prominent (odds ratio: 1.52, 95% confidence interval: 1.13-2.02). OSA combined dyslipidemia conferred additive adverse effects on cardiovascular system, with severe-OSA most prominent.
Guo, Jin-Cheng; Wu, Yang; Chen, Yang; Pan, Feng; Wu, Zhi-Yong; Zhang, Jia-Sheng; Wu, Jian-Yi; Xu, Xiu-E; Zhao, Jian-Mei; Li, En-Min; Zhao, Yi; Xu, Li-Yan
2018-04-09
Esophageal squamous cell carcinoma (ESCC) is the predominant subtype of esophageal carcinoma in China. This study was to develop a staging model to predict outcomes of patients with ESCC. Using Cox regression analysis, principal component analysis (PCA), partitioning clustering, Kaplan-Meier analysis, receiver operating characteristic (ROC) curve analysis, and classification and regression tree (CART) analysis, we mined the Gene Expression Omnibus database to determine the expression profiles of genes in 179 patients with ESCC from GSE63624 and GSE63622 dataset. Univariate cox regression analysis of the GSE63624 dataset revealed that 2404 protein-coding genes (PCGs) and 635 long non-coding RNAs (lncRNAs) were associated with the survival of patients with ESCC. PCA categorized these PCGs and lncRNAs into three principal components (PCs), which were used to cluster the patients into three groups. ROC analysis demonstrated that the predictive ability of PCG-lncRNA PCs when applied to new patients was better than that of the tumor-node-metastasis staging (area under ROC curve [AUC]: 0.69 vs. 0.65, P < 0.05). Accordingly, we constructed a molecular disaggregated model comprising one lncRNA and two PCGs, which we designated as the LSB staging model using CART analysis in the GSE63624 dataset. This LSB staging model classified the GSE63622 dataset of patients into three different groups, and its effectiveness was validated by analysis of another cohort of 105 patients. The LSB staging model has clinical significance for the prognosis prediction of patients with ESCC and may serve as a three-gene staging microarray.
Bertrand-Krajewski, J L
2004-01-01
In order to replace traditional sampling and analysis techniques, turbidimeters can be used to estimate TSS concentration in sewers, by means of sensor and site specific empirical equations established by linear regression of on-site turbidity Tvalues with TSS concentrations C measured in corresponding samples. As the ordinary least-squares method is not able to account for measurement uncertainties in both T and C variables, an appropriate regression method is used to solve this difficulty and to evaluate correctly the uncertainty in TSS concentrations estimated from measured turbidity. The regression method is described, including detailed calculations of variances and covariance in the regression parameters. An example of application is given for a calibrated turbidimeter used in a combined sewer system, with data collected during three dry weather days. In order to show how the established regression could be used, an independent 24 hours long dry weather turbidity data series recorded at 2 min time interval is used, transformed into estimated TSS concentrations, and compared to TSS concentrations measured in samples. The comparison appears as satisfactory and suggests that turbidity measurements could replace traditional samples. Further developments, including wet weather periods and other types of sensors, are suggested.
NASA Astrophysics Data System (ADS)
Whitehead, James Joshua
The analysis documented herein provides an integrated approach for the conduct of optimization under uncertainty (OUU) using Monte Carlo Simulation (MCS) techniques coupled with response surface-based methods for characterization of mixture-dependent variables. This novel methodology provides an innovative means of conducting optimization studies under uncertainty in propulsion system design. Analytic inputs are based upon empirical regression rate information obtained from design of experiments (DOE) mixture studies utilizing a mixed oxidizer hybrid rocket concept. Hybrid fuel regression rate was selected as the target response variable for optimization under uncertainty, with maximization of regression rate chosen as the driving objective. Characteristic operational conditions and propellant mixture compositions from experimental efforts conducted during previous foundational work were combined with elemental uncertainty estimates as input variables. Response surfaces for mixture-dependent variables and their associated uncertainty levels were developed using quadratic response equations incorporating single and two-factor interactions. These analysis inputs, response surface equations and associated uncertainty contributions were applied to a probabilistic MCS to develop dispersed regression rates as a function of operational and mixture input conditions within design space. Illustrative case scenarios were developed and assessed using this analytic approach including fully and partially constrained operational condition sets over all of design mixture space. In addition, optimization sets were performed across an operationally representative region in operational space and across all investigated mixture combinations. These scenarios were selected as representative examples relevant to propulsion system optimization, particularly for hybrid and solid rocket platforms. Ternary diagrams, including contour and surface plots, were developed and utilized to aid in visualization. The concept of Expanded-Durov diagrams was also adopted and adapted to this study to aid in visualization of uncertainty bounds. Regions of maximum regression rate and associated uncertainties were determined for each set of case scenarios. Application of response surface methodology coupled with probabilistic-based MCS allowed for flexible and comprehensive interrogation of mixture and operating design space during optimization cases. Analyses were also conducted to assess sensitivity of uncertainty to variations in key elemental uncertainty estimates. The methodology developed during this research provides an innovative optimization tool for future propulsion design efforts.
2013-01-01
Background Developing countries in South Asia, such as Bangladesh, bear a disproportionate burden of diarrhoeal diseases such as Cholera, Typhoid and Paratyphoid. These seem to be aggravated by a number of social and environmental factors such as lack of access to safe drinking water, overcrowdedness and poor hygiene brought about by poverty. Some socioeconomic data can be obtained from census data whilst others are more difficult to elucidate. This study considers a range of both census data and spatial data from other sources, including remote sensing, as potential predictors of typhoid risk. Typhoid data are aggregated from hospital admission records for the period from 2005 to 2009. The spatial and statistical structures of the data are analysed and Principal Axis Factoring is used to reduce the degree of co-linearity in the data. The resulting factors are combined into a Quality of Life index, which in turn is used in a regression model of typhoid occurrence and risk. Results The three Principal Factors used together explain 87% of the variance in the initial candidate predictors, which eminently qualifies them for use as a set of uncorrelated explanatory variables in a linear regression model. Initial regression result using Ordinary Least Squares (OLS) were disappointing, this was explainable by analysis of the spatial autocorrelation inherent in the Principal factors. The use of Geographically Weighted Regression caused a considerable increase in the predictive power of regressions based on these factors. The best prediction, determined by analysis of the Akaike Information Criterion (AIC) was found when the three factors were combined into a quality of life index, using a method previously published by others, and had a coefficient of determination of 73%. Conclusions The typhoid occurrence/risk prediction equation was used to develop the first risk map showing areas of Dhaka Metropolitan Area whose inhabitants are at greater or lesser risk of typhoid infection. This, coupled with seasonal information on typhoid incidence also reported in this paper, has the potential to advise public health professionals on developing prevention strategies such as targeted vaccination. PMID:23497202
Corner, Robert J; Dewan, Ashraf M; Hashizume, Masahiro
2013-03-16
Developing countries in South Asia, such as Bangladesh, bear a disproportionate burden of diarrhoeal diseases such as cholera, typhoid and paratyphoid. These seem to be aggravated by a number of social and environmental factors such as lack of access to safe drinking water, overcrowdedness and poor hygiene brought about by poverty. Some socioeconomic data can be obtained from census data whilst others are more difficult to elucidate. This study considers a range of both census data and spatial data from other sources, including remote sensing, as potential predictors of typhoid risk. Typhoid data are aggregated from hospital admission records for the period from 2005 to 2009. The spatial and statistical structures of the data are analysed and principal axis factoring is used to reduce the degree of co-linearity in the data. The resulting factors are combined into a quality of life index, which in turn is used in a regression model of typhoid occurrence and risk. The three principal factors used together explain 87% of the variance in the initial candidate predictors, which eminently qualifies them for use as a set of uncorrelated explanatory variables in a linear regression model. Initial regression result using ordinary least squares (OLS) were disappointing, this was explainable by analysis of the spatial autocorrelation inherent in the principal factors. The use of geographically weighted regression caused a considerable increase in the predictive power of regressions based on these factors. The best prediction, determined by analysis of the Akaike information criterion (AIC) was found when the three factors were combined into a quality of life index, using a method previously published by others, and had a coefficient of determination of 73%. The typhoid occurrence/risk prediction equation was used to develop the first risk map showing areas of Dhaka metropolitan area whose inhabitants are at greater or lesser risk of typhoid infection. This, coupled with seasonal information on typhoid incidence also reported in this paper, has the potential to advise public health professionals on developing prevention strategies such as targeted vaccination.
Meyer, Stacy L; Hoffman, Robert P
2011-10-01
Type 2 diabetes mellitus is a growing problem in pediatrics and there is no consensus on the best treatment. We conducted this chart review on newly diagnosed pediatric patients with type 2 diabetes mellitus to compare the effect of treatment regimen on body mass index (BMI) and hemoglobin A1c over a 6-month period. We conducted a retrospective chart review on patients with type 2 DM who presented to Nationwide Children's Hospital. Data were collected on therapy type, BMI, and hemoglobin A1c over a 6-month follow-up. Therapy type was divided into metformin, insulin, or combination insulin and metformin. 1,997 charts were reviewed for inclusion based on ICD-9 codes consistent with a diagnosis of diabetes, abnormal oral glucose tolerance test, or insulin resistance. Of the 47 charts eligible for the review, 26 subjects were treated with metformin 1000-1500 mg daily, 14 patients were treated with insulin therapy, and 7 patients were treated with a combination of insulin and metformin therapy. At baseline, the only significant difference among groups was A1c (P = 0.012). In regression analysis with baseline A1c as a covariate, the only predictor of change in A1c over time was the A1c at onset (P < 0.001). Therapy type was not predictive of change (P = 0.905). Regression analysis showed a greater BMI at onset predicted a greater decrease in BMI (P = 0.006), but therapy type did not predict a change (P = 0.517). Metformin may be as effective as insulin or combination therapy for treatment of diabetes from onset to 6-month follow-up.
Klimek, Ludger; Schumacher, Helmut; Schütt, Tanja; Gräter, Heidemarie; Mueck, Tobias; Michel, Martin C
2017-02-01
The aim of this study was to explore factors affecting efficacy of treatment of common cold symptoms with an over-the-counter ibuprofen/pseudoephedrine combination product. Data from an anonymous survey among 1770 pharmacy customers purchasing the combination product for treatment of own common cold symptoms underwent post-hoc descriptive analysis. Scores of symptoms typically responsive to ibuprofen (headache, pharyngeal pain, joint pain and fever), typically responsive to pseudoephedrine (congested nose, congested sinus and runny nose), considered non-specific (sneezing, fatigue, dry cough, cough with expectoration) and comprising all 11 symptoms were analysed. Multiple regression analysis was applied to explore factors associated with greater reduction in symptom intensity or greater probability of experiencing a symptom reduction of at least 50%. After intake of first dose of medication, typically ibuprofen-sensitive, pseudoephedrine-responsive, non-specific and total symptoms were reduced by 60.0%, 46.3%, 45.4% and 52.8%, respectively. A symptom reduction of at least 50% was reported by 73.6%, 55.1%, 50.9% and 61.6% of participants, respectively. A high baseline score was associated with greater reductions in symptom scores but smaller probability of achieving an improvement of at least 50%. Across both multiple regression approaches, two tablets at first dosing were more effective than one and (except for ibuprofen-sensitive symptoms) starting treatment later than day 2 of the cold was generally less effective. Efficacy of an ibuprofen/pseudoephedrine combination in the treatment of common cold symptoms was dose-dependent and greatest when treatment started within the first 2 days after onset of symptoms. © 2016 The Authors. International Journal of Clinical Practice Published by John Wiley & Sons Ltd.
NASA Astrophysics Data System (ADS)
Shao, G.; Gallion, J.; Fei, S.
2016-12-01
Sound forest aboveground biomass estimation is required to monitor diverse forest ecosystems and their impacts on the changing climate. Lidar-based regression models provided promised biomass estimations in most forest ecosystems. However, considerable uncertainties of biomass estimations have been reported in the temperate hardwood and hardwood-dominated mixed forests. Varied site productivities in temperate hardwood forests largely diversified height and diameter growth rates, which significantly reduced the correlation between tree height and diameter at breast height (DBH) in mature and complex forests. It is, therefore, difficult to utilize height-based lidar metrics to predict DBH-based field-measured biomass through a simple regression model regardless the variation of site productivity. In this study, we established a multi-dimension nonlinear regression model incorporating lidar metrics and site productivity classes derived from soil features. In the regression model, lidar metrics provided horizontal and vertical structural information and productivity classes differentiated good and poor forest sites. The selection and combination of lidar metrics were discussed. Multiple regression models were employed and compared. Uncertainty analysis was applied to the best fit model. The effects of site productivity on the lidar-based biomass model were addressed.
Subsonic Aircraft With Regression and Neural-Network Approximators Designed
NASA Technical Reports Server (NTRS)
Patnaik, Surya N.; Hopkins, Dale A.
2004-01-01
At the NASA Glenn Research Center, NASA Langley Research Center's Flight Optimization System (FLOPS) and the design optimization testbed COMETBOARDS with regression and neural-network-analysis approximators have been coupled to obtain a preliminary aircraft design methodology. For a subsonic aircraft, the optimal design, that is the airframe-engine combination, is obtained by the simulation. The aircraft is powered by two high-bypass-ratio engines with a nominal thrust of about 35,000 lbf. It is to carry 150 passengers at a cruise speed of Mach 0.8 over a range of 3000 n mi and to operate on a 6000-ft runway. The aircraft design utilized a neural network and a regression-approximations-based analysis tool, along with a multioptimizer cascade algorithm that uses sequential linear programming, sequential quadratic programming, the method of feasible directions, and then sequential quadratic programming again. Optimal aircraft weight versus the number of design iterations is shown. The central processing unit (CPU) time to solution is given. It is shown that the regression-method-based analyzer exhibited a smoother convergence pattern than the FLOPS code. The optimum weight obtained by the approximation technique and the FLOPS code differed by 1.3 percent. Prediction by the approximation technique exhibited no error for the aircraft wing area and turbine entry temperature, whereas it was within 2 percent for most other parameters. Cascade strategy was required by FLOPS as well as the approximators. The regression method had a tendency to hug the data points, whereas the neural network exhibited a propensity to follow a mean path. The performance of the neural network and regression methods was considered adequate. It was at about the same level for small, standard, and large models with redundancy ratios (defined as the number of input-output pairs to the number of unknown coefficients) of 14, 28, and 57, respectively. In an SGI octane workstation (Silicon Graphics, Inc., Mountainview, CA), the regression training required a fraction of a CPU second, whereas neural network training was between 1 and 9 min, as given. For a single analysis cycle, the 3-sec CPU time required by the FLOPS code was reduced to milliseconds by the approximators. For design calculations, the time with the FLOPS code was 34 min. It was reduced to 2 sec with the regression method and to 4 min by the neural network technique. The performance of the regression and neural network methods was found to be satisfactory for the analysis and design optimization of the subsonic aircraft.
NASA Astrophysics Data System (ADS)
Shao, Yongni; Jiang, Linjun; Zhou, Hong; Pan, Jian; He, Yong
2016-04-01
In our study, the feasibility of using visible/near infrared hyperspectral imaging technology to detect the changes of the internal components of Chlorella pyrenoidosa so as to determine the varieties of pesticides (such as butachlor, atrazine and glyphosate) at three concentrations (0.6 mg/L, 3 mg/L, 15 mg/L) was investigated. Three models (partial least squares discriminant analysis combined with full wavelengths, FW-PLSDA; partial least squares discriminant analysis combined with competitive adaptive reweighted sampling algorithm, CARS-PLSDA; linear discrimination analysis combined with regression coefficients, RC-LDA) were built by the hyperspectral data of Chlorella pyrenoidosa to find which model can produce the most optimal result. The RC-LDA model, which achieved an average correct classification rate of 97.0% was more superior than FW-PLSDA (72.2%) and CARS-PLSDA (84.0%), and it proved that visible/near infrared hyperspectral imaging could be a rapid and reliable technique to identify pesticide varieties. It also proved that microalgae can be a very promising medium to indicate characteristics of pesticides.
Binary logistic regression-Instrument for assessing museum indoor air impact on exhibits.
Bucur, Elena; Danet, Andrei Florin; Lehr, Carol Blaziu; Lehr, Elena; Nita-Lazar, Mihai
2017-04-01
This paper presents a new way to assess the environmental impact on historical artifacts using binary logistic regression. The prediction of the impact on the exhibits during certain pollution scenarios (environmental impact) was calculated by a mathematical model based on the binary logistic regression; it allows the identification of those environmental parameters from a multitude of possible parameters with a significant impact on exhibitions and ranks them according to their severity effect. Air quality (NO 2 , SO 2 , O 3 and PM 2.5 ) and microclimate parameters (temperature, humidity) monitoring data from a case study conducted within exhibition and storage spaces of the Romanian National Aviation Museum Bucharest have been used for developing and validating the binary logistic regression method and the mathematical model. The logistic regression analysis was used on 794 data combinations (715 to develop of the model and 79 to validate it) by a Statistical Package for Social Sciences (SPSS 20.0). The results from the binary logistic regression analysis demonstrated that from six parameters taken into consideration, four of them present a significant effect upon exhibits in the following order: O 3 >PM 2.5 >NO 2 >humidity followed at a significant distance by the effects of SO 2 and temperature. The mathematical model, developed in this study, correctly predicted 95.1 % of the cumulated effect of the environmental parameters upon the exhibits. Moreover, this model could also be used in the decisional process regarding the preventive preservation measures that should be implemented within the exhibition space. The paper presents a new way to assess the environmental impact on historical artifacts using binary logistic regression. The mathematical model developed on the environmental parameters analyzed by the binary logistic regression method could be useful in a decision-making process establishing the best measures for pollution reduction and preventive preservation of exhibits.
2013-01-01
Methods for analysis of network dynamics have seen great progress in the past decade. This article shows how Dynamic Network Logistic Regression techniques (a special case of the Temporal Exponential Random Graph Models) can be used to implement decision theoretic models for network dynamics in a panel data context. We also provide practical heuristics for model building and assessment. We illustrate the power of these techniques by applying them to a dynamic blog network sampled during the 2004 US presidential election cycle. This is a particularly interesting case because it marks the debut of Internet-based media such as blogs and social networking web sites as institutionally recognized features of the American political landscape. Using a longitudinal sample of all Democratic National Convention/Republican National Convention–designated blog citation networks, we are able to test the influence of various strategic, institutional, and balance-theoretic mechanisms as well as exogenous factors such as seasonality and political events on the propensity of blogs to cite one another over time. Using a combination of deviance-based model selection criteria and simulation-based model adequacy tests, we identify the combination of processes that best characterizes the choice behavior of the contending blogs. PMID:24143060
Association analysis of multiple traits by an approach of combining P values.
Chen, Lili; Wang, Yong; Zhou, Yajing
2018-03-01
Increasing evidence shows that one variant can affect multiple traits, which is a widespread phenomenon in complex diseases. Joint analysis of multiple traits can increase statistical power of association analysis and uncover the underlying genetic mechanism. Although there are many statistical methods to analyse multiple traits, most of these methods are usually suitable for detecting common variants associated with multiple traits. However, because of low minor allele frequency of rare variant, these methods are not optimal for rare variant association analysis. In this paper, we extend an adaptive combination of P values method (termed ADA) for single trait to test association between multiple traits and rare variants in the given region. For a given region, we use reverse regression model to test each rare variant associated with multiple traits and obtain the P value of single-variant test. Further, we take the weighted combination of these P values as the test statistic. Extensive simulation studies show that our approach is more powerful than several other comparison methods in most cases and is robust to the inclusion of a high proportion of neutral variants and the different directions of effects of causal variants.
Hordge, LaQuana N; McDaniel, Kiara L; Jones, Derick D; Fakayode, Sayo O
2016-05-15
The endocrine disruption property of estrogens necessitates the immediate need for effective monitoring and development of analytical protocols for their analyses in biological and human specimens. This study explores the first combined utility of a steady-state fluorescence spectroscopy and multivariate partial-least-square (PLS) regression analysis for the simultaneous determination of two estrogens (17α-ethinylestradiol (EE) and norgestimate (NOR)) concentrations in bovine serum albumin (BSA) and human serum albumin (HSA) samples. The influence of EE and NOR concentrations and temperature on the emission spectra of EE-HSA EE-BSA, NOR-HSA, and NOR-BSA complexes was also investigated. The binding of EE with HSA and BSA resulted in increase in emission characteristics of HSA and BSA and a significant blue spectra shift. In contrast, the interaction of NOR with HSA and BSA quenched the emission characteristics of HSA and BSA. The observed emission spectral shifts preclude the effective use of traditional univariate regression analysis of fluorescent data for the determination of EE and NOR concentrations in HSA and BSA samples. Multivariate partial-least-squares (PLS) regression analysis was utilized to correlate the changes in emission spectra with EE and NOR concentrations in HSA and BSA samples. The figures-of-merit of the developed PLS regression models were excellent, with limits of detection as low as 1.6×10(-8) M for EE and 2.4×10(-7) M for NOR and good linearity (R(2)>0.994985). The PLS models correctly predicted EE and NOR concentrations in independent validation HSA and BSA samples with a root-mean-square-percent-relative-error (RMS%RE) of less than 6.0% at physiological condition. On the contrary, the use of univariate regression resulted in poor predictions of EE and NOR in HSA and BSA samples, with RMS%RE larger than 40% at physiological conditions. High accuracy, low sensitivity, simplicity, low-cost with no prior analyte extraction or separation required makes this method promising, compelling, and attractive alternative for the rapid determination of estrogen concentrations in biomedical and biological specimens, pharmaceuticals, or environmental samples. Published by Elsevier B.V.
Spatial Bayesian latent factor regression modeling of coordinate-based meta-analysis data.
Montagna, Silvia; Wager, Tor; Barrett, Lisa Feldman; Johnson, Timothy D; Nichols, Thomas E
2018-03-01
Now over 20 years old, functional MRI (fMRI) has a large and growing literature that is best synthesised with meta-analytic tools. As most authors do not share image data, only the peak activation coordinates (foci) reported in the article are available for Coordinate-Based Meta-Analysis (CBMA). Neuroimaging meta-analysis is used to (i) identify areas of consistent activation; and (ii) build a predictive model of task type or cognitive process for new studies (reverse inference). To simultaneously address these aims, we propose a Bayesian point process hierarchical model for CBMA. We model the foci from each study as a doubly stochastic Poisson process, where the study-specific log intensity function is characterized as a linear combination of a high-dimensional basis set. A sparse representation of the intensities is guaranteed through latent factor modeling of the basis coefficients. Within our framework, it is also possible to account for the effect of study-level covariates (meta-regression), significantly expanding the capabilities of the current neuroimaging meta-analysis methods available. We apply our methodology to synthetic data and neuroimaging meta-analysis datasets. © 2017, The International Biometric Society.
A Semiparametric Change-Point Regression Model for Longitudinal Observations.
Xing, Haipeng; Ying, Zhiliang
2012-12-01
Many longitudinal studies involve relating an outcome process to a set of possibly time-varying covariates, giving rise to the usual regression models for longitudinal data. When the purpose of the study is to investigate the covariate effects when experimental environment undergoes abrupt changes or to locate the periods with different levels of covariate effects, a simple and easy-to-interpret approach is to introduce change-points in regression coefficients. In this connection, we propose a semiparametric change-point regression model, in which the error process (stochastic component) is nonparametric and the baseline mean function (functional part) is completely unspecified, the observation times are allowed to be subject-specific, and the number, locations and magnitudes of change-points are unknown and need to be estimated. We further develop an estimation procedure which combines the recent advance in semiparametric analysis based on counting process argument and multiple change-points inference, and discuss its large sample properties, including consistency and asymptotic normality, under suitable regularity conditions. Simulation results show that the proposed methods work well under a variety of scenarios. An application to a real data set is also given.
Zhang, Qun; Zhang, Qunzhi; Sornette, Didier
2016-01-01
We augment the existing literature using the Log-Periodic Power Law Singular (LPPLS) structures in the log-price dynamics to diagnose financial bubbles by providing three main innovations. First, we introduce the quantile regression to the LPPLS detection problem. This allows us to disentangle (at least partially) the genuine LPPLS signal and the a priori unknown complicated residuals. Second, we propose to combine the many quantile regressions with a multi-scale analysis, which aggregates and consolidates the obtained ensembles of scenarios. Third, we define and implement the so-called DS LPPLS Confidence™ and Trust™ indicators that enrich considerably the diagnostic of bubbles. Using a detailed study of the “S&P 500 1987” bubble and presenting analyses of 16 historical bubbles, we show that the quantile regression of LPPLS signals contributes useful early warning signals. The comparison between the constructed signals and the price development in these 16 historical bubbles demonstrates their significant predictive ability around the real critical time when the burst/rally occurs. PMID:27806093
Chakraborty, Somsubhra; Weindorf, David C; Li, Bin; Ali Aldabaa, Abdalsamad Abdalsatar; Ghosh, Rakesh Kumar; Paul, Sathi; Nasim Ali, Md
2015-05-01
Using 108 petroleum contaminated soil samples, this pilot study proposed a new analytical approach of combining visible near-infrared diffuse reflectance spectroscopy (VisNIR DRS) and portable X-ray fluorescence spectrometry (PXRF) for rapid and improved quantification of soil petroleum contamination. Results indicated that an advanced fused model where VisNIR DRS spectra-based penalized spline regression (PSR) was used to predict total petroleum hydrocarbon followed by PXRF elemental data-based random forest regression was used to model the PSR residuals, it outperformed (R(2)=0.78, residual prediction deviation (RPD)=2.19) all other models tested, even producing better generalization than using VisNIR DRS alone (RPD's of 1.64, 1.86, and 1.96 for random forest, penalized spline regression, and partial least squares regression, respectively). Additionally, unsupervised principal component analysis using the PXRF+VisNIR DRS system qualitatively separated contaminated soils from control samples. Fusion of PXRF elemental data and VisNIR derivative spectra produced an optimized model for total petroleum hydrocarbon quantification in soils. Copyright © 2015 Elsevier B.V. All rights reserved.
Held, Elizabeth; Cape, Joshua; Tintle, Nathan
2016-01-01
Machine learning methods continue to show promise in the analysis of data from genetic association studies because of the high number of variables relative to the number of observations. However, few best practices exist for the application of these methods. We extend a recently proposed supervised machine learning approach for predicting disease risk by genotypes to be able to incorporate gene expression data and rare variants. We then apply 2 different versions of the approach (radial and linear support vector machines) to simulated data from Genetic Analysis Workshop 19 and compare performance to logistic regression. Method performance was not radically different across the 3 methods, although the linear support vector machine tended to show small gains in predictive ability relative to a radial support vector machine and logistic regression. Importantly, as the number of genes in the models was increased, even when those genes contained causal rare variants, model predictive ability showed a statistically significant decrease in performance for both the radial support vector machine and logistic regression. The linear support vector machine showed more robust performance to the inclusion of additional genes. Further work is needed to evaluate machine learning approaches on larger samples and to evaluate the relative improvement in model prediction from the incorporation of gene expression data.
NASA Astrophysics Data System (ADS)
Liu, Bilan; Qiu, Xing; Zhu, Tong; Tian, Wei; Hu, Rui; Ekholm, Sven; Schifitto, Giovanni; Zhong, Jianhui
2016-03-01
Subject-specific longitudinal DTI study is vital for investigation of pathological changes of lesions and disease evolution. Spatial Regression Analysis of Diffusion tensor imaging (SPREAD) is a non-parametric permutation-based statistical framework that combines spatial regression and resampling techniques to achieve effective detection of localized longitudinal diffusion changes within the whole brain at individual level without a priori hypotheses. However, boundary blurring and dislocation limit its sensitivity, especially towards detecting lesions of irregular shapes. In the present study, we propose an improved SPREAD (dubbed improved SPREAD, or iSPREAD) method by incorporating a three-dimensional (3D) nonlinear anisotropic diffusion filtering method, which provides edge-preserving image smoothing through a nonlinear scale space approach. The statistical inference based on iSPREAD was evaluated and compared with the original SPREAD method using both simulated and in vivo human brain data. Results demonstrated that the sensitivity and accuracy of the SPREAD method has been improved substantially by adapting nonlinear anisotropic filtering. iSPREAD identifies subject-specific longitudinal changes in the brain with improved sensitivity, accuracy, and enhanced statistical power, especially when the spatial correlation is heterogeneous among neighboring image pixels in DTI.
Ren, Y Y; Zhou, L C; Yang, L; Liu, P Y; Zhao, B W; Liu, H X
2016-09-01
The paper highlights the use of the logistic regression (LR) method in the construction of acceptable statistically significant, robust and predictive models for the classification of chemicals according to their aquatic toxic modes of action. Essentials accounting for a reliable model were all considered carefully. The model predictors were selected by stepwise forward discriminant analysis (LDA) from a combined pool of experimental data and chemical structure-based descriptors calculated by the CODESSA and DRAGON software packages. Model predictive ability was validated both internally and externally. The applicability domain was checked by the leverage approach to verify prediction reliability. The obtained models are simple and easy to interpret. In general, LR performs much better than LDA and seems to be more attractive for the prediction of the more toxic compounds, i.e. compounds that exhibit excess toxicity versus non-polar narcotic compounds and more reactive compounds versus less reactive compounds. In addition, model fit and regression diagnostics was done through the influence plot which reflects the hat-values, studentized residuals, and Cook's distance statistics of each sample. Overdispersion was also checked for the LR model. The relationships between the descriptors and the aquatic toxic behaviour of compounds are also discussed.
A meta-analysis investigating factors underlying attrition rates in infant ERP studies.
Stets, Manuela; Stahl, Daniel; Reid, Vincent M
2012-01-01
In this meta-analysis, we examined interrelationships between characteristics of infant event-related potential (ERP) studies and their attrition rates. One-hundred and forty-nine published studies provided information on 314 experimental groups of which 181 provided data on attrition. A random effects meta-analysis revealed a high average attrition rate of 49.2%. Additionally, we used meta-regression for 178 groups with attrition data to analyze which variables best explained attrition variance. Our main findings were that the nature of the stimuli-visual, auditory, or combined as well as if stimuli were animated-influenced exclusion rates from the final analysis and that infant age did not alter attrition rates.
Wan, Jian; Chen, Yi-Chieh; Morris, A Julian; Thennadil, Suresh N
2017-07-01
Near-infrared (NIR) spectroscopy is being widely used in various fields ranging from pharmaceutics to the food industry for analyzing chemical and physical properties of the substances concerned. Its advantages over other analytical techniques include available physical interpretation of spectral data, nondestructive nature and high speed of measurements, and little or no need for sample preparation. The successful application of NIR spectroscopy relies on three main aspects: pre-processing of spectral data to eliminate nonlinear variations due to temperature, light scattering effects and many others, selection of those wavelengths that contribute useful information, and identification of suitable calibration models using linear/nonlinear regression . Several methods have been developed for each of these three aspects and many comparative studies of different methods exist for an individual aspect or some combinations. However, there is still a lack of comparative studies for the interactions among these three aspects, which can shed light on what role each aspect plays in the calibration and how to combine various methods of each aspect together to obtain the best calibration model. This paper aims to provide such a comparative study based on four benchmark data sets using three typical pre-processing methods, namely, orthogonal signal correction (OSC), extended multiplicative signal correction (EMSC) and optical path-length estimation and correction (OPLEC); two existing wavelength selection methods, namely, stepwise forward selection (SFS) and genetic algorithm optimization combined with partial least squares regression for spectral data (GAPLSSP); four popular regression methods, namely, partial least squares (PLS), least absolute shrinkage and selection operator (LASSO), least squares support vector machine (LS-SVM), and Gaussian process regression (GPR). The comparative study indicates that, in general, pre-processing of spectral data can play a significant role in the calibration while wavelength selection plays a marginal role and the combination of certain pre-processing, wavelength selection, and nonlinear regression methods can achieve superior performance over traditional linear regression-based calibration.
NASA Astrophysics Data System (ADS)
Hasan, Haliza; Ahmad, Sanizah; Osman, Balkish Mohd; Sapri, Shamsiah; Othman, Nadirah
2017-08-01
In regression analysis, missing covariate data has been a common problem. Many researchers use ad hoc methods to overcome this problem due to the ease of implementation. However, these methods require assumptions about the data that rarely hold in practice. Model-based methods such as Maximum Likelihood (ML) using the expectation maximization (EM) algorithm and Multiple Imputation (MI) are more promising when dealing with difficulties caused by missing data. Then again, inappropriate methods of missing value imputation can lead to serious bias that severely affects the parameter estimates. The main objective of this study is to provide a better understanding regarding missing data concept that can assist the researcher to select the appropriate missing data imputation methods. A simulation study was performed to assess the effects of different missing data techniques on the performance of a regression model. The covariate data were generated using an underlying multivariate normal distribution and the dependent variable was generated as a combination of explanatory variables. Missing values in covariate were simulated using a mechanism called missing at random (MAR). Four levels of missingness (10%, 20%, 30% and 40%) were imposed. ML and MI techniques available within SAS software were investigated. A linear regression analysis was fitted and the model performance measures; MSE, and R-Squared were obtained. Results of the analysis showed that MI is superior in handling missing data with highest R-Squared and lowest MSE when percent of missingness is less than 30%. Both methods are unable to handle larger than 30% level of missingness.
Predictive models of energy consumption in multi-family housing in College Station, Texas
NASA Astrophysics Data System (ADS)
Ali, Hikmat Hummad
Patterns of energy consumption in apartment buildings are different than those in single-family houses. Apartment buildings have different physical characteristics, and their inhabitants have different demographic attributes. This study develops models that predict energy usage in apartment buildings in College Station. This is accomplished by analyzing and identifying the predictive variables that affect energy usage, studying the consumption patterns, and creating formulas based on combinations of these variables. According to the hypotheses and the specific research context, a cross-sectional design strategy is adopted. This choice implies analyses across variations within a sample of fourplex apartments in College Station. The data available for analysis include the monthly billing data along with the physical characteristics of the building, climate data for College Station, and occupant demographic characteristics. A simple random sampling procedure is adopted. The sample size of 176 apartments is drawn from the population in such a way that every possible sample has the same chance of being selected. Statistical methods used to interpret the data include univariate analysis (mean, standard deviation, range, and distribution of data), correlation analysis, regression analysis, and ANOVA (analyses of variance). The results show there are significant differences in cooling efficiency and actual energy consumption among different building types, but there are no significant differences in heating consumption. There are no significant differences in actual energy consumption between student and non-student groups or among ethnic groups. The findings indicate that there are significant differences in actual energy consumption among marital status groups and educational level groups. The multiple regression procedures show there is a significant relationship between normalized annual consumption and the combined variables of floor area, marital status, dead band, construction material, summer thermostat setting, heating, slope, and base load, as well as a relationship between cooling slope and the combined variables of share wall, floor level, summer thermostat setting, external wall, and American household. In addition, there is a significant relationship between heating slope and the combined variables of winter thermostat setting, market value, student, and rent. The results also indicate there is a relationship between base load and the combined variables of floor area, market value, age of the building, marital status, student, and summer thermostat setting.
Least-squares sequential parameter and state estimation for large space structures
NASA Technical Reports Server (NTRS)
Thau, F. E.; Eliazov, T.; Montgomery, R. C.
1982-01-01
This paper presents the formulation of simultaneous state and parameter estimation problems for flexible structures in terms of least-squares minimization problems. The approach combines an on-line order determination algorithm, with least-squares algorithms for finding estimates of modal approximation functions, modal amplitudes, and modal parameters. The approach combines previous results on separable nonlinear least squares estimation with a regression analysis formulation of the state estimation problem. The technique makes use of sequential Householder transformations. This allows for sequential accumulation of matrices required during the identification process. The technique is used to identify the modal prameters of a flexible beam.
Heiss, Christian; Govindarajan, Parameswari; Schlewitz, Gudrun; Hemdan, Nasr Y A; Schliefke, Nathalie; Alt, Volker; Thormann, Ulrich; Lips, Katrin Susanne; Wenisch, Sabine; Langheinrich, Alexander C; Zahner, Daniel; Schnettler, Reinhard
2012-06-01
As women are the population most affected by multifactorial osteoporosis, research is focused on unraveling the underlying mechanism of osteoporosis induction in rats by combining ovariectomy (OVX) either with calcium, phosphorus, vitamin C and vitamin D2/D3 deficiency, or by administration of glucocorticoid (dexamethasone). Different skeletal sites of sham, OVX-Diet and OVX-Steroid rats were analyzed by Dual Energy X-ray Absorptiometry (DEXA) at varied time points of 0, 4 and 12 weeks to determine and compare the osteoporotic factors such as bone mineral density (BMD), bone mineral content (BMC), area, body weight and percent fat among different groups and time points. Comparative analysis and interrelationships among osteoporotic determinants by regression analysis were also determined. T scores were below-2.5 in OVX-Diet rats at 4 and 12 weeks post-OVX. OVX-diet rats revealed pronounced osteoporotic status with reduced BMD and BMC than the steroid counterparts, with the spine and pelvis as the most affected skeletal sites. Increase in percent fat was observed irrespective of the osteoporosis inducers applied. Comparative analysis and interrelationships between osteoporotic determinants that are rarely studied in animals indicate the necessity to analyze BMC and area along with BMD in obtaining meaningful information leading to proper prediction of probability of osteoporotic fractures. Enhanced osteoporotic effect observed in OVX-Diet rats indicates that estrogen dysregulation combined with diet treatment induces and enhances osteoporosis with time when compared to the steroid group. Comparative and regression analysis indicates the need to determine BMC along with BMD and area in osteoporotic determination.
Quantitative Analysis of Land Loss in Coastal Louisiana Using Remote Sensing
NASA Astrophysics Data System (ADS)
Wales, P. M.; Kuszmaul, J.; Roberts, C.
2005-12-01
For the past thirty-five years the land loss along the Louisiana Coast has been recognized as a growing problem. One of the clearest indicators of this land loss is that in 2000 smooth cord grass (spartina alterniflora) was turning brown well before its normal hibernation period. Over 100,000 acres of marsh were affected by the 2000 browning. In 2001 data were collected using low altitude helicopter based transects of the coast, with 7,400 data points being collected by researchers at the USGS, National Wetlands Research Center, and Louisiana Department of Natural Resources. The surveys contained data describing the characteristics of the marsh, including latitude, longitude, marsh condition, marsh color, percent vegetated, and marsh die-back. Creating a model that combines remote sensing images, field data, and statistical analysis to develop a methodology for estimating the margin of error in measurements of coastal land loss (erosion) is the ultimate goal of the study. A model was successfully created using a series of band combinations (used as predictive variables). The most successful band combinations or predictive variables were the braud value [(Sum Visible TM Bands - Sum Infrared TM Bands)/(Sum Visible TM Bands + Sum Infrared TM Bands)], TM band 7/ TM band 2, brightness, NDVI, wetness, vegetation index, and a 7x7 autocovariate nearest neighbor floating window. The model values were used to generate the logistic regression model. A new image was created based on the logistic regression probability equation where each pixel represents the probability of finding water or non-water at that location in each image. Pixels within each image that have a high probability of representing water have a value close to 1 and pixels with a low probability of representing water have a value close to 0. A logistic regression model is proposed that uses seven independent variables. This model yields an accurate classification in 86.5% of the locations considered in the 1997 and 2001 survey locations. When the logistic regression was modeled to the satellite imagery of the entire Louisiana Coast study area a statewide loss was estimated to be 358 mi2 to 368 mi2, from 1997 to 2001, using two different methods for estimating land loss.
A rational model of function learning.
Lucas, Christopher G; Griffiths, Thomas L; Williams, Joseph J; Kalish, Michael L
2015-10-01
Theories of how people learn relationships between continuous variables have tended to focus on two possibilities: one, that people are estimating explicit functions, or two that they are performing associative learning supported by similarity. We provide a rational analysis of function learning, drawing on work on regression in machine learning and statistics. Using the equivalence of Bayesian linear regression and Gaussian processes, which provide a probabilistic basis for similarity-based function learning, we show that learning explicit rules and using similarity can be seen as two views of one solution to this problem. We use this insight to define a rational model of human function learning that combines the strengths of both approaches and accounts for a wide variety of experimental results.
Optical scatterometry of quarter-micron patterns using neural regression
NASA Astrophysics Data System (ADS)
Bischoff, Joerg; Bauer, Joachim J.; Haak, Ulrich; Hutschenreuther, Lutz; Truckenbrodt, Horst
1998-06-01
With shrinking dimensions and increasing chip areas, a rapid and non-destructive full wafer characterization after every patterning cycle is an inevitable necessity. In former publications it was shown that Optical Scatterometry (OS) has the potential to push the attainable feature limits of optical techniques from 0.8 . . . 0.5 microns for imaging methods down to 0.1 micron and below. Thus the demands of future metrology can be met. Basically being a nonimaging method, OS combines light scatter (or diffraction) measurements with modern data analysis schemes to solve the inverse scatter issue. For very fine patterns with lambda-to-pitch ratios grater than one, the specular reflected light versus the incidence angle is recorded. Usually, the data analysis comprises two steps -- a training cycle connected the a rigorous forward modeling and the prediction itself. Until now, two data analysis schemes are usually applied -- the multivariate regression based Partial Least Squares method (PLS) and a look-up-table technique which is also referred to as Minimum Mean Square Error approach (MMSE). Both methods are afflicted with serious drawbacks. On the one hand, the prediction accuracy of multivariate regression schemes degrades with larger parameter ranges due to the linearization properties of the method. On the other hand, look-up-table methods are rather time consuming during prediction thus prolonging the processing time and reducing the throughput. An alternate method is an Artificial Neural Network (ANN) based regression which combines the advantages of multivariate regression and MMSE. Due to the versatility of a neural network, not only can its structure be adapted more properly to the scatter problem, but also the nonlinearity of the neuronal transfer functions mimic the nonlinear behavior of optical diffraction processes more adequately. In spite of these pleasant properties, the prediction speed of ANN regression is comparable with that of the PLS-method. In this paper, the viability and performance of ANN-regression will be demonstrated with the example of sub-quarter-micron resist metrology. To this end, 0.25 micrometer line/space patterns have been printed in positive photoresist by means of DUV projection lithography. In order to evaluate the total metrology chain from light scatter measurement through data analysis, a thorough modeling has been performed. Assuming a trapezoidal shape of the developed resist profile, a training data set was generated by means of the Rigorous Coupled Wave Approach (RCWA). After training the model, a second data set was computed and deteriorated by Gaussian noise to imitate real measuring conditions. Then, these data have been fed into the models established before resulting in a Standard Error of Prediction (SEP) which corresponds to the measuring accuracy. Even with putting only little effort in the design of a back-propagation network, the ANN is clearly superior to the PLS-method. Depending on whether a network with one or two hidden layers was used, accuracy gains between 2 and 5 can be achieved compared with PLS regression. Furthermore, the ANN is less noise sensitive, for there is only a doubling of the SEP at 5% noise for ANN whereas for PLS the accuracy degrades rapidly with increasing noise. The accuracy gain also depends on the light polarization and on the measured parameters. Finally, these results have been proven experimentally, where the OS-results are in good accordance with the profiles obtained from cross- sectioning micrographs.
Norisue, Yasuhiro; Tokuda, Yasuharu; Juarez, Mayrol; Uchimido, Ryo; Fujitani, Shigeki; Stoeckel, David A
2017-02-07
Cumulative sum (CUSUM) analysis can be used to continuously monitor the performance of an individual or process and detect deviations from a preset or standard level of achievement. However, no previous study has evaluated the utility of CUSUM analysis in facilitating timely environmental assessment and interventions to improve performance of linear-probe endobronchial ultrasound-guided transbronchial needle aspiration (EBUS-TBNA). The aim of this study was to evaluate the usefulness of combined CUSUM and chronological environmental analysis as a tool to improve the learning environment for EBUS-TBNA trainees. This study was an observational chart review. To determine if performance was acceptable, CUSUM analysis was used to track procedural outcomes of trainees in EBUS-TBNA. To investigate chronological changes in the learning environment, multivariate logistic regression analysis was used to compare several indices before and after time points when significant changes occurred in proficiency. Presence of an additional attending bronchoscopist was inversely associated with nonproficiency (odds ratio, 0.117; 95% confidence interval, 0-0.749; P = 0.019). Other factors, including presence of an on-site cytopathologist and dose of sedatives used, were not significantly associated with duration of nonproficiency. Combined CUSUM and chronological environmental analysis may be useful in hastening interventions that improve performance of EBUS-TBNA.
Grantz, Erin; Haggard, Brian; Scott, J Thad
2018-06-12
We calculated four median datasets (chlorophyll a, Chl a; total phosphorus, TP; and transparency) using multiple approaches to handling censored observations, including substituting fractions of the quantification limit (QL; dataset 1 = 1QL, dataset 2 = 0.5QL) and statistical methods for censored datasets (datasets 3-4) for approximately 100 Texas, USA reservoirs. Trend analyses of differences between dataset 1 and 3 medians indicated percent difference increased linearly above thresholds in percent censored data (%Cen). This relationship was extrapolated to estimate medians for site-parameter combinations with %Cen > 80%, which were combined with dataset 3 as dataset 4. Changepoint analysis of Chl a- and transparency-TP relationships indicated threshold differences up to 50% between datasets. Recursive analysis identified secondary thresholds in dataset 4. Threshold differences show that information introduced via substitution or missing due to limitations of statistical methods biased values, underestimated error, and inflated the strength of TP thresholds identified in datasets 1-3. Analysis of covariance identified differences in linear regression models relating transparency-TP between datasets 1, 2, and the more statistically robust datasets 3-4. Study findings identify high-risk scenarios for biased analytical outcomes when using substitution. These include high probability of median overestimation when %Cen > 50-60% for a single QL, or when %Cen is as low 16% for multiple QL's. Changepoint analysis was uniquely vulnerable to substitution effects when using medians from sites with %Cen > 50%. Linear regression analysis was less sensitive to substitution and missing data effects, but differences in model parameters for transparency cannot be discounted and could be magnified by log-transformation of the variables.
Drivers of wetland conversion: a global meta-analysis.
van Asselen, Sanneke; Verburg, Peter H; Vermaat, Jan E; Janse, Jan H
2013-01-01
Meta-analysis of case studies has become an important tool for synthesizing case study findings in land change. Meta-analyses of deforestation, urbanization, desertification and change in shifting cultivation systems have been published. This present study adds to this literature, with an analysis of the proximate causes and underlying forces of wetland conversion at a global scale using two complementary approaches of systematic review. Firstly, a meta-analysis of 105 case-study papers describing wetland conversion was performed, showing that different combinations of multiple-factor proximate causes, and underlying forces, drive wetland conversion. Agricultural development has been the main proximate cause of wetland conversion, and economic growth and population density are the most frequently identified underlying forces. Secondly, to add a more quantitative component to the study, a logistic meta-regression analysis was performed to estimate the likelihood of wetland conversion worldwide, using globally-consistent biophysical and socioeconomic location factor maps. Significant factors explaining wetland conversion, in order of importance, are market influence, total wetland area (lower conversion probability), mean annual temperature and cropland or built-up area. The regression analyses results support the outcomes of the meta-analysis of the processes of conversion mentioned in the individual case studies. In other meta-analyses of land change, similar factors (e.g., agricultural development, population growth, market/economic factors) are also identified as important causes of various types of land change (e.g., deforestation, desertification). Meta-analysis helps to identify commonalities across the various local case studies and identify which variables may lead to individual cases to behave differently. The meta-regression provides maps indicating the likelihood of wetland conversion worldwide based on the location factors that have determined historic conversions.
Drivers of Wetland Conversion: a Global Meta-Analysis
van Asselen, Sanneke; Verburg, Peter H.; Vermaat, Jan E.; Janse, Jan H.
2013-01-01
Meta-analysis of case studies has become an important tool for synthesizing case study findings in land change. Meta-analyses of deforestation, urbanization, desertification and change in shifting cultivation systems have been published. This present study adds to this literature, with an analysis of the proximate causes and underlying forces of wetland conversion at a global scale using two complementary approaches of systematic review. Firstly, a meta-analysis of 105 case-study papers describing wetland conversion was performed, showing that different combinations of multiple-factor proximate causes, and underlying forces, drive wetland conversion. Agricultural development has been the main proximate cause of wetland conversion, and economic growth and population density are the most frequently identified underlying forces. Secondly, to add a more quantitative component to the study, a logistic meta-regression analysis was performed to estimate the likelihood of wetland conversion worldwide, using globally-consistent biophysical and socioeconomic location factor maps. Significant factors explaining wetland conversion, in order of importance, are market influence, total wetland area (lower conversion probability), mean annual temperature and cropland or built-up area. The regression analyses results support the outcomes of the meta-analysis of the processes of conversion mentioned in the individual case studies. In other meta-analyses of land change, similar factors (e.g., agricultural development, population growth, market/economic factors) are also identified as important causes of various types of land change (e.g., deforestation, desertification). Meta-analysis helps to identify commonalities across the various local case studies and identify which variables may lead to individual cases to behave differently. The meta-regression provides maps indicating the likelihood of wetland conversion worldwide based on the location factors that have determined historic conversions. PMID:24282580
Sternberg, Maya R; Schleicher, Rosemary L; Pfeiffer, Christine M
2013-06-01
The collection of articles in this supplement issue provides insight into the association of various covariates with concentrations of biochemical indicators of diet and nutrition (biomarkers), beyond age, race, and sex, using linear regression. We studied 10 specific sociodemographic and lifestyle covariates in combination with 29 biomarkers from NHANES 2003-2006 for persons aged ≥ 20 y. The covariates were organized into 2 sets or "chunks": sociodemographic (age, sex, race-ethnicity, education, and income) and lifestyle (dietary supplement use, smoking, alcohol consumption, BMI, and physical activity) and fit in hierarchical fashion by using each category or set of related variables to determine how covariates, jointly, are related to biomarker concentrations. In contrast to many regression modeling applications, all variables were retained in a full regression model regardless of significance to preserve the interpretation of the statistical properties of β coefficients, P values, and CIs and to keep the interpretation consistent across a set of biomarkers. The variables were preselected before data analysis, and the data analysis plan was designed at the outset to minimize the reporting of false-positive findings by limiting the amount of preliminary hypothesis testing. Although we generally found that demographic differences seen in biomarkers were over- or underestimated when ignoring other key covariates, the demographic differences generally remained significant after adjusting for sociodemographic and lifestyle variables. These articles are intended to provide a foundation to researchers to help them generate hypotheses for future studies or data analyses and/or develop predictive regression models using the wealth of NHANES data.
Cronin, Matthew A.; Amstrup, Steven C.; Durner, George M.; Noel, Lynn E.; McDonald, Trent L.; Ballard, Warren B.
1998-01-01
There is concern that caribou (Rangifer tarandus) may avoid roads and facilities (i.e., infrastructure) in the Prudhoe Bay oil field (PBOF) in northern Alaska, and that this avoidance can have negative effects on the animals. We quantified the relationship between caribou distribution and PBOF infrastructure during the post-calving period (mid-June to mid-August) with aerial surveys from 1990 to 1995. We conducted four to eight surveys per year with complete coverage of the PBOF. We identified active oil field infrastructure and used a geographic information system (GIS) to construct ten 1 km wide concentric intervals surrounding the infrastructure. We tested whether caribou distribution is related to distance from infrastructure with a chi-squared habitat utilization-availability analysis and log-linear regression. We considered bulls, calves, and total caribou of all sex/age classes separately. The habitat utilization-availability analysis indicated there was no consistent trend of attraction to or avoidance of infrastructure. Caribou frequently were more abundant than expected in the intervals close to infrastructure, and this trend was more pronounced for bulls and for total caribou of all sex/age classes than for calves. Log-linear regression (with Poisson error structure) of numbers of caribou and distance from infrastructure were also done, with and without combining data into the 1 km distance intervals. The analysis without intervals revealed no relationship between caribou distribution and distance from oil field infrastructure, or between caribou distribution and Julian date, year, or distance from the Beaufort Sea coast. The log-linear regression with caribou combined into distance intervals showed the density of bulls and total caribou of all sex/age classes declined with distance from infrastructure. Our results indicate that during the post-calving period: 1) caribou distribution is largely unrelated to distance from infrastructure; 2) caribou regularly use habitats in the PBOF; 3) caribou often occur close to infrastructure; and 4) caribou do not appear to avoid oil field infrastructure.
Traub, Meike; Lauer, Romy; Kesztyüs, Tibor; Wartha, Olivia; Steinacker, Jürgen Michael; Kesztyüs, Dorothea
2018-03-16
Regular breakfast and well-balanced soft drink, and screen media consumption are associated with a lower risk of overweight and obesity in schoolchildren. The aim of this research is the combined examination of these three parameters as influencing factors for longitudinal weight development in schoolchildren in order to adapt targeted preventive measures. In the course of the Baden-Württemberg Study, Germany, data from direct measurements (baseline (2010) and follow-up (2011)) at schools was available for 1733 primary schoolchildren aged 7.08 ± 0.6 years (50.8% boys). Anthropometric measurements of the children were taken according to ISAK-standards (International Standard for Anthropometric Assessment) by trained staff. Health and lifestyle characteristics of the children and their parents were assessed in questionnaires. A linear mixed effects regression analysis was conducted to examine influences on changes in waist-to-height-ratio (WHtR), weight, and body mass index (BMI) measures. A generalised linear mixed effects regression analysis was performed to identify the relationship between breakfast, soft drink and screen media consumption with the prevalence of overweight, obesity and abdominal obesity at follow-up. According to the regression analyses, skipping breakfast led to increased changes in WHtR, weight and BMI measures. Skipping breakfast and the overconsumption of screen media at baseline led to higher odds of abdominal obesity and overweight at follow-up. No significant association between soft drink consumption and weight development was found. Targeted prevention for healthy weight status and development in primary schoolchildren should aim towards promoting balanced breakfast habits and a reduction in screen media consumption. Future research on soft drink consumption is needed. Health promoting interventions should synergistically involve children, parents, and schools. The Baden-Württemberg Study is registered at the German Clinical Trials Register (DRKS) under the DRKS-ID: DRKS00000494 .
Regression Analysis of Combined Gene Expression Regulation in Acute Myeloid Leukemia
Li, Yue; Liang, Minggao; Zhang, Zhaolei
2014-01-01
Gene expression is a combinatorial function of genetic/epigenetic factors such as copy number variation (CNV), DNA methylation (DM), transcription factors (TF) occupancy, and microRNA (miRNA) post-transcriptional regulation. At the maturity of microarray/sequencing technologies, large amounts of data measuring the genome-wide signals of those factors became available from Encyclopedia of DNA Elements (ENCODE) and The Cancer Genome Atlas (TCGA). However, there is a lack of an integrative model to take full advantage of these rich yet heterogeneous data. To this end, we developed RACER (Regression Analysis of Combined Expression Regulation), which fits the mRNA expression as response using as explanatory variables, the TF data from ENCODE, and CNV, DM, miRNA expression signals from TCGA. Briefly, RACER first infers the sample-specific regulatory activities by TFs and miRNAs, which are then used as inputs to infer specific TF/miRNA-gene interactions. Such a two-stage regression framework circumvents a common difficulty in integrating ENCODE data measured in generic cell-line with the sample-specific TCGA measurements. As a case study, we integrated Acute Myeloid Leukemia (AML) data from TCGA and the related TF binding data measured in K562 from ENCODE. As a proof-of-concept, we first verified our model formalism by 10-fold cross-validation on predicting gene expression. We next evaluated RACER on recovering known regulatory interactions, and demonstrated its superior statistical power over existing methods in detecting known miRNA/TF targets. Additionally, we developed a feature selection procedure, which identified 18 regulators, whose activities clustered consistently with cytogenetic risk groups. One of the selected regulators is miR-548p, whose inferred targets were significantly enriched for leukemia-related pathway, implicating its novel role in AML pathogenesis. Moreover, survival analysis using the inferred activities identified C-Fos as a potential AML prognostic marker. Together, we provided a novel framework that successfully integrated the TCGA and ENCODE data in revealing AML-specific regulatory program at global level. PMID:25340776
NASA Astrophysics Data System (ADS)
Fernández-Manso, O.; Fernández-Manso, A.; Quintano, C.
2014-09-01
Aboveground biomass (AGB) estimation from optical satellite data is usually based on regression models of original or synthetic bands. To overcome the poor relation between AGB and spectral bands due to mixed-pixels when a medium spatial resolution sensor is considered, we propose to base the AGB estimation on fraction images from Linear Spectral Mixture Analysis (LSMA). Our study area is a managed Mediterranean pine woodland (Pinus pinaster Ait.) in central Spain. A total of 1033 circular field plots were used to estimate AGB from Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER) optical data. We applied Pearson correlation statistics and stepwise multiple regression to identify suitable predictors from the set of variables of original bands, fraction imagery, Normalized Difference Vegetation Index and Tasselled Cap components. Four linear models and one nonlinear model were tested. A linear combination of ASTER band 2 (red, 0.630-0.690 μm), band 8 (short wave infrared 5, 2.295-2.365 μm) and green vegetation fraction (from LSMA) was the best AGB predictor (Radj2=0.632, the root-mean-squared error of estimated AGB was 13.3 Mg ha-1 (or 37.7%), resulting from cross-validation), rather than other combinations of the above cited independent variables. Results indicated that using ASTER fraction images in regression models improves the AGB estimation in Mediterranean pine forests. The spatial distribution of the estimated AGB, based on a multiple linear regression model, may be used as baseline information for forest managers in future studies, such as quantifying the regional carbon budget, fuel accumulation or monitoring of management practices.
NASA Astrophysics Data System (ADS)
Hadley, Brian Christopher
This dissertation assessed remotely sensed data and geospatial modeling technique(s) to map the spatial distribution of total above-ground biomass present on the surface of the Savannah River National Laboratory's (SRNL) Mixed Waste Management Facility (MWMF) hazardous waste landfill. Ordinary least squares (OLS) regression, regression kriging, and tree-structured regression were employed to model the empirical relationship between in-situ measured Bahia (Paspalum notatum Flugge) and Centipede [Eremochloa ophiuroides (Munro) Hack.] grass biomass against an assortment of explanatory variables extracted from fine spatial resolution passive optical and LIDAR remotely sensed data. Explanatory variables included: (1) discrete channels of visible, near-infrared (NIR), and short-wave infrared (SWIR) reflectance, (2) spectral vegetation indices (SVI), (3) spectral mixture analysis (SMA) modeled fractions, (4) narrow-band derivative-based vegetation indices, and (5) LIDAR derived topographic variables (i.e. elevation, slope, and aspect). Results showed that a linear combination of the first- (1DZ_DGVI), second- (2DZ_DGVI), and third-derivative of green vegetation indices (3DZ_DGVI) calculated from hyperspectral data recorded over the 400--960 nm wavelengths of the electromagnetic spectrum explained the largest percentage of statistical variation (R2 = 0.5184) in the total above-ground biomass measurements. In general, the topographic variables did not correlate well with the MWMF biomass data, accounting for less than five percent of the statistical variation. It was concluded that tree-structured regression represented the optimum geospatial modeling technique due to a combination of model performance and efficiency/flexibility factors.
Liver Rapid Reference Set Application: Kevin Qu-Quest (2011) — EDRN Public Portal
We propose to evaluate the performance of a novel serum biomarker panel for early detection of hepatocellular carcinoma (HCC). This panel is based on markers from the ubiquitin-proteasome system (UPS) in combination with the existing known HCC biomarkers, namely, alpha-fetoprotein (AFP), AFP-L3%, and des-y-carboxy prothrombin (DCP). To this end, we applied multivariate logistic regression analysis to optimize this biomarker algorithm tool.
Chen, Pei; Jou, Yuh-Shan; Fann, Cathy S J; Chen, Jaw-Wen; Chung, Chia-Min; Lin, Chin-Yu; Wu, Sheng-Yeu; Kang, Mei-Jyh; Chen, Ying-Chuang; Jong, Yuh-Shiun; Lo, Huey-Ming; Kang, Chih-Sen; Chen, Chien-Chung; Chang, Huan-Cheng; Huang, Nai-Kuei; Wu, Yi-Lin; Pan, Wen-Harn
2009-01-01
Previously, we observed that young-onset hypertension was independently associated with elevated plasma triglyceride(s) (TG) levels to a greater extent than other metabolic risk factors. Thus, focusing on the endophenotype--hypertension combined with elevated TG--we designed a family-based haplotype association study to explore its genetic connection with novel genetic variants of lipoprotein lipase gene (LPL), which encodes a major lipid metabolizing enzyme. Young-onset hypertension probands and their families were recruited, numbering 1,002 individuals from 345 families. Single-nucleotide polymorphism discovery for LPL, linkage disequilibrium (LD) analysis, transmission disequilibrium tests (TDT), bin construction, haplotype TDT association and logistic regression analysis were performed. We found that the CC- haplotype (i) spanning from intron 2 to intron 4 and the ACATT haplotype (ii) spanning from intron 5 to intron 6 were significantly associated with hypertension-related phenotypes: hypertension (ii, P=0.05), elevated TG (i, P=0.01), and hypertension combined with elevated TG (i, P=0.001; ii, P<0.0001), according to TDT. The risk of this hypertension subtype increased with the number of risk haplotypes in the two loci, using logistic regression model after adjusting within-family correlation. The relationships between LPL variants and hypertension-related disorders were also confirmed by an independent association study. Finally, we showed a trend that individuals with homozygous risk haplotypes had decreased LPL expression after a fatty meal, as opposed to those with protective haplotypes. In conclusion, this study strongly suggests that two LPL intronic variants may be associated with development of the hypertension endophenotype with elevated TG. Copyright 2008 Wiley-Liss, Inc.
Prolonged grief and post-traumatic growth after loss: Latent class analysis.
Zhou, Ningning; Yu, Wei; Tang, Suqin; Wang, Jianping; Killikelly, Clare
2018-06-06
Bereavement may trigger different psychological outcomes, such as prolonged grief disorder or post-traumatic growth. The relationship between these two outcomes and potential precipitators remain unknown. The current study aimed to identify classes of Chinese bereaved individuals based on prolonged grief symptoms and post-traumatic growth and to examine predictors for these classes. We used data from 273 Chinese individuals who lost a relative due to disease (92.3%), accident (4.4%) and other reasons (1.8%). Latent class analysis revealed three classes: a resilient class, a growth class, and a combined grief/growth class. A higher level of functional impairment was found for the combined grief/growth class than for the other two classes. Membership in the combined grief/growth class was significantly predicted by the younger age of the deceased and the death of a parent, child or spouse. Subjective closeness with the deceased and gender were marginally significant predictors. When the four variables were included in the multinomial regression analysis, death of a parent, child or spouse significantly predicted the membership to the combined grief/growth class. These findings provide valuable information for the development of tailored interventions that may build on the bereaved individuals' personal strengths. Copyright © 2018. Published by Elsevier B.V.
Damman, Olga C; Stubbe, Janine H; Hendriks, Michelle; Arah, Onyebuchi A; Spreeuwenberg, Peter; Delnoij, Diana M J; Groenewegen, Peter P
2009-04-01
Ratings on the quality of healthcare from the consumer's perspective need to be adjusted for consumer characteristics to ensure fair and accurate comparisons between healthcare providers or health plans. Although multilevel analysis is already considered an appropriate method for analyzing healthcare performance data, it has rarely been used to assess case-mix adjustment of such data. The purpose of this article is to investigate whether multilevel regression analysis is a useful tool to detect case-mix adjusters in consumer assessment of healthcare. We used data on 11,539 consumers from 27 Dutch health plans, which were collected using the Dutch Consumer Quality Index health plan instrument. We conducted multilevel regression analyses of consumers' responses nested within health plans to assess the effects of consumer characteristics on consumer experience. We compared our findings to the results of another methodology: the impact factor approach, which combines the predictive effect of each case-mix variable with its heterogeneity across health plans. Both multilevel regression and impact factor analyses showed that age and education were the most important case-mix adjusters for consumer experience and ratings of health plans. With the exception of age, case-mix adjustment had little impact on the ranking of health plans. On both theoretical and practical grounds, multilevel modeling is useful for adequate case-mix adjustment and analysis of performance ratings.
Rimaityte, Ingrida; Ruzgas, Tomas; Denafas, Gintaras; Racys, Viktoras; Martuzevicius, Dainius
2012-01-01
Forecasting of generation of municipal solid waste (MSW) in developing countries is often a challenging task due to the lack of data and selection of suitable forecasting method. This article aimed to select and evaluate several methods for MSW forecasting in a medium-scaled Eastern European city (Kaunas, Lithuania) with rapidly developing economics, with respect to affluence-related and seasonal impacts. The MSW generation was forecast with respect to the economic activity of the city (regression modelling) and using time series analysis. The modelling based on social-economic indicators (regression implemented in LCA-IWM model) showed particular sensitivity (deviation from actual data in the range from 2.2 to 20.6%) to external factors, such as the synergetic effects of affluence parameters or changes in MSW collection system. For the time series analysis, the combination of autoregressive integrated moving average (ARIMA) and seasonal exponential smoothing (SES) techniques were found to be the most accurate (mean absolute percentage error equalled to 6.5). Time series analysis method was very valuable for forecasting the weekly variation of waste generation data (r (2) > 0.87), but the forecast yearly increase should be verified against the data obtained by regression modelling. The methods and findings of this study may assist the experts, decision-makers and scientists performing forecasts of MSW generation, especially in developing countries.
Gordon, Evan M.; Stollstorff, Melanie; Vaidya, Chandan J.
2012-01-01
Many researchers have noted that the functional architecture of the human brain is relatively invariant during task performance and the resting state. Indeed, intrinsic connectivity networks (ICNs) revealed by resting-state functional connectivity analyses are spatially similar to regions activated during cognitive tasks. This suggests that patterns of task-related activation in individual subjects may result from the engagement of one or more of these ICNs; however, this has not been tested. We used a novel analysis, spatial multiple regression, to test whether the patterns of activation during an N-back working memory task could be well described by a linear combination of ICNs delineated using Independent Components Analysis at rest. We found that across subjects, the cingulo-opercular Set Maintenance ICN, as well as right and left Frontoparietal Control ICNs, were reliably activated during working memory, while Default Mode and Visual ICNs were reliably deactivated. Further, involvement of Set Maintenance, Frontoparietal Control, and Dorsal Attention ICNs was sensitive to varying working memory load. Finally, the degree of left Frontoparietal Control network activation predicted response speed, while activation in both left Frontoparietal Control and Dorsal Attention networks predicted task accuracy. These results suggest that a close relationship between resting-state networks and task-evoked activation is functionally relevant for behavior, and that spatial multiple regression analysis is a suitable method for revealing that relationship. PMID:21761505
Using Time Series Analysis to Predict Cardiac Arrest in a PICU.
Kennedy, Curtis E; Aoki, Noriaki; Mariscalco, Michele; Turley, James P
2015-11-01
To build and test cardiac arrest prediction models in a PICU, using time series analysis as input, and to measure changes in prediction accuracy attributable to different classes of time series data. Retrospective cohort study. Thirty-one bed academic PICU that provides care for medical and general surgical (not congenital heart surgery) patients. Patients experiencing a cardiac arrest in the PICU and requiring external cardiac massage for at least 2 minutes. None. One hundred three cases of cardiac arrest and 109 control cases were used to prepare a baseline dataset that consisted of 1,025 variables in four data classes: multivariate, raw time series, clinical calculations, and time series trend analysis. We trained 20 arrest prediction models using a matrix of five feature sets (combinations of data classes) with four modeling algorithms: linear regression, decision tree, neural network, and support vector machine. The reference model (multivariate data with regression algorithm) had an accuracy of 78% and 87% area under the receiver operating characteristic curve. The best model (multivariate + trend analysis data with support vector machine algorithm) had an accuracy of 94% and 98% area under the receiver operating characteristic curve. Cardiac arrest predictions based on a traditional model built with multivariate data and a regression algorithm misclassified cases 3.7 times more frequently than predictions that included time series trend analysis and built with a support vector machine algorithm. Although the final model lacks the specificity necessary for clinical application, we have demonstrated how information from time series data can be used to increase the accuracy of clinical prediction models.
Analyzing thresholds and efficiency with hierarchical Bayesian logistic regression.
Houpt, Joseph W; Bittner, Jennifer L
2018-07-01
Ideal observer analysis is a fundamental tool used widely in vision science for analyzing the efficiency with which a cognitive or perceptual system uses available information. The performance of an ideal observer provides a formal measure of the amount of information in a given experiment. The ratio of human to ideal performance is then used to compute efficiency, a construct that can be directly compared across experimental conditions while controlling for the differences due to the stimuli and/or task specific demands. In previous research using ideal observer analysis, the effects of varying experimental conditions on efficiency have been tested using ANOVAs and pairwise comparisons. In this work, we present a model that combines Bayesian estimates of psychometric functions with hierarchical logistic regression for inference about both unadjusted human performance metrics and efficiencies. Our approach improves upon the existing methods by constraining the statistical analysis using a standard model connecting stimulus intensity to human observer accuracy and by accounting for variability in the estimates of human and ideal observer performance scores. This allows for both individual and group level inferences. Copyright © 2018 Elsevier Ltd. All rights reserved.
Optimizing Hybrid Metrology: Rigorous Implementation of Bayesian and Combined Regression.
Henn, Mark-Alexander; Silver, Richard M; Villarrubia, John S; Zhang, Nien Fan; Zhou, Hui; Barnes, Bryan M; Ming, Bin; Vladár, András E
2015-01-01
Hybrid metrology, e.g., the combination of several measurement techniques to determine critical dimensions, is an increasingly important approach to meet the needs of the semiconductor industry. A proper use of hybrid metrology may yield not only more reliable estimates for the quantitative characterization of 3-D structures but also a more realistic estimation of the corresponding uncertainties. Recent developments at the National Institute of Standards and Technology (NIST) feature the combination of optical critical dimension (OCD) measurements and scanning electron microscope (SEM) results. The hybrid methodology offers the potential to make measurements of essential 3-D attributes that may not be otherwise feasible. However, combining techniques gives rise to essential challenges in error analysis and comparing results from different instrument models, especially the effect of systematic and highly correlated errors in the measurement on the χ 2 function that is minimized. Both hypothetical examples and measurement data are used to illustrate solutions to these challenges.
Garnier, Alain; Gaillet, Bruno
2015-12-01
Not so many fermentation mathematical models allow analytical solutions of batch process dynamics. The most widely used is the combination of the logistic microbial growth kinetics with Luedeking-Piret bioproduct synthesis relation. However, the logistic equation is principally based on formalistic similarities and only fits a limited range of fermentation types. In this article, we have developed an analytical solution for the combination of Monod growth kinetics with Luedeking-Piret relation, which can be identified by linear regression and used to simulate batch fermentation evolution. Two classical examples are used to show the quality of fit and the simplicity of the method proposed. A solution for the combination of Haldane substrate-limited growth model combined with Luedeking-Piret relation is also provided. These models could prove useful for the analysis of fermentation data in industry as well as academia. © 2015 Wiley Periodicals, Inc.
Principal component regression analysis with SPSS.
Liu, R X; Kuang, J; Gong, Q; Hou, X L
2003-06-01
The paper introduces all indices of multicollinearity diagnoses, the basic principle of principal component regression and determination of 'best' equation method. The paper uses an example to describe how to do principal component regression analysis with SPSS 10.0: including all calculating processes of the principal component regression and all operations of linear regression, factor analysis, descriptives, compute variable and bivariate correlations procedures in SPSS 10.0. The principal component regression analysis can be used to overcome disturbance of the multicollinearity. The simplified, speeded up and accurate statistical effect is reached through the principal component regression analysis with SPSS.
Mao, Nini; Liu, Yunting; Chen, Kewei; Yao, Li; Wu, Xia
2018-06-05
Multiple neuroimaging modalities have been developed providing various aspects of information on the human brain. Used together and properly, these complementary multimodal neuroimaging data integrate multisource information which can facilitate a diagnosis and improve the diagnostic accuracy. In this study, 3 types of brain imaging data (sMRI, FDG-PET, and florbetapir-PET) were fused in the hope to improve diagnostic accuracy, and multivariate methods (logistic regression) were applied to these trimodal neuroimaging indices. Then, the receiver-operating characteristic (ROC) method was used to analyze the outcomes of the logistic classifier, with either each index, multiples from each modality, or all indices from all 3 modalities, to investigate their differential abilities to identify the disease. With increasing numbers of indices within each modality and across modalities, the accuracy of identifying Alzheimer disease (AD) increases to varying degrees. For example, the area under the ROC curve is above 0.98 when all the indices from the 3 imaging data types are combined. Using a combination of different indices, the results confirmed the initial hypothesis that different biomarkers were potentially complementary, and thus the conjoint analysis of multiple information from multiple sources would improve the capability to identify diseases such as AD and mild cognitive impairment. © 2018 S. Karger AG, Basel.
Sharp, T G
1984-02-01
The study was designed to determine whether any one of seven selected variables or a combination of the variables is predictive of performance on the State Board Test Pool Examination. The selected variables studied were: high school grade point average (HSGPA), The University of Tennessee, Knoxville, College of Nursing grade point average (GPA), and American College Test Assessment (ACT) standard scores (English, ENG; mathematics, MA; social studies, SS; natural sciences, NSC; composite, COMP). Data utilized were from graduates of the baccalaureate program of The University of Tennessee, Knoxville, College of Nursing from 1974 through 1979. The sample of 322 was selected from a total population of 572. The Statistical Analysis System (SAS) was designed to accomplish analysis of the predictive relationship of each of the seven selected variables to State Board Test Pool Examination performance (result of pass or fail), a stepwise discriminant analysis was designed for determining the predictive relationship of the strongest combination of the independent variables to overall State Board Test Pool Examination performance (result of pass or fail), and stepwise multiple regression analysis was designed to determine the strongest predictive combination of selected variables for each of the five subexams of the State Board Test Pool Examination. The selected variables were each found to be predictive of SBTPE performance (result of pass or fail). The strongest combination for predicting SBTPE performance (result of pass or fail) was found to be GPA, MA, and NSC.
Martins, Filipe C; Santiago, Ines de; Trinh, Anne; Xian, Jian; Guo, Anne; Sayal, Karen; Jimenez-Linan, Mercedes; Deen, Suha; Driver, Kristy; Mack, Marie; Aslop, Jennifer; Pharoah, Paul D; Markowetz, Florian; Brenton, James D
2014-12-17
TP53 and BRCA1/2 mutations are the main drivers in high-grade serous ovarian carcinoma (HGSOC). We hypothesise that combining tissue phenotypes from image analysis of tumour sections with genomic profiles could reveal other significant driver events. Automatic estimates of stromal content combined with genomic analysis of TCGA HGSOC tumours show that stroma strongly biases estimates of PTEN expression. Tumour-specific PTEN expression was tested in two independent cohorts using tissue microarrays containing 521 cases of HGSOC. PTEN loss or downregulation occurred in 77% of the first cohort by immunofluorescence and 52% of the validation group by immunohistochemistry, and is associated with worse survival in a multivariate Cox-regression model adjusted for study site, age, stage and grade. Reanalysis of TCGA data shows that hemizygous loss of PTEN is common (36%) and expression of PTEN and expression of androgen receptor are positively associated. Low androgen receptor expression was associated with reduced survival in data from TCGA and immunohistochemical analysis of the first cohort. PTEN loss is a common event in HGSOC and defines a subgroup with significantly worse prognosis, suggesting the rational use of drugs to target PI3K and androgen receptor pathways for HGSOC. This work shows that integrative approaches combining tissue phenotypes from images with genomic analysis can resolve confounding effects of tissue heterogeneity and should be used to identify new drivers in other cancers.
Arano, Ichiro; Sugimoto, Tomoyuki; Hamasaki, Toshimitsu; Ohno, Yuko
2010-04-23
Survival analysis methods such as the Kaplan-Meier method, log-rank test, and Cox proportional hazards regression (Cox regression) are commonly used to analyze data from randomized withdrawal studies in patients with major depressive disorder. However, unfortunately, such common methods may be inappropriate when a long-term censored relapse-free time appears in data as the methods assume that if complete follow-up were possible for all individuals, each would eventually experience the event of interest. In this paper, to analyse data including such a long-term censored relapse-free time, we discuss a semi-parametric cure regression (Cox cure regression), which combines a logistic formulation for the probability of occurrence of an event with a Cox proportional hazards specification for the time of occurrence of the event. In specifying the treatment's effect on disease-free survival, we consider the fraction of long-term survivors and the risks associated with a relapse of the disease. In addition, we develop a tree-based method for the time to event data to identify groups of patients with differing prognoses (cure survival CART). Although analysis methods typically adapt the log-rank statistic for recursive partitioning procedures, the method applied here used a likelihood ratio (LR) test statistic from a fitting of cure survival regression assuming exponential and Weibull distributions for the latency time of relapse. The method is illustrated using data from a sertraline randomized withdrawal study in patients with major depressive disorder. We concluded that Cox cure regression reveals facts on who may be cured, and how the treatment and other factors effect on the cured incidence and on the relapse time of uncured patients, and that cure survival CART output provides easily understandable and interpretable information, useful both in identifying groups of patients with differing prognoses and in utilizing Cox cure regression models leading to meaningful interpretations.
NASA Astrophysics Data System (ADS)
Kamaruddin, Ainur Amira; Ali, Zalila; Noor, Norlida Mohd.; Baharum, Adam; Ahmad, Wan Muhamad Amir W.
2014-07-01
Logistic regression analysis examines the influence of various factors on a dichotomous outcome by estimating the probability of the event's occurrence. Logistic regression, also called a logit model, is a statistical procedure used to model dichotomous outcomes. In the logit model the log odds of the dichotomous outcome is modeled as a linear combination of the predictor variables. The log odds ratio in logistic regression provides a description of the probabilistic relationship of the variables and the outcome. In conducting logistic regression, selection procedures are used in selecting important predictor variables, diagnostics are used to check that assumptions are valid which include independence of errors, linearity in the logit for continuous variables, absence of multicollinearity, and lack of strongly influential outliers and a test statistic is calculated to determine the aptness of the model. This study used the binary logistic regression model to investigate overweight and obesity among rural secondary school students on the basis of their demographics profile, medical history, diet and lifestyle. The results indicate that overweight and obesity of students are influenced by obesity in family and the interaction between a student's ethnicity and routine meals intake. The odds of a student being overweight and obese are higher for a student having a family history of obesity and for a non-Malay student who frequently takes routine meals as compared to a Malay student.
Extreme learning machine for ranking: generalization analysis and applications.
Chen, Hong; Peng, Jiangtao; Zhou, Yicong; Li, Luoqing; Pan, Zhibin
2014-05-01
The extreme learning machine (ELM) has attracted increasing attention recently with its successful applications in classification and regression. In this paper, we investigate the generalization performance of ELM-based ranking. A new regularized ranking algorithm is proposed based on the combinations of activation functions in ELM. The generalization analysis is established for the ELM-based ranking (ELMRank) in terms of the covering numbers of hypothesis space. Empirical results on the benchmark datasets show the competitive performance of the ELMRank over the state-of-the-art ranking methods. Copyright © 2014 Elsevier Ltd. All rights reserved.
Exhaustive Search for Sparse Variable Selection in Linear Regression
NASA Astrophysics Data System (ADS)
Igarashi, Yasuhiko; Takenaka, Hikaru; Nakanishi-Ohno, Yoshinori; Uemura, Makoto; Ikeda, Shiro; Okada, Masato
2018-04-01
We propose a K-sparse exhaustive search (ES-K) method and a K-sparse approximate exhaustive search method (AES-K) for selecting variables in linear regression. With these methods, K-sparse combinations of variables are tested exhaustively assuming that the optimal combination of explanatory variables is K-sparse. By collecting the results of exhaustively computing ES-K, various approximate methods for selecting sparse variables can be summarized as density of states. With this density of states, we can compare different methods for selecting sparse variables such as relaxation and sampling. For large problems where the combinatorial explosion of explanatory variables is crucial, the AES-K method enables density of states to be effectively reconstructed by using the replica-exchange Monte Carlo method and the multiple histogram method. Applying the ES-K and AES-K methods to type Ia supernova data, we confirmed the conventional understanding in astronomy when an appropriate K is given beforehand. However, we found the difficulty to determine K from the data. Using virtual measurement and analysis, we argue that this is caused by data shortage.
Drying kinetics and characteristics of combined infrared-vacuum drying of button mushroom slices
NASA Astrophysics Data System (ADS)
Salehi, Fakhreddin; Kashaninejad, Mahdi; Jafarianlari, Ali
2017-05-01
Infrared-vacuum drying characteristics of button mushroom ( Agaricus bisporus) were evaluated in a combined dryer system. The effects of drying parameters, including infrared radiation power (150-375 W), system pressure (5-15 kPa) and time (0-160 min) on the drying kinetics and characteristics of button mushroom slices were investigated. Both the infrared lamp power and vacuum pressure influenced the drying time of button mushroom slices. The rate constants of the nine different kinetic's models for thin layer drying were established by nonlinear regression analysis of the experimental data which were found to be affected mainly by the infrared power level while system pressure had a little effect on the moisture ratios. The regression results showed that the Page model satisfactorily described the drying behavior of button mushroom slices with highest R value and lowest SE values. The effective moisture diffusivity increases as power increases and range between 0.83 and 2.33 × 10-9 m2/s. The rise in infrared power has a negative effect on the ΔE and with increasing in infrared radiation power it was increased.
Bumps in river profiles: uncertainty assessment and smoothing using quantile regression techniques
NASA Astrophysics Data System (ADS)
Schwanghart, Wolfgang; Scherler, Dirk
2017-12-01
The analysis of longitudinal river profiles is an important tool for studying landscape evolution. However, characterizing river profiles based on digital elevation models (DEMs) suffers from errors and artifacts that particularly prevail along valley bottoms. The aim of this study is to characterize uncertainties that arise from the analysis of river profiles derived from different, near-globally available DEMs. We devised new algorithms - quantile carving and the CRS algorithm - that rely on quantile regression to enable hydrological correction and the uncertainty quantification of river profiles. We find that globally available DEMs commonly overestimate river elevations in steep topography. The distributions of elevation errors become increasingly wider and right skewed if adjacent hillslope gradients are steep. Our analysis indicates that the AW3D DEM has the highest precision and lowest bias for the analysis of river profiles in mountainous topography. The new 12 m resolution TanDEM-X DEM has a very low precision, most likely due to the combined effect of steep valley walls and the presence of water surfaces in valley bottoms. Compared to the conventional approaches of carving and filling, we find that our new approach is able to reduce the elevation bias and errors in longitudinal river profiles.
Effects of climate change on Salmonella infections.
Akil, Luma; Ahmad, H Anwar; Reddy, Remata S
2014-12-01
Climate change and global warming have been reported to increase spread of foodborne pathogens. To understand these effects on Salmonella infections, modeling approaches such as regression analysis and neural network (NN) were used. Monthly data for Salmonella outbreaks in Mississippi (MS), Tennessee (TN), and Alabama (AL) were analyzed from 2002 to 2011 using analysis of variance and time series analysis. Meteorological data were collected and the correlation with salmonellosis was examined using regression analysis and NN. A seasonal trend in Salmonella infections was observed (p<0.001). Strong positive correlation was found between high temperature and Salmonella infections in MS and for the combined states (MS, TN, AL) models (R(2)=0.554; R(2)=0.415, respectively). NN models showed a strong effect of rise in temperature on the Salmonella outbreaks. In this study, an increase of 1°F was shown to result in four cases increase of Salmonella in MS. However, no correlation between monthly average precipitation rate and Salmonella infections was observed. There is consistent evidence that gastrointestinal infection with bacterial pathogens is positively correlated with ambient temperature, as warmer temperatures enable more rapid replication. Warming trends in the United States and specifically in the southern states may increase rates of Salmonella infections.
Li, Hong Zhi; Tao, Wei; Gao, Ting; Li, Hui; Lu, Ying Hua; Su, Zhong Min
2011-01-01
We propose a generalized regression neural network (GRNN) approach based on grey relational analysis (GRA) and principal component analysis (PCA) (GP-GRNN) to improve the accuracy of density functional theory (DFT) calculation for homolysis bond dissociation energies (BDE) of Y-NO bond. As a demonstration, this combined quantum chemistry calculation with the GP-GRNN approach has been applied to evaluate the homolysis BDE of 92 Y-NO organic molecules. The results show that the ull-descriptor GRNN without GRA and PCA (F-GRNN) and with GRA (G-GRNN) approaches reduce the root-mean-square (RMS) of the calculated homolysis BDE of 92 organic molecules from 5.31 to 0.49 and 0.39 kcal mol(-1) for the B3LYP/6-31G (d) calculation. Then the newly developed GP-GRNN approach further reduces the RMS to 0.31 kcal mol(-1). Thus, the GP-GRNN correction on top of B3LYP/6-31G (d) can improve the accuracy of calculating the homolysis BDE in quantum chemistry and can predict homolysis BDE which cannot be obtained experimentally.
IPMP Global Fit - A one-step direct data analysis tool for predictive microbiology.
Huang, Lihan
2017-12-04
The objective of this work is to develop and validate a unified optimization algorithm for performing one-step global regression analysis of isothermal growth and survival curves for determination of kinetic parameters in predictive microbiology. The algorithm is incorporated with user-friendly graphical interfaces (GUIs) to develop a data analysis tool, the USDA IPMP-Global Fit. The GUIs are designed to guide the users to easily navigate through the data analysis process and properly select the initial parameters for different combinations of mathematical models. The software is developed for one-step kinetic analysis to directly construct tertiary models by minimizing the global error between the experimental observations and mathematical models. The current version of the software is specifically designed for constructing tertiary models with time and temperature as the independent model parameters in the package. The software is tested with a total of 9 different combinations of primary and secondary models for growth and survival of various microorganisms. The results of data analysis show that this software provides accurate estimates of kinetic parameters. In addition, it can be used to improve the experimental design and data collection for more accurate estimation of kinetic parameters. IPMP-Global Fit can be used in combination with the regular USDA-IPMP for solving the inverse problems and developing tertiary models in predictive microbiology. Published by Elsevier B.V.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Nee, K.; Bryan, S.; Levitskaia, T.
The reliability of chemical processes can be greatly improved by implementing inline monitoring systems. Combining multivariate analysis with non-destructive sensors can enhance the process without interfering with the operation. Here, we present here hierarchical models using both principal component analysis and partial least square analysis developed for different chemical components representative of solvent extraction process streams. A training set of 380 samples and an external validation set of 95 samples were prepared and Near infrared and Raman spectral data as well as conductivity under variable temperature conditions were collected. The results from the models indicate that careful selection of themore » spectral range is important. By compressing the data through Principal Component Analysis (PCA), we lower the rank of the data set to its most dominant features while maintaining the key principal components to be used in the regression analysis. Within the studied data set, concentration of five chemical components were modeled; total nitrate (NO 3 -), total acid (H +), neodymium (Nd 3+), sodium (Na +), and ionic strength (I.S.). The best overall model prediction for each of the species studied used a combined data set comprised of complementary techniques including NIR, Raman, and conductivity. Finally, our study shows that chemometric models are powerful but requires significant amount of carefully analyzed data to capture variations in the chemistry.« less
Nee, K.; Bryan, S.; Levitskaia, T.; ...
2017-12-28
The reliability of chemical processes can be greatly improved by implementing inline monitoring systems. Combining multivariate analysis with non-destructive sensors can enhance the process without interfering with the operation. Here, we present here hierarchical models using both principal component analysis and partial least square analysis developed for different chemical components representative of solvent extraction process streams. A training set of 380 samples and an external validation set of 95 samples were prepared and Near infrared and Raman spectral data as well as conductivity under variable temperature conditions were collected. The results from the models indicate that careful selection of themore » spectral range is important. By compressing the data through Principal Component Analysis (PCA), we lower the rank of the data set to its most dominant features while maintaining the key principal components to be used in the regression analysis. Within the studied data set, concentration of five chemical components were modeled; total nitrate (NO 3 -), total acid (H +), neodymium (Nd 3+), sodium (Na +), and ionic strength (I.S.). The best overall model prediction for each of the species studied used a combined data set comprised of complementary techniques including NIR, Raman, and conductivity. Finally, our study shows that chemometric models are powerful but requires significant amount of carefully analyzed data to capture variations in the chemistry.« less
Kjelstrom, L.C.
1995-01-01
Many individual springs and groups of springs discharge water from volcanic rocks that form the north canyon wall of the Snake River between Milner Dam and King Hill. Previous estimates of annual mean discharge from these springs have been used to understand the hydrology of the eastern part of the Snake River Plain. Four methods that were used in previous studies or developed to estimate annual mean discharge since 1902 were (1) water-budget analysis of the Snake River; (2) correlation of water-budget estimates with discharge from 10 index springs; (3) determination of the combined discharge from individual springs or groups of springs by using annual discharge measurements of 8 springs, gaging-station records of 4 springs and 3 sites on the Malad River, and regression equations developed from 5 of the measured springs; and (4) a single regression equation that correlates gaging-station records of 2 springs with historical water-budget estimates. Comparisons made among the four methods of estimating annual mean spring discharges from 1951 to 1959 and 1963 to 1980 indicated that differences were about equivalent to a measurement error of 2 to 3 percent. The method that best demonstrates the response of annual mean spring discharge to changes in ground-water recharge and discharge is method 3, which combines the measurements and regression estimates of discharge from individual springs.
Newman, J; Egan, T; Harbourne, N; O'Riordan, D; Jacquier, J C; O'Sullivan, M
2014-08-01
Sensory evaluation can be problematic for ingredients with a bitter taste during research and development phase of new food products. In this study, 19 dairy protein hydrolysates (DPH) were analysed by an electronic tongue and their physicochemical characteristics, the data obtained from these methods were correlated with their bitterness intensity as scored by a trained sensory panel and each model was also assessed by its predictive capabilities. The physiochemical characteristics of the DPHs investigated were degree of hydrolysis (DH%), and data relating to peptide size and relative hydrophobicity from size exclusion chromatography (SEC) and reverse phase (RP) HPLC. Partial least square regression (PLS) was used to construct the prediction models. All PLS regressions had good correlations (0.78 to 0.93) with the strongest being the combination of data obtained from SEC and RP HPLC. However, the PLS with the strongest predictive power was based on the e-tongue which had the PLS regression with the lowest root mean predicted residual error sum of squares (PRESS) in the study. The results show that the PLS models constructed with the e-tongue and the combination of SEC and RP-HPLC has potential to be used for prediction of bitterness and thus reducing the reliance on sensory analysis in DPHs for future food research. Copyright © 2014 Elsevier B.V. All rights reserved.
Peak-flow characteristics of Virginia streams
Austin, Samuel H.; Krstolic, Jennifer L.; Wiegand, Ute
2011-01-01
Peak-flow annual exceedance probabilities, also called probability-percent chance flow estimates, and regional regression equations are provided describing the peak-flow characteristics of Virginia streams. Statistical methods are used to evaluate peak-flow data. Analysis of Virginia peak-flow data collected from 1895 through 2007 is summarized. Methods are provided for estimating unregulated peak flow of gaged and ungaged streams. Station peak-flow characteristics identified by fitting the logarithms of annual peak flows to a Log Pearson Type III frequency distribution yield annual exceedance probabilities of 0.5, 0.4292, 0.2, 0.1, 0.04, 0.02, 0.01, 0.005, and 0.002 for 476 streamgaging stations. Stream basin characteristics computed using spatial data and a geographic information system are used as explanatory variables in regional regression model equations for six physiographic regions to estimate regional annual exceedance probabilities at gaged and ungaged sites. Weighted peak-flow values that combine annual exceedance probabilities computed from gaging station data and from regional regression equations provide improved peak-flow estimates. Text, figures, and lists are provided summarizing selected peak-flow sites, delineated physiographic regions, peak-flow estimates, basin characteristics, regional regression model equations, error estimates, definitions, data sources, and candidate regression model equations. This study supersedes previous studies of peak flows in Virginia.
Feasibility study of palm-based fuels for hybrid rocket motor applications
NASA Astrophysics Data System (ADS)
Tarmizi Ahmad, M.; Abidin, Razali; Taha, A. Latif; Anudip, Amzaryi
2018-02-01
This paper describes the combined analysis done in pure palm-based wax that can be used as solid fuel in a hybrid rocket engine. The measurement of pure palm wax calorific value was performed using a bomb calorimeter. An experimental rocket engine and static test stand facility were established. After initial measurement and calibration, repeated procedures were performed. Instrumentation supplies carried out allow fuel regression rate measurements, oxidizer mass flow rates and stearic acid rocket motors measurements. Similar tests are also carried out with stearate acid (from palm oil by-products) dissolved with nitrocellulose and bee solution. Calculated data and experiments show that rates and regression thrust can be achieved even in pure-tested palm-based wax. Additionally, palm-based wax is mixed with beeswax characterized by higher nominal melting temperatures to increase moisturizing points to higher temperatures without affecting regression rate values. Calorie measurements and ballistic experiments were performed on this new fuel formulation. This new formulation promises driving applications in a wide range of temperatures.
Age Estimation of Infants Through Metric Analysis of Developing Anterior Deciduous Teeth.
Viciano, Joan; De Luca, Stefano; Irurita, Javier; Alemán, Inmaculada
2018-01-01
This study provides regression equations for estimation of age of infants from the dimensions of their developing deciduous teeth. The sample comprises 97 individuals of known sex and age (62 boys, 35 girls), aged between 2 days and 1,081 days. The age-estimation equations were obtained for the sexes combined, as well as for each sex separately, thus including "sex" as an independent variable. The values of the correlations and determination coefficients obtained for each regression equation indicate good fits for most of the equations obtained. The "sex" factor was statistically significant when included as an independent variable in seven of the regression equations. However, the "sex" factor provided an advantage for age estimation in only three of the equations, compared to those that did not include "sex" as a factor. These data suggest that the ages of infants can be accurately estimated from measurements of their developing deciduous teeth. © 2017 American Academy of Forensic Sciences.
NASA Astrophysics Data System (ADS)
Buckner, Steven A.
The Helicopter Emergency Medical Service (HEMS) industry has a significant role in the transportation of injured patients, but has experienced more accidents than all other segments of the aviation industry combined. With the objective of addressing this discrepancy, this study assesses the effect of safety management systems implementation and aviation technologies utilization on the reduction of HEMS accident rates. Participating were 147 pilots from Federal Aviation Regulations Part 135 HEMS operators, who completed a survey questionnaire based on the Safety Culture and Safety Management System Survey (SCSMSS). The study assessed the predictor value of SMS implementation and aviation technologies to the frequency of HEMS accident rates with correlation and multiple linear regression. The correlation analysis identified three significant positive relationships. HEMS years of experience had a high significant positive relationship with accident rate (r=.90; p<.05); SMS had a moderate significant positive relationship to Night Vision Goggles (NVG) (r=.38; p<.05); and SMS had a slight significant positive relationship with Terrain Avoidance Warning System (TAWS) (r=.234; p<.05). Multiple regression analysis suggested that when combined with NVG, TAWS, and SMS, HEMS years of experience explained 81.4% of the variance in accident rate scores (p<.05), and HEMS years of experience was found to be a significant predictor of accident rates (p<.05). Additional quantitative regression analysis was recommended to replicate the results of this study and to consider the influence of these variables for continued reduction of HEMS accidents, and to induce execution of SMS and aviation technologies from a systems engineering application. Recommendations for practice included the adoption of existing regulatory guidance for a SMS program. A qualitative analysis was also recommended for future study SMS implementation and HEMS accident rate from the pilot's perspective. A quantitative longitudinal study would further explore inferential relationships between the study variables. Current strategies should include the increased utilization of available aviation technology resources as this proactive stance may be beneficial for the establishment of an effective safety culture within the HEMS industry.
Sewage sludge disintegration by combined treatment of alkaline+high pressure homogenization.
Zhang, Yuxuan; Zhang, Panyue; Zhang, Guangming; Ma, Weifang; Wu, Hao; Ma, Boqiang
2012-11-01
Alkaline pretreatment combined with high pressure homogenization (HPH) was applied to promote sewage sludge disintegration. For sewage sludge with a total solid content of 1.82%, sludge disintegration degree (DD(COD)) with combined treatment was higher than the sum of DD(COD) with single alkaline and single HPH treatment. NaOH dosage ⩽0.04mol/L, homogenization pressure ⩽60MPa and a single homogenization cycle were the suitable conditions for combined sludge treatment. The combined sludge treatment showed a maximum DD(COD) of 59.26%. By regression analysis, the combined sludge disintegration model was established as 11-DD(COD)=0.713C(0.334)P(0.234)N(0.119), showing that the effect of operating parameters on sludge disintegration followed the order: NaOH dosage>homogenization pressure>number of homogenization cycle. The energy efficiency with combined sludge treatment significantly increased compared with that with single HPH treatment, and the high energy efficiency was achieved at low homogenization pressure with a single homogenization cycle. Copyright © 2012 Elsevier Ltd. All rights reserved.
Drug Pricing Evolution in Hepatitis C.
Vernaz, Nathalie; Girardin, François; Goossens, Nicolas; Brügger, Urs; Riguzzi, Marco; Perrier, Arnaud; Negro, Francesco
2016-01-01
We aimed to determine the association between the stepwise increase in the sustained viral response (SVR) and Swiss and United States (US) market prices of drug regimens for treatment-naive, genotype 1 chronic hepatitis C virus (HCV) infection in the last 25 years. We identified the following five steps in the development of HCV treatment regimens: 1) interferon (IFN)-α monotherapy in the early '90s, 2) IFN-α in combination with ribavirin (RBV), 3) pegylated (peg) IFN-α in combination with RBV, 4) the first direct acting antivirals (DAAs) (telaprevir and boceprevir) in combination with pegIFN-α and RBV, and 5) newer DAA-based regimens, such as sofosbuvir (which is or is not combined with ledipasvir) and fixed-dose combination of ritonavir-boosted paritaprevir and ombitasvir in combination with dasabuvir. We performed a linear regression and mean cost analysis to test for an association between SVRs and HCV regimen prices. We conducted a sensitivity analysis using US prices at the time of US drug licensing. We selected randomized clinical trials of drugs approved for use in Switzerland from 1997 to July 2015 including treatment-naïve patients with HCV genotype 1 infection. We identified a statistically significant positive relationship between the proportion of patients achieving SVRs and the costs of HCV regimens in Switzerland (with a bivariate ordinary least square regression yielding an R2 measure of 0.96) and the US (R2 = 0.95). The incremental cost per additional percentage of SVR was 597.14 USD in Switzerland and 1,063.81 USD in the US. The pricing of drugs for HCV regimens follows a value-based model, which has a stable ratio of costs per achieved SVR over 25 years. Health care systems are struggling with the high resource use of these new agents despite their obvious long-term advantages for the overall health of the population. Therefore, the pharmaceutical industry, health care payers and other stakeholders are challenged with finding new drug pricing schemes to treat the entire population infected with HCV.
Drug Pricing Evolution in Hepatitis C
Vernaz, Nathalie; Girardin, François; Goossens, Nicolas; Brügger, Urs; Riguzzi, Marco; Perrier, Arnaud; Negro, Francesco
2016-01-01
Objective We aimed to determine the association between the stepwise increase in the sustained viral response (SVR) and Swiss and United States (US) market prices of drug regimens for treatment-naive, genotype 1 chronic hepatitis C virus (HCV) infection in the last 25 years. We identified the following five steps in the development of HCV treatment regimens: 1) interferon (IFN)-α monotherapy in the early '90s, 2) IFN-α in combination with ribavirin (RBV), 3) pegylated (peg) IFN-α in combination with RBV, 4) the first direct acting antivirals (DAAs) (telaprevir and boceprevir) in combination with pegIFN-α and RBV, and 5) newer DAA-based regimens, such as sofosbuvir (which is or is not combined with ledipasvir) and fixed-dose combination of ritonavir-boosted paritaprevir and ombitasvir in combination with dasabuvir. Design We performed a linear regression and mean cost analysis to test for an association between SVRs and HCV regimen prices. We conducted a sensitivity analysis using US prices at the time of US drug licensing. We selected randomized clinical trials of drugs approved for use in Switzerland from 1997 to July 2015 including treatment-naïve patients with HCV genotype 1 infection. Results We identified a statistically significant positive relationship between the proportion of patients achieving SVRs and the costs of HCV regimens in Switzerland (with a bivariate ordinary least square regression yielding an R2 measure of 0.96) and the US (R2 = 0.95). The incremental cost per additional percentage of SVR was 597.14 USD in Switzerland and 1,063.81 USD in the US. Conclusion The pricing of drugs for HCV regimens follows a value-based model, which has a stable ratio of costs per achieved SVR over 25 years. Health care systems are struggling with the high resource use of these new agents despite their obvious long-term advantages for the overall health of the population. Therefore, the pharmaceutical industry, health care payers and other stakeholders are challenged with finding new drug pricing schemes to treat the entire population infected with HCV. PMID:27310294
Regression Analysis by Example. 5th Edition
ERIC Educational Resources Information Center
Chatterjee, Samprit; Hadi, Ali S.
2012-01-01
Regression analysis is a conceptually simple method for investigating relationships among variables. Carrying out a successful application of regression analysis, however, requires a balance of theoretical results, empirical rules, and subjective judgment. "Regression Analysis by Example, Fifth Edition" has been expanded and thoroughly…
Li, Min; Zhong, Guo-yue; Wu, Ao-lin; Zhang, Shou-wen; Jiang, Wei; Liang, Jian
2015-05-01
To explore the correlation between the ecological factors and the contents of podophyllotoxin and total lignans in root and rhizome of Sinopodophyllum hexandrum, podophyllotoxin in 87 samples (from 5 provinces) was determined by HPLC and total lignans by UV. A correlation and regression analysis was made by software SPSS 16.0 in combination with ecological factors (terrain, soil and climate). The content determination results showed a great difference between podophyllotoxin and total lignans, attaining 1.001%-6.230% and 5.350%-16.34%, respective. The correlation and regression analysis by SPSS showed a positive linear correlation between their contents, strong positive correlation between their contents, latitude and annual average rainfall within the sampling area, weak negative correlation with pH value and organic material in soil, weaker and stronger positive correlations with soil potassium, weak negative correlation with slope and annual average temperature and weaker positive correlation between the podophyllotoxin content and soil potassium.
QSAR Analysis of 2-Amino or 2-Methyl-1-Substituted Benzimidazoles Against Pseudomonas aeruginosa
Podunavac-Kuzmanović, Sanja O.; Cvetković, Dragoljub D.; Barna, Dijana J.
2009-01-01
A set of benzimidazole derivatives were tested for their inhibitory activities against the Gram-negative bacterium Pseudomonas aeruginosa and minimum inhibitory concentrations were determined for all the compounds. Quantitative structure activity relationship (QSAR) analysis was applied to fourteen of the abovementioned derivatives using a combination of various physicochemical, steric, electronic, and structural molecular descriptors. A multiple linear regression (MLR) procedure was used to model the relationships between molecular descriptors and the antibacterial activity of the benzimidazole derivatives. The stepwise regression method was used to derive the most significant models as a calibration model for predicting the inhibitory activity of this class of molecules. The best QSAR models were further validated by a leave one out technique as well as by the calculation of statistical parameters for the established theoretical models. To confirm the predictive power of the models, an external set of molecules was used. High agreement between experimental and predicted inhibitory values, obtained in the validation procedure, indicated the good quality of the derived QSAR models. PMID:19468332
Desire thinking: A risk factor for binge eating?
Spada, Marcantonio M; Caselli, Gabriele; Fernie, Bruce A; Manfredi, Chiara; Boccaletti, Fabio; Dallari, Giulia; Gandini, Federica; Pinna, Eleonora; Ruggiero, Giovanni M; Sassaroli, Sandra
2015-08-01
In the current study we explored the role of desire thinking in predicting binge eating independently of Body Mass Index, negative affect and irrational food beliefs. A sample of binge eaters (n=77) and a sample of non-binge eaters (n=185) completed the following self-report instruments: Hospital Anxiety and Depression Scale, Irrational Food Beliefs Scale, Desire Thinking Questionnaire, and Binge Eating Scale. Mann-Whitney U tests revealed that all variable scores were significantly higher for binge eaters than non-binge eaters. A logistic regression analysis indicated that verbal perseveration was a predictor of classification as a binge eater over and above Body Mass Index, negative affect and irrational food beliefs. A hierarchical regression analysis, on the combined sample, indicated that verbal perseveration predicted levels of binge eating independently of Body Mass Index, negative affect and irrational food beliefs. These results highlight the possible role of desire thinking as a risk factor for binge eating. Copyright © 2015 Elsevier Ltd. All rights reserved.
Edwards, Jeffrey R; Lambert, Lisa Schurer
2007-03-01
Studies that combine moderation and mediation are prevalent in basic and applied psychology research. Typically, these studies are framed in terms of moderated mediation or mediated moderation, both of which involve similar analytical approaches. Unfortunately, these approaches have important shortcomings that conceal the nature of the moderated and the mediated effects under investigation. This article presents a general analytical framework for combining moderation and mediation that integrates moderated regression analysis and path analysis. This framework clarifies how moderator variables influence the paths that constitute the direct, indirect, and total effects of mediated models. The authors empirically illustrate this framework and give step-by-step instructions for estimation and interpretation. They summarize the advantages of their framework over current approaches, explain how it subsumes moderated mediation and mediated moderation, and describe how it can accommodate additional moderator and mediator variables, curvilinear relationships, and structural equation models with latent variables. (c) 2007 APA, all rights reserved.
NASA Astrophysics Data System (ADS)
Wang, Wei; Zhong, Ming; Cheng, Ling; Jin, Lu; Shen, Si
2018-02-01
In the background of building global energy internet, it has both theoretical and realistic significance for forecasting and analysing the ratio of electric energy to terminal energy consumption. This paper firstly analysed the influencing factors of the ratio of electric energy to terminal energy and then used combination method to forecast and analyse the global proportion of electric energy. And then, construct the cointegration model for the proportion of electric energy by using influence factor such as electricity price index, GDP, economic structure, energy use efficiency and total population level. At last, this paper got prediction map of the proportion of electric energy by using the combination-forecasting model based on multiple linear regression method, trend analysis method, and variance-covariance method. This map describes the development trend of the proportion of electric energy in 2017-2050 and the proportion of electric energy in 2050 was analysed in detail using scenario analysis.
Beer fermentation: monitoring of process parameters by FT-NIR and multivariate data analysis.
Grassi, Silvia; Amigo, José Manuel; Lyndgaard, Christian Bøge; Foschino, Roberto; Casiraghi, Ernestina
2014-07-15
This work investigates the capability of Fourier-Transform near infrared (FT-NIR) spectroscopy to monitor and assess process parameters in beer fermentation at different operative conditions. For this purpose, the fermentation of wort with two different yeast strains and at different temperatures was monitored for nine days by FT-NIR. To correlate the collected spectra with °Brix, pH and biomass, different multivariate data methodologies were applied. Principal component analysis (PCA), partial least squares (PLS) and locally weighted regression (LWR) were used to assess the relationship between FT-NIR spectra and the abovementioned process parameters that define the beer fermentation. The accuracy and robustness of the obtained results clearly show the suitability of FT-NIR spectroscopy, combined with multivariate data analysis, to be used as a quality control tool in the beer fermentation process. FT-NIR spectroscopy, when combined with LWR, demonstrates to be a perfectly suitable quantitative method to be implemented in the production of beer. Copyright © 2014 Elsevier Ltd. All rights reserved.
Cai, Rui; Wang, Shisheng; Tang, Bo; Li, Yueqing; Zhao, Weijie
2018-01-01
Sea cucumber is the major tonic seafood worldwide, and geographical origin traceability is an important part of its quality and safety control. In this work, a non-destructive method for origin traceability of sea cucumber (Apostichopus japonicus) from northern China Sea and East China Sea using near infrared spectroscopy (NIRS) and multivariate analysis methods was proposed. Total fat contents of 189 fresh sea cucumber samples were determined and partial least-squares (PLS) regression was used to establish the quantitative NIRS model. The ordered predictor selection algorithm was performed to select feasible wavelength regions for the construction of PLS and identification models. The identification model was developed by principal component analysis combined with Mahalanobis distance and scaling to the first range algorithms. In the test set of the optimum PLS models, the root mean square error of prediction was 0.45, and correlation coefficient was 0.90. The correct classification rates of 100% were obtained in both identification calibration model and test model. The overall results indicated that NIRS method combined with chemometric analysis was a suitable tool for origin traceability and identification of fresh sea cucumber samples from nine origins in China. PMID:29410795
Huang, Shi; MacKinnon, David P.; Perrino, Tatiana; Gallo, Carlos; Cruden, Gracelyn; Brown, C Hendricks
2016-01-01
Mediation analysis often requires larger sample sizes than main effect analysis to achieve the same statistical power. Combining results across similar trials may be the only practical option for increasing statistical power for mediation analysis in some situations. In this paper, we propose a method to estimate: 1) marginal means for mediation path a, the relation of the independent variable to the mediator; 2) marginal means for path b, the relation of the mediator to the outcome, across multiple trials; and 3) the between-trial level variance-covariance matrix based on a bivariate normal distribution. We present the statistical theory and an R computer program to combine regression coefficients from multiple trials to estimate a combined mediated effect and confidence interval under a random effects model. Values of coefficients a and b, along with their standard errors from each trial are the input for the method. This marginal likelihood based approach with Monte Carlo confidence intervals provides more accurate inference than the standard meta-analytic approach. We discuss computational issues, apply the method to two real-data examples and make recommendations for the use of the method in different settings. PMID:28239330
Diaz Beveridge, Robert; Alcolea, Vicent; Aparicio, Jorge; Segura, Ángel; García, Jose; Corbellas, Miguel; Fonfría, María; Giménez, Alejandra; Montalar, Joaquin
2014-01-10
The combination of gemcitabine and erlotinib is a standard first-line treatment for unresectable, locally advanced or metastatic pancreatic cancer. We reviewed our single centre experience to assess its efficacy and toxicity in clinical practice. Clinical records of patients with unresectable, locally advanced or metastatic pancreatic cancer who were treated with the combination of gemcitabine and erlotinib were reviewed. Univariate survival analysis and multivariate analysis were carried out to indentify independent predictors factors of overall survival. Our series included 55 patients. Overall disease control rate was 47%: 5% of patients presented complete response, 20% partial response and 22% stable disease. Median overall survival was 8.3 months). Cox regression analysis indicated that performance status and locally advanced versus metastatic disease were independent factors of overall survival. Patients who developed acne-like rash toxicity, related to erlotinib administration, presented a higher survival than those patients who did not develop this toxicity. Gemcitabine plus erlotinib doublet is active in our series of patients with advanced pancreatic cancer. This study provides efficacy and safety results similar to those of the pivotal phase III clinical trial that tested the same combination.
Guo, Xiuhan; Cai, Rui; Wang, Shisheng; Tang, Bo; Li, Yueqing; Zhao, Weijie
2018-01-01
Sea cucumber is the major tonic seafood worldwide, and geographical origin traceability is an important part of its quality and safety control. In this work, a non-destructive method for origin traceability of sea cucumber ( Apostichopus japonicus ) from northern China Sea and East China Sea using near infrared spectroscopy (NIRS) and multivariate analysis methods was proposed. Total fat contents of 189 fresh sea cucumber samples were determined and partial least-squares (PLS) regression was used to establish the quantitative NIRS model. The ordered predictor selection algorithm was performed to select feasible wavelength regions for the construction of PLS and identification models. The identification model was developed by principal component analysis combined with Mahalanobis distance and scaling to the first range algorithms. In the test set of the optimum PLS models, the root mean square error of prediction was 0.45, and correlation coefficient was 0.90. The correct classification rates of 100% were obtained in both identification calibration model and test model. The overall results indicated that NIRS method combined with chemometric analysis was a suitable tool for origin traceability and identification of fresh sea cucumber samples from nine origins in China.
Magnitude and Frequency of Floods for Urban and Small Rural Streams in Georgia, 2008
Gotvald, Anthony J.; Knaak, Andrew E.
2011-01-01
A study was conducted that updated methods for estimating the magnitude and frequency of floods in ungaged urban basins in Georgia that are not substantially affected by regulation or tidal fluctuations. Annual peak-flow data for urban streams from September 2008 were analyzed for 50 streamgaging stations (streamgages) in Georgia and 6 streamgages on adjacent urban streams in Florida and South Carolina having 10 or more years of data. Flood-frequency estimates were computed for the 56 urban streamgages by fitting logarithms of annual peak flows for each streamgage to a Pearson Type III distribution. Additionally, basin characteristics for the streamgages were computed by using a geographical information system and computer algorithms. Regional regression analysis, using generalized least-squares regression, was used to develop a set of equations for estimating flows with 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities for ungaged urban basins in Georgia. In addition to the 56 urban streamgages, 171 rural streamgages were included in the regression analysis to maintain continuity between flood estimates for urban and rural basins as the basin characteristics pertaining to urbanization approach zero. Because 21 of the rural streamgages have drainage areas less than 1 square mile, the set of equations developed for this study can also be used for estimating small ungaged rural streams in Georgia. Flood-frequency estimates and basin characteristics for 227 streamgages were combined to form the final database used in the regional regression analysis. Four hydrologic regions were developed for Georgia. The final equations are functions of drainage area and percentage of impervious area for three of the regions and drainage area, percentage of developed land, and mean basin slope for the fourth region. Average standard errors of prediction for these regression equations range from 20.0 to 74.5 percent.
NASA Technical Reports Server (NTRS)
MCKissick, Burnell T. (Technical Monitor); Plassman, Gerald E.; Mall, Gerald H.; Quagliano, John R.
2005-01-01
Linear multivariable regression models for predicting day and night Eddy Dissipation Rate (EDR) from available meteorological data sources are defined and validated. Model definition is based on a combination of 1997-2000 Dallas/Fort Worth (DFW) data sources, EDR from Aircraft Vortex Spacing System (AVOSS) deployment data, and regression variables primarily from corresponding Automated Surface Observation System (ASOS) data. Model validation is accomplished through EDR predictions on a similar combination of 1994-1995 Memphis (MEM) AVOSS and ASOS data. Model forms include an intercept plus a single term of fixed optimal power for each of these regression variables; 30-minute forward averaged mean and variance of near-surface wind speed and temperature, variance of wind direction, and a discrete cloud cover metric. Distinct day and night models, regressing on EDR and the natural log of EDR respectively, yield best performance and avoid model discontinuity over day/night data boundaries.
NASA Technical Reports Server (NTRS)
Patniak, Surya N.; Guptill, James D.; Hopkins, Dale A.; Lavelle, Thomas M.
1998-01-01
Nonlinear mathematical-programming-based design optimization can be an elegant method. However, the calculations required to generate the merit function, constraints, and their gradients, which are frequently required, can make the process computational intensive. The computational burden can be greatly reduced by using approximating analyzers derived from an original analyzer utilizing neural networks and linear regression methods. The experience gained from using both of these approximation methods in the design optimization of a high speed civil transport aircraft is the subject of this paper. The Langley Research Center's Flight Optimization System was selected for the aircraft analysis. This software was exercised to generate a set of training data with which a neural network and a regression method were trained, thereby producing the two approximating analyzers. The derived analyzers were coupled to the Lewis Research Center's CometBoards test bed to provide the optimization capability. With the combined software, both approximation methods were examined for use in aircraft design optimization, and both performed satisfactorily. The CPU time for solution of the problem, which had been measured in hours, was reduced to minutes with the neural network approximation and to seconds with the regression method. Instability encountered in the aircraft analysis software at certain design points was also eliminated. On the other hand, there were costs and difficulties associated with training the approximating analyzers. The CPU time required to generate the input-output pairs and to train the approximating analyzers was seven times that required for solution of the problem.
Creating a non-linear total sediment load formula using polynomial best subset regression model
NASA Astrophysics Data System (ADS)
Okcu, Davut; Pektas, Ali Osman; Uyumaz, Ali
2016-08-01
The aim of this study is to derive a new total sediment load formula which is more accurate and which has less application constraints than the well-known formulae of the literature. 5 most known stream power concept sediment formulae which are approved by ASCE are used for benchmarking on a wide range of datasets that includes both field and flume (lab) observations. The dimensionless parameters of these widely used formulae are used as inputs in a new regression approach. The new approach is called Polynomial Best subset regression (PBSR) analysis. The aim of the PBRS analysis is fitting and testing all possible combinations of the input variables and selecting the best subset. Whole the input variables with their second and third powers are included in the regression to test the possible relation between the explanatory variables and the dependent variable. While selecting the best subset a multistep approach is used that depends on significance values and also the multicollinearity degrees of inputs. The new formula is compared to others in a holdout dataset and detailed performance investigations are conducted for field and lab datasets within this holdout data. Different goodness of fit statistics are used as they represent different perspectives of the model accuracy. After the detailed comparisons are carried out we figured out the most accurate equation that is also applicable on both flume and river data. Especially, on field dataset the prediction performance of the proposed formula outperformed the benchmark formulations.
Gerami, Pedram; Cook, Robert W; Russell, Maria C; Wilkinson, Jeff; Amaria, Rodabe N; Gonzalez, Rene; Lyle, Stephen; Jackson, Gilchrist L; Greisinger, Anthony J; Johnson, Clare E; Oelschlager, Kristen M; Stone, John F; Maetzold, Derek J; Ferris, Laura K; Wayne, Jeffrey D; Cooper, Chelsea; Obregon, Roxana; Delman, Keith A; Lawson, David
2015-05-01
A gene expression profile (GEP) test able to accurately identify risk of metastasis for patients with cutaneous melanoma has been clinically validated. We aimed for assessment of the prognostic accuracy of GEP and sentinel lymph node biopsy (SLNB) tests, independently and in combination, in a multicenter cohort of 217 patients. Reverse transcription polymerase chain reaction (RT-PCR) was performed to assess the expression of 31 genes from primary melanoma tumors, and SLNB outcome was determined from clinical data. Prognostic accuracy of each test was determined using Kaplan-Meier and Cox regression analysis of disease-free, distant metastasis-free, and overall survivals. GEP outcome was a more significant and better predictor of each end point in univariate and multivariate regression analysis, compared with SLNB (P < .0001 for all). In combination with SLNB, GEP improved prognostication. For patients with a GEP high-risk outcome and a negative SLNB result, Kaplan-Meier 5-year disease-free, distant metastasis-free, and overall survivals were 35%, 49%, and 54%, respectively. Within the SLNB-negative cohort of patients, overall risk of metastatic events was higher (∼30%) than commonly found in the general population of patients with melanoma. In this study cohort, GEP was an objective tool that accurately predicted metastatic risk in SLNB-eligible patients. Copyright © 2015 American Academy of Dermatology, Inc. Published by Elsevier Inc. All rights reserved.
Heimes, F.J.; Luckey, R.R.; Stephens, D.M.
1986-01-01
Combining estimates of applied irrigation water, determined for selected sample sites, with information on irrigated acreage provides one alternative for developing areal estimates of groundwater pumpage for irrigation. The reliability of this approach was evaluated by comparing estimated pumpage with metered pumpage for two years for a three-county area in southwestern Nebraska. Meters on all irrigation wells in the three counties provided a complete data set for evaluation of equipment and comparison with pumpage estimates. Regression analyses were conducted on discharge, time-of-operation, and pumpage data collected at 52 irrigation sites in 1983 and at 57 irrigation sites in 1984 using data from inline flowmeters as the independent variable. The standard error of the estimate for regression analysis of discharge measurements made using a portable flowmeter was 6.8% of the mean discharge metered by inline flowmeters. The standard error of the estimate for regression analysis of time of operation determined from electric meters was 8.1% of the mean time of operation determined from in-line and 15.1% for engine-hour meters. Sampled pumpage, calculated by multiplying the average discharge obtained from the portable flowmeter by the time of operation obtained from energy or hour meters, was compared with metered pumpage from in-line flowmeters at sample sites. The standard error of the estimate for the regression analysis of sampled pumpage was 10.3% of the mean of the metered pumpage for 1983 and 1984 combined. The difference in the mean of the sampled pumpage and the mean of the metered pumpage was only 1.8% for 1983 and 2.3% for 1984. Estimated pumpage, for each county and for the study area, was calculated by multiplying application (sampled pumpage divided by irrigated acreages at sample sites) by irrigated acreage compiled from Landsat (Land satellite) imagery. Estimated pumpage was compared with total metered pumpage for each county and the study area. Estimated pumpage by county varied from 9% less, to 20% more, than metered pumpage in 1983 and from 0 to 15% more than metered pumpage in 1984. Estimated pumpage for the study area was 11 % more than metered pumpage in 1983 and 5% more than metered pumpage in 1984. (Author 's abstract)
Luo, Ying-zhen; Tu, Meng; Fan, Fei; Zheng, Jie-qian; Yang, Ming; Li, Tao; Zhang, Kui; Deng, Zhen-hua
2015-06-01
To establish the linear regression equation between body height and combined length of manubrium and mesostenum of sternum measured by CT volume rendering technique (CT-VRT) in southwest Han population. One hundred and sixty subjects, including 80 males and 80 females were selected from southwest Han population for routine CT-VRT (reconstruction thickness 1 mm) examination. The lengths of both manubrium and mesosternum were recorded, and the combined length of manubrium and mesosternum was equal to the algebraic sum of them. The sex-specific linear regression equations between the combined length of manubrium and mesosternum and the real body height of each subject were deduced. The sex-specific simple linear regression equations between the combined length of manubrium and mesostenum (x3) and body height (y) were established (male: y = 135.000+2.118 x3 and female: y = 120.790+2.808 x3). Both equations showed statistical significance (P < 0.05) with a 100% predictive accuracy. CT-VRT is an effective method for measurement of the index of sternum. The combined length of manubrium and mesosternum from CT-VRT can be used for body height estimation in southwest Han population.
Regression Verification Using Impact Summaries
NASA Technical Reports Server (NTRS)
Backes, John; Person, Suzette J.; Rungta, Neha; Thachuk, Oksana
2013-01-01
Regression verification techniques are used to prove equivalence of syntactically similar programs. Checking equivalence of large programs, however, can be computationally expensive. Existing regression verification techniques rely on abstraction and decomposition techniques to reduce the computational effort of checking equivalence of the entire program. These techniques are sound but not complete. In this work, we propose a novel approach to improve scalability of regression verification by classifying the program behaviors generated during symbolic execution as either impacted or unimpacted. Our technique uses a combination of static analysis and symbolic execution to generate summaries of impacted program behaviors. The impact summaries are then checked for equivalence using an o-the-shelf decision procedure. We prove that our approach is both sound and complete for sequential programs, with respect to the depth bound of symbolic execution. Our evaluation on a set of sequential C artifacts shows that reducing the size of the summaries can help reduce the cost of software equivalence checking. Various reduction, abstraction, and compositional techniques have been developed to help scale software verification techniques to industrial-sized systems. Although such techniques have greatly increased the size and complexity of systems that can be checked, analysis of large software systems remains costly. Regression analysis techniques, e.g., regression testing [16], regression model checking [22], and regression verification [19], restrict the scope of the analysis by leveraging the differences between program versions. These techniques are based on the idea that if code is checked early in development, then subsequent versions can be checked against a prior (checked) version, leveraging the results of the previous analysis to reduce analysis cost of the current version. Regression verification addresses the problem of proving equivalence of closely related program versions [19]. These techniques compare two programs with a large degree of syntactic similarity to prove that portions of one program version are equivalent to the other. Regression verification can be used for guaranteeing backward compatibility, and for showing behavioral equivalence in programs with syntactic differences, e.g., when a program is refactored to improve its performance, maintainability, or readability. Existing regression verification techniques leverage similarities between program versions by using abstraction and decomposition techniques to improve scalability of the analysis [10, 12, 19]. The abstractions and decomposition in the these techniques, e.g., summaries of unchanged code [12] or semantically equivalent methods [19], compute an over-approximation of the program behaviors. The equivalence checking results of these techniques are sound but not complete-they may characterize programs as not functionally equivalent when, in fact, they are equivalent. In this work we describe a novel approach that leverages the impact of the differences between two programs for scaling regression verification. We partition program behaviors of each version into (a) behaviors impacted by the changes and (b) behaviors not impacted (unimpacted) by the changes. Only the impacted program behaviors are used during equivalence checking. We then prove that checking equivalence of the impacted program behaviors is equivalent to checking equivalence of all program behaviors for a given depth bound. In this work we use symbolic execution to generate the program behaviors and leverage control- and data-dependence information to facilitate the partitioning of program behaviors. The impacted program behaviors are termed as impact summaries. The dependence analyses that facilitate the generation of the impact summaries, we believe, could be used in conjunction with other abstraction and decomposition based approaches, [10, 12], as a complementary reduction technique. An evaluation of our regression verification technique shows that our approach is capable of leveraging similarities between program versions to reduce the size of the queries and the time required to check for logical equivalence. The main contributions of this work are: - A regression verification technique to generate impact summaries that can be checked for functional equivalence using an off-the-shelf decision procedure. - A proof that our approach is sound and complete with respect to the depth bound of symbolic execution. - An implementation of our technique using the LLVMcompiler infrastructure, the klee Symbolic Virtual Machine [4], and a variety of Satisfiability Modulo Theory (SMT) solvers, e.g., STP [7] and Z3 [6]. - An empirical evaluation on a set of C artifacts which shows that the use of impact summaries can reduce the cost of regression verification.
Uncovering the Power of Personality to Shape Income
Denissen, Jaap J. A.; Bleidorn, Wiebke; Hennecke, Marie; Luhmann, Maike; Orth, Ulrich; Specht, Jule; Zimmermann, Julia
2017-01-01
The notion of person-environment fit implies that personal and contextual factors interact in influencing important life outcomes. Using data from 8,458 employed individuals, we examined the combined effects of individuals’ actual personality traits and jobs’ expert-rated personality demands on earnings. Results from a response surface analysis indicated that the fit between individuals’ actual personality and the personality demands of their jobs is a predictor of income. Conclusions of this combined analysis were partly opposite to conclusions reached in previous studies using conventional regression methods. Individuals can earn additional income of more than their monthly salary per year if they hold a job that fits their personality. Thus, at least for some traits, economic success depends not only on having a “successful personality” but also, in part, on finding the best niche for one’s personality. We discuss the findings with regard to labor-market policies and individuals’ job-selection strategies. PMID:29155616
NASA Technical Reports Server (NTRS)
Caldas, M.; Walker, R. T.; Shirota, R.; Perz, S.; Skole, D.
2003-01-01
This paper examines the relationships between the socio-demographic characteristics of small settlers in the Brazilian Amazon and the life cycle hypothesis in the process of deforestation. The analysis was conducted combining remote sensing and geographic data with primary data of 153 small settlers along the TransAmazon Highway. Regression analyses and spatial autocorrelation tests were conducted. The results from the empirical model indicate that socio-demographic characteristics of households as well as institutional and market factors, affect the land use decision. Although remotely sensed information is not very popular among Brazilian social scientists, these results confirm that they can be very useful for this kind of study. Furthermore, the research presented by this paper strongly indicates that family and socio-demographic data, as well as market data, may result in misspecification problems. The same applies to models that do not incorporate spatial analysis.
Magagna, Federico; Guglielmetti, Alessandro; Liberto, Erica; Reichenbach, Stephen E; Allegrucci, Elena; Gobino, Guido; Bicchi, Carlo; Cordero, Chiara
2017-08-02
This study investigates chemical information of volatile fractions of high-quality cocoa (Theobroma cacao L. Malvaceae) from different origins (Mexico, Ecuador, Venezuela, Columbia, Java, Trinidad, and Sao Tomè) produced for fine chocolate. This study explores the evolution of the entire pattern of volatiles in relation to cocoa processing (raw, roasted, steamed, and ground beans). Advanced chemical fingerprinting (e.g., combined untargeted and targeted fingerprinting) with comprehensive two-dimensional gas chromatography coupled with mass spectrometry allows advanced pattern recognition for classification, discrimination, and sensory-quality characterization. The entire data set is analyzed for 595 reliable two-dimensional peak regions, including 130 known analytes and 13 potent odorants. Multivariate analysis with unsupervised exploration (principal component analysis) and simple supervised discrimination methods (Fisher ratios and linear regression trees) reveal informative patterns of similarities and differences and identify characteristic compounds related to sample origin and manufacturing step.
Determination of butter adulteration with margarine using Raman spectroscopy.
Uysal, Reyhan Selin; Boyaci, Ismail Hakki; Genis, Hüseyin Efe; Tamer, Ugur
2013-12-15
In this study, adulteration of butter with margarine was analysed using Raman spectroscopy combined with chemometric methods (principal component analysis (PCA), principal component regression (PCR), partial least squares (PLS)) and artificial neural networks (ANNs). Different butter and margarine samples were mixed at various concentrations ranging from 0% to 100% w/w. PCA analysis was applied for the classification of butters, margarines and mixtures. PCR, PLS and ANN were used for the detection of adulteration ratios of butter. Models were created using a calibration data set and developed models were evaluated using a validation data set. The coefficient of determination (R(2)) values between actual and predicted values obtained for PCR, PLS and ANN for the validation data set were 0.968, 0.987 and 0.978, respectively. In conclusion, a combination of Raman spectroscopy with chemometrics and ANN methods can be applied for testing butter adulteration. Copyright © 2013 Elsevier Ltd. All rights reserved.
Ren, Jiliang; Yuan, Ying; Wu, Yingwei; Tao, Xiaofeng
2018-05-02
The overlap of morphological feature and mean ADC value restricted clinical application of MRI in the differential diagnosis of orbital lymphoma and idiopathic orbital inflammatory pseudotumor (IOIP). In this paper, we aimed to retrospectively evaluate the combined diagnostic value of conventional magnetic resonance imaging (MRI) and whole-tumor histogram analysis of apparent diffusion coefficient (ADC) maps in the differentiation of the two lesions. In total, 18 patients with orbital lymphoma and 22 patients with IOIP were included, who underwent both conventional MRI and diffusion weighted imaging before treatment. Conventional MRI features and histogram parameters derived from ADC maps, including mean ADC (ADC mean ), median ADC (ADC median ), skewness, kurtosis, 10th, 25th, 75th and 90th percentiles of ADC (ADC 10 , ADC 25 , ADC 75 , ADC 90 ) were evaluated and compared between orbital lymphoma and IOIP. Multivariate logistic regression analysis was used to identify the most valuable variables for discriminating. Differential model was built upon the selected variables and receiver operating characteristic (ROC) analysis was also performed to determine the differential ability of the model. Multivariate logistic regression showed ADC 10 (P = 0.023) and involvement of orbit preseptal space (P = 0.029) were the most promising indexes in the discrimination of orbital lymphoma and IOIP. The logistic model defined by ADC 10 and involvement of orbit preseptal space was built, which achieved an AUC of 0.939, with sensitivity of 77.30% and specificity of 94.40%. Conventional MRI feature of involvement of orbit preseptal space and ADC histogram parameter of ADC 10 are valuable in differential diagnosis of orbital lymphoma and IOIP.
Heiss, Christian; Govindarajan, Parameswari; Schlewitz, Gudrun; Hemdan, Nasr Y.A.; Schliefke, Nathalie; Alt, Volker; Thormann, Ulrich; Lips, Katrin Susanne; Wenisch, Sabine; Langheinrich, Alexander C.; Zahner, Daniel; Schnettler, Reinhard
2012-01-01
Summary Background As women are the population most affected by multifactorial osteoporosis, research is focused on unraveling the underlying mechanism of osteoporosis induction in rats by combining ovariectomy (OVX) either with calcium, phosphorus, vitamin C and vitamin D2/D3 deficiency, or by administration of glucocorticoid (dexamethasone). Material/Methods Different skeletal sites of sham, OVX-Diet and OVX-Steroid rats were analyzed by Dual Energy X-ray Absorptiometry (DEXA) at varied time points of 0, 4 and 12 weeks to determine and compare the osteoporotic factors such as bone mineral density (BMD), bone mineral content (BMC), area, body weight and percent fat among different groups and time points. Comparative analysis and interrelationships among osteoporotic determinants by regression analysis were also determined. Results T scores were below-2.5 in OVX-Diet rats at 4 and 12 weeks post-OVX. OVX-diet rats revealed pronounced osteoporotic status with reduced BMD and BMC than the steroid counterparts, with the spine and pelvis as the most affected skeletal sites. Increase in percent fat was observed irrespective of the osteoporosis inducers applied. Comparative analysis and interrelationships between osteoporotic determinants that are rarely studied in animals indicate the necessity to analyze BMC and area along with BMD in obtaining meaningful information leading to proper prediction of probability of osteoporotic fractures. Conclusions Enhanced osteoporotic effect observed in OVX-Diet rats indicates that estrogen dysregulation combined with diet treatment induces and enhances osteoporosis with time when compared to the steroid group. Comparative and regression analysis indicates the need to determine BMC along with BMD and area in osteoporotic determination. PMID:22648240
Rationale for hedging initiatives: Empirical evidence from the energy industry
NASA Astrophysics Data System (ADS)
Dhanarajata, Srirajata
Theory offers different rationales for hedging including (i) financial distress and bankruptcy cost, (ii) capacity to capture attractive investment opportunities, (iii) information asymmetry, (iv) economy of scale, (v) substitution for hedging, (vi) managerial risk aversion, and (vii) convexity of tax schedule. The purpose of this dissertation is to empirically test the explanatory power of the first five theoretical rationales on hedging done by oil and gas exploration and production (E&P) companies. The level of hedging is measured by the percentage of production effectively hedged, calculated based on the concept of delta and delta-gamma hedging. I employ Tobit regression, principal components, and panel data analysis on dependent and raw independent variables. Tobit regression is applied due to the fact that the dependent variable used in the analysis is non-negative. Principal component analysis helps to reduce the dimension of explanatory variables while panel data analysis combines/pools the data that is a combination of time-series and cross-sectional. Based on the empirical results, leverage level is consistently found to be a significant factor on hedging activities, either due to an attempt to avoid financial distress by the firm, or an attempt to control agency cost by debtholders, or both. The effect of capital expenditures and discretionary cash flows are both indeterminable due possibly to a potential mismatch in timing of realized cash flow items and hedging decision. Firm size is found to be positively related to hedging supporting economy of scale hypothesis, which is introduced in past literature, as well as the argument that large firm usually are more sophisticated and should be more willing and more comfortable to use hedge instruments than smaller firms.
Preferences for healthy carryout meals in low-income neighborhoods of Baltimore city.
Jeffries, Jayne K; Lee, Seung Hee; Frick, Kevin D; Gittelsohn, Joel
2013-03-01
The nutrition environment is associated with risk of obesity and other diet-related chronic diseases. In Baltimore's low-income areas, carryouts (locally prepared-food sources that offer food "to go") are a common source of food, but they lack a variety of healthy options for purchase. To evaluate individuals' preferences of healthy combination meals sold at carryouts and to identify successful intervention methods to promote healthier foods in carryouts in low-income communities in Baltimore. The study estimated the relationship between combinations of healthier entrées (turkey club, grilled chicken), beverages (diet coke, bottled water), side dishes (watermelon, side salad), price points ($5.00, $7.50), and labeling on consumers' combination meal decisions using a forced-choice conjoint analysis. Logistic regression analysis was used to determine how individuals value different features in combination meals sold in carryouts. There was a statistically significant difference between customer preference for the two entrées, with a turkey club sandwich being preferred over a grilled chicken sandwich (p = .02). Carryout customers (n = 50) preferred water to diet soda (p < .00). Results suggested specific foods to improve the bundling of healthy combination meals. The selection of preferred promotion foods is important in the success of environmental nutrition interventions.
NASA Astrophysics Data System (ADS)
Peterson, K. T.; Wulamu, A.
2017-12-01
Water, essential to all living organisms, is one of the Earth's most precious resources. Remote sensing offers an ideal approach to monitor water quality over traditional in-situ techniques that are highly time and resource consuming. Utilizing a multi-scale approach, incorporating data from handheld spectroscopy, UAS based hyperspectal, and satellite multispectral images were collected in coordination with in-situ water quality samples for the two midwestern watersheds. The remote sensing data was modeled and correlated to the in-situ water quality variables including chlorophyll content (Chl), turbidity, and total dissolved solids (TDS) using Normalized Difference Spectral Indices (NDSI) and Partial Least Squares Regression (PLSR). The results of the study supported the original hypothesis that correlating water quality variables with remotely sensed data benefits greatly from the use of more complex modeling and regression techniques such as PLSR. The final results generated from the PLSR analysis resulted in much higher R2 values for all variables when compared to NDSI. The combination of NDSI and PLSR analysis also identified key wavelengths for identification that aligned with previous study's findings. This research displays the advantages and future for complex modeling and machine learning techniques to improve water quality variable estimation from spectral data.
Zhao, Zeng-hui; Wang, Wei-ming; Gao, Xin; Yan, Ji-xing
2013-01-01
According to the geological characteristics of Xinjiang Ili mine in western area of China, a physical model of interstratified strata composed of soft rock and hard coal seam was established. Selecting the tunnel position, deformation modulus, and strength parameters of each layer as influencing factors, the sensitivity coefficient of roadway deformation to each parameter was firstly analyzed based on a Mohr-Columb strain softening model and nonlinear elastic-plastic finite element analysis. Then the effect laws of influencing factors which showed high sensitivity were further discussed. Finally, a regression model for the relationship between roadway displacements and multifactors was obtained by equivalent linear regression under multiple factors. The results show that the roadway deformation is highly sensitive to the depth of coal seam under the floor which should be considered in the layout of coal roadway; deformation modulus and strength of coal seam and floor have a great influence on the global stability of tunnel; on the contrary, roadway deformation is not sensitive to the mechanical parameters of soft roof; roadway deformation under random combinations of multi-factors can be deduced by the regression model. These conclusions provide theoretical significance to the arrangement and stability maintenance of coal roadway. PMID:24459447
Sabes-Figuera, Ramon; McCrone, Paul; Kendricks, Antony
2013-04-01
Economic evaluation analyses can be enhanced by employing regression methods, allowing for the identification of important sub-groups and to adjust for imperfect randomisation in clinical trials or to analyse non-randomised data. To explore the benefits of combining regression techniques and the standard Bayesian approach to refine cost-effectiveness analyses using data from randomised clinical trials. Data from a randomised trial of anti-depressant treatment were analysed and a regression model was used to explore the factors that have an impact on the net benefit (NB) statistic with the aim of using these findings to adjust the cost-effectiveness acceptability curves. Exploratory sub-samples' analyses were carried out to explore possible differences in cost-effectiveness. Results The analysis found that having suffered a previous similar depression is strongly correlated with a lower NB, independent of the outcome measure or follow-up point. In patients with previous similar depression, adding an selective serotonin reuptake inhibitors (SSRI) to supportive care for mild-to-moderate depression is probably cost-effective at the level used by the English National Institute for Health and Clinical Excellence to make recommendations. This analysis highlights the need for incorporation of econometric methods into cost-effectiveness analyses using the NB approach.
Evaluation of visual impairment in Usher syndrome 1b and Usher syndrome 2a.
Pennings, Ronald J E; Huygen, Patrick L M; Orten, Dana J; Wagenaar, Mariette; van Aarem, Annelies; Kremer, Hannie; Kimberling, William J; Cremers, Cor W R J; Deutman, August F
2004-04-01
To evaluate visual impairment in Usher syndrome 1b (USH1b) and Usher syndrome 2a (USH2a). We carried out a retrospective study of 19 USH1b patients and 40 USH2a patients. Cross-sectional regression analyses of the functional acuity score (FAS), functional field score (FFS) and functional vision score (FVS) related to age were performed. Statistical tests relating to regression lines and Student's t-test were used to compare between (sub)groups of patients. Parts of the available individual longitudinal data were used to obtain individual estimates of progressive deterioration and compare these to those obtained with cross-sectional analysis. Results were compared between subgroups of USH2a patients pertaining to combinations of different types of mutations. Cross-sectional analyses revealed significant deterioration of the FAS (0.7% per year), FFS (1.0% per year) and FVS (1.5% per year) with advancing age in both patient groups, without a significant difference between the USH1b and USH2a patients. Individual estimates of the deterioration rates were substantially and significantly higher than the cross-sectional estimates in some USH2a cases, including values of about 5% per year (or even higher) for the FAS (age 35-50 years), 3-4% per year for the FFS and 4-5% per year for the FVS (age > 20 years). There was no difference in functional vision score behaviour detected between subgroups of patients pertaining to different biallelic combinations of specific types of mutations. The FAS, FFS and FVS deteriorated significantly by 0.7-1.5% per year according to cross-sectional linear regression analysis in both USH1b and USH2a patients. Higher deterioration rates (3-5% per year) in any of these scores were attained, according to longitudinal data collected from individual USH2a patients. Score behaviour was similar across the patient groups and across different biallelic combinations of various types of mutations. However, more elaborate studies, preferably covering longitudinal data, are needed to obtain conclusive evidence.
Wang, Jun; Yang, Dong-Lin; Chen, Zhong-Zhu; Gou, Ben-Fu
2016-06-01
In order to further reveal the differences of association between body mass index (BMI) and cancer incidence across populations, genders, and menopausal status, we performed comprehensive meta-analysis with eligible citations. The risk ratio (RR) of incidence at 10 different cancer sites (per 5kg/m(2) increase in BMI) were quantified separately by employing generalized least-squares to estimate trends, and combined by meta-analyses. We observed significantly stronger association between increased BMI and breast cancer incidence in the Asia-Pacific group (RR 1.18:1.11-1.26) than in European-Australian (1.05:1.00-1.09) and North-American group (1.06:1.03-1.08) (meta-regression p<0.05). No association between increased BMI and pancreatic cancer incidence (0.94:0.71-1.24) was shown in the Asia-Pacific group (meta-regression p<0.05), whereas positive associations were found in other two groups. A significantly higher RR in men was found for colorectal cancer in comparison with women (meta-regression p<0.05). Compared with postmenopausal women, premenopausal women displayed significantly higher RR for ovarian cancer (pre- vs. post-=1.10 vs. 1.01, meta-regression p<0.05), but lower RR for breast cancer (pre- vs. post-=0.99 vs. 1.11, meta-regression p<0.0001). Our results indicate that overweight or obesity is a strong risk factor of cancer incidence at several cancer sites. Genders, populations, and menopausal status are important factors effecting the association between obesity and cancer incidence for certain cancer types. Copyright © 2016 Elsevier Ltd. All rights reserved.
Using within-day hive weight changes to measure environmental effects on honey bee colonies
Holst, Niels; Weiss, Milagra; Carroll, Mark J.; McFrederick, Quinn S.; Barron, Andrew B.
2018-01-01
Patterns in within-day hive weight data from two independent datasets in Arizona and California were modeled using piecewise regression, and analyzed with respect to honey bee colony behavior and landscape effects. The regression analysis yielded information on the start and finish of a colony’s daily activity cycle, hive weight change at night, hive weight loss due to departing foragers and weight gain due to returning foragers. Assumptions about the meaning of the timing and size of the morning weight changes were tested in a third study by delaying the forager departure times from one to three hours using screen entrance gates. A regression of planned vs. observed departure delays showed that the initial hive weight loss around dawn was largely due to foragers. In a similar experiment in Australia, hive weight loss due to departing foragers in the morning was correlated with net bee traffic (difference between the number of departing bees and the number of arriving bees) and from those data the payload of the arriving bees was estimated to be 0.02 g. The piecewise regression approach was then used to analyze a fifth study involving hives with and without access to natural forage. The analysis showed that, during a commercial pollination event, hives with previous access to forage had a significantly higher rate of weight gain as the foragers returned in the afternoon, and, in the weeks after the pollination event, a significantly higher rate of weight loss in the morning, as foragers departed. This combination of continuous weight data and piecewise regression proved effective in detecting treatment differences in foraging activity that other methods failed to detect. PMID:29791462
NASA Astrophysics Data System (ADS)
Das, Bappa; Sahoo, Rabi N.; Pargal, Sourabh; Krishna, Gopal; Verma, Rakesh; Chinnusamy, Viswanathan; Sehgal, Vinay K.; Gupta, Vinod K.; Dash, Sushanta K.; Swain, Padmini
2018-03-01
In the present investigation, the changes in sucrose, reducing and total sugar content due to water-deficit stress in rice leaves were modeled using visible, near infrared (VNIR) and shortwave infrared (SWIR) spectroscopy. The objectives of the study were to identify the best vegetation indices and suitable multivariate technique based on precise analysis of hyperspectral data (350 to 2500 nm) and sucrose, reducing sugar and total sugar content measured at different stress levels from 16 different rice genotypes. Spectral data analysis was done to identify suitable spectral indices and models for sucrose estimation. Novel spectral indices in near infrared (NIR) range viz. ratio spectral index (RSI) and normalised difference spectral indices (NDSI) sensitive to sucrose, reducing sugar and total sugar content were identified which were subsequently calibrated and validated. The RSI and NDSI models had R2 values of 0.65, 0.71 and 0.67; RPD values of 1.68, 1.95 and 1.66 for sucrose, reducing sugar and total sugar, respectively for validation dataset. Different multivariate spectral models such as artificial neural network (ANN), multivariate adaptive regression splines (MARS), multiple linear regression (MLR), partial least square regression (PLSR), random forest regression (RFR) and support vector machine regression (SVMR) were also evaluated. The best performing multivariate models for sucrose, reducing sugars and total sugars were found to be, MARS, ANN and MARS, respectively with respect to RPD values of 2.08, 2.44, and 1.93. Results indicated that VNIR and SWIR spectroscopy combined with multivariate calibration can be used as a reliable alternative to conventional methods for measurement of sucrose, reducing sugars and total sugars of rice under water-deficit stress as this technique is fast, economic, and noninvasive.
Using within-day hive weight changes to measure environmental effects on honey bee colonies.
Meikle, William G; Holst, Niels; Colin, Théotime; Weiss, Milagra; Carroll, Mark J; McFrederick, Quinn S; Barron, Andrew B
2018-01-01
Patterns in within-day hive weight data from two independent datasets in Arizona and California were modeled using piecewise regression, and analyzed with respect to honey bee colony behavior and landscape effects. The regression analysis yielded information on the start and finish of a colony's daily activity cycle, hive weight change at night, hive weight loss due to departing foragers and weight gain due to returning foragers. Assumptions about the meaning of the timing and size of the morning weight changes were tested in a third study by delaying the forager departure times from one to three hours using screen entrance gates. A regression of planned vs. observed departure delays showed that the initial hive weight loss around dawn was largely due to foragers. In a similar experiment in Australia, hive weight loss due to departing foragers in the morning was correlated with net bee traffic (difference between the number of departing bees and the number of arriving bees) and from those data the payload of the arriving bees was estimated to be 0.02 g. The piecewise regression approach was then used to analyze a fifth study involving hives with and without access to natural forage. The analysis showed that, during a commercial pollination event, hives with previous access to forage had a significantly higher rate of weight gain as the foragers returned in the afternoon, and, in the weeks after the pollination event, a significantly higher rate of weight loss in the morning, as foragers departed. This combination of continuous weight data and piecewise regression proved effective in detecting treatment differences in foraging activity that other methods failed to detect.
Cesium and strontium loads into a combined sewer system from rainwater runoff.
Kamei-Ishikawa, Nao; Yoshida, Daiki; Ito, Ayumi; Umita, Teruyuki
2016-12-01
In this study, combined sewage samples were taken with time in several rain events and sanitary sewage samples were taken with time in dry weather to calculate Cs and Sr loads to sewers from rainwater runoff. Cs and Sr in rainwater were present as particulate forms at first flush and the particulate Cs and Sr were mainly bound with inorganic suspended solids such as clay minerals in combined sewage samples. In addition, multiple linear regression analysis showed Cs and Sr loads from rainwater runoff could be estimated by the total amount of rainfall and antecedent dry weather days. The variation of the Sr load from rainwater to sewers was more sensitive to total amount of rainfall and antecedent dry weather days than that of the Cs load. Copyright © 2016 Elsevier Ltd. All rights reserved.
Processing Rhythmic Pattern during Chinese Sentence Reading: An Eye Movement Study
Luo, Yingyi; Duan, Yunyan; Zhou, Xiaolin
2015-01-01
Prosodic constraints play a fundamental role during both spoken sentence comprehension and silent reading. In Chinese, the rhythmic pattern of the verb-object (V-O) combination has been found to rapidly affect the semantic access/integration process during sentence reading (Luo and Zhou, 2010). Rhythmic pattern refers to the combination of words with different syllabic lengths, with certain combinations disallowed (e.g., [2 + 1]; numbers standing for the number of syllables of the verb and the noun respectively) and certain combinations preferred (e.g., [1 + 1] or [2 + 2]). This constraint extends to the situation in which the combination is used to modify other words. A V-O phrase could modify a noun by simply preceding it, forming a V-O-N compound; when the verb is disyllabic, however, the word order has to be O-V-N and the object is preferred to be disyllabic. In this study, we investigated how the reader processes the rhythmic pattern and word order information by recording the reader's eye-movements. We created four types of sentences by crossing rhythmic pattern and word order in compounding. The compound, embedding a disyllabic verb, could be in the correct O-V-N or the incorrect V-O-N order; the object could be disyllabic or monosyllabic. We found that the reader spent more time and made more regressions on and after the compounds when either type of anomaly was detected during the first pass reading. However, during re-reading (after all the words in the sentence have been viewed), less regressive eye movements were found for the anomalous rhythmic pattern, relative to the correct pattern; moreover, only the abnormal rhythmic pattern, not the violated word order, influenced the regressive eye movements. These results suggest that while the processing of rhythmic pattern and word order information occurs rapidly during the initial reading of the sentence, the process of recovering from the rhythmic pattern anomaly may ease the reanalysis processing at the later stage of sentence integration. Thus, rhythmic pattern in Chinese can dynamically affect both local phrase analysis and global sentence integration during silent reading. PMID:26696942
Processing Rhythmic Pattern during Chinese Sentence Reading: An Eye Movement Study.
Luo, Yingyi; Duan, Yunyan; Zhou, Xiaolin
2015-01-01
Prosodic constraints play a fundamental role during both spoken sentence comprehension and silent reading. In Chinese, the rhythmic pattern of the verb-object (V-O) combination has been found to rapidly affect the semantic access/integration process during sentence reading (Luo and Zhou, 2010). Rhythmic pattern refers to the combination of words with different syllabic lengths, with certain combinations disallowed (e.g., [2 + 1]; numbers standing for the number of syllables of the verb and the noun respectively) and certain combinations preferred (e.g., [1 + 1] or [2 + 2]). This constraint extends to the situation in which the combination is used to modify other words. A V-O phrase could modify a noun by simply preceding it, forming a V-O-N compound; when the verb is disyllabic, however, the word order has to be O-V-N and the object is preferred to be disyllabic. In this study, we investigated how the reader processes the rhythmic pattern and word order information by recording the reader's eye-movements. We created four types of sentences by crossing rhythmic pattern and word order in compounding. The compound, embedding a disyllabic verb, could be in the correct O-V-N or the incorrect V-O-N order; the object could be disyllabic or monosyllabic. We found that the reader spent more time and made more regressions on and after the compounds when either type of anomaly was detected during the first pass reading. However, during re-reading (after all the words in the sentence have been viewed), less regressive eye movements were found for the anomalous rhythmic pattern, relative to the correct pattern; moreover, only the abnormal rhythmic pattern, not the violated word order, influenced the regressive eye movements. These results suggest that while the processing of rhythmic pattern and word order information occurs rapidly during the initial reading of the sentence, the process of recovering from the rhythmic pattern anomaly may ease the reanalysis processing at the later stage of sentence integration. Thus, rhythmic pattern in Chinese can dynamically affect both local phrase analysis and global sentence integration during silent reading.
Dobashi, Akira; Goda, Kenichi; Yoshimura, Noboru; Ohya, Tomohiko R; Kato, Masayuki; Sumiyama, Kazuki; Matsushima, Masato; Hirooka, Shinichi; Ikegami, Masahiro; Tajiri, Hisao
2016-01-01
AIM To simplify the diagnostic criteria for superficial esophageal squamous cell carcinoma (SESCC) on Narrow Band Imaging combined with magnifying endoscopy (NBI-ME). METHODS This study was based on the post-hoc analysis of a randomized controlled trial. We performed NBI-ME for 147 patients with present or a history of squamous cell carcinoma in the head and neck, or esophagus between January 2009 and June 2011. Two expert endoscopists detected 89 lesions that were suspicious for SESCC lesions, which had been prospectively evaluated for the following 6 NBI-ME findings in real time: “intervascular background coloration”; “proliferation of intrapapillary capillary loops (IPCL)”; and “dilation”, “tortuosity”, “change in caliber”, and “various shapes (VS)” of IPCLs (i.e., Inoue’s tetrad criteria). The histologic examination of specimens was defined as the gold standard for diagnosis. A stepwise logistic regression analysis was used to identify candidates for the simplified criteria from among the 6 NBI-ME findings for diagnosing SESCCs. We evaluated diagnostic performance of the simplified criteria compared with that of Inoue’s criteria. RESULTS Fifty-four lesions (65%) were histologically diagnosed as SESCCs and the others as low-grade intraepithelial neoplasia or inflammation. In the univariate analysis, proliferation, tortuosity, change in caliber, and VS were significantly associated with SESCC (P < 0.01). The combination of VS and proliferation was statistically extracted from the 6 NBI-ME findings by using the stepwise logistic regression model. We defined the combination of VS and proliferation as simplified dyad criteria for SESCC. The areas under the curve of the simplified dyad criteria and Inoue’s tetrad criteria were 0.70 and 0.73, respectively. No significant difference was shown between them. The sensitivity, specificity, and accuracy of diagnosis for SESCC were 77.8%, 57.1%, 69.7% and 51.9%, 80.0%, 62.9% for the simplified dyad criteria and Inoue’s tetrad criteria, respectively. CONCLUSION The combination of proliferation and VS may serve as simplified criteria for the diagnosis of SESCC using NBI-ME. PMID:27895406
Bao, Jie; Hou, Zhangshuan; Huang, Maoyi; ...
2015-12-04
Here, effective sensitivity analysis approaches are needed to identify important parameters or factors and their uncertainties in complex Earth system models composed of multi-phase multi-component phenomena and multiple biogeophysical-biogeochemical processes. In this study, the impacts of 10 hydrologic parameters in the Community Land Model on simulations of runoff and latent heat flux are evaluated using data from a watershed. Different metrics, including residual statistics, the Nash-Sutcliffe coefficient, and log mean square error, are used as alternative measures of the deviations between the simulated and field observed values. Four sensitivity analysis (SA) approaches, including analysis of variance based on the generalizedmore » linear model, generalized cross validation based on the multivariate adaptive regression splines model, standardized regression coefficients based on a linear regression model, and analysis of variance based on support vector machine, are investigated. Results suggest that these approaches show consistent measurement of the impacts of major hydrologic parameters on response variables, but with differences in the relative contributions, particularly for the secondary parameters. The convergence behaviors of the SA with respect to the number of sampling points are also examined with different combinations of input parameter sets and output response variables and their alternative metrics. This study helps identify the optimal SA approach, provides guidance for the calibration of the Community Land Model parameters to improve the model simulations of land surface fluxes, and approximates the magnitudes to be adjusted in the parameter values during parametric model optimization.« less
Tvete, Ingunn Fride; Natvig, Bent; Gåsemyr, Jørund; Meland, Nils; Røine, Marianne; Klemp, Marianne
2015-01-01
Rheumatoid arthritis patients have been treated with disease modifying anti-rheumatic drugs (DMARDs) and the newer biologic drugs. We sought to compare and rank the biologics with respect to efficacy. We performed a literature search identifying 54 publications encompassing 9 biologics. We conducted a multiple treatment comparison regression analysis letting the number experiencing a 50% improvement on the ACR score be dependent upon dose level and disease duration for assessing the comparable relative effect between biologics and placebo or DMARD. The analysis embraced all treatment and comparator arms over all publications. Hence, all measured effects of any biologic agent contributed to the comparison of all biologic agents relative to each other either given alone or combined with DMARD. We found the drug effect to be dependent on dose level, but not on disease duration, and the impact of a high versus low dose level was the same for all drugs (higher doses indicated a higher frequency of ACR50 scores). The ranking of the drugs when given without DMARD was certolizumab (ranked highest), etanercept, tocilizumab/ abatacept and adalimumab. The ranking of the drugs when given with DMARD was certolizumab (ranked highest), tocilizumab, anakinra/rituximab, golimumab/ infliximab/ abatacept, adalimumab/ etanercept [corrected]. Still, all drugs were effective. All biologic agents were effective compared to placebo, with certolizumab the most effective and adalimumab (without DMARD treatment) and adalimumab/ etanercept (combined with DMARD treatment) the least effective. The drugs were in general more effective, except for etanercept, when given together with DMARDs.
Tvete, Ingunn Fride; Natvig, Bent; Gåsemyr, Jørund; Meland, Nils; Røine, Marianne; Klemp, Marianne
2015-01-01
Rheumatoid arthritis patients have been treated with disease modifying anti-rheumatic drugs (DMARDs) and the newer biologic drugs. We sought to compare and rank the biologics with respect to efficacy. We performed a literature search identifying 54 publications encompassing 9 biologics. We conducted a multiple treatment comparison regression analysis letting the number experiencing a 50% improvement on the ACR score be dependent upon dose level and disease duration for assessing the comparable relative effect between biologics and placebo or DMARD. The analysis embraced all treatment and comparator arms over all publications. Hence, all measured effects of any biologic agent contributed to the comparison of all biologic agents relative to each other either given alone or combined with DMARD. We found the drug effect to be dependent on dose level, but not on disease duration, and the impact of a high versus low dose level was the same for all drugs (higher doses indicated a higher frequency of ACR50 scores). The ranking of the drugs when given without DMARD was certolizumab (ranked highest), etanercept, tocilizumab/ abatacept and adalimumab. The ranking of the drugs when given with DMARD was certolizumab (ranked highest), tocilizumab, anakinra, rituximab, golimumab/ infliximab/ abatacept, adalimumab/ etanercept. Still, all drugs were effective. All biologic agents were effective compared to placebo, with certolizumab the most effective and adalimumab (without DMARD treatment) and adalimumab/ etanercept (combined with DMARD treatment) the least effective. The drugs were in general more effective, except for etanercept, when given together with DMARDs. PMID:26356639
Analysis of hyperspectral field radiometric data for monitoring nitrogen concentration in rice crops
NASA Astrophysics Data System (ADS)
Stroppiana, D.; Boschetti, M.; Confalonieri, R.; Bocchi, S.; Brivio, P. A.
2005-10-01
Monitoring crop conditions and assessing nutrition requirements is fundamental for implementing sustainable agriculture. Rational nitrogen fertilization is of particular importance in rice crops in order to guarantee high production levels while minimising the impact on the environment. In fact, the typical flooded condition of rice fields can be a significant source of greenhouse gasses. Information on plant nitrogen concentration can be used, coupled with information about the phenological stage, to plan strategies for a rational and spatially differentiated fertilization schedule. A field experiment was carried out in a rice field Northern Italy, in order to evaluate the potential of field radiometric measurements for the prediction of rice nitrogen concentration. The results indicate that rice reflectance is influenced by nitrogen supply at certain wavelengths although N concentration cannot be accurately predicted based on the reflectance measured at a given wavelength. Regression analysis highlighted that the visible region of the spectrum is most sensitive to plant nitrogen concentration when reflectance measures are combined into a spectral index. An automated procedure allowed the analysis of all the possible combinations into a Normalized Difference Index (NDI) of the narrow spectral bands derived by spectral resampling of field measurements. The derived index appeared to be least influenced by plant biomass and Leaf Area Index (LAI) providing a useful approach to detect rice nutritional status. The validation of the regressive model showed that the model is able to predict rice N concentration (R2=0.55 [p<0.01] RRMSE=29.4; modelling efficiency close to the optimum value).
Folded concave penalized learning in identifying multimodal MRI marker for Parkinson’s disease
Liu, Hongcheng; Du, Guangwei; Zhang, Lijun; Lewis, Mechelle M.; Wang, Xue; Yao, Tao; Li, Runze; Huang, Xuemei
2016-01-01
Background Brain MRI holds promise to gauge different aspects of Parkinson’s disease (PD)-related pathological changes. Its analysis, however, is hindered by the high-dimensional nature of the data. New method This study introduces folded concave penalized (FCP) sparse logistic regression to identify biomarkers for PD from a large number of potential factors. The proposed statistical procedures target the challenges of high-dimensionality with limited data samples acquired. The maximization problem associated with the sparse logistic regression model is solved by local linear approximation. The proposed procedures then are applied to the empirical analysis of multimodal MRI data. Results From 45 features, the proposed approach identified 15 MRI markers and the UPSIT, which are known to be clinically relevant to PD. By combining the MRI and clinical markers, we can enhance substantially the specificity and sensitivity of the model, as indicated by the ROC curves. Comparison to existing methods We compare the folded concave penalized learning scheme with both the Lasso penalized scheme and the principle component analysis-based feature selection (PCA) in the Parkinson’s biomarker identification problem that takes into account both the clinical features and MRI markers. The folded concave penalty method demonstrates a substantially better clinical potential than both the Lasso and PCA in terms of specificity and sensitivity. Conclusions For the first time, we applied the FCP learning method to MRI biomarker discovery in PD. The proposed approach successfully identified MRI markers that are clinically relevant. Combining these biomarkers with clinical features can substantially enhance performance. PMID:27102045
Lofgren, Sarah M; Tadros, Talaat; Herring-Bailey, Gina; Birdsong, George; Mosunjac, Marina; Flowers, Lisa; Nguyen, Minh Ly
2015-05-01
Our objective was to evaluate the progression and regression of cervical dysplasia in human immunodeficiency virus (HIV)-positive women during the late antiretroviral era. Risk factors as well as outcomes after treatment of cancerous or precancerous lesions were examined. This is a longitudinal retrospective review of cervical Pap tests performed on HIV-infected women with an intact cervix between 2004 and 2011. Subjects needed over two Pap tests for at least 2 years of follow-up. Progression was defined as those who developed a squamous intraepithelial lesion (SIL), atypical glandular cells (AGC), had low-grade SIL (LSIL) followed by atypical squamous cells-cannot exclude high-grade SIL (ASC-H) or high-grade SIL (HSIL), or cancer. Regression was defined as an initial SIL with two or more subsequent normal Pap tests. Persistence was defined as having an SIL without progression or regression. High-risk human papillomavirus (HPV) testing started in 2006 on atypical squamous cells of undetermined significance (ASCUS) Pap tests. AGC at enrollment were excluded from progression analysis. Of 1,445 screened, 383 patients had over two Pap tests for a 2-year period. Of those, 309 had an intact cervix. The median age was 40 years and CD4+ cell count was 277 cells/mL. Four had AGC at enrollment. A quarter had persistently normal Pap tests, 64 (31%) regressed, and 50 (24%) progressed. Four developed cancer. The only risk factor associated with progression was CD4 count. In those with treated lesions, 24 (59%) had negative Pap tests at the end of follow-up. More studies are needed to evaluate follow-up strategies of LSIL patients, potentially combined with HPV testing. Guidelines for HIV-seropositive women who are in care, have improved CD4, and have persistently negative Pap tests could likely lengthen the follow-up interval.
Linear models for calculating digestibile energy for sheep diets.
Fonnesbeck, P V; Christiansen, M L; Harris, L E
1981-05-01
Equations for estimating the digestible energy (DE) content of sheep diets were generated from the chemical contents and a factorial description of diets fed to lambs in digestion trials. The diet factors were two forages (alfalfa and grass hay), harvested at three stages of maturity (late vegetative, early bloom and full bloom), fed in two ingredient combinations (all hay or a 50:50 hay and corn grain mixture) and prepared by two forage texture processes (coarsely chopped or finely chopped and pelleted). The 2 x 3 x 2 x 2 factorial arrangement produced 24 diet treatments. These were replicated twice, for a total of 48 lamb digestion trials. In model 1 regression equations, DE was calculated directly from chemical composition of the diet. In model 2, regression equations predicted the percentage of digested nutrient from the chemical contents of the diet and then DE of the diet was calculated as the sum of the gross energy of the digested organic components. Expanded forms of model 1 and model 2 were also developed that included diet factors as qualitative indicator variables to adjust the regression constant and regression coefficients for the diet description. The expanded forms of the equations accounted for significantly more variation in DE than did the simple models and more accurately estimated DE of the diet. Information provided by the diet description proved as useful as chemical analyses for the prediction of digestibility of nutrients. The statistics indicate that, with model 1, neutral detergent fiber and plant cell wall analyses provided as much information for the estimation of DE as did model 2 with the combined information from crude protein, available carbohydrate, total lipid, cellulose and hemicellulose. Regression equations are presented for estimating DE with the most currently analyzed organic components, including linear and curvilinear variables and diet factors that significantly reduce the standard error of the estimate. To estimate De of a diet, the user utilizes the equation that uses the chemical analysis information and diet description most effectively.
Tadros, Talaat; Herring-Bailey, Gina; Birdsong, George; Mosunjac, Marina; Flowers, Lisa; Nguyen, Minh Ly
2015-01-01
Abstract Our objective was to evaluate the progression and regression of cervical dysplasia in human immunodeficiency virus (HIV)-positive women during the late antiretroviral era. Risk factors as well as outcomes after treatment of cancerous or precancerous lesions were examined. This is a longitudinal retrospective review of cervical Pap tests performed on HIV-infected women with an intact cervix between 2004 and 2011. Subjects needed over two Pap tests for at least 2 years of follow-up. Progression was defined as those who developed a squamous intraepithelial lesion (SIL), atypical glandular cells (AGC), had low-grade SIL (LSIL) followed by atypical squamous cells-cannot exclude high-grade SIL (ASC-H) or high-grade SIL (HSIL), or cancer. Regression was defined as an initial SIL with two or more subsequent normal Pap tests. Persistence was defined as having an SIL without progression or regression. High-risk human papillomavirus (HPV) testing started in 2006 on atypical squamous cells of undetermined significance (ASCUS) Pap tests. AGC at enrollment were excluded from progression analysis. Of 1,445 screened, 383 patients had over two Pap tests for a 2-year period. Of those, 309 had an intact cervix. The median age was 40 years and CD4+ cell count was 277 cells/mL. Four had AGC at enrollment. A quarter had persistently normal Pap tests, 64 (31%) regressed, and 50 (24%) progressed. Four developed cancer. The only risk factor associated with progression was CD4 count. In those with treated lesions, 24 (59%) had negative Pap tests at the end of follow-up. More studies are needed to evaluate follow-up strategies of LSIL patients, potentially combined with HPV testing. Guidelines for HIV-seropositive women who are in care, have improved CD4, and have persistently negative Pap tests could likely lengthen the follow-up interval. PMID:25693769
Hoffman, Haydn; Lee, Sunghoon I; Garst, Jordan H; Lu, Derek S; Li, Charles H; Nagasawa, Daniel T; Ghalehsari, Nima; Jahanforouz, Nima; Razaghy, Mehrdad; Espinal, Marie; Ghavamrezaii, Amir; Paak, Brian H; Wu, Irene; Sarrafzadeh, Majid; Lu, Daniel C
2015-09-01
This study introduces the use of multivariate linear regression (MLR) and support vector regression (SVR) models to predict postoperative outcomes in a cohort of patients who underwent surgery for cervical spondylotic myelopathy (CSM). Currently, predicting outcomes after surgery for CSM remains a challenge. We recruited patients who had a diagnosis of CSM and required decompressive surgery with or without fusion. Fine motor function was tested preoperatively and postoperatively with a handgrip-based tracking device that has been previously validated, yielding mean absolute accuracy (MAA) results for two tracking tasks (sinusoidal and step). All patients completed Oswestry disability index (ODI) and modified Japanese Orthopaedic Association questionnaires preoperatively and postoperatively. Preoperative data was utilized in MLR and SVR models to predict postoperative ODI. Predictions were compared to the actual ODI scores with the coefficient of determination (R(2)) and mean absolute difference (MAD). From this, 20 patients met the inclusion criteria and completed follow-up at least 3 months after surgery. With the MLR model, a combination of the preoperative ODI score, preoperative MAA (step function), and symptom duration yielded the best prediction of postoperative ODI (R(2)=0.452; MAD=0.0887; p=1.17 × 10(-3)). With the SVR model, a combination of preoperative ODI score, preoperative MAA (sinusoidal function), and symptom duration yielded the best prediction of postoperative ODI (R(2)=0.932; MAD=0.0283; p=5.73 × 10(-12)). The SVR model was more accurate than the MLR model. The SVR can be used preoperatively in risk/benefit analysis and the decision to operate. Copyright © 2015 Elsevier Ltd. All rights reserved.
Chen, Gang; Wu, Yulian; Wang, Tao; Liang, Jixing; Lin, Wei; Li, Liantao; Wen, Junping; Lin, Lixiang; Huang, Huibin
2012-10-01
The role of the endogenous secretory receptor for advanced glycation end products (esRAGE) in depression of diabetes patients and its clinical significance are unclear. This study investigated the role of serum esRAGE in patients with type 2 diabetes mellitus with depression in the Chinese population. One hundred nineteen hospitalized patients with type 2 diabetes were recruited at Fujian Provincial Hospital (Fuzhou, China) from February 2010 to January 2011. All selected subjects were assessed with the Hamilton Rating Scale for Depression (HAMD). Among them, 71 patients with both type 2 diabetes and depression were included. All selected subjects were examined for the following: esRAGE concentration, glycosylated hemoglobin (HbA1c), blood lipids, C-reactive protein, trace of albumin in urine, and carotid artery intima-media thickness (IMT). Association between serum esRAGE levels and risk of type 2 diabetes mellitus with depression was also analyzed. There were statistically significant differences in gender, age, body mass index, waist circumference, and treatment methods between the group with depression and the group without depression (P<0.05). Multiple linear regression analysis showed that HAMD scores were negatively correlated with esRAGE levels (standard regression coefficient -0.270, P<0.01). HAMD-17 scores were positively correlated with IMT (standard regression coefficient 0.183, P<0.05) and with HbA1c (standard regression coefficient 0.314, P<0.01). Female gender, younger age, obesity, poor glycemic control, complications, and insulin therapy are all risk factors of type 2 diabetes mellitus with combined depression in the Chinese population. Inflammation and atherosclerosis play an important role in the pathogenesis of depression. esRAGE is a protective factor of depression among patients who have type 2 diabetes.
The tolerance of the femoral shaft in combined axial compression and bending loading.
Ivarsson, B Johan; Genovese, Daniel; Crandall, Jeff R; Bolton, James R; Untaroiu, Costin D; Bose, Dipan
2009-11-01
The likelihood of a front seat occupant sustaining a femoral shaft fracture in a frontal crash has traditionally been assessed by an injury criterion relying solely on the axial force in the femur. However, recently published analyses of real world data indicate that femoral shaft fracture occurs at axial loads levels below those found experimentally. One hypothesis attempting to explain this discrepancy suggests that femoral shaft fracture tends to occur as a result of combined axial compression and applied bending. The current study aims to evaluate this hypothesis by investigating how these two loading components interact. Femoral shafts harvested from human cadavers were loaded to failure in axial compression, sagittal plane bending, and combined axial compression and sagittal plane bending. All specimens subjected to bending and combined loading fractured midshaft, whereas the specimens loaded in axial compression demonstrated a variety of failure locations including midshaft and distal end. The interaction between the recorded levels of applied moment and axial compression force at fracture were evaluated using two different analysis methods: fitting of an analytical model to the experimental data and multiple regression analysis. The two analysis methods yielded very similar relationships between applied moment and axial compression force at midshaft fracture. The results indicate that posteroanterior bending reduces the tolerance of the femoral shaft to axial compression and that that this type of combined loading therefore may contribute to the high prevalence of femoral shaft fracture in frontal crashes.
Stenner, Frank; Chastonay, Rahel; Liewen, Heike; Haile, Sarah R; Cathomas, Richard; Rothermundt, Christian; Siciliano, Raffaele D; Stoll, Susanna; Knuth, Alexander; Buchler, Tomas; Porta, Camillo; Renner, Christoph; Samaras, Panagiotis
2012-01-01
To evaluate the optimal sequence for the receptor tyrosine kinase inhibitors (rTKIs) sorafenib and sunitinib in metastatic renal cell cancer. We performed a retrospective analysis of patients who had received sequential therapy with both rTKIs and integrated these results into a pooled analysis of available data from other publications. Differences in median progression-free survival (PFS) for first- (PFS1) and second-line treatment (PFS2), and for the combined PFS (PFS1 plus PFS2) were examined using weighted linear regression. In the pooled analysis encompassing 853 patients, the median combined PFS for first-line sunitinib and 2nd-line sorafenib (SuSo) was 12.1 months compared with 15.4 months for the reverse sequence (SoSu; 95% CI for difference 1.45-5.12, p = 0.0013). Regarding first-line treatment, no significant difference in PFS1 was noted regardless of which drug was initially used (0.62 months average increase on sorafenib, 95% CI for difference -1.01 to 2.26, p = 0.43). In second-line treatment, sunitinib showed a significantly longer PFS2 than sorafenib (average increase 2.66 months, 95% CI 1.02-4.3, p = 0.003). The SoSu sequence translates into a longer combined PFS compared to the SuSo sequence. Predominantly the superiority of sunitinib regarding PFS2 contributed to the longer combined PFS in sequential use. Copyright © 2012 S. Karger AG, Basel.
Allegrini, Franco; Braga, Jez W B; Moreira, Alessandro C O; Olivieri, Alejandro C
2018-06-29
A new multivariate regression model, named Error Covariance Penalized Regression (ECPR) is presented. Following a penalized regression strategy, the proposed model incorporates information about the measurement error structure of the system, using the error covariance matrix (ECM) as a penalization term. Results are reported from both simulations and experimental data based on replicate mid and near infrared (MIR and NIR) spectral measurements. The results for ECPR are better under non-iid conditions when compared with traditional first-order multivariate methods such as ridge regression (RR), principal component regression (PCR) and partial least-squares regression (PLS). Copyright © 2018 Elsevier B.V. All rights reserved.
Zaggia, Luca; Lorenzetti, Giuliano; Manfé, Giorgia; Scarpa, Gian Marco; Molinaroli, Emanuela; Parnell, Kevin Ellis; Rapaglia, John Paul; Gionta, Maria; Soomere, Tarmo
2017-01-01
An investigation based on in-situ surveys combined with remote sensing and GIS analysis revealed fast shoreline retreat on the side of a major waterway, the Malamocco Marghera Channel, in the Lagoon of Venice, Italy. Monthly and long-term regression rates caused by ship wakes in a reclaimed industrial area were considered. The short-term analysis, based on field surveys carried out between April 2014 and January 2015, revealed that the speed of shoreline regression was insignificantly dependent on the distance from the navigation channel, but was not constant through time. Periods of high water levels due to tidal forcing or storm surges, more common in the winter season, are characterized by faster regression rates. The retreat is a discontinuous process in time and space depending on the morpho-stratigraphy and the vegetation cover of the artificial deposits. A GIS analysis performed with the available imagery shows an average retreat of 3-4 m/yr in the period between 1974 and 2015. Digitization of historical maps and bathymetric surveys made in April 2015 enabled the construction of two digital terrain models for both past and present situations. The two models have been used to calculate the total volume of sediment lost during the period 1968-2015 (1.19×106 m3). The results show that in the presence of heavy ship traffic, ship-channel interactions can dominate the morphodynamics of a waterway and its margins. The analysis enables a better understanding of how shallow-water systems react to the human activities in the post-industrial period. An adequate evaluation of the temporal and spatial variation of shoreline position is also crucial for the development of future scenarios and for the sustainable management port traffic worldwide.
Evaluation of Visual Field Progression in Glaucoma: Quasar Regression Program and Event Analysis.
Díaz-Alemán, Valentín T; González-Hernández, Marta; Perera-Sanz, Daniel; Armas-Domínguez, Karintia
2016-01-01
To determine the sensitivity, specificity and agreement between the Quasar program, glaucoma progression analysis (GPA II) event analysis and expert opinion in the detection of glaucomatous progression. The Quasar program is based on linear regression analysis of both mean defect (MD) and pattern standard deviation (PSD). Each series of visual fields was evaluated by three methods; Quasar, GPA II and four experts. The sensitivity, specificity and agreement (kappa) for each method was calculated, using expert opinion as the reference standard. The study included 439 SITA Standard visual fields of 56 eyes of 42 patients, with a mean of 7.8 ± 0.8 visual fields per eye. When suspected cases of progression were considered stable, sensitivity and specificity of Quasar, GPA II and the experts were 86.6% and 70.7%, 26.6% and 95.1%, and 86.6% and 92.6% respectively. When suspected cases of progression were considered as progressing, sensitivity and specificity of Quasar, GPA II and the experts were 79.1% and 81.2%, 45.8% and 90.6%, and 85.4% and 90.6% respectively. The agreement between Quasar and GPA II when suspected cases were considered stable or progressing was 0.03 and 0.28 respectively. The degree of agreement between Quasar and the experts when suspected cases were considered stable or progressing was 0.472 and 0.507. The degree of agreement between GPA II and the experts when suspected cases were considered stable or progressing was 0.262 and 0.342. The combination of MD and PSD regression analysis in the Quasar program showed better agreement with the experts and higher sensitivity than GPA II.
Lorenzetti, Giuliano; Manfé, Giorgia; Scarpa, Gian Marco; Molinaroli, Emanuela; Parnell, Kevin Ellis; Rapaglia, John Paul; Gionta, Maria; Soomere, Tarmo
2017-01-01
An investigation based on in-situ surveys combined with remote sensing and GIS analysis revealed fast shoreline retreat on the side of a major waterway, the Malamocco Marghera Channel, in the Lagoon of Venice, Italy. Monthly and long-term regression rates caused by ship wakes in a reclaimed industrial area were considered. The short-term analysis, based on field surveys carried out between April 2014 and January 2015, revealed that the speed of shoreline regression was insignificantly dependent on the distance from the navigation channel, but was not constant through time. Periods of high water levels due to tidal forcing or storm surges, more common in the winter season, are characterized by faster regression rates. The retreat is a discontinuous process in time and space depending on the morpho-stratigraphy and the vegetation cover of the artificial deposits. A GIS analysis performed with the available imagery shows an average retreat of 3˗4 m/yr in the period between 1974 and 2015. Digitization of historical maps and bathymetric surveys made in April 2015 enabled the construction of two digital terrain models for both past and present situations. The two models have been used to calculate the total volume of sediment lost during the period 1968˗2015 (1.19×106 m3). The results show that in the presence of heavy ship traffic, ship-channel interactions can dominate the morphodynamics of a waterway and its margins. The analysis enables a better understanding of how shallow-water systems react to the human activities in the post-industrial period. An adequate evaluation of the temporal and spatial variation of shoreline position is also crucial for the development of future scenarios and for the sustainable management port traffic worldwide. PMID:29088244
Analyzing industrial energy use through ordinary least squares regression models
NASA Astrophysics Data System (ADS)
Golden, Allyson Katherine
Extensive research has been performed using regression analysis and calibrated simulations to create baseline energy consumption models for residential buildings and commercial institutions. However, few attempts have been made to discuss the applicability of these methodologies to establish baseline energy consumption models for industrial manufacturing facilities. In the few studies of industrial facilities, the presented linear change-point and degree-day regression analyses illustrate ideal cases. It follows that there is a need in the established literature to discuss the methodologies and to determine their applicability for establishing baseline energy consumption models of industrial manufacturing facilities. The thesis determines the effectiveness of simple inverse linear statistical regression models when establishing baseline energy consumption models for industrial manufacturing facilities. Ordinary least squares change-point and degree-day regression methods are used to create baseline energy consumption models for nine different case studies of industrial manufacturing facilities located in the southeastern United States. The influence of ambient dry-bulb temperature and production on total facility energy consumption is observed. The energy consumption behavior of industrial manufacturing facilities is only sometimes sufficiently explained by temperature, production, or a combination of the two variables. This thesis also provides methods for generating baseline energy models that are straightforward and accessible to anyone in the industrial manufacturing community. The methods outlined in this thesis may be easily replicated by anyone that possesses basic spreadsheet software and general knowledge of the relationship between energy consumption and weather, production, or other influential variables. With the help of simple inverse linear regression models, industrial manufacturing facilities may better understand their energy consumption and production behavior, and identify opportunities for energy and cost savings. This thesis study also utilizes change-point and degree-day baseline energy models to disaggregate facility annual energy consumption into separate industrial end-user categories. The baseline energy model provides a suitable and economical alternative to sub-metering individual manufacturing equipment. One case study describes the conjoined use of baseline energy models and facility information gathered during a one-day onsite visit to perform an end-point energy analysis of an injection molding facility conducted by the Alabama Industrial Assessment Center. Applying baseline regression model results to the end-point energy analysis allowed the AIAC to better approximate the annual energy consumption of the facility's HVAC system.
Mi, Jia; Li, Jie; Zhang, Qinglu; Wang, Xing; Liu, Hongyu; Cao, Yanlu; Liu, Xiaoyan; Sun, Xiao; Shang, Mengmeng; Liu, Qing
2016-01-01
Abstract The purpose of the study was to establish a mathematical model for correlating the combination of ultrasonography and noncontrast helical computerized tomography (NCHCT) with the total energy of Holmium laser lithotripsy. In this study, from March 2013 to February 2014, 180 patients with single urinary calculus were examined using ultrasonography and NCHCT before Holmium laser lithotripsy. The calculus location and size, acoustic shadowing (AS) level, twinkling artifact intensity (TAI), and CT value were all documented. The total energy of lithotripsy (TEL) and the calculus composition were also recorded postoperatively. Data were analyzed using Spearman's rank correlation coefficient, with the SPSS 17.0 software package. Multiple linear regression was also used for further statistical analysis. A significant difference in the TEL was observed between renal calculi and ureteral calculi (r = –0.565, P < 0.001), and there was a strong correlation between the calculus size and the TEL (r = 0.675, P < 0.001). The difference in the TEL between the calculi with and without AS was highly significant (r = 0.325, P < 0.001). The CT value of the calculi was significantly correlated with the TEL (r = 0.386, P < 0.001). A correlation between the TAI and TEL was also observed (r = 0.391, P < 0.001). Multiple linear regression analysis revealed that the location, size, and TAI of the calculi were related to the TEL, and the location and size were statistically significant predictors (adjusted r2 = 0.498, P < 0.001). A mathematical model correlating the combination of ultrasonography and NCHCT with TEL was established; this model may provide a foundation to guide the use of energy in Holmium laser lithotripsy. The TEL can be estimated by the location, size, and TAI of the calculus. PMID:27930563
Zhu, A N; Yang, X X; Sun, M Y; Zhang, Z X; Li, M
2015-03-13
We explored the associations of INSR and mTOR, 2 key genes in the insulin signaling pathway, and the susceptibility to type 2 diabetes mellitus and diabetic nephropathy. Three single-nucleotide polymorphisms (SNPs) (rs1799817, rs1051690, and rs2059806) in INSR and 3 SNPs (rs7211818, rs7212142, and rs9674559) in mTOR were genotyped using the Sequenom MassARRAY iPLEX platform in 89 type 2 diabetes patients without diabetic nephropathy, 134 type 2 diabetes patients with diabetic nephropathy, and 120 healthy control subjects. Statistical analysis based on unconditional logistic regression was carried out to determine the odds ratio (OR) and 95% confidence interval (95%CI) for each SNP. Combination analyses between rs2059806 and rs7212142 were also performed using the X(2) test and logistic regression. Among these 6 SNPs, 4 (rs1799817, rs1051690, rs7211818, and rs9674559) showed no association with type 2 diabetes mellitus or diabetic nephropathy. However, rs2059806 in INSR was associated with both type 2 diabetes mellitus (P = 0.033) and type 2 diabetic nephropathy (P = 0.018). The rs7212142 polymorphism in mTOR was associated with type 2 diabetic nephropathy (P = 0.010, OR = 0.501, 95%CI = 0.288- 0.871), but showed no relationship with type 2 diabetes mellitus. Combination analysis revealed that rs2059806 and rs7212142 had a combined effect on susceptibility to type 2 diabetes mellitus and diabetic nephropathy. Our results suggest that both INSR and mTOR play a role in the predisposition of the Han Chinese population to type 2 diabetic nephropathy, but the genetic predisposition may show some differences.
Online breakage detection of multitooth tools using classifier ensembles for imbalanced data
NASA Astrophysics Data System (ADS)
Bustillo, Andrés; Rodríguez, Juan J.
2014-12-01
Cutting tool breakage detection is an important task, due to its economic impact on mass production lines in the automobile industry. This task presents a central limitation: real data-sets are extremely imbalanced because breakage occurs in very few cases compared with normal operation of the cutting process. In this paper, we present an analysis of different data-mining techniques applied to the detection of insert breakage in multitooth tools. The analysis applies only one experimental variable: the electrical power consumption of the tool drive. This restriction profiles real industrial conditions more accurately than other physical variables, such as acoustic or vibration signals, which are not so easily measured. Many efforts have been made to design a method that is able to identify breakages with a high degree of reliability within a short period of time. The solution is based on classifier ensembles for imbalanced data-sets. Classifier ensembles are combinations of classifiers, which in many situations are more accurate than individual classifiers. Six different base classifiers are tested: Decision Trees, Rules, Naïve Bayes, Nearest Neighbour, Multilayer Perceptrons and Logistic Regression. Three different balancing strategies are tested with each of the classifier ensembles and compared to their performance with the original data-set: Synthetic Minority Over-Sampling Technique (SMOTE), undersampling and a combination of SMOTE and undersampling. To identify the most suitable data-mining solution, Receiver Operating Characteristics (ROC) graph and Recall-precision graph are generated and discussed. The performance of logistic regression ensembles on the balanced data-set using the combination of SMOTE and undersampling turned out to be the most suitable technique. Finally a comparison using industrial performance measures is presented, which concludes that this technique is also more suited to this industrial problem than the other techniques presented in the bibliography.
A 4-gene panel as a marker at chromosome 8q in Asian gastric cancer patients.
Cheng, Lei; Zhang, Qing; Yang, Sheng; Yang, Yanqing; Zhang, Wen; Gao, Hengjun; Deng, Xiaxing; Zhang, Qinghua
2013-10-01
A widely held viewpoint is that the use of multiple markers, combined in some type of algorithm, will be necessary to provide high enough discrimination between diseased cases and non-diseased. We applied stepwise logistic regression analysis to identify the best combination of the 32 biomarkers at chromosome 8q on an independent public microarray test set of 80 paired gastric samples. A combination of SULF1, INTS8, ATP6V1C1, and GPR172A was identified with a prediction accuracy of 98.0% for discriminating carcinomas from adjacent noncancerous tissues in our previous 25 paired samples. Interestingly, the overexpression of SULF1 was associated with tumor invasion and metastasis. Function prediction analysis revealed that the 4-marker panel was mainly associated with acidification of intracellular compartments. Taken together, we found a 4-gene panel that accurately discriminated gastric carcinomas from adjacent noncancerous tissues and these results had potential clinical significance in the early diagnosis and targeted treatment of gastric cancer. Copyright © 2013 Elsevier Inc. All rights reserved.
A secure distributed logistic regression protocol for the detection of rare adverse drug events
El Emam, Khaled; Samet, Saeed; Arbuckle, Luk; Tamblyn, Robyn; Earle, Craig; Kantarcioglu, Murat
2013-01-01
Background There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectiveness in genetic, ethnic and clinically defined subpopulations. However, combining datasets from different data custodians or jurisdictions to perform an analysis on the pooled data creates significant privacy concerns that would need to be addressed. Existing protocols for addressing these concerns can result in reduced analysis accuracy and can allow sensitive information to leak. Objective To develop a secure distributed multi-party computation protocol for logistic regression that provides strong privacy guarantees. Methods We developed a secure distributed logistic regression protocol using a single analysis center with multiple sites providing data. A theoretical security analysis demonstrates that the protocol is robust to plausible collusion attacks and does not allow the parties to gain new information from the data that are exchanged among them. The computational performance and accuracy of the protocol were evaluated on simulated datasets. Results The computational performance scales linearly as the dataset sizes increase. The addition of sites results in an exponential growth in computation time. However, for up to five sites, the time is still short and would not affect practical applications. The model parameters are the same as the results on pooled raw data analyzed in SAS, demonstrating high model accuracy. Conclusion The proposed protocol and prototype system would allow the development of logistic regression models in a secure manner without requiring the sharing of personal health information. This can alleviate one of the key barriers to the establishment of large-scale post-marketing surveillance programs. We extended the secure protocol to account for correlations among patients within sites through generalized estimating equations, and to accommodate other link functions by extending it to generalized linear models. PMID:22871397
A secure distributed logistic regression protocol for the detection of rare adverse drug events.
El Emam, Khaled; Samet, Saeed; Arbuckle, Luk; Tamblyn, Robyn; Earle, Craig; Kantarcioglu, Murat
2013-05-01
There is limited capacity to assess the comparative risks of medications after they enter the market. For rare adverse events, the pooling of data from multiple sources is necessary to have the power and sufficient population heterogeneity to detect differences in safety and effectiveness in genetic, ethnic and clinically defined subpopulations. However, combining datasets from different data custodians or jurisdictions to perform an analysis on the pooled data creates significant privacy concerns that would need to be addressed. Existing protocols for addressing these concerns can result in reduced analysis accuracy and can allow sensitive information to leak. To develop a secure distributed multi-party computation protocol for logistic regression that provides strong privacy guarantees. We developed a secure distributed logistic regression protocol using a single analysis center with multiple sites providing data. A theoretical security analysis demonstrates that the protocol is robust to plausible collusion attacks and does not allow the parties to gain new information from the data that are exchanged among them. The computational performance and accuracy of the protocol were evaluated on simulated datasets. The computational performance scales linearly as the dataset sizes increase. The addition of sites results in an exponential growth in computation time. However, for up to five sites, the time is still short and would not affect practical applications. The model parameters are the same as the results on pooled raw data analyzed in SAS, demonstrating high model accuracy. The proposed protocol and prototype system would allow the development of logistic regression models in a secure manner without requiring the sharing of personal health information. This can alleviate one of the key barriers to the establishment of large-scale post-marketing surveillance programs. We extended the secure protocol to account for correlations among patients within sites through generalized estimating equations, and to accommodate other link functions by extending it to generalized linear models.
Rebolledo, Brian J; Bernard, Johnathan A; Werner, Brian C; Finlay, Andrea K; Nwachukwu, Benedict U; Dare, David M; Warren, Russell F; Rodeo, Scott A
2018-04-01
To evaluate the association between serum vitamin D level and the prevalence of lower extremity muscle strains and core muscle injuries in elite level athletes at the National Football League (NFL) combine. During the 2015 NFL combine, all athletes with available serum vitamin D levels were included for study. Baseline data were collected, including age, race, body mass index, position, injury history specific to lower extremity muscle strain or core muscle injury, and Functional Movement Screen scores. Serum 25-hydroxyvitamin D was collected and defined as normal (≥32 ng/mL), insufficient (20-31 ng/mL), and deficient (<20 ng/mL). Univariate regression analysis was used to examine the association of vitamin D level and injury history. Subsequent multivariate regression analysis was used to examine this relation with adjustment for collected baseline data variables. The study population included 214 athletes, including 78% African American athletes and 51% skilled position players. Inadequate vitamin D was present in 59%, including 10% with deficient levels. Lower extremity muscle strain or core muscle injury was present in 50% of athletes, which was associated with lower vitamin D levels (P = .03). Athletes with a positive injury history also showed significantly lower vitamin D levels as compared with uninjured athletes (P = .03). African American/black race (P < .001) and injury history (P < .001) was associated with lower vitamin D. Vitamin D groups showed no differences in age (P = .9), body mass index (P = .9), or Functional Movement Screen testing (P = .2). Univariate analysis of inadequate vitamin D levels showed a 1.86 higher odds of lower extremity strain or core muscle injury (P = .03), and 3.61 higher odds of hamstring injury (P < .001). Multivariate analysis did not reach an independent association of low vitamin D with injury history (P = .07). Inadequate vitamin D levels are a widespread finding in athletes at the NFL combine. Players with a history of lower extremity muscle strain and core muscle injury had a higher prevalence of inadequate vitamin D. Level IV, retrospective study-case series. Copyright © 2017 Arthroscopy Association of North America. Published by Elsevier Inc. All rights reserved.
Modeling Complex Phenomena Using Multiscale Time Sequences
2009-08-24
measures based on Hurst and Holder exponents , auto-regressive methods and Fourier and wavelet decomposition methods. The applications for this technology...relate to each other. This can be done by combining a set statistical fractal measures based on Hurst and Holder exponents , auto-regressive...different scales and how these scales relate to each other. This can be done by combining a set statistical fractal measures based on Hurst and
Rupert, Michael G.; Cannon, Susan H.; Gartner, Joseph E.; Michael, John A.; Helsel, Dennis R.
2008-01-01
Logistic regression was used to develop statistical models that can be used to predict the probability of debris flows in areas recently burned by wildfires by using data from 14 wildfires that burned in southern California during 2003-2006. Twenty-eight independent variables describing the basin morphology, burn severity, rainfall, and soil properties of 306 drainage basins located within those burned areas were evaluated. The models were developed as follows: (1) Basins that did and did not produce debris flows soon after the 2003 to 2006 fires were delineated from data in the National Elevation Dataset using a geographic information system; (2) Data describing the basin morphology, burn severity, rainfall, and soil properties were compiled for each basin. These data were then input to a statistics software package for analysis using logistic regression; and (3) Relations between the occurrence or absence of debris flows and the basin morphology, burn severity, rainfall, and soil properties were evaluated, and five multivariate logistic regression models were constructed. All possible combinations of independent variables were evaluated to determine which combinations produced the most effective models, and the multivariate models that best predicted the occurrence of debris flows were identified. Percentage of high burn severity and 3-hour peak rainfall intensity were significant variables in all models. Soil organic matter content and soil clay content were significant variables in all models except Model 5. Soil slope was a significant variable in all models except Model 4. The most suitable model can be selected from these five models on the basis of the availability of independent variables in the particular area of interest and field checking of probability maps. The multivariate logistic regression models can be entered into a geographic information system, and maps showing the probability of debris flows can be constructed in recently burned areas of southern California. This study demonstrates that logistic regression is a valuable tool for developing models that predict the probability of debris flows occurring in recently burned landscapes.
Taljaard, Monica; McKenzie, Joanne E; Ramsay, Craig R; Grimshaw, Jeremy M
2014-06-19
An interrupted time series design is a powerful quasi-experimental approach for evaluating effects of interventions introduced at a specific point in time. To utilize the strength of this design, a modification to standard regression analysis, such as segmented regression, is required. In segmented regression analysis, the change in intercept and/or slope from pre- to post-intervention is estimated and used to test causal hypotheses about the intervention. We illustrate segmented regression using data from a previously published study that evaluated the effectiveness of a collaborative intervention to improve quality in pre-hospital ambulance care for acute myocardial infarction (AMI) and stroke. In the original analysis, a standard regression model was used with time as a continuous variable. We contrast the results from this standard regression analysis with those from segmented regression analysis. We discuss the limitations of the former and advantages of the latter, as well as the challenges of using segmented regression in analysing complex quality improvement interventions. Based on the estimated change in intercept and slope from pre- to post-intervention using segmented regression, we found insufficient evidence of a statistically significant effect on quality of care for stroke, although potential clinically important effects for AMI cannot be ruled out. Segmented regression analysis is the recommended approach for analysing data from an interrupted time series study. Several modifications to the basic segmented regression analysis approach are available to deal with challenges arising in the evaluation of complex quality improvement interventions.
Hubbard, Logan; Lipinski, Jerry; Ziemer, Benjamin; Malkasian, Shant; Sadeghi, Bahman; Javan, Hanna; Groves, Elliott M; Dertli, Brian; Molloi, Sabee
2018-01-01
Purpose To retrospectively validate a first-pass analysis (FPA) technique that combines computed tomographic (CT) angiography and dynamic CT perfusion measurement into one low-dose examination. Materials and Methods The study was approved by the animal care committee. The FPA technique was retrospectively validated in six swine (mean weight, 37.3 kg ± 7.5 [standard deviation]) between April 2015 and October 2016. Four to five intermediate-severity stenoses were generated in the left anterior descending artery (LAD), and 20 contrast material-enhanced volume scans were acquired per stenosis. All volume scans were used for maximum slope model (MSM) perfusion measurement, but only two volume scans were used for FPA perfusion measurement. Perfusion measurements in the LAD, left circumflex artery (LCx), right coronary artery, and all three coronary arteries combined were compared with microsphere perfusion measurements by using regression, root-mean-square error, root-mean-square deviation, Lin concordance correlation, and diagnostic outcomes analysis. The CT dose index and size-specific dose estimate per two-volume FPA perfusion measurement were also determined. Results FPA and MSM perfusion measurements (P FPA and P MSM ) in all three coronary arteries combined were related to reference standard microsphere perfusion measurements (P MICRO ), as follows: P FPA_COMBINED = 1.02 P MICRO_COMBINED + 0.11 (r = 0.96) and P MSM_COMBINED = 0.28 P MICRO_COMBINED + 0.23 (r = 0.89). The CT dose index and size-specific dose estimate per two-volume FPA perfusion measurement were 10.8 and 17.8 mGy, respectively. Conclusion The FPA technique was retrospectively validated in a swine model and has the potential to be used for accurate, low-dose vessel-specific morphologic and physiologic assessment of coronary artery disease. © RSNA, 2017.
Passenger comfort during terminal-area flight maneuvers. M.S. Thesis.
NASA Technical Reports Server (NTRS)
Schoonover, W. E., Jr.
1976-01-01
A series of flight experiments was conducted to obtain passenger subjective responses to closely controlled and repeatable flight maneuvers. In 8 test flights, reactions were obtained from 30 passenger subjects to a wide range of terminal-area maneuvers, including descents, turns, decelerations, and combinations thereof. Analysis of the passenger rating variance indicated that the objective of a repeatable flight passenger environment was achieved. Multiple linear regression models developed from the test data were used to define maneuver motion boundaries for specified degrees of passenger acceptance.
NASA Astrophysics Data System (ADS)
Karami, K.; Mohebi, R.
2007-08-01
We introduce a new method to derive the orbital parameters of spectroscopic binary stars by nonlinear least squares of (o-c). Using the measured radial velocity data of the four double lined spectroscopic binary systems, AI Phe, GM Dra, HD 93917 and V502 Oph, we derived both the orbital and combined spectroscopic elements of these systems. Our numerical results are in good agreement with the those obtained using the method of Lehmann-Filhé.
Unleashing the power of inhibitors of oncogenic kinases through BH3 mimetics.
Cragg, Mark S; Harris, Claire; Strasser, Andreas; Scott, Clare L
2009-05-01
Therapeutic targeting of tumours on the basis of molecular analysis is a new paradigm for cancer treatment but has yet to fulfil expectations. For many solid tumours, targeted therapeutics, such as inhibitors of oncogenic kinase pathways, elicit predominantly disease-stabilizing, cytostatic responses, rather than tumour regression. Combining oncogenic kinase inhibitors with direct activators of the apoptosis machinery, such as the BH3 mimetic ABT-737, may unlock potent anti-tumour potential to produce durable clinical responses with less collateral damage.
Standardized Regression Coefficients as Indices of Effect Sizes in Meta-Analysis
ERIC Educational Resources Information Center
Kim, Rae Seon
2011-01-01
When conducting a meta-analysis, it is common to find many collected studies that report regression analyses, because multiple regression analysis is widely used in many fields. Meta-analysis uses effect sizes drawn from individual studies as a means of synthesizing a collection of results. However, indices of effect size from regression analyses…
ERIC Educational Resources Information Center
Shih, Ching-Lin; Liu, Tien-Hsiang; Wang, Wen-Chung
2014-01-01
The simultaneous item bias test (SIBTEST) method regression procedure and the differential item functioning (DIF)-free-then-DIF strategy are applied to the logistic regression (LR) method simultaneously in this study. These procedures are used to adjust the effects of matching true score on observed score and to better control the Type I error…
On the reliable and flexible solution of practical subset regression problems
NASA Technical Reports Server (NTRS)
Verhaegen, M. H.
1987-01-01
A new algorithm for solving subset regression problems is described. The algorithm performs a QR decomposition with a new column-pivoting strategy, which permits subset selection directly from the originally defined regression parameters. This, in combination with a number of extensions of the new technique, makes the method a very flexible tool for analyzing subset regression problems in which the parameters have a physical meaning.
Choi, Scott Seung W; Budhathoki, Chakra; Gitlin, Laura N
2017-05-01
To investigate co-occurrences of agitation, aggression, and rejection of care in community-dwelling families living with dementia. Cross-sectional, secondary analysis from a randomized controlled trial testing a nonpharmacological intervention to reduce behavioral symptoms. We examined frequency of occurrence of presenting behaviors at baseline and their combination. Omnibus tests compared those exhibiting combinations of behaviors on contributory factors. Multinomial logistic regression analyses examined relationships of contributory factors to combinations of behaviors. Of 272 persons with dementia (PwDs), 41 (15%) had agitation alone (Agi), 3 (1%) had aggression alone, 5 (2%) had rejection of care alone. For behavioral combinations, 65 (24%) had agitation and aggression (Agi+Aggr), 35 (13%) had agitation and rejection (Agi+Rej), 1 (0%) had aggression and rejection, and 106 (39%) had all three behaviors (All). Four behavioral subgroups (Agi, Agi+Aggr, Agi+Rej, and All) were examined. Kruskal-Wallis tests showed that there were significant group differences in PwD cognition, functional dependence, and caregiver frustration. PwDs in Agi+Rej and All were more cognitively impaired than those in Agi and Agi+Aggr. Also, caregivers in All were more frustrated than those in Agi. In logistic regression analyses, compared with Agi, greater cognitive impairment was a significant predictor of Agi+Rej and All, but not Agi+Aggr. In contrast, greater caregiver frustration was a significant predictor of Agi+Aggr and All, but not Agi+Rej. We found that agitation, aggression, and rejection are common but distinct behaviors. Combinations of these behaviors have different relationships with contributory factors, suggesting the need for targeting treatment approaches to clusters. Copyright © 2016 American Association for Geriatric Psychiatry. Published by Elsevier Inc. All rights reserved.
Simultaneous grouping pursuit and feature selection over an undirected graph*
Zhu, Yunzhang; Shen, Xiaotong; Pan, Wei
2013-01-01
Summary In high-dimensional regression, grouping pursuit and feature selection have their own merits while complementing each other in battling the curse of dimensionality. To seek a parsimonious model, we perform simultaneous grouping pursuit and feature selection over an arbitrary undirected graph with each node corresponding to one predictor. When the corresponding nodes are reachable from each other over the graph, regression coefficients can be grouped, whose absolute values are the same or close. This is motivated from gene network analysis, where genes tend to work in groups according to their biological functionalities. Through a nonconvex penalty, we develop a computational strategy and analyze the proposed method. Theoretical analysis indicates that the proposed method reconstructs the oracle estimator, that is, the unbiased least squares estimator given the true grouping, leading to consistent reconstruction of grouping structures and informative features, as well as to optimal parameter estimation. Simulation studies suggest that the method combines the benefit of grouping pursuit with that of feature selection, and compares favorably against its competitors in selection accuracy and predictive performance. An application to eQTL data is used to illustrate the methodology, where a network is incorporated into analysis through an undirected graph. PMID:24098061
Robust regression for large-scale neuroimaging studies.
Fritsch, Virgile; Da Mota, Benoit; Loth, Eva; Varoquaux, Gaël; Banaschewski, Tobias; Barker, Gareth J; Bokde, Arun L W; Brühl, Rüdiger; Butzek, Brigitte; Conrod, Patricia; Flor, Herta; Garavan, Hugh; Lemaitre, Hervé; Mann, Karl; Nees, Frauke; Paus, Tomas; Schad, Daniel J; Schümann, Gunter; Frouin, Vincent; Poline, Jean-Baptiste; Thirion, Bertrand
2015-05-01
Multi-subject datasets used in neuroimaging group studies have a complex structure, as they exhibit non-stationary statistical properties across regions and display various artifacts. While studies with small sample sizes can rarely be shown to deviate from standard hypotheses (such as the normality of the residuals) due to the poor sensitivity of normality tests with low degrees of freedom, large-scale studies (e.g. >100 subjects) exhibit more obvious deviations from these hypotheses and call for more refined models for statistical inference. Here, we demonstrate the benefits of robust regression as a tool for analyzing large neuroimaging cohorts. First, we use an analytic test based on robust parameter estimates; based on simulations, this procedure is shown to provide an accurate statistical control without resorting to permutations. Second, we show that robust regression yields more detections than standard algorithms using as an example an imaging genetics study with 392 subjects. Third, we show that robust regression can avoid false positives in a large-scale analysis of brain-behavior relationships with over 1500 subjects. Finally we embed robust regression in the Randomized Parcellation Based Inference (RPBI) method and demonstrate that this combination further improves the sensitivity of tests carried out across the whole brain. Altogether, our results show that robust procedures provide important advantages in large-scale neuroimaging group studies. Copyright © 2015 Elsevier Inc. All rights reserved.
Akimov, M A; Gel'fond, M L; Gershanovich, M L; Barchuk, A S
2003-01-01
Thirty-eight patients with disseminated skin melanoma received chemotherapy in conjunction with laser coagulation or interstitial hyperthermia of intra- or subcutaneous metastases. Use of combination therapy was followed by a rise to 37% in total response and 16%--complete regression, respectively. Most effectiveness was attained when the dacarbazine + cisplatin + BCNU + tamoxifen regime was employed. In this group of 16 patients (46%), total response was 56% and, what is most significant, 31% in complete regression. In all cases of apparent response, polychemotherapy was administered both before and after laser coagulation or interstitial hyperthermia.
Sumithran, P; Purcell, K; Kuyruk, S; Proietto, J; Prendergast, L A
2018-02-01
Consistent, strong predictors of obesity treatment outcomes have not been identified. It has been suggested that broadening the range of predictor variables examined may be valuable. We explored methods to predict outcomes of a very-low-energy diet (VLED)-based programme in a clinically comparable setting, using a wide array of pre-intervention biological and psychosocial participant data. A total of 61 women and 39 men (mean ± standard deviation [SD] body mass index: 39.8 ± 7.3 kg/m 2 ) underwent an 8-week VLED and 12-month follow-up. At baseline, participants underwent a blood test and assessment of psychological, social and behavioural factors previously associated with treatment outcomes. Logistic regression, linear discriminant analysis, decision trees and random forests were used to model outcomes from baseline variables. Of the 100 participants, 88 completed the VLED and 42 attended the Week 60 visit. Overall prediction rates for weight loss of ≥10% at weeks 8 and 60, and attrition at Week 60, using combined data were between 77.8 and 87.6% for logistic regression, and lower for other methods. When logistic regression analyses included only baseline demographic and anthropometric variables, prediction rates were 76.2-86.1%. In this population, considering a wide range of biological and psychosocial data did not improve outcome prediction compared to simply-obtained baseline characteristics. © 2017 World Obesity Federation.
Using Dominance Analysis to Determine Predictor Importance in Logistic Regression
ERIC Educational Resources Information Center
Azen, Razia; Traxel, Nicole
2009-01-01
This article proposes an extension of dominance analysis that allows researchers to determine the relative importance of predictors in logistic regression models. Criteria for choosing logistic regression R[superscript 2] analogues were determined and measures were selected that can be used to perform dominance analysis in logistic regression. A…
Gray, Michael J; Gong, Jian; Hatch, Michaela M S; Nguyen, Van; Hughes, Christopher C W; Hutchins, Jeff T; Freimark, Bruce D
2016-05-11
The purpose of this study was to investigate the potential of antibody-directed immunotherapy targeting the aminophospholipid phosphatidylserine, which promotes immunosuppression when exposed in the tumor microenvironment, alone and in combination with antibody treatment towards the T-cell checkpoint inhibitor PD-1 in breast carcinomas, including triple-negative breast cancers. Immune-competent mice bearing syngeneic EMT-6 or E0771 tumors were subjected to treatments comprising of a phosphatidylserine-targeting and an anti-PD-1 antibody either as single or combinational treatments. Anti-tumor effects were determined by tumor growth inhibition and changes in overall survival accompanying each treatment. The generation of a tumor-specific immune response in animals undergoing complete tumor regression was assessed by secondary tumor cell challenge and splenocyte-produced IFNγ in the presence or absence of irradiated tumor cells. Changes in the presence of tumor-infiltrating lymphocytes were assessed by flow cytometry, while mRNA-based immune profiling was determined using NanoString PanCancer Immune Profiling Panel analysis. Treatment by a phosphatidylserine-targeting antibody inhibits in-vivo growth and significantly enhances the anti-tumor activity of antibody-mediated PD-1 therapy, including providing a distinct survival advantage over treatment by either single agent. Animals in which complete tumor regression occurred with combination treatments were resistant to secondary tumor challenge and presented heightened expression levels of splenocyte-produced IFNγ. Combinational treatment by a phosphatidylserine-targeting antibody with anti-PD-1 therapy increased the number of tumor-infiltrating lymphocytes more than that observed with single-arm therapies. Finally, immunoprofiling analysis revealed that the combination of anti-phosphatidylserine targeting antibody and anti-PD-1 therapy enhanced tumor-infiltrating lymphocytes, and increased expression of pro-immunosurveillance-associated cytokines while significantly decreasing expression of pro-tumorigenic cytokines that were induced by single anti-PD-1 therapy. Our data suggest that antibody therapy targeting phosphatidylserine-associated immunosuppression, which has activity as a single agent, can significantly enhance immunotherapies targeting the PD-1 pathway in murine breast neoplasms, including triple-negative breast cancers.
Effects of Climate Change on Salmonella Infections
Akil, Luma; Reddy, Remata S.
2014-01-01
Abstract Background: Climate change and global warming have been reported to increase spread of foodborne pathogens. To understand these effects on Salmonella infections, modeling approaches such as regression analysis and neural network (NN) were used. Methods: Monthly data for Salmonella outbreaks in Mississippi (MS), Tennessee (TN), and Alabama (AL) were analyzed from 2002 to 2011 using analysis of variance and time series analysis. Meteorological data were collected and the correlation with salmonellosis was examined using regression analysis and NN. Results: A seasonal trend in Salmonella infections was observed (p<0.001). Strong positive correlation was found between high temperature and Salmonella infections in MS and for the combined states (MS, TN, AL) models (R2=0.554; R2=0.415, respectively). NN models showed a strong effect of rise in temperature on the Salmonella outbreaks. In this study, an increase of 1°F was shown to result in four cases increase of Salmonella in MS. However, no correlation between monthly average precipitation rate and Salmonella infections was observed. Conclusion: There is consistent evidence that gastrointestinal infection with bacterial pathogens is positively correlated with ambient temperature, as warmer temperatures enable more rapid replication. Warming trends in the United States and specifically in the southern states may increase rates of Salmonella infections. PMID:25496072
Dor, Avi; Luo, Qian; Gerstein, Maya Tuchman; Malveaux, Floyd; Mitchell, Herman; Markus, Anne Rossier
We present an incremental cost-effectiveness analysis of an evidence-based childhood asthma intervention (Community Healthcare for Asthma Management and Prevention of Symptoms [CHAMPS]) to usual management of childhood asthma in community health centers. Data used in the analysis include household surveys, Medicaid insurance claims, and community health center expenditure reports. We combined our incremental cost-effectiveness analysis with a difference-in-differences multivariate regression framework. We found that CHAMPS reduced symptom days by 29.75 days per child-year and was cost-effective (incremental cost-effectiveness ratio: $28.76 per symptom-free days). Most of the benefits were due to reductions in direct medical costs. Indirect benefits from increased household productivity were relatively small.
Network structure and travel time perception.
Parthasarathi, Pavithra; Levinson, David; Hochmair, Hartwig
2013-01-01
The purpose of this research is to test the systematic variation in the perception of travel time among travelers and relate the variation to the underlying street network structure. Travel survey data from the Twin Cities metropolitan area (which includes the cities of Minneapolis and St. Paul) is used for the analysis. Travelers are classified into two groups based on the ratio of perceived and estimated commute travel time. The measures of network structure are estimated using the street network along the identified commute route. T-test comparisons are conducted to identify statistically significant differences in estimated network measures between the two traveler groups. The combined effect of these estimated network measures on travel time is then analyzed using regression models. The results from the t-test and regression analyses confirm the influence of the underlying network structure on the perception of travel time.
Li, Yankun; Shao, Xueguang; Cai, Wensheng
2007-04-15
Consensus modeling of combining the results of multiple independent models to produce a single prediction avoids the instability of single model. Based on the principle of consensus modeling, a consensus least squares support vector regression (LS-SVR) method for calibrating the near-infrared (NIR) spectra was proposed. In the proposed approach, NIR spectra of plant samples were firstly preprocessed using discrete wavelet transform (DWT) for filtering the spectral background and noise, then, consensus LS-SVR technique was used for building the calibration model. With an optimization of the parameters involved in the modeling, a satisfied model was achieved for predicting the content of reducing sugar in plant samples. The predicted results show that consensus LS-SVR model is more robust and reliable than the conventional partial least squares (PLS) and LS-SVR methods.
Graphical Evaluation of the Ridge-Type Robust Regression Estimators in Mixture Experiments
Erkoc, Ali; Emiroglu, Esra
2014-01-01
In mixture experiments, estimation of the parameters is generally based on ordinary least squares (OLS). However, in the presence of multicollinearity and outliers, OLS can result in very poor estimates. In this case, effects due to the combined outlier-multicollinearity problem can be reduced to certain extent by using alternative approaches. One of these approaches is to use biased-robust regression techniques for the estimation of parameters. In this paper, we evaluate various ridge-type robust estimators in the cases where there are multicollinearity and outliers during the analysis of mixture experiments. Also, for selection of biasing parameter, we use fraction of design space plots for evaluating the effect of the ridge-type robust estimators with respect to the scaled mean squared error of prediction. The suggested graphical approach is illustrated on Hald cement data set. PMID:25202738
Graphical evaluation of the ridge-type robust regression estimators in mixture experiments.
Erkoc, Ali; Emiroglu, Esra; Akay, Kadri Ulas
2014-01-01
In mixture experiments, estimation of the parameters is generally based on ordinary least squares (OLS). However, in the presence of multicollinearity and outliers, OLS can result in very poor estimates. In this case, effects due to the combined outlier-multicollinearity problem can be reduced to certain extent by using alternative approaches. One of these approaches is to use biased-robust regression techniques for the estimation of parameters. In this paper, we evaluate various ridge-type robust estimators in the cases where there are multicollinearity and outliers during the analysis of mixture experiments. Also, for selection of biasing parameter, we use fraction of design space plots for evaluating the effect of the ridge-type robust estimators with respect to the scaled mean squared error of prediction. The suggested graphical approach is illustrated on Hald cement data set.
NASA Technical Reports Server (NTRS)
Whitlock, C. H., III
1977-01-01
Constituents with linear radiance gradients with concentration may be quantified from signals which contain nonlinear atmospheric and surface reflection effects for both homogeneous and non-homogeneous water bodies provided accurate data can be obtained and nonlinearities are constant with wavelength. Statistical parameters must be used which give an indication of bias as well as total squared error to insure that an equation with an optimum combination of bands is selected. It is concluded that the effect of error in upwelled radiance measurements is to reduce the accuracy of the least square fitting process and to increase the number of points required to obtain a satisfactory fit. The problem of obtaining a multiple regression equation that is extremely sensitive to error is discussed.
Mukaratirwa, S; Chitanga, S; Chimatira, T; Makuleke, C; Sayi, S T; Bhebhe, E
2009-06-01
Therapeutic efficacy and histological changes after bacillus Calmette-Guerin (BCG), vincristine and BCG/vincristine combination therapy of canine transmissible venereal tumours (CTVT) were studied. Twenty dogs with naturally occurring CTVT in the progression stage were divided into 4 groups and treated with intratumoral BCG, vincristine, BCG/vincristine combination therapy or intratumoral buffered saline (control group). Tumour sizes were determined weekly and tumour response to therapy was assessed. Tumour biopsies were taken weekly to evaluate histological changes. Complete tumour regression was observed in all the dogs treated with BCG, vincristine and BCG/vincristine combination therapy. BCG/vincristine combination therapy had a statistically significantly shorter regression time than BCG or vincristine therapy. No tumour regression was observed in the control group. Intratumoral BCG treatment resulted in the appearance of macrophages and increased numbers of tumour infiltrating lymphocytes (TILs) followed by tumour cell apoptosis and necrosis. Treatment with vincristine resulted in increased tumour cell apoptosis, reduction in the mitotic index and a decrease in the number of TILs. Tumours from dogs on BCG/vincristine combination were characterised by reduction in the mitotic index, and appearance of numerous TILs and macrophages followed by marked tumour cell apoptosis and necrosis. This study indicates that combined BCG and vincristine therapy is more effective than vincristine in treating CTVT, suggesting that the clinical course of this disease may be altered by immunochemotherapy.
Optimizing Hybrid Metrology: Rigorous Implementation of Bayesian and Combined Regression
Henn, Mark-Alexander; Silver, Richard M.; Villarrubia, John S.; Zhang, Nien Fan; Zhou, Hui; Barnes, Bryan M.; Ming, Bin; Vladár, András E.
2015-01-01
Hybrid metrology, e.g., the combination of several measurement techniques to determine critical dimensions, is an increasingly important approach to meet the needs of the semiconductor industry. A proper use of hybrid metrology may yield not only more reliable estimates for the quantitative characterization of 3-D structures but also a more realistic estimation of the corresponding uncertainties. Recent developments at the National Institute of Standards and Technology (NIST) feature the combination of optical critical dimension (OCD) measurements and scanning electron microscope (SEM) results. The hybrid methodology offers the potential to make measurements of essential 3-D attributes that may not be otherwise feasible. However, combining techniques gives rise to essential challenges in error analysis and comparing results from different instrument models, especially the effect of systematic and highly correlated errors in the measurement on the χ2 function that is minimized. Both hypothetical examples and measurement data are used to illustrate solutions to these challenges. PMID:26681991
Weitz, Erica; Kleiboer, Annet; van Straten, Annemieke; Hollon, Steven D; Cuijpers, Pim
2017-02-13
There are many proven treatments (psychotherapy, pharmacotherapy or their combination) for the treatment of depression. Although there is growing evidence for the effectiveness of combination treatment (psychotherapy + pharmacotherapy) over pharmacotherapy alone, psychotherapy alone or psychotherapy plus pill placebo, for depression, little is known about which specific groups of patients may respond best to combined treatment versus monotherapy. Conventional meta-analyses techniques have limitations when tasked with examining whether specific individual characteristics moderate the effect of treatment on depression. Therefore, this protocol outlines an individual patient data (IPD) meta-analysis to explore which patients, with which clinical characteristics, have better outcomes in combined treatment compared with psychotherapy (alone or with pill placebo), pharmacotherapy and pill placebo. Study searches are completed using an established database of randomised controlled trials (RCTs) on the psychological treatment of adult depression that has previously been reported. Searches were conducted in PubMed, PsycInfo, Embase and the Cochrane Central Register of Controlled Trials. RCTs comparing combination treatment (psychotherapy + pharmacotherapy) with psychotherapy (with or without pill placebo), pharmacotherapy or pill placebo for the treatment of adult depression will be included. Study authors of eligible trials will be contacted and asked to contribute IPD. Conventional meta-analysis techniques will be used to examine differences between studies that have contributed data and those that did not. Then, IPD will be harmonised and analysis using multilevel regression will be conducted to examine effect moderators of treatment outcomes. Study results outlined above will be published in peer-reviewed journals. Study results will contribute to better understanding whether certain patients respond best to combined treatment or other depression treatments and provide new information on moderators of treatment outcome that can be used by patients, clinicians and researchers. CRD42016039028. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/.
Eng, K.; Milly, P.C.D.; Tasker, Gary D.
2007-01-01
To facilitate estimation of streamflow characteristics at an ungauged site, hydrologists often define a region of influence containing gauged sites hydrologically similar to the estimation site. This region can be defined either in geographic space or in the space of the variables that are used to predict streamflow (predictor variables). These approaches are complementary, and a combination of the two may be superior to either. Here we propose a hybrid region-of-influence (HRoI) regression method that combines the two approaches. The new method was applied with streamflow records from 1,091 gauges in the southeastern United States to estimate the 50-year peak flow (Q50). The HRoI approach yielded lower root-mean-square estimation errors and produced fewer extreme errors than either the predictor-variable or geographic region-of-influence approaches. It is concluded, for Q50 in the study region, that similarity with respect to the basin characteristics considered (area, slope, and annual precipitation) is important, but incomplete, and that the consideration of geographic proximity of stations provides a useful surrogate for characteristics that are not included in the analysis. ?? 2007 ASCE.
Wang, Yubo; Veluvolu, Kalyana C
2017-06-14
It is often difficult to analyze biological signals because of their nonlinear and non-stationary characteristics. This necessitates the usage of time-frequency decomposition methods for analyzing the subtle changes in these signals that are often connected to an underlying phenomena. This paper presents a new approach to analyze the time-varying characteristics of such signals by employing a simple truncated Fourier series model, namely the band-limited multiple Fourier linear combiner (BMFLC). In contrast to the earlier designs, we first identified the sparsity imposed on the signal model in order to reformulate the model to a sparse linear regression model. The coefficients of the proposed model are then estimated by a convex optimization algorithm. The performance of the proposed method was analyzed with benchmark test signals. An energy ratio metric is employed to quantify the spectral performance and results show that the proposed method Sparse-BMFLC has high mean energy (0.9976) ratio and outperforms existing methods such as short-time Fourier transfrom (STFT), continuous Wavelet transform (CWT) and BMFLC Kalman Smoother. Furthermore, the proposed method provides an overall 6.22% in reconstruction error.
A non-linear data mining parameter selection algorithm for continuous variables
Razavi, Marianne; Brady, Sean
2017-01-01
In this article, we propose a new data mining algorithm, by which one can both capture the non-linearity in data and also find the best subset model. To produce an enhanced subset of the original variables, a preferred selection method should have the potential of adding a supplementary level of regression analysis that would capture complex relationships in the data via mathematical transformation of the predictors and exploration of synergistic effects of combined variables. The method that we present here has the potential to produce an optimal subset of variables, rendering the overall process of model selection more efficient. This algorithm introduces interpretable parameters by transforming the original inputs and also a faithful fit to the data. The core objective of this paper is to introduce a new estimation technique for the classical least square regression framework. This new automatic variable transformation and model selection method could offer an optimal and stable model that minimizes the mean square error and variability, while combining all possible subset selection methodology with the inclusion variable transformations and interactions. Moreover, this method controls multicollinearity, leading to an optimal set of explanatory variables. PMID:29131829
Robson, Andrew; Robson, Fiona
2015-01-01
To identify the combination of variables that explain nurses' continuation intention in the UK National Health Service. This alternative arena has permitted the replication of a private sector Australian study. This study provides understanding about the issues that affect nurse retention in a sector where employee attrition is a key challenge, further exacerbated by an ageing workforce. A quantitative study based on a self-completion survey questionnaire completed in 2010. Nurses employed in two UK National Health Service Foundation Trusts were surveyed and assessed using seven work-related constructs and various demographics including age generation. Through correlation, multiple regression and stepwise regression analysis, the potential combined effect of various explanatory variables on continuation intention was assessed, across the entire nursing cohort and in three age-generation groups. Three variables act in combination to explain continuation intention: work-family conflict, work attachment and importance of work to the individual. This combination of significant explanatory variables was consistent across the three generations of nursing employee. Work attachment was identified as the strongest marginal predictor of continuation intention. Work orientation has a greater impact on continuation intention compared with employer-directed interventions such as leader-member exchange, teamwork and autonomy. UK nurses are homogeneous across the three age-generations regarding explanation of continuation intention, with the significant explanatory measures being recognizably narrower in their focus and more greatly concentrated on the individual. This suggests that differentiated approaches to retention should perhaps not be pursued in this sectoral context. © 2014 John Wiley & Sons Ltd.
Lee, Bandy X; Marotta, Phillip L; Blay-Tofey, Morkeh; Wang, Winnie; de Bourmont, Shalila
2014-01-01
Our goal was to identify if there might be advantages to combining two major public health concerns, i.e., homicides and suicides, in an analysis with well-established macro-level economic determinants, i.e., unemployment and inequality. Mortality data, unemployment statistics, and inequality measures were obtained for 40 countries for the years 1962-2008. Rates of combined homicide and suicide, ratio of suicide to combined violent death, and ratio between homicide and suicide were graphed and analyzed. A fixed effects regression model was then performed for unemployment rates and Gini coefficients on homicide, suicide, and combined death rates. For a majority of nation states, suicide comprised a substantial proportion (mean 75.51%; range 0-99%) of the combined rate of homicide and suicide. When combined, a small but significant relationship emerged between logged Gini coefficient and combined death rates (0.0066, p < 0.05), suggesting that the combined rate improves the ability to detect a significant relationship when compared to either rate measurement alone. Results were duplicated by age group, whereby combining death rates into a single measure improved statistical power, provided that the association was strong. Violent deaths, when combined, were associated with an increase in unemployment and an increase in Gini coefficient, creating a more robust variable. As the effects of macro-level factors (e.g., social and economic policies) on violent death rates in a population are shown to be more significant than those of micro-level influences (e.g., individual characteristics), these associations may be useful to discover. An expansion of socioeconomic variables and the inclusion of other forms of violence in future research could help elucidate long-term trends.
Lee, Bandy X.; Marotta, Phillip L.; Blay-Tofey, Morkeh; Wang, Winnie; de Bourmont, Shalila
2015-01-01
Objectives Our goal was to identify if there might be advantages to combining two major public health concerns, i.e., homicides and suicides, in an analysis with well-established macro-level economic determinants, i.e., unemployment and inequality. Methods Mortality data, unemployment statistics, and inequality measures were obtained for 40 countries for the years 1962–2008. Rates of combined homicide and suicide, ratio of suicide to combined violent death, and ratio between homicide and suicide were graphed and analyzed. A fixed effects regression model was then performed for unemployment rates and Gini coefficients on homicide, suicide, and combined death rates. Results For a majority of nation states, suicide comprised a substantial proportion (mean 75.51%; range 0–99%) of the combined rate of homicide and suicide. When combined, a small but significant relationship emerged between logged Gini coefficient and combined death rates (0.0066, p < 0.05), suggesting that the combined rate improves the ability to detect a significant relationship when compared to either rate measurement alone. Results were duplicated by age group, whereby combining death rates into a single measure improved statistical power, provided that the association was strong. Conclusions Violent deaths, when combined, were associated with an increase in unemployment and an increase in Gini coefficient, creating a more robust variable. As the effects of macro-level factors (e.g., social and economic policies) on violent death rates in a population are shown to be more significant than those of micro-level influences (e.g., individual characteristics), these associations may be useful to discover. An expansion of socioeconomic variables and the inclusion of other forms of violence in future research could help elucidate long-term trends. PMID:26028985
A statistical method for predicting seizure onset zones from human single-neuron recordings
NASA Astrophysics Data System (ADS)
Valdez, André B.; Hickman, Erin N.; Treiman, David M.; Smith, Kris A.; Steinmetz, Peter N.
2013-02-01
Objective. Clinicians often use depth-electrode recordings to localize human epileptogenic foci. To advance the diagnostic value of these recordings, we applied logistic regression models to single-neuron recordings from depth-electrode microwires to predict seizure onset zones (SOZs). Approach. We collected data from 17 epilepsy patients at the Barrow Neurological Institute and developed logistic regression models to calculate the odds of observing SOZs in the hippocampus, amygdala and ventromedial prefrontal cortex, based on statistics such as the burst interspike interval (ISI). Main results. Analysis of these models showed that, for a single-unit increase in burst ISI ratio, the left hippocampus was approximately 12 times more likely to contain a SOZ; and the right amygdala, 14.5 times more likely. Our models were most accurate for the hippocampus bilaterally (at 85% average sensitivity), and performance was comparable with current diagnostics such as electroencephalography. Significance. Logistic regression models can be combined with single-neuron recording to predict likely SOZs in epilepsy patients being evaluated for resective surgery, providing an automated source of clinically useful information.
Suresh, Arumuganainar; Choi, Hong Lim
2011-10-01
Swine waste land application has increased due to organic fertilization, but excess application in an arable system can cause environmental risk. Therefore, in situ characterizations of such resources are important prior to application. To explore this, 41 swine slurry samples were collected from Korea, and wide differences were observed in the physico-biochemical properties. However, significant (P<0.001) multiple property correlations (R²) were obtained between nutrients with specific gravity (SG), electrical conductivity (EC), total solids (TS) and pH. The different combinations of hydrometer, EC meter, drying oven and pH meter were found useful to estimate Mn, Fe, Ca, K, Al, Na, N and 5-day biochemical oxygen demands (BOD₅) at improved R² values of 0.83, 0.82, 0.77, 0.75, 0.67, 0.47, 0.88 and 0.70, respectively. The results from this study suggest that multiple property regressions can facilitate the prediction of micronutrients and organic matter much better than a single property regression for livestock waste. Copyright © 2011 Elsevier Ltd. All rights reserved.
Reconstruction of missing daily streamflow data using dynamic regression models
NASA Astrophysics Data System (ADS)
Tencaliec, Patricia; Favre, Anne-Catherine; Prieur, Clémentine; Mathevet, Thibault
2015-12-01
River discharge is one of the most important quantities in hydrology. It provides fundamental records for water resources management and climate change monitoring. Even very short data-gaps in this information can cause extremely different analysis outputs. Therefore, reconstructing missing data of incomplete data sets is an important step regarding the performance of the environmental models, engineering, and research applications, thus it presents a great challenge. The objective of this paper is to introduce an effective technique for reconstructing missing daily discharge data when one has access to only daily streamflow data. The proposed procedure uses a combination of regression and autoregressive integrated moving average models (ARIMA) called dynamic regression model. This model uses the linear relationship between neighbor and correlated stations and then adjusts the residual term by fitting an ARIMA structure. Application of the model to eight daily streamflow data for the Durance river watershed showed that the model yields reliable estimates for the missing data in the time series. Simulation studies were also conducted to evaluate the performance of the procedure.
Xie, Jiangan; Zhao, Lili; Zhou, Shangbo; He, Yongqun
2016-01-01
Vaccinations often induce various adverse events (AEs), and sometimes serious AEs (SAEs). While many vaccines are used in combination, the effects of vaccine-vaccine interactions (VVIs) on vaccine AEs are rarely studied. In this study, AE profiles induced by hepatitis A vaccine (Havrix), hepatitis B vaccine (Engerix-B), and hepatitis A and B combination vaccine (Twinrix) were studied using the VAERS data. From May 2001 to January 2015, VAERS recorded 941, 3,885, and 1,624 AE case reports where patients aged at least 18 years old were vaccinated with only Havrix, Engerix-B, and Twinrix, respectively. Using these data, our statistical analysis identified 46, 69, and 82 AEs significantly associated with Havrix, Engerix-B, and Twinrix, respectively. Based on the Ontology of Adverse Events (OAE) hierarchical classification, these AEs were enriched in the AEs related to behavioral and neurological conditions, immune system, and investigation results. Twenty-nine AEs were classified as SAEs and mainly related to immune conditions. Using a logistic regression model accompanied with MCMC sampling, 13 AEs (e.g., hepatosplenomegaly) were identified to result from VVI synergistic effects. Classifications of these 13 AEs using OAE and MedDRA hierarchies confirmed the advantages of the OAE-based method over MedDRA in AE term hierarchical analysis. PMID:27694888
Smith, E M D; Jorgensen, A L; Beresford, M W
2017-10-01
Background Lupus nephritis (LN) affects up to 80% of juvenile-onset systemic lupus erythematosus (JSLE) patients. The value of commonly available biomarkers, such as anti-dsDNA antibodies, complement (C3/C4), ESR and full blood count parameters in the identification of active LN remains uncertain. Methods Participants from the UK JSLE Cohort Study, aged <16 years at diagnosis, were categorized as having active or inactive LN according to the renal domain of the British Isles Lupus Assessment Group score. Classic biomarkers: anti-dsDNA, C3, C4, ESR, CRP, haemoglobin, total white cells, neutrophils, lymphocytes, platelets and immunoglobulins were assessed for their ability to identify active LN using binary logistic regression modeling, with stepAIC function applied to select a final model. Receiver-operating curve analysis was used to assess diagnostic accuracy. Results A total of 370 patients were recruited; 191 (52%) had active LN and 179 (48%) had inactive LN. Binary logistic regression modeling demonstrated a combination of ESR, C3, white cell count, neutrophils, lymphocytes and IgG to be best for the identification of active LN (area under the curve 0.724). Conclusions At best, combining common classic blood biomarkers of lupus activity using multivariate analysis provides a 'fair' ability to identify active LN. Urine biomarkers were not included in these analyses. These results add to the concern that classic blood biomarkers are limited in monitoring discrete JSLE manifestations such as LN.
Lina, Xu; Feng, Li; Yanyun, Zhang; Nan, Gao; Mingfang, Hu
2016-12-01
To explore the phonological characteristics and rehabilitation training of abnormal velar in patients with functional articulation disorders (FAD). Eighty-seven patients with FAD were observed of the phonological characteristics of velar. Seventy-two patients with abnormal velar accepted speech training. The correlation and simple linear regression analysis were carried out on abnormal velar articulation and age. The articulation disorder of /g/ mainly showed replacement by /d/, /b/ or omission. /k/ mainly showed replacement by /d/, /t/, /g/, /p/, /b/. /h/ mainly showed replacement by /g/, /f/, /p/, /b/ or omission. The common erroneous articulation forms of /g/, /k/, /h/ were fronting of tongue and replacement by bilabial consonants. When velar combined with vowels contained /a/ and /e/, the main error was fronting of tongue. When velar combined with vowels contained /u/, the errors trended to be replacement by bilabial consonants. After 3 to 10 times of speech training, the number of erroneous words decreased to (6.24±2.61) from (40.28±6.08) before the speech training was established, the difference was statistically significant (Z=-7.379, P=0.000). The number of erroneous words was negatively correlated with age (r=-0.691, P=0.000). The result of simple linear regression analysis showed that the determination coefficient was 0.472. The articulation disorder of velar mainly shows replacement, varies with the vowels. The targeted rehabilitation training hereby established is significantly effective. Age plays an important role in the outcome of velar.
An Update on Statistical Boosting in Biomedicine.
Mayr, Andreas; Hofner, Benjamin; Waldmann, Elisabeth; Hepp, Tobias; Meyer, Sebastian; Gefeller, Olaf
2017-01-01
Statistical boosting algorithms have triggered a lot of research during the last decade. They combine a powerful machine learning approach with classical statistical modelling, offering various practical advantages like automated variable selection and implicit regularization of effect estimates. They are extremely flexible, as the underlying base-learners (regression functions defining the type of effect for the explanatory variables) can be combined with any kind of loss function (target function to be optimized, defining the type of regression setting). In this review article, we highlight the most recent methodological developments on statistical boosting regarding variable selection, functional regression, and advanced time-to-event modelling. Additionally, we provide a short overview on relevant applications of statistical boosting in biomedicine.
Kabeshova, A; Annweiler, C; Fantino, B; Philip, T; Gromov, V A; Launay, C P; Beauchet, O
2014-06-01
Regression tree (RT) analyses are particularly adapted to explore the risk of recurrent falling according to various combinations of fall risk factors compared to logistic regression models. The aims of this study were (1) to determine which combinations of fall risk factors were associated with the occurrence of recurrent falls in older community-dwellers, and (2) to compare the efficacy of RT and multiple logistic regression model for the identification of recurrent falls. A total of 1,760 community-dwelling volunteers (mean age ± standard deviation, 71.0 ± 5.1 years; 49.4 % female) were recruited prospectively in this cross-sectional study. Age, gender, polypharmacy, use of psychoactive drugs, fear of falling (FOF), cognitive disorders and sad mood were recorded. In addition, the history of falls within the past year was recorded using a standardized questionnaire. Among 1,760 participants, 19.7 % (n = 346) were recurrent fallers. The RT identified 14 nodes groups and 8 end nodes with FOF as the first major split. Among participants with FOF, those who had sad mood and polypharmacy formed the end node with the greatest OR for recurrent falls (OR = 6.06 with p < 0.001). Among participants without FOF, those who were male and not sad had the lowest OR for recurrent falls (OR = 0.25 with p < 0.001). The RT correctly classified 1,356 from 1,414 non-recurrent fallers (specificity = 95.6 %), and 65 from 346 recurrent fallers (sensitivity = 18.8 %). The overall classification accuracy was 81.0 %. The multiple logistic regression correctly classified 1,372 from 1,414 non-recurrent fallers (specificity = 97.0 %), and 61 from 346 recurrent fallers (sensitivity = 17.6 %). The overall classification accuracy was 81.4 %. Our results show that RT may identify specific combinations of risk factors for recurrent falls, the combination most associated with recurrent falls involving FOF, sad mood and polypharmacy. The FOF emerged as the risk factor strongly associated with recurrent falls. In addition, RT and multiple logistic regression were not sensitive enough to identify the majority of recurrent fallers but appeared efficient in detecting individuals not at risk of recurrent falls.
NASA Astrophysics Data System (ADS)
Srinivas, Kadivendi; Vundavilli, Pandu R.; Manzoor Hussain, M.; Saiteja, M.
2016-09-01
Welding input parameters such as current, gas flow rate and torch angle play a significant role in determination of qualitative mechanical properties of weld joint. Traditionally, it is necessary to determine the weld input parameters for every new welded product to obtain a quality weld joint which is time consuming. In the present work, the effect of plasma arc welding parameters on mild steel was studied using a neural network approach. To obtain a response equation that governs the input-output relationships, conventional regression analysis was also performed. The experimental data was constructed based on Taguchi design and the training data required for neural networks were randomly generated, by varying the input variables within their respective ranges. The responses were calculated for each combination of input variables by using the response equations obtained through the conventional regression analysis. The performances in Levenberg-Marquardt back propagation neural network and radial basis neural network (RBNN) were compared on various randomly generated test cases, which are different from the training cases. From the results, it is interesting to note that for the above said test cases RBNN analysis gave improved training results compared to that of feed forward back propagation neural network analysis. Also, RBNN analysis proved a pattern of increasing performance as the data points moved away from the initial input values.
Feng, Yongjiu; Tong, Xiaohua
2017-09-22
Defining transition rules is an important issue in cellular automaton (CA)-based land use modeling because these models incorporate highly correlated driving factors. Multicollinearity among correlated driving factors may produce negative effects that must be eliminated from the modeling. Using exploratory regression under pre-defined criteria, we identified all possible combinations of factors from the candidate factors affecting land use change. Three combinations that incorporate five driving factors meeting pre-defined criteria were assessed. With the selected combinations of factors, three logistic regression-based CA models were built to simulate dynamic land use change in Shanghai, China, from 2000 to 2015. For comparative purposes, a CA model with all candidate factors was also applied to simulate the land use change. Simulations using three CA models with multicollinearity eliminated performed better (with accuracy improvements about 3.6%) than the model incorporating all candidate factors. Our results showed that not all candidate factors are necessary for accurate CA modeling and the simulations were not sensitive to changes in statistically non-significant driving factors. We conclude that exploratory regression is an effective method to search for the optimal combinations of driving factors, leading to better land use change models that are devoid of multicollinearity. We suggest identification of dominant factors and elimination of multicollinearity before building land change models, making it possible to simulate more realistic outcomes.
Ifoulis, A A; Savopoulou-Soultani, M
2006-10-01
The purpose of this research was to quantify the spatial pattern and develop a sampling program for larvae of Lobesia botrana Denis and Schiffermüller (Lepidoptera: Tortricidae), an important vineyard pest in northern Greece. Taylor's power law and Iwao's patchiness regression were used to model the relationship between the mean and the variance of larval counts. Analysis of covariance was carried out, separately for infestation and injury, with combined second and third generation data, for vine and half-vine sample units. Common regression coefficients were estimated to permit use of the sampling plan over a wide range of conditions. Optimum sample sizes for infestation and injury, at three levels of precision, were developed. An investigation of a multistage sampling plan with a nested analysis of variance showed that if the goal of sampling is focusing on larval infestation, three grape clusters should be sampled in a half-vine; if the goal of sampling is focusing on injury, then two grape clusters per half-vine are recommended.
Eutrophication risk assessment in coastal embayments using simple statistical models.
Arhonditsis, G; Eleftheriadou, M; Karydis, M; Tsirtsis, G
2003-09-01
A statistical methodology is proposed for assessing the risk of eutrophication in marine coastal embayments. The procedure followed was the development of regression models relating the levels of chlorophyll a (Chl) with the concentration of the limiting nutrient--usually nitrogen--and the renewal rate of the systems. The method was applied in the Gulf of Gera, Island of Lesvos, Aegean Sea and a surrogate for renewal rate was created using the Canberra metric as a measure of the resemblance between the Gulf and the oligotrophic waters of the open sea in terms of their physical, chemical and biological properties. The Chl-total dissolved nitrogen-renewal rate regression model was the most significant, accounting for 60% of the variation observed in Chl. Predicted distributions of Chl for various combinations of the independent variables, based on Bayesian analysis of the models, enabled comparison of the outcomes of specific scenarios of interest as well as further analysis of the system dynamics. The present statistical approach can be used as a methodological tool for testing the resilience of coastal ecosystems under alternative managerial schemes and levels of exogenous nutrient loading.
Linking brain-wide multivoxel activation patterns to behaviour: Examples from language and math.
Raizada, Rajeev D S; Tsao, Feng-Ming; Liu, Huei-Mei; Holloway, Ian D; Ansari, Daniel; Kuhl, Patricia K
2010-05-15
A key goal of cognitive neuroscience is to find simple and direct connections between brain and behaviour. However, fMRI analysis typically involves choices between many possible options, with each choice potentially biasing any brain-behaviour correlations that emerge. Standard methods of fMRI analysis assess each voxel individually, but then face the problem of selection bias when combining those voxels into a region-of-interest, or ROI. Multivariate pattern-based fMRI analysis methods use classifiers to analyse multiple voxels together, but can also introduce selection bias via data-reduction steps as feature selection of voxels, pre-selecting activated regions, or principal components analysis. We show here that strong brain-behaviour links can be revealed without any voxel selection or data reduction, using just plain linear regression as a classifier applied to the whole brain at once, i.e. treating each entire brain volume as a single multi-voxel pattern. The brain-behaviour correlations emerged despite the fact that the classifier was not provided with any information at all about subjects' behaviour, but instead was given only the neural data and its condition-labels. Surprisingly, more powerful classifiers such as a linear SVM and regularised logistic regression produce very similar results. We discuss some possible reasons why the very simple brain-wide linear regression model is able to find correlations with behaviour that are as strong as those obtained on the one hand from a specific ROI and on the other hand from more complex classifiers. In a manner which is unencumbered by arbitrary choices, our approach offers a method for investigating connections between brain and behaviour which is simple, rigorous and direct. Copyright (c) 2010 Elsevier Inc. All rights reserved.
Nakajima, Hisato; Yano, Kouya; Nagasawa, Kaoko; Katou, Satoka; Yokota, Kuninobu
2017-01-01
The objective of this study is to examine the factors that influence the operation income and expenditure balance ratio of school corporations running university hospitals by multiple regression analysis. 1. We conducted cluster analysis of the financial ratio and classified the school corporations into those running colleges and universities.2. We conducted multiple regression analysis using the operation income and expenditure balance ratio of the colleges as the variables and the Diagnosis Procedure Combination data as the explaining variables.3. The predictive expression was used for multiple regression analysis. 1. The school corporations were divided into those running universities (7), colleges (20) and others. The medical income ratio and the debt ratio were high and the student payment ratio was low in the colleges.2. The numbers of emergency care hospitalizations, operations, radiation therapies, and ambulance conveyances, and the complexity index had a positive influence on the operation income and expenditure balance ratio. On the other hand, the number of general anesthesia procedures, the cover rate index, and the emergency care index had a negative influence.3. The predictive expression was as follows.Operation income and expenditure balance ratio = 0.027 × number of emergency care hospitalizations + 0.005 × number of operations + 0.019 × number of radiation therapies + 0.007 × number of ambulance conveyances - 0.003 × number of general anesthesia procedures + 648.344 × complexity index - 5877.210 × cover rate index - 2746.415 × emergency care index - 38.647Conclusion: In colleges, the number of emergency care hospitalizations, the number of operations, the number of radiation therapies, and the number of ambulance conveyances and the complexity index were factors for gaining ordinary profit.
Linking brain-wide multivoxel activation patterns to behaviour: Examples from language and math
Raizada, Rajeev D.S.; Tsao, Feng-Ming; Liu, Huei-Mei; Holloway, Ian D.; Ansari, Daniel; Kuhl, Patricia K.
2010-01-01
A key goal of cognitive neuroscience is to find simple and direct connections between brain and behaviour. However, fMRI analysis typically involves choices between many possible options, with each choice potentially biasing any brain–behaviour correlations that emerge. Standard methods of fMRI analysis assess each voxel individually, but then face the problem of selection bias when combining those voxels into a region-of-interest, or ROI. Multivariate pattern-based fMRI analysis methods use classifiers to analyse multiple voxels together, but can also introduce selection bias via data-reduction steps as feature selection of voxels, pre-selecting activated regions, or principal components analysis. We show here that strong brain–behaviour links can be revealed without any voxel selection or data reduction, using just plain linear regression as a classifier applied to the whole brain at once, i.e. treating each entire brain volume as a single multi-voxel pattern. The brain–behaviour correlations emerged despite the fact that the classifier was not provided with any information at all about subjects' behaviour, but instead was given only the neural data and its condition-labels. Surprisingly, more powerful classifiers such as a linear SVM and regularised logistic regression produce very similar results. We discuss some possible reasons why the very simple brain-wide linear regression model is able to find correlations with behaviour that are as strong as those obtained on the one hand from a specific ROI and on the other hand from more complex classifiers. In a manner which is unencumbered by arbitrary choices, our approach offers a method for investigating connections between brain and behaviour which is simple, rigorous and direct. PMID:20132896
Hyper-Spectral Image Analysis With Partially Latent Regression and Spatial Markov Dependencies
NASA Astrophysics Data System (ADS)
Deleforge, Antoine; Forbes, Florence; Ba, Sileye; Horaud, Radu
2015-09-01
Hyper-spectral data can be analyzed to recover physical properties at large planetary scales. This involves resolving inverse problems which can be addressed within machine learning, with the advantage that, once a relationship between physical parameters and spectra has been established in a data-driven fashion, the learned relationship can be used to estimate physical parameters for new hyper-spectral observations. Within this framework, we propose a spatially-constrained and partially-latent regression method which maps high-dimensional inputs (hyper-spectral images) onto low-dimensional responses (physical parameters such as the local chemical composition of the soil). The proposed regression model comprises two key features. Firstly, it combines a Gaussian mixture of locally-linear mappings (GLLiM) with a partially-latent response model. While the former makes high-dimensional regression tractable, the latter enables to deal with physical parameters that cannot be observed or, more generally, with data contaminated by experimental artifacts that cannot be explained with noise models. Secondly, spatial constraints are introduced in the model through a Markov random field (MRF) prior which provides a spatial structure to the Gaussian-mixture hidden variables. Experiments conducted on a database composed of remotely sensed observations collected from the Mars planet by the Mars Express orbiter demonstrate the effectiveness of the proposed model.
Regression: The Apple Does Not Fall Far From the Tree.
Vetter, Thomas R; Schober, Patrick
2018-05-15
Researchers and clinicians are frequently interested in either: (1) assessing whether there is a relationship or association between 2 or more variables and quantifying this association; or (2) determining whether 1 or more variables can predict another variable. The strength of such an association is mainly described by the correlation. However, regression analysis and regression models can be used not only to identify whether there is a significant relationship or association between variables but also to generate estimations of such a predictive relationship between variables. This basic statistical tutorial discusses the fundamental concepts and techniques related to the most common types of regression analysis and modeling, including simple linear regression, multiple regression, logistic regression, ordinal regression, and Poisson regression, as well as the common yet often underrecognized phenomenon of regression toward the mean. The various types of regression analysis are powerful statistical techniques, which when appropriately applied, can allow for the valid interpretation of complex, multifactorial data. Regression analysis and models can assess whether there is a relationship or association between 2 or more observed variables and estimate the strength of this association, as well as determine whether 1 or more variables can predict another variable. Regression is thus being applied more commonly in anesthesia, perioperative, critical care, and pain research. However, it is crucial to note that regression can identify plausible risk factors; it does not prove causation (a definitive cause and effect relationship). The results of a regression analysis instead identify independent (predictor) variable(s) associated with the dependent (outcome) variable. As with other statistical methods, applying regression requires that certain assumptions be met, which can be tested with specific diagnostics.
Theorizing Land Cover and Land Use Change: The Peasant Economy of Colonization in the Amazon Basin
NASA Technical Reports Server (NTRS)
Caldas, Marcellus; Walker, Robert; Arima, Eugenio; Perz, Stephen; Aldrich, Stephen; Simmons, Cynthia
2007-01-01
This paper addresses deforestation processes in the Amazon basin. It deploys a methodology combining remote sensing and survey-based fieldwork to examine, with regression analysis, the impact household structure and economic circumstances on deforestation decisions made by colonist farmers in the forest frontiers of Brazil. Unlike most previous regression-based studies, the methodology implemented analyzes behavior at the level of the individual property. The regressions correct for endogenous relationships between key variables, and spatial autocorrelation, as necessary. Variables used in the analysis are specified, in part, by a theoretical development integrating the Chayanovian concept of the peasant household with spatial considerations stemming from von Thuenen. The results from the empirical model indicate that demographic characteristics of households, as well as market factors, affect deforestation in the Amazon. Thus, statistical results from studies that do not include household-scale information may be subject to error. From a policy perspective, the results suggest that environmental policies in the Amazon based on market incentives to small farmers may not be as effective as hoped, given the importance of household factors in catalyzing the demand for land. The paper concludes by noting that household decisions regarding land use and deforestation are not independent of broader social circumstances, and that a full understanding of Amazonian deforestation will require insight into why poor families find it necessary to settle the frontier in the first place.
Data Mining CMMSs: How to Convert Data into Knowledge.
Fennigkoh, Larry; Nanney, D Courtney
2018-01-01
Although the healthcare technology management (HTM) community has decades of accumulated medical device-related maintenance data, little knowledge has been gleaned from these data. Finding and extracting such knowledge requires the use of the well-established, but admittedly somewhat foreign to HTM, application of inferential statistics. This article sought to provide a basic background on inferential statistics and describe a case study of their application, limitations, and proper interpretation. The research question associated with this case study involved examining the effects of ventilator preventive maintenance (PM) labor hours, age, and manufacturer on needed unscheduled corrective maintenance (CM) labor hours. The study sample included more than 21,000 combined PM inspections and CM work orders on 2,045 ventilators from 26 manufacturers during a five-year period (2012-16). A multiple regression analysis revealed that device age, manufacturer, and accumulated PM inspection labor hours all influenced the amount of CM labor significantly (P < 0.001). In essence, CM labor hours increased with increasing PM labor. However, and despite the statistical significance of these predictors, the regression analysis also indicated that ventilator age, manufacturer, and PM labor hours only explained approximately 16% of all variability in CM labor, with the remainder (84%) caused by other factors that were not included in the study. As such, the regression model obtained here is not suitable for predicting ventilator CM labor hours.
Differential gene expression profiles of peripheral blood mononuclear cells in childhood asthma.
Kong, Qian; Li, Wen-Jing; Huang, Hua-Rong; Zhong, Ying-Qiang; Fang, Jian-Pei
2015-05-01
Asthma is a common childhood disease with strong genetic components. This study compared whole-genome expression differences between asthmatic young children and healthy controls to identify gene signatures of childhood asthma. Total RNA extracted from peripheral blood mononuclear cells (PBMC) was subjected to microarray analysis. QRT-PCR was performed to verify the microarray results. Classification and functional characterization of differential genes were illustrated by hierarchical clustering and gene ontology analysis. Multiple logistic regression (MLR) analysis, receiver operating characteristic (ROC) curve analysis, and discriminate power were used to scan asthma-specific diagnostic markers. For fold-change>2 and p < 0.05, there were 758 named differential genes. The results of QRT-PCR confirmed successfully the array data. Hierarchical clustering divided 29 highly possible genes into seven categories and the genes in the same cluster were likely to possess similar expression patterns or functions. Gene ontology analysis presented that differential genes primarily enriched in immune response, response to stress or stimulus, and regulation of apoptosis in biological process. MLR and ROC curve analysis revealed that the combination of ADAM33, Smad7, and LIGHT possessed excellent discriminating power. The combination of ADAM33, Smad7, and LIGHT would be a reliable and useful childhood asthma model for prediction and diagnosis.
Hip fractures are risky business: an analysis of the NSQIP data.
Sathiyakumar, Vasanth; Greenberg, Sarah E; Molina, Cesar S; Thakore, Rachel V; Obremskey, William T; Sethi, Manish K
2015-04-01
Hip fractures are one of the most common types of orthopaedic injury with high rates of morbidity. Currently, no study has compared risk factors and adverse events following the different types of hip fracture surgeries. The purpose of this paper is to investigate the major and minor adverse events and risk factors for complication development associated with five common surgeries for the treatment of hip fractures using the NSQIP database. Using the ACS-NSQIP database, complications for five forms of hip surgeries were selected and categorized into major and minor adverse events. Demographics and clinical variables were collected and an unadjusted bivariate logistic regression analyses was performed to determine significant risk factors for adverse events. Five multivariate regressions were run for each surgery as well as a combined regression analysis. A total of 9640 patients undergoing surgery for hip fracture were identified with an adverse events rate of 25.2% (n=2433). Open reduction and internal fixation of a femoral neck fracture had the greatest percentage of all major events (16.6%) and total adverse events (27.4%), whereas partial hip hemiarthroplasty had the greatest percentage of all minor events (11.6%). Mortality was the most common major adverse event (44.9-50.6%). For minor complications, urinary tract infections were the most common minor adverse event (52.7-62.6%). Significant risk factors for development of any adverse event included age, BMI, gender, race, active smoking status, history of COPD, history of CHF, ASA score, dyspnoea, and functional status, with various combinations of these factors significantly affecting complication development for the individual surgeries. Hip fractures are associated with significantly high numbers of adverse events. The type of surgery affects the type of complications developed and also has an effect on what risk factors significantly predict the development of a complication. Concerted efforts from orthopaedists should be made to identify higher risk patients and prevent the most common adverse events that occur postoperatively. Copyright © 2014 Elsevier Ltd. All rights reserved.
Comparison of clinician-predicted to measured low vision outcomes.
Chan, Tiffany L; Goldstein, Judith E; Massof, Robert W
2013-08-01
To compare low-vision rehabilitation (LVR) clinicians' predictions of the probability of success of LVR with patients' self-reported outcomes after provision of usual outpatient LVR services and to determine if patients' traits influence clinician ratings. The Activity Inventory (AI), a self-report visual function questionnaire, was administered pre-and post-LVR to 316 low-vision patients served by 28 LVR centers that participated in a collaborative observational study. The physical component of the Short Form-36, Geriatric Depression Scale, and Telephone Interview for Cognitive Status were also administered pre-LVR to measure physical capability, depression, and cognitive status. After patient evaluation, 38 LVR clinicians estimated the probability of outcome success (POS) using their own criteria. The POS ratings and change in functional ability were used to assess the effects of patients' baseline traits on predicted outcomes. A regression analysis with a hierarchical random-effects model showed no relationship between LVR physician POS estimates and AI-based outcomes. In another analysis, kappa statistics were calculated to determine the probability of agreement between POS and AI-based outcomes for different outcome criteria. Across all comparisons, none of the kappa values were significantly different from 0, which indicates that the rate of agreement is equivalent to chance. In an exploratory analysis, hierarchical mixed-effects regression models show that POS ratings are associated with information about the patient's cognitive functioning and the combination of visual acuity and functional ability, as opposed to visual acuity or functional ability alone. Clinicians' predictions of LVR outcomes seem to be influenced by knowledge of patients' cognitive functioning and the combination of visual acuity and functional ability-information clinicians acquire from the patient's history and examination. However, clinicians' predictions do not agree with observed changes in functional ability from the patient's perspective; they are no better than chance.
Wang, L; Qin, X C; Lin, H C; Deng, K F; Luo, Y W; Sun, Q R; Du, Q X; Wang, Z Y; Tuo, Y; Sun, J H
2018-02-01
To analyse the relationship between Fourier transform infrared (FTIR) spectrum of rat's spleen tissue and postmortem interval (PMI) for PMI estimation using FTIR spectroscopy combined with data mining method. Rats were sacrificed by cervical dislocation, and the cadavers were placed at 20 ℃. The FTIR spectrum data of rats' spleen tissues were taken and measured at different time points. After pretreatment, the data was analysed by data mining method. The absorption peak intensity of rat's spleen tissue spectrum changed with the PMI, while the absorption peak position was unchanged. The results of principal component analysis (PCA) showed that the cumulative contribution rate of the first three principal components was 96%. There was an obvious clustering tendency for the spectrum sample at each time point. The methods of partial least squares discriminant analysis (PLS-DA) and support vector machine classification (SVMC) effectively divided the spectrum samples with different PMI into four categories (0-24 h, 48-72 h, 96-120 h and 144-168 h). The determination coefficient ( R ²) of the PMI estimation model established by PLS regression analysis was 0.96, and the root mean square error of calibration (RMSEC) and root mean square error of cross validation (RMSECV) were 9.90 h and 11.39 h respectively. In prediction set, the R ² was 0.97, and the root mean square error of prediction (RMSEP) was 10.49 h. The FTIR spectrum of the rat's spleen tissue can be effectively analyzed qualitatively and quantitatively by the combination of FTIR spectroscopy and data mining method, and the classification and PLS regression models can be established for PMI estimation. Copyright© by the Editorial Department of Journal of Forensic Medicine.
NASA Astrophysics Data System (ADS)
Qie, G.; Wang, G.; Wang, M.
2016-12-01
Mixed pixels and shadows due to buildings in urban areas impede accurate estimation and mapping of city vegetation carbon density. In most of previous studies, these factors are often ignored, which thus result in underestimation of city vegetation carbon density. In this study we presented an integrated methodology to improve the accuracy of mapping city vegetation carbon density. Firstly, we applied a linear shadow remove analysis (LSRA) on remotely sensed Landsat 8 images to reduce the shadow effects on carbon estimation. Secondly, we integrated a linear spectral unmixing analysis (LSUA) with a linear stepwise regression (LSR), a logistic model-based stepwise regression (LMSR) and k-Nearest Neighbors (kNN), and utilized and compared the integrated models on shadow-removed images to map vegetation carbon density. This methodology was examined in Shenzhen City of Southeast China. A data set from a total of 175 sample plots measured in 2013 and 2014 was used to train the models. The independent variables statistically significantly contributing to improving the fit of the models to the data and reducing the sum of squared errors were selected from a total of 608 variables derived from different image band combinations and transformations. The vegetation fraction from LSUA was then added into the models as an important independent variable. The estimates obtained were evaluated using a cross-validation method. Our results showed that higher accuracies were obtained from the integrated models compared with the ones using traditional methods which ignore the effects of mixed pixels and shadows. This study indicates that the integrated method has great potential on improving the accuracy of urban vegetation carbon density estimation. Key words: Urban vegetation carbon, shadow, spectral unmixing, spatial modeling, Landsat 8 images
Wu, Qiong; Nie, Jun; Wu, Fu-Xia; Zou, Xiu-Lan; Chen, Feng-Yi
2017-03-30
BACKGROUND To investigate the prognostic value of procalcitonin (PCT), high-sensitivity C-reactive protein (hs-CRP), and pancreatic stone protein (PSP) in children with sepsis. MATERIAL AND METHODS A total of 214 patients with sepsis during hospitalization were enrolled. Serum levels of PCT, hs-CRP, and PSP were measured on day 1 of hospitalization and the survival rates of children were recorded after a follow-up of 28 days. Pearson's correlation analysis was conducted to test the association of PCT, hs-CRP, and PSP with pediatric critical illness score (PCIS). Logistic regression models were used to analyze the risk factors contributing to patients' death. The AUC was used to determine the value of PCT, hs-CRP, and PSP in the prognosis of patients with sepsis. RESULTS The expression of PCT, hs-CRP, and PSP in the dying patients was higher than in the surviving patients (p<0.001). Pearson's correlation analysis showed that serum PCT, hs-CRP, and PSP levels were negatively correlated with PCIS (p<0.001). Multivariate logistic regression revealed that PCT, hs-CRP, and PSP were independent risk factors for the prognosis of patients with sepsis (p<0.001). ROC analysis showed the AUC values of PCT, hs-CRP, and PSP were 0.83 (95% CI, 0.77-0.88), 0.76 (95% CI, 0.70-0.82), and 0.73 (95% CI, 0.67-0.79), respectively. The combined AUC value of PCT, hs-CRP, and PSP, was 0.92 (95% CI, 0.87-0.95), which was significantly increased compared with PCT, hs-CRP, or PSP (p<0.001). CONCLUSIONS The combination of serum PCT, hs-CRP, and PSP represents a promising biomarker of risk, and is a useful clinical tool for risk stratification of children with sepsis.
Innovating patient care delivery: DSRIP's interrupted time series analysis paradigm.
Shenoy, Amrita G; Begley, Charles E; Revere, Lee; Linder, Stephen H; Daiger, Stephen P
2017-12-08
Adoption of Medicaid Section 1115 waiver is one of the many ways of innovating healthcare delivery system. The Delivery System Reform Incentive Payment (DSRIP) pool, one of the two funding pools of the waiver has four categories viz. infrastructure development, program innovation and redesign, quality improvement reporting and lastly, bringing about population health improvement. A metric of the fourth category, preventable hospitalization (PH) rate was analyzed in the context of eight conditions for two time periods, pre-reporting years (2010-2012) and post-reporting years (2013-2015) for two hospital cohorts, DSRIP participating and non-participating hospitals. The study explains how DSRIP impacted Preventable Hospitalization (PH) rates of eight conditions for both hospital cohorts within two time periods. Eight PH rates were regressed as the dependent variable with time, intervention and post-DSRIP Intervention as independent variables. PH rates of eight conditions were then consolidated into one rate for regressing with the above independent variables to evaluate overall impact of DSRIP. An interrupted time series regression was performed after accounting for auto-correlation, stationarity and seasonality in the dataset. In the individual regression model, PH rates showed statistically significant coefficients for seven out of eight conditions in DSRIP participating hospitals. In the combined regression model, the coefficient of the PH rate showed a statistically significant decrease with negative p-values for regression coefficients in DSRIP participating hospitals compared to positive/increased p-values for regression coefficients in DSRIP non-participating hospitals. Several macro- and micro-level factors may have likely contributed DSRIP hospitals outperforming DSRIP non-participating hospitals. Healthcare organization/provider collaboration, support from healthcare professionals, DSRIP's design, state reimbursement and coordination in care delivery methods may have led to likely success of DSRIP. IV, a retrospective cohort study based on longitudinal data. Copyright © 2017 Elsevier Inc. All rights reserved.
Yamazaki, Takeshi; Takeda, Hisato; Hagiya, Koichi; Yamaguchi, Satoshi; Sasaki, Osamu
2018-03-13
Because lactation periods in dairy cows lengthen with increasing total milk production, it is important to predict individual productivities after 305 days in milk (DIM) to determine the optimal lactation period. We therefore examined whether the random regression (RR) coefficient from 306 to 450 DIM (M2) can be predicted from those during the first 305 DIM (M1) by using a random regression model. We analyzed test-day milk records from 85690 Holstein cows in their first lactations and 131727 cows in their later (second to fifth) lactations. Data in M1 and M2 were analyzed separately by using different single-trait RR animal models. We then performed a multiple regression analysis of the RR coefficients of M2 on those of M1 during the first and later lactations. The first-order Legendre polynomials were practical covariates of random regression for the milk yields of M2. All RR coefficients for the additive genetic (AG) effect and the intercept for the permanent environmental (PE) effect of M2 had moderate to strong correlations with the intercept for the AG effect of M1. The coefficients of determination for multiple regression of the combined intercepts for the AG and PE effects of M2 on the coefficients for the AG effect of M1 were moderate to high. The daily milk yields of M2 predicted by using the RR coefficients for the AG effect of M1 were highly correlated with those obtained by using the coefficients of M2. Milk production after 305 DIM can be predicted by using the RR coefficient estimates of the AG effect during the first 305 DIM.
Use of partial least squares regression to impute SNP genotypes in Italian cattle breeds.
Dimauro, Corrado; Cellesi, Massimo; Gaspa, Giustino; Ajmone-Marsan, Paolo; Steri, Roberto; Marras, Gabriele; Macciotta, Nicolò P P
2013-06-05
The objective of the present study was to test the ability of the partial least squares regression technique to impute genotypes from low density single nucleotide polymorphisms (SNP) panels i.e. 3K or 7K to a high density panel with 50K SNP. No pedigree information was used. Data consisted of 2093 Holstein, 749 Brown Swiss and 479 Simmental bulls genotyped with the Illumina 50K Beadchip. First, a single-breed approach was applied by using only data from Holstein animals. Then, to enlarge the training population, data from the three breeds were combined and a multi-breed analysis was performed. Accuracies of genotypes imputed using the partial least squares regression method were compared with those obtained by using the Beagle software. The impact of genotype imputation on breeding value prediction was evaluated for milk yield, fat content and protein content. In the single-breed approach, the accuracy of imputation using partial least squares regression was around 90 and 94% for the 3K and 7K platforms, respectively; corresponding accuracies obtained with Beagle were around 85% and 90%. Moreover, computing time required by the partial least squares regression method was on average around 10 times lower than computing time required by Beagle. Using the partial least squares regression method in the multi-breed resulted in lower imputation accuracies than using single-breed data. The impact of the SNP-genotype imputation on the accuracy of direct genomic breeding values was small. The correlation between estimates of genetic merit obtained by using imputed versus actual genotypes was around 0.96 for the 7K chip. Results of the present work suggested that the partial least squares regression imputation method could be useful to impute SNP genotypes when pedigree information is not available.
NASA Astrophysics Data System (ADS)
Dalkilic, Turkan Erbay; Apaydin, Aysen
2009-11-01
In a regression analysis, it is assumed that the observations come from a single class in a data cluster and the simple functional relationship between the dependent and independent variables can be expressed using the general model; Y=f(X)+[epsilon]. However; a data cluster may consist of a combination of observations that have different distributions that are derived from different clusters. When faced with issues of estimating a regression model for fuzzy inputs that have been derived from different distributions, this regression model has been termed the [`]switching regression model' and it is expressed with . Here li indicates the class number of each independent variable and p is indicative of the number of independent variables [J.R. Jang, ANFIS: Adaptive-network-based fuzzy inference system, IEEE Transaction on Systems, Man and Cybernetics 23 (3) (1993) 665-685; M. Michel, Fuzzy clustering and switching regression models using ambiguity and distance rejects, Fuzzy Sets and Systems 122 (2001) 363-399; E.Q. Richard, A new approach to estimating switching regressions, Journal of the American Statistical Association 67 (338) (1972) 306-310]. In this study, adaptive networks have been used to construct a model that has been formed by gathering obtained models. There are methods that suggest the class numbers of independent variables heuristically. Alternatively, in defining the optimal class number of independent variables, the use of suggested validity criterion for fuzzy clustering has been aimed. In the case that independent variables have an exponential distribution, an algorithm has been suggested for defining the unknown parameter of the switching regression model and for obtaining the estimated values after obtaining an optimal membership function, which is suitable for exponential distribution.
Applied Multiple Linear Regression: A General Research Strategy
ERIC Educational Resources Information Center
Smith, Brandon B.
1969-01-01
Illustrates some of the basic concepts and procedures for using regression analysis in experimental design, analysis of variance, analysis of covariance, and curvilinear regression. Applications to evaluation of instruction and vocational education programs are illustrated. (GR)
Ye, Jiang-Feng; Zhao, Yu-Xin; Ju, Jian; Wang, Wei
2017-10-01
To discuss the value of the Bedside Index for Severity in Acute Pancreatitis (BISAP), Modified Early Warning Score (MEWS), serum Ca2+, similarly hereinafter, and red cell distribution width (RDW) for predicting the severity grade of acute pancreatitis and to develop and verify a more accurate scoring system to predict the severity of AP. In 302 patients with AP, we calculated BISAP and MEWS scores and conducted regression analyses on the relationships of BISAP scoring, RDW, MEWS, and serum Ca2+ with the severity of AP using single-factor logistics. The variables with statistical significance in the single-factor logistic regression were used in a multi-factor logistic regression model; forward stepwise regression was used to screen variables and build a multi-factor prediction model. A receiver operating characteristic curve (ROC curve) was constructed, and the significance of multi- and single-factor prediction models in predicting the severity of AP using the area under the ROC curve (AUC) was evaluated. The internal validity of the model was verified through bootstrapping. Among 302 patients with AP, 209 had mild acute pancreatitis (MAP) and 93 had severe acute pancreatitis (SAP). According to single-factor logistic regression analysis, we found that BISAP, MEWS and serum Ca2+ are prediction indexes of the severity of AP (P-value<0.001), whereas RDW is not a prediction index of AP severity (P-value>0.05). The multi-factor logistic regression analysis showed that BISAP and serum Ca2+ are independent prediction indexes of AP severity (P-value<0.001), and MEWS is not an independent prediction index of AP severity (P-value>0.05); BISAP is negatively related to serum Ca2+ (r=-0.330, P-value<0.001). The constructed model is as follows: ln()=7.306+1.151*BISAP-4.516*serum Ca2+. The predictive ability of each model for SAP follows the order of the combined BISAP and serum Ca2+ prediction model>Ca2+>BISAP. There is no statistical significance for the predictive ability of BISAP and serum Ca2+ (P-value>0.05); however, there is remarkable statistical significance for the predictive ability using the newly built prediction model as well as BISAP and serum Ca2+ individually (P-value<0.01). Verification of the internal validity of the models by bootstrapping is favorable. BISAP and serum Ca2+ have high predictive value for the severity of AP. However, the model built by combining BISAP and serum Ca2+ is remarkably superior to those of BISAP and serum Ca2+ individually. Furthermore, this model is simple, practical and appropriate for clinical use. Copyright © 2016. Published by Elsevier Masson SAS.
[Visual field progression in glaucoma: cluster analysis].
Bresson-Dumont, H; Hatton, J; Foucher, J; Fonteneau, M
2012-11-01
Visual field progression analysis is one of the key points in glaucoma monitoring, but distinction between true progression and random fluctuation is sometimes difficult. There are several different algorithms but no real consensus for detecting visual field progression. The trend analysis of global indices (MD, sLV) may miss localized deficits or be affected by media opacities. Conversely, point-by-point analysis makes progression difficult to differentiate from physiological variability, particularly when the sensitivity of a point is already low. The goal of our study was to analyse visual field progression with the EyeSuite™ Octopus Perimetry Clusters algorithm in patients with no significant changes in global indices or worsening of the analysis of pointwise linear regression. We analyzed the visual fields of 162 eyes (100 patients - 58 women, 42 men, average age 66.8 ± 10.91) with ocular hypertension or glaucoma. For inclusion, at least six reliable visual fields per eye were required, and the trend analysis (EyeSuite™ Perimetry) of visual field global indices (MD and SLV), could show no significant progression. The analysis of changes in cluster mode was then performed. In a second step, eyes with statistically significant worsening of at least one of their clusters were analyzed point-by-point with the Octopus Field Analysis (OFA). Fifty four eyes (33.33%) had a significant worsening in some clusters, while their global indices remained stable over time. In this group of patients, more advanced glaucoma was present than in stable group (MD 6.41 dB vs. 2.87); 64.82% (35/54) of those eyes in which the clusters progressed, however, had no statistically significant change in the trend analysis by pointwise linear regression. Most software algorithms for analyzing visual field progression are essentially trend analyses of global indices, or point-by-point linear regression. This study shows the potential role of analysis by clusters trend. However, for best results, it is preferable to compare the analyses of several tests in combination with morphologic exam. Copyright © 2012 Elsevier Masson SAS. All rights reserved.
Saqr, Mohammed; Fors, Uno; Tedre, Matti
2018-02-06
Collaborative learning facilitates reflection, diversifies understanding and stimulates skills of critical and higher-order thinking. Although the benefits of collaborative learning have long been recognized, it is still rarely studied by social network analysis (SNA) in medical education, and the relationship of parameters that can be obtained via SNA with students' performance remains largely unknown. The aim of this work was to assess the potential of SNA for studying online collaborative clinical case discussions in a medical course and to find out which activities correlate with better performance and help predict final grade or explain variance in performance. Interaction data were extracted from the learning management system (LMS) forum module of the Surgery course in Qassim University, College of Medicine. The data were analyzed using social network analysis. The analysis included visual as well as a statistical analysis. Correlation with students' performance was calculated, and automatic linear regression was used to predict students' performance. By using social network analysis, we were able to analyze a large number of interactions in online collaborative discussions and gain an overall insight of the course social structure, track the knowledge flow and the interaction patterns, as well as identify the active participants and the prominent discussion moderators. When augmented with calculated network parameters, SNA offered an accurate view of the course network, each user's position, and level of connectedness. Results from correlation coefficients, linear regression, and logistic regression indicated that a student's position and role in information relay in online case discussions, combined with the strength of that student's network (social capital), can be used as predictors of performance in relevant settings. By using social network analysis, researchers can analyze the social structure of an online course and reveal important information about students' and teachers' interactions that can be valuable in guiding teachers, improve students' engagement, and contribute to learning analytics insights.
The Statistical Analysis Techniques to Support the NGNP Fuel Performance Experiments
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bihn T. Pham; Jeffrey J. Einerson
2010-06-01
This paper describes the development and application of statistical analysis techniques to support the AGR experimental program on NGNP fuel performance. The experiments conducted in the Idaho National Laboratory’s Advanced Test Reactor employ fuel compacts placed in a graphite cylinder shrouded by a steel capsule. The tests are instrumented with thermocouples embedded in graphite blocks and the target quantity (fuel/graphite temperature) is regulated by the He-Ne gas mixture that fills the gap volume. Three techniques for statistical analysis, namely control charting, correlation analysis, and regression analysis, are implemented in the SAS-based NGNP Data Management and Analysis System (NDMAS) for automatedmore » processing and qualification of the AGR measured data. The NDMAS also stores daily neutronic (power) and thermal (heat transfer) code simulation results along with the measurement data, allowing for their combined use and comparative scrutiny. The ultimate objective of this work includes (a) a multi-faceted system for data monitoring and data accuracy testing, (b) identification of possible modes of diagnostics deterioration and changes in experimental conditions, (c) qualification of data for use in code validation, and (d) identification and use of data trends to support effective control of test conditions with respect to the test target. Analysis results and examples given in the paper show the three statistical analysis techniques providing a complementary capability to warn of thermocouple failures. It also suggests that the regression analysis models relating calculated fuel temperatures and thermocouple readings can enable online regulation of experimental parameters (i.e. gas mixture content), to effectively maintain the target quantity (fuel temperature) within a given range.« less
General Framework for Meta-analysis of Rare Variants in Sequencing Association Studies
Lee, Seunggeun; Teslovich, Tanya M.; Boehnke, Michael; Lin, Xihong
2013-01-01
We propose a general statistical framework for meta-analysis of gene- or region-based multimarker rare variant association tests in sequencing association studies. In genome-wide association studies, single-marker meta-analysis has been widely used to increase statistical power by combining results via regression coefficients and standard errors from different studies. In analysis of rare variants in sequencing studies, region-based multimarker tests are often used to increase power. We propose meta-analysis methods for commonly used gene- or region-based rare variants tests, such as burden tests and variance component tests. Because estimation of regression coefficients of individual rare variants is often unstable or not feasible, the proposed method avoids this difficulty by calculating score statistics instead that only require fitting the null model for each study and then aggregating these score statistics across studies. Our proposed meta-analysis rare variant association tests are conducted based on study-specific summary statistics, specifically score statistics for each variant and between-variant covariance-type (linkage disequilibrium) relationship statistics for each gene or region. The proposed methods are able to incorporate different levels of heterogeneity of genetic effects across studies and are applicable to meta-analysis of multiple ancestry groups. We show that the proposed methods are essentially as powerful as joint analysis by directly pooling individual level genotype data. We conduct extensive simulations to evaluate the performance of our methods by varying levels of heterogeneity across studies, and we apply the proposed methods to meta-analysis of rare variant effects in a multicohort study of the genetics of blood lipid levels. PMID:23768515
Patient casemix classification for medicare psychiatric prospective payment.
Drozd, Edward M; Cromwell, Jerry; Gage, Barbara; Maier, Jan; Greenwald, Leslie M; Goldman, Howard H
2006-04-01
For a proposed Medicare prospective payment system for inpatient psychiatric facility treatment, the authors developed a casemix classification to capture differences in patients' real daily resource use. Primary data on patient characteristics and daily time spent in various activities were collected in a survey of 696 patients from 40 inpatient psychiatric facilities. Survey data were combined with Medicare claims data to estimate intensity-adjusted daily cost. Classification and Regression Trees (CART) analysis of average daily routine and ancillary costs yielded several hierarchical classification groupings. Regression analysis was used to control for facility and day-of-stay effects in order to compare hierarchical models with models based on the recently proposed payment system of the Centers for Medicare & Medicaid Services. CART analysis identified a small set of patient characteristics strongly associated with higher daily costs, including age, psychiatric diagnosis, deficits in daily living activities, and detox or ECT use. A parsimonious, 16-group, fully interactive model that used five major DSM-IV categories and stratified by age, illness severity, deficits in daily living activities, dangerousness, and use of ECT explained 40% (out of a possible 76%) of daily cost variation not attributable to idiosyncratic daily changes within patients. A noninteractive model based on diagnosis-related groups, age, and medical comorbidity had explanatory power of only 32%. A regression model with 16 casemix groups restricted to using "appropriate" payment variables (i.e., those with clinical face validity and low administrative burden that are easily validated and provide proper care incentives) produced more efficient and equitable payments than did a noninteractive system based on diagnosis-related groups.
NASA Astrophysics Data System (ADS)
Buermeyer, Jonas; Gundlach, Matthias; Grund, Anna-Lisa; Grimm, Volker; Spizyn, Alexander; Breckow, Joachim
2016-09-01
This work is part of the analysis of the effects of constructional energy-saving measures to radon concentration levels in dwellings performed on behalf of the German Federal Office for Radiation Protection. In parallel to radon measurements for five buildings, both meteorological data outside the buildings and the indoor climate factors were recorded. In order to access effects of inhabited buildings, the amount of carbon dioxide (CO2) was measured. For a statistical linear regression model, the data of one object was chosen as an example. Three dummy variables were extracted from the process of the CO2 concentration to provide information on the usage and ventilation of the room. The analysis revealed a highly autoregressive model for the radon concentration with additional influence by the natural environmental factors. The autoregression implies a strong dependency on a radon source since it reflects a backward dependency in time. At this point of the investigation, it cannot be determined whether the influence by outside factors affects the source of radon or the habitant’s ventilation behavior resulting in variation of the occurring concentration levels. In any case, the regression analysis might provide further information that would help to distinguish these effects. In the next step, the influence factors will be weighted according to their impact on the concentration levels. This might lead to a model that enables the prediction of radon concentration levels based on the measurement of CO2 in combination with environmental parameters, as well as the development of advices for ventilation.
Straub, D.E.
1998-01-01
The streamflow-gaging station network in Ohio was evaluated for its effectiveness in providing regional streamflow information. The analysis involved application of the principles of generalized least squares regression between streamflow and climatic and basin characteristics. Regression equations were developed for three flow characteristics: (1) the instantaneous peak flow with a 100-year recurrence interval (P100), (2) the mean annual flow (Qa), and (3) the 7-day, 10-year low flow (7Q10). All active and discontinued gaging stations with 5 or more years of unregulated-streamflow data with respect to each flow characteristic were used to develop the regression equations. The gaging-station network was evaluated for the current (1996) condition of the network and estimated conditions of various network strategies if an additional 5 and 20 years of streamflow data were collected. Any active or discontinued gaging station with (1) less than 5 years of unregulated-streamflow record, (2) previously defined basin and climatic characteristics, and (3) the potential for collection of more unregulated-streamflow record were included in the network strategies involving the additional 5 and 20 years of data. The network analysis involved use of the regression equations, in combination with location, period of record, and cost of operation, to determine the contribution of the data for each gaging station to regional streamflow information. The contribution of each gaging station was based on a cost-weighted reduction of the mean square error (average sampling-error variance) associated with each regional estimating equation. All gaging stations included in the network analysis were then ranked according to their contribution to the regional information for each flow characteristic. The predictive ability of the regression equations developed from the gaging station network could be improved for all three flow characteristics with the collection of additional streamflow data. The addition of new gaging stations to the network would result in an even greater improvement of the accuracy of the regional regression equations. Typically, continued data collection at stations with unregulated streamflow for all flow conditions that had less than 11 years of record with drainage areas smaller than 200 square miles contributed the largest cost-weighted reduction to the average sampling-error variance of the regional estimating equations. The results of the network analyses can be used to prioritize the continued operation of active gaging stations or the reactivation of discontinued gaging stations if the objective is to maximize the regional information content in the streamflow-gaging station network.
NASA Technical Reports Server (NTRS)
Parsons, Vickie s.
2009-01-01
The request to conduct an independent review of regression models, developed for determining the expected Launch Commit Criteria (LCC) External Tank (ET)-04 cycle count for the Space Shuttle ET tanking process, was submitted to the NASA Engineering and Safety Center NESC on September 20, 2005. The NESC team performed an independent review of regression models documented in Prepress Regression Analysis, Tom Clark and Angela Krenn, 10/27/05. This consultation consisted of a peer review by statistical experts of the proposed regression models provided in the Prepress Regression Analysis. This document is the consultation's final report.
Schramm, Elisabeth; Weitz, Erica S; Salanti, Georgia; Efthimiou, Orestis; Michalak, Johannes; Watanabe, Norio; Keller, Martin B; Kocsis, James H; Klein, Daniel N; Cuijpers, Pim
2016-01-01
Introduction Despite important advances in psychological and pharmacological treatments of persistent depressive disorders in the past decades, their responses remain typically slow and poor, and differential responses among different modalities of treatments or their combinations are not well understood. Cognitive-Behavioural Analysis System of Psychotherapy (CBASP) is the only psychotherapy that has been specifically designed for chronic depression and has been examined in an increasing number of trials against medications, alone or in combination. When several treatment alternatives are available for a certain condition, network meta-analysis (NMA) provides a powerful tool to examine their relative efficacy by combining all direct and indirect comparisons. Individual participant data (IPD) meta-analysis enables exploration of impacts of individual characteristics that lead to a differentiated approach matching treatments to specific subgroups of patients. Methods and analysis We will search for all randomised controlled trials that compared CBASP, pharmacotherapy or their combination, in the treatment of patients with persistent depressive disorder, in Cochrane CENTRAL, PUBMED, SCOPUS and PsycINFO, supplemented by personal contacts. Individual participant data will be sought from the principal investigators of all the identified trials. Our primary outcomes are depression severity as measured on a continuous observer-rated scale for depression, and dropouts for any reason as a proxy measure of overall treatment acceptability. We will conduct a one-step IPD-NMA to compare CBASP, medications and their combinations, and also carry out a meta-regression to identify their prognostic factors and effect moderators. The model will be fitted in OpenBUGS, using vague priors for all location parameters. For the heterogeneity we will use a half-normal prior on the SD. Ethics and dissemination This study requires no ethical approval. We will publish the findings in a peer-reviewed journal. The study results will contribute to more finely differentiated therapeutics for patients suffering from this chronically disabling disorder. Trial registration number CRD42016035886. PMID:27147393
Gao, Yongnian; Gao, Junfeng; Yin, Hongbin; Liu, Chuansheng; Xia, Ting; Wang, Jing; Huang, Qi
2015-03-15
Remote sensing has been widely used for ater quality monitoring, but most of these monitoring studies have only focused on a few water quality variables, such as chlorophyll-a, turbidity, and total suspended solids, which have typically been considered optically active variables. Remote sensing presents a challenge in estimating the phosphorus concentration in water. The total phosphorus (TP) in lakes has been estimated from remotely sensed observations, primarily using the simple individual band ratio or their natural logarithm and the statistical regression method based on the field TP data and the spectral reflectance. In this study, we investigated the possibility of establishing a spatial modeling scheme to estimate the TP concentration of a large lake from multi-spectral satellite imagery using band combinations and regional multivariate statistical modeling techniques, and we tested the applicability of the spatial modeling scheme. The results showed that HJ-1A CCD multi-spectral satellite imagery can be used to estimate the TP concentration in a lake. The correlation and regression analysis showed a highly significant positive relationship between the TP concentration and certain remotely sensed combination variables. The proposed modeling scheme had a higher accuracy for the TP concentration estimation in the large lake compared with the traditional individual band ratio method and the whole-lake scale regression-modeling scheme. The TP concentration values showed a clear spatial variability and were high in western Lake Chaohu and relatively low in eastern Lake Chaohu. The northernmost portion, the northeastern coastal zone and the southeastern portion of western Lake Chaohu had the highest TP concentrations, and the other regions had the lowest TP concentration values, except for the coastal zone of eastern Lake Chaohu. These results strongly suggested that the proposed modeling scheme, i.e., the band combinations and the regional multivariate statistical modeling techniques, demonstrated advantages for estimating the TP concentration in a large lake and had a strong potential for universal application for the TP concentration estimation in large lake waters worldwide. Copyright © 2014 Elsevier Ltd. All rights reserved.
Motulsky, Harvey J; Brown, Ronald E
2006-01-01
Background Nonlinear regression, like linear regression, assumes that the scatter of data around the ideal curve follows a Gaussian or normal distribution. This assumption leads to the familiar goal of regression: to minimize the sum of the squares of the vertical or Y-value distances between the points and the curve. Outliers can dominate the sum-of-the-squares calculation, and lead to misleading results. However, we know of no practical method for routinely identifying outliers when fitting curves with nonlinear regression. Results We describe a new method for identifying outliers when fitting data with nonlinear regression. We first fit the data using a robust form of nonlinear regression, based on the assumption that scatter follows a Lorentzian distribution. We devised a new adaptive method that gradually becomes more robust as the method proceeds. To define outliers, we adapted the false discovery rate approach to handling multiple comparisons. We then remove the outliers, and analyze the data using ordinary least-squares regression. Because the method combines robust regression and outlier removal, we call it the ROUT method. When analyzing simulated data, where all scatter is Gaussian, our method detects (falsely) one or more outlier in only about 1–3% of experiments. When analyzing data contaminated with one or several outliers, the ROUT method performs well at outlier identification, with an average False Discovery Rate less than 1%. Conclusion Our method, which combines a new method of robust nonlinear regression with a new method of outlier identification, identifies outliers from nonlinear curve fits with reasonable power and few false positives. PMID:16526949
Combined Effects of High-Speed Railway Noise and Ground Vibrations on Annoyance.
Yokoshima, Shigenori; Morihara, Takashi; Sato, Tetsumi; Yano, Takashi
2017-07-27
The Shinkansen super-express railway system in Japan has greatly increased its capacity and has expanded nationwide. However, many inhabitants in areas along the railways have been disturbed by noise and ground vibration from the trains. Additionally, the Shinkansen railway emits a higher level of ground vibration than conventional railways at the same noise level. These findings imply that building vibrations affect living environments as significantly as the associated noise. Therefore, it is imperative to quantify the effects of noise and vibration exposures on each annoyance under simultaneous exposure. We performed a secondary analysis using individual datasets of exposure and community response associated with Shinkansen railway noise and vibration. The data consisted of six socio-acoustic surveys, which were conducted separately over the last 20 years in Japan. Applying a logistic regression analysis to the datasets, we confirmed the combined effects of vibration/noise exposure on noise/vibration annoyance. Moreover, we proposed a representative relationship between noise and vibration exposures, and the prevalence of each annoyance associated with the Shinkansen railway.
Ignjatović, Aleksandra; Stojanović, Miodrag; Milošević, Zoran; Anđelković Apostolović, Marija
2017-12-02
The interest in developing risk models in medicine not only is appealing, but also associated with many obstacles in different aspects of predictive model development. Initially, the association of biomarkers or the association of more markers with the specific outcome was proven by statistical significance, but novel and demanding questions required the development of new and more complex statistical techniques. Progress of statistical analysis in biomedical research can be observed the best through the history of the Framingham study and development of the Framingham score. Evaluation of predictive models comes from a combination of the facts which are results of several metrics. Using logistic regression and Cox proportional hazards regression analysis, the calibration test, and the ROC curve analysis should be mandatory and eliminatory, and the central place should be taken by some new statistical techniques. In order to obtain complete information related to the new marker in the model, recently, there is a recommendation to use the reclassification tables by calculating the net reclassification index and the integrated discrimination improvement. Decision curve analysis is a novel method for evaluating the clinical usefulness of a predictive model. It may be noted that customizing and fine-tuning of the Framingham risk score initiated the development of statistical analysis. Clinically applicable predictive model should be a trade-off between all abovementioned statistical metrics, a trade-off between calibration and discrimination, accuracy and decision-making, costs and benefits, and quality and quantity of patient's life.
Jastreboff, P W
1979-06-01
Time histograms of neural responses evoked by sinuosidal stimulation often contain a slow drifting and an irregular noise which disturb Fourier analysis of these responses. Section 2 of this paper evaluates the extent to which a linear drift influences the Fourier analysis, and develops a combined Fourier and linear regression analysis for detecting and correcting for such a linear drift. Usefulness of this correcting method is demonstrated for the time histograms of actual eye movements and Purkinje cell discharges evoked by sinusoidal rotation of rabbits in the horizontal plane. In Sect. 3, the analysis of variance is adopted for estimating the probability of the random occurrence of the response curve extracted by Fourier analysis from noise. This method proved to be useful for avoiding false judgements as to whether the response curve was meaningful, particularly when the response was small relative to the contaminating noise.
NASA Astrophysics Data System (ADS)
Asencio-Cortés, G.; Morales-Esteban, A.; Shang, X.; Martínez-Álvarez, F.
2018-06-01
Earthquake magnitude prediction is a challenging problem that has been widely studied during the last decades. Statistical, geophysical and machine learning approaches can be found in literature, with no particularly satisfactory results. In recent years, powerful computational techniques to analyze big data have emerged, making possible the analysis of massive datasets. These new methods make use of physical resources like cloud based architectures. California is known for being one of the regions with highest seismic activity in the world and many data are available. In this work, the use of several regression algorithms combined with ensemble learning is explored in the context of big data (1 GB catalog is used), in order to predict earthquakes magnitude within the next seven days. Apache Spark framework, H2 O library in R language and Amazon cloud infrastructure were been used, reporting very promising results.
NASA Astrophysics Data System (ADS)
Yin, Jianhua; Xia, Yang
2014-12-01
Fourier transform infrared imaging (FTIRI) combining with principal component regression (PCR) analysis were used to determine the reduction of proteoglycan (PG) in articular cartilage after the transection of the anterior cruciate ligament (ACL). A number of canine knee cartilage sections were harvested from the meniscus-covered and meniscus-uncovered medial tibial locations from the control joints, the ACL joints at three time points after the surgery, and their contralateral joints. The PG loss in the ACL cartilage was related positively to the durations after the surgery. The PG loss in the contralateral knees was less than that of the ACL knees. The PG loss in the meniscus-covered cartilage was less than that of the meniscus-uncovered tissue in both ACL and contralateral knees. The quantitative mapping of PG loss could monitor the disease progression and repair processes in arthritis.
Multi crop area estimation in Idaho using EDITOR
NASA Technical Reports Server (NTRS)
Sheffner, E. J.
1984-01-01
The use of LANDSAT multispectral scanner digital data for multi-crop acreage estimation in the central Snake River Plain of Idaho was examined. Two acquisitions of LANDSAT data covering ground sample units selected from a U.S. Department of Agriculture sampling frame in a four country study site were used to train a maximum likelihood classifier which, subsequently, classified all picture elements in the study site. Acreage estimates for six major crops, by county and for the four counties combined, were generated from the classification using the Battesse-Fuller model for estimation by regression in small areas. Results from the regression analysis were compared to those obtained by direct expansion of the ground data. Using the LANDSAT data significantly decreased the errors associated with the estimates for the three largest acreage crops. The late date of the second LANDSAT acquisition may have contributed to the poor results for three summer crops.
Network Structure and Travel Time Perception
Parthasarathi, Pavithra; Levinson, David; Hochmair, Hartwig
2013-01-01
The purpose of this research is to test the systematic variation in the perception of travel time among travelers and relate the variation to the underlying street network structure. Travel survey data from the Twin Cities metropolitan area (which includes the cities of Minneapolis and St. Paul) is used for the analysis. Travelers are classified into two groups based on the ratio of perceived and estimated commute travel time. The measures of network structure are estimated using the street network along the identified commute route. T-test comparisons are conducted to identify statistically significant differences in estimated network measures between the two traveler groups. The combined effect of these estimated network measures on travel time is then analyzed using regression models. The results from the t-test and regression analyses confirm the influence of the underlying network structure on the perception of travel time. PMID:24204932
Prediction of elemental creep. [steady state and cyclic data from regression analysis
NASA Technical Reports Server (NTRS)
Davis, J. W.; Rummler, D. R.
1975-01-01
Cyclic and steady-state creep tests were performed to provide data which were used to develop predictive equations. These equations, describing creep as a function of stress, temperature, and time, were developed through the use of a least squares regression analyses computer program for both the steady-state and cyclic data sets. Comparison of the data from the two types of tests, revealed that there was no significant difference between the cyclic and steady-state creep strains for the L-605 sheet under the experimental conditions investigated (for the same total time at load). Attempts to develop a single linear equation describing the combined steady-state and cyclic creep data resulted in standard errors of estimates higher than obtained for the individual data sets. A proposed approach to predict elemental creep in metals uses the cyclic creep equation and a computer program which applies strain and time hardening theories of creep accumulation.
Analytical and regression models of glass rod drawing process
NASA Astrophysics Data System (ADS)
Alekseeva, L. B.
2018-03-01
The process of drawing glass rods (light guides) is being studied. The parameters of the process affecting the quality of the light guide have been determined. To solve the problem, mathematical models based on general equations of continuum mechanics are used. The conditions for the stable flow of the drawing process have been found, which are determined by the stability of the motion of the glass mass in the formation zone to small uncontrolled perturbations. The sensitivity of the formation zone to perturbations of the drawing speed and viscosity is estimated. Experimental models of the drawing process, based on the regression analysis methods, have been obtained. These models make it possible to customize a specific production process to obtain light guides of the required quality. They allow one to find the optimum combination of process parameters in the chosen area and to determine the required accuracy of maintaining them at a specified level.
Use of ocean color scanner data in water quality mapping
NASA Technical Reports Server (NTRS)
Khorram, S.
1981-01-01
Remotely sensed data, in combination with in situ data, are used in assessing water quality parameters within the San Francisco Bay-Delta. The parameters include suspended solids, chlorophyll, and turbidity. Regression models are developed between each of the water quality parameter measurements and the Ocean Color Scanner (OCS) data. The models are then extended to the entire study area for mapping water quality parameters. The results include a series of color-coded maps, each pertaining to one of the water quality parameters, and the statistical analysis of the OCS data and regression models. It is found that concurrently collected OCS data and surface truth measurements are highly useful in mapping the selected water quality parameters and locating areas having relatively high biological activity. In addition, it is found to be virtually impossible, at least within this test site, to locate such areas on U-2 color and color-infrared photography.
NASA Astrophysics Data System (ADS)
Singh, Veena D.; Daharwal, Sanjay J.
2017-01-01
Three multivariate calibration spectrophotometric methods were developed for simultaneous estimation of Paracetamol (PARA), Enalapril maleate (ENM) and Hydrochlorothiazide (HCTZ) in tablet dosage form; namely multi-linear regression calibration (MLRC), trilinear regression calibration method (TLRC) and classical least square (CLS) method. The selectivity of the proposed methods were studied by analyzing the laboratory prepared ternary mixture and successfully applied in their combined dosage form. The proposed methods were validated as per ICH guidelines and good accuracy; precision and specificity were confirmed within the concentration range of 5-35 μg mL- 1, 5-40 μg mL- 1 and 5-40 μg mL- 1of PARA, HCTZ and ENM, respectively. The results were statistically compared with reported HPLC method. Thus, the proposed methods can be effectively useful for the routine quality control analysis of these drugs in commercial tablet dosage form.
NASA Technical Reports Server (NTRS)
Colwell, R. N. (Principal Investigator)
1983-01-01
The geometric quality of the TM and MSS film products were evaluated by making selective photo measurements such as scale, linear and area determinations; and by measuring the coordinates of known features on both the film products and map products and then relating these paired observations using a standard linear least squares regression approach. Quantitative interpretation tests are described which evaluate the quality and utility of the TM film products and various band combinations for detecting and identifying important forest and agricultural features.
Comparison of methods for estimating flood magnitudes on small streams in Georgia
Hess, Glen W.; Price, McGlone
1989-01-01
The U.S. Geological Survey has collected flood data for small, natural streams at many sites throughout Georgia during the past 20 years. Flood-frequency relations were developed for these data using four methods: (1) observed (log-Pearson Type III analysis) data, (2) rainfall-runoff model, (3) regional regression equations, and (4) map-model combination. The results of the latter three methods were compared to the analyses of the observed data in order to quantify the differences in the methods and determine if the differences are statistically significant.
Cumulative Risk and Impact Modeling on Environmental Chemical and Social Stressors.
Huang, Hongtai; Wang, Aolin; Morello-Frosch, Rachel; Lam, Juleen; Sirota, Marina; Padula, Amy; Woodruff, Tracey J
2018-03-01
The goal of this review is to identify cumulative modeling methods used to evaluate combined effects of exposures to environmental chemicals and social stressors. The specific review question is: What are the existing quantitative methods used to examine the cumulative impacts of exposures to environmental chemical and social stressors on health? There has been an increase in literature that evaluates combined effects of exposures to environmental chemicals and social stressors on health using regression models; very few studies applied other data mining and machine learning techniques to this problem. The majority of studies we identified used regression models to evaluate combined effects of multiple environmental and social stressors. With proper study design and appropriate modeling assumptions, additional data mining methods may be useful to examine combined effects of environmental and social stressors.
Liu, Y; Zhu, L; Wang, J; Shen, X; Chen, X
2001-11-01
Twelve polycyclic aromatic hydrocarbons (PAHs) were measured in eight homes in Hangzhou during the summer and autumn in 1999. The sources of PAHs and the contributions of the sources to the total concentration of PAHs in the indoor air were identified by the combination of correlation analysis, factor analysis and multiple regression, and the equations between the concentrations of PAHs in indoor and outdoor air and factors were got. It was indicated that the factors of PAHs in the indoor air were domestic cuisine, the volatility of the mothball, cigarette smoke and heating, the waste gas from vehicles. In the smokers' home, cigarette smoke was the most important factor, and it contributed 25.8% of BaP to the indoor air of smokers' home.
Comparing multiple imputation methods for systematically missing subject-level data.
Kline, David; Andridge, Rebecca; Kaizar, Eloise
2017-06-01
When conducting research synthesis, the collection of studies that will be combined often do not measure the same set of variables, which creates missing data. When the studies to combine are longitudinal, missing data can occur on the observation-level (time-varying) or the subject-level (non-time-varying). Traditionally, the focus of missing data methods for longitudinal data has been on missing observation-level variables. In this paper, we focus on missing subject-level variables and compare two multiple imputation approaches: a joint modeling approach and a sequential conditional modeling approach. We find the joint modeling approach to be preferable to the sequential conditional approach, except when the covariance structure of the repeated outcome for each individual has homogenous variance and exchangeable correlation. Specifically, the regression coefficient estimates from an analysis incorporating imputed values based on the sequential conditional method are attenuated and less efficient than those from the joint method. Remarkably, the estimates from the sequential conditional method are often less efficient than a complete case analysis, which, in the context of research synthesis, implies that we lose efficiency by combining studies. Copyright © 2015 John Wiley & Sons, Ltd. Copyright © 2015 John Wiley & Sons, Ltd.
Cuesta-Vargas, Antonio I; González-Sánchez, Manuel
2014-03-01
Currently, there are no studies combining electromyography (EMG) and sonography to estimate the absolute and relative strength values of erector spinae (ES) muscles in healthy individuals. The purpose of this study was to establish whether the maximum voluntary contraction (MVC) of the ES during isometric contractions could be predicted from the changes in surface EMG as well as in fiber pennation and thickness as measured by sonography. Thirty healthy adults performed 3 isometric extensions at 45° from the vertical to calculate the MVC force. Contractions at 33% and 100% of the MVC force were then used during sonographic and EMG recordings. These measurements were used to observe the architecture and function of the muscles during contraction. Statistical analysis was performed using bivariate regression and regression equations. The slope for each regression equation was statistically significant (P < .001) with R(2) values of 0.837 and 0.986 for the right and left ES, respectively. The standard error estimate between the sonographic measurements and the regression-estimated pennation angles for the right and left ES were 0.10 and 0.02, respectively. Erector spinae muscle activation can be predicted from the changes in fiber pennation during isometric contractions at 33% and 100% of the MVC force. These findings could be essential for developing a regression equation that could estimate the level of muscle activation from changes in the muscle architecture.
Crane, Paul K; Gibbons, Laura E; Jolley, Lance; van Belle, Gerald
2006-11-01
We present an ordinal logistic regression model for identification of items with differential item functioning (DIF) and apply this model to a Mini-Mental State Examination (MMSE) dataset. We employ item response theory ability estimation in our models. Three nested ordinal logistic regression models are applied to each item. Model testing begins with examination of the statistical significance of the interaction term between ability and the group indicator, consistent with nonuniform DIF. Then we turn our attention to the coefficient of the ability term in models with and without the group term. If including the group term has a marked effect on that coefficient, we declare that it has uniform DIF. We examined DIF related to language of test administration in addition to self-reported race, Hispanic ethnicity, age, years of education, and sex. We used PARSCALE for IRT analyses and STATA for ordinal logistic regression approaches. We used an iterative technique for adjusting IRT ability estimates on the basis of DIF findings. Five items were found to have DIF related to language. These same items also had DIF related to other covariates. The ordinal logistic regression approach to DIF detection, when combined with IRT ability estimates, provides a reasonable alternative for DIF detection. There appear to be several items with significant DIF related to language of test administration in the MMSE. More attention needs to be paid to the specific criteria used to determine whether an item has DIF, not just the technique used to identify DIF.
Lin, Ching-Yih; Lee, Ying-En; Tian, Yu-Feng; Sun, Ding-Ping; Sheu, Ming-Jen; Lin, Chen-Yi; Li, Chien-Feng; Lee, Sung-Wei; Lin, Li-Ching; Chang, I-Wei; Wang, Chieh-Tien; He, Hong-Lin
2017-01-01
Background: Numerous transmembrane receptor tyrosine kinase pathways have been found to play an important role in tumor progression in some cancers. This study was aimed to evaluate the clinical impact of Eph receptor A4 (EphA4) in patients with rectal cancer treated with neoadjuvant concurrent chemoradiotherapy (CCRT) combined with mesorectal excision, with special emphasis on tumor regression. Methods: Analysis of the publicly available expression profiling dataset of rectal cancer disclosed that EphA4 was the top-ranking, significantly upregulated, transmembrane receptor tyrosine kinase pathway-associated gene in the non-responders to CCRT, compared with the responders. Immunohistochemical study was conducted to assess the EphA4 expression in pre-treatment biopsy specimens from 172 rectal cancer patients without distant metastasis. The relationships between EphA4 expression and various clinicopathological factors or survival were statistically analyzed. Results: EphA4 expression was significantly associated with vascular invasion ( P =0.015), post-treatment depth of tumor invasion ( P =0.006), pre-treatment and post-treatment lymph node metastasis ( P =0.004 and P =0.011, respectively). More importantly, high EphA4 expression was significantly predictive for lesser degree of tumor regression after CCRT ( P =0.031). At univariate analysis, high EphA4 expression was a negative prognosticator for disease-specific survival ( P =0.0009) and metastasis-free survival ( P =0.0001). At multivariate analysis, high expression of EphA4 still served as an independent adverse prognostic factor for disease-specific survival (HR, 2.528; 95% CI, 1.131-5.651; P =0.024) and metastasis-free survival (HR, 3.908; 95% CI, 1.590-9.601; P =0.003). Conclusion: High expression of EphA4 predicted lesser degree of tumor regression after CCRT and served as an independent negative prognostic factor in patients with rectal cancer.
Menon, Ramkumar; Bhat, Geeta; Saade, George R; Spratt, Heidi
2014-04-01
To develop classification models of demographic/clinical factors and biomarker data from spontaneous preterm birth in African Americans and Caucasians. Secondary analysis of biomarker data using multivariate adaptive regression splines (MARS), a supervised machine learning algorithm method. Analysis of data on 36 biomarkers from 191 women was reduced by MARS to develop predictive models for preterm birth in African Americans and Caucasians. Maternal plasma, cord plasma collected at admission for preterm or term labor and amniotic fluid at delivery. Data were partitioned into training and testing sets. Variable importance, a relative indicator (0-100%) and area under the receiver operating characteristic curve (AUC) characterized results. Multivariate adaptive regression splines generated models for combined and racially stratified biomarker data. Clinical and demographic data did not contribute to the model. Racial stratification of data produced distinct models in all three compartments. In African Americans maternal plasma samples IL-1RA, TNF-α, angiopoietin 2, TNFRI, IL-5, MIP1α, IL-1β and TGF-α modeled preterm birth (AUC train: 0.98, AUC test: 0.86). In Caucasians TNFR1, ICAM-1 and IL-1RA contributed to the model (AUC train: 0.84, AUC test: 0.68). African Americans cord plasma samples produced IL-12P70, IL-8 (AUC train: 0.82, AUC test: 0.66). Cord plasma in Caucasians modeled IGFII, PDGFBB, TGF-β1 , IL-12P70, and TIMP1 (AUC train: 0.99, AUC test: 0.82). Amniotic fluid in African Americans modeled FasL, TNFRII, RANTES, KGF, IGFI (AUC train: 0.95, AUC test: 0.89) and in Caucasians, TNF-α, MCP3, TGF-β3 , TNFR1 and angiopoietin 2 (AUC train: 0.94 AUC test: 0.79). Multivariate adaptive regression splines models multiple biomarkers associated with preterm birth and demonstrated racial disparity. © 2014 Nordic Federation of Societies of Obstetrics and Gynecology.
Soccer and sexual health education: a promising approach for reducing adolescent births in Haiti.
Kaplan, Kathryn C; Lewis, Judy; Gebrian, Bette; Theall, Katherine
2015-05-01
To explore the effect of an innovative, integrative program in female sexual reproductive health (SRH) and soccer (or fútbol, in Haitian Creole) in rural Haiti by measuring the rate of births among program participants 15-19 years old and their nonparticipant peers. A retrospective cohort study using 2006-2009 data from the computerized data-tracking system of the Haitian Health Foundation (HHF), a U.S.-based nongovernmental organization serving urban and rural populations in Haiti, was used to assess births among girls 15-19 years old who participated in HHF's GenNext program, a combination education-soccer program for youth, based on SRH classes HHF nurses and community workers had been conducting in Haiti for mothers, fathers, and youth; girl-centered health screenings; and an all-female summer soccer league, during 2006-2009 (n = 4 251). Bivariate and multiple logistic regression analyses were carried out to assess differences in the rate of births among program participants according to their level of participation (SRH component only ("EDU") versus both the SRH and soccer components ("SO") compared to their village peers who did not participate. Hazard ratios (HRs) of birth rates were estimated using Cox regression analysis of childbearing data for the three different groups. In the multiple logistic regression analysis, only the girls in the "EDU" group had significantly fewer births than the nonparticipants after adjusting for confounders (odds ratio = 0.535; 95% confidence interval (CI) = 0.304, 0.940). The Cox regression analysis demonstrated that those in the EDU group (HR = 0.893; 95% CI = 0.802, 0.994) and to a greater degree those in the SO group (HR = 0.631; 95% CI = 0.558, 0.714) were significantly protected against childbearing between the ages of 15 and 19 years. HHF's GenNext program demonstrates the effectiveness of utilizing nurse educators, community mobilization, and youth participation in sports, education, and structured youth groups to promote and sustain health for adolescent girls and young women.
Maintenance Operations in Mission Oriented Protective Posture Level IV (MOPPIV)
1987-10-01
Repair FADAC Printed Circuit Board ............. 6 3. Data Analysis Techniques ............................. 6 a. Multiple Linear Regression... ANALYSIS /DISCUSSION ............................... 12 1. Exa-ple of Regression Analysis ..................... 12 S2. Regression results for all tasks...6 * TABLE 9. Task Grouping for Analysis ........................ 7 "TABXLE 10. Remove/Replace H60A3 Power Pack................. 8 TABLE
Study on Hyperspectral Characteristics and Estimation Model of Soil Mercury Content
NASA Astrophysics Data System (ADS)
Liu, Jinbao; Dong, Zhenyu; Sun, Zenghui; Ma, Hongchao; Shi, Lei
2017-12-01
In this study, the mercury content of 44 soil samples in Guan Zhong area of Shaanxi Province was used as the data source, and the reflectance spectrum of soil was obtained by ASD Field Spec HR (350-2500 nm) Comparing the reflection characteristics of different contents and the effect of different pre-treatment methods on the establishment of soil heavy metal spectral inversion model. The first order differential, second order differential and reflectance logarithmic transformations were carried out after the pre-treatment of NOR, MSC and SNV, and the sensitive bands of reflectance and mercury content in different mathematical transformations were selected. A hyperspectral estimation model is established by regression method. The results of chemical analysis show that there is a serious Hg pollution in the study area. The results show that: (1) the reflectivity decreases with the increase of mercury content, and the sensitive regions of mercury are located at 392 ~ 455nm, 923nm ~ 1040nm and 1806nm ~ 1969nm. (2) The combination of NOR, MSC and SNV transformations combined with differential transformations can improve the information of heavy metal elements in the soil, and the combination of high correlation band can improve the stability and prediction ability of the model. (3) The partial least squares regression model based on the logarithm of the original reflectance is better and the precision is higher, Rc2 = 0.9912, RMSEC = 0.665; Rv2 = 0.9506, RMSEP = 1.93, which can achieve the mercury content in this region Quick forecast.
Parodi, Stefano; Dosi, Corrado; Zambon, Antonella; Ferrari, Enrico; Muselli, Marco
2017-12-01
Identifying potential risk factors for problem gambling (PG) is of primary importance for planning preventive and therapeutic interventions. We illustrate a new approach based on the combination of standard logistic regression and an innovative method of supervised data mining (Logic Learning Machine or LLM). Data were taken from a pilot cross-sectional study to identify subjects with PG behaviour, assessed by two internationally validated scales (SOGS and Lie/Bet). Information was obtained from 251 gamblers recruited in six betting establishments. Data on socio-demographic characteristics, lifestyle and cognitive-related factors, and type, place and frequency of preferred gambling were obtained by a self-administered questionnaire. The following variables associated with PG were identified: instant gratification games, alcohol abuse, cognitive distortion, illegal behaviours and having started gambling with a relative or a friend. Furthermore, the combination of LLM and LR indicated the presence of two different types of PG, namely: (a) daily gamblers, more prone to illegal behaviour, with poor money management skills and who started gambling at an early age, and (b) non-daily gamblers, characterised by superstitious beliefs and a higher preference for immediate reward games. Finally, instant gratification games were strongly associated with the number of games usually played. Studies on gamblers habitually frequently betting shops are rare. The finding of different types of PG by habitual gamblers deserves further analysis in larger studies. Advanced data mining algorithms, like LLM, are powerful tools and potentially useful in identifying risk factors for PG.
Gotvald, Anthony J.; Barth, Nancy A.; Veilleux, Andrea G.; Parrett, Charles
2012-01-01
Methods for estimating the magnitude and frequency of floods in California that are not substantially affected by regulation or diversions have been updated. Annual peak-flow data through water year 2006 were analyzed for 771 streamflow-gaging stations (streamgages) in California having 10 or more years of data. Flood-frequency estimates were computed for the streamgages by using the expected moments algorithm to fit a Pearson Type III distribution to logarithms of annual peak flows for each streamgage. Low-outlier and historic information were incorporated into the flood-frequency analysis, and a generalized Grubbs-Beck test was used to detect multiple potentially influential low outliers. Special methods for fitting the distribution were developed for streamgages in the desert region in southeastern California. Additionally, basin characteristics for the streamgages were computed by using a geographical information system. Regional regression analysis, using generalized least squares regression, was used to develop a set of equations for estimating flows with 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities for ungaged basins in California that are outside of the southeastern desert region. Flood-frequency estimates and basin characteristics for 630 streamgages were combined to form the final database used in the regional regression analysis. Five hydrologic regions were developed for the area of California outside of the desert region. The final regional regression equations are functions of drainage area and mean annual precipitation for four of the five regions. In one region, the Sierra Nevada region, the final equations are functions of drainage area, mean basin elevation, and mean annual precipitation. Average standard errors of prediction for the regression equations in all five regions range from 42.7 to 161.9 percent. For the desert region of California, an analysis of 33 streamgages was used to develop regional estimates of all three parameters (mean, standard deviation, and skew) of the log-Pearson Type III distribution. The regional estimates were then used to develop a set of equations for estimating flows with 50-, 20-, 10-, 4-, 2-, 1-, 0.5-, and 0.2-percent annual exceedance probabilities for ungaged basins. The final regional regression equations are functions of drainage area. Average standard errors of prediction for these regression equations range from 214.2 to 856.2 percent. Annual peak-flow data through water year 2006 were analyzed for eight streamgages in California having 10 or more years of data considered to be affected by urbanization. Flood-frequency estimates were computed for the urban streamgages by fitting a Pearson Type III distribution to logarithms of annual peak flows for each streamgage. Regression analysis could not be used to develop flood-frequency estimation equations for urban streams because of the limited number of sites. Flood-frequency estimates for the eight urban sites were graphically compared to flood-frequency estimates for 630 non-urban sites. The regression equations developed from this study will be incorporated into the U.S. Geological Survey (USGS) StreamStats program. The StreamStats program is a Web-based application that provides streamflow statistics and basin characteristics for USGS streamgages and ungaged sites of interest. StreamStats can also compute basin characteristics and provide estimates of streamflow statistics for ungaged sites when users select the location of a site along any stream in California.
Li, Xiaomeng; Fang, Dansi; Cong, Xiaodong; Cao, Gang; Cai, Hao; Cai, Baochang
2012-12-01
A method is described using rapid and sensitive Fourier transform near-infrared spectroscopy combined with high-performance liquid chromatography-diode array detection for the simultaneous identification and determination of four bioactive compounds in crude Radix Scrophulariae samples. Partial least squares regression is selected as the analysis type and multiplicative scatter correction, second derivative, and Savitzky-Golay filter were adopted for the spectral pretreatment. The correlation coefficients (R) of the calibration models were above 0.96 and the root mean square error of predictions were under 0.028. The developed models were applied to unknown samples with satisfactory results. The established method was validated and can be applied to the intrinsic quality control of crude Radix Scrophulariae.
Govindarajan, Parameswari; Schlewitz, Gudrun; Schliefke, Nathalie; Weisweiler, David; Alt, Volker; Thormann, Ulrich; Lips, Katrin Susanne; Wenisch, Sabine; Langheinrich, Alexander C.; Zahner, Daniel; Hemdan, Nasr Y.; Böcker, Wolfgang; Schnettler, Reinhard; Heiss, Christian
2013-01-01
Background Osteoporosis is a multi-factorial, chronic, skeletal disease highly prevalent in post-menopausal women and is influenced by hormonal and dietary factors. Because animal models are imperative for disease diagnostics, the present study establishes and evaluates enhanced osteoporosis obtained through combined ovariectomy and deficient diet by DEXA (dual-energy X-ray absorptiometry) for a prolonged time period. Material/Methods Sprague-Dawley rats were randomly divided into sham (laparotomized) and OVX-diet (ovariectomized and fed with deficient diet) groups. Different skeletal sites were scanned by DEXA at the following time points: M0 (baseline), M12 (12 months post-surgery), and M14 (14 months post-surgery). Parameters analyzed included BMD (bone mineral density), BMC (bone mineral content), bone area, and fat (%). Regression analysis was performed to determine the interrelationships between BMC, BMD, and bone area from M0 to M14. Results BMD and BMC were significantly lower in OVX-diet rats at M12 and M14 compared to sham rats. The Z-scores were below −5 in OVX-diet rats at M12, but still decreased at M14 in OVX-diet rats. Bone area and percent fat were significantly lower in OVX-diet rats at M14 compared to sham rats. The regression coefficients for BMD vs. bone area, BMC vs. bone area, and BMC vs. BMD of OVX-diet rats increased with time. This is explained by differential percent change in BMD, BMC, and bone area with respect to time and disease progression. Conclusions Combined ovariectomy and deficient diet in rats caused significant reduction of BMD, BMC, and bone area, with nearly 40% bone loss after 14 months, indicating the development of severe osteoporosis. An increasing regression coefficient of BMD vs. bone area with disease progression emphasizes bone area as an important parameter, along with BMD and BMC, for prediction of fracture risk. PMID:23446183
Govindarajan, Parameswari; Schlewitz, Gudrun; Schliefke, Nathalie; Weisweiler, David; Alt, Volker; Thormann, Ulrich; Lips, Katrin Susanne; Wenisch, Sabine; Langheinrich, Alexander C; Zahner, Daniel; Hemdan, Nasr Y; Böcker, Wolfgang; Schnettler, Reinhard; Heiss, Christian
2013-02-28
Osteoporosis is a multi-factorial, chronic, skeletal disease highly prevalent in post-menopausal women and is influenced by hormonal and dietary factors. Because animal models are imperative for disease diagnostics, the present study establishes and evaluates enhanced osteoporosis obtained through combined ovariectomy and deficient diet by DEXA (dual-energy X-ray absorptiometry) for a prolonged time period. Sprague-Dawley rats were randomly divided into sham (laparotomized) and OVX-diet (ovariectomized and fed with deficient diet) groups. Different skeletal sites were scanned by DEXA at the following time points: M0 (baseline), M12 (12 months post-surgery), and M14 (14 months post-surgery). Parameters analyzed included BMD (bone mineral density), BMC (bone mineral content), bone area, and fat (%). Regression analysis was performed to determine the interrelationships between BMC, BMD, and bone area from M0 to M14. BMD and BMC were significantly lower in OVX-diet rats at M12 and M14 compared to sham rats. The Z-scores were below -5 in OVX-diet rats at M12, but still decreased at M14 in OVX-diet rats. Bone area and percent fat were significantly lower in OVX-diet rats at M14 compared to sham rats. The regression coefficients for BMD vs. bone area, BMC vs. bone area, and BMC vs. BMD of OVX-diet rats increased with time. This is explained by differential percent change in BMD, BMC, and bone area with respect to time and disease progression. Combined ovariectomy and deficient diet in rats caused significant reduction of BMD, BMC, and bone area, with nearly 40% bone loss after 14 months, indicating the development of severe osteoporosis. An increasing regression coefficient of BMD vs. bone area with disease progression emphasizes bone area as an important parameter, along with BMD and BMC, for prediction of fracture risk.
Explaining Match Outcome During The Men’s Basketball Tournament at The Olympic Games
Leicht, Anthony S.; Gómez, Miguel A.; Woods, Carl T.
2017-01-01
In preparation for the Olympics, there is a limited opportunity for coaches and athletes to interact regularly with team performance indicators providing important guidance to coaches for enhanced match success at the elite level. This study examined the relationship between match outcome and team performance indicators during men’s basketball tournaments at the Olympic Games. Twelve team performance indicators were collated from all men’s teams and matches during the basketball tournament of the 2004-2016 Olympic Games (n = 156). Linear and non-linear analyses examined the relationship between match outcome and team performance indicator characteristics; namely, binary logistic regression and a conditional interference (CI) classification tree. The most parsimonious logistic regression model retained ‘assists’, ‘defensive rebounds’, ‘field-goal percentage’, ‘fouls’, ‘fouls against’, ‘steals’ and ‘turnovers’ (delta AIC <0.01; Akaike weight = 0.28) with a classification accuracy of 85.5%. Conversely, four performance indicators were retained with the CI classification tree with an average classification accuracy of 81.4%. However, it was the combination of ‘field-goal percentage’ and ‘defensive rebounds’ that provided the greatest probability of winning (93.2%). Match outcome during the men’s basketball tournaments at the Olympic Games was identified by a unique combination of performance indicators. Despite the average model accuracy being marginally higher for the logistic regression analysis, the CI classification tree offered a greater practical utility for coaches through its resolution of non-linear phenomena to guide team success. Key points A unique combination of team performance indicators explained 93.2% of winning observations in men’s basketball at the Olympics. Monitoring of these team performance indicators may provide coaches with the capability to devise multiple game plans or strategies to enhance their likelihood of winning. Incorporation of machine learning techniques with team performance indicators may provide a valuable and strategic approach to explain patterns within multivariate datasets in sport science. PMID:29238245
Neuberger, Ulf; Kickingereder, Philipp; Helluy, Xavier; Fischer, Manuel; Bendszus, Martin; Heiland, Sabine
2017-12-01
Non-invasive detection of 2-hydroxyglutarate (2HG) by magnetic resonance spectroscopy is attractive since it is related to tumor metabolism. Here, we compare the detection accuracy of 2HG in a controlled phantom setting via widely used localized spectroscopy sequences quantified by linear combination of metabolite signals vs. a more complex approach applying a J-difference editing technique at 9.4T. Different phantoms, comprised out of a concentration series of 2HG and overlapping brain metabolites, were measured with an optimized point-resolved-spectroscopy sequence (PRESS) and an in-house developed J-difference editing sequence. The acquired spectra were post-processed with LCModel and a simulated metabolite set (PRESS) or with a quantification formula for J-difference editing. Linear regression analysis demonstrated a high correlation of real 2HG values with those measured with the PRESS method (adjusted R-squared: 0.700, p<0.001) as well as with those measured with the J-difference editing method (adjusted R-squared: 0.908, p<0.001). The regression model with the J-difference editing method however had a significantly higher explanatory value over the regression model with the PRESS method (p<0.0001). Moreover, with J-difference editing 2HG was discernible down to 1mM, whereas with the PRESS method 2HG values were not discernable below 2mM and with higher systematic errors, particularly in phantoms with high concentrations of N-acetyl-asparate (NAA) and glutamate (Glu). In summary, quantification of 2HG with linear combination of metabolite signals shows high systematic errors particularly at low 2HG concentration and high concentration of confounding metabolites such as NAA and Glu. In contrast, J-difference editing offers a more accurate quantification even at low 2HG concentrations, which outweighs the downsides of longer measurement time and more complex postprocessing. Copyright © 2017. Published by Elsevier GmbH.
Bili, Eleni; Bili, Authors Eleni; Dampala, Kaliopi; Iakovou, Ioannis; Tsolakidis, Dimitrios; Giannakou, Anastasia; Tarlatzis, Basil C
2014-08-01
The aim of this study was to determine the performance of prostate specific antigen (PSA) and ultrasound parameters, such as ovarian volume and outline, in the diagnosis of polycystic ovary syndrome (PCOS). This prospective, observational, case-controlled study included 43 women with PCOS, and 40 controls. Between day 3 and 5 of the menstrual cycle, fasting serum samples were collected and transvaginal ultrasound was performed. The diagnostic performance of each parameter [total PSA (tPSA), total-to-free PSA ratio (tPSA:fPSA), ovarian volume, ovarian outline] was estimated by means of receiver operating characteristic (ROC) analysis, along with area under the curve (AUC), threshold, sensitivity, specificity as well as positive (+) and negative (-) likelihood ratios (LRs). Multivariate logistical regression models, using ovarian volume and ovarian outline, were constructed. The tPSA and tPSA:fPSA ratio resulted in AUC of 0.74 and 0.70, respectively, with moderate specificity/sensitivity and insufficient LR+/- values. In the multivariate logistic regression model, the combination of ovarian volume and outline had a sensitivity of 97.7% and a specificity of 97.5% in the diagnosis of PCOS, with +LR and -LR values of 39.1 and 0.02, respectively. In women with PCOS, tPSA and tPSA:fPSA ratio have similar diagnostic performance. The use of a multivariate logistic regression model, incorporating ovarian volume and outline, offers very good diagnostic accuracy in distinguishing women with PCOS patients from controls. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
NASA Technical Reports Server (NTRS)
Rummler, D. R.
1976-01-01
The results are presented of investigations to apply regression techniques to the development of methodology for creep-rupture data analysis. Regression analysis techniques are applied to the explicit description of the creep behavior of materials for space shuttle thermal protection systems. A regression analysis technique is compared with five parametric methods for analyzing three simulated and twenty real data sets, and a computer program for the evaluation of creep-rupture data is presented.
NASA Technical Reports Server (NTRS)
Bigger, J. T. Jr; Steinman, R. C.; Rolnitzky, L. M.; Fleiss, J. L.; Albrecht, P.; Cohen, R. J.
1996-01-01
BACKGROUND. The purposes of the present study were (1) to establish normal values for the regression of log(power) on log(frequency) for, RR-interval fluctuations in healthy middle-aged persons, (2) to determine the effects of myocardial infarction on the regression of log(power) on log(frequency), (3) to determine the effect of cardiac denervation on the regression of log(power) on log(frequency), and (4) to assess the ability of power law regression parameters to predict death after myocardial infarction. METHODS AND RESULTS. We studied three groups: (1) 715 patients with recent myocardial infarction; (2) 274 healthy persons age and sex matched to the infarct sample; and (3) 19 patients with heart transplants. Twenty-four-hour RR-interval power spectra were computed using fast Fourier transforms and log(power) was regressed on log(frequency) between 10(-4) and 10(-2) Hz. There was a power law relation between log(power) and log(frequency). That is, the function described a descending straight line that had a slope of approximately -1 in healthy subjects. For the myocardial infarction group, the regression line for log(power) on log(frequency) was shifted downward and had a steeper negative slope (-1.15). The transplant (denervated) group showed a larger downward shift in the regression line and a much steeper negative slope (-2.08). The correlation between traditional power spectral bands and slope was weak, and that with log(power) at 10(-4) Hz was only moderate. Slope and log(power) at 10(-4) Hz were used to predict mortality and were compared with the predictive value of traditional power spectral bands. Slope and log(power) at 10(-4) Hz were excellent predictors of all-cause mortality or arrhythmic death. To optimize the prediction of death, we calculated a log(power) intercept that was uncorrelated with the slope of the power law regression line. We found that the combination of slope and zero-correlation log(power) was an outstanding predictor, with a relative risk of > 10, and was better than any combination of the traditional power spectral bands. The combination of slope and log(power) at 10(-4) Hz also was an excellent predictor of death after myocardial infarction. CONCLUSIONS. Myocardial infarction or denervation of the heart causes a steeper slope and decreased height of the power law regression relation between log(power) and log(frequency) of RR-interval fluctuations. Individually and, especially, combined, the power law regression parameters are excellent predictors of death of any cause or arrhythmic death and predict these outcomes better than the traditional power spectral bands.
Chang, Brian A; Pearson, William S; Owusu-Edusei, Kwame
2017-04-01
We used a combination of hot spot analysis (HSA) and spatial regression to examine county-level hot spot correlates for the most commonly reported nonviral sexually transmitted infections (STIs) in the 48 contiguous states in the United States (US). We obtained reported county-level total case rates of chlamydia, gonorrhea, and primary and secondary (P&S) syphilis in all counties in the 48 contiguous states from national surveillance data and computed temporally smoothed rates using 2008-2012 data. Covariates were obtained from county-level multiyear (2008-2012) American Community Surveys from the US census. We conducted HSA to identify hot spot counties for all three STIs. We then applied spatial logistic regression with the spatial error model to determine the association between the identified hot spots and the covariates. HSA indicated that ≥84% of hot spots for each STI were in the South. Spatial regression results indicated that, a 10-unit increase in the percentage of Black non-Hispanics was associated with ≈42% (P < 0.01) [≈22% (P < 0.01), for Hispanics] increase in the odds of being a hot spot county for chlamydia and gonorrhea, and ≈27% (P < 0.01) [≈11% (P < 0.01) for Hispanics] for P&S syphilis. Compared with the other regions (West, Midwest, and Northeast), counties in the South were 6.5 (P < 0.01; chlamydia), 9.6 (P < 0.01; gonorrhea), and 4.7 (P < 0.01; P&S syphilis) times more likely to be hot spots. Our study provides important information on hot spot clusters of nonviral STIs in the entire United States, including associations between hot spot counties and sociodemographic factors. Published by Elsevier Inc.
van der Meer, D; Hoekstra, P J; van Donkelaar, M; Bralten, J; Oosterlaan, J; Heslenfeld, D; Faraone, S V; Franke, B; Buitelaar, J K; Hartman, C A
2017-01-01
Identifying genetic variants contributing to attention-deficit/hyperactivity disorder (ADHD) is complicated by the involvement of numerous common genetic variants with small effects, interacting with each other as well as with environmental factors, such as stress exposure. Random forest regression is well suited to explore this complexity, as it allows for the analysis of many predictors simultaneously, taking into account any higher-order interactions among them. Using random forest regression, we predicted ADHD severity, measured by Conners’ Parent Rating Scales, from 686 adolescents and young adults (of which 281 were diagnosed with ADHD). The analysis included 17 374 single-nucleotide polymorphisms (SNPs) across 29 genes previously linked to hypothalamic–pituitary–adrenal (HPA) axis activity, together with information on exposure to 24 individual long-term difficulties or stressful life events. The model explained 12.5% of variance in ADHD severity. The most important SNP, which also showed the strongest interaction with stress exposure, was located in a region regulating the expression of telomerase reverse transcriptase (TERT). Other high-ranking SNPs were found in or near NPSR1, ESR1, GABRA6, PER3, NR3C2 and DRD4. Chronic stressors were more influential than single, severe, life events. Top hits were partly shared with conduct problems. We conclude that random forest regression may be used to investigate how multiple genetic and environmental factors jointly contribute to ADHD. It is able to implicate novel SNPs of interest, interacting with stress exposure, and may explain inconsistent findings in ADHD genetics. This exploratory approach may be best combined with more hypothesis-driven research; top predictors and their interactions with one another should be replicated in independent samples. PMID:28585928
NASA Astrophysics Data System (ADS)
Mekanik, F.; Imteaz, M. A.; Gato-Trinidad, S.; Elmahdi, A.
2013-10-01
In this study, the application of Artificial Neural Networks (ANN) and Multiple regression analysis (MR) to forecast long-term seasonal spring rainfall in Victoria, Australia was investigated using lagged El Nino Southern Oscillation (ENSO) and Indian Ocean Dipole (IOD) as potential predictors. The use of dual (combined lagged ENSO-IOD) input sets for calibrating and validating ANN and MR Models is proposed to investigate the simultaneous effect of past values of these two major climate modes on long-term spring rainfall prediction. The MR models that did not violate the limits of statistical significance and multicollinearity were selected for future spring rainfall forecast. The ANN was developed in the form of multilayer perceptron using Levenberg-Marquardt algorithm. Both MR and ANN modelling were assessed statistically using mean square error (MSE), mean absolute error (MAE), Pearson correlation (r) and Willmott index of agreement (d). The developed MR and ANN models were tested on out-of-sample test sets; the MR models showed very poor generalisation ability for east Victoria with correlation coefficients of -0.99 to -0.90 compared to ANN with correlation coefficients of 0.42-0.93; ANN models also showed better generalisation ability for central and west Victoria with correlation coefficients of 0.68-0.85 and 0.58-0.97 respectively. The ability of multiple regression models to forecast out-of-sample sets is compatible with ANN for Daylesford in central Victoria and Kaniva in west Victoria (r = 0.92 and 0.67 respectively). The errors of the testing sets for ANN models are generally lower compared to multiple regression models. The statistical analysis suggest the potential of ANN over MR models for rainfall forecasting using large scale climate modes.
Prediction of cold and heat patterns using anthropometric measures based on machine learning.
Lee, Bum Ju; Lee, Jae Chul; Nam, Jiho; Kim, Jong Yeol
2018-01-01
To examine the association of body shape with cold and heat patterns, to determine which anthropometric measure is the best indicator for discriminating between the two patterns, and to investigate whether using a combination of measures can improve the predictive power to diagnose these patterns. Based on a total of 4,859 subjects (3,000 women and 1,859 men), statistical analyses using binary logistic regression were performed to assess the significance of the difference and the predictive power of each anthropometric measure, and binary logistic regression and Naive Bayes with the variable selection technique were used to assess the improvement in the predictive power of the patterns using the combined measures. In women, the strongest indicators for determining the cold and heat patterns among anthropometric measures were body mass index (BMI) and rib circumference; in men, the best indicator was BMI. In experiments using a combination of measures, the values of the area under the receiver operating characteristic curve in women were 0.776 by Naive Bayes and 0.772 by logistic regression, and the values in men were 0.788 by Naive Bayes and 0.779 by logistic regression. Individuals with a higher BMI have a tendency toward a heat pattern in both women and men. The use of a combination of anthropometric measures can slightly improve the diagnostic accuracy. Our findings can provide fundamental information for the diagnosis of cold and heat patterns based on body shape for personalized medicine.
Standards for Standardized Logistic Regression Coefficients
ERIC Educational Resources Information Center
Menard, Scott
2011-01-01
Standardized coefficients in logistic regression analysis have the same utility as standardized coefficients in linear regression analysis. Although there has been no consensus on the best way to construct standardized logistic regression coefficients, there is now sufficient evidence to suggest a single best approach to the construction of a…
Linear regression analysis: part 14 of a series on evaluation of scientific publications.
Schneider, Astrid; Hommel, Gerhard; Blettner, Maria
2010-11-01
Regression analysis is an important statistical method for the analysis of medical data. It enables the identification and characterization of relationships among multiple factors. It also enables the identification of prognostically relevant risk factors and the calculation of risk scores for individual prognostication. This article is based on selected textbooks of statistics, a selective review of the literature, and our own experience. After a brief introduction of the uni- and multivariable regression models, illustrative examples are given to explain what the important considerations are before a regression analysis is performed, and how the results should be interpreted. The reader should then be able to judge whether the method has been used correctly and interpret the results appropriately. The performance and interpretation of linear regression analysis are subject to a variety of pitfalls, which are discussed here in detail. The reader is made aware of common errors of interpretation through practical examples. Both the opportunities for applying linear regression analysis and its limitations are presented.
An improved multiple linear regression and data analysis computer program package
NASA Technical Reports Server (NTRS)
Sidik, S. M.
1972-01-01
NEWRAP, an improved version of a previous multiple linear regression program called RAPIER, CREDUC, and CRSPLT, allows for a complete regression analysis including cross plots of the independent and dependent variables, correlation coefficients, regression coefficients, analysis of variance tables, t-statistics and their probability levels, rejection of independent variables, plots of residuals against the independent and dependent variables, and a canonical reduction of quadratic response functions useful in optimum seeking experimentation. A major improvement over RAPIER is that all regression calculations are done in double precision arithmetic.
Mapping soil textural fractions across a large watershed in north-east Florida.
Lamsal, S; Mishra, U
2010-08-01
Assessment of regional scale soil spatial variation and mapping their distribution is constrained by sparse data which are collected using field surveys that are labor intensive and cost prohibitive. We explored geostatistical (ordinary kriging-OK), regression (Regression Tree-RT), and hybrid methods (RT plus residual Sequential Gaussian Simulation-SGS) to map soil textural fractions across the Santa Fe River Watershed (3585 km(2)) in north-east Florida. Soil samples collected from four depths (L1: 0-30 cm, L2: 30-60 cm, L3: 60-120 cm, and L4: 120-180 cm) at 141 locations were analyzed for soil textural fractions (sand, silt and clay contents), and combined with textural data (15 profiles) assembled under the Florida Soil Characterization program. Textural fractions in L1 and L2 were autocorrelated, and spatially mapped across the watershed. OK performance was poor, which may be attributed to the sparse sampling. RT model structure varied among textural fractions, and the model explained variations ranged from 25% for L1 silt to 61% for L2 clay content. Regression residuals were simulated using SGS, and the average of simulated residuals were used to approximate regression residual distribution map, which were added to regression trend maps. Independent validation of the prediction maps showed that regression models performed slightly better than OK, and regression combined with average of simulated regression residuals improved predictions beyond the regression model. Sand content >90% in both 0-30 and 30-60 cm covered 80.6% of the watershed area. Copyright 2010 Elsevier Ltd. All rights reserved.
Zang, Qing-Ce; Wang, Jia-Bo; Kong, Wei-Jun; Jin, Cheng; Ma, Zhi-Jie; Chen, Jing; Gong, Qian-Feng; Xiao, Xiao-He
2011-12-01
The fingerprints of artificial Calculus bovis extracts from different solvents were established by ultra-performance liquid chromatography (UPLC) and the anti-bacterial activities of artificial C. bovis extracts on Staphylococcus aureus (S. aureus) growth were studied by microcalorimetry. The UPLC fingerprints were evaluated using hierarchical clustering analysis. Some quantitative parameters obtained from the thermogenic curves of S. aureus growth affected by artificial C. bovis extracts were analyzed using principal component analysis. The spectrum-effect relationships between UPLC fingerprints and anti-bacterial activities were investigated using multi-linear regression analysis. The results showed that peak 1 (taurocholate sodium), peak 3 (unknown compound), peak 4 (cholic acid), and peak 6 (chenodeoxycholic acid) are more significant than the other peaks with the standard parameter estimate 0.453, -0.166, 0.749, 0.025, respectively. So, compounds cholic acid, taurocholate sodium, and chenodeoxycholic acid might be the major anti-bacterial components in artificial C. bovis. Altogether, this work provides a general model of the combination of UPLC chromatography and anti-bacterial effect to study the spectrum-effect relationships of artificial C. bovis extracts, which can be used to discover the main anti-bacterial components in artificial C. bovis or other Chinese herbal medicines with anti-bacterial effects. Copyright © 2011 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Meta-analysis of association between mobile phone use and glioma risk.
Wang, Yabo; Guo, Xiaqing
2016-12-01
The purpose of this study was to evaluate the association between mobile phone use and glioma risk through pooling the published data. By searching Medline, EMBSE, and CNKI databases, we screened the open published case-control or cohort studies about mobile phone use and glioma risk by systematic searching strategy. The pooled odds of mobile use in glioma patients versus healthy controls were calculated by meta-analysis method. The statistical analysis was done by Stata12.0 software (http://www.stata.com). After searching the Medline, EMBSE, and CNKI databases, we ultimately included 11 studies range from 2001 to 2008. For ≥1 year group, the data were pooled by random effects model. The combined data showed that there was no association between mobile phone use and glioma odds ratio (OR) =1.08 (95% confidence interval [CI]: 0.91-1.25,P > 0.05). However, a significant association was found between mobile phone use more than 5 years and glioma risk OR = 1.35 (95% CI: 1.09-1.62, P < 0.05). The publication bias of this study was evaluated by funnel plot and line regression test. The funnel plot and line regression test (t = 0.25,P = 0.81) did not indicate any publication bias. Long-term mobile phone use may increase the risk of developing glioma according to this meta-analysis.
Cheng, Ching-Rong; Sessler, Daniel I.; Apfel, Christian C.
2005-01-01
Neostigmine is used to antagonize neoromuscluar blocker-induced residual neuromuscular paralysis. Despite a previous meta-analysis, the effect of neostigmine on postoperative nausea and vomiting (PONV) remains unresolved. We reevaluated the effect of neostigmine on PONV while considering the different anticholinergics as potentially confounding factors. We performed a systematic literature search using Medline, Embase, Cochrane library, reference listings, and hand searching with no language restriction through December 2004 and identified 10 clinical, randomized, controlled trials evaluating neostigmine's effect on PONV. Data on nausea or vomiting from 933 patients were extracted for the early (0-6 h), delayed (6-24 h), and overall postoperative periods (0-24 h) and analyzed with RevMan 4.2 (Cochrane Collaboration, Oxford, UK) and multiple logistic regression analysis. The combination of neostigmine with either atropine or glycopyrrolate did not significantly increase the incidence of overall (0-24 h) vomiting (relative risk (RR) 0.91 [0.70-1.18], P=0.48) or nausea (RR 1.24 [95% CI: 0.98-1.59], P=0.08). Multiple logistic regression analysis indicated that that there was not a significant increase in the risk of vomiting with large compared with small doses of neostigmine. In contrast to a previous analysis, we conclude that there is insufficient evidence to conclude that neostigmine increases the risk of PONV. PMID:16243993
NASA Astrophysics Data System (ADS)
Priya, Mallika; Rao, Bola Sadashiva Satish; Chandra, Subhash; Ray, Satadru; Mathew, Stanley; Datta, Anirbit; Nayak, Subramanya G.; Mahato, Krishna Kishore
2016-02-01
In spite of many efforts for early detection of breast cancer, there is still lack of technology for immediate implementation. In the present study, the potential photoacoustic spectroscopy was evaluated in discriminating breast cancer from normal, involving blood serum samples seeking early detection. Three photoacoustic spectra in time domain were recorded from each of 20 normal and 20 malignant samples at 281nm pulsed laser excitations and a total of 120 spectra were generated. The time domain spectra were then Fast Fourier Transformed into frequency domain and 116.5625 - 206.875 kHz region was selected for further analysis using a combinational approach of wavelet, PCA and logistic regression. Initially, wavelet analysis was performed on the FFT data and seven features (mean, median, area under the curve, variance, standard deviation, skewness and kurtosis) from each were extracted. PCA was then performed on the feature matrix (7x120) for discriminating malignant samples from the normal by plotting a decision boundary using logistic regression analysis. The unsupervised mode of classification used in the present study yielded specificity and sensitivity values of 100% in each respectively with a ROC - AUC value of 1. The results obtained have clearly demonstrated the capability of photoacoustic spectroscopy in discriminating cancer from the normal, suggesting its possible clinical implications.
Vitamin D and Graves' disease: a meta-analysis update.
Xu, Mei-Yan; Cao, Bing; Yin, Jian; Wang, Dong-Fang; Chen, Kai-Li; Lu, Qing-Bin
2015-05-21
The association between vitamin D levels and Graves' disease is not well studied. This update review aims to further analyze the relationship in order to provide an actual view of estimating the risk. We searched for the publications on vitamin D and Graves' disease in English or Chinese on PubMed, EMBASE, Chinese National Knowledge Infrastructure, China Biology Medical and Wanfang databases. The standardized mean difference (SMD) and 95% confidence interval (CI) were calculated for the vitamin D levels. Pooled odds ratio (OR) and 95% CI were calculated for vitamin D deficiency. We also performed sensitivity analysis and meta-regression. Combining effect sizes from 26 studies for Graves' disease as an outcome found a pooled effect of SMD = -0.77 (95% CI: -1.12, -0.42; p < 0.001) favoring the low vitamin D level by the random effect analysis. The meta-regression found assay method had the definite influence on heterogeneity (p = 0.048). The patients with Graves' disease were more likely to be deficient in vitamin D compared to the controls (OR = 2.24, 95% CI: 1.31, 3.81) with a high heterogeneity (I2 = 84.1%, p < 0.001). We further confirmed that low vitamin D status may increase the risk of Graves' disease.
Vitamin D and Graves’ Disease: A Meta-Analysis Update
Xu, Mei-Yan; Cao, Bing; Yin, Jian; Wang, Dong-Fang; Chen, Kai-Li; Lu, Qing-Bin
2015-01-01
The association between vitamin D levels and Graves’ disease is not well studied. This update review aims to further analyze the relationship in order to provide an actual view of estimating the risk. We searched for the publications on vitamin D and Graves’ disease in English or Chinese on PubMed, EMBASE, Chinese National Knowledge Infrastructure, China Biology Medical and Wanfang databases. The standardized mean difference (SMD) and 95% confidence interval (CI) were calculated for the vitamin D levels. Pooled odds ratio (OR) and 95% CI were calculated for vitamin D deficiency. We also performed sensitivity analysis and meta-regression. Combining effect sizes from 26 studies for Graves’ disease as an outcome found a pooled effect of SMD = −0.77 (95% CI: −1.12, −0.42; p < 0.001) favoring the low vitamin D level by the random effect analysis. The meta-regression found assay method had the definite influence on heterogeneity (p = 0.048). The patients with Graves’ disease were more likely to be deficient in vitamin D compared to the controls (OR = 2.24, 95% CI: 1.31, 3.81) with a high heterogeneity (I2 = 84.1%, p < 0.001). We further confirmed that low vitamin D status may increase the risk of Graves’ disease. PMID:26007334
Quantitative analysis of titanium-induced artifacts and correlated factors during micro-CT scanning.
Li, Jun Yuan; Pow, Edmond Ho Nang; Zheng, Li Wu; Ma, Li; Kwong, Dora Lai Wan; Cheung, Lim Kwong
2014-04-01
To investigate the impact of cover screw, resin embedment, and implant angulation on artifact of microcomputed tomography (micro-CT) scanning for implant. A total of twelve implants were randomly divided into 4 groups: (i) implant only; (ii) implant with cover screw; (iii) implant with resin embedment; and (iv) implants with cover screw and resin embedment. Implants angulation at 0°, 45°, and 90° were scanned by micro-CT. Images were assessed, and the ratio of artifact volume to total volume (AV/TV) was calculated. A multiple regression analysis in stepwise model was used to determine the significance of different factors. One-way ANOVA was performed to identify which combination of factors could minimize the artifact. In the regression analysis, implant angulation was identified as the best predictor for artifact among the factors (P < 0.001). Resin embedment also had significant effect on artifact volume (P = 0.028), while cover screw had not (P > 0.05). Non-embedded implants with the axis parallel to X-ray source of micro-CT produced minimal artifact. Implant angulation and resin embedment affected the artifact volume of micro-CT scanning for implant, while cover screw did not. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Machine learning of swimming data via wisdom of crowd and regression analysis.
Xie, Jiang; Xu, Junfu; Nie, Celine; Nie, Qing
2017-04-01
Every performance, in an officially sanctioned meet, by a registered USA swimmer is recorded into an online database with times dating back to 1980. For the first time, statistical analysis and machine learning methods are systematically applied to 4,022,631 swim records. In this study, we investigate performance features for all strokes as a function of age and gender. The variances in performance of males and females for different ages and strokes were studied, and the correlations of performances for different ages were estimated using the Pearson correlation. Regression analysis show the performance trends for both males and females at different ages and suggest critical ages for peak training. Moreover, we assess twelve popular machine learning methods to predict or classify swimmer performance. Each method exhibited different strengths or weaknesses in different cases, indicating no one method could predict well for all strokes. To address this problem, we propose a new method by combining multiple inference methods to derive Wisdom of Crowd Classifier (WoCC). Our simulation experiments demonstrate that the WoCC is a consistent method with better overall prediction accuracy. Our study reveals several new age-dependent trends in swimming and provides an accurate method for classifying and predicting swimming times.
Agreement evaluation of AVHRR and MODIS 16-day composite NDVI data sets
Ji, Lei; Gallo, Kevin P.; Eidenshink, Jeffery C.; Dwyer, John L.
2008-01-01
Satellite-derived normalized difference vegetation index (NDVI) data have been used extensively to detect and monitor vegetation conditions at regional and global levels. A combination of NDVI data sets derived from AVHRR and MODIS can be used to construct a long NDVI time series that may also be extended to VIIRS. Comparative analysis of NDVI data derived from AVHRR and MODIS is critical to understanding the data continuity through the time series. In this study, the AVHRR and MODIS 16-day composite NDVI products were compared using regression and agreement analysis methods. The analysis shows a high agreement between the AVHRR-NDVI and MODIS-NDVI observed from 2002 and 2003 for the conterminous United States, but the difference between the two data sets is appreciable. Twenty per cent of the total difference between the two data sets is due to systematic difference, with the remainder due to unsystematic difference. The systematic difference can be eliminated with a linear regression-based transformation between two data sets, and the unsystematic difference can be reduced partially by applying spatial filters to the data. We conclude that the continuity of NDVI time series from AVHRR to MODIS is satisfactory, but a linear transformation between the two sets is recommended.
Accurate Diabetes Risk Stratification Using Machine Learning: Role of Missing Value and Outliers.
Maniruzzaman, Md; Rahman, Md Jahanur; Al-MehediHasan, Md; Suri, Harman S; Abedin, Md Menhazul; El-Baz, Ayman; Suri, Jasjit S
2018-04-10
Diabetes mellitus is a group of metabolic diseases in which blood sugar levels are too high. About 8.8% of the world was diabetic in 2017. It is projected that this will reach nearly 10% by 2045. The major challenge is that when machine learning-based classifiers are applied to such data sets for risk stratification, leads to lower performance. Thus, our objective is to develop an optimized and robust machine learning (ML) system under the assumption that missing values or outliers if replaced by a median configuration will yield higher risk stratification accuracy. This ML-based risk stratification is designed, optimized and evaluated, where: (i) the features are extracted and optimized from the six feature selection techniques (random forest, logistic regression, mutual information, principal component analysis, analysis of variance, and Fisher discriminant ratio) and combined with ten different types of classifiers (linear discriminant analysis, quadratic discriminant analysis, naïve Bayes, Gaussian process classification, support vector machine, artificial neural network, Adaboost, logistic regression, decision tree, and random forest) under the hypothesis that both missing values and outliers when replaced by computed medians will improve the risk stratification accuracy. Pima Indian diabetic dataset (768 patients: 268 diabetic and 500 controls) was used. Our results demonstrate that on replacing the missing values and outliers by group median and median values, respectively and further using the combination of random forest feature selection and random forest classification technique yields an accuracy, sensitivity, specificity, positive predictive value, negative predictive value and area under the curve as: 92.26%, 95.96%, 79.72%, 91.14%, 91.20%, and 0.93, respectively. This is an improvement of 10% over previously developed techniques published in literature. The system was validated for its stability and reliability. RF-based model showed the best performance when outliers are replaced by median values.
Predictors of protein-energy wasting in haemodialysis patients: a cross-sectional study.
Ruperto, M; Sánchez-Muniz, F J; Barril, G
2016-02-01
Protein-energy wasting (PEW) is a highly prevalent condition in haemodialysis patients (HD). The potential usefulness of nutritional-inflammatory markers in the diagnosis of PEW in chronic kidney disease has not been established completely. We hypothesised that a combination of serum albumin, percentage of mid-arm muscle circumference and standard body weight comprises a better discriminator than either single marker of nutritional status in HD patients. A cross-sectional study was performed in 80 HD patients. Patients were categorised in two groups: well-nourished and PEW. Logistic regression analysis was applied to corroborate the reliability of the three markers of PEW with all the nutritional-inflammatory markers analysed. PEW was identified in 52.5% of HD patients. Compared with the well-nourished patients, PEW patients had lower body mass index, serum pre-albumin and body cell mass (all P < 0.001) and higher C-reactive protein (s-CRP) (P < 0.01). Logistic regression analyses showed that the combination of the three criteria were significantly related with s-CRP >1 mg dL(-1) , phase angle <4°, and serum pre-albumin <30 mg dL(-1) (all P < 0.05). Other indicators, such as lymphocytes <20% and Charlson comorbidity index, were significantly involved (both P < 0.01). A receiver operating characteristic curve (area under the curve) of 0.86 (P < 0.001) was found. The combined utilisation of serum albumin, percentage of mid-arm muscle circumference and standard body weight as PEW markers appears to be useful for nutritional-inflammatory status assessment and adds predictive value to the traditional indicators. Larger studies are needed to achieve the reliability of these predictor combinations and their cut-off values in HD patients and other populations. © 2014 The British Dietetic Association Ltd.
Food combination and Alzheimer disease risk: a protective diet.
Gu, Yian; Nieves, Jeri W; Stern, Yaakov; Luchsinger, Jose A; Scarmeas, Nikolaos
2010-06-01
To assess the association between food combination and Alzheimer disease (AD) risk. Because foods are not consumed in isolation, dietary pattern (DP) analysis of food combination, taking into account the interactions among food components, may offer methodological advantages. Prospective cohort study. Northern Manhattan, New York, New York. Two thousand one hundred forty-eight community-based elderly subjects (aged > or = 65 years) without dementia in New York provided dietary information and were prospectively evaluated with the same standardized neurological and neuropsychological measures approximately every 1.5 years. Using reduced rank regression, we calculated DPs based on their ability to explain variation in 7 potentially AD-related nutrients: saturated fatty acids, monounsaturated fatty acids, omega-3 polyunsaturated fatty acids, omega-6 polyunsaturated fatty acids, vitamin E, vitamin B(12), and folate. The associations of reduced rank regression-derived DPs with AD risk were then examined using a Cox proportional hazards model. Main Outcome Measure Incident AD risk. Two hundred fifty-three subjects developed AD during a follow-up of 3.9 years. We identified a DP strongly associated with lower AD risk: compared with subjects in the lowest tertile of adherence to this pattern, the AD hazard ratio (95% confidence interval) for subjects in the highest DP tertile was 0.62 (0.43-0.89) after multivariable adjustment (P for trend = .01). This DP was characterized by higher intakes of salad dressing, nuts, fish, tomatoes, poultry, cruciferous vegetables, fruits, and dark and green leafy vegetables and a lower intake of high-fat dairy products, red meat, organ meat, and butter. Simultaneous consideration of previous knowledge regarding potentially AD-related nutrients and multiple food groups can aid in identifying food combinations that are associated with AD risk.
[A SAS marco program for batch processing of univariate Cox regression analysis for great database].
Yang, Rendong; Xiong, Jie; Peng, Yangqin; Peng, Xiaoning; Zeng, Xiaomin
2015-02-01
To realize batch processing of univariate Cox regression analysis for great database by SAS marco program. We wrote a SAS macro program, which can filter, integrate, and export P values to Excel by SAS9.2. The program was used for screening survival correlated RNA molecules of ovarian cancer. A SAS marco program could finish the batch processing of univariate Cox regression analysis, the selection and export of the results. The SAS macro program has potential applications in reducing the workload of statistical analysis and providing a basis for batch processing of univariate Cox regression analysis.
Exact Analysis of Squared Cross-Validity Coefficient in Predictive Regression Models
ERIC Educational Resources Information Center
Shieh, Gwowen
2009-01-01
In regression analysis, the notion of population validity is of theoretical interest for describing the usefulness of the underlying regression model, whereas the presumably more important concept of population cross-validity represents the predictive effectiveness for the regression equation in future research. It appears that the inference…
USDA-ARS?s Scientific Manuscript database
Selective principal component regression analysis (SPCR) uses a subset of the original image bands for principal component transformation and regression. For optimal band selection before the transformation, this paper used genetic algorithms (GA). In this case, the GA process used the regression co...
Spatial prediction of soil texture in region Centre (France) from summary data
NASA Astrophysics Data System (ADS)
Dobarco, Mercedes Roman; Saby, Nicolas; Paroissien, Jean-Baptiste; Orton, Tom G.
2015-04-01
Soil texture is a key controlling factor of important soil functions like water and nutrient holding capacity, retention of pollutants, drainage, soil biodiversity, and C cycling. High resolution soil texture maps enhance our understanding of the spatial distribution of soil properties and provide valuable information for decision making and crop management, environmental protection, and hydrological planning. We predicted the soil texture of agricultural topsoils in the Region Centre (France) combining regression and area-to-point kriging. Soil texture data was collected from the French soil-test database (BDAT), which is populated with soil analysis performed by farmers' demand. To protect the anonymity of the farms the data was treated by commune. In a first step, summary statistics of environmental covariates by commune were used to develop prediction models with Cubist, boosted regression trees, and random forests. In a second step the residuals of each individual observation were summarized by commune and kriged following the method developed by Orton et al. (2012). This approach allowed to include non-linear relationships among covariates and soil texture while accounting for the uncertainty on areal means in the area-to-point kriging step. Independent validation of the models was done using data from the systematic soil monitoring network of French soils. Future work will compare the performance of these models with a non-stationary variance geostatistical model using the most important covariates and summary statistics of texture data. The results will inform on whether the later and statistically more-challenging approach improves significantly texture predictions or whether the more simple area-to-point regression kriging can offer satisfactory results. The application of area-to-point regression kriging at national level using BDAT data has the potential to improve soil texture predictions for agricultural topsoils, especially when combined with existing maps (i.e., model ensemble).
Sá, Michel Pompeu Barros de Oliveira; Ferraz, Paulo Ernando; Escobar, Rodrigo Renda; Martins, Wendell Nunes; Lustosa, Pablo César; Nunes, Eliobas de Oliveira; Vasconcelos, Frederico Pires; Lima, Ricardo Carvalho
2012-12-01
Most recent published meta-analysis of randomized controlled trials (RCTs) showed that off-pump coronary artery bypass graft surgery (CABG) reduces incidence of stroke by 30% compared with on-pump CABG, but showed no difference in other outcomes. New RCTs were published, indicating need of new meta-analysis to investigate pooled results adding these further studies. MEDLINE, EMBASE, CENTRAL/CCTR, SciELO, LILACS, Google Scholar and reference lists of relevant articles were searched for RCTs that compared outcomes (30-day mortality for all-cause, myocardial infarction or stroke) between off-pump versus on-pump CABG until May 2012. The principal summary measures were relative risk (RR) with 95% Confidence Interval (CI) and P values (considered statistically significant when <0.05). The RR's were combined across studies using DerSimonian-Laird random effects weighted model. Meta-analysis and meta-regression were completed using the software Comprehensive Meta-Analysis version 2 (Biostat Inc., Englewood, New Jersey, USA). Forty-seven RCTs were identified and included 13,524 patients (6,758 for off-pump and 6,766 for on-pump CABG). There was no significant difference between off-pump and on-pump CABG groups in RR for 30-day mortality or myocardial infarction, but there was difference about stroke in favor to off-pump CABG (RR 0.793, 95% CI 0.660-0.920, P=0.049). It was observed no important heterogeneity of effects about any outcome, but it was observed publication bias about outcome "stroke". Meta-regression did not demonstrate influence of female gender, number of grafts or age in outcomes. Off-pump CABG reduces the incidence of post-operative stroke by 20.7% and has no substantial effect on mortality or myocardial infarction in comparison to on-pump CABG. Patient gender, number of grafts performed and age do not seem to explain the effect of off-pump CABG on mortality, myocardial infarction or stroke, respectively.
Barazani, Oz; Waitz, Yoni; Tugendhaft, Yizhar; Dorman, Michael; Dag, Arnon; Hamidat, Mohammed; Hijawi, Thameen; Kerem, Zohar; Westberg, Erik; Kadereit, Joachim W
2017-02-06
A previous multi-locus lineage (MLL) analysis of SSR-microsatellite data of old olive trees in the southeast Mediterranean area had shown the predominance of the Souri cultivar (MLL1) among grafted trees. The MLL analysis had also identified an MLL (MLL7) that was more common among rootstocks than other MLLs. We here present a comparison of the MLL combinations MLL1 (scion)/MLL7 (rootstock) and MLL1/MLL1 in order to investigate the possible influence of rootstock on scion phenotype. A linear regression analysis demonstrated that the abundance of MLL1/MLL7 trees decreases and of MLL1/MLL1 trees increases along a gradient of increasing aridity. Hypothesizing that grafting on MLL7 provides an advantage under certain conditions, Akaike information criterion (AIC) model selection procedure was used to assess the influence of different environmental conditions on phenotypic characteristics of the fruits and oil of the two MLL combinations. The most parsimonious models indicated differential influences of environmental conditions on parameters of olive oil quality in trees belonging to the MLL1/MLL7 and MLL1/MLL1 combinations, but a similar influence on fruit characteristics and oil content. These results suggest that in certain environments grafting of the local Souri cultivar on MLL7 rootstocks and the MLL1/MLL1 combination result in improved oil quality. The decreasing number of MLL1/MLL7 trees along an aridity gradient suggests that use of this genotype combination in arid sites was not favoured because of sensitivity of MLL7 to drought. Our results thus suggest that MLL1/MLL7 and MLL1/MLL1 combinations were selected by growers in traditional rain-fed cultivation under Mediterranean climate conditions in the southeast Mediterranean area.
Lu, Peng; Chen, Chang; Fu, Meihong; Fang, Jing; Gao, Jian; Zhu, Li; Liang, Rixin; Shen, Xin; Yang, Hongjun
2013-01-01
Recently, the pharmaceutical industry has shifted to pursuing combination therapies that comprise more than one active ingredient. Interestingly, combination therapies have been used for more than 2500 years in traditional Chinese medicine (TCM). Understanding optimal proportions and synergistic mechanisms of multi-component drugs are critical for developing novel strategies to combat complex diseases. A new multi-objective optimization algorithm based on least angle regression-partial least squares was proposed to construct the predictive model to evaluate the synergistic effect of the three components of a novel combination drug Yi-qi-jie-du formula (YJ), which came from clinical TCM prescription for the treatment of encephalopathy. Optimal proportion of the three components, ginsenosides (G), berberine (B) and jasminoidin (J) was determined via particle swarm optimum. Furthermore, the combination mechanisms were interpreted using PLS VIP and principal components analysis. The results showed that YJ had optimal proportion 3(G): 2(B): 0.5(J), and it yielded synergy in the treatment of rats impaired by middle cerebral artery occlusion induced focal cerebral ischemia. YJ with optimal proportion had good pharmacological effects on acute ischemic stroke. The mechanisms study demonstrated that the combination of G, B and J could exhibit the strongest synergistic effect. J might play an indispensable role in the formula, especially when combined with B for the acute stage of stroke. All these data in this study suggested that in the treatment of acute ischemic stroke, besides restoring blood supply and protecting easily damaged cells in the area of the ischemic penumbra as early as possible, we should pay more attention to the removal of the toxic metabolites at the same time. Mathematical system modeling may be an essential tool for the analysis of the complex pharmacological effects of multi-component drug. The powerful mathematical analysis method could greatly improve the efficiency in finding new combination drug from TCM. PMID:24236065
Characterization, Operation and Analysis of Test Motors Containing Aluminized Hybrid Fuels
NASA Technical Reports Server (NTRS)
Kibbey, Timothy P.; Cortopassi, Andrew C.; Boyer, J. Eric
2017-01-01
NASA Marshall Space Flight Center's Materials and Processes Department, with support from the Propulsion Systems Department, has renewed the development and maintenance of a hybrid test bed for exposing ablative thermal protection materials to an environment similar to that seen in solid rocket motors (SRM). The Solid Fuel Torch (SFT), operated during the Space Shuttle program, utilized gaseous oxygen for oxidizer and an aluminized hydroxyl-terminated polybutadiene (HTPB) fuel grain to expose a converging section of phenolic material to a 400 psi, 2-phase flow combustion environment. The configuration allows for up to a 2 foot long, 5 inch diameter fuel grain cartridge. Wanting to now test rubber insulation materials with a turn-back feature to mimic the geometry of an aft dome being impinged by alumina particles, the throat area has now been increased by several times to afford flow similarity. Combined with the desire to maintain a higher operating pressure, the oxidizer flow rate is being increased by a factor of 10. Out of these changes has arisen the need to characterize the fuel/oxidizer combination in a higher mass flux condition than has been previously tested at MSFC, and at which the literature has little to no reporting as well. Testing for fuel regression rate comprised a two-level, full factorial design available over Aluminum loading level, mass flow rate, pressure, and diameter. The data taken significantly surpasses the previous available data on regression rate of aluminized HTPB fuel burning with gaseous oxygen. It encompasses higher mass fluxes, and appears to generate more consistent data. The good test article and facility design and testing work of the Penn State HPCL combined with careful analysis of the data and good planning has made this possible. This should be able to assist with developing rate laws that are useful both for research planning and for developing flight system sizing relationships that can help optimize hybrid rocket concepts for trade studies. The successful approach of this DOE and test setup is applicable to other propellant combinations as well.
Flippin' Fluid Mechanics - Quasi-experimental Pre-test and Post-test Comparison Using Two Groups
NASA Astrophysics Data System (ADS)
Webster, D. R.; Majerich, D. M.; Luo, J.
2014-11-01
A flipped classroom approach has been implemented in an undergraduate fluid mechanics course. Students watch short on-line videos before class, participate in active in-class problem solving (in dyads), and complete individualized on-line quizzes weekly. In-class activities are designed to achieve a trifecta of: 1. developing problem solving skills, 2. learning subject content, and 3. developing inquiry skills. The instructor and assistants provide critical ``just-in-time tutoring'' during the in-class problem solving sessions. Comparisons are made with a simultaneous section offered in a traditional mode by a different instructor. Regression analysis was used to control for differences among students and to quantify the effect of the flipped fluid mechanics course. The dependent variable was the students' combined final examination and post-concept inventory scores and the independent variables were pre-concept inventory score, gender, major, course section, and (incoming) GPA. The R-square equaled 0.45 indicating that the included variables explain 45% of the variation in the dependent variable. The regression results indicated that if the student took the flipped fluid mechanics course, the dependent variable (i.e., combined final exam and post-concept inventory scores) was raised by 7.25 points. Interestingly, the comparison group reported significantly more often that their course emphasized memorization than did the flipped classroom group.
Dinç, Erdal; Ustündağ, Ozgür; Baleanu, Dumitru
2010-08-01
The sole use of pyridoxine hydrochloride during treatment of tuberculosis gives rise to pyridoxine deficiency. Therefore, a combination of pyridoxine hydrochloride and isoniazid is used in pharmaceutical dosage form in tuberculosis treatment to reduce this side effect. In this study, two chemometric methods, partial least squares (PLS) and principal component regression (PCR), were applied to the simultaneous determination of pyridoxine (PYR) and isoniazid (ISO) in their tablets. A concentration training set comprising binary mixtures of PYR and ISO consisting of 20 different combinations were randomly prepared in 0.1 M HCl. Both multivariate calibration models were constructed using the relationships between the concentration data set (concentration data matrix) and absorbance data matrix in the spectral region 200-330 nm. The accuracy and the precision of the proposed chemometric methods were validated by analyzing synthetic mixtures containing the investigated drugs. The recovery results obtained by applying PCR and PLS calibrations to the artificial mixtures were found between 100.0 and 100.7%. Satisfactory results obtained by applying the PLS and PCR methods to both artificial and commercial samples were obtained. The results obtained in this manuscript strongly encourage us to use them for the quality control and the routine analysis of the marketing tablets containing PYR and ISO drugs. Copyright © 2010 John Wiley & Sons, Ltd.
Regression Analysis and the Sociological Imagination
ERIC Educational Resources Information Center
De Maio, Fernando
2014-01-01
Regression analysis is an important aspect of most introductory statistics courses in sociology but is often presented in contexts divorced from the central concerns that bring students into the discipline. Consequently, we present five lesson ideas that emerge from a regression analysis of income inequality and mortality in the USA and Canada.
Influence of Primary Gage Sensitivities on the Convergence of Balance Load Iterations
NASA Technical Reports Server (NTRS)
Ulbrich, Norbert Manfred
2012-01-01
The connection between the convergence of wind tunnel balance load iterations and the existence of the primary gage sensitivities of a balance is discussed. First, basic elements of two load iteration equations that the iterative method uses in combination with results of a calibration data analysis for the prediction of balance loads are reviewed. Then, the connection between the primary gage sensitivities, the load format, the gage output format, and the convergence characteristics of the load iteration equation choices is investigated. A new criterion is also introduced that may be used to objectively determine if the primary gage sensitivity of a balance gage exists. Then, it is shown that both load iteration equations will converge as long as a suitable regression model is used for the analysis of the balance calibration data, the combined influence of non linear terms of the regression model is very small, and the primary gage sensitivities of all balance gages exist. The last requirement is fulfilled, e.g., if force balance calibration data is analyzed in force balance format. Finally, it is demonstrated that only one of the two load iteration equation choices, i.e., the iteration equation used by the primary load iteration method, converges if one or more primary gage sensitivities are missing. This situation may occur, e.g., if force balance calibration data is analyzed in direct read format using the original gage outputs. Data from the calibration of a six component force balance is used to illustrate the connection between the convergence of the load iteration equation choices and the existence of the primary gage sensitivities.
Ultrasonographic Evaluation of Cervical Lymph Nodes in Thyroid Cancer.
Machado, Maria Regina Marrocos; Tavares, Marcos Roberto; Buchpiguel, Carlos Alberto; Chammas, Maria Cristina
2017-02-01
Objective To determine what ultrasonographic features can identify metastatic cervical lymph nodes, both preoperatively and in recurrences after complete thyroidectomy. Study Design Prospective. Setting Outpatient clinic, Department of Head and Neck Surgery, School of Medicine, University of São Paulo, Brazil. Subjects and Methods A total of 1976 lymph nodes were evaluated in 118 patients submitted to total thyroidectomy with or without cervical lymph node dissection. All the patients were examined by cervical ultrasonography, preoperatively and/or postoperatively. The following factors were assessed: number, size, shape, margins, presence of fatty hilum, cortex, echotexture, echogenicity, presence of microcalcification, presence of necrosis, and type of vascularity. The specificity, sensitivity, positive predictive value, and negative predictive value of each variable were calculated. Univariate and multivariate logistic regression analyses were conducted. A receiver operator characteristic (ROC) curve was plotted to determine the best cutoff value for the number of variables to discriminate malignant lymph nodes. Results Significant differences were found between metastatic and benign lymph nodes with regard to all of the variables evaluated ( P < .05). Logistic regression analysis revealed that size and echogenicity were the best combination of altered variables (odds ratio, 40.080 and 7.288, respectively) in discriminating malignancy. The ROC curve analysis showed that 4 was the best cutoff value for the number of altered variables to discriminate malignant lymph nodes, with a combined specificity of 85.7%, sensitivity of 96.4%, and efficiency of 91.0%. Conclusion Greater diagnostic accuracy was achieved by associating the ultrasonographic variables assessed rather than by considering them individually.
Reichert, Stefan; Triebert, Ulrike; Santos, Alexander Navarrete; Hofmann, Britt; Schaller, Hans-Günter; Schlitt, Axel; Schulz, Susanne
2017-11-01
Soluble RAGE (sRAGE) serum level could be a biomarker for atherosclerosis and subsequent diseases such as cardiovascular disease (CVD). Therefore, we wanted to investigate whether peripheral sRAGE level is associated with new cardiovascular events among patients with CVD using the Cox's regression analysis. In this three-year longitudinal cohort study, 1002 in-patients with angiographically proven CVD were included. In 933 patients, sRAGE levels were determined by a commercial available ELISA kit at the time of baseline examination. The combined endpoint was defined as myocardial infarction, stroke/TIA (non-fatal, fatal), and cardiovascular death. For risk analysis, sRAGE values were distributed in quartiles. For generation of adjusted hazard ratios (HR), other risk factors for CVD, such as age, gender, current smoking, body mass index, diabetes, hypertension, dyslipoproteinemia, family history of CVD, severe periodontitis, serum levels for C-reactive protein and interleukin-6, were recorded. 886 patients completed the 3-year follow-up. The overall incidence of the combined endpoint was 16%. Patients with sRAGE levels >838.19 pg/ml (fourth quartile) had the highest incidence of recurrent CVD events (24.9% versus 13.1%, p < 0.0001). In multivariate Cox regression with respect to further confounders for CVD, the association between sRAGE and new CVD events was confirmed (HR = 1.616, 95% CI 1.027-2.544, p = 0.038). Elevated sRAGE serum level is associated with further adverse events in patients with CVD. Copyright © 2017 Elsevier B.V. All rights reserved.
Espigares, Miguel; Lardelli, Pablo; Ortega, Pedro
2003-10-01
The presence of trihalomethanes (THMs) in potable-water sources is an issue of great interest because of the negative impact THMs have on human health. The objective of this study was to correlate the presence of trihalomethanes with more routinely monitored parameters of water quality, in order to facilitate THM control. Water samples taken at various stages of treatment from a water treatment plant were analyzed for the presence of trihalomethanes with the Fujiwara method. The data collected from these determinations were compared with the values obtained for free-residual-chlorine and combined-residual-chlorine levels as well as standard physico-chemical and microbiological indicators such as chemical oxygen demand (by the KMnO4 method), total chlorophyll, conductivity, pH, alkalinity, turbidity, chlorides, sulfates, nitrates, nitrites, phosphates, ammonia, calcium, magnesium, heterotrophic bacteria count, Pseudomonas spp., total and fecal coliforms, and fecal streptococci. The data from these determinations were compiled, and statistical analysis was performed to determine which variables correlate best with the presence and quantity of trihalomethanes in the samples. Levels of THMs in water seem to correlate directly with levels of combined residual chlorine and nitrates, and inversely with the level of free residual chlorine. Statistical analysis with multiple linear regression was conducted to determine the best-fitting models. The models chosen incorporate between two and four independent variables and include chemical oxygen demand, nitrites, and ammonia. These indicators, which are commonly determined during the water treatment process, demonstrate the strongest correlation with the levels of trihalomethanes in water and offer great utility as an accessible method for THM detection and control.
Liu, Jianhua; Zeng, Weiqiang; Huang, Chengzhi; Wang, Junjiang; Xu, Lishu; Ma, Dong
2018-05-01
The present study aimed to investigate whether c-mesenchymal epithelial transition factor (C-MET) overexpression combined with RAS (including KRAS, NRAS and HRAS ) or BRAF mutations were associated with late distant metastases and the prognosis of patients with colorectal cancer (CRC). A total of 374 patients with stage III CRC were classified into 4 groups based on RAS/BRAF and C-MET status for comprehensive analysis. Mutations in RAS / BRAF were determined using Sanger sequencing and C-MET expression was examined using immunohistochemistry. The associations between RAS/BRAF mutations in combination with C-MET overexpression and clinicopathological variables including survival were evaluated. In addition, their predictive value for late distant metastases were statistically analyzed via logistic regression and receiver operating characteristic analysis. Among 374 patients, mutations in KRAS, NRAS, HRAS, BRAF and C-MET overexpression were observed in 43.9, 2.4, 0.3, 5.9 and 71.9% of cases, respectively. Considering RAS/BRAF mutations and C-MET overexpression, vascular invasion (P=0.001), high carcino-embryonic antigen level (P=0.031) and late distant metastases (P<0.001) were more likely to occur in patients of group 4. Furthermore, survival analyses revealed RAS/BRAF mutations may have a more powerful impact on survival than C-MET overexpression, although they were both predictive factors for adverse prognosis. Further logistic regression suggested that RAS/BRAF mutations and C-MET overexpression may predict late distant metastases. In conclusion, RAS/BRAF mutations and C-MET overexpression may serve as predictive indicators for metastatic behavior and poor prognosis of CRC.
A new look at patient satisfaction: learning from self-organizing maps.
Voutilainen, Ari; Kvist, Tarja; Sherwood, Paula R; Vehviläinen-Julkunen, Katri
2014-01-01
To some extent, results always depend on the methods used, and the complete picture of the phenomenon of interest can be drawn only by combining results of different data processing techniques. This emphasizes the use of a wide arsenal of methods for processing and analyzing patient satisfaction surveys. The purpose of this study was to introduce the self-organizing map (SOM) to nursing science and to illustrate the use of the SOM with patient satisfaction data. The SOM is a widely used artificial neural network suitable for clustering and exploring all kind of data sets. The study was partly a secondary analysis of data collected for the Attractive and Safe Hospital Study from four Finnish hospitals in 2008 and 2010 using the Revised Humane Caring Scale. The sample consisted of 5,283 adult patients. The SOM was used to cluster the data set according to (a) respondents and (b) questionnaire items. The SOM was also used as a preprocessor for multinomial logistic regression. An analysis of missing data was carried out to improve the data interpretation. Combining results of the two SOMs and the logistic regression revealed associations between the level of satisfaction, different components of satisfaction, and item nonresponse. The common conception that the relationship between patient satisfaction and age is positive may partly be due to positive association between the tendency of item nonresponse and age. The SOM proved to be a useful method for clustering a questionnaire data set even when the data set was low dimensional per se. Inclusion of empty responses in analyses may help to detect possible misleading noncausative relationships.
Xu, Yaomin; Guo, Xingyi; Sun, Jiayang; Zhao, Zhongming
2015-01-01
Motivation: Large-scale cancer genomic studies, such as The Cancer Genome Atlas (TCGA), have profiled multidimensional genomic data, including mutation and expression profiles on a variety of cancer cell types, to uncover the molecular mechanism of cancerogenesis. More than a hundred driver mutations have been characterized that confer the advantage of cell growth. However, how driver mutations regulate the transcriptome to affect cellular functions remains largely unexplored. Differential analysis of gene expression relative to a driver mutation on patient samples could provide us with new insights in understanding driver mutation dysregulation in tumor genome and developing personalized treatment strategies. Results: Here, we introduce the Snowball approach as a highly sensitive statistical analysis method to identify transcriptional signatures that are affected by a recurrent driver mutation. Snowball utilizes a resampling-based approach and combines a distance-based regression framework to assign a robust ranking index of genes based on their aggregated association with the presence of the mutation, and further selects the top significant genes for downstream data analyses or experiments. In our application of the Snowball approach to both synthesized and TCGA data, we demonstrated that it outperforms the standard methods and provides more accurate inferences to the functional effects and transcriptional dysregulation of driver mutations. Availability and implementation: R package and source code are available from CRAN at http://cran.r-project.org/web/packages/DESnowball, and also available at http://bioinfo.mc.vanderbilt.edu/DESnowball/. Contact: zhongming.zhao@vanderbilt.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25192743
Schoenenberger, A W; Erne, P; Ammann, S; Perrig, M; Bürgi, U; Stuck, A E
2008-01-01
Approximate entropy (ApEn) of blood pressure (BP) can be easily measured based on software analysing 24-h ambulatory BP monitoring (ABPM), but the clinical value of this measure is unknown. In a prospective study we investigated whether ApEn of BP predicts, in addition to average and variability of BP, the risk of hypertensive crisis. In 57 patients with known hypertension we measured ApEn, average and variability of systolic and diastolic BP based on 24-h ABPM. Eight of these fifty-seven patients developed hypertensive crisis during follow-up (mean follow-up duration 726 days). In bivariate regression analysis, ApEn of systolic BP (P<0.01), average of systolic BP (P=0.02) and average of diastolic BP (P=0.03) were significant predictors of hypertensive crisis. The incidence rate ratio of hypertensive crisis was 14.0 (95% confidence interval (CI) 1.8, 631.5; P<0.01) for high ApEn of systolic BP as compared to low values. In multivariable regression analysis, ApEn of systolic (P=0.01) and average of diastolic BP (P<0.01) were independent predictors of hypertensive crisis. A combination of these two measures had a positive predictive value of 75%, and a negative predictive value of 91%, respectively. ApEn, combined with other measures of 24-h ABPM, is a potentially powerful predictor of hypertensive crisis. If confirmed in independent samples, these findings have major clinical implications since measures predicting the risk of hypertensive crisis define patients requiring intensive follow-up and intensified therapy.
Li, Cheng-Wei; Chen, Bor-Sen
2016-01-01
Epigenetic and microRNA (miRNA) regulation are associated with carcinogenesis and the development of cancer. By using the available omics data, including those from next-generation sequencing (NGS), genome-wide methylation profiling, candidate integrated genetic and epigenetic network (IGEN) analysis, and drug response genome-wide microarray analysis, we constructed an IGEN system based on three coupling regression models that characterize protein-protein interaction networks (PPINs), gene regulatory networks (GRNs), miRNA regulatory networks (MRNs), and epigenetic regulatory networks (ERNs). By applying system identification method and principal genome-wide network projection (PGNP) to IGEN analysis, we identified the core network biomarkers to investigate bladder carcinogenic mechanisms and design multiple drug combinations for treating bladder cancer with minimal side-effects. The progression of DNA repair and cell proliferation in stage 1 bladder cancer ultimately results not only in the derepression of miR-200a and miR-200b but also in the regulation of the TNF pathway to metastasis-related genes or proteins, cell proliferation, and DNA repair in stage 4 bladder cancer. We designed a multiple drug combination comprising gefitinib, estradiol, yohimbine, and fulvestrant for treating stage 1 bladder cancer with minimal side-effects, and another multiple drug combination comprising gefitinib, estradiol, chlorpromazine, and LY294002 for treating stage 4 bladder cancer with minimal side-effects.
Yang, Shan-Shan; Guo, Wan-Qian; Cao, Guang-Li; Zheng, He-Shan; Ren, Nan-Qi
2012-11-01
This paper offers an effective pretreatment method that can simultaneously achieve excess sludge reduction and bio-hydrogen production from sludge self-fermentation. Batch tests demonstrated that the combinative use of ozone/ultrasound pretreatment had an advantage over the individual ozone and ultrasound pretreatments. The optimal condition (ozone dose of 0.158 g O(3)/g DS and ultrasound energy density of 1.423 W/mL) was recommended by response surface methodology. The maximum hydrogen yield was achieved at 9.28 mL H(2)/g DS under the optimal condition. According to the kinetic analysis, the highest hydrogen production rate (1.84 mL/h) was also obtained using combined pretreatment, which well fitted the predicted equation (the squared regression statistic was 0.9969). The disintegration degrees (DD) were limited to 19.57% and 46.10% in individual ozone and ultrasound pretreatments, while it reached up to 60.88% in combined pretreatment. The combined ozone/ultrasound pretreatment provides an ideal and environmental friendly solution to the problem of sludge disposal. Copyright © 2012 Elsevier Ltd. All rights reserved.
GIS-based spatial statistical analysis of risk areas for liver flukes in Surin Province of Thailand.
Rujirakul, Ratana; Ueng-arporn, Naporn; Kaewpitoon, Soraya; Loyd, Ryan J; Kaewthani, Sarochinee; Kaewpitoon, Natthawut
2015-01-01
It is urgently necessary to be aware of the distribution and risk areas of liver fluke, Opisthorchis viverrini, for proper allocation of prevention and control measures. This study aimed to investigate the human behavior, and environmental factors influencing the distribution in Surin Province of Thailand, and to build a model using stepwise multiple regression analysis with a geographic information system (GIS) on environment and climate data. The relationship between the human behavior, attitudes (<50%; X111), environmental factors like population density (148-169 pop/km2; X73), and land use as wetland (X64), were correlated with the liver fluke disease distribution at 0.000, 0.034, and 0.006 levels, respectively. Multiple regression analysis, by equations OV=-0.599+0.005(population density (148-169 pop/km2); X73)+0.040 (human attitude (<50%); X111)+0.022 (land used (wetland; X64), was used to predict the distribution of liver fluke. OV is the patients of liver fluke infection, R Square=0.878, and, Adjust R Square=0.849. By GIS analysis, we found Si Narong, Sangkha, Phanom Dong Rak, Mueang Surin, Non Narai, Samrong Thap, Chumphon Buri, and Rattanaburi to have the highest distributions in Surin province. In conclusion, the combination of GIS and statistical analysis can help simulate the spatial distribution and risk areas of liver fluke, and thus may be an important tool for future planning of prevention and control measures.
Furukawa, Toshi A; Schramm, Elisabeth; Weitz, Erica S; Salanti, Georgia; Efthimiou, Orestis; Michalak, Johannes; Watanabe, Norio; Cipriani, Andrea; Keller, Martin B; Kocsis, James H; Klein, Daniel N; Cuijpers, Pim
2016-05-04
Despite important advances in psychological and pharmacological treatments of persistent depressive disorders in the past decades, their responses remain typically slow and poor, and differential responses among different modalities of treatments or their combinations are not well understood. Cognitive-Behavioural Analysis System of Psychotherapy (CBASP) is the only psychotherapy that has been specifically designed for chronic depression and has been examined in an increasing number of trials against medications, alone or in combination. When several treatment alternatives are available for a certain condition, network meta-analysis (NMA) provides a powerful tool to examine their relative efficacy by combining all direct and indirect comparisons. Individual participant data (IPD) meta-analysis enables exploration of impacts of individual characteristics that lead to a differentiated approach matching treatments to specific subgroups of patients. We will search for all randomised controlled trials that compared CBASP, pharmacotherapy or their combination, in the treatment of patients with persistent depressive disorder, in Cochrane CENTRAL, PUBMED, SCOPUS and PsycINFO, supplemented by personal contacts. Individual participant data will be sought from the principal investigators of all the identified trials. Our primary outcomes are depression severity as measured on a continuous observer-rated scale for depression, and dropouts for any reason as a proxy measure of overall treatment acceptability. We will conduct a one-step IPD-NMA to compare CBASP, medications and their combinations, and also carry out a meta-regression to identify their prognostic factors and effect moderators. The model will be fitted in OpenBUGS, using vague priors for all location parameters. For the heterogeneity we will use a half-normal prior on the SD. This study requires no ethical approval. We will publish the findings in a peer-reviewed journal. The study results will contribute to more finely differentiated therapeutics for patients suffering from this chronically disabling disorder. CRD42016035886. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://www.bmj.com/company/products-services/rights-and-licensing/
Plant, soil, and shadow reflectance components of row crops
NASA Technical Reports Server (NTRS)
Richardson, A. J.; Wiegand, C. L.; Gausman, H. W.; Cuellar, J. A.; Gerbermann, A. H.
1975-01-01
Data from the first Earth Resource Technology Satellite (LANDSAT-1) multispectral scanner (MSS) were used to develop three plant canopy models (Kubelka-Munk (K-M), regression, and combined K-M and regression models) for extracting plant, soil, and shadow reflectance components of cropped fields. The combined model gave the best correlation between MSS data and ground truth, by accounting for essentially all of the reflectance of plants, soil, and shadow between crop rows. The principles presented can be used to better forecast crop yield and to estimate acreage.
Multivariate Regression Analysis and Slaughter Livestock,
AGRICULTURE, *ECONOMICS), (*MEAT, PRODUCTION), MULTIVARIATE ANALYSIS, REGRESSION ANALYSIS , ANIMALS, WEIGHT, COSTS, PREDICTIONS, STABILITY, MATHEMATICAL MODELS, STORAGE, BEEF, PORK, FOOD, STATISTICAL DATA, ACCURACY
Fisz, Jacek J
2006-12-07
The optimization approach based on the genetic algorithm (GA) combined with multiple linear regression (MLR) method, is discussed. The GA-MLR optimizer is designed for the nonlinear least-squares problems in which the model functions are linear combinations of nonlinear functions. GA optimizes the nonlinear parameters, and the linear parameters are calculated from MLR. GA-MLR is an intuitive optimization approach and it exploits all advantages of the genetic algorithm technique. This optimization method results from an appropriate combination of two well-known optimization methods. The MLR method is embedded in the GA optimizer and linear and nonlinear model parameters are optimized in parallel. The MLR method is the only one strictly mathematical "tool" involved in GA-MLR. The GA-MLR approach simplifies and accelerates considerably the optimization process because the linear parameters are not the fitted ones. Its properties are exemplified by the analysis of the kinetic biexponential fluorescence decay surface corresponding to a two-excited-state interconversion process. A short discussion of the variable projection (VP) algorithm, designed for the same class of the optimization problems, is presented. VP is a very advanced mathematical formalism that involves the methods of nonlinear functionals, algebra of linear projectors, and the formalism of Fréchet derivatives and pseudo-inverses. Additional explanatory comments are added on the application of recently introduced the GA-NR optimizer to simultaneous recovery of linear and weakly nonlinear parameters occurring in the same optimization problem together with nonlinear parameters. The GA-NR optimizer combines the GA method with the NR method, in which the minimum-value condition for the quadratic approximation to chi(2), obtained from the Taylor series expansion of chi(2), is recovered by means of the Newton-Raphson algorithm. The application of the GA-NR optimizer to model functions which are multi-linear combinations of nonlinear functions, is indicated. The VP algorithm does not distinguish the weakly nonlinear parameters from the nonlinear ones and it does not apply to the model functions which are multi-linear combinations of nonlinear functions.
Regression to the Mean Mimicking Changes in Sexual Arousal to Child Stimuli in Pedophiles.
Mokros, Andreas; Habermeyer, Elmar
2016-10-01
The sexual preference for prepubertal children (pedophilia) is generally assumed to be a lifelong condition. Müller et al. (2014) challenged the notion that pedophilia was stable. Using data from phallometric testing, they found that almost half of 40 adult pedophilic men did not show a corresponding arousal pattern at retest. Critics pointed out that regression to the mean and measurement error might account for these results. Müller et al. contested these explanations. The present study shows that regression to the mean in combination with low reliability does indeed provide an exhaustive explanation for the results. Using a statistical model and an estimate of the retest correlation derived from the data, the relative frequency of cases with an allegedly non-pedophilic arousal pattern was shown to be consistent with chance expectation. A bootstrap simulation showed that this outcome was to be expected under a wide range of retest correlations. A re-analysis of the original data from the study by Müller et al. corroborated the assumption of considerable measurement error. Therefore, the original data do not challenge the view that pedophilic sexual preference is stable.
Guisan, Antoine; Edwards, T.C.; Hastie, T.
2002-01-01
An important statistical development of the last 30 years has been the advance in regression analysis provided by generalized linear models (GLMs) and generalized additive models (GAMs). Here we introduce a series of papers prepared within the framework of an international workshop entitled: Advances in GLMs/GAMs modeling: from species distribution to environmental management, held in Riederalp, Switzerland, 6-11 August 2001. We first discuss some general uses of statistical models in ecology, as well as provide a short review of several key examples of the use of GLMs and GAMs in ecological modeling efforts. We next present an overview of GLMs and GAMs, and discuss some of their related statistics used for predictor selection, model diagnostics, and evaluation. Included is a discussion of several new approaches applicable to GLMs and GAMs, such as ridge regression, an alternative to stepwise selection of predictors, and methods for the identification of interactions by a combined use of regression trees and several other approaches. We close with an overview of the papers and how we feel they advance our understanding of their application to ecological modeling. ?? 2002 Elsevier Science B.V. All rights reserved.
Ebtehaj, Isa; Bonakdari, Hossein
2016-01-01
Sediment transport without deposition is an essential consideration in the optimum design of sewer pipes. In this study, a novel method based on a combination of support vector regression (SVR) and the firefly algorithm (FFA) is proposed to predict the minimum velocity required to avoid sediment settling in pipe channels, which is expressed as the densimetric Froude number (Fr). The efficiency of support vector machine (SVM) models depends on the suitable selection of SVM parameters. In this particular study, FFA is used by determining these SVM parameters. The actual effective parameters on Fr calculation are generally identified by employing dimensional analysis. The different dimensionless variables along with the models are introduced. The best performance is attributed to the model that employs the sediment volumetric concentration (C(V)), ratio of relative median diameter of particles to hydraulic radius (d/R), dimensionless particle number (D(gr)) and overall sediment friction factor (λ(s)) parameters to estimate Fr. The performance of the SVR-FFA model is compared with genetic programming, artificial neural network and existing regression-based equations. The results indicate the superior performance of SVR-FFA (mean absolute percentage error = 2.123%; root mean square error =0.116) compared with other methods.
NASA Astrophysics Data System (ADS)
Xin, Pei; Wang, Shen S. J.; Shen, Chengji; Zhang, Zeyu; Lu, Chunhui; Li, Ling
2018-03-01
Shallow groundwater interacts strongly with surface water across a quarter of global land area, affecting significantly the terrestrial eco-hydrology and biogeochemistry. We examined groundwater behavior subjected to unimodal impulse and irregular surface water fluctuations, combining physical experiments, numerical simulations, and functional data analysis. Both the experiments and numerical simulations demonstrated a damped and delayed response of groundwater table to surface water fluctuations. To quantify this hysteretic shallow groundwater behavior, we developed a regression model with the Gamma distribution functions adopted to account for the dependence of groundwater behavior on antecedent surface water conditions. The regression model fits and predicts well the groundwater table oscillations resulting from propagation of irregular surface water fluctuations in both laboratory and large-scale aquifers. The coefficients of the Gamma distribution function vary spatially, reflecting the hysteresis effect associated with increased amplitude damping and delay as the fluctuation propagates. The regression model, in a relatively simple functional form, has demonstrated its capacity of reproducing high-order nonlinear effects that underpin the surface water and groundwater interactions. The finding has important implications for understanding and predicting shallow groundwater behavior and associated biogeochemical processes, and will contribute broadly to studies of groundwater-dependent ecology and biogeochemistry.
Optimizing separate phase light hydrocarbon recovery from contaminated unconfined aquifers
NASA Astrophysics Data System (ADS)
Cooper, Grant S.; Peralta, Richard C.; Kaluarachchi, Jagath J.
A modeling approach is presented that optimizes separate phase recovery of light non-aqueous phase liquids (LNAPL) for a single dual-extraction well in a homogeneous, isotropic unconfined aquifer. A simulation/regression/optimization (S/R/O) model is developed to predict, analyze, and optimize the oil recovery process. The approach combines detailed simulation, nonlinear regression, and optimization. The S/R/O model utilizes nonlinear regression equations describing system response to time-varying water pumping and oil skimming. Regression equations are developed for residual oil volume and free oil volume. The S/R/O model determines optimized time-varying (stepwise) pumping rates which minimize residual oil volume and maximize free oil recovery while causing free oil volume to decrease a specified amount. This S/R/O modeling approach implicitly immobilizes the free product plume by reversing the water table gradient while achieving containment. Application to a simple representative problem illustrates the S/R/O model utility for problem analysis and remediation design. When compared with the best steady pumping strategies, the optimal stepwise pumping strategy improves free oil recovery by 11.5% and reduces the amount of residual oil left in the system due to pumping by 15%. The S/R/O model approach offers promise for enhancing the design of free phase LNAPL recovery systems and to help in making cost-effective operation and management decisions for hydrogeologists, engineers, and regulators.
Davis, Kelly D.; Zarit, Steven H.; Moen, Phyllis; Hammer, Leslie B.; Almeida, David M.
2016-01-01
Objectives. Women who combine formal and informal caregiving roles represent a unique, understudied population. In the literature, healthcare employees who simultaneously provide unpaid elder care at home have been referred to as double-duty caregivers. The present study broadens this perspective by examining the psychosocial implications of double-duty child care (child care only), double-duty elder care (elder care only), and triple-duty care (both child care and elder care or “sandwiched” care). Method. Drawing from the Work, Family, and Health Study, we focus on a large sample of women working in nursing homes in the United States (n = 1,399). We use multiple regression analysis and analysis of covariance tests to examine a range of psychosocial implications associated with double- and triple-duty care. Results. Compared with nonfamily caregivers, double-duty child caregivers indicated greater family-to-work conflict and poorer partner relationship quality. Double-duty elder caregivers reported more family-to-work conflict, perceived stress, and psychological distress, whereas triple-duty caregivers indicated poorer psychosocial functioning overall. Discussion. Relative to their counterparts without family caregiving roles, women with combined caregiving roles reported poorer psychosocial well-being. Additional research on women with combined caregiving roles, especially triple-duty caregivers, should be a priority amidst an aging population, older workforce, and growing number of working caregivers. PMID:25271309
Zhou, Jin J.; Cho, Michael H.; Lange, Christoph; Lutz, Sharon; Silverman, Edwin K.; Laird, Nan M.
2015-01-01
Many correlated disease variables are analyzed jointly in genetic studies in the hope of increasing power to detect causal genetic variants. One approach involves assessing the relationship between each phenotype and each single nucleotide polymorphism (SNP) individually and using a Bonferroni correction for the effective number of tests conducted. Alternatively, one can apply a multivariate regression or a dimension reduction technique, such as principal component analysis (PCA), and test for the association with the principal components (PC) of the phenotypes rather than the individual phenotypes. Inspired by the previous approaches of combining phenotypes to maximize heritability at individual SNPs, in this paper, we propose to construct a maximally heritable phenotype (MaxH) by taking advantage of the estimated total heritability and co-heritability. The heritability and co-heritability only need to be estimated once, therefore our method is applicable to genome-wide scans. MaxH phenotype is a linear combination of the individual phenotypes with increased heritability and power over the phenotypes being combined. Simulations show that the heritability and power achieved agree well with the theory for large samples and two phenotypes. We compare our approach with commonly used methods and assess both the heritability and the power of the MaxH phenotype. Moreover we provide suggestions for how to choose the phenotypes for combination. An application of our approach to a COPD genome-wide association study shows the practical relevance. PMID:26111731
Raines, G.L.; Mihalasky, M.J.
2002-01-01
The U.S. Geological Survey (USGS) is proposing to conduct a global mineral-resource assessment using geologic maps, significant deposits, and exploration history as minimal data requirements. Using a geologic map and locations of significant pluton-related deposits, the pluton-related-deposit tract maps from the USGS national mineral-resource assessment have been reproduced with GIS-based analysis and modeling techniques. Agreement, kappa, and Jaccard's C correlation statistics between the expert USGS and calculated tract maps of 87%, 40%, and 28%, respectively, have been achieved using a combination of weights-of-evidence and weighted logistic regression methods. Between the experts' and calculated maps, the ranking of states measured by total permissive area correlates at 84%. The disagreement between the experts and calculated results can be explained primarily by tracts defined by geophysical evidence not considered in the calculations, generalization of tracts by the experts, differences in map scales, and the experts' inclusion of large tracts that are arguably not permissive. This analysis shows that tracts for regional mineral-resource assessment approximating those delineated by USGS experts can be calculated using weights of evidence and weighted logistic regression, a geologic map, and the location of significant deposits. Weights of evidence and weighted logistic regression applied to a global geologic map could provide quickly a useful reconnaissance definition of tracts for mineral assessment that is tied to the data and is reproducible. ?? 2002 International Association for Mathematical Geology.
Protective Effect of HLA-DQB1 Alleles Against Alloimmunization in Patients with Sickle Cell Disease
Tatari-Calderone, Zohreh; Gordish-Dressman, Heather; Fasano, Ross; Riggs, Michael; Fortier, Catherine; Andrew; Campbell, D.; Charron, Dominique; Gordeuk, Victor R.; Luban, Naomi L.C.; Vukmanovic, Stanislav; Tamouza, Ryad
2015-01-01
Background Alloimmunization or the development of alloantibodies to Red Blood Cell (RBC) antigens is considered one of the major complications after RBC transfusions in patients with sickle cell disease (SCD) and can lead to both acute and delayed hemolytic reactions. It has been suggested that polymorphisms in HLA genes, may play a role in alloimmunization. We conducted a retrospective study analyzing the influence of HLA-DRB1 and DQB1 genetic diversity on RBC-alloimmunization. Study design Two-hundred four multi-transfused SCD patients with and without RBC-alloimmunization were typed at low/medium resolution by PCR-SSO, using IMGT-HLA Database. HLA-DRB1 and DQB1 allele frequencies were analyzed using logistic regression models, and global p-value was calculated using multiple logistic regression. Results While only trends towards associations between HLA-DR diversity and alloimmunization were observed, analysis of HLA-DQ showed that HLA-DQ2 (p=0.02), -DQ3 (p=0.02) and -DQ5 (p=0.01) alleles were significantly higher in non-alloimmunized patients, likely behaving as protective alleles. In addition, multiple logistic regression analysis showed both HLA-DQ2/6 (p=0.01) and HLA-DQ5/5 (p=0.03) combinations constitute additional predictor of protective status. Conclusion Our data suggest that particular HLA-DQ alleles influence the clinical course of RBC transfusion in patients with SCD, which could pave the way towards predictive strategies. PMID:26476208
Tse, Samson; Davidson, Larry; Chung, Ka-Fai; Yu, Chong Ho; Ng, King Lam; Tsoi, Emily
2015-02-01
More mental health services are adopting the recovery paradigm. This study adds to prior research by (a) using measures of stages of recovery and elements of recovery that were designed and validated in a non-Western, Chinese culture and (b) testing which demographic factors predict advanced recovery and whether placing importance on certain elements predicts advanced recovery. We examined recovery and factors associated with recovery among 75 Hong Kong adults who were diagnosed with schizophrenia and assessed to be in clinical remission. Data were collected on socio-demographic factors, recovery stages and elements associated with recovery. Logistic regression analysis was used to identify variables that could best predict stages of recovery. Receiver operating characteristic curves were used to detect the classification accuracy of the model (i.e. rates of correct classification of stages of recovery). Logistic regression results indicated that stages of recovery could be distinguished with reasonable accuracy for Stage 3 ('living with disability', classification accuracy = 75.45%) and Stage 4 ('living beyond disability', classification accuracy = 75.50%). However, there was no sufficient information to predict Combined Stages 1 and 2 ('overwhelmed by disability' and 'struggling with disability'). It was found that having a meaningful role and age were the most important differentiators of recovery stage. Preliminary findings suggest that adopting salient life roles personally is important to recovery and that this component should be incorporated into mental health services. © The Author(s) 2014.